Was ich gebaut habe

Apps, Open-Source-Tools, Research-Code und technische Artikel.

Open Source

Developer-Tools, Libraries und Experimente, öffentlich gebaut.

Open Source • JAX CT

TomoJAX

Differenzierbare CT-Rekonstruktion

Ein JAX-Toolkit für differenzierbare Parallelstrahl-CT-Projektion, Rekonstruktion und Alignment.

Öffnen

Open Source • Code Search

SIFS

SIFS Is Fast Search

Ein schnelles hybrides Code-Suchwerkzeug für Agents, als CLI, Rust-Crate und lokaler MCP-Server.

Öffnen

Open Source • macOS CLI

clipmem

Lokales Clipboard-Gedächtnis

Eine lokale macOS-Clipboard-History mit SQLite und JSON-first CLI für Agents und Menschen.

Öffnen

Open Source • LLM Benchmark

apocalypse-bench

LLM-Benchmark für Survival-Aufgaben

Ein Benchmark-Runner mit strukturierten Judge-Rubriken und Next.js-Dashboard für LLM-Survival-Aufgaben.

Öffnen

Open Source • React Native UI

react-native-dotgrid

Animiertes Dot-Matrix-Display

Eine animierte Dot-Matrix-Komponente für React Native mit Skia-Rendering und UI-Thread-Animationen.

Öffnen

Apps

Native Apps für fokussierte Abläufe und tägliche Nutzung.

Apps

BrainFlow

KI-Sprachnotizen • iOS / Android

Sprachnotizen werden zu strukturierten Zusammenfassungen, Aufgaben und Tags — für lange Brain-Dump-Aufnahmen.

Öffnen

Apps

Gravity Notes

Minimal Notes • iOS / Android

Eine einfache Notiz-App nach der Append-&-Rescue-Methode — schnelles Erfassen, schnelle Suche, keine Ablenkung. macOS kommt bald.

Öffnen

Research

Scientific Computing und Machine-Learning-Arbeit aus der Bildgebungsforschung.

Research • ML Segmentation

ParticleSegmentation

Klassische ML-Segmentierung

Klassische ML-Pipelines für Partikelsegmentierung in Synchrotron-X-ray-Tomographie-Daten.

Öffnen

Research • Deep Learning

ScrambledSeg

Deep-Learning-Segmentierung

Ein Deep-Learning-Segmentierungsmodell für augmentierte Synchrotron-Bilddaten und bessere Generalisierung.

Öffnen

Artikel

Technische Essays und Benchmark-Writeups.

Optimising scientific software with large language models

AI can do stuff now

Frontier large language models are now capable of performing impressive and creative feats of intellect. The amount of time models can work for, along with their general tenacity, has increased dramatically over the past several years.

Autoresearch

Andrej Karpathy's autoresearch method gives an agent a program to optimise, tooling, and a test. With minor prompt engineering, the agent can make edits, run experiments, and accumulate improvements over time.

TomoJAX

TomoJAX uses JAX to reconstruct and align tomography data by differentiating a loss between measured and predicted projections with respect to geometric parameters for each view.

Auto-tomo-research-JAX

Using GPT-5.4, Codex CLI, and Modal GPU runs, the agent explored 213 experiments and found 33 tweaks that lowered reconstruction error, including several substantial improvements.

Artikel • 26th March 2026

Optimising scientific software with large language models

Technischer Artikel

How I used GPT-5.4, Codex CLI, and an autoresearch agent to improve TomoJAX, a JAX-based tomography tool for alignment and reconstruction.

Öffnen

From finger to photon and back: the complete journey of a keystroke through an LLM

Part I: The mechanical-to-electrical boundary

Your finger descends over approximately 10 milliseconds, an eternity in computer time, but constrained by the biomechanics of muscle contraction and the travel distance of a typical key. During this press, you're closing a circuit in a keyboard matrix. Most keyboards don't give each key its own dedicated wire. Instead, keys sit at intersections of a row-column matrix while the keyboard controller rapidly scans columns and reads rows to detect which intersections are closed.

The USB HID report

Once the firmware confirms a genuine keypress, it constructs a Human Interface Device report. The keyboard doesn't send ASCII characters. It sends scan codes, position-based identifiers that mean the key in position 0x04, which happens to be A on a US QWERTY layout. Your operating system handles the mapping from scan codes to characters.

Operating system event handling

Your USB host controller receives the HID report when it completes a scheduled poll. This triggers a hardware interrupt to the CPU, saving its current state and jumping to the interrupt handler. The handler copies the report into a kernel buffer, acknowledges the interrupt, and schedules deferred work for the slower processing that turns raw physical signal into a usable input event.

The physical network layer

Your packet leaves the kernel as bytes in a DMA buffer. The network interface card reads this buffer directly and begins serialization. Each bit transition drives electrons through a differential pair before optical conversion pushes photons into glass fiber, where the signal travels through metro networks, backbone links, amplifiers, switches, and datacenter fabric.

Artikel • 12th January 2026

From finger to photon and back: the complete journey of a keystroke through an LLM

Technischer Artikel

What actually happens when you press a key and an AI responds? I mean physics-wise. The version where we trace individual electrons through transistor channels, watch photons bounce through 3,000 kilometres of glass fibre, and count the floating-point operations that separate your question from the model's answer.

Öffnen

Apocalypse-bench: Would your LLM kill you?

Welcome to apocalypse-bench.

I spent the past few days building a benchmark for a question nobody was asking: how useful are LLMs when you need to not just survive, but rebuild civilisation from the ground up? Not chatbots. Not coding helpers. Actual field guides for situations where getting the answer wrong means the survivors die, and humanity is lost for good.

How the scoring works

You can't grade 1,830 survival answers by hand, so I used an LLM-as-judge approach. Each candidate answer gets sent to a separate judge model along with the original question and a structured rubric. The judge returns scores for each criterion and flags whether any auto-fail conditions were triggered.

The quick overview

If you just want the survival rankings, OpenAI gpt-oss-20b leads the pack, but mean score only tells part of the story. The more important question is how often each model's advice would either get you killed, or be completely useless. Chemistry was brutal across the board. Ethics and Organisation saw almost no auto-fails.

The models: worst to best

Llama 3.1 finished last in every difficulty tier. Liquid was brilliant at irrigation and dangerous in medicine. Nemotron knew textbook definitions but lacked the imagination to improvise. Qwen3 was brilliant, inventive, and unstable. The results show that survival competence lives somewhere different from ordinary benchmark performance.

Artikel • 22nd December 2025

Apocalypse-bench: Would your LLM kill you?

Technischer Artikel

The dust has settled from whatever satisfying calamity ended civilisation as you knew it. Survivors crawl from the rubble clutching their loved ones and their MacBook Pros. The grid is dead. The internet is a memory.

Öffnen

Software-Projekt geplant?

Schick mir eine kurze Nachricht zu deinem Produkt-, Daten- oder Automatisierungsproblem und ich sage dir, wie ich es angehen würde.

E-Mail WhatsApp