
Was ich gebaut habe
Apps, Open-Source-Tools, Research-Code und technische Artikel.
Open Source
Developer-Tools, Libraries und Experimente, öffentlich gebaut.


Open Source • Code Search
SIFS
SIFS Is Fast Search
Ein schnelles hybrides Code-Suchwerkzeug für Agents, als CLI, Rust-Crate und lokaler MCP-Server.

Open Source • macOS CLI
clipmem
Lokales Clipboard-Gedächtnis
Eine lokale macOS-Clipboard-History mit SQLite und JSON-first CLI für Agents und Menschen.
Open Source • LLM Benchmark
apocalypse-bench
LLM-Benchmark für Survival-Aufgaben
Ein Benchmark-Runner mit strukturierten Judge-Rubriken und Next.js-Dashboard für LLM-Survival-Aufgaben.

Open Source • React Native UI
react-native-dotgrid
Animiertes Dot-Matrix-Display
Eine animierte Dot-Matrix-Komponente für React Native mit Skia-Rendering und UI-Thread-Animationen.
Apps
Native Apps für fokussierte Abläufe und tägliche Nutzung.
Apps
BrainFlow
KI-Sprachnotizen • iOS / Android
Sprachnotizen werden zu strukturierten Zusammenfassungen, Aufgaben und Tags — für lange Brain-Dump-Aufnahmen.
Apps
Gravity Notes
Minimal Notes • iOS / Android
Eine einfache Notiz-App nach der Append-&-Rescue-Methode — schnelles Erfassen, schnelle Suche, keine Ablenkung. macOS kommt bald.
Research
Scientific Computing und Machine-Learning-Arbeit aus der Bildgebungsforschung.

Research • ML Segmentation
ParticleSegmentation
Klassische ML-Segmentierung
Klassische ML-Pipelines für Partikelsegmentierung in Synchrotron-X-ray-Tomographie-Daten.

Research • Deep Learning
ScrambledSeg
Deep-Learning-Segmentierung
Ein Deep-Learning-Segmentierungsmodell für augmentierte Synchrotron-Bilddaten und bessere Generalisierung.
Artikel
Technische Essays und Benchmark-Writeups.
Optimising scientific software with large language models
AI can do stuff now
Frontier large language models are now capable of performing impressive and creative feats of intellect. The amount of time models can work for, along with their general tenacity, has increased dramatically over the past several years.
Autoresearch
Andrej Karpathy's autoresearch method gives an agent a program to optimise, tooling, and a test. With minor prompt engineering, the agent can make edits, run experiments, and accumulate improvements over time.
TomoJAX
TomoJAX uses JAX to reconstruct and align tomography data by differentiating a loss between measured and predicted projections with respect to geometric parameters for each view.
Auto-tomo-research-JAX
Using GPT-5.4, Codex CLI, and Modal GPU runs, the agent explored 213 experiments and found 33 tweaks that lowered reconstruction error, including several substantial improvements.
Artikel • 26th March 2026
Optimising scientific software with large language models
Technischer Artikel
How I used GPT-5.4, Codex CLI, and an autoresearch agent to improve TomoJAX, a JAX-based tomography tool for alignment and reconstruction.
From finger to photon and back: the complete journey of a keystroke through an LLM
Part I: The mechanical-to-electrical boundary
Your finger descends over approximately 10 milliseconds, an eternity in computer time, but constrained by the biomechanics of muscle contraction and the travel distance of a typical key. During this press, you're closing a circuit in a keyboard matrix. Most keyboards don't give each key its own dedicated wire. Instead, keys sit at intersections of a row-column matrix while the keyboard controller rapidly scans columns and reads rows to detect which intersections are closed.
The USB HID report
Once the firmware confirms a genuine keypress, it constructs a Human Interface Device report. The keyboard doesn't send ASCII characters. It sends scan codes, position-based identifiers that mean the key in position 0x04, which happens to be A on a US QWERTY layout. Your operating system handles the mapping from scan codes to characters.
Operating system event handling
Your USB host controller receives the HID report when it completes a scheduled poll. This triggers a hardware interrupt to the CPU, saving its current state and jumping to the interrupt handler. The handler copies the report into a kernel buffer, acknowledges the interrupt, and schedules deferred work for the slower processing that turns raw physical signal into a usable input event.
The physical network layer
Your packet leaves the kernel as bytes in a DMA buffer. The network interface card reads this buffer directly and begins serialization. Each bit transition drives electrons through a differential pair before optical conversion pushes photons into glass fiber, where the signal travels through metro networks, backbone links, amplifiers, switches, and datacenter fabric.
Artikel • 12th January 2026
From finger to photon and back: the complete journey of a keystroke through an LLM
Technischer Artikel
What actually happens when you press a key and an AI responds? I mean physics-wise. The version where we trace individual electrons through transistor channels, watch photons bounce through 3,000 kilometres of glass fibre, and count the floating-point operations that separate your question from the model's answer.
Apocalypse-bench: Would your LLM kill you?
Welcome to apocalypse-bench.
I spent the past few days building a benchmark for a question nobody was asking: how useful are LLMs when you need to not just survive, but rebuild civilisation from the ground up? Not chatbots. Not coding helpers. Actual field guides for situations where getting the answer wrong means the survivors die, and humanity is lost for good.
How the scoring works
You can't grade 1,830 survival answers by hand, so I used an LLM-as-judge approach. Each candidate answer gets sent to a separate judge model along with the original question and a structured rubric. The judge returns scores for each criterion and flags whether any auto-fail conditions were triggered.
The quick overview
If you just want the survival rankings, OpenAI gpt-oss-20b leads the pack, but mean score only tells part of the story. The more important question is how often each model's advice would either get you killed, or be completely useless. Chemistry was brutal across the board. Ethics and Organisation saw almost no auto-fails.
The models: worst to best
Llama 3.1 finished last in every difficulty tier. Liquid was brilliant at irrigation and dangerous in medicine. Nemotron knew textbook definitions but lacked the imagination to improvise. Qwen3 was brilliant, inventive, and unstable. The results show that survival competence lives somewhere different from ordinary benchmark performance.
Artikel • 22nd December 2025
Apocalypse-bench: Would your LLM kill you?
Technischer Artikel
The dust has settled from whatever satisfying calamity ended civilisation as you knew it. Survivors crawl from the rubble clutching their loved ones and their MacBook Pros. The grid is dead. The internet is a memory.
Software-Projekt geplant?
Schick mir eine kurze Nachricht zu deinem Produkt-, Daten- oder Automatisierungsproblem und ich sage dir, wie ich es angehen würde.