Building, testing, and thinking about AI
RSS FeedLongform essays on AI, human agency, and the systems we build with them — written from an unusual vantage point: an operator running a production system on LLMs every day, not a frontier lab or academia.
Start with the essay below, or read more about this site .
Featured
-
How do you know a correction held? Instrumenting an agent in production
Functional scars make a correction persist. They don't tell you whether the system as a whole is getting better. So I instrumented every session — deterministic, zero-token, never blocking — and let a monthly pass turn the evidence into mechanism changes. The first thing the data caught was me.
-
The same scar, two agents
fscars 0.4.0 promotes the Codex adapter from instructions to native hooks. A correction you write once now fires deterministically in both Claude Code and Codex — and the scar itself does not change. Here is how each side works, and the one honest limitation.
-
A month of functional scars: 934 fires, one broken validation loop, and what it cost
I wired ten functional scars into my own workspace and let them run for a month. Half the signal came from one hook with documented false positives, and 3,838 captured opportunities sat unread. Here are the numbers and what they forced me to build.
-
Functional Scars — turning corrections into a primitive
fscars 0.1.0 is out — a bolt-on correction primitive for AI coding agents, built on the framework from the Lucy Syndrome paper. Apache 2.0, pip install fscars.
-
The Lucy Syndrome: Why LLMs Forget Corrections
LLMs don't remember yesterday. That gap has a name, a causal mechanism, and a fix that doesn't require better memory.
-
The Lucy Syndrome and AI
LLMs don't remember yesterday — and that gap has a name. A five-part essay on the Lucy Syndrome, functional scars, and what it takes for a production system to actually learn.
-
Questions and answers
Questions about the Lucy Syndrome essay — its scope, its method, and what functional scars actually look like in operation. Compiled from real conversations and updated as new questions arrive.
-
Where this came from
The informal companion to the Lucy Syndrome essay — how the observation started, how the system around it took shape, and why an operator in Paraguay ended up writing about model amnesia.
Recent Posts
-
Anatomy of a four-minute boot
Where my Claude Code sessions actually spent their first four minutes, and what cut it to thirty seconds without losing any context. Hooks were three seconds of the problem. The model's diligence was the rest.
-
Calibrate against your own voice
AI detectors flag non-native English as machine-written. callus scores your draft against your own voice instead. pip install callus.
-
The Pixar precedent: vibe coding has been here before
Debates over disruptive tools follow a pattern. The vibe coding fight is at a recognisable stage of it. Notes from someone who passed through an earlier version of the same argument.
-
From memory to scar: a four-layer progression
Anthropic shipped managed memory stores in April. They sit at the third of four layers. The fourth, hooks, is the one that closes the Lucy loop.