Anton, chapter 5: LangGraph excised, agents standardized

Monday morning I open the editor and the shape of the week is already clear in my head. Every agent in Anton is a LangGraph subgraph. The parent is a LangGraph state machine. Conversation history is a LangChain message array. The trace viewer parses LangGraph checkpoints. Skills are wrapped as LangChain tools. The framework is not on the side of the system. It is the system.

Two weeks ago I picked LangGraph and I thought it was a great choice. It was. Structure, observability, checkpointing, a clean way to express subgraphs per domain. It got me to a working assistant fast. What I notice now is that none of those things are pulling their weight anymore. Every time I add a delegate I am editing a StateGraph builder. The runtime keeps imposing a node-and-edge mental model on what is, conceptually, just an LLM looping over tools until it is done. LangChain's APIs evolve and break things on unrelated weeks. And there are now three separate representations of the same idea in the codebase: the LangGraph graph, the domain module registry, and the UI's architecture view. They drift. I reconcile them by hand.

The friction has crossed the value. That is the moment.

The decision

The decision lands in one commit and it is austere: two primitives only. Skills and agents. No pseudo-agents. No special endpoints. No classify-then-dispatch pipelines registered as agents. If it does not need reasoning, it is a skill. If it does, it is an agent that loops on a real LLM. Anything that does not fit one of those two shapes does not get to exist.

Excision

The next commit is the brutal one. LangGraph comes out. In its place I write runAgent() in a new packages/agent/, a few hundred lines that do exactly what is actually needed: an LLM call, tool dispatch, a loop bound, trace emission, a permission filter, a validation pass. That is the whole runtime. Reading it back I am almost embarrassed at how small it is. Two weeks of framework, replaced by a function I can hold in my head.

Then the rename. graph becomes agent everywhere: package names, file names, doc copy, UI labels. The "no graph terminology" rule goes into MEMORY.md. Documents that still say "subgraph" are now misleading rather than out-of-date, which is a stronger reason to fix them. A sweep through docs consolidates the lot and resolves the inconsistencies the rename leaves behind.

What falls out of this is what I was actually after. Every agent now has the same one-line shape: a thin function that hands its input to runAgent with a config. Every delegate handler is one line: call the agent, return its text. The Invoke tab, the schedules system, anything that wants to talk to an agent, all see the same surface. Uniformity from the outside is what makes everything else easy from the inside.

The same week, skill naming gets standardized to verb_entity: get_event, create_event, update_event, list_events. Aliases that grew over the past two weeks get folded back into the canonical name (update_event_by_title becomes a path inside update_event). The web domain dissolves: it was duplicating documents and research, and once I look at it without the LangGraph frame there is no reason for it to be its own thing. Naming consistency is a small win on its own. Combined with the new runtime it means the LLM sees one coherent tool surface and the system prompt can describe what an agent does in a paragraph instead of enumerating thirty idiosyncratic commands.

Replication

The 25th, with the runtime quiet and the renames done, I write the replication engine. The framing in the commit message is the Von Neumann probe: clone the entire Anton stack to a new server with one command. The mechanism is unromantic, rsync plus docker compose plus a seed orchestration, but the property it gives me is the one I want. Three reasons it matters now: every household should be able to run its own Anton, the Spark could die and I want a clone to come up cleanly, and I want to be able to spin up a copy to test invasive changes without holding my breath. The replication script does the first cut of all three.

It also surfaces the secret-management problem in a way I cannot ignore anymore. Vaultwarden auth does not survive cloning cleanly. A clone comes up missing the credentials it needs to be useful, and the only way to fix it is by hand on each machine. That defeats the point. I leave it open for now. It is the next problem.

A second transport

The Telegram bridge lands the same week. Same agent backend, same skill surface, different transport. The fact that I can add a whole new way for users to talk to Anton without touching the agent loop is the validation that splitting transport from worker on day one was the right call. The new bridge is a small app that enqueues jobs the same way WhatsApp does. The agents do not know which one they are answering.

Two refinements to the retrieval layer round out the week. Calendar queries now weight calendar facts more heavily, media queries weight media facts: the provenance tags I added the previous week finally do something useful. And document-derived facts stop leaking across users. One person's PDFs cannot show up in another person's recall. The household has more than one human in it; the memory has to know that.

By Thursday night the codebase looks like what I wanted it to look like two weeks ago and could not have known to ask for. Two primitives. One runtime. One naming convention. A replication path. A second transport. The friction that was building all of last week is gone, and what is left is small enough to keep entirely in my head. Which is the only size I trust.