Beads Changed How I Work With Coding Agents

A colleague dropped a link in our chat a few weeks ago. "Have you heard of Gas Town?"

"Mad Max Gas Town?"

"Yes and no."

The link went to Steve Yegge's Gas Town manifesto. If you haven't read it, it's an unhinged multi-part blog series about orchestrating swarms of AI coding agents, and the entire thing is dressed in Mad Max Fury Road nomenclature. I read right past the warnings about monkeys ripping my face off and couldn't stop reading.

Getting Past the War Rigs

I'll admit I struggled in places. Yegge's naming convention is committed. Pole Cats are ephemeral worker agents. War Rigs are project repositories. Deacons are health monitors. Dogs are the Deacon's investigation crew. The Refinery is a merge queue processor. Mapping these to the mental models I already had for how agent orchestration should work took effort.

But here's the thing. Underneath all the Fury Road cosplay, I kept seeing working solutions to problems I'd been theorizing about for months. How do you give agents persistent memory across sessions? How do you coordinate multiple agents without them stepping on each other? How do you track work in a way that agents can actually parse instead of humans squinting at Jira tickets?

I'm not at Stage 8 yet. Yegge describes eight stages of developer evolution with AI tools, from barely using copilot all the way up to building your own orchestration system. Gas Town is built for people at Stage 6 and above, running multiple agents in parallel. I'm solidly in the "getting there" camp. But the foundation of the whole thing, in my opinion, is beads. And beads were simple enough that I could start using them immediately.

What Beads Actually Are

Beads is an issue tracker. But not the kind you're thinking of. It's not Jira. It's not Linear. It's not GitHub Issues. It's an issue tracker built from the ground up for AI agent consumption.

Issues are stored in a .beads/issues.jsonl file in your repo, one JSON line per issue. There's a SQLite database alongside it as a queryable cache. Dependencies between issues are first-class citizens with semantic types, not free-text descriptions that an agent has to parse. And the whole thing is git-backed, so your work state persists, syncs, and has history.

The command that sold me is bd ready --json. It returns only unblocked work. Tasks where every dependency has been resolved. An agent doesn't need to understand your whole project backlog. It just asks "what's ready?" and gets back a structured list of things it can start working on right now.

Yegge calls it a solution to the "50 First Dates" problem. Every time you start a new session with a coding agent, it wakes up with amnesia. It doesn't know what you were working on yesterday. It doesn't know what's blocked, what's done, what's in progress. Beads gives agents a persistent, structured memory that survives session boundaries. That alone is worth the setup.

Starting Small

I didn't dive into Gas Town. I cloned the beads repo, installed it, and ran opencode. I just started asking the agent to help me understand how to use it.

The docs could use some work, honestly. And there are some Gas Town specific concepts that have leaked into beads that I don't think belong there. Beads should stand on its own as a tool, and in practice it does, but you'll run into references that assume you're running the full Gas Town stack. Those things aren't in my way though. The process I've built around beads, you'd never even know Gas Town was involved.

The more I used beads the more the idea just made sense to me. Re-envisioning issue tracking for agent consumption instead of human consumption. Once I wrapped my head around that framing, I started looking at all kinds of other problems the same way. ADRs, specs, project documentation, all of it could be rethought for how agents process information rather than how humans scan it. But I'm jumping ahead.

The Workflow I Built

After poking around with beads in opencode, I wrote a handful of custom agents and slash commands. Two full skills. The result is a workflow I've been running for about two weeks now, and I'm really happy with it.

Here's how it works.

I have a command that imports a spec. The spec gets stored as an Epic-type bead. Think of it as the top-level container for a body of work, with the full specification attached.

I then have a decompose command that breaks that epic into child issues. Each child bead is a discrete unit of work with clearly defined scope. The command also maps the dependencies between these issues, making sure the dependency graph is accurate but not overly constrained. You don't want every bead blocking every other bead. You want the minimum set of real dependencies so the agent has maximum parallelism.

Here's where it gets interesting. In the Gas Town manifesto, Yegge describes implementing Jeffrey Emanuel's "Rule of Five," the observation that if you make an LLM review something five times with different focus areas each time, it generates superior outcomes. The implementation itself counts as the first review, so I have the decompose process loop over the beads four additional times. Each pass tightens the scope, refines the dependencies, and catches issues the previous pass missed. By the fourth loop, the beads are tightly scoped with a clean dependency graph. Not too granular, not too vague. Each one is a concrete unit of work that an agent can pick up and execute without ambiguity.

The result is a set of beads that have a dependency graph effect. I can point an agent at the epic and tell it to start working on ready beads. It pulls the first unblocked bead, implements it, marks it done, and the next set of beads that were waiting on it become ready. The agent just chews through the beads until the full spec is implemented. It's almost like magic.

The Review Loop

This is the part I'm most proud of. That same decompose command doesn't just create the work beads. It also adds a final bead at the end that's blocked by every other bead in the epic. This is a review bead.

The review bead instructs a different agent, one configured specifically for code review, to look over all the work that's been done for the epic. Every bead's implementation, tested against the original spec. And not just the code. The review agent is also instructed to verify that documentation is up to date. If the implementation changed behavior, the docs need to reflect it. That alone is a huge win. I can't count how many times I've shipped something and forgotten to update the README or the API docs. The review agent doesn't forget.

If the review agent finds issues, it files them as new beads. And if any of those issues are relevant to the current epic, it creates them as children of the epic and makes the review bead dependent on them. So now the review has to stop. The agent goes and fixes the issue. Once that fix bead is done, the review bead becomes ready again and the review agent starts over from the top.

It's a self-healing review loop. The review can't complete until every issue it found has been resolved. And if fixing one issue reveals another, that gets filed too. The loop continues until the review agent has nothing left to flag.

This has been working amazingly well. The implementation agent does 90% of the work correctly on the first pass. The review agent catches the remaining 10%, mostly edge cases and spec misalignments. The fix-and-re-review cycle usually takes one or two iterations before the review passes clean.

Surviving Compaction

There's a practical problem with long-running agent sessions: compaction. When your context window fills up, the agent compacts its memory and loses the thread of what it was doing. If you're halfway through an epic with 20 beads, compaction can effectively kill the session. The agent wakes up not knowing what it was working on.

I built a custom plugin that hooks into opencode's events. When a compaction event fires, the plugin immediately re-tasks the agent: here's the epic, here's where you left off, get back to work. The agent gets slapped in the face right after going dim and picks up from the next ready bead. It doesn't need to remember the previous session because the state is in beads. What's done is done. What's ready is ready. The agent just queries bd ready and keeps going.

This is one of those things that sounds trivial but completely changes the experience. Without the compaction hook, I'd have to notice the agent went quiet, manually re-prompt it, give it context about what it was doing. With the hook, it's seamless. The agent might lose its memory but it never loses the work state.

Why This Works

I think there are a few reasons this workflow is effective.

Beads give agents the right abstraction. A bead is small enough that an agent can hold the entire context in its working memory. The description, the acceptance criteria, the dependencies. It doesn't need to understand the whole project. It just needs to understand this one bead and implement it.

The dependency graph means agents don't step on each other's work. When I eventually scale this to multiple parallel agents, the graph ensures they're working on independent beads. No merge conflicts from two agents touching the same code. No race conditions from one agent building on code another agent hasn't finished yet.

And the review loop catches what the implementation agent misses without requiring me to manually review every line. I still look at the final output. But the review agent does the tedious pass first, and by the time I'm looking at it, the obvious issues are already fixed.

What I'd Change About Beads

The tool isn't perfect. The docs assume too much familiarity with Gas Town concepts, which makes the initial learning curve steeper than it needs to be. I had to experiment more than I should have to figure out basic workflows.

Some Gas Town-specific terminology and behavior has made its way into the beads codebase where it doesn't belong. Beads is useful as a standalone tool, and the more it depends on Gas Town abstractions, the less accessible it becomes to people who just want a better issue tracker for agents.

But these are minor complaints. The core concept, git-backed, dependency-aware, agent-optimized issue tracking, is solid. The implementation is good enough to build real workflows on top of. And the community is moving fast. There are already Rust ports, TUI viewers, and IDE integrations being built.

ADRs as Constraints

Using beads for two weeks has changed how I think about tooling. The question I keep coming back to: what else are we building for human consumption that should be rebuilt for agent consumption?

Architecture Decision Records are the example I can't stop thinking about. For humans, ADRs serve an archaeology purpose. Why did we choose Postgres over Mongo? Why are we using event sourcing for this service? You read the ADR, you understand the reasoning, and you follow the decision. The context, the alternatives considered, the tradeoffs, all of that helps humans internalize the "why" so they can apply the decision correctly in new situations.

An agent doesn't need any of that. It doesn't need to know why you chose Postgres. It doesn't need to know that you evaluated Mongo and rejected it. It just needs to know: use Postgres. More specifically, it needs the constraints that fell out of that decision. Use Postgres. Use connection pooling through PgBouncer. Don't use database-level triggers. Keep migration scripts idempotent. Those are the actionable constraints. The rest is human context.

Beads has a bead type called decision, which is also aliased to adr. I'm not sure what the original intent was, but these beads don't show up when you run bd ready. Which means I can load content into them without breaking anything in the workflow. They're invisible to the worker agents unless something explicitly references them.

So here's what I'm playing with. I extract the constraints from each ADR into decision beads. Just the constraints, not the full decision record. When I import a new spec and create an epic, the importer searches via progressive disclosure for any decision beads that might be relevant to the work. If it finds relevant ADRs, it adds a relates-to dependency between the epic and the decision.

Then when the decompose command runs, it looks at only the filtered list of related decisions and puts those constraints into context. It doesn't see every ADR in the system. It sees the handful that are relevant to this specific epic. And when it decomposes the spec into child beads, the work it scopes is constrained by those ADRs. Use Postgres. Keep migrations idempotent. Connection pooling through PgBouncer.

By the time the worker agent starts implementing the code, the knowledge is embedded in the work itself. The worker doesn't have to know a damn thing about why those constraints exist. It just follows them because they're part of the bead's scope.

This is the pattern I think will spread to everything. Issue trackers were the obvious first target for agent-first redesign. ADRs are next. Runbooks, deployment configs, incident response playbooks, all of it could be decomposed into structured, machine-readable constraints that agents consume without needing the human narrative wrapped around them.

For now, I'm just a guy with a set of custom slash commands and a beads-powered workflow that lets me import a spec and watch agents build it. Two weeks in and I'm shipping faster than I ever have.

The monkeys haven't ripped my face off yet.