The Pattern We Started Noticing
Episode 2 gave the loop memory. Agents now start tasks with context from similar past work — friction points, patterns, gotchas already documented. That fixed one class of repeated mistakes. A different class persisted. As the loop took on larger tasks — interface changes, service refactors, shared utility updates — we’d watch the same sequence play out. The agent edits a function signature. Clean implementation. Correct logic. Then it runs the build. Twenty-three errors across fifteen files. The agent didn’t make a bad decision. It made a decision without knowing the scope of what it was touching. The function it changed was called by 47 modules. Three of them used positional argument ordering that no longer matched. One was a test file that mocked the old signature. One was a macro in another crate that depended on the parameter count. None of that was visible to the agent when it made the change. It found out the way you’d find out if you were new to the codebase and nobody had told you about the dependents: by compiling and watching things break. This is the reactive loop: change → compile → find breakage → fix → compile again. For a human engineer it’s annoying. For an autonomous loop running without human review, it’s a trust problem.Why This Is an Information Problem, Not an Intelligence Problem
The reactive loop isn’t a model capability failure. The model can reason about change impact when it has the relevant context. The problem is that the relevant context — the full dependency graph of what calls what, what types flow where, what contracts exist across module boundaries — isn’t available to the agent when it’s deciding where to apply a change. You can partially address this by providing more files upfront: “here are the callers of this function.” But that requires someone to know which files to include. In a codebase with dozens of repositories and hundreds of modules, nobody has that complete picture ready to hand. The information exists in the codebase. It just isn’t in a form the agent can query. The reactive loop is a symptom of a missing capability: the agent doesn’t have a structured representation of the codebase it’s operating on.What Proactive Blast Radius Awareness Looks Like
The difference in practice: Same change. Same agent. Different starting information. With blast radius awareness, before writing anything, the agent callsimpact().
The tool returns: 47 direct callers, 8 test files, 3 macro dependents, highest
risk in billing_handler.rs, auth_middleware.rs, macro/derive.rs. The agent
reviews those files first, drafts a migration approach that handles all callers,
and makes the change in one pass.
Two compiler cycles instead of six. Eighteen minutes instead of forty-five.
More importantly: the agent made a decision about how to sequence the work based
on actual knowledge of the codebase, not reactive discovery.
The Architecture
Three components: a structured code representation, a graph query layer, and an MCP interface for the agent.Component 1: Tree-sitter AST Parsing
Tree-sitter produces concrete syntax trees for source code. It’s fast (incremental on file changes), language-agnostic, and queryable. For each file, the parser extracts symbols and their relationships:Component 2: KuzuDB Graph
KuzuDB is an embedded graph database — no separate service, runs in-process. Nodes are symbols (functions, types, modules). Edges are relationships (calls, implements, imports).Component 3: MCP Server
Four tools exposed to the agent:impact), targeted context (context), flexible queries
(query), post-change verification (detect_changes). The loop calls impact
before planning, detect_changes before closing.
Why This Changes the Trust Equation
The reactive loop isn’t just slow. It actively undermines confidence in autonomous operation. When an agent makes cascading errors — edits a function, breaks 15 callers, generates fixes that introduce new issues — it’s not because the model reasoned badly. It’s because the model reasoned without the information it needed. The failure mode isn’t intelligence; it’s information architecture. This matters for where you draw the human oversight boundary. If agents fail because they reason poorly, the answer is better models. If they fail because they reason without crucial context, the answer is better tooling. Blast radius awareness addresses the second category. An agent that can ask “how many things break if I change this signature” before changing it can operate with materially less oversight than one that discovers breakage reactively.The enforcement layer described in Claude Code hooks
addresses a complementary failure mode — agents that violate architectural
constraints rather than misscope changes. The two layers target different parts
of the autonomy problem.
Constraints Worth Knowing
The design is clear. A few real constraints in the implementation: Graph freshness. The graph needs to stay synchronized with the codebase. The right approach is incremental: watch for file changes (or hook into git), re-parse modified files, update affected nodes and edges. Full rebuild on every query is too slow for large codebases. Multi-language support. Tree-sitter supports most major languages, but the extraction logic differs by language. A Rust codebase and a TypeScript codebase need different parsers. Manageable, but not free. Cross-repo resolution. In a polyrepo structure, a function call might cross repository boundaries. Full blast radius awareness across repos requires a shared graph spanning repositories — more complex to maintain, but necessary for accurate impact analysis in a distributed codebase. These are engineering problems, not architecture problems. The design handles them at the cost of implementation complexity.What’s Next
The loop has memory (Episode 2). It has blast radius awareness (this episode). The next phase is proving the product actually works — not just that the issue count dropped. End-to-end verification: using the product as a real user, across real environments, with real data. The loop will miss things. Production-grade software isn’t correct implementations in isolation — it’s correct behavior under the full combination of real constraints. That’s what the coming episodes will close.All content represents personal learning from personal and side projects. Code examples are sanitized and generalized. No proprietary information is shared. Opinions are my own and do not reflect my employer’s views.