What Comes After Vibe Coding

Why agents need a deterministic engineering substrate for orientation, not more memory

By Andrei Roman

Principal Architect, The Foundry

TLDR: A new layer of understanding is coming, it's not memory, and repo-graph is the first serious attempt I've made at building that layer.

1. The Bottleneck Moved

I'm looking back on two years of headlines treating software engineering as if its main bottleneck was typing. It was not.

AI made that mistake obvious. Code generation got cheap. System understanding did not. The result is familiar by now: agents that look magical in greenfield demos and confused in real repositories. The problem is not that the models cannot write code. The problem is that most codebases have no deterministic, queryable representation of their own structure, ownership, boundaries, runtime context, or verification state.

That is why agents look brilliant in empty projects and unstable in old ones.

2. Why Agents Fail in Real Repos

I wrote about this in "Missing Links in Agentic Coding." The short version: in a new repo, an agent can get away with local reasoning. In a ten-year codebase, local reasoning is not enough. The hard part is not producing syntax. The hard part is entering a system whose real structure is distributed across code, conventions, build files, runtime wiring, dead abstractions, and undocumented historical scars. Agents compensate with guesses. Guess accumulation becomes architectural drift. Drift becomes the next developer's nightmare.

This is not a completely new problem. Michael Feathers framed the older version of it years ago in Working Effectively with Legacy Code: the real danger is not writing code, but changing systems whose behavior you cannot safely see, isolate, and verify. What AI changes is the scale and speed of the failure mode. The old legacy-code problem was already about missing structure and weak feedback. Agents just make that absence impossible to ignore.

3. The Missing Layer: Engineering Orientation

What agents need first is not prose. They need orientation.

They need to know what modules exist, what they own, where the seams are, what runtime each module belongs to, which edges are unresolved, which facts are inferred, and what changed since the last known state. That is not "documentation." That is operational intelligence. And it does not exist in most repositories.

The category I am pointing at is a deterministic repo orientation system. That sounds heavy. In plain English: a system that gives fast, queryable truth about the current codebase. Not a chat UI. Not a generic vector search layer. Not "AI memory." Not just code indexing. A queryable graph of what exists, what owns what, what boundaries matter, what can be trusted, what changed, and what policy and evidence apply.

4. Why Chat, Search, Breadcrumbs and RAG Are Not Enough

Current tools fill adjacent slots. None of them fill this one.

IDE and chat tools are good at generating code, patching, and explaining local snippets. They are bad at stable repo-scale truth, preserving uncertainty, and linking policy to evidence. They operate at the file level. The system operates at the architecture level.

Grep, ripgrep, and tree-sitter scripts are good at ad hoc discovery and one-off exploration. They are bad at persistent identity, trust surfaces, and versioned declarations. They answer "where is this string" but not "who owns this module" or "is this boundary enforced."

Vector and RAG solutions are good at fuzzy retrieval and semantic recall. They are bad at deterministic structural answers, graph completeness, and boundary reasoning. They retrieve things that look similar. They cannot tell you whether a dependency edge is resolved or inferred.

The stopgap solution for now is what I call "breadcrumbs" - leave md files everywhere so the agent stumbles upon them, or the more organized version of having them stored in Obsidian.

But the question "who calls this?" should be a deterministic query. The question "is this safe to change?" should not be answered from vibes or readme files. It should combine graph facts, boundary facts, declared constraints, and evidence.

5. What I Started Building Instead

That realization is what triggered repo-graph. Not a chatbot. Not a Copilot wrapper. A deterministic multi-language indexing and trust-analysis system that already does the unglamorous but necessary work: preserve unresolved structure instead of dropping it, classify what is unknown, attach declarations and obligations, and compute gate outcomes over the result.

The previous article forced the system into existence. Writing about why agents fail in legacy codebases made the gap concrete enough to build against. Repo-graph is the concrete implementation: working deterministic indexing, a stable graph, unresolved structure preserved as first-class data, trust analysis, declarations and policy overlays, obligation and gate evaluation, and multi-language extractors. It already does these things. It is not a slide deck.

I am building this in a way that actually proves the point. I am not deeply familiar with every implementation detail in the stack. What I have instead is a strong architectural target. Claude Code does much of the code writing work. Codex reviews against the stated vision. I just know my intention, constraints, and judgment. It feels very weird, like they are hiding something from me.

6. The Three-Layer Truth Model

Most tools collapse the world too early. They either pretend unresolved structure does not matter, or they smear everything into one low-confidence soup. The right model has three layers.

Layer one: extracted facts. What the code actually declares. Functions, types, imports, exports, call edges, file boundaries. Deterministic. Machine-readable. No inference.

Layer two: explicit unresolved observations. Edges that exist in code but cannot be fully resolved from static analysis alone. Dynamic dispatch, runtime wiring, config-driven injection, convention-based routing. Instead of dropping these or pretending they are resolved, the system marks them as unresolved with confidence metadata. The uncertainty is data, not a bug.

Layer three: interpretation layers on top. Trust analysis, boundary classification, module ownership, runtime model assignment, requirements traceability, waivers, policy overlays. These are declared, versioned, queryable. They sit on top of the graph, not inside it. You can change a policy without re-indexing the repo. You can ask "does this change violate a boundary?" and get a deterministic answer instead of a probabilistic guess.

Suppose an agent wants to change a payment flow.
Layer one tells it which modules, symbols, and edges are statically known.
Layer two tells it that one callback path is unresolved because runtime wiring happens through configuration.
Layer three tells it that this module crosses a forbidden boundary and that changes to it require specific evidence before merge.
That is a very different world from search around and hope.

Once you separate those three layers, uncertainty stops being a defect. It becomes part of the substrate. An agent can know what it knows, know what it does not know, and know what constraints apply - before it writes code.

7. Why This Matters Economically

There is an economic reason this layer matters now. A lot of "AI productivity" is currently actually wasteful: too many tokens, too many tool calls, too much re-discovery of facts the system should already know.

A surprising amount of AI-assisted engineering is just repeated rediscovery: reading the same files, reconstructing the same ownership facts, re-inferring the same call paths, re-checking the same boundaries.

If an agent needs to burn context window just to figure out ownership, callers, entrypoints, or boundary exposure, your problem is not model quality. Your problem is missing infrastructure. Deterministic discovery reduces token cost because the agent stops exploring what a graph already knows. The savings compound: fewer hallucinations, fewer retries, fewer wasted cycles on re-orientation.

Token-capped pricing is already arriving. When every API call costs real money and providers start enforcing usage ceilings, the teams that burn context on orientation will hit walls before the teams that can pre-load structural truth. This is not speculative. The cost pressure is already visible.

8. Why Process Matters More After AI, Not Less

Historically, strong engineering process was expensive because humans had to maintain it manually. Traceability matrices, verification obligations, review gates, policy overlays - all of it felt like bureaucratic drag. The cost of maintaining process exceeded its apparent value for most teams outside regulated industries.

AI changes that cost curve. Once generation becomes cheap and stochastic, process stops being compliance theater and becomes an entropy-control system. The point is not paperwork. The point is constraining stochastic code generation with deterministic choke points. Gates that ask: does this change satisfy its declared obligation? Does it violate a boundary? Is there evidence, or only inference?

High-assurance engineering discipline - the kind that aerospace and medical device teams use - is about to become cheap enough to matter everywhere. Not because humans suddenly enjoy filling out compliance forms. Because the substrate that enforces traceability can now be maintained by the same machines that generate the code. Process becomes the containment vessel for entropy. The cheaper generation gets, the more you need it.

The point is not to force engineers to fill in templates. The point is to make requirements, obligations, and boundary checks cheap enough that the machine can carry the clerical load and the engineer only has to supply judgment.

9. What Exists Already and What Is Still Missing

To be clear, this is not done.

Today repo-graph is a working deterministic graph-and-trust engine with governance features. It indexes multi-language codebases, preserves unresolved structure as first-class data, computes trust surfaces, evaluates policy gates, and tracks obligations against evidence. That part is real and running.

It is not yet the finished architectural-orientation substrate. Runtime and build model integration is incomplete. Cross-repo intelligence is not built. Module discovery is only partially realized. The in-memory daemon architecture that would make this a live, always-current system-of-record is still ahead. This is an honest assessment of where the line is between built and aspirational.

10. What Comes Next

The first generation of AI coding tools optimized for generation. The next generation will optimize for control. Not better autocomplete. Not more vibes. Queryable truth about systems: modules, seams, runtimes, requirements, evidence, and uncertainty.

The teams that build that layer will get more than faster code. They will get engineering systems that remain steerable after generation becomes cheap.

The next important developer tool is not another chat window attached to an editor. It is the layer that lets agents and engineers enter a codebase without hallucinating its shape.

Once code is cheap, truth becomes the scarce resource.

Discussion (0)

No comments yet. Be the first!