% man Spec-Driven Development

TL;DR: Spec-driven development is the old dream of describing what you want and letting the machine produce the software. The dream went unrealized for forty years; modern AI has finally unblocked it — though not in the way most people assume. I practice spec-driven development for a living — here’s how it works today, where I see it heading, and what I’m building next.

Spec-driven development (SDD) is not a new idea. It is one of the oldest dreams in engineering: describe what you want, and let the machine produce the software. The industry has chased it for decades — through 4GLs, CASE tools, and UML-driven modeling — and never made it real.

To different people, “describe what you want” always meant different things. To executives it was software development without the engineering bottleneck — repeatedly promised, never delivered. To product managers (PMs) it was wave after wave of tools that improved so little that, decades later, they still write requirements in Word. To engineers it was a parade of technologies that never panned out. Most don’t even remember the failed attempts; today we’re being sold the same dream in new packaging. I’ve been in the field long enough to know better, so before we assume this time is different, let’s dissect SDD for what it actually is.

Start with the spec

The term is simple: a specification is how two parties communicate formally and record their alignment. Party A writes down what they want and hands it to Party B. B reads it and asks questions. A revises until the questions stop. Only then can B safely act on it, inside the alignment they’ve reached. The spec is the durable record of that alignment.

In software, “the spec” has two incarnations. The PM writes the product spec — the PRD — and the architect writes the engineering spec. Both feed into coding. The engineering spec isn’t just a restatement of the PRD. It adds the architecture decisions, and the technical specifics that pin down how the product must actually work — detail a PRD would never carry — while recasting the PRD’s intent in terms engineers find easier to act on. Each recasting risks drifting from the original intent. And when engineers talk about spec-driven development, they almost always mean it from the coder’s seat: give me the full spec of what to build, and I’ll build it.

Over the wall, and back again

Here is how the work has flowed for as long as anyone can remember. Stay with the chain — its length is the point.

  • An executive wants a system that moves a business objective, and explains it — usually verbally and informally, in a meeting — to the PM.
  • The PM writes up the features that system needs and hands the write-up to engineering.
  • The architect lays down the technical foundation those features will sit on. Now all that’s left is to code it.
  • The coder writes the code — lots of it.
  • The tester checks that the code matches what the PM wrote.
  • The PM finally plays with the actual system and, often for the first time, sees the gaps that were invisible on paper — gaps that could trace back to their own write-up, the architecture, or the code. Each gap restarts the cycle, on a smaller scope each time. Once it’s good enough, the PM takes it to the executive, who may be satisfied — or may send everyone back to the drawing board. The loop ends when the result is finally right, or when time, budget, or patience runs out.

Every handoff in that chain is a chance to lose or distort the original intent. The work moves mostly in one direction, and alignment does get checked — but by hand, at great effort, and never reliably. The people who could have caught a problem early are the last to see the result. A solo builder never needed this chain; it exists only because a team has to pass work from one person to the next, and until recently there was no better way to keep everyone aligned. Hold onto that picture — the fix is coming.

How the code quietly becomes the source of truth

Once the spec is coded and accepted, the code becomes the engineers’ source of truth, and it quietly frames how they approach everything that comes after. A new requirement arrives and the engineers reach straight for the code — how do we modify what’s already here? — rarely stepping back to consider a clean rebuild. A rebuild means a budget overrun; it means killing the thing they made — their baby — and it means admitting, out loud, that the earlier attempt missed. So the original spec freezes in time, and every change rides on a new, smaller spec describing only what’s different. This is spec-first development: a full spec goes to an engineer who codes it, and from then on every change is patched onto code that has become the single living truth.

Teams with deeper pockets go a step further. Every time a change is made, they loop back and update the original spec as well — the PM revising the PRD, the architect revising the engineering spec. This is known as spec-anchored development. It helps with onboarding and QA alike, because there is finally one up-to-date place to learn the state of the project and to test the build against. But it is costly and a little idealistic — nothing actually guarantees the spec matches the code, and the two drift further apart with every release.

Spec-as-code, and why it never worked before

There is one idea that promises to fix all of that at once: spec-as-code. If the spec truly holds everything needed to build, producing the code becomes pure automation — and automating the coding removes the lossy handoffs that plague the process. The spec-versus-code drift disappears as a side effect: the moment code is generated, it stops being the source of truth — the spec is — so there is nothing left to drift. This was the promise of CASE tools in the 1980s–90s and of model-driven architecture with UML in the 2000s. They did a good job capturing the structure of a system and could generate database schemas reliably, but they couldn’t describe the behavior — what the application actually does — because they relied on a formal language, and no formal language can capture everything from gaming to banking. The one place where a formal language does work is inside a narrow domain. For instance, my Rishon platform uses a domain-specific language (DSL) to do exactly that for business applications. In the general case, though, such a universal language is impossible to build — the variety of what software can do has no end — so the tools never closed the gap. Practitioners fell back on what they always used — English — but the informal nature of human language is precisely what leaves requirements incomplete, which has ranked among the most-cited causes of project failure for decades.

So the idea of spec-as-code fell out of love — until modern AI rekindled it by removing the biggest blockers. First, you no longer need a formal grammar to write the spec; plain English is enough. Second, AI generates the code. For the first time, the automation that spec-as-code always assumed actually exists.

Why the common narrative doesn’t hold

So: point AI at the spec and let it write the code. As a story it’s clean, and on the surface it works. In practice, it rarely holds up — and the reason is easy to miss. To see it, you have to look closely at what coding actually is, and very few people do, coders included.

A programmer does not simply convert a spec into code. They could, if the spec were mathematically complete — but it never is. There are always gaps, smaller or larger depending on who wrote the spec. Those gaps surface only when someone actually builds and hits the question: how should this behave, exactly? The big, obvious gaps get escalated to the architect or the PM. The ones that appear insignificant to the coder get filled with a quick assumption, and the work moves on.

This is the part of the job nobody names: coding was never mostly typing — it was deciding, a steady stream of small calls the spec never made. Across the teams I’ve led, I’ve watched ungrounded guesses like these miscalculate sales tax, mishandle free gifts on returned purchases, and accrue incorrect interest on backdated financial transactions. Juniors guess wrong more often than seniors, which is why their code frequently passes spec-gated testing and fails much later in production.

Now put AI in the coder’s seat. It guesses just as readily, and gets it wrong for a sharper reason: the answers it needed were never in its training, and it was not in the meeting when the context was shared with the team.

When you’re vibe-coding and watching closely, you can still catch a bad assumption as it happens — not reliably, but possible. Try to live on the bleeding edge — hand the work to an autonomous agent — and that safety net is gone. If you’re counting on manual review to catch bad assumptions afterward, reconsider: reading someone else’s code is far harder than writing your own, and it doesn’t scale — erasing every bit of productivity promised by agentic AI coding. Going down this path lands us exactly where CASE and UML did: nowhere.

Making spec-as-code real

The way out isn’t in the coding. It is in the spec — not written off the top of someone’s head, but developed in an environment that agentic AI makes practical. Here is what that environment has to provide:

  • Intent over instruction. Record not just what was decided but why. A bare decision is rigid; the intent behind it lets the spec adapt when conditions change. Decisions are still captured — but always with their reasoning, so they can be questioned later instead of blindly followed.
  • Precise and comprehensive, in plain English. The spec stays human-readable but has to be exact and complete. People can’t do that reliably on their own, so the process has to help them get there first, then enforce that standard.
  • Always consistent. Any decision or fact that would contradict what’s already recorded is caught and resolved before it’s admitted to the spec, so the spec is never in an inconsistent state.
  • A workflow for compromises. Many decisions are forced by outside limits — a deadline, a vendor, a regulation. A compromise should never outlive its cause: once that limit is gone, the trade-off is flagged for another look, since it may no longer be needed.
  • Technical debt and rollout schedule inferred from business priorities. How much technical debt to take on, and how to sequence the rollout, are not purely technical calls; they follow from business priorities, risk, budget, and timing — reasoned out by weighing the pros and cons each option carries.
  • One shared environment. The spec is no longer built in rigid phases; it becomes the product of collaborative work — PMs, engineering, architects, and ideally executives — all contributing. The environment organizes the flow that replaces the chain of traditional handoffs.
  • Approval workflow. The decisions that carry weight — the ones that put revenue or legal liability on the line, or affect the architecture — have defined owners. The system knows who must sign off, so such a decision can’t be made quietly by whoever happens to hit it first.
  • Decisions surfaced early, by design. Those decisions don’t come out on their own; a deliberate process drives them out. The same questions that used to ambush the coder mid-build — or slip by entirely — are raised and answered while the spec is being written. This one is deceptively hard, and it is where I have personally burned the most midnight oil.
  • Built-in evals. Every behavior is recorded together with the check that proves it correct. It’s test-first thinking, moved from code to the spec. Some cases will always go unwritten, but the method makes capturing them the habit rather than the afterthought.
  • Non-functional and compliance requirements. Security, performance, data residency, and regulatory constraints are first-class decisions, not afterthoughts. They’re often negotiable and the trade-offs around them shift over time, so they’re recorded with owners and re-checked continuously, never settled once and forgotten.
  • Traceability and auditability. End to end, every behavior traces back to the decision behind it, and every decision back to who made it and why — so the reasoning lives in the record instead of in people’s memories.
  • Code as artifact. The spec is the source of truth, and the code is generated from it — most of it disposable, regenerated at will. Where a piece of code carries lasting consequences — a database schema, say, since changing it forces data migrations — that piece (in any form) becomes a decision in the spec, not left buried in the code.

This is not a complete list, but it is a start. I work this way today, from solo and small-team builds all the way to enterprise scale, on greenfield and brownfield projects alike.

How I practice this today

In my projects today, the spec is stored as an ontology graph — ground truths, decisions, preview artifacts, and more, all linked together — with protocols in place that keep every part consistent with the rest. Both the graph and the protocols are enforced today by the skills exercised by agentic AI: Claude, Codex, Antigravity, Cursor — whichever you prefer.

Agentic Spec-Driven Development book cover #1 Best Sellerin Software EngineeringAmazon, June 2026

I could have shared the AI skills as a black box. Instead I wrote the method down, so you can see how each piece works and adapt it to your own needs. Agentic Spec-Driven Development walks through every skill with worked examples.

The book isn’t the end of the road — it’s the beginning. I’m already building the next generation of the skills the method relies on, and I’ll release them when they’re ready.

What I’m building next

The journey doesn’t stop at instructions written for an AI to follow. Three moves are already underway.

From skills to tools. The next step replaces pure-AI skill instructions with a mix of agentic tools and plugins for the popular AI clients — Claude, Codex, Antigravity, Cursor. That turns today’s Git-based sharing into true team-wide collaboration, cuts token cost, improves reliability, and enables efficient multi-model routing. When it’s ready, it becomes a commercial product.

Domain knowledgebases. In parallel, I see real value in knowledgebases that assist spec-building for select verticals — think of them as modules in an ERP system, where the spec configures a module instead of defining it from scratch. The same idea works for legacy codebases: the spec takes over the job once held by a veteran maintainer.

A runnable spec. The last move makes the spec not just codable but runnable — record a piece of spec and immediately see what it does, without waiting for a build. Getting there most likely means migrating the whole environment (Claude and the others) into a custom spec-focused agent — in effect, the spec-writing IDE Sean Grove imagined in 2025. That agent would interpret the spec and run the application it describes. A product manager could enter a new decision or a constraint and, moments later, get their hands on the working app it produces — then make a correction and try again. The loop stays immediate as long as nothing calls for an engineering decision. When something does, the question goes onto the engineering queue, and the app is ready the moment engineering answers.

Building this from scratch would be an enormous undertaking — if not for the existing Rishon codebase, which already executes an ontology graph for the business domain (see the demo).

If this could help your team

What everyone is after is straightforward: working software assembled from a spec the whole team shapes, with AI doing the build and very little ceremony in between. Getting there reliably is harder than it looks, and the guidance out there is mostly direction and motivational talks, not instruction. That gap is exactly why I locked myself away and wrote the book — the actual how-to, not another pep talk.

If your team is already practicing spec-driven development and just wants practical ideas, the book may be all you need — yours to put to work today. If you want hands-on help getting there faster and without the hiccups, I’m open to working with a few teams directly, as a fractional CTO or an advisor on AI adoption in engineering.

drop me a line

Either way, I wish you the best of luck with your projects.