Blog post from March 5, 2026

Hello, and welcome!

Today’s topic: why some companies report successful use of AI in development — up to 100% in the best cases — while most experience a measured slowdown, getting up to 19% slower. Understanding this is crucial to your plans for adopting AI in engineering. I’m not talking about the top-down approach of developing new apps from a single prompt. I’m referring to the bottom-up use of AI by developers.

I didn’t pull these numbers out of thin air. The “100% AI-generated code” claim comes from Boris Cherny, head of Claude Code at Anthropic. He shared that he personally stopped writing code manually in November 2025. Anthropic-wide, the figure is 70–90%. The “19% slower” claim comes from a July 2025 METR study—a randomized controlled trial involving 16 experienced open-source developers working on their own mature repos. Developers predicted AI would speed them up 24%, estimated afterward it had helped by 20%, but were actually 19% slower. It’s a small sample — only 16 developers — but it’s arguably the best-designed study we have so far.

This gap is hardly incidental. The 100% figure comes from greenfield work on relatively simple architectures. The 19% figure comes from maintenance on mature, complex repos. These are fundamentally different tasks — and the difference is exactly the point. Companies seeing near-total AI code generation are all AI-native labs — Anthropic, OpenAI, and the like. Leaving aside the obvious commercial benefits of such claims, let’s analyze the applications they’re building.

Take Claude Code. It achieves a lot, but most of the work is done by the LLM behind it, by the know-how for controlling that LLM, and by the rules and skills, which can be complex but are technically just text files. The Claude Code (or Cowork) application itself is architecturally straightforward: a fairly basic user interface and a collection of tools compliant with a unified API. Not to challenge the ingenuity of the design itself, but I can easily see how these apps could be 100% coded by AI.

When we look at what most engineers deal with in enterprise settings, we see a very different picture: complex data models, multiple database tables — not just text files — multi-tier topologies, serverless execution that limits every piece of processing to minutes, multi-user concurrency, fine-grained data security, a bulk of legacy code in several programming languages, unrecorded knowledge and conventions, several generations of software architecture that evolved over decades, and only God knows what else. All of this goes way beyond the context and attention limits of today’s AI and exceeds the reasoning capabilities of both AI and human developers when taken with a brute-force, head-on approach.

This is where teams inspired by Boris’ success hit a stonewall. Can they adopt AI successfully? Yes, of course — but it requires a refined, well-thought-through approach. I will go much deeper in upcoming posts, but here are the core ideas.

Actually, the arrival of AI hasn’t changed the fundamental principles of good software development. Not a single bit. What has changed is who needs to master them. Let me explain.

In my work, I was frequently asked to build an architecture for a fairly large team to code against, including a good portion of junior contributors. To make it work, an architect must build plenty of guardrails into the software architecture — making sure developers are forced to comply with certain rules. This is different from asking them to obey rules and regulations, because those can—and will be — ignored. Enforcement is required for rules to be effective.

Enforcement comes in many forms, from subjective code reviews to automated code analysis and quality assessment. But when guardrails are embedded into the architecture, enforcement processes become largely unnecessary.

For instance, you might choose a programming language like Rust that enforces immutability of data structures in your code. Alternatively, you might define interfaces that ensure immutability at compile time — for example, using read-only members in all interfaces in TypeScript gets you close. The interfaces are “baked” into the architecture, and developers build everything against them, without the ability to modify them. Immutable data structures automatically prevent many side effects caused by code attempting to modify function parameters. This also makes the code more thread-safe.

Another example: API-level contracts between components. A database can easily be corrupted when manipulated directly from various places in the codebase. By segregating all data access into a component with a clearly defined API, such corruption becomes impossible. It also allows us to reason about components independently, without analyzing the entire codebase, and therefore reducing complexity.

To be clear, architectural guardrails won’t catch every bug. The code might compile, pass all type checks, and still be subtly wrong. What good architecture does is dramatically reduce the surface area for such bugs and minimize the blast radius and the related business risks from software failures. You’ll still need testing and validation — but you’ll need far less of it.

These, and many others, are well-understood principles of software development proven by decades of work by millions of engineers. There are many books and classes on the matter — pick the ones that resonate with you.

So what changes with the arrival of AI coding? Surprisingly, very little. Instead of distributing tasks to a team of engineers, you give them to AI agents. Those agents — or more precisely, the LLMs under the hood — were trained on fairly mediocre, low-complexity code from the public domain and therefore exhibit clear traits of inexperienced engineers. To control an AI developer, we use the same principles as with humans. And as with humans, you have two choices: enforcement versus architecture.

Enforcement makes you work harder because of review overhead. Since AI works much faster than humans, reviews will max you out way sooner than you think. In practice, I doubt you can juggle more than about 6 AI agents concurrently — let me know if you can; I tried and failed. This isn’t just my experience; there’s hard science behind it. The human brain can hold roughly 4 to 6 items in working memory. If you want to go deeper, look up George Miller’s “Magical Number Seven” from the 1950s, Nelson Cowan’s “Magical Mystery Four” from 2010, or Douglas Ross’s “Structured Analysis and Design Technique” from the 1970s, which limits every system diagram to 3 to 6 activities — no more.

In today’s AI era, this becomes super relevant because the team lead no longer has the luxury of assigning a task and putting it aside for a few days. An AI agent comes back with results in minutes, not days, and you must efficiently juggle the work of several agents concurrently.

This puts a hard limit on the enforcement approach. My conclusion: it isn’t scalable, and we should minimize manual enforcement — including reviews — wherever possible.

The viable alternative is to control the AI agents’ undesired creative freedom through system design. There’s only one problem with it. In the past, a few architects distributed work to the rest of the engineers. Most engineers were just coders with no pressing need to sharpen their architectural skills. Today, to control AI coding effectively, every engineer must develop real architectural thinking — and get good at it. This won’t happen overnight. In the near term, senior architects can bridge the gap by producing better scaffolding and tighter contracts for the rest of the team to work within. But the direction is clear: engineers who never develop these skills will find themselves increasingly sidelined. Sorry that I have to break it to you like this.

The good news: the knowledge is out there. Pick a good book, or take classes — but make sure they teach not abstract patterns but explain why those patterns are important. You need to develop good intuition for architectural matters; otherwise, you’re still running the risk of severely overloading your prefrontal cortex ;)

Keep in mind that the architecture of a system is ultimately constrained by the architecture of the mind that must understand it. Every decomposition heuristic in software engineering — whether it’s about module size, team size, API surface, diagram complexity, or abstraction depth — is implicitly a theory about human reasoning capacity. The ones that survive in practice are the ones that happen to respect the ~4–6 item limit, whether their inventors knew the science or not. I see a similar limitation in AI results, which could be explained by its training data.

One recent trend is the development of AI skills — essentially, text files that automatically feed information into the AI as needed. I’d be cautious here. I expect great benefit for coding, but much less for system design. The reason: architects balance clearly defined rules with things that cannot be well expressed in a text file — dealing with uncertainties, technical debt tolerance, real-world business risks, the skill level of the human team, and so on.

The other major gap is knowing when to break the rules. Architecture principles are heuristics, not laws. Sometimes the right call is to violate the dependency rule because the deadline matters more. Sometimes the right call is to duplicate code rather than create a premature abstraction. A good architect has calibrated judgment about when principles serve the goal and when they become obstacles. Current AI tools are rule-followers, not judgment-exercisers.

Last point, but not least. Regardless of whether you use AI with reviews or lean heavily on architecture, you must be extremely proficient with code to pass good judgment. The only way to acquire and maintain that skill, as far as I can tell, is to keep coding — even if you use AI heavily. Boris stopped writing code. His context — building an architecturally simple AI tool inside an AI-native lab — makes that viable. Your context is almost certainly different. So you shouldn’t.

If you’re looking for a place to start, I wrote a book a few years ago called “Become an Awesome Software Architect.” I’m also working on several masterclasses that go deeper into AI-specific workflows.

Have a good one!

% cat 100% AI Code at Anthropic. 19% Slower Everywhere Else. Why?