Reading Code Is Dying as a Gatekeeper

An essay on what changes when AI writes most of the code — and what doesn't.

"All it will take for a software team to commit seppuku next year is for one person to insist at least one human understands each PR..."

"To be a good software engineer next year you need to understand that reading code is a failure mode. To verify systems you need to be using dynamics, not statics."

— Greg Fodor (@gfodor), December 2024

This is a provocation worth taking seriously, not a hot take to dismiss.

Static code review — a human reading a diff line by line — does not scale. It assumes code is written by humans, at human pace, in human-readable patterns. AI destroys all three assumptions. It produces code faster than we can review, in volumes that make "a smart person looked at it" economically unviable.

We tolerated static review because the alternatives were worse. Now, dynamic verification offers a better path: evidence of behaviour rather than inspection of structure.

Property-based testing checks invariants across thousands of inputs.
Fuzzing finds crashes by throwing random data at interfaces.
Runtime observability shows what the system actually does, not what you thought it would do.

The direction of travel is clear. Static review as the primary gatekeeper is dying. But "dynamics over statics" is a slogan, not a strategy. And slogans obscure the hard questions.

Where the argument breaks down

1. The Circular Verification Trap

If the system tests itself, who defines "correct"?

Tests encode assumptions. If AI writes both the implementation and the tests, a dangerous loop emerges: the implementation satisfies the tests, but the tests assert incorrect behaviour. Green checkmarks everywhere; working software nowhere.

Reading doesn't disappear; it relocates. If you aren't reading the implementation, you must read the specification.

2. The Tooling Gap

Fodor assumes you can verify purely through behaviour. But when behaviour is wrong in ways your tests didn't anticipate, you have to debug a system you've never read.

The promise is "just-in-time understanding" — ask the AI to explain it. But current tools hallucinate connections and oversimplify complexity. If you rely on AI to explain systems that AI built, and the explanation is wrong, you're double-blind.

3. Security vs. Malice

Dynamic verification finds bugs (crashes, anomalies). It fails at finding malice.

Logic bombs and backdoors are designed to pass tests. More immediately, AI models frequently hallucinate package names. Attackers register these names to inject malware. If no human reads package.json, you are open to supply chain attacks that dynamic verification will never catch.

4. The Cost of Verification

Static analysis is effectively free. Dynamic verification is expensive.

Ephemeral environments, fuzzing, and end-to-end tests consume massive compute. "YOLO teams" might win simply because their infrastructure bill is lower. The winning strategy isn't "maximise dynamics"; it's "spend verification budget where it produces the highest return."

What actually changes

The human role doesn't disappear. It shifts from inspector to designer.

From diff reviewer to test reviewer. If you're not reading the code, you must rigorously review the tests. Do they encode the actual requirements?
From code reader to results reader. You study evidence of behaviour: traces, metrics, mutation reports.
From gatekeeper to architect. Attention focuses on system shape, boundaries, and failure modes.

A useful frame: humans design experiments; machines explore state space.

The Accountability Floor

One thing cannot be delegated: responsibility.

When the system goes down at 3 AM, "The AI did it" is not an acceptable post-mortem. Meaningful responsibility requires enough understanding to accept risk. Even if AI surpasses us at every technical task, the need for a human to say "I trust this fix enough to deploy it" remains.

On writing this with AI

This essay was written by AI (Claude, Gemini, GPT), guided by Will.

If the thesis is that we should verify behaviour rather than inspect structure, then hiding the provenance is unnecessary. The goal is quality, not human purity. Rounds of revision and editorial judgment produce that quality, regardless of who or what types the characters.

Conclusion

Fodor is directionally right. Teams that insist on "one human understands every PR" will be slow, and likely dead.

But "dynamics over statics" is incomplete. Someone still has to define correctness, catch malice, and own the risk. The engineers who thrive won't be those who read everything, nor those who read nothing. They'll be the ones who know where human attention provides the highest leverage.

For concrete practices on how to work this way, see the companion piece: AI-Assisted Development: A Practical Guide.