Prompts Are Source Code Now
Tobi Lütke points out that throwing away prompts while keeping AI-generated code is like throwing away source and keeping binaries. He's right, and the implications run deeper than version control.
6 min read
Tobi Lütke — CEO of Shopify — posted this observation:
This is one of those observations that seems obvious once stated and invisible before. For small tools especially, the prompt is the source. The code is the compiled output.
The Analogy Unpacked
When you compile C to a binary, you lose:
- Readability. The binary works, but good luck understanding why.
- Modifiability. Want to change something? You're reverse-engineering, not editing.
- Portability. Recompile for a different architecture? Not without the source.
- Intent. Comments, variable names, structure — the why — is gone.
When you throw away the prompt that generated your AI code, you lose the same things:
- Regeneration. The model improves next month. You could get better code for free — if you had the prompt.
- Iteration. "Make it handle edge case X" is easy with the original prompt. Starting from generated code, you're doing archaeology.
- Context. Why does this function exist? What constraints was it designed for? The prompt knew. The code doesn't say.
- Transferability. This prompt worked for Claude. Slightly modified, it might work for the next model. The generated code is frozen in time.
This connects to what's still worth knowing in AI-assisted development. Knowing how to prompt — what context to include, what constraints to specify — is becoming more valuable than knowing how to write the code itself. The prompt is where the work happens.
What This Changes
If prompts are source, they need source control.
Not as an afterthought — not a prompts/ folder you forget to commit. As the primary artifact. The code it generates is a build output.
This sounds extreme until you consider the workflow:
- Write prompt
- Generate code
- Test code
- Tweak prompt
- Regenerate
- Repeat until satisfied
Where's the work? In the prompt. The code is a side effect.
This is structurally similar to the Rust inversion. Rust's strict compiler was seen as a barrier when humans wrote code. Now that AI writes code, it's an asset — free verification. The same inversion is happening with prompts: they were seen as ephemeral inputs, but now they're the durable artifact.
Map and Territory
There's a useful distinction from general semantics: the map is not the territory.
In software, this pattern appears wherever you have a generative layer and its output:
- The map: lightweight, structural, captures intent
- The territory: rich, specific, the rendered result
Schema definitions and database contents. Build configurations and compiled artifacts. Templates and rendered pages.
The map is what you need to regenerate or modify the structure. The territory is what you read and use. Losing the territory is inconvenient; losing the map means starting over.
Prompts and generated code have the same relationship:
- Prompts: lightweight, structural, generative
- Generated code: specific, executable, the output
The prompt is the map. The code is the territory. If you only keep the territory, you lose the ability to regenerate when conditions change.
The "Small Tools" Qualifier
Tobi's careful framing matters: "at least for small tools."
For a 50-line script that parses CSV files, the prompt is almost certainly more valuable than the output. Regenerating from prompt takes seconds. Modifying the prompt to handle a new format takes minutes. The code is ephemeral.
For a large system, it's murkier. The prompt that generated your authentication module might be useful, but:
- You've likely hand-modified the output
- The code now has runtime context the prompt didn't know about
- Integration points evolved beyond the original generation
Scale changes the ratio. But it doesn't eliminate the value.
The Deeper Pattern
This connects to a shift I keep noticing: the locus of understanding is moving.
In the old model, understanding lived in the code. You read it to know what happened. Comments explained intent. Architecture revealed purpose.
In the new model, understanding lives in the prompt — or, more precisely, in the context you provided. The code is a projection of that understanding into executable form. Readable, maybe. But not the source of truth about intent.
This is part of why reading code is dying as a gatekeeper. The code is downstream of the prompt. If you want to understand what happened, you need the prompt chain, not just the diff.
Practical Implications
Version your prompts. If you're generating code from prompts, those prompts belong in git. Not as documentation — as the source.
Comment the prompts, not (just) the code. Why did you ask for it this way? What alternatives did you try? What didn't work? The prompt is where this belongs.
Treat regeneration as recompilation. New model? Regenerate. Found a better approach? Regenerate. The cost is low. The benefit is compounding.
Build prompt libraries. Just like you'd build utility functions, build reusable prompts. "Parse this format." "Handle these edge cases." "Write tests in this style." Composable understanding.
Reality Checks
This analysis might be wrong in several ways:
First-output-is-final-output. If AI gets good enough that you don't iterate — if the first generation is always correct — then prompts stop mattering. You wouldn't save intermediate compiler inputs either if compilation was instant and perfect.
Code becomes self-explanatory. If AI generates code with enough comments and context that intent is obvious, the prompt becomes redundant. The binary includes the source, effectively.
The workflow doesn't generalize. Maybe most people don't iterate on prompts the way I described. Maybe they write once, accept the output, and move on. In that case, saving prompts is overhead without benefit.
Tooling never materializes. Git works for prompts, but specialized tools (prompt versioning, diff visualization, chain tracking) might never arrive. Without tooling, the practice stays niche.
If you're working differently and think I've missed something, I'd like to hear about it.
The Inheritance Connection
There's a version of this observation that extends beyond code.
When I work with AI assistants across sessions, the context documents I maintain — what happened, what we decided, why — serve the same function as saved prompts. A new session can regenerate understanding from those documents. Without them, every session starts cold.
The pattern is general: the generative layer persists; the output is downstream.
- Prompts generate code
- Context docs generate oriented instances
- The map generates the territory
In each case, throwing away the generative layer while keeping the output is throwing away source and keeping binary. The output works, but you've lost the ability to regenerate, modify, or understand.
The Minimum Version
If you're using AI to write code:
- Save your prompts somewhere
- Put them in version control
- When you revisit code you don't understand, check if there's a prompt
That's it. No new tooling required. Just the recognition that you have two artifacts now, and only one of them is truly source.
Written January 2026. The code was easy. The prompt took longer.