Prompts are config, not code
June 3, 2026
Here's a question worth sitting with: when you change one sentence in a production prompt, why does it take a full deploy to ship it? The model didn't change. Your code didn't change. One string did. Yet most teams route that one-word edit through the same pipeline they'd use for a schema migration — pull request, review, build, deploy, wait. We think that's backwards. A prompt is configuration, and it should move at the speed of configuration.
Key takeaways
- 88% of organizations now run AI in at least one business function, but most still ship prompt changes through a full code deploy (McKinsey, 2025).
- A prompt is behavior you tune constantly — that makes it config, not code.
- Treating prompts as config means versioning, instant rollout, and one-click rollback: the same pattern feature flags brought to releases.
- Prompts still belong in code when a change touches parsing, tools, or schemas — not just wording.
What does treating prompts as config actually mean?
It means the text that steers your model lives outside your application bundle, where you can edit, version, and roll it out without rebuilding or redeploying anything. The code stays stable. The prompt becomes a value you change at runtime — like a feature flag or an environment variable, not a constant baked into your release.
The distinction is between the harness and the instruction. Your code decides how you call the model: how you parse its output, wire up tools, and handle errors. The prompt decides what you ask of it. These two things change on completely different clocks. The harness might be stable for months. The prompt gets tuned every week, sometimes every day.
When you bundle them together, the slow clock wins. Every wording tweak inherits all the ceremony of a code change, even though nothing about your code is actually different.
Why does shipping prompts as code slow teams down?
Because deploys are the bottleneck. Only 16.2% of organizations deploy on demand — everyone else batches changes into release windows (DORA, 2025). When a prompt edit rides that same train, a fix you wrote in thirty seconds can sit for hours, or until the next sprint, before it reaches a single user.
The real cost isn't latency, though. It's that friction kills iteration. If every phrasing change means a deploy, people stop making them. Prompts go stale. Edge cases pile up. And here's the part that stings: the people who understand the problem best — support leads, domain experts, the PM who read every angry support ticket — usually can't touch the prompt at all, because it's buried three folders deep in a repo they don't have commit access to.
The feature-flag lesson: decouple deploy from release
Feature flags solved a version of this problem a decade ago by separating deployment from release. You ship code dark, then flip it on when you're ready (DORA, 2025). The deploy and the decision to expose a change stopped being the same event.
Prompts need exactly that separation. The model integration ships once and rarely changes. The instructions that govern its behavior should change independently, as often as you like — with gradual rollout when you want to be careful and an instant kill switch when something goes wrong.
Our take: the prompt is the highest-impact, fastest-changing surface in an AI product, and it's the one most teams have the least direct control over. That inversion is the whole problem.
What does it take to treat prompts as config?
Pulling prompts out of your codebase only works if you replace what the codebase gave you: history, review, and a safety net. Four things make a prompt safe to manage as config.
- Versioning — every change is a numbered version with a diff and an author, so you can see exactly what shipped and when.
- Evaluation — you can score a new prompt against real examples before it goes live, the way a test suite gates code.
- Instant rollout and rollback — promote a version to production, and revert in one click if it misbehaves. No redeploy.
- An audit trail — who changed what, when, and why, because a prompt edit is a production change even when no code moved.
Miss any one of these and "prompts as config" becomes "prompts in a Google Doc," which is worse than where you started. The gap between those two outcomes is the reason we built PromptVault.
When are prompts better left in code?
Not every prompt belongs in config — and pretending otherwise would be dishonest. When a change is coupled to your code, version it with the code that depends on it. A new output schema your parser relies on, a tool definition, a structural rewrite that needs matching application logic: those aren't config changes, even though they happen to involve prompt text.
The test is simple. If the change is just words, it's config. If the change forces a code change to stay correct, ship them together. Most day-to-day prompt work falls firmly in the first bucket — which is exactly why the first bucket deserves better tooling.
Frequently asked questions
Isn't a prompt just part of my code?
It lives in your code today, but so did copy strings and feature values before config systems pulled them out. The real test is how often something changes and who needs to change it. Prompts change constantly and are often best edited by non-engineers — that's the signature of config, not code.
Doesn't moving prompts out of the repo make them harder to track?
Only if you move them somewhere with no history. The goal isn't to abandon version control — it's to get prompt-native version control: diffs, authorship, and rollback without a deploy. Done right, you end up with more traceability than a string constant buried in a source file ever gave you.
How is this different from a feature flag?
A feature flag is usually a boolean — on or off. A prompt is a rich, evolving value you tune for quality, not just toggle. The decoupling principle is the same — change behavior without redeploying — but prompts also need the versioning and evaluation that simple flags don't.
The shift is small to describe, large in practice
Stop treating the most-edited part of your AI product like source code, and start treating it like what it actually is: configuration. You'll iterate faster, recover faster, and let the people who understand the problem actually shape the prompt that serves it.
That's the idea behind PromptVault, and it's what we'll keep writing about here. If you're ready to put it into practice, our guide to prompt versioning best practices covers the mechanics — and there's more on the way.