Production Agents: From Demo to Deployment | Intentional / Deliberate / Engineering

Your agent works beautifully in development. It demos perfectly. Then you deploy it.

And it:

Books the same flight twice when the API times out
Loses all progress when a user closes their browser
Burns through your monthly API budget in 3 hours
Sends 47 follow-up emails because it didn’t know it was waiting
Does the wrong thing without crashing — and you don’t find out until a customer complains

You’re not alone. Only 2% of organizations have successfully deployed agentic AI at scale. Gartner predicts 40%+ of agentic AI projects will be canceled by 2027 due to cost overruns and inadequate risk controls.

The problem isn’t your agent’s reasoning. It’s everything around the reasoning that tutorials don’t teach.

So I wrote the series I wished existed when I started shipping agents.

What This Series Covers

9 parts covering what actually breaks in production:

Part	Topic	What You’ll Learn
0	Overview	Why 98% haven’t deployed, the six capabilities tutorials skip
1	Idempotency & Safe Retries	The Stripe pattern, error classification, preventing duplicate bookings
2	State Persistence	Checkpointing, crash recovery, resumable workflows
3	Human in the Loop	Approval gates, escalation patterns, async handoffs
4	Cost Control	Token budgets, circuit breakers, preventing runaway loops
5	Observability	Silent failures, semantic monitoring, the metrics that matter
6	Durable Execution	Temporal, Inngest, Restate — when to use each
7	Security & Sandboxing	Tool permissions, prompt injection defense, blast radius
8	Testing & Evaluation	Task completion metrics, trajectory quality, regression testing

The Tutorial vs Production Gap

What Tutorials Teach vs What Production Needs

Why This Structure?

Each part follows a pattern:

What can go wrong — real production failures
Why it happens — the underlying cause
How to prevent it — patterns that work
Implementation — code you can use
Trade-offs — nothing is free

No hand-waving. Just mechanics.

Who This Is For

You should read this if:

You’ve built agents that work in demos but fail in production
You’re about to deploy your first agent and want to avoid the pitfalls
You’re debugging production agent issues and need a framework
You’re evaluating whether to build vs buy agent infrastructure

You probably don’t need this if:

You’re building simple single-turn LLM applications
You’re doing research, not production systems

The Cost of Getting It Wrong

Production Failure Costs

Start Here

If you’re new to production agents: Start from the overview

If you’re debugging duplicate operations: Idempotency patterns

If you’re dealing with cost issues: Cost control

If you’re evaluating frameworks: Durable execution

This complements the AI Engineering Fundamentals series. That one covers how LLMs work. This one covers how to ship them.

→ Browse the full Production Agents series