Last time, we walked through the agent-versus-workflow question — which shape of automation actually fits the work you're trying to change. If you've been following this series, you've probably narrowed your thinking. You have a candidate process. You have a rough sense of whether it's a workflow you want to harden or a more open-ended task you want an agent to handle. Good. The harder part starts now.
Because picking the right tool was never the bottleneck.

The graveyard nobody warns you about
There's a number that has been making the rounds in industry reports lately, and I keep coming back to it: IDC says 88% of AI proofs-of-concept are abandoned before they ever reach production. Some surveys put it closer to 95%. RAND found over 80% of AI projects fail to ship. And in 2025, 42% of companies told researchers they had walked away from most of their AI initiatives — up from 17% the year before.
88% sounds abstract. Try this instead: out of 100 pilots, 88 die. That's most of them. The vast majority of the AI work happening inside companies right now is, statistically, going nowhere.
I used to read those numbers and assume it was a technology problem. Models too unreliable, hallucinations too frequent, integrations too brittle. Then I started sitting in on more post-mortems, and the pattern was almost embarrassingly consistent. The model worked. The demo worked. The problem was that nobody had ever drawn a line from working in a sandbox to working in the business. The pilot was designed as a science experiment, not as the first mile of a production system.
A logistics manager I spoke with last spring put it more bluntly than I would have. "We had this beautiful little chatbot that could answer warehouse queries," he said. "And then someone asked, okay, where does it actually live? Whose laptop? Whose budget? Who fixes it at 2 a.m. when it goes down?" Nobody had an answer. The pilot quietly died on the conference room whiteboard where it was born.
This episode is about the four stages between idea and impact, and — more importantly — the seams between them where most projects fall through.
Stage 1: Discovery and PoC
The first stage isn't really about building. It's about proving that a specific business problem can plausibly be solved with a specific kind of AI capability. That's it. The deliverable is evidence, not software.
A good PoC, in my experience, has three traits. It targets a single, narrow process — not a department, not a workflow family, one process. It uses real data, not hand-curated samples. And it has a defined exit criterion before a single line of code is written: "if this thing can correctly classify 85% of inbound tickets, we move forward; if not, we kill it."
That last point is the one most teams skip. They run a PoC for six weeks, the model produces something interesting, and then everyone sort of nods and asks, "so… is this good?" Nobody knows, because nobody agreed in advance what good would look like. So the project enters a kind of zombie state — alive enough to keep getting funded, dead enough that nothing happens. I've watched at least a dozen pilots die this exact way.
One thing worth saying out loud: a PoC that fails its exit criterion is not a failed project. It's a successful experiment. You spent two months and a small budget to learn that this particular shape of AI doesn't fit this particular problem. That's worth something. The failure mode isn't the negative result — it's pretending the negative result was actually positive because someone's quarterly objectives need it to be.
Stage 2: Pilot
This is where most projects die, and it's worth understanding why.
A pilot is the first time the system meets actual users doing actual work. It's no longer about whether the model can do the task — it's about whether the model fits inside the human and organizational system around the task. Different question entirely.
Pilots fail for reasons that have almost nothing to do with AI. The data turns out to be messier than the PoC sample suggested. The users don't trust the output and quietly route around it. The integration with the existing CRM or ERP requires three months of work nobody scoped. The team that owns the upstream process doesn't want to change anything because their KPIs aren't aligned with the project's success. Pick any of these and you have a textbook stalled pilot.
The fix isn't more sophisticated technology. It's structuring the pilot like a real product launch — with a defined user group, a feedback loop that runs weekly not quarterly, and an explicit owner who is responsible for the thing actually getting used. Research from a few of the major consultancies suggests structured pilots cut time-to-deployment by roughly 40% compared to the ad-hoc approach. I believe it. Most of the slippage isn't engineering time — it's the weeks lost waiting for someone to decide whether to keep going.
One pattern I've come to trust: if you can't name the person whose job gets measurably easier when the pilot succeeds, you don't have a pilot. You have a demo with a bigger budget.
Stage 3: Production
If the pilot works, production is where the boring, expensive, unglamorous work happens. And it really is expensive — a fact that catches finance teams off guard with depressing regularity, because the PoC budget rarely included anything resembling production cost.
Production means the system runs without someone babysitting it. That sounds simple until you list everything it implies. Monitoring for model drift. A rollback plan when an upgraded model behaves worse than the old one. Logging that's good enough to debug a wrong answer six weeks after the fact. Access controls. A retraining cadence. Someone, somewhere, on call. A line item in next year's budget that didn't exist this year.
The companies that make it through this stage cleanly tend to share one habit: they treat the AI system as software, not as a science project. That means version control, CI/CD, observability — the same discipline you'd apply to any other production service. The companies that struggle are usually the ones still running the model out of a notebook a data scientist wrote eight months ago and hasn't touched since.
I'll come back to the cost and timeline reality in the next episode, because it deserves its own conversation. The short version: the PoC is the cheap part. Production is where the real money shows up, and pretending otherwise is one of the main reasons projects die at this exact transition.
Stage 4: Scale
Scale is where AI stops being a project and starts being a capability.
The interesting thing about this stage is how few companies actually reach it. Recent surveys suggest fewer than 10% of organizations have scaled AI agents beyond their initial pilot. Not deployed broadly — just gotten past the first one. Most of the industry is still stuck somewhere between stages two and three, running one production system and hoping the second one will be easier.
It usually isn't, on the first try. But it gets meaningfully easier on the third or fourth, and that's the whole point of scale: you build a repeatable path. Standard patterns for data prep. A reusable evaluation framework. A short list of vendors and tools you actually trust. An internal team that has now shipped something twice and knows where the landmines are.
This is the stage where I've seen organizations transform — not because the AI itself is transformative, but because they finally have the operational muscle to deploy it without each project being a heroic, bespoke effort. The first project takes nine months. The fourth takes six weeks. That's the curve worth aiming for.
So where should you actually be?
If you're earlier in this series — if you've been with us since episode 1's five questions and through the five-area self-assessment — you probably already know which stage your organization is in. Most SMBs I talk to are somewhere between "we should probably do something" and "we ran a PoC last year and it sort of fizzled." Both are fine starting points. The mistake isn't being early; it's mistaking your stage for a later one.
A useful exercise: write down, in one sentence, the exit criterion for your current stage. If you can't, that's the work. The technology question — workflow versus agent, the topic from last episode — only gets interesting once you know what success looks like at the stage you're in.
Honestly? If reading this is making you reconsider the structure of a project you're already running, that's a good sign. The teams that quietly course-correct at stage two ship; the ones who push forward without admitting the pilot is broken end up in the 88%.
If any of this maps to a project you're stuck on and you want to talk it through, our team at 5years+ has spent the last few years walking SMBs through exactly these transitions across the Korean and Japanese markets — feel free to reach out if a conversation would help. No pitch; we mostly end up doing whiteboard work first.
Next time, we'll get into the part everyone wants to know but few people talk about honestly: what AI projects actually cost, and how long they actually take. By company size, by scope, with real ranges. See you in episode 5.