GeneralFeatured1,390 views2 likes

The Missing Phase: Why AI Agents Fail Before They Start

Most companies jump straight to building agents. But there's an entire phase that has to come first — and skipping it is why your first demo ends with "it's not mature yet."

Sahar Carmel

• Director AI enablement

May 21, 2026 • 11 min read

Oil painting: Ancient Greece Olympic games where philosophers use a drone, an ATM, a PC, and a game console. A pink octopus in a laurel wreath, the Squid Club mascot, hides in plain sight.

A friend of mine — VP R&D at a mid-sized company — spent weeks building an agent for someone on the marketing team. Clean architecture, good tooling, real use case. When the demo came, the marketing guy wanted to test the hardest thing he could think of. The agent failed.

"It's not mature."

That was it. Project shelved.

Here's the thing: the agent probably wasn't broken. The organization wasn't ready for it.

There's a difference. And almost nobody talks about it.

The Phase That Doesn't Exist in Most Roadmaps

When companies decide to "do AI," the roadmap usually looks like this:

Identify use cases
Build agents
Deploy
Measure ROI

It's clean. It looks like software development. It's wrong.

What's missing is an entire phase that has to come before agents exist: individual fluency. Every person in the organization needs to develop a working relationship with AI models themselves — not through training sessions, not through watching demos, but by actually using them on real work that matters to them.

When I started rolling out AI at Mixtiles, I didn't build a single agent for the first several months. Instead, I taught 17 technical people to work with Claude Code. Then 5 analysts. Then a workshop for 40-50 non-technical employees — all in one room, the technical people coaching everyone else, each person bringing the task they least wanted to do.

That workshop was the readiness phase. The agents came later.

What Actually Happens When You Skip It

The failure mode my friend hit is predictable in retrospect. When someone encounters an AI agent without having any personal experience with AI, they treat it like traditional software. And traditional software either works or it doesn't.

A user who's never worked directly with an LLM will test the agent the same way they'd test a database: find the hardest edge case, see if it breaks. When it does — and it will, because that's not how AI works — the verdict is: "not ready."

A user who's spent even two weeks working with Claude Code or ChatGPT on their own tasks has a completely different mental model. They've seen hallucinations. They've learned to prompt. They've hit the edge cases themselves and figured out how to route around them. When they see an agent fail, they don't conclude that agents are broken — they diagnose what went wrong and think about how to fix the prompt.

The difference isn't intelligence or technical aptitude. It's exposure. The first person is encountering AI for the first time, under pressure, in a high-stakes demo. The second person has already built their own tolerance.

This is what Everett Rogers called the "innovation-decision process" in his foundational work on technology diffusion — and it's worth revisiting here. Rogers found that adoption isn't a single moment; it's a multi-stage process: awareness, persuasion, decision, implementation, confirmation. Organizations routinely try to compress this into a single demo. That never works. The persuasion stage requires direct experience, not observation.

The Field Intelligence You're Missing

There's a second reason the readiness phase matters, and it's less obvious than the first.

When you deploy agents without letting people work with AI first, you're flying blind on what to build.

At that Mixtiles workshop, each person brought their own worst task. The constraint was specific: not "what would be useful," but "what do you most not want to do." That's a completely different question.

A woman from finance came in with a process she'd been manually running for months: comparing expenses across two Excel files, reconciling discrepancies, flagging anomalies. By the time she finished the workshop, she'd built a workflow that was almost ready for deployment. Not because I told her what to build — because I gave her the tools and let her solve her own problem.

That's field intelligence. Not what people say they need in a requirements meeting. What they actually build when you give them the capability.

This distinction is enormous. In traditional enterprise software deployments, requirements gathering is famously unreliable — users describe their current process, not their actual need, and the gap between those two things is where expensive projects go to die. When people work directly with AI tools on their own problems, they bypass that translation layer entirely. They discover the workflow through iteration, not specification.

The copywriting insight was similar. Multiple departments at Mixtiles started developing their own AI workflows for content production independently. I didn't design those workflows. I watched them emerge, then extracted the logic and formalized it into agents. The agents were downstream of organic workflow discovery, not upstream.

This is bottom-up automation. It's slower at the start and dramatically more likely to stick.

Why the Most Impressive Use Case is the Worst Starting Point

There's a strong instinct, when introducing agents to an organization, to lead with the thing that will generate the most awe. The demo that makes the VP's jaw drop. The use case that sounds like science fiction.

Resist this instinct.

The best starting point is the most painful task — the one nobody wants to touch, the one that gets passed around, the one that's been a source of ambient misery for years. There are two reasons for this.

First, the bar is underground. When something is universally dreaded, any improvement is a win. An agent that handles 60% of a hated task perfectly and gets the other 40% wrong is still a massive improvement over the status quo. People calibrate their expectations to match the reality of what they were dealing with before. An impressive-sounding use case has a high bar to clear; a miserable one has a negative bar.

Second, there's no threat. When I built an agent for the analysts that could run SQL queries on behalf of anyone in the company, the initial reaction wasn't excitement. It was fear.

"Wait — that's what I do."

That's a natural response. And if I'd led with capability — "look what this agent can do" — I'd have lost them immediately. Instead, I had a conversation. We worked through the logic together. Analysts weren't hired to write SQL. They were hired to do deep analysis — to interpret data, identify patterns, surface insights that inform decisions. Writing SQL is a means to that end, not the end itself. The agent wasn't replacing the analysts. It was freeing them from the part of their job they didn't sign up for.

Same with developers. The pitch wasn't "this AI can write your code." It was: "you didn't get into engineering to write boilerplate. You got into it to design systems and solve hard problems. Let the agent handle the repetitive parts so you can do the actual work."

Amy Edmondson's research on psychological safety is useful here. Her finding — borne out across dozens of organizations — is that people won't engage with new processes unless they feel safe to fail. An agent demo in front of leadership is not a psychologically safe environment. An individual using an AI tool on their own work, with no one watching, is. The readiness phase is also a safety-building phase.

A Historical Parallel Worth Sitting With

Johannes Gutenberg invented the movable-type printing press around 1440. By most measures, it was the most consequential technological invention in Western history — the enabling infrastructure for the Scientific Revolution, the Reformation, and the Enlightenment.

But the printing press didn't immediately transform society. For the first several decades, it mostly produced Bibles and official documents. The mass-market impact took generations to arrive — not because the technology wasn't ready, but because the population wasn't ready for it. European literacy rates in 1450 were somewhere around 10-20% in urban areas, far lower in rural ones. The technology was ready. The users weren't.

The printing press required a parallel investment in literacy — in building the human capacity to extract value from the tool — before it could deliver its systemic promise. That investment took over a century.

We're at an analogous moment with AI agents. The technology is, by most reasonable measures, ready enough. The agents can do real work. The tools are mature. What's not ready, in most organizations, is the human layer.

You can't skip the literacy phase. Gutenberg couldn't, and neither can you.

What Organizational Readiness Actually Looks Like

Based on what worked at Mixtiles, here's a practical framework:

Phase 1: Individual fluency. Every person who will eventually interact with agents needs direct, hands-on experience with AI tools on their own work. Not a demo. Not a workshop where they watch someone else. Actual usage, on actual tasks that matter to them. The technical people first — they'll become the peer coaches for everyone else. Then the analysts. Then the broader organization. The sequence matters because the technical people can help translate.

Phase 2: Organic workflow discovery. Give people tools and space to build their own solutions. Resist the urge to specify what they should build. The most valuable output of this phase isn't the workflows themselves — it's the intelligence about what people actually need. Watch what emerges. Document it. This is your real requirements gathering.

Phase 3: Agent extraction. Now build agents. You're not designing from scratch — you're formalizing what already exists. The workflows are already tested, because they've been running in practice. The users already understand how AI fails, because they've experienced it themselves. When you deploy the agent, they don't test it to destruction. They meet it.

The Real Measure of Agent Success

There's a metrics question that most organizations get wrong when evaluating AI agents: they measure the agent in isolation.

Does the agent complete the task? How often does it fail? What's the error rate?

These are the wrong questions. The right question is: how does the team respond when the agent fails?

An organizationally mature team, one that's gone through the readiness phase, treats agent failures as debugging exercises. They iterate on the prompt, adjust the workflow, flag the edge case for improvement. Their tolerance for imperfection is calibrated to the reality of how AI works.

An organizationally immature team treats the same failure as a verdict: "it's not mature." Project shelved.

The agent didn't change. The organization's readiness determined whether it survived its first encounter with reality.

What to Do Monday Morning

If you're a technical leader thinking about agent deployments, here's a practical starting point:

Don't build agents yet. Run a workshop instead. Get your team using AI tools directly — Claude Code for engineers, whatever fits for other roles. The constraint: real tasks, not toy problems. Real work that needs to get done.

Start with the most painful task, not the most impressive one. Find the thing everyone hates doing. The thing that's been a low-grade source of misery. Build the first agent there.

Watch what people build before you decide what to automate. The organic workflows are your requirements document.

Have the job-redefinition conversation before you deploy. Don't let people discover that their role overlaps with an agent in a demo. Have the conversation directly: "You weren't hired to do X. You were hired to do Y. The agent handles X so you can do more Y."

Measure organizational readiness as a first-class metric. Track how many people on your team have meaningful hands-on AI experience. That number is a better leading indicator of agent success than any technical benchmark.

The Uncomfortable Truth

The VP R&D's agent probably worked fine. The problem wasn't technical maturity — it was organizational maturity. And organizational maturity isn't something you can sprint to. You can't compress it into a training session or a pilot program. It's built through accumulated experience, through individual encounters with AI tools on real problems, through conversations about what roles actually mean.

Agents don't fail because they're bad. They fail because you arrived with them before the organization was ready.

The missing phase isn't a delay. It's the work.

Sahar Carmel is the Director of AI Enablement at Mixtiles and the founder of Squid Club, a community of AI-first developers in Israel. He writes about organizational transformation, AI adoption, and what it actually takes to make agents work.

Continue Reading

Back to Blog