Back to Changelog
engineeringFebruary 11, 20263 min read

How Not to Build an Agent: Part 1 of Building Agents That Do Real Work

Most agents collapse the moment they leave the demo. Here is why, and the mistakes every team keeps making.

Building an Agent Feels Incredible — Until It Hits Production

In demos, everything works. In production, agents hallucinate tool calls, forget constraints, deadlock, or quietly burn API credits at 3 AM.

This is Part 1 of a series about building agents that do real work, not impressive demos. We start with the most important lesson: most agent failures are not model failures. They are design failures.

Your Stack Doesn't Matter (But Your Thinking Does)

People argue endlessly about frameworks. LangChain vs CrewAI vs custom stacks. It does not matter.

An LLM is not alive. It has no intent, no goals, no understanding of your business.

Think of it like a massive encyclopedia in a library. It contains everything — but it will not find anything unless you give it exact instructions.

If you do not deeply understand the manual workflow you are automating, your agent never will.

Agents do not fail because they are dumb. They fail because we give them vague responsibilities.

Over-Engineering Is the Silent Killer

Not every workflow needs an agent. This is where engineer brain betrays us.

We build planners, memory layers, and multi-agent orchestration for tasks that could be handled by a script, a form, or a simple automation.

Every extra layer adds surface area for failure. Complexity is not power. It is fragility. If the simplest version does not work reliably, the agent version will be worse.

Bad Prompting Is a Design Bug, Not a Skill Issue

Bad prompting is not about clever wording. It is about missing structure.

Giving an agent vague tool descriptions is like handing a world-class chef a terrible recipe. The chef is not the problem.

In production, "the agent will infer it" is another way of saying "this will break."

If the agent can guess, it will guess. Guessing in production is a bug.

We Are Ignoring 20 Years of Multi-Agent Research

Here is the uncomfortable truth: most modern "agentic AI" systems are reinventing problems that were solved decades ago.

That is why we see agents arguing over definitions, agents violating rules, and agents waiting on each other forever.

Classic multi-agent systems used explicit belief and goal models, shared state, norms and constraints, and commitment protocols.

LLMs do not have these unless you build them into the architecture.

If your agent is not allowed to deploy after 5 PM, that should not live in a prompt. It should be a hard constraint the system enforces.

The Core Lesson

LLMs are not agents. Agents are systems.

Prompting is not engineering. Architecture is.

Next in this series: How to design your first real agent without over-engineering or magical thinking.