Learn . AI Engineering
AI agent architectures
Five architecture patterns for production AI agents. Each solves different problems. Pick the simplest one that works, then layer complexity only when you have evidence it's needed.
Updated . 2026-05-02 . 9 min read
Pattern 01
ReAct (Reason + Act)
The agent alternates between reasoning (thinking about what to do) and acting (calling a tool or API). Each observation feeds back into the next reasoning step.
Best for
- General-purpose agents that need to handle diverse tasks
- Customer support agents that look up data and take actions
- Research agents that search, synthesize, and answer
Trade-offs
- Simple to implement. The default choice for most agent projects.
- Can get stuck in loops if reasoning goes off track.
- Each step adds latency. Complex tasks mean many LLM calls.
Example
Support agent: user asks about order status. Agent reasons it needs order ID, calls order API, reasons about the result, formulates response.
Pattern 02
Plan-and-Execute
A planner LLM creates a full plan upfront, then an executor LLM carries out each step. The planner can revise the plan if execution hits unexpected results.
Best for
- Complex multi-step tasks where order matters
- Data pipeline orchestration and ETL workflows
- Tasks with dependencies between steps
Trade-offs
- Better at complex tasks than ReAct because it thinks before acting.
- Planning step adds upfront latency but reduces total steps.
- Plan revision is expensive. Bad plans waste execution budget.
Example
Data migration agent: plans extraction order based on foreign keys, executes table-by-table, revises if a dependency fails.
Pattern 03
Multi-Agent Systems
Multiple specialized agents collaborate. A supervisor routes tasks to the right specialist. Agents can hand off to each other or work in parallel.
Best for
- Complex workflows spanning multiple domains
- Systems where different steps need different expertise or tools
- Parallel processing of independent subtasks
Trade-offs
- Most capable architecture for complex problems.
- Hardest to debug. Failures cascade between agents.
- Coordination overhead. Supervisor quality is the ceiling.
Example
Insurance claims: intake agent extracts data, fraud agent checks patterns, routing agent assigns to adjuster, comms agent drafts response.
Pattern 04
Reflexion
The agent executes, evaluates its own output, and self-corrects. A critic module (often the same LLM with a different prompt) reviews results and suggests improvements.
Best for
- Code generation where output can be tested
- Writing tasks where quality can be self-evaluated
- Any task with programmatic success criteria
Trade-offs
- Dramatically improves output quality on tasks with clear success criteria.
- Doubles or triples inference cost per task.
- Only works when the agent can meaningfully evaluate its own output.
Example
Code agent: writes function, runs tests, reads failures, rewrites function. Repeats until tests pass or budget exhausted.
Pattern 05
Tool-Augmented Generation
The LLM decides when to call external tools (APIs, databases, calculators, code interpreters) to augment its responses. Simpler than full agents: no loops, no planning.
Best for
- Chatbots that need real-time data (weather, stock prices, order status)
- Assistants that perform calculations or lookups
- RAG systems with structured data sources
Trade-offs
- Simplest architecture. Easy to build, test, and debug.
- No multi-step reasoning. One tool call per turn.
- Not an 'agent' in the strict sense, but solves 80% of use cases.
Example
Finance assistant: user asks for portfolio performance. LLM calls portfolio API, formats the data, returns a summary.
Decision framework
How to pick the right architecture.
Start with tool-augmented generation. If the task needs multi-step reasoning, upgrade to ReAct. If steps have dependencies, use plan-and-execute. If the domain is broad enough to need specialists, go multi-agent. Add reflexion when output quality matters and you can evaluate it programmatically.
Most production agents we build at AISD start as ReAct and stay there. The architecture that ships and works is better than the architecture that's theoretically optimal but takes 3x longer to build.