Learn · AI Engineering
LLM orchestration patterns
LLM orchestration is the art of composing language model calls with tools, data sources, and control flow to solve real problems. Six patterns cover 90% of production use cases. Picking the right one is the highest-leverage architecture decision in an AI build.
Updated · 2026-05-02 · 10 min read
The six patterns
From simple chains to multi-agent systems
These patterns are ordered by complexity. Start at the top. Move down only when the simpler pattern genuinely can't solve the problem. Most teams over-architect on the first build.
Pattern 01
Sequential chain
Model A outputs feed into Model B inputs. Simple, predictable, easy to debug. Best for pipelines where each step has a clear single purpose.
When to use
Summarize → classify → route. Extract → validate → store.
Tradeoff
Latency compounds. No parallelism. Error in step 1 cascades.
Pattern 02
Router / classifier dispatch
A lightweight model classifies the input and routes it to a specialized handler. Each handler can use a different model, prompt, or tool set.
When to use
Multi-intent inputs (support tickets, mixed queries). Cost optimization by routing simple tasks to cheaper models.
Tradeoff
Router accuracy is a single point of failure. Misrouting is invisible without eval.
Pattern 03
Map-reduce / fan-out
Split a large input into chunks, process each in parallel, then merge the results. The classic pattern for long-document analysis.
When to use
Document summarization. Multi-source research. Batch classification.
Tradeoff
Merge step is hard. Parallel calls multiply cost. Chunk boundaries lose context.
Pattern 04
ReAct (Reason + Act)
The model thinks, decides on a tool call, observes the result, and loops. The foundation of most agent architectures. Each iteration is a full LLM inference.
When to use
Open-ended tasks requiring tool use. Customer support agents. Research agents.
Tradeoff
Cost and latency scale with loop count. Needs safety bounds (max iterations, budget caps).
Pattern 05
Plan-and-execute
A planner model generates a full plan upfront. An executor model runs each step. The planner can revise the plan based on executor results.
When to use
Complex multi-step tasks where planning quality matters. Coding agents. Workflow generation.
Tradeoff
Planning step is expensive. Plan revision adds latency. Over-planning on simple tasks.
Pattern 06
Multi-agent collaboration
Multiple specialized agents communicate through a shared workspace or message bus. Each agent has its own tools, prompts, and model. A supervisor agent coordinates.
When to use
Complex workflows spanning multiple domains. Enterprise process automation.
Tradeoff
Debugging is hard. Cost multiplies. Coordination overhead. Start with a single agent first.
Decision framework
How to pick the right pattern
- 01Start with a sequential chain. If it works, stop. Most production AI runs on simple chains.
- 02If inputs are heterogeneous, add a router. Route by intent, not by complexity.
- 03If inputs are large, split with map-reduce. Keep chunk sizes under 4K tokens for accuracy.
- 04If the task requires tool use, use ReAct. Cap iterations at 5-10 for cost control.
- 05If planning quality matters (coding, multi-step workflows), try plan-and-execute.
- 06If the workflow spans multiple domains with different tool sets, consider multi-agent. But exhaust single-agent options first.