🟠 Builder

Multi-agent orchestration patterns: 6 patterns with tradeoffs

Six multi-agent orchestration patterns with explicit tradeoffs. When it's worth it, when it's not, and how to avoid the 'crew' anti-pattern where 5 agents do the work of 1.

May 15, 2026 · 11 min · ai-agents

Multi-agent has become over-sold. In 80% of cases we see, a well-designed single agent would beat three “crew” agents. But there are genuine cases where multi-agent pays. Six patterns, with tradeoffs.

Pattern 1 · Sequential pipeline

Agent A → output → Agent B → output → Agent C.

When it’s worth it: tasks with distinct phases and appropriate model swap per phase. Researcher (Sonnet) → writer (Opus) → reviewer (Haiku for quick check).

Tradeoff:

✅ Each phase optimized by right model.
✅ High auditability (each output is a checkpoint).
❌ Latency adds up. 3 sequential agents = 3× the latency of 1.
❌ Error propagates. Failure in A becomes garbage in B and C.

Mitigation: validation between stages, with retry.

Pattern 2 · Specialist routing (router + experts)

Router agent classifies request → routes to appropriate expert (legal, financial, support). Each expert is specialized.

When it’s worth it: domain where real specialization exists and the router classifies with high confidence.

Tradeoff:

✅ Each expert can have a short, focused system prompt.
✅ Cost-efficient (simple expert can use Haiku).
❌ Wrong router destroys the system.
❌ Cross-domain (request touching legal + financial) becomes a problem.

Mitigation: router with confidence threshold; when low, escalate to multi-expert or human.

Pattern 3 · Debate / dialectical

Two agents argue opposing positions; a third decides.

When it’s worth it: decisions with sensitive tradeoffs (approve/reject complex request, choose between alternatives).

Tradeoff:

✅ Captures nuance a single agent skips.
❌ Very expensive. 3× the cost and higher latency.
❌ Risk of “theatrical debate” — agents generate arguments without real adversarial pressure.

Used in: Anthropic’s Constitutional AI training; rarely in direct commercial production.

Pattern 4 · Hierarchical (manager + workers)

Manager agent breaks task into subtasks, distributes to workers, aggregates.

When it’s worth it: tasks that decompose well (research across N parallel topics, code across N modules).

Tradeoff:

✅ Real parallelism.
❌ Coordination costs. Manager needs large context window.
❌ When subtasks are interdependent, manager becomes bottleneck.

Pattern 5 · Adversarial / red team

One agent generates, another tries to break. Iterates until stable.

When it’s worth it: producing robust output (code with edge cases covered, text without legal ambiguity).

Tradeoff:

✅ Final output much better than single-agent.
❌ Infinite iterations if design is bad. Time-box mandatory.

Pattern 6 · Single agent with advisor escalation

A main agent. Advisor (another model) consulted when confidence is low or stakes high. Iterative until stable.

When it’s worth it: most cases.

Tradeoff:

✅ Simplicity. Advisor max_uses=1 per interaction controls cost.
✅ Covers 80% of multi-agent’s value with a fraction of the overhead.
❌ Doesn’t cover tasks needing real parallelism.

This is our default pattern (literal).

The anti-pattern: “crew” for everything

Frameworks like CrewAI popularize “delegate tasks to a crew of specialized agents.” In real production, we see:

5 agents consuming 5× the tokens to deliver what 1 would.
Bug in one agent cascades to others (vector 5 of Prompt Infection Taxonomy).
Debugging becomes a nightmare (which agent erred?).

Heuristic: start with 1 agent. Add a second only when a specific bottleneck demonstrates decomposition would solve it.

Cost matrix (relative estimate)

Pattern	Tokens	Latency	Complexity
Single agent	1×	1×	Low
Single + advisor (rare)	1.5×	1.5×	Medium
Sequential pipeline	3×	3×	Medium
Specialist routing	1.5-2×	1.5×	High
Debate (3 agents)	3×	2×	High
Hierarchical	2-5×	2-3×	Very high
Adversarial loop	2-10×	2-5×	Very high

Where to go deeper

For the delegation framework (which task to delegate to which agent): Agent Trust Stack. For safety in multi-agent (cross-agent propagation): Prompt Infection Taxonomy.