Skip to content
🟠 Builder

Multi-agent orchestration patterns: 6 patterns with tradeoffs

Six multi-agent orchestration patterns with explicit tradeoffs. When it's worth it, when it's not, and how to avoid the 'crew' anti-pattern where 5 agents do the work of 1.

Multi-agent has become over-sold. In 80% of cases we see, a well-designed single agent would beat three “crew” agents. But there are genuine cases where multi-agent pays. Six patterns, with tradeoffs.

Pattern 1 · Sequential pipeline

Agent A → output → Agent B → output → Agent C.

When it’s worth it: tasks with distinct phases and appropriate model swap per phase. Researcher (Sonnet) → writer (Opus) → reviewer (Haiku for quick check).

Tradeoff:

  • ✅ Each phase optimized by right model.
  • ✅ High auditability (each output is a checkpoint).
  • ❌ Latency adds up. 3 sequential agents = 3× the latency of 1.
  • ❌ Error propagates. Failure in A becomes garbage in B and C.

Mitigation: validation between stages, with retry.

Pattern 2 · Specialist routing (router + experts)

Router agent classifies request → routes to appropriate expert (legal, financial, support). Each expert is specialized.

When it’s worth it: domain where real specialization exists and the router classifies with high confidence.

Tradeoff:

  • ✅ Each expert can have a short, focused system prompt.
  • ✅ Cost-efficient (simple expert can use Haiku).
  • ❌ Wrong router destroys the system.
  • ❌ Cross-domain (request touching legal + financial) becomes a problem.

Mitigation: router with confidence threshold; when low, escalate to multi-expert or human.

Pattern 3 · Debate / dialectical

Two agents argue opposing positions; a third decides.

When it’s worth it: decisions with sensitive tradeoffs (approve/reject complex request, choose between alternatives).

Tradeoff:

  • ✅ Captures nuance a single agent skips.
  • ❌ Very expensive. 3× the cost and higher latency.
  • ❌ Risk of “theatrical debate” — agents generate arguments without real adversarial pressure.

Used in: Anthropic’s Constitutional AI training; rarely in direct commercial production.

Pattern 4 · Hierarchical (manager + workers)

Manager agent breaks task into subtasks, distributes to workers, aggregates.

When it’s worth it: tasks that decompose well (research across N parallel topics, code across N modules).

Tradeoff:

  • ✅ Real parallelism.
  • ❌ Coordination costs. Manager needs large context window.
  • ❌ When subtasks are interdependent, manager becomes bottleneck.

Pattern 5 · Adversarial / red team

One agent generates, another tries to break. Iterates until stable.

When it’s worth it: producing robust output (code with edge cases covered, text without legal ambiguity).

Tradeoff:

  • ✅ Final output much better than single-agent.
  • ❌ Infinite iterations if design is bad. Time-box mandatory.

Pattern 6 · Single agent with advisor escalation

A main agent. Advisor (another model) consulted when confidence is low or stakes high. Iterative until stable.

When it’s worth it: most cases.

Tradeoff:

  • ✅ Simplicity. Advisor max_uses=1 per interaction controls cost.
  • ✅ Covers 80% of multi-agent’s value with a fraction of the overhead.
  • ❌ Doesn’t cover tasks needing real parallelism.

This is our default pattern (literal).

The anti-pattern: “crew” for everything

Frameworks like CrewAI popularize “delegate tasks to a crew of specialized agents.” In real production, we see:

  • 5 agents consuming 5× the tokens to deliver what 1 would.
  • Bug in one agent cascades to others (vector 5 of Prompt Infection Taxonomy).
  • Debugging becomes a nightmare (which agent erred?).

Heuristic: start with 1 agent. Add a second only when a specific bottleneck demonstrates decomposition would solve it.

Cost matrix (relative estimate)

PatternTokensLatencyComplexity
Single agentLow
Single + advisor (rare)1.5×1.5×Medium
Sequential pipelineMedium
Specialist routing1.5-2×1.5×High
Debate (3 agents)High
Hierarchical2-5×2-3×Very high
Adversarial loop2-10×2-5×Very high

Where to go deeper

For the delegation framework (which task to delegate to which agent): Agent Trust Stack. For safety in multi-agent (cross-agent propagation): Prompt Infection Taxonomy.