Agent Trust Stack: five dimensions for deciding which task to delegate to which agent diagram

The five dimensions

1 · Reversibility (“can it be undone?”). Scale 0 to 3: irreversible (prod deploy, bank transfer, bulk send), reversible at high cost (creating a public issue, posting to a community), reversible at low cost (creating a local branch, generating a draft), fully reversible (running a SELECT query, reading a file). Low reversibility always requires durable pause.

2 · Blast radius (“who is affected?”). Scale 0 to 3: agent only (isolated sandbox), agent + local environment (worktree), agent + shared system (repo, staging DB), agent + real users (production, community, finance). High blast radius + low reversibility is the mandatory durable-pause zone.

3 · Auditability (“can the event be reconstructed?”). Scale 0 to 3: no trace, partial trace (LLM call log only), structured trace (input, output, tool calls, timing), trace + verifiable provenance (signature, hash, corpus link). Tasks with low auditability should be blocked in regulated environments (LGPD, SOC 2).

4 · Cost (“how much to run?”). Scale 0 to 3: trivial (<$0.01), normal ($0.01-$1), expensive ($1-$100), high risk ($100+). High cost without confidence gating is a recipe for runaway loops.

5 · Time (“how long can the agent run?”). Scale 0 to 3: instant (<10s), normal (10s-2min), long (2-30min), background (30min+). High time requires periodic checkpoint or advisor consult.

How to apply

For each task type an agent will receive, score it across each dimension. Sum 0-5: full autonomy. Sum 6-10: autonomy with durable pause on specific actions. Sum 11-15: block, requires human in the loop.

The common error is treating everything under the same agent with the same policy. Trust Stack forces decomposition: the same Claude Code agent that has autonomy for git status should have durable pause on git push --force and block on rm -rf. The policy applies to the task, not the agent.

Use cases applying Agent Trust Stack

OpenClaw — gateway applying Agent Trust Stack at the channel level: WhatsApp has lower reversibility than an internal ticket; threshold rises.

Agent Trust Stack: when to trust which agent with what task
Harness Stack — Trust Stack assumes harness is in place; without harness, all dimensions degrade.
Prompt Infection Taxonomy — vectors that reduce Auditability even with a well-built harness.

When to use

Deciding autonomy policy for a Claude Code, Cowork, or n8n agent.
Incident audit when an agent did something it shouldn't have — which dimension was ignored?
Brief for an AI engineer on safe delegation in a multi-agent system.

When NOT to use

Purely conversational chatbot without real-world effects — unnecessary overhead.
Decision about which model to use (Claude vs Copilot vs Gemini) — that's a different decision, use AI Agency Ladder.

The five dimensions

How to apply

Use cases applying Agent Trust Stack

Related posts

When to use

When NOT to use