Agent Trust Stack: five dimensions for deciding which task to delegate to which agent
Reversibility · Blast radius · Auditability · Cost · Time
Agent Trust Stack is the Automation Labs framework of five dimensions that structures the delegation decision for AI agents — when to trust, when to require confirmation, when to block.
The five dimensions
1 · Reversibility (“can it be undone?”). Scale 0 to 3: irreversible (prod deploy, bank transfer, bulk send), reversible at high cost (creating a public issue, posting to a community), reversible at low cost (creating a local branch, generating a draft), fully reversible (running a SELECT query, reading a file). Low reversibility always requires durable pause.
2 · Blast radius (“who is affected?”). Scale 0 to 3: agent only (isolated sandbox), agent + local environment (worktree), agent + shared system (repo, staging DB), agent + real users (production, community, finance). High blast radius + low reversibility is the mandatory durable-pause zone.
3 · Auditability (“can the event be reconstructed?”). Scale 0 to 3: no trace, partial trace (LLM call log only), structured trace (input, output, tool calls, timing), trace + verifiable provenance (signature, hash, corpus link). Tasks with low auditability should be blocked in regulated environments (LGPD, SOC 2).
4 · Cost (“how much to run?”). Scale 0 to 3: trivial (<$0.01), normal ($0.01-$1), expensive ($1-$100), high risk ($100+). High cost without confidence gating is a recipe for runaway loops.
5 · Time (“how long can the agent run?”). Scale 0 to 3: instant (<10s), normal (10s-2min), long (2-30min), background (30min+). High time requires periodic checkpoint or advisor consult.
How to apply
For each task type an agent will receive, score it across each dimension. Sum 0-5: full autonomy. Sum 6-10: autonomy with durable pause on specific actions. Sum 11-15: block, requires human in the loop.
The common error is treating everything under the same agent with the same policy. Trust Stack forces decomposition: the same Claude Code agent that has autonomy for git status should have durable pause on git push --force and block on rm -rf. The policy applies to the task, not the agent.
Use cases applying Agent Trust Stack
- OpenClaw — gateway applying Agent Trust Stack at the channel level: WhatsApp has lower reversibility than an internal ticket; threshold rises.
Related posts
- Agent Trust Stack: when to trust which agent with what task
- Harness Stack — Trust Stack assumes harness is in place; without harness, all dimensions degrade.
- Prompt Infection Taxonomy — vectors that reduce Auditability even with a well-built harness.
When to use
- Deciding autonomy policy for a Claude Code, Cowork, or n8n agent.
- Incident audit when an agent did something it shouldn't have — which dimension was ignored?
- Brief for an AI engineer on safe delegation in a multi-agent system.
When NOT to use
- Purely conversational chatbot without real-world effects — unnecessary overhead.
- Decision about which model to use (Claude vs Copilot vs Gemini) — that's a different decision, use AI Agency Ladder.