Agent Trust Stack — Policy file in YAML
How to translate the Agent Trust Stack into a versioned YAML file. Complete schema + 3 real examples (HR scoring, financial transaction, content generation). Output runnable by agent.
Why policy file in YAML
The Agent Trust Stack defines 5 dimensions to decide AI delegation: Reversibility, Blast radius, Auditability, Cost, Time sensitivity. The framework exists in text. But for companies operating agents at scale, text doesn’t scale.
Solution: encode the policy in a YAML file versioned in git, read by the agent at runtime. Advantages:
- Versioning: every change is PR + review.
- Auditability: complete history of policy changes.
- Reuse: same policy applied to multiple agents.
- Compliance-as-code: legal/DPO can review YAML policy directly.
This post brings the complete schema + 3 real examples.
Complete schema
# policy.yaml
version: "1.0"
metadata:
name: "Production agent policy"
owner: "ivan@skillab.co"
updated: "2026-05-15"
policy_id: "POL-2026-001"
actions:
- name: <action_name>
description: <one-line description>
trust_dimensions:
reversibility: <high|medium|low> # 1-5 scale
blast_radius: <personal|internal|external>
auditability: <full|partial|none>
cost_of_error: <low|medium|high> # USD ranges
time_sensitivity: <not-sensitive|sensitive|critical>
total_score: <auto-computed>
autonomy_level: <auto-computed> # 1=fully auto, 5=hold
required_gates:
- <gate1>
- <gate2>
review_required: <true|false>
reviewer_role: <role|null>
audit_retention: <duration>
notes: <free-form>
Computed fields
total_score is sum of the 5 dimensions (each scored 1-3 depending on severity).
autonomy_level derives from total_score:
- 5-7: Level 1 (autonomous + log)
- 8-10: Level 2 (assistive, human approve)
- 11-13: Level 3 (auxiliary, human decide)
- 14-15: Level 4 (don’t delegate)
Required gates
List of specific gates the agent needs to satisfy:
schema_validation: validate against schema (always)dry_run: generate preview before executeidempotency_check: validate not duplicatinghuman_approval: human-in-the-loopsandbox: execute in isolated sandboxconfidence_threshold:<N>: only execute if confidence > Naudit_log: complete log + retention
Example 1 — HR Scoring (high-risk)
actions:
- name: candidate_score
description: "AI generates 0-100 fit score for job candidate"
trust_dimensions:
reversibility: high # candidate can be re-reviewed manually
blast_radius: external # affects external person's career
auditability: full # legal requirement (GDPR Art. 22, LGPD Art. 20)
cost_of_error: high # discrimination lawsuit risk
time_sensitivity: not-sensitive
total_score: 12
autonomy_level: 3 # auxiliary only
required_gates:
- schema_validation
- human_approval
- audit_log
- confidence_threshold:0.7
review_required: true
reviewer_role: "recruiting_manager"
audit_retention: "5y"
notes: |
GDPR Art. 22 and LGPD Art. 20 require human review on automated
decisions affecting individuals. Brazilian TST case 2025 (Banco
do Brasil) reinforced that AI score alone cannot decide hiring.
Reviewer must add written justification.
Application: when agent is prompted “score candidate”, runs the pipeline. But executing the final decision requires human_approval before — agent pauses, sends preview to recruiter, waits for approval. Audit log persists for 5 years.
Example 2 — Financial Transaction (medium-risk reversible)
actions:
- name: invoice_classification
description: "AI categorizes invoice into cost center"
trust_dimensions:
reversibility: high # reclassify in monthly close
blast_radius: internal # affects company's books
auditability: full # accounting audit requirement
cost_of_error: medium # tax implication possible
time_sensitivity: not-sensitive
total_score: 8
autonomy_level: 2 # assistive
required_gates:
- schema_validation
- confidence_threshold:0.85
- audit_log
review_required: false # if confidence > 0.85, auto-execute
reviewer_role: "accountant" # for low-confidence cases
audit_retention: "5y" # tax authority requirement
notes: |
exemplo de vertical SaaS contábil. High-confidence classifications auto-execute
with logging. Low-confidence (<0.85) routed to human queue.
Monthly close review covers ~5% of total volume.
Application: agent classifies invoice. If confidence > 0.85, executes direct + log. If < 0.85, goes to manual queue.
Example 3 — Content Generation (low-risk, high volume)
actions:
- name: marketing_copy_draft
description: "AI generates first draft of marketing copy"
trust_dimensions:
reversibility: high # discard and regenerate
blast_radius: personal # internal team only until published
auditability: partial # log prompt + response, not full chain
cost_of_error: low # marketing draft, easy to fix
time_sensitivity: sensitive # campaign deadlines
total_score: 6
autonomy_level: 1 # autonomous + sampling
required_gates:
- schema_validation
- audit_log
review_required: false
reviewer_role: null
audit_retention: "1y"
sampling_review:
enabled: true
rate: 0.1 # 10% randomly sampled for human review
notes: |
Drafts go directly to writer. Sampling review catches systematic
bias or quality drift. Once draft is published, separate workflow
handles external publication.
Application: agent generates draft, summarized log. 10% of outputs go for random human review.
How the agent reads the policy
// Pseudo-code TypeScript
import yaml from 'js-yaml';
import { readFileSync } from 'fs';
const policy = yaml.load(readFileSync('policy.yaml', 'utf8'));
async function executeAction(actionName: string, args: unknown) {
const policyEntry = policy.actions.find(a => a.name === actionName);
if (!policyEntry) {
throw new Error(`Action ${actionName} not in policy`);
}
// Apply gates
for (const gate of policyEntry.required_gates) {
const passed = await applyGate(gate, args);
if (!passed) {
return { rejected: true, reason: `Gate ${gate} failed` };
}
}
// Check autonomy level
if (policyEntry.autonomy_level >= 3) {
return { pending: true, reason: 'Awaiting human approval' };
}
// Execute
const result = await executeReal(actionName, args);
// Log per policy
await auditLog({
action: actionName,
args,
result,
policy_id: policy.metadata.policy_id,
retention: policyEntry.audit_retention,
});
return { executed: true, result };
}
Policy versioning
Policy change = PR + DPO review + change log in commit. Examples:
- “Increase confidence threshold from 0.85 → 0.90 in invoice_classification” → PR.
- “Add action
cancel_subscriptionwith autonomy_level=3” → PR. - “Change audit_retention from 1y → 5y in response to new regulation” → PR + communication.
Each policy version persisted. Audit trail points to which policy version was applied in a decision on day X.
Anti-patterns
- Giant disorganized policy file. Break into multiple files by domain.
- Policy edited directly without PR. Nullifies versioning gain.
- Gates implemented but not tested. CI must have test for each gate.
- Policy not enforced in code. Has to run in runtime, not be doc.
FAQ
What if policy has bug and blocks legitimate use? PR + rollback. Important: have monitor of “actions being rejected in unusual volume” to detect bugs.
Can I use JSON instead of YAML? Yes. YAML is more readable for non-dev (DPO, legal). JSON is better for programmatic generation. Choose by your use.
Other languages? Tools exist like OPA (Open Policy Agent) with Rego. More powerful but more complex. Simple YAML covers 90% of cases.
How do I test policy? Test suite with pairs (action, args, expected_outcome). Run in CI for every change.
Next steps
- Encode 1 action of yours in policy.yaml this week. Start with the highest-risk action.
- SkilLab Workshop — Consulting & Training. Trust Stack as code implementation in proprietary systems. Details.
- SkilLab AI Newsletter. Sign up below.
Also read
- Agent Trust Stack — framework hub — canonical framework.
- AI for business: the only decision matrix you need — practical application of Trust Stack.
- Harness Stack — Verification deep dive — verification gates that go in the policy file.
By Ivan Prado · SkilLab AI · May 2026. Translated and adapted from the PT-BR original.