🟠 Builder

Agent Trust Stack — Policy file in YAML

How to translate the Agent Trust Stack into a versioned YAML file. Complete schema + 3 real examples (HR scoring, financial transaction, content generation). Output runnable by agent.

May 16, 2026 · 10 min · ai-agency-skills

Why policy file in YAML

The Agent Trust Stack defines 5 dimensions to decide AI delegation: Reversibility, Blast radius, Auditability, Cost, Time sensitivity. The framework exists in text. But for companies operating agents at scale, text doesn’t scale.

Solution: encode the policy in a YAML file versioned in git, read by the agent at runtime. Advantages:

Versioning: every change is PR + review.
Auditability: complete history of policy changes.
Reuse: same policy applied to multiple agents.
Compliance-as-code: legal/DPO can review YAML policy directly.

This post brings the complete schema + 3 real examples.

Complete schema

# policy.yaml
version: "1.0"
metadata:
  name: "Production agent policy"
  owner: "ivan@skillab.co"
  updated: "2026-05-15"
  policy_id: "POL-2026-001"

actions:
  - name: <action_name>
    description: <one-line description>
    trust_dimensions:
      reversibility: <high|medium|low>      # 1-5 scale
      blast_radius: <personal|internal|external>
      auditability: <full|partial|none>
      cost_of_error: <low|medium|high>      # USD ranges
      time_sensitivity: <not-sensitive|sensitive|critical>
    total_score: <auto-computed>
    autonomy_level: <auto-computed>   # 1=fully auto, 5=hold
    required_gates:
      - <gate1>
      - <gate2>
    review_required: <true|false>
    reviewer_role: <role|null>
    audit_retention: <duration>
    notes: <free-form>

Computed fields

total_score is sum of the 5 dimensions (each scored 1-3 depending on severity).

autonomy_level derives from total_score:

5-7: Level 1 (autonomous + log)
8-10: Level 2 (assistive, human approve)
11-13: Level 3 (auxiliary, human decide)
14-15: Level 4 (don’t delegate)

Required gates

List of specific gates the agent needs to satisfy:

schema_validation: validate against schema (always)
dry_run: generate preview before execute
idempotency_check: validate not duplicating
human_approval: human-in-the-loop
sandbox: execute in isolated sandbox
confidence_threshold:<N>: only execute if confidence > N
audit_log: complete log + retention

Example 1 — HR Scoring (high-risk)

actions:
  - name: candidate_score
    description: "AI generates 0-100 fit score for job candidate"
    trust_dimensions:
      reversibility: high      # candidate can be re-reviewed manually
      blast_radius: external   # affects external person's career
      auditability: full       # legal requirement (GDPR Art. 22, LGPD Art. 20)
      cost_of_error: high      # discrimination lawsuit risk
      time_sensitivity: not-sensitive
    total_score: 12
    autonomy_level: 3   # auxiliary only
    required_gates:
      - schema_validation
      - human_approval
      - audit_log
      - confidence_threshold:0.7
    review_required: true
    reviewer_role: "recruiting_manager"
    audit_retention: "5y"
    notes: |
      GDPR Art. 22 and LGPD Art. 20 require human review on automated
      decisions affecting individuals. Brazilian TST case 2025 (Banco
      do Brasil) reinforced that AI score alone cannot decide hiring.
      Reviewer must add written justification.

Application: when agent is prompted “score candidate”, runs the pipeline. But executing the final decision requires human_approval before — agent pauses, sends preview to recruiter, waits for approval. Audit log persists for 5 years.

Example 2 — Financial Transaction (medium-risk reversible)

actions:
  - name: invoice_classification
    description: "AI categorizes invoice into cost center"
    trust_dimensions:
      reversibility: high      # reclassify in monthly close
      blast_radius: internal   # affects company's books
      auditability: full       # accounting audit requirement
      cost_of_error: medium    # tax implication possible
      time_sensitivity: not-sensitive
    total_score: 8
    autonomy_level: 2   # assistive
    required_gates:
      - schema_validation
      - confidence_threshold:0.85
      - audit_log
    review_required: false  # if confidence > 0.85, auto-execute
    reviewer_role: "accountant"  # for low-confidence cases
    audit_retention: "5y"  # tax authority requirement
    notes: |
      exemplo de vertical SaaS contábil. High-confidence classifications auto-execute
      with logging. Low-confidence (<0.85) routed to human queue.
      Monthly close review covers ~5% of total volume.

Application: agent classifies invoice. If confidence > 0.85, executes direct + log. If < 0.85, goes to manual queue.

Example 3 — Content Generation (low-risk, high volume)

actions:
  - name: marketing_copy_draft
    description: "AI generates first draft of marketing copy"
    trust_dimensions:
      reversibility: high      # discard and regenerate
      blast_radius: personal   # internal team only until published
      auditability: partial    # log prompt + response, not full chain
      cost_of_error: low       # marketing draft, easy to fix
      time_sensitivity: sensitive  # campaign deadlines
    total_score: 6
    autonomy_level: 1   # autonomous + sampling
    required_gates:
      - schema_validation
      - audit_log
    review_required: false
    reviewer_role: null
    audit_retention: "1y"
    sampling_review:
      enabled: true
      rate: 0.1  # 10% randomly sampled for human review
    notes: |
      Drafts go directly to writer. Sampling review catches systematic
      bias or quality drift. Once draft is published, separate workflow
      handles external publication.

Application: agent generates draft, summarized log. 10% of outputs go for random human review.

How the agent reads the policy

// Pseudo-code TypeScript
import yaml from 'js-yaml';
import { readFileSync } from 'fs';

const policy = yaml.load(readFileSync('policy.yaml', 'utf8'));

async function executeAction(actionName: string, args: unknown) {
  const policyEntry = policy.actions.find(a => a.name === actionName);
  if (!policyEntry) {
    throw new Error(`Action ${actionName} not in policy`);
  }

  // Apply gates
  for (const gate of policyEntry.required_gates) {
    const passed = await applyGate(gate, args);
    if (!passed) {
      return { rejected: true, reason: `Gate ${gate} failed` };
    }
  }

  // Check autonomy level
  if (policyEntry.autonomy_level >= 3) {
    return { pending: true, reason: 'Awaiting human approval' };
  }

  // Execute
  const result = await executeReal(actionName, args);

  // Log per policy
  await auditLog({
    action: actionName,
    args,
    result,
    policy_id: policy.metadata.policy_id,
    retention: policyEntry.audit_retention,
  });

  return { executed: true, result };
}

Policy versioning

Policy change = PR + DPO review + change log in commit. Examples:

“Increase confidence threshold from 0.85 → 0.90 in invoice_classification” → PR.
“Add action cancel_subscription with autonomy_level=3” → PR.
“Change audit_retention from 1y → 5y in response to new regulation” → PR + communication.

Each policy version persisted. Audit trail points to which policy version was applied in a decision on day X.

Anti-patterns

Giant disorganized policy file. Break into multiple files by domain.
Policy edited directly without PR. Nullifies versioning gain.
Gates implemented but not tested. CI must have test for each gate.
Policy not enforced in code. Has to run in runtime, not be doc.

FAQ

What if policy has bug and blocks legitimate use? PR + rollback. Important: have monitor of “actions being rejected in unusual volume” to detect bugs.

Can I use JSON instead of YAML? Yes. YAML is more readable for non-dev (DPO, legal). JSON is better for programmatic generation. Choose by your use.

Other languages? Tools exist like OPA (Open Policy Agent) with Rego. More powerful but more complex. Simple YAML covers 90% of cases.

How do I test policy? Test suite with pairs (action, args, expected_outcome). Run in CI for every change.

Next steps

Encode 1 action of yours in policy.yaml this week. Start with the highest-risk action.
SkilLab Workshop — Consulting & Training. Trust Stack as code implementation in proprietary systems. Details.
SkilLab AI Newsletter. Sign up below.