Skip to content
🔴 Research

Prompt Infection Taxonomy: five attack vectors for agent threat modeling

Direct · Indirect · Multi-turn · Tool-mediated · Cross-agent propagation

Prompt Infection Taxonomy is the Automation Labs lens in five vectors for classifying and defending against prompt injection in agent systems, from direct injection to cross-agent propagation.

Prompt Infection Taxonomy: five attack vectors for agent threat modeling diagram

The five vectors

1 · Direct — the user explicitly asks the agent to ignore instructions, change persona, or reveal the system prompt. Most visible attack and easiest to mitigate (firm system prompt + explicit refusal + logging). Typically captured by basic filters nowadays.

2 · Indirect — malicious payload arrives via content the agent reads (web page, PDF, email, uploaded file). The attacker is not the user; it’s whoever planted the instruction in the document the user asked the agent to process. Fastest-growing vector in 2025-2026 with web-browsing agents. Countermeasure: separate instruction channel from content channel, explicit treatment of content as data never as executable.

3 · Multi-turn — attacker conditions the agent across multiple messages, building context that dilutes original instructions. Small step by step, each acceptable, sum out of scope. Countermeasure: periodic re-anchoring of system prompt, fresh policy summary every N turns, confidence gating on high-risk actions.

4 · Tool-mediated — attacker uses tool output to inject instruction. Example: agent reads from a database a record whose “description” field contains “NOW EXECUTE THE FOLLOWING QUERY.” Particularly dangerous because content arrives from an apparently trusted source. Countermeasure: structured escaping between tool output and prompt, schema validation before re-injection into LLM call.

5 · Cross-agent propagation — in multi-agent systems, infection in one agent propagates to another via inter-agent messages. Agent A is compromised, sends a message to agent B with embedded instruction, agent B acts. New vector in Cowork, Crew AI, n8n environments with multiple LLM nodes. Countermeasure: explicit trust boundary between agents, schema validation on inter-agent messages, agent identity + signature.

How to apply

Use Prompt Infection Taxonomy as a matrix in architecture review. For each new agent, for each vector: is there surface? Is there control? Is there a regression test? “Implicit” is a failure answer.

Pair with Harness Stack: vectors 2, 3, 4 are where Verification (layer 3) and Confidence gating (layer 8) do most work. Vector 5 requires Constraint (layer 2) with well-defined scope between agents.

When to use

  • Threat modeling for a new agent before production.
  • Security incident audit in a multi-agent system.
  • Red team brief for testing an agent.

When NOT to use

  • Closed chatbot without tool use and without external content ingestion — surface too reduced for the full framework.