The 10 categories
1. Prompt Injection. 2. Sensitive Info Disclosure. 3. Supply Chain. 4. Data + Model Poisoning. 5. Improper Output Handling. 6. Excessive Agency. 7. System Prompt Leakage. 8. Vector + Embedding Weaknesses. 9. Misinformation. 10. Unbounded Consumption.
Advertisement
Prompt Injection (LLM01)
Direct + indirect. Most exploited. Attacker input overrides developer instructions.
Advertisement
Sensitive Info Disclosure (LLM02)
PII, credentials, or proprietary data leaked in responses. From training or context.
Excessive Agency (LLM06)
Agent has more tools/permissions than task requires. Compromised agent = larger blast radius.