Attack vector
Summarizer LLM might preserve 'ignore instructions and…' from source. Downstream LLM treats as instruction.
Advertisement
Amplification
Attacker doesn't need direct access to downstream LLM. Attack propagates via data pipeline.
Advertisement
Defenses
Delimit LLM-generated content as untrusted. Filter output of first LLM for instruction-like patterns. Never chain LLMs on untrusted data without intermediate sanitization.
Design
Trust boundaries between LLM stages. Explicit contracts. Downstream never gets raw upstream output — always via filter.