Deduplication

Exact + near-dedup removes many attack payloads that rely on repetition for memorization.

Advertisement

Quality classifier

Filter low-quality/adversarial-looking content. Model trained on quality signals.

Advertisement

Domain reputation

Weight by source reputation. Wikipedia > random blog. Reduces impact of low-reputation adversary.

Continuous audit

Sample dataset + LLM-review for policy issues. Adversarial patterns discovered.