Deduplication
Exact + near-dedup removes many attack payloads that rely on repetition for memorization.
Advertisement
Quality classifier
Filter low-quality/adversarial-looking content. Model trained on quality signals.
Advertisement
Domain reputation
Weight by source reputation. Wikipedia > random blog. Reduces impact of low-reputation adversary.
Continuous audit
Sample dataset + LLM-review for policy issues. Adversarial patterns discovered.