Providers

Perspective API (Google). OpenAI Moderation (free). Azure Content Safety. Amazon Bedrock Guardrails. Each has own categories.

Advertisement

Open-source

Detoxify. Llama Guard. ToxDECT. Local deployment for latency + privacy.

Advertisement

Streaming filter

Classify each chunk as streams. Cut stream on toxic. Some latency: can't classify partial sentence perfectly. Buffer + classify per sentence.

False positives

Filters over-block medical/legal discussion, marginalized language. Tune thresholds per domain. Human review for edge cases.