Categories

4 harm categories + jailbreak detection + protected material. Severity levels 0-6.

Advertisement

Prompt shields

Direct + indirect injection detection. Separate classifier optimized for each.

Advertisement

Groundedness detection

NLI-based check for RAG. Detects ungrounded claims. Recommend regeneration.

Copilot integration

Powers Microsoft Copilot safety. Battle-tested at billions of queries.