Kirchenbauer et al

Partition vocab into 'green' + 'red' per hash of preceding token. Bias toward green. Detector counts green fraction.

Advertisement

Stealth

Human reader doesn't notice. Undetectable without key. Small perplexity increase.

Advertisement

Robustness

Survives paraphrase partially. Removed by heavy rewriting. Not undefeatable.

Deployment

Anthropic, OpenAI have deployed variants. Not universally on. Trade-off with user preference for natural output.