Randomized smoothing

Wrap classifier with noise. Certifiable robustness within L2 ball. Trade accuracy for guarantees.

Advertisement

For LLMs

Very hard. Discrete token space. Some progress on classifier heads (safety, moderation classifiers).

Advertisement

Applications

Safety-critical classification: harm detection, medical Q, autonomous vehicles. Where formal guarantees demanded.

Practical status

Research frontier. Not production-ready for general LLM. Watch for specific narrow applications.