Randomized smoothing
Wrap classifier with noise. Certifiable robustness within L2 ball. Trade accuracy for guarantees.
Advertisement
For LLMs
Very hard. Discrete token space. Some progress on classifier heads (safety, moderation classifiers).
Advertisement
Applications
Safety-critical classification: harm detection, medical Q, autonomous vehicles. Where formal guarantees demanded.
Practical status
Research frontier. Not production-ready for general LLM. Watch for specific narrow applications.