Exact unlearning
Retrain without data → correct but expensive. Impractical for large LLMs.
Advertisement
Approximate methods
Fine-tune to forget (SISA). Task arithmetic (subtract task vector). Fine-tune on 'blank slate' about specific topics.
Advertisement
Verification
Membership inference should return 'not in training' after unlearning. Metric for success.
Catastrophic forgetting risk
Unlearning target may damage related capabilities. Balance retention vs forgetting.