Perplexity filtering
Small model scores each token's perplexity in context. Drop lowest-perplexity tokens (most predictable → least informative).
Advertisement
Ratio control
Target compression 2x/5x/10x. Coarser compression = more loss. Task-specific sweet spot.
Advertisement
What survives
Named entities, numbers, key verbs. What drops: filler words, redundant phrasing, easy-to-predict grammatical structure.
Cost/quality trade
Save 50-80% on input tokens. Small quality hit on most tasks. Test per use case.