Advertisement
ppl = exp(loss). Lower = better. ppl=N means model is choosing between N tokens.
What you're seeing
GPT-2 ~30, GPT-3 ~20, Llama 3 8B ~6, GPT-4 ~5.
★ KEY TAKEAWAY
Perplexity = exp(loss). GPT-2 ~30, Llama 3 8B ~6, GPT-4 ~5. Lower = better but compare same domain only.
▶ WHAT TO TRY
- Slide Loss and see which model class your value matches.
- Note: comparing perplexity across tokenizers is meaningless.