Advertisement
Truncate distribution by k (fixed count), p (cumulative mass), or T (sharpness).
What you're seeing
Greedy = top-1 = T→0. Top-p adapts to distribution; top-k is fixed. Production: top-p with T=0.7-0.9.
★ KEY TAKEAWAY
Top-k truncates to a fixed count; top-p adapts to distribution shape. Combine with temperature for production sampling.
▶ WHAT TO TRY
- Set Top-p to 0.9 — see how many tokens it keeps (varies with distribution).
- Set Top-k to 5 — always exactly 5.
- Drop T low — distribution sharpens, both methods keep fewer tokens.