▶ Interactive Lab

Temperature & Sampling

Adjust temperature; see how it reshapes the next-token distribution.

Advertisement
Temperature divides logits before softmax. Low T → peaked (deterministic). High T → flat (random).

What you're seeing

Given raw logits z, softmax(z/T) is the next-token probability distribution. T=1 is the model's learned distribution. T<1 sharpens it (more deterministic). T>1 flattens it (more diverse). T=0 is pure greedy (always the argmax).

Low T for code, factual answers. High T for creative writing. Beyond ~1.5, outputs become incoherent for most models.

★ KEY TAKEAWAY
Temperature divides logits before softmax. T<1 sharpens; T>1 flattens; T=0 is greedy.
▶ WHAT TO TRY
  • Slide Temperature from 0.1 (peaked) to 2.0 (flat).
  • Watch the top-prob and entropy readouts change.
  • Default chat: T=0.7-0.8. Code: T=0-0.3.