Temperature & Sampling — Belgavi.AI Lab

Advertisement

Temperature 1.00

Temperature divides logits before softmax. Low T → peaked (deterministic). High T → flat (random).

What you're seeing

Given raw logits z, softmax(z/T) is the next-token probability distribution. T=1 is the model's learned distribution. T<1 sharpens it (more deterministic). T>1 flattens it (more diverse). T=0 is pure greedy (always the argmax).

Low T for code, factual answers. High T for creative writing. Beyond ~1.5, outputs become incoherent for most models.

★ KEY TAKEAWAY

Temperature divides logits before softmax. T<1 sharpens; T>1 flattens; T=0 is greedy.

▶ WHAT TO TRY

Slide Temperature from 0.1 (peaked) to 2.0 (flat).
Watch the top-prob and entropy readouts change.
Default chat: T=0.7-0.8. Code: T=0-0.3.