Large Language Models Labs

Interactive labs

All 5 labs in this category

Advertisement

See causal, bidirectional, and prefix-LM attention masks side by side.

See how KV cache grows with context length, batch size, and precision.

Draft model proposes K tokens; main model verifies in parallel.

Adjust temperature; see how it reshapes the next-token distribution.

See how each method truncates the candidate set differently.