Front-load value

Ask for answer first, explanation after. User sees answer immediately while explanation streams.

Advertisement

Streaming markdown

Progressive markdown renderer needs valid partial docs. Prefer flat lists + short paragraphs over deep nesting that breaks mid-stream.

Advertisement

JSON streaming

Emit fields in order client can consume. 'title' first, 'body' second. Streaming JSON parsers (jsonrepair) handle partials.

Thinking tag

o1/Claude thinking: hidden reasoning delays visible tokens. First-token latency spikes. UX: show 'thinking…' state during reasoning.