Format
Classify sentiment.
Input: I love this!
Output: positive
Input: This is terrible.
Output: negative
Input: {user_input}
Output:Advertisement
How many examples
Diminishing returns after ~5-8. Modern LLMs often get by with 2-3. Watch total context length + cost.
Advertisement
Example diversity
Cover the label space. Include edge cases. Avoid all-positive examples — model will overpredict positive.
Order matters
Recency bias: last example influences output most. Randomize order across queries to reduce bias.