Why it works
CoT with temp explores multiple reasoning paths. Wrong paths often diverge; correct answer appears repeatedly. Majority is more robust.
Advertisement
Typical N
5-40 samples. Diminishing returns after ~20. Cost = N × single CoT.
Advertisement
Only for final-answer tasks
Math (numeric answer). Multiple-choice. Extraction with canonical form. Not useful for open-ended text generation.
Weighted variants
Weight samples by confidence, coherence, or verifier score. Marginal improvement over uniform majority.