Why it works

CoT with temp explores multiple reasoning paths. Wrong paths often diverge; correct answer appears repeatedly. Majority is more robust.

Advertisement

Typical N

5-40 samples. Diminishing returns after ~20. Cost = N × single CoT.

Advertisement

Only for final-answer tasks

Math (numeric answer). Multiple-choice. Extraction with canonical form. Not useful for open-ended text generation.

Weighted variants

Weight samples by confidence, coherence, or verifier score. Marginal improvement over uniform majority.