Self-Consistency — Sample K, Pick Majority

Why it works

CoT with temp explores multiple reasoning paths. Wrong paths often diverge; correct answer appears repeatedly. Majority is more robust.

Advertisement

5-40 samples. Diminishing returns after ~20. Cost = N × single CoT.

Advertisement

Math (numeric answer). Multiple-choice. Extraction with canonical form. Not useful for open-ended text generation.

Weight samples by confidence, coherence, or verifier score. Marginal improvement over uniform majority.