Debate Prompting — Multiple Personas Discuss

Setup

Agent A: argues position X. Agent B: argues counter. Each rebuts. Judge scores + selects. 2-4 rounds.

Advertisement

Contentious questions with legitimate multiple views. Complex reasoning where one CoT would miss counterarguments.

Advertisement

Agents converge to consensus (agreement bias). Prompt to maintain positions strongly. Or use different base models.

Irving et al: humans + debate can supervise superhuman models. Theoretical proposal — active research.