Monte Carlo Tree Search (MCTS)

Four phases

Selection (walk tree via UCB), Expansion (add new node), Simulation (random rollout), Backpropagation (update statistics up tree).

Advertisement

UCB(node) = win_rate + c · √(ln(parent_visits) / node_visits). Balance visited-good vs unvisited-uncertain.

Advertisement

Minimax: perfect play if eval accurate. MCTS: no eval needed, learns from simulations. Anytime — stop anytime.

MCTS + neural network policy/value. Neural net guides search, MCTS refines. Beat Lee Sedol 2016.