▶ Interactive Lab

Backprop Chain Rule

Watch gradients flow backwards through a 3-layer net.

Advertisement
Forward: compute outputs. Backward: compute gradients via chain rule.

What you're seeing

3-layer network: x → h1 → h2 → ŷ. Forward computes activations layer by layer. Backward computes ∂L/∂params via chain rule: from loss back to weights, multiplying Jacobians.

PyTorch's autograd builds a computation graph during forward and traverses it backward automatically. You write forward; backward is free.

★ KEY TAKEAWAY
Backward = forward, traversed in reverse. Each layer's gradient is the chain rule applied to its local Jacobian.
▶ WHAT TO TRY
  • Click Forward pass to see activations flow left-to-right.
  • Click Backward pass to see gradients flow back right-to-left.
  • This is what PyTorch's autograd does for you, automatically, for any computation graph.