▶ Interactive Lab

LoRA — Low-Rank Decomposition

Replace ΔW (d×d) with A·B (d×r · r×d).

Advertisement
Full ΔW: d×d params. LoRA A·B: 2·d·r params. Savings = d²/(2dr) = d/(2r).

What you're seeing

r=8-32 captures most fine-tuning gains. Up to ~250× fewer trainable params.

★ KEY TAKEAWAY
LoRA: replace full ΔW (d×d) with A·B (d×r and r×d). Up to 250× fewer trainable params. Standard fine-tuning recipe in 2026.
▶ WHAT TO TRY
  • Slide r down to 4 — see how few params get trained.
  • Up to r=64 captures most fine-tuning gains; diminishing returns after.