▶ Interactive Lab

RoPE Extension Strategies

Linear, NTK-aware, YaRN compared.

Advertisement
Different strategies for stretching RoPE beyond trained context.

What you're seeing

Linear: divide positions. NTK-aware: scale frequency base. YaRN: NTK + temperature.

★ KEY TAKEAWAY
RoPE extension stretches a model trained on short context to longer. NTK-aware and YaRN preserve low-freq info; linear PI loses it.
▶ WHAT TO TRY
  • Switch between extension strategies — see how the RoPE angle pattern changes past training length.
  • Phi-3's LongRope is the SOTA for extreme lengths (128K+).