Rotation mechanics, frequency spectra, and distance-dependent attention decay.
RoPE encodes position by rotating Q and K vectors in 2D subspaces:
where m is position and θi = 10000-2i/d for dimension pair i.
The dot product between positions m and n becomes:
Attention naturally decays with position distance |m-n|.
This creates predictable attention patterns:
We exploit this for RoPE-aware prefetching — tokens near the current decode position are prefetched with higher priority.
Combined with EMA scores, this achieves 90%+ prefetch hit rates.