Rotary Encoding
RoPE encodes position by rotating Q/K vectors. Nearby positions have similar rotations → higher dot product.
Locality Bias
On average, ~70% of attention mass falls within ±W positions of the query, even for long contexts.
Predictable Access
If GPU requests position P, it will likely need P±W soon. Prefetch proactively.