© 2025 Subramaniyam (Sam) Pooni
All Rights Reserved
Proprietary & Confidential
Technical Appendix

Deep Dives and Reference Material

Comprehensive technical background covering transformer fundamentals, attention mechanisms, CXL protocols, and implementation details.

A
Transformer Architecture Fundamentals
Layer structure, parameter counts, memory footprint by precision
B
Attention Mechanism Deep Dive
Q/K/V computation, multi-head attention, grouped-query attention
C
KV-Cache Structure and Mathematics
Size formulas, growth analysis, multi-user scaling
D
Rotary Position Embeddings (RoPE)
Rotation mechanics, frequency spectra, locality properties
E
Attention Head Specialization
Recency heads, anchor heads, retrieval heads, syntactic heads
F
CXL Technology Primer
Protocol details, latency breakdown, comparison with PCIe
G
EMA Scoring Algorithm
Mathematical foundation, α parameter selection, practical examples
H
Memory Hierarchy and Caching Theory
Three-tier design, effective latency formulas, hit rate analysis
I
Bandwidth and Latency Calculations
Decode bandwidth requirements, prefetch window sizing
J
Implementation Reference
Code examples, configuration parameters, API reference