© 2025 Subramaniyam (Sam) Pooni
All Rights Reserved
Proprietary & Confidential
⚡
KV-Cache Architecture
0
Summary
2
Background
7
KV-Cache
📊 Figures
Chapter 2
Background: LLM Inference Fundamentals
Understanding prefill vs decode, the roofline model, and why KV-cache exists.
8
Figures in Chapter
2
Inference Phases
1000×
Recompute Savings
2.1 Inference Fundamentals
Visual Appendix — Background Figures
Open Full Screen ↗
2.2 Prefill vs Decode Phases
Figure 2.1-2.3 — Inference Phases
View Source ↗
Section 2.1 — Inference Fundamentals Deep Dive
View Source ↗
2.3 Why KV-Cache Exists
Figure 2.4 — With vs Without KV-Cache
View Source ↗
Figure 2.5 — KV-Cache Growth
View Source ↗
2.4 Roofline Model
Figure 2.7 — Roofline Analysis
View Source ↗
2.5 Current Approaches
Figure 2.8 — Approaches Comparison Matrix
View Source ↗
← Previous
Introduction
Next →
Architecture