Two-tier cache model, CXL vs PCIe comparison, and the 65× latency improvement.
| Tier | Media | Latency | Capacity |
|---|---|---|---|
| Tier 1 | Endpoint DDR5 | 250 ns | 1 TB |
| Tier 2 | Endpoint NVMe | 25 μs | 16 TB |
| Fallback | Recompute | 50 ms | ∞ |
With 85% DRAM hit rate, 14% flash hit, 1% miss:
| Component | CXL.mem | PCIe DMA |
|---|---|---|
| CPU involvement | None | Required (interrupt) |
| Protocol overhead | 100 ns | 2-5 μs |
| Memory access | 100 ns | 100 ns |
| TLB management | Hardware | Software (1-2 μs) |
| Total | ~250 ns | 5-16 μs |
CXL.mem eliminates CPU interrupt handling, explicit DMA setup, and software TLB management. Result: 250 ns vs 16+ μs = 65× faster