Main Chapters
Understanding why GPU-NVMe integration requires fundamental changes
Motivation: AI & Storage
Why storage is the critical bottleneck for AI infrastructure
- GPU-centric AI infrastructure
- Training pipeline data flows
- CPU-mediated vs GPUDirect paths
Implementation Challenges
Current NVMe limitations & GPU-optimized recommendations
- NVMe assumes CPU-mediated control plane
- Doorbell serialization crisis
- Interrupt-driven I/O vs GPU polling
Solutions Architecture
Technology roadmap & emerging standards
- GPUDirect Storage deep dive
- CXL memory semantics
- NVMe protocol enhancements (shadow doorbells, batched submission)
Advanced & Hard Truths
Real-world architecture & honest assessments
- What actually works today
- Production deployment patterns
- Cost vs performance tradeoffs