AI Infrastructure¶
Use this track for the platform layer beneath AI products: backend APIs, model serving, vector retrieval infrastructure, MLOps, and cost/latency economics.
This is not the starting point for AI Engineering. Start with ../01_ai_engineering/ for agent/product architecture, then come here when the system needs concrete infrastructure decisions.
| Module | Focus | Folder |
|---|---|---|
| 00 | AI backend API engineering | 00_ai_backend_api_engineering/ |
| 01 | Model gateway and provider operations | 01_model_gateway_provider_ops/ (placeholder) |
| 02 | Inference serving systems | 02_inference_serving_systems/ |
| 03 | Vector retrieval infrastructure | 03_vector_retrieval_infrastructure/ |
| 04 | ML platform operations | 04_ml_platform_operations/ |
| 05 | Agent performance economics | 05_agent_performance_economics/ |
| 06 | AI runbooks and on-call operations | 06_ai_runbooks_oncall/ (placeholder) |
| 07 | Tool execution sandboxes | 07_tool_execution_sandboxes/ (placeholder) |
| 08 | Distributed training systems — memory wall, data/tensor/pipeline parallelism, ZeRO/FSDP, 3D parallelism, checkpointing at scale | 08_distributed_training_systems/ |
| 09 | GPU acceleration stack — roofline, CUDA/kernel fusion, NCCL, TensorRT-LLM, Triton, NIM, NeMo, MIG/cluster scheduling | 09_gpu_acceleration_stack/ |