Skip to content

AI Infrastructure

Use this track for the platform layer beneath AI products: backend APIs, model serving, vector retrieval infrastructure, MLOps, and cost/latency economics.

This is not the starting point for AI Engineering. Start with ../01_ai_engineering/ for agent/product architecture, then come here when the system needs concrete infrastructure decisions.

Module Focus Folder
00 AI backend API engineering 00_ai_backend_api_engineering/
01 Model gateway and provider operations 01_model_gateway_provider_ops/ (placeholder)
02 Inference serving systems 02_inference_serving_systems/
03 Vector retrieval infrastructure 03_vector_retrieval_infrastructure/
04 ML platform operations 04_ml_platform_operations/
05 Agent performance economics 05_agent_performance_economics/
06 AI runbooks and on-call operations 06_ai_runbooks_oncall/ (placeholder)
07 Tool execution sandboxes 07_tool_execution_sandboxes/ (placeholder)
08 Distributed training systems — memory wall, data/tensor/pipeline parallelism, ZeRO/FSDP, 3D parallelism, checkpointing at scale 08_distributed_training_systems/
09 GPU acceleration stack — roofline, CUDA/kernel fusion, NCCL, TensorRT-LLM, Triton, NIM, NeMo, MIG/cluster scheduling 09_gpu_acceleration_stack/