Home / Applied AI / 02. AI Infrastructure / 04. ML Platform Operations ML Platform Operations¶ The chapters in this module, in reading order. # Chapter 00 MLOps & Production — The Five-Year-Old Version 01 The notebook that worked once — applause in dev, silence in prod 02 Memory for your training runs — the run is the unit, not the file 03 The warehouse that holds approved models — experiments are not deployable by default 04 Full chain of evidence — reproducibility is a stack, not a wish 05 CI/CD for ML — make retraining boring on purpose 06 Quality gates for ML — speed with a sober bouncer 07 Feature stores — one recipe for training and serving 08 Serving infrastructure — where latency, quality, and cost argue politely 09 Deployment strategies — change the model without shaking the factory 10 Monitoring & Drift — when the factory alarms actually matter 11 Incident Response — runbooks beat brave Slack threads 12 Cost Optimization in Serving — GPU money burns while you sleep 13 Honest Admission — MLOps is still messy in the real world