ML Platform Operations¶

The chapters in this module, in reading order.

#	Chapter
00	MLOps & Production — The Five-Year-Old Version
01	The notebook that worked once — applause in dev, silence in prod
02	Memory for your training runs — the run is the unit, not the file
03	The warehouse that holds approved models — experiments are not deployable by default
04	Full chain of evidence — reproducibility is a stack, not a wish
05	CI/CD for ML — make retraining boring on purpose
06	Quality gates for ML — speed with a sober bouncer
07	Feature stores — one recipe for training and serving
08	Serving infrastructure — where latency, quality, and cost argue politely
09	Deployment strategies — change the model without shaking the factory
10	Monitoring & Drift — when the factory alarms actually matter
11	Incident Response — runbooks beat brave Slack threads
12	Cost Optimization in Serving — GPU money burns while you sleep
13	Honest Admission — MLOps is still messy in the real world