06. Module 17 Review — MLOps & Production¶

Focus: lifecycle management, CI/CD for ML, serving infrastructure, monitoring, rollback, and production cost discipline.

Review loop¶

Re-read the summary sections in 02_explainer.md §6.1-§6.7.
Use 04_daily_recall.md to answer everything cold.
Revisit 03_study_material.md for tables and tool comparisons.
Re-open your deliverables from 05_hands_on_lab.md and defend each design choice aloud.

Which part of the lifecycle still feels hand-wavy to me?
If production quality dropped tomorrow, where would I look first?
Which rollback target would I forget under pressure: code, model, prompt, or data?
What cost assumption in my hands_on_lab is least trustworthy?
Which MLOps habit would most improve the team I am likely to join next?

Why did the opening failure go unnoticed for weeks? See 02_explainer.md §1.3-§1.4.
What belongs in every tracked run? See 02_explainer.md §2.2-§2.3.
Registry versus artifact store — what is the difference? See 02_explainer.md §2.5 and §2.9.
What exactly is the quality gate? See 02_explainer.md §3.5.
When is automated retraining dangerous? See 02_explainer.md §3.7.
vLLM vs TGI vs Triton — what would make you choose each? See 02_explainer.md §4.3.
Why is token-level work often better than QPS for scaling? See 02_explainer.md §4.4-§4.5.
Data drift vs model drift vs vendor drift — clean distinction? See 02_explainer.md §5.3-§5.5.
Shadow vs canary vs blue-green — how do the risks differ? See 02_explainer.md §4.10 and §5.8.
What exactly can you roll back during an AI incident? See 02_explainer.md §5.9-§5.10.

Design a minimal MLOps stack for a five-person startup team.
Your model quality dropped after a data refresh. Walk the first thirty minutes of triage.
Your GPU bill doubled without traffic doubling. What are your first three hypotheses?
You must deploy a new model version with zero downtime. Which strategy do you choose and why?
Give a two-minute bridge from this module into ../00_realtime_voice_agents/.

[ ] I completed the reading flow through 02_explainer.md and 03_study_material.md.
[ ] I finished the work in 05_hands_on_lab.md.
[ ] I can explain the factory analogy without notes.
[ ] I can describe a promotion gate and a rollback path clearly.
[ ] I can compare at least two credible serving stacks.
[ ] I feel operationally prepared for ../00_realtime_voice_agents/.