06. Module 17 Review — MLOps & Production¶
Focus: lifecycle management, CI/CD for ML, serving infrastructure, monitoring, rollback, and production cost discipline.
Review loop¶
- Re-read the summary sections in
02_explainer.md§6.1-§6.7. - Use
04_daily_recall.mdto answer everything cold. - Revisit
03_study_material.mdfor tables and tool comparisons. - Re-open your deliverables from
05_hands_on_lab.mdand defend each design choice aloud.
Reflection prompts¶
- Which part of the lifecycle still feels hand-wavy to me?
- If production quality dropped tomorrow, where would I look first?
- Which rollback target would I forget under pressure: code, model, prompt, or data?
- What cost assumption in my hands_on_lab is least trustworthy?
- Which MLOps habit would most improve the team I am likely to join next?
Embedded checkpoint¶
Conceptual¶
- Why did the opening failure go unnoticed for weeks? See
02_explainer.md§1.3-§1.4. - What belongs in every tracked run? See
02_explainer.md§2.2-§2.3. - Registry versus artifact store — what is the difference? See
02_explainer.md§2.5 and §2.9. - What exactly is the quality gate? See
02_explainer.md§3.5. - When is automated retraining dangerous? See
02_explainer.md§3.7. - vLLM vs TGI vs Triton — what would make you choose each? See
02_explainer.md§4.3. - Why is token-level work often better than QPS for scaling? See
02_explainer.md§4.4-§4.5. - Data drift vs model drift vs vendor drift — clean distinction? See
02_explainer.md§5.3-§5.5. - Shadow vs canary vs blue-green — how do the risks differ? See
02_explainer.md§4.10 and §5.8. - What exactly can you roll back during an AI incident? See
02_explainer.md§5.9-§5.10.
Applied¶
- Design a minimal MLOps stack for a five-person startup team.
- Your model quality dropped after a data refresh. Walk the first thirty minutes of triage.
- Your GPU bill doubled without traffic doubling. What are your first three hypotheses?
- You must deploy a new model version with zero downtime. Which strategy do you choose and why?
- Give a two-minute bridge from this module into
../00_realtime_voice_agents/.
Self-evaluation¶
| Section | Score | / |
|---|---|---|
| Conceptual | __ | 10 |
| Applied | __ | 10 |
| Reflection honesty | __ | 5 |
| Total | __ | 25 |
Completion gate¶
- [ ] I completed the reading flow through
02_explainer.mdand03_study_material.md. - [ ] I finished the work in
05_hands_on_lab.md. - [ ] I can explain the factory analogy without notes.
- [ ] I can describe a promotion gate and a rollback path clearly.
- [ ] I can compare at least two credible serving stacks.
- [ ] I feel operationally prepared for
../00_realtime_voice_agents/.