Skip to content

06. Module 17 Review — MLOps & Production

Focus: lifecycle management, CI/CD for ML, serving infrastructure, monitoring, rollback, and production cost discipline.

Review loop

  1. Re-read the summary sections in 02_explainer.md §6.1-§6.7.
  2. Use 04_daily_recall.md to answer everything cold.
  3. Revisit 03_study_material.md for tables and tool comparisons.
  4. Re-open your deliverables from 05_hands_on_lab.md and defend each design choice aloud.

Reflection prompts

  • Which part of the lifecycle still feels hand-wavy to me?
  • If production quality dropped tomorrow, where would I look first?
  • Which rollback target would I forget under pressure: code, model, prompt, or data?
  • What cost assumption in my hands_on_lab is least trustworthy?
  • Which MLOps habit would most improve the team I am likely to join next?

Embedded checkpoint

Conceptual

  1. Why did the opening failure go unnoticed for weeks? See 02_explainer.md §1.3-§1.4.
  2. What belongs in every tracked run? See 02_explainer.md §2.2-§2.3.
  3. Registry versus artifact store — what is the difference? See 02_explainer.md §2.5 and §2.9.
  4. What exactly is the quality gate? See 02_explainer.md §3.5.
  5. When is automated retraining dangerous? See 02_explainer.md §3.7.
  6. vLLM vs TGI vs Triton — what would make you choose each? See 02_explainer.md §4.3.
  7. Why is token-level work often better than QPS for scaling? See 02_explainer.md §4.4-§4.5.
  8. Data drift vs model drift vs vendor drift — clean distinction? See 02_explainer.md §5.3-§5.5.
  9. Shadow vs canary vs blue-green — how do the risks differ? See 02_explainer.md §4.10 and §5.8.
  10. What exactly can you roll back during an AI incident? See 02_explainer.md §5.9-§5.10.

Applied

  1. Design a minimal MLOps stack for a five-person startup team.
  2. Your model quality dropped after a data refresh. Walk the first thirty minutes of triage.
  3. Your GPU bill doubled without traffic doubling. What are your first three hypotheses?
  4. You must deploy a new model version with zero downtime. Which strategy do you choose and why?
  5. Give a two-minute bridge from this module into ../00_realtime_voice_agents/.

Self-evaluation

Section Score /
Conceptual __ 10
Applied __ 10
Reflection honesty __ 5
Total __ 25

Completion gate