Home / Applied AI / 04. AI Product Evals / 00. AI Evals Release Gates AI Evals Release Gates¶ The chapters in this module, in reading order. # Chapter 00 Evals & Production — The Five-Year-Old Version 01 Shipping on vibes — when a flawless demo hides a 38-point quality drop 02 Eval taxonomy — four axes, one decision per cell 03 Golden datasets — the labelled tray that turns every eval claim into evidence 04 Synthetic generation — manufacturing the cases you cannot hand-write fast enough 05 The metrics zoo — three families, one honest truth, many lying numbers 06 LLM as judge — verification is cheaper than generation 07 Rubric design — when two careful readers score the same chat and disagree 08 Judge calibration — the rubric is anchored, but the judge still drifts 09 Drift detection — when 78% quietly becomes 64% and nobody pages 10 A/B testing — when the offline winner loses the live argument 11 Logging & tracing — when the A/B winner has a 9% bug nobody can name 12 Alerting & dashboards — turning ten thousand traces into one glance and one page 13 Eval-driven development — when the test is written before the prompt 14 Honest admission — five things evals still cannot do for you