AI Product Evals¶
Use this track for the measurement and release layer: evals, metrics, experimentation, release gates, judge calibration, drift checks, dashboards, and feedback loops.
This track is deliberately separate from agent architecture. Evals are the proof layer that tells you whether an AI product can ship, regress, roll back, or improve.
| Module | Focus | Folder |
|---|---|---|
| 00 | AI evals and release gates | 00_ai_evals_release_gates/ |
| 01 | Dataset and golden set operations | 01_dataset_golden_set_operations/ (placeholder) |
| 02 | Telemetry and feedback loops | 02_telemetry_feedback_loops/ (placeholder) |
| 03 | AI release management | 03_ai_release_management/ (placeholder) |