Skip to content

00. AI release management — First-principles overview

Modules 01 and 02 of this category taught you to build evals and feedback loops. This module is the discipline of using them — together with prompts, models, and agent code as versioned artefacts — to ship AI changes safely. Canary, rollback, comms, change windows; the production engineering of releasing AI changes that touch real users.


A platform engineer at a Bengaluru SaaS company ships a prompt update at 14:00 IST on a Friday. By 16:00, customer-support tickets are spiking. The prompt change was reviewed; the eval passed; the canary was at 50%; the rollback flag was prepared but not tested. The rollback runs into a configuration issue; reverting takes 90 minutes; in that window, ten thousand customer conversations get the regressed prompt. The postmortem identifies the discipline gaps: the canary should have been at 5%, not 50%, for a change of this surface area; the rollback flag should have been rehearsed; the release should not have shipped late on Friday with reduced staffing.

This is the AI release management problem. The system is non-deterministic, the changes are often prompt or model rather than code, the blast radius is large, and the conventional software-release disciplines (canary, rollback, freeze) need adaptation. This module is the discipline.


What AI release management is

AI release management is the production discipline of shipping prompt, model, agent, and data changes safely — with eval gates, feedback monitoring, canary rollouts, rehearsed rollbacks, version control, and stakeholder communication.

Six surfaces.

Surface One-liner Pressure it answers
Change types Recognising prompt, model, agent code, eval, data as distinct change classes targeted discipline per type
Release gates Eval and feedback signals as preconditions to ship quality: ship only what passes
Canary Gradual rollout with monitoring at each step blast control: detect before wide
Rollback Tested, fast reversal when canary or production shows regression recovery: undo cleanly
Versioning Prompts, models, agents as versioned artefacts traceability: know what shipped when
Communication Stakeholders informed before, during, after trust: surprise erodes confidence

A seventh concern — incident response when a release goes wrong — runs across the surfaces and is its own chapter (10).


The recurring vocabulary

Name Surface What it is
the change type Change types Prompt, model, agent code, eval, data
the eval gate Release gates The eval check that must pass before promotion
the feedback gate Release gates The feedback profile that must hold during canary
the canary step Canary A specific traffic percentage in the rollout
the rollback flag Rollback The flag that flips traffic back to the prior version
the change window Communication The agreed time period for shipping; respects freezes
the freeze period Communication A time period during which non-critical changes do not ship

The journey

This module has three acts.

Act 1 — Recognise (files 01–03). Why AI releases differ; the types of changes; the gates that precede shipping.

Act 2 — Roll out (files 04–07). Canary discipline, rollback discipline, versioning, communication.

Act 3 — Operate (files 08–10). Change windows, coordinated multi-change releases, emergency changes, postmortems.

Synthesis (files 12–13). Architect checklist and honest admission.


Memory map

# File What it adds
01 why-ai-releases-are-different the reframe: prompts/models as releases, not just deploys
02 the-change-types prompt, model, agent code, eval, data — distinct disciplines
03 release-gates eval + feedback as preconditions
— milestone: gates exist —
04 canary-rollouts traffic shifts with monitoring at each step
05 rollback-discipline tested, fast reversal
06 versioning-prompts-and-models the artefact discipline
07 release-communication stakeholders before, during, after
— milestone: rollouts work —
08 change-windows-and-freezes when shipping happens; when it does not
09 coordinated-multi-change when several changes ship together
10 emergency-changes the discipline for the urgent override
11 release-postmortem when something went wrong
— milestone: operations are mature —
12 architect-checklist 20 items
13 honest-admission what release management cannot solve

How this module relates to its neighbours


Top resources

  • Google SRE — release engineering — https://sre.google/sre-book/release-engineering/
  • Netflix — canary analysis — https://netflixtechblog.com/automated-canary-analysis-at-netflix-with-kayenta-3260bc7acc69
  • Anthropic — model deprecations — https://docs.anthropic.com/en/docs/about-claude/model-deprecations
  • OpenAI — production best practices — https://platform.openai.com/docs/guides/production-best-practices

What's coming

  1. 01-why-ai-releases-are-different.md — Why prompt and model changes are releases, not just deploys.
  2. 02-the-change-types.md — Prompt, model, agent code, eval, data; distinct disciplines.
  3. 03-release-gates.md — Eval and feedback gates.
  4. 04-canary-rollouts.md — Traffic shifts with monitoring.
  5. 05-rollback-discipline.md — Tested, fast reversal.
  6. 06-versioning-prompts-and-models.md — Artefact versioning.
  7. 07-release-communication.md — Stakeholders before, during, after.
  8. 08-change-windows-and-freezes.md — When to ship.
  9. 09-coordinated-multi-change.md — Several changes together.
  10. 10-emergency-changes.md — The urgent override.
  11. 11-release-postmortem.md — When it went wrong.
  12. 12-architect-checklist.md — Twenty items.
  13. 13-honest-admission.md — Limits.

Bridge. Before designing the gates or the canary, we feel why AI releases are not the same as software deploys. The first chapter is that reframe. → 01-why-ai-releases-are-different.md