00. AI Incident Response — The Fire Station for AI Systems¶

A production AI system eventually wakes someone up. This module teaches what happens between the first bad complaint and the moment the system is safe again.

Imagine your AI product is a city.

Some days the city works normally. Search finds the right policy. The agent calls the right tool. The chatbot refuses dangerous requests. Costs stay inside budget. Latency stays boring.

Then one night the emergency line rings.

A customer says the refund agent approved a refund it should have denied. Another customer says the assistant leaked private text from a different tenant. Support says the model is giving a confident answer from stale documents. Finance says token spend doubled in one hour. Nobody knows yet whether this is a prompt bug, model rollout, retrieval issue, tool loop, guardrail bypass, or one angry user with a weird case.

Debugging asks, "What broke?"

Incident response asks a different question first: "How do we keep the fire from spreading while we investigate?"

That is the whole module.

The core picture¶

complaint / alert
      │
      ▼
┌──────────────────┐
│ alarm bell       │  detect and page
└────────┬─────────┘
         ▼
┌──────────────────┐
│ fire captain     │  declare severity and owner
└────────┬─────────┘
         ▼
┌──────────────────┐
│ snapshot room    │  freeze prompts, traces, model, retrieval state
└────────┬─────────┘
         ▼
┌──────────────────┐
│ firebreak        │  rollback, kill switch, rate limit, disable tool
└────────┬─────────┘
         ▼
┌──────────────────┐
│ status board     │  internal and customer communication
└────────┬─────────┘
         ▼
┌──────────────────┐
│ after-action     │  postmortem, eval lock, drill
└──────────────────┘

The mistake beginners make is treating incidents like debugging sessions with more stress. That is backwards. A good incident response buys time for debugging. It limits harm, preserves evidence, gives one person decision authority, and communicates uncertainty without pretending the root cause is known.

The placeholders you will see called back¶

Placeholder	Meaning
alarm bell	The alert, complaint, eval failure, cost spike, or safety report that starts the incident.
fire captain	The single accountable incident lead who declares severity and makes tradeoff calls.
runbook wall	The rehearsed response steps: page, snapshot, mitigate, communicate, verify, close.
snapshot room	The frozen evidence package: prompt, model version, retrieval results, traces, tool outputs, config, flags, and user-visible output.
firebreak	Any containment action that stops spread: rollback, kill switch, tool disable, rate limit, degraded mode, or traffic split.
status board	The communication surface for internal updates, customer status, decisions, and timestamps.
after-action lock	The postmortem change that prevents the same incident class from quietly returning.

Why AI incidents are different from ordinary backend incidents¶

Ordinary incidents often start with a crisp signal. Error rate jumps. CPU saturates. Database writes fail. The dashboard is red.

AI incidents can stay green while the product is wrong.

The API returns 200. Latency is normal. JSON parses. The model sounds confident. The agent completes the workflow. Yet the answer is unsafe, stale, ungrounded, biased, expensive, or operationally unauthorized.

That changes the incident response muscle.

You need the usual SRE habits: severity, paging, owners, rollback, communication, postmortems. You also need AI-specific artifacts: prompt diffs, model identifiers, retrieval snapshots, eval slices, judge outputs, tool call traces, tenant filters, memory state, safety classifier decisions, and human-review samples.

The fire captain does not need to solve the root cause in the first five minutes. The captain needs to decide whether to pull the firebreak before more users are harmed.

Memory map¶

Concept	Prerequisite	Pressure family	Recurs later as	Layer touched
Severity declaration	debugging taxonomy	user harm + business risk	rollback decision	product → ops
Snapshot package	trace reading	evidence preservation	postmortem proof	app → model → retrieval → tools
Firebreak	feature flags + deployment	blast-radius control	guardrail fallback	API → workflow → infra
Soft failure detection	evals + human review	semantic ambiguity	safety monitoring	quality → policy
War-room communication	incident process	coordination under uncertainty	customer trust	team → stakeholder
Eval lock	regression testing	recurrence prevention	release gate	eval → CI/CD
Drill	runbook ownership	readiness	operational muscle	team → process

What is coming¶

01-what-counts-as-ai-incident.md — why a green API can still be a sev.
02-first-fifteen-minutes.md — the first actions before root cause is known.
03-severity-and-blast-radius.md — how to classify harm, spread, and urgency.
04-snapshot-the-system.md — what evidence to freeze before mitigation changes the scene.
05-war-room-roles-and-comms.md — who joins, who decides, and how updates are written.
06-rollback-and-kill-switches.md — prompt rollback, model rollback, tool disable, and degraded mode.
07-soft-failure-detection.md — detecting plausible-but-wrong failures that dashboards miss.
08-ai-specific-incident-patterns.md — prompt regression, retrieval poisoning, tool loops, judge drift, and cost runaway.
09-postmortem-evals-and-locks.md — turning the incident into an eval, guardrail, or process lock.
10-incident-drills-and-readiness.md — practicing before the real customer outage.
11-honest-admission.md — what incident response still cannot guarantee.

The rule to carry¶

AI incident response is not heroic debugging. It is controlled containment under uncertainty.

If you remember one sentence, remember this:

Teacher voice. Snapshot before you mutate, contain before you theorize, communicate before certainty, and convert every incident into a lock.

That sentence is the runbook wall.

Bridge. Before we can run the first fifteen minutes, we need to know what counts as an AI incident in the first place. A silent bad answer can be more dangerous than a loud 500. → 01-what-counts-as-ai-incident.md