Skip to content

00. AI Incident Response — The Fire Station for AI Systems

A production AI system eventually wakes someone up. This module teaches what happens between the first bad complaint and the moment the system is safe again.


Imagine your AI product is a city.

Some days the city works normally. Search finds the right policy. The agent calls the right tool. The chatbot refuses dangerous requests. Costs stay inside budget. Latency stays boring.

Then one night the emergency line rings.

A customer says the refund agent approved a refund it should have denied. Another customer says the assistant leaked private text from a different tenant. Support says the model is giving a confident answer from stale documents. Finance says token spend doubled in one hour. Nobody knows yet whether this is a prompt bug, model rollout, retrieval issue, tool loop, guardrail bypass, or one angry user with a weird case.

Debugging asks, "What broke?"

Incident response asks a different question first: "How do we keep the fire from spreading while we investigate?"

That is the whole module.


The core picture

complaint / alert
┌──────────────────┐
│ alarm bell       │  detect and page
└────────┬─────────┘
┌──────────────────┐
│ fire captain     │  declare severity and owner
└────────┬─────────┘
┌──────────────────┐
│ snapshot room    │  freeze prompts, traces, model, retrieval state
└────────┬─────────┘
┌──────────────────┐
│ firebreak        │  rollback, kill switch, rate limit, disable tool
└────────┬─────────┘
┌──────────────────┐
│ status board     │  internal and customer communication
└────────┬─────────┘
┌──────────────────┐
│ after-action     │  postmortem, eval lock, drill
└──────────────────┘

The mistake beginners make is treating incidents like debugging sessions with more stress. That is backwards. A good incident response buys time for debugging. It limits harm, preserves evidence, gives one person decision authority, and communicates uncertainty without pretending the root cause is known.


The placeholders you will see called back

Placeholder Meaning
alarm bell The alert, complaint, eval failure, cost spike, or safety report that starts the incident.
fire captain The single accountable incident lead who declares severity and makes tradeoff calls.
runbook wall The rehearsed response steps: page, snapshot, mitigate, communicate, verify, close.
snapshot room The frozen evidence package: prompt, model version, retrieval results, traces, tool outputs, config, flags, and user-visible output.
firebreak Any containment action that stops spread: rollback, kill switch, tool disable, rate limit, degraded mode, or traffic split.
status board The communication surface for internal updates, customer status, decisions, and timestamps.
after-action lock The postmortem change that prevents the same incident class from quietly returning.

Why AI incidents are different from ordinary backend incidents

Ordinary incidents often start with a crisp signal. Error rate jumps. CPU saturates. Database writes fail. The dashboard is red.

AI incidents can stay green while the product is wrong.

The API returns 200. Latency is normal. JSON parses. The model sounds confident. The agent completes the workflow. Yet the answer is unsafe, stale, ungrounded, biased, expensive, or operationally unauthorized.

That changes the incident response muscle.

You need the usual SRE habits: severity, paging, owners, rollback, communication, postmortems. You also need AI-specific artifacts: prompt diffs, model identifiers, retrieval snapshots, eval slices, judge outputs, tool call traces, tenant filters, memory state, safety classifier decisions, and human-review samples.

The fire captain does not need to solve the root cause in the first five minutes. The captain needs to decide whether to pull the firebreak before more users are harmed.


Memory map

Concept Prerequisite Pressure family Recurs later as Layer touched
Severity declaration debugging taxonomy user harm + business risk rollback decision product → ops
Snapshot package trace reading evidence preservation postmortem proof app → model → retrieval → tools
Firebreak feature flags + deployment blast-radius control guardrail fallback API → workflow → infra
Soft failure detection evals + human review semantic ambiguity safety monitoring quality → policy
War-room communication incident process coordination under uncertainty customer trust team → stakeholder
Eval lock regression testing recurrence prevention release gate eval → CI/CD
Drill runbook ownership readiness operational muscle team → process

What is coming

  1. 01-what-counts-as-ai-incident.md — why a green API can still be a sev.
  2. 02-first-fifteen-minutes.md — the first actions before root cause is known.
  3. 03-severity-and-blast-radius.md — how to classify harm, spread, and urgency.
  4. 04-snapshot-the-system.md — what evidence to freeze before mitigation changes the scene.
  5. 05-war-room-roles-and-comms.md — who joins, who decides, and how updates are written.
  6. 06-rollback-and-kill-switches.md — prompt rollback, model rollback, tool disable, and degraded mode.
  7. 07-soft-failure-detection.md — detecting plausible-but-wrong failures that dashboards miss.
  8. 08-ai-specific-incident-patterns.md — prompt regression, retrieval poisoning, tool loops, judge drift, and cost runaway.
  9. 09-postmortem-evals-and-locks.md — turning the incident into an eval, guardrail, or process lock.
  10. 10-incident-drills-and-readiness.md — practicing before the real customer outage.
  11. 11-honest-admission.md — what incident response still cannot guarantee.

The rule to carry

AI incident response is not heroic debugging. It is controlled containment under uncertainty.

If you remember one sentence, remember this:

Teacher voice. Snapshot before you mutate, contain before you theorize, communicate before certainty, and convert every incident into a lock.

That sentence is the runbook wall.


Bridge. Before we can run the first fifteen minutes, we need to know what counts as an AI incident in the first place. A silent bad answer can be more dangerous than a loud 500. → 01-what-counts-as-ai-incident.md