Home / Applied AI / 01. AI Engineering / 03. Agent Observability Debugging Agent Observability Debugging¶ The chapters in this module, in reading order. # Chapter 00 Debugging Agents in Production — The Five-Year-Old Version 01 Failure taxonomy — name the bug in ten seconds 02 From complaint to trace — the first move when a user reports a bug 03 Reading a trace — anatomy of the case file 04 LLM-specific traces — watch the model like a subsystem, not a magic box 05 Reproducing the failure — freeze the scene before the trail goes cold 06 The layer-isolation lineup — five suspects, one at a time 07 Prompt-layer bugs — the first suspect in the lineup 08 Tool-layer bugs — where physical reality bites the agent 09 Loop-layer bugs — when the control flow itself is broken 10 Memory-layer bugs — the fourth suspect lies through what it remembers 11 Model-layer bugs — when the suspect is the brain itself 12 Multi-agent handoff bugs — the seams are the crime scene 13 Drift detection — the cold case 14 Latency and cost regressions — when the answer is right but the bill is wrong 15 Debugging tools workflow — LangSmith, Phoenix, Braintrust in practice 16 Span tagging for debugging — small labels, huge investigation power 17 Regression eval as a lock — turning a fixed bug into a permanent guardrail 18 Postmortem for agents — the detective's case-closed report 19 Data privacy and retention — observe enough to debug, not enough to betray users 20 Honest admission — what debugging agents in production still cannot solve