Skip to content

Agent Observability Debugging

The chapters in this module, in reading order.

# Chapter
00 Debugging Agents in Production — The Five-Year-Old Version
01 Failure taxonomy — name the bug in ten seconds
02 From complaint to trace — the first move when a user reports a bug
03 Reading a trace — anatomy of the case file
04 LLM-specific traces — watch the model like a subsystem, not a magic box
05 Reproducing the failure — freeze the scene before the trail goes cold
06 The layer-isolation lineup — five suspects, one at a time
07 Prompt-layer bugs — the first suspect in the lineup
08 Tool-layer bugs — where physical reality bites the agent
09 Loop-layer bugs — when the control flow itself is broken
10 Memory-layer bugs — the fourth suspect lies through what it remembers
11 Model-layer bugs — when the suspect is the brain itself
12 Multi-agent handoff bugs — the seams are the crime scene
13 Drift detection — the cold case
14 Latency and cost regressions — when the answer is right but the bill is wrong
15 Debugging tools workflow — LangSmith, Phoenix, Braintrust in practice
16 Span tagging for debugging — small labels, huge investigation power
17 Regression eval as a lock — turning a fixed bug into a permanent guardrail
18 Postmortem for agents — the detective's case-closed report
19 Data privacy and retention — observe enough to debug, not enough to betray users
20 Honest admission — what debugging agents in production still cannot solve