Skip to content

03. Week 9 — Agents & Tool Calling

For deep understanding see 02_explainer.md — narrative with the handyman analogy, loop mechanics, tool schemas, guardrails, retrieval prompts, and the module 10 bridge. This file is the quick-reference glossary.

Section 1 — LLM call vs tool use vs agent vs multi-agent

Pattern Loop? Tools? State? Best use
LLM call No Usually no Minimal rewrite,
summarize,
classify
Single tool use Usually one step Yes Minimal one lookup,
one calculation
Agent Yes Yes Yes uncertain,
branching workflows
Multi-agent Many loops Yes Shared or coordinated large,
specialized workflows

See explainer chapter 1 for the failure story.

Section 2 — ReAct loop

ReAct = think → act → observe → repeat.

user request
think
act (tool call or answer)
observe
stop or repeat

Minimal loop skeleton

for step in range(MAX_STEPS):
    response = call_model(messages, tools)
    if response.final_text:
        return response.final_text
    for tool_call in response.tool_calls:
        tool_result = execute_tool(tool_call)
        messages.append(tool_call.as_message())
        messages.append(tool_result.as_message())

Why it works:

  • breaks big uncertainty into smaller steps
  • grounds on fresh external observations
  • supports retries and recovery
  • creates debuggable traces

See explainer chapter 2.

Section 3 — Tool schema design

Design rules

Rule Why
Narrow scope Easier selection
Clear verb lookup_invoice beats billing_tool
When-to-use description Helps routing
Typed arguments Fewer hallucinated arguments
Structured errors Cleaner recovery
Idempotency Safer retries

Good description template

What the tool does.
When to use it.
When not to use it.
Whether it is read-only or has side effects.

Pydantic example

from typing import Literal
from pydantic import BaseModel, Field

class LookupAccountArgs(BaseModel):
    user_id: str = Field(..., description="Stable user identifier like u_145")

class EscalateArgs(BaseModel):
    category: Literal["billing", "outage", "security", "legal"]
    summary: str = Field(..., min_length=20)

See explainer chapter 3.

Section 4 — Failure modes & guardrails

Failure mode Guardrail
Infinite loop max-step cap,
repeated-call detection
Wrong tool chosen better names,
better descriptions,
smaller tool subset
Hallucinated arguments schema validation,
clarifying questions
Raw tool crashes structured error returns
Duplicate side effects idempotency keys,
read-before-write
Unsafe write action human approval gate
Cost blow-up budget cap,
cheap router,
summarization

Safe policy table

Situation Action
Missing required ID ask clarifying question
Tool failed permanently explain and escalate
High-risk write HITL gate
Step cap reached partial progress + safe next step
Verified evidence present answer directly

See explainer chapter 4.

Section 5 — Advanced patterns

Parallel tool calls

Use for independent read-only work. Typical implementation: asyncio.gather. Do not parallelize dependent or risky writes.

Tool chaining

One tool's output becomes the next tool's input. Choose boundaries where logs would matter if the flow failed.

Dynamic tool selection

Route first. Then expose only the relevant tool subset. This improves accuracy and safety.

State across turns

Keep these separate:

  • messages — conversational history
  • structured state — machine-readable control state
  • memory — durable facts worth carrying forward

See explainer chapter 5.

Section 6 — Foundation-gap audit for module 10

Before starting module 10, you should be able to explain:

  1. Single-agent loop mechanics
  2. Tool schema design
  3. Error handling in loops
  4. When to stop or give up
  5. State across turns

If these are shaky, re-read explainer §5.7 and revision checkpoint items.

Reading list

  1. ReAct
  2. Toolformer
  3. Anthropic: Building Effective Agents
  4. LangGraph agent docs
  5. Provider docs for structured tool calling

Reference material

YouTube

Blogs

Self-check

Use these after reading the explainer.

  1. What makes a loop an agent instead of one-off tool use?
  2. Why do descriptions affect tool routing?
  3. What is the minimum structured error format you want from tools?
  4. When is parallelism helpful, and when is it dangerous?
  5. What will module 10 build on from module 9?