01. Architecture decision tree — Which shape does this agent need?¶

~15 min read. You have a task you want an LLM to handle. But do you need one model call, a looping agent, a planner with workers, or a swarm of peers? By the end of this page you will be able to pick the right topology in under a minute — and explain why the others are wrong.

Built on the first-principles overview in 00-first-principles.md. The leash — how much autonomy the agent gets — is decision number one. This file turns that single knob into a four-way switch with concrete criteria for each position.

1) The demo that died on complexity¶

A founder walks on stage. One model call. send_slack_message(channel="#eng", text="standup at 10"). Audience claps. Headlines: "AI agent." Three days later the PM says: "Pick a free slot, message four people, wait for acks, tell me who's missing." Same model. Same SDK. Demo collapses.

Why? Not because the model got dumber. Because the shape was wrong. V1 needed one tool call — fire and forget. V2 needed a loop — judgment between steps, recovery on failure, a stop condition. The team was using a bicycle frame to build a truck. No amount of better pedalling fixes that.

The engineering team's instinct was "upgrade the model." They tried GPT-4, Claude Opus, Gemini Ultra. Every model failed the same way: it answered the first sub-question and stopped, or it hallucinated the remaining steps without actually executing them. The bug was not intelligence. The bug was topology.

The lesson: the first architecture decision is not "which model" or "which framework." It is which topology. Get the shape wrong and you debug forever. Get it right and a mediocre model often suffices.

This file gives you the decision tree. Four topologies. Three branching criteria. One table of tradeoffs. By section 7 you'll have a four-question audit that identifies the correct shape for any task in under a minute.

2) The four topologies — one diagram¶

┌─────────────────┐  ┌─────────────────────┐  ┌──────────────────────────┐  ┌────────────────────────────┐
│  SINGLE CALL    │  │  REACT LOOP         │  │  ORCHESTRATOR            │  │  MULTI-AGENT               │
│  (one shot)     │  │  (one agent loops)  │  │  (planner + workers)     │  │  (peers negotiate)         │
├─────────────────┤  ├─────────────────────┤  ├──────────────────────────┤  ├────────────────────────────┤
│                 │  │                     │  │                          │  │                            │
│  User ──→ LLM  │  │  User ──→ Agent     │  │  User ──→ Planner       │  │  User ──→ Coordinator     │
│        ──→ Tool│  │        ┌──→ Think   │  │        ──→ decompose    │  │        ──→ Agent A        │
│        ──→ Done│  │        │   Act      │  │        ──→ Worker A     │  │             ↕              │
│                 │  │        │   Observe  │  │        ──→ Worker B     │  │           Agent B          │
│                 │  │        └── loop?    │  │        ──→ Worker C     │  │             ↕              │
│                 │  │            ↓ done   │  │        ←── merge        │  │           Agent C          │
│                 │  │          answer     │  │        ──→ answer       │  │        ←── consensus       │
│                 │  │                     │  │                          │  │        ──→ answer          │
└─────────────────┘  └─────────────────────┘  └──────────────────────────┘  └────────────────────────────┘

Single call¶

One prompt in, one response out. The LLM may call a tool, but it never sees the result and decides again. Data flows in one direction. Zero judgment between steps. Think of it as a pure function: f(input) → output.

User ──→ [prompt + tools] ──→ LLM ──→ tool_call ──→ execute ──→ LLM ──→ final answer
                                                     (no feedback loop)

ReAct loop (single agent)¶

One agent with one toolbelt. It thinks, acts, observes the result, then decides: loop or stop. All reasoning lives in one context window. The back-arrow — from observation back to thinking — is the defining feature. The agent reacts to what it learns.

User ──→ Agent ──→ Think ──→ Act ──→ Observe ──┐
                     ↑                          │
                     └────── not done ──────────┘
                              done → answer

Orchestrator (planner + workers)¶

A planner LLM decomposes the task into sub-tasks. Separate worker agents (or tool calls) execute each sub-task independently. The planner merges results. Workers do not talk to each other — they report up. The planner is the only entity with full context.

User ──→ Planner ──→ [sub-task A] ──→ Worker A ──→ result A ──┐
                 ──→ [sub-task B] ──→ Worker B ──→ result B ──┤──→ Planner merges ──→ answer
                 ──→ [sub-task C] ──→ Worker C ──→ result C ──┘

Multi-agent (peer network)¶

Multiple agents with distinct roles communicate laterally. They share intermediate state, negotiate disagreements, and converge on a joint output. No single agent holds the full plan. Messages flow peer-to-peer, not just up and down.

User ──→ Agent A ←──→ Agent B
              ↕            ↕
           Agent C ←──→ Agent D
                   ↓
              consensus ──→ answer

One-line mental models: - Single call = a calculator. You press = once. - ReAct = a craftsman. Works, inspects, adjusts, repeats. - Orchestrator = a manager. Delegates, collects reports, synthesizes. - Multi-agent = a committee. Debates, negotiates, votes.

3) The decision tree — pick your topology¶

Walk this flowchart top-to-bottom. Stop at the first leaf that fits.

Does the task need JUDGMENT between steps?
(i.e., does step 2's action depend on step 1's result?)
│
├── NO ──→ ★ SINGLE CALL
│          (cheapest, fastest, sufficient for one-step work)
│          Examples: classify a ticket, extract fields from an invoice,
│                   generate a title, translate a paragraph
│
└── YES ──→ Does the task span multiple DOMAINS requiring
             different tool sets or expertise?
             │
             ├── NO ──→ ★ SINGLE REACT AGENT
             │          (one loop, one toolbelt, one context window)
             │          Examples: refund processing, research + summarize,
             │                   code edit + test + fix, book a meeting
             │
             └── YES ──→ Do the domains need to SHARE STATE
                          or NEGOTIATE during execution?
                          │
                          ├── NO ──→ ★ ORCHESTRATOR
                          │          (planner dispatches, workers execute alone)
                          │          Examples: "write a blog post" (research worker
                          │                   + writing worker + SEO worker),
                          │                   multi-service deployment pipeline,
                          │                   parallel data enrichment
                          │
                          └── YES ──→ ★ MULTI-AGENT
                                     (peers communicate, reach consensus)
                                     Examples: adversarial code review (writer +
                                              critic iterate), debate-based
                                              fact-checking, collaborative game AI,
                                              multi-party negotiation simulation

Three decision criteria, applied in order:

Judgment between steps? — If no, stop at single call. If yes, you need a loop. Test: can you define the entire task as one deterministic function with no conditionals on runtime data? If yes → single call.
How many domains / tool sets? — If one coherent toolbelt covers it, one ReAct agent suffices. If you need fundamentally different capabilities (code vs web search vs database vs image generation), decompose. Test: would one system prompt and one tool list make sense, or would you naturally write separate prompts?
Shared state between sub-tasks? — If workers can execute independently and merge, use an orchestrator. If they must exchange intermediate results and negotiate, you need multi-agent. Test: can you define a clean input/output contract for each sub-task? If yes → orchestrator. If sub-tasks need to read each other's scratchpads → multi-agent.

Quick-reference — task pattern → topology:

Task pattern	Topology	Example
"Do X" (one action, no follow-up)	Single call	Classify this email
"Do X, then decide Y based on result"	ReAct	Process this refund
"Do X and Y and Z in parallel, merge"	Orchestrator	Research + write + proofread
"Argue about X until consensus"	Multi-agent	Red team this prompt

4) Tradeoff table — what each shape costs you¶

Property	Single call	ReAct loop	Orchestrator	Multi-agent
Tokens per task	500–2k	5k–25k	15k–80k	30k–200k+
Latency (p50)	1–3 s	8–25 s	15–45 s	30–120 s
Latency (p95)	3–6 s	20–60 s	40–120 s	60–300 s
Cost per task	$0.005–$0.02	$0.05–$0.30	$0.20–$1.50	$0.50–$5.00+
Failure shape	Silent wrong answer	Stuck in loop / wrong stop	Worker fails silently, planner merges garbage	Deadlock, circular negotiation
Debug difficulty	Trivial (one call)	Medium (trace the loop)	Hard (which worker? which merge?)	Very hard (reconstruct multi-party state)
When to pick	Task is genuinely one step	Multi-step, one domain	Multi-domain, independent sub-tasks	Sub-tasks must cross-pollinate
When NOT to pick	Any step depends on a prior result	Toolbelt exceeds ~15 tools or spans unrelated domains	Sub-tasks need each other's intermediate results	Latency budget < 30 s, or coordination cost > task value

Ranges assume Claude Sonnet 4 / GPT-5-class models, 5–10 tools per agent, short system prompts. Adjust for your stack.

Reading the table — three insights:

Cost grows super-linearly. Orchestrator isn't 3× the cost of ReAct — it's 4–5× — because the planner call adds overhead on top of worker costs, and each worker maintains its own context.
Failure shapes change qualitatively. Single-call fails silently (wrong answer, looks confident). ReAct fails visibly (stuck loop, easy to trace). Orchestrator and multi-agent fail structurally (wrong decomposition, deadlock) — harder to even recognize as failures.
Debug difficulty is the hidden cost. When a ReAct agent fails, you read one trace. When a multi-agent system fails, you reconstruct a conversation graph. Teams underestimate this at design time and pay for it in on-call hours.

The core heuristic: pick the simplest topology that can handle the task's branching and domain needs. Every step up the complexity ladder multiplies cost, latency, and debug surface.

5) The refund task through all four shapes¶

Task: "Customer Aisha wants a refund on order 4481. Decide eligibility, issue if qualified, email the outcome."

Shape A — Single call (mega-tool)¶

User ──→ LLM ──→ handle_refund(order=4481, email=aisha@x.com)
              ──→ { decision: "approved", refund_id: "rf_77", email_sent: true }
              ──→ "Done!"

Works on the happy path. Fails when: order status is ambiguous (8% of orders), policy changed last week (VIP override), email service times out (no retry possible). The hard-coded tool can't adapt because judgment between steps was baked out.

Why do teams build this? Because it's fast to ship. One tool, one test, one deploy. The problem only surfaces in production when the happy path is no longer 100% of traffic. By then the architecture is load-bearing and hard to change.

Failure trace in production:

08:41:12  handle_refund(4481) called
08:41:13  internal: get_order → delivery_status = "unknown"
08:41:13  internal: hard-coded branch → status != "delivered" → DENY
08:41:13  return { decision: "denied", reason: "not delivered" }
08:41:13  LLM says: "Sorry Aisha, your order wasn't delivered so we can't refund."

Reality: the order WAS delivered. The shipping API had a stale cache.
A loop would have called get_delivery_proof() and found the signature.

Verdict: wrong topology. The task needs mid-flight decisions.

Shape B — ReAct loop (correct for this task)¶

ITER 1  think → "Need order details."      try → get_order(4481)         → delivered, 9 days late
ITER 2  think → "Check refund policy."      try → get_refund_policy()     → threshold=7d, VIP override
ITER 3  think → "Check customer tier."      try → get_customer_tier(...)  → tier=gold, eligible
ITER 4  think → "Issue refund."             try → issue_refund(4481)      → rf_77, ₹1,250
ITER 5  think → "Email Aisha."             try → send_email(...)          → FAILED (timeout)
ITER 6  think → "Retry email."             try → send_email(...)          → sent ✓  STOP

Six iterations. Each decision informed by the last result. Handles the VIP edge case, retries the email. One agent, one toolbelt, one context window. Cost: ~$0.12. Latency: ~18 s.

Verdict: correct topology. One domain (customer service), judgment between steps, recoverable failures.

Shape C — Orchestrator (overkill but instructive)¶

Planner decomposes:
  Sub-task 1: "Determine eligibility"  → Worker A (tools: get_order, get_policy, get_tier)
  Sub-task 2: "Execute refund"         → Worker B (tool: issue_refund)
  Sub-task 3: "Notify customer"        → Worker C (tool: send_email)

Planner merges results → final answer

Problem: Worker B cannot run until Worker A decides eligibility. Worker C cannot run until Worker B succeeds. The sub-tasks are serial, not parallel. The orchestrator adds a planner call (~2k tokens, ~3 s) that contributes nothing — it just re-serializes what a single ReAct agent would do naturally. Worse: if Worker A returns an edge case (VIP override), the planner must re-plan — but it already dispatched workers with fixed instructions. The rigid decomposition can't absorb surprises mid-flight.

Verdict: wrong topology. Sub-tasks are sequential and share one domain. Orchestrators shine when sub-tasks are independent and parallel — think "research + write + review" where all three need different tools and can run concurrently.

Shape D — Multi-agent (absurd for this task)¶

Agent "Policy Expert" checks eligibility
Agent "Finance Agent" issues refund
Agent "Comms Agent" drafts and sends email
  ↕ negotiate on wording, timing, edge cases

Three agents, three context windows, message-passing overhead, consensus protocol. For a routine refund. The coordination cost ($2+, 90 s) exceeds the task value.

What would justify multi-agent here? Only if the agents had genuinely conflicting objectives — e.g., "Policy Expert" wants to deny (loss prevention), "Customer Success Agent" wants to approve (retention), and they must negotiate a compromise. That's a different problem. For a straightforward "check policy, execute, notify" flow, the agents would just… agree with each other. Expensive agreement is still just agreement.

Verdict: wrong topology. No negotiation needed. No disagreement to resolve. Multi-agent is for tasks where agents must argue or iterate on each other's work.

Summary — same task, four shapes, one correct answer¶

Topology	Works?	Why / why not
Single call	✗	Can't handle ambiguous order status, policy changes, email retry
ReAct loop	✓	Sequential decisions in one domain. Handles surprises.
Orchestrator	✗	Sub-tasks are serial, not parallel. Adds planner overhead for nothing.
Multi-agent	✗	No negotiation needed. 10× cost for zero benefit.

The takeaway: topology selection is not about task difficulty. A "simple" refund needs a loop. A "complex" blog post (research + write + SEO) might only need an orchestrator with no loops inside workers. Match the shape to the structure of the task, not its perceived difficulty.

6) When each topology breaks¶

Understanding failure modes is how you know when to upgrade — and more importantly, when to downgrade. Most teams over-build; knowing the break points prevents that.

Single call breaks when…¶

The second step depends on the first step's result (you need a loop)
The tool can return unexpected states the hard-coded path doesn't handle
Verification is required after an action (refund issued → confirm status)
Concrete signal: you find yourself writing massive if/else trees inside the tool to handle "what if the first call returned X?" — that logic belongs in a loop, not in a tool.

ReAct loop breaks when…¶

The toolbelt grows past ~15 tools (model struggles to select correctly — tool confusion is the #1 ReAct failure mode)
The task spans unrelated domains (code gen + legal research + image creation — no single system prompt can guide all three)
Sub-tasks are naturally parallel but the single loop serializes them (waiting 60 s when 20 s of parallel work would do)
Context window fills up (>50 iterations → truncation → amnesia → loops on forgotten steps)
Concrete signal: the agent keeps picking the wrong tool, or keeps re-doing work it already did 20 iterations ago.

Orchestrator breaks when…¶

Sub-tasks need each other's intermediate results (not just final outputs) — Worker B needs Worker A's scratchpad, not just its conclusion
Workers need to negotiate or iterate on shared artifacts (e.g., one writes code, another reviews, they go back and forth)
The planner cannot know the correct decomposition upfront (the task is too novel or underspecified to split before starting)
The task is inherently serial (planner adds overhead without parallelism gain)
Concrete signal: the planner's decomposition keeps being wrong, or workers return partial results that the planner can't merge into a coherent answer.

Multi-agent breaks when…¶

Latency budget is under 30 seconds (coordination overhead alone exceeds it)
The coordination cost exceeds the task's value (a $2 multi-agent run to handle a $0.50 refund)
Agents enter circular negotiation with no convergence criterion (infinite "I disagree" → "but consider…" loops)
Debugging a failure requires reconstructing multi-party message history across 4+ agents
A single knowledgeable agent could do the job alone — Occam's razor
Concrete signal: agents are "agreeing" with each other without adding value, or they deadlock because neither can yield without violating its role prompt.

Pattern: each topology's failure mode is the signal to move to the next one — or, more often, to step back to a simpler one. In production, you'll downgrade topology more often than you upgrade it.

Migration signals — when to change topology¶

You shipped with one topology. Now production data tells you it's wrong. Here's what the signals look like:

Current topology	Signal to upgrade	Signal to downgrade
Single call	Success rate on real queries < 60%; users complain "it doesn't follow through"	N/A (already simplest)
ReAct loop	Tool confusion > 20% of iterations; latency unacceptable due to serialized sub-tasks	Success rate is the same as single-call because the task doesn't actually need a loop
Orchestrator	Planner decomposition accuracy < 70%; workers keep needing each other's intermediate data	Workers always execute serially; planner adds overhead but no parallelism
Multi-agent	N/A (already most complex)	Agents always agree; no actual negotiation happening; could be orchestrator at 1/5 the cost

The most common migration in practice: ReAct → Orchestrator, triggered by toolbelt bloat or latency from serialization. The second most common: Multi-agent → Orchestrator, triggered by the realization that "agents" are just workers who never actually disagree.

7) The four-question topology audit¶

When someone shows you a system — or you're designing one — ask these four questions in order. They function as a decision tree in interview form.

#	Question	What the answer tells you
1	Does step 2 depend on step 1's result?	NO → single call. YES → you need a loop.
2	How many distinct tool sets / domains does it span?	1 → single ReAct. Multiple → decompose.
3	Do sub-tasks need each other's intermediate state?	NO → orchestrator. YES → multi-agent.
4	What's the latency budget?	< 5 s → single call or aggressive caching. < 30 s → ReAct. < 120 s → orchestrator. Unbounded → multi-agent is feasible.

How to use this in practice:

Question 4 acts as a constraint that can downgrade your topology choice. If the task "needs" an orchestrator but must respond in 3 seconds, you either cache aggressively, pre-compute sub-tasks, or accept reduced capability. Latency is the great simplifier — it forces you toward simpler topologies, which is usually fine because latency-sensitive tasks tend to be simpler anyway.

Example application: "Build a customer support bot that handles refund requests." 1. Does step 2 depend on step 1? YES (need order status before deciding eligibility) → needs a loop. 2. How many domains? ONE (customer service tools) → single ReAct. 3. N/A (not decomposing). 4. Latency budget? 30 seconds acceptable for support → ReAct confirmed.

Rule of thumb. If you can't explain why you need the next topology up, you don't. Complexity is not a feature — it is a liability you must justify.

8) Where each topology lives in the wild¶

Single-call systems (high volume, low complexity, tight latency): - OpenAI "get_weather" function calling demo — one tool, one response, the canonical teaching example - Notion AI "improve writing" — single transform call, no iteration needed - Linear's AI title suggestions — one structured-output call per ticket - Stripe's invoice field auto-fill — one extraction call, deterministic

ReAct loop agents (the workhorse — most production agents are here): - Claude Code — reads files, runs commands, reacts to test output, loops until done - Cursor agent mode — plans edits, runs builds, inspects errors, retries across files - Aider — propose-apply-test loop for pair programming, stops when tests pass - Intercom Fin 2 — policy lookup → case lookup → draft → verify → send - Perplexity Comet — multi-step research loops with iterative retrieval

Orchestrator systems (multi-domain, parallel dispatch): - Cognition Devin — planner decomposes coding tasks into sub-goals, workers execute - GitHub Copilot coding agent — plans changes across files, dispatches edits, runs tests in parallel - LangGraph production apps (Klarna support, Elastic AI Assistant) — explicit state-graph with dispatch nodes

Multi-agent systems (rare, adversarial or negotiation-heavy): - AutoGen deployments — multi-agent debate and conversational planning - CrewAI production flows — role-based agents with lateral communication - Adversarial red-teaming setups — attacker agent vs defender agent iterate until breach or timeout

Distribution in production: if you audit 100 shipped agent products, you'll find roughly 40% are single-call (but called "agents" in marketing), 45% are ReAct loops, 12% are orchestrators, and 3% are genuine multi-agent. The industry over-uses the term "multi-agent" for what are actually orchestrators with fancy names.

How to identify topology from the outside: Look at the latency. < 3 s is almost certainly single-call. 8–30 s is usually ReAct. > 30 s with visible parallel progress is an orchestrator. > 60 s with back-and-forth updates is multi-agent. Latency is a reliable proxy because each topology has a characteristic timing signature.

Pause and recall¶

Close the file. Answer from memory. Check back only after writing your answers.

Name the four agent topologies in order of complexity.
What is the first decision criterion in the tree? (One sentence.)
A task needs 3 tools from the same domain and step 3 depends on step 2. Which topology?
An orchestrator dispatches 4 workers. Worker C needs Worker B's intermediate result (not final output). What breaks?
What's the approximate cost range for a ReAct loop vs an orchestrator, per task?
When does a ReAct loop break down? (Name two conditions.)
The refund task needs 6 iterations. Why is an orchestrator wrong for it?
What does question 4 (latency budget) do to your topology choice?

Interview Q&A¶

Q1. Given this task: "Summarize this PDF into 3 bullet points" — which topology and why? A. Single call. The task is one step with no branching. The PDF goes in, bullets come out. A loop adds cost and latency for zero benefit. Trap to avoid: "ReAct, because summarization is hard." Difficulty ≠ multi-step. One-shot difficulty is solved by a better model, not a loop.

Q2. Given this task: "Research competitor pricing, then draft a strategy memo, then get it reviewed for legal compliance" — which topology? A. Orchestrator. Three distinct domains (market research, business writing, legal review), each with different tool sets. Sub-tasks are sequential but their expertise is independent — each worker can execute without the others' intermediate reasoning. Trap to avoid: "Multi-agent, because there are multiple roles." Roles ≠ negotiation. Workers report up; they don't argue with each other.

Q3. When would you move from a ReAct agent to an orchestrator? A. When the single agent's toolbelt grows past ~15 tools spanning unrelated domains, when sub-tasks are parallelizable and you're wasting latency serializing them, or when different sub-tasks need different system prompts or model sizes. Key insight: The move is about domain separation and parallelism, not about task difficulty.

Q4. What is the most common mistake teams make when choosing a topology? A. Over-building. They reach for multi-agent or orchestrator when a single ReAct loop would suffice. The result: 5× the cost, 3× the latency, 10× the debug surface — for the same success rate. The correct default is the simplest topology that handles the task's branching needs. Corollary: Under-building (single-call for multi-step work) is the second most common. It manifests as "the model is dumb" complaints that are really architecture complaints.

Q5. How do you debug a failure in an orchestrator vs a ReAct loop? A. In a ReAct loop: trace the iteration log — which think step went wrong? In an orchestrator: first identify which worker failed (or if the planner's decomposition was wrong), then trace that worker's internal loop. Orchestrator debugging has two layers; ReAct has one. This is why you don't orchestrate unless you must. Key insight: Debug cost grows super-linearly with topology complexity.

Q6. When is multi-agent actually the right answer? A. When the task genuinely requires adversarial iteration (writer + critic), negotiation between parties with different objectives, or synthesis from agents that must see and react to each other's intermediate work. If you can define a clean input/output contract between sub-tasks, orchestrator is simpler and sufficient. Real examples: adversarial red-teaming, debate-based fact-checking, collaborative design critique.

Q7. A PM says "we need multi-agent because we have 5 microservices." What's wrong with this reasoning? A. Number of services ≠ number of agents. If those 5 services are called sequentially by one agent with 5 tools, that's a ReAct loop. If they're called in parallel by a planner, that's an orchestrator. Multi-agent only applies if the services need to negotiate with each other — which microservices almost never do. The PM is confusing system architecture with agent topology. Key insight: Agent topology maps to task structure, not infrastructure structure.

Apply now (10 min)¶

Step 1. Take three real tasks from your current work (or your team's backlog). For each, walk the decision tree from section 3. Write down: which topology, and which decision node determined it. Example format:

Task: "Auto-label incoming GitHub issues based on content"
Q1: Judgment between steps? NO — one classification call suffices.
→ Topology: Single call.

Step 2. For one of those tasks, sketch what happens if you use the wrong topology one level up. What do you pay in cost/latency/debug for no benefit? This builds intuition for "simplest that works."

Step 3. Find one system you've seen called a "multi-agent" system. Apply the four-question audit from section 7. Is it genuinely multi-agent (peers negotiate), or is it an orchestrator wearing a multi-agent label? Most "multi-agent" demos are orchestrators.

Operational memory¶

This chapter gave you a decision tree for picking among four agent topologies: single call, ReAct loop, orchestrator, and multi-agent. The tree uses three criteria applied in order — judgment between steps, number of domains, and whether domains share state — plus a latency constraint that can downgrade your choice.

The core rule: pick the simplest topology that handles the task's branching and domain needs. Every step up multiplies cost, latency, and debug surface. Most production work lives in single-call (high volume, low complexity) or ReAct (multi-step, one domain). Orchestrators appear when domains diverge and parallelize. Multi-agent is rare and reserved for genuine negotiation.

The refund example crystallized this: a sequential, single-domain, multi-step task is textbook ReAct. An orchestrator adds a useless planner. Multi-agent adds useless negotiation. Single-call can't handle the edge cases. Matching topology to task structure — not to task difficulty — is the skill.

Remember:

Single call → no judgment between steps. Cheapest. Use for one-shot work.
ReAct loop → judgment between steps, one domain, one toolbelt. The workhorse topology for production agents.
Orchestrator → multiple domains, independent sub-tasks, planner dispatches. Parallelism is the payoff.
Multi-agent → sub-tasks share intermediate state and negotiate. Rare in production. If agents always agree, downgrade to orchestrator.
The four-question audit: (1) judgment between steps? (2) how many domains? (3) shared state? (4) latency budget?
Over-building is the #1 topology mistake. If you can't explain why you need the next level, you don't.
Migration signals: tool confusion in ReAct → upgrade to orchestrator. Agents always agree in multi-agent → downgrade to orchestrator.
Debug cost grows super-linearly with topology complexity. This is the hidden tax you pay forever.
Latency is a topology identifier: < 3 s = single call, 8–30 s = ReAct, 30–120 s = orchestrator, > 60 s with back-and-forth = multi-agent.

Bridge. We chose the shape. For any shape beyond a single call, the agent runs a loop. But a loop without an exit is a billing time bomb. The next file opens that loop, names its three beats — Think, Act, Observe — and then wires the wall that stops it. → 02-react-loop-and-stopping.md