05. Hierarchical & Peer-to-Peer — scaling the org chart¶
~10 min read. When one boss is not enough. When departments need to talk directly.
Built on the ELI5 in 00-eli5.md. The org chart — who talks to whom — can grow vertically with sub-managers or laterally with direct department-to-department links.
1) Hierarchical — managers under the CEO¶
See the picture first.
┌─────────────────┐
│ Chief CEO │
└──────┬──────────┘
┌────┴────┐
▼ ▼
┌──────────┐ ┌──────────┐
│ Research │ │ Delivery │
│ Manager │ │ Manager │
└──┬────┬──┘ └──┬────┬──┘
▼ ▼ ▼ ▼
Search Fact Writer Publisher
Check
2) Peer-to-peer — departments talk directly¶
Picture first again.
┌──────────┐ ┌──────────┐
│ Research │◀──▶ │ Writer │
└─────┬────┘ └────┬─────┘
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│Fact Check│◀──▶ │ Reviewer │
└──────────┘ └──────────┘
3) The decision ladder¶
So what to do before drawing a fancy system? Climb this ladder in order. This prevents premature complexity. 1. Can one agent do this reliably? → stay single. 2. Is the task naturally stage-based? → pipeline. 3. Is there one obvious coordinator? → orchestrator-worker. 4. Do specialists need critique or negotiation? → debate or peer-to-peer. 5. Does scale grow beyond one coordinator? → hierarchical. Look. Each rung adds cost, latency, and observability burden. So do not jump to the top because it looks senior. Start simple. Add structure only where repeated failure demands it. If one agent already works, extra roles are theatre. If the task is stage-based, use a clean pipeline. If one router can delegate well, keep the CEO central. If specialists must challenge each other, allow direct exchange. If even that coordinator becomes overloaded, add manager layers. That is the whole logic. The ladder protects you from architecture vanity.
4) Worked example — choosing topology for three different tasks¶
Now compare three jobs. Same design question. Different answers.
Task A: Customer support routing¶
Picture the flow. A customer issue must go to billing, refund, or tech. One router reads the ticket and sends it correctly. That is orchestrator-worker. Why this fits: - There is one clear coordination point. - Specialists do not need long debate with each other. - Delegation is clean and easy to audit. Why pipeline would fail: A ticket should not pass through billing, then refund, then tech blindly. That adds waste and confusion. Why hierarchical would fail: Adding sub-managers would create overhead before scale demands it. You would be managing the managers too early.
Task B: Research paper summarization pipeline¶
Picture the order. OCR extracts text. Parser cleans structure. Summarizer writes the digest. Formatter turns it into the final template. That is pipeline. Why this fits: - The steps are sequential and strongly ordered. - Each stage transforms one artifact into the next. - Local negotiation adds little value here. Why peer-to-peer would fail: If OCR and formatter keep chatting directly, state gets messy and step boundaries blur. Why hierarchical would fail: A manager layer adds summaries where strict ordering was enough. The job is process-heavy, not org-heavy.
Task C: Legal contract review with 50+ clause types¶
Now the picture changes. One reviewer cannot own indemnity, privacy, liability, termination, IP, employment, and jurisdiction deeply together. One top coordinator also becomes overloaded fast. That is where hierarchical wins. Why this fits: - Clause families form natural sub-domains. - Local specialists need domain-specific coordination. - The top layer should receive summaries, not raw clause chatter. Why orchestrator-worker would fail: One coordinator would become the bottleneck for fifty specialties. Routing and synthesis quality would collapse. Why peer-to-peer alone would fail: Direct links across many legal specialists become chatty and hard to bound. You need manager layers to compress the conversation. See the reasoning. Topology is chosen by failure mode, not by taste.
5) The golden rule — start with the smallest viable org chart¶
Do not start hierarchical because it feels sophisticated. Do not start peer-to-peer because it resembles human teams. Start with the smallest structure that works. Measure it. Watch where it fails. Then add one layer or one link. Not five. Premature architecture is expensive to unwind. Extra managers mean extra summaries. Extra peer links mean extra coordination drift. Every new edge costs tokens, latency, and debugging time. So what to do? Begin with one agent, a pipeline, or one orchestrator. Promote complexity only after repeated evidence. That keeps the org chart honest. That keeps the department narrow. That keeps your system explainable during failure. Simple, no? Small first. Structure second. Proof before prestige.
Where this lives in the wild¶
- Large enterprise AI platforms — platform architect may use hierarchical teams where legal, finance, and HR managers each coordinate their own specialist agents.
- Collaborative coding systems — software engineer may use peer-to-peer loops where a planner agent and coder agent iterate without always going through one orchestrator.
- Multi-modal AI products — multimodal systems engineer often use hierarchical design where a vision manager and language manager each supervise their own sub-specialists.
- Game AI systems — gameplay engineer may allow NPC agents to negotiate and cooperate directly with bounded message counts and stopping rules.
- Research automation platforms — research engineer may use literature managers for search and relevance, plus synthesis managers for writing and citation assembly.
Pause and recall¶
- What exact failure signal tells you one coordinator is no longer enough?
- Why can peer-to-peer improve iteration but still damage observability?
- How does the decision ladder stop premature topology complexity?
- In the legal review example, why is hierarchical better than one flat router?
Interview Q&A¶
Q: Why choose hierarchical over orchestrator-worker for a very large specialist system? A: Because one coordinator eventually becomes a routing and summarization bottleneck. Hierarchical design pushes local coordination downward and keeps the top layer focused on compressed decisions. Common wrong answer to avoid: "Because hierarchical is more advanced" — maturity is not the reason; bottleneck control and summary compression are. Q: Why choose peer-to-peer instead of forcing every specialist exchange through one orchestrator? A: Because some tasks need iterative critique, clarification, and local negotiation. Direct exchange can reduce central bottlenecks when bounded carefully. Common wrong answer to avoid: "Because direct communication is always faster" — it can also create loops, drift, and traceability problems. Q: Why not use peer-to-peer for the legal contract example? A: Because too many clause specialists talking freely create chatty meshes that are hard to bound. Manager layers compress and structure the discussion. Common wrong answer to avoid: "Because lawyers do not collaborate" — they do; the issue is scaling coordination, not banning collaboration. Q: Why not start hierarchical from day one just to be safe? A: Because every extra layer adds latency, summary loss risk, and debugging cost. You should add layers only when observed failures justify them. Common wrong answer to avoid: "Because simple systems are always best" — simplicity is a starting strategy, not an absolute rule.
Apply now (5 min)¶
Exercise: Take one workflow you know. Choose between single agent, pipeline, orchestrator-worker, peer-to-peer, or hierarchical. Then justify the choice using the five-step decision ladder. Sketch from memory: Draw one tree version and one peer-link version of the same workflow. Then mark where latency, loop risk, and summary loss would appear.
Bridge. We have four org chart shapes. But none of them work unless we know HOW to split work. Bad splits create chatty agents with vague jobs. Good splits create specialists with testable outputs. Next: the art of task decomposition. → 06-task-decomposition.md