04. Workflow patterns — sequential, parallel, DAG, and conditional shapes¶
~18 min read. The workflow graph has nodes (steps) and executors (agents/tools). Now it needs edges — the connections that determine whether steps run in sequence, in parallel, or conditionally. The shape you pick determines latency, failure propagation, and recovery complexity. Pick wrong and you either waste time (false sequencing) or create merge chaos (false parallelism).
Built on the first-principles overview in 00-first-principles.md. The workflow graph — the declared structure of steps and edges — takes concrete shape here. The pressure is durability vs latency: parallel execution reduces wall-clock time but makes checkpointing, merging, and failure recovery harder.
What file 03 established and what remains¶
File 03 assigned executors to steps. Each node in the workflow graph now has a typed contract and a routed executor. But the graph has no edges yet — no specification of which steps depend on which, which can run concurrently, and what conditions determine branching. Without explicit edges, the dispatch loop defaults to sequential execution, which is safe but slow. This file gives the graph its shape.
The workflow that took 34 seconds because nobody drew the edges¶
A customer-onboarding workflow: verify identity → check sanctions list → pull credit data → compute risk score → generate welcome letter → create account records → send notification. Seven steps. Sequential execution. Each step waits for the prior one to finish.
sequential (current):
verify ─→ sanctions ─→ credit ─→ score ─→ letter ─→ records ─→ notify
4s 3s 6s 2s 3s 2s 2s = 22s total
+ orchestration overhead per step ≈ 34s wall-clock
An engineer draws the actual dependency graph:
actual dependencies:
┌── sanctions (3s) ──┐
verify (4s) ───────┤ ├──→ score (2s) ─→ letter ─→ records ─→ notify
└── credit (6s) ─────┘
Sanctions check and credit pull are independent — they both need only the verified identity. They can run in parallel. The risk score needs both their outputs. Everything after the score is sequential.
parallel-aware:
verify (4s) ──→ [sanctions (3s) ‖ credit (6s)] ──→ score (2s) ─→ letter ─→ records ─→ notify
4s 6s (parallel max) 2s 3s 2s 2s = 19s
wall-clock savings: 34s → 23s (32% faster, same cost)
The steps didn't change. The executors didn't change. The edges changed. Shape is the free latency optimisation — it costs nothing in tokens or money, only in design attention.
Teacher voice. Default-sequential is the most common waste pattern in workflow orchestration. Any two steps that share no data dependency and no write conflict can run in parallel. The engineer's job is to identify true dependencies and parallelise everything else.
The invariant: edges encode data dependencies, not temporal intuition¶
The chapter protects one rule: an edge between two steps must represent a genuine data dependency (step B reads step A's output) or a side-effect ordering constraint (step B must not fire until step A's side-effect is committed). Any edge that exists for no structural reason wastes latency.
If you cannot name what data step B needs from step A, the edge should not exist. If both steps read from the same prior state without needing each other's output, they are parallel-safe.
Four patterns — and most workflows combine them¶
Sequential¶
Each step waits for the previous. Latency accumulates honestly. Use when every step genuinely needs the prior step's output.
Strengths: Simplest to reason about. Easiest to checkpoint (one active step at a time). Easiest to debug (linear trace).
Weakness: Wastes time when steps are independent. Serialises work that could overlap.
Use when: Each step materially transforms state that the next step reads. Side-effect ordering is critical (charge card before send confirmation).
Parallel (fan-out / fan-in)¶
Independent branches execute concurrently. A merge step collects all outputs. Wall-clock = max(branch latencies) + merge time.
Strengths: Reduces latency when branches are independent. Cost is unchanged (same total tokens).
Weakness: Merge logic is complex. Failure in one branch must be handled (wait? cancel siblings? proceed with partial results?). Checkpointing must snapshot all in-flight branches.
Use when: Multiple data sources needed independently. Research steps that don't share state. Redundant executors for quality comparison.
DAG (directed acyclic graph)¶
Steps form an arbitrary dependency graph with no cycles. Some paths parallelise naturally; others are sequential. The dispatch loop topologically sorts and executes.
Strengths: Expresses real-world dependencies accurately. Allows partial parallelism where available. Supports reuse (one node feeds multiple downstream).
Weakness: More complex to visualise. Checkpoint strategy must handle multiple active fronts. Failure in a shared node affects all downstream paths.
Use when: Dependencies are not purely linear but also not embarrassingly parallel. Most real workflows.
Conditional (branching)¶
A predicate at a branch point determines which subgraph executes. The unchosen path never fires — saving cost and latency.
Strengths: Avoids executing unnecessary steps. Enables risk-proportional behaviour (high-confidence → fast path; low-confidence → human review path).
Weakness: Branch predicates must be deterministic and testable. Hidden branches (inside prompts) make the workflow unpredictable. Testing must cover all branches.
Use when: Different inputs require different treatment. Confidence thresholds. Error classes. Tenant tiers. Policy variations.
Threaded example — loan workflow graph with mixed patterns¶
From files 01–03, the loan-approval workflow. Now with full edge specification:
┌──────────────────────────────────────────────────────────────────────────┐
│ │
│ [1. parse docs] ─→ [2. check eligibility] │
│ │ │
│ ▼ │
│ eligible? ─── no ──→ [reject + notify] ──→ END │
│ │ yes │
│ ▼ │
│ ┌── [3a. credit bureau] ──┐ │
│ │ │ │
│ └── [3b. internal hist.] ─┴──→ [3c. score] ─→ [3d. policy] │
│ │ │
│ ▼ │
│ score > threshold? │
│ ┌───────┴────────┐ │
│ ▼ ▼ │
│ [4. human [5. send │
│ review] offer] │
│ │ │ │
│ └───────┬────────┘ │
│ ▼ │
│ [6. create │
│ records] │
└──────────────────────────────────────────────────────────────────────────┘
Pattern inventory: - Steps 1→2: sequential (2 needs 1's output) - After eligibility: conditional (ineligible → reject path; eligible → continue) - Steps 3a ‖ 3b: parallel (independent data sources) - Steps 3c→3d: sequential (3d validates 3c's output) - After policy check: conditional (high risk → human; low risk → auto-approve) - Steps 5→6: sequential (records need the offer details)
This is a mixed-pattern DAG with conditional branches — the most common shape in production. No single pattern describes the whole workflow.
The merge problem — where parallelism gets hard¶
Fan-out is easy. Fan-in (merging) is where the complexity hides.
| Merge strategy | When to use | Risk |
|---|---|---|
| Wait for all | All branches are required for the next step | One slow branch delays everything |
| Wait for first | Redundant executors; first valid result wins | Wasted compute on losing branches |
| Wait for quorum | Majority agreement needed (voting) | Minority branch may have the right answer |
| Proceed with partial | Some branches are optional enrichment | Next step must handle missing data |
For the loan workflow: step 3c (risk scoring) needs both credit bureau data AND internal history. The merge strategy is "wait for all" — neither is optional. If the credit bureau times out, the workflow cannot proceed with partial data (regulatory requirement). The dispatch loop must wait, retry, or escalate.
Teacher voice. The merge policy is not a detail — it is often the hardest design decision in a parallel workflow. "Wait for all" is the default, but it means your P95 latency is determined by your slowest branch plus your flakiest API. Design the merge before designing the fan-out.
Failure propagation by pattern¶
Each pattern has a characteristic failure shape:
| Pattern | Failure shape | Recovery strategy |
|---|---|---|
| Sequential | Single point of failure per step; failure stops the chain | Checkpoint before each step; resume from last good checkpoint |
| Parallel | One branch failure can block the merge | Retry the failed branch independently; other branches' checkpoints are preserved |
| DAG | Shared-node failure affects all downstream paths | Checkpoint at every node; replay only the failed subgraph |
| Conditional | Wrong predicate sends the workflow down the incorrect path | Checkpoint before the branch; correction requires rollback to branch point |
In the loan workflow: if step 3a (credit bureau) fails while 3b (internal history) succeeds, only 3a needs to retry. Step 3b's checkpoint is preserved. This is why parallel decomposition at step 3 is better than a monolithic "gather all data" step — failure recovery is granular.
Operational signals¶
| Signal | Meaning |
|---|---|
| Healthy: parallel branches complete within similar time windows | Fan-out is balanced; no single branch dominates latency |
| First degrading: merge step latency increases while individual branch times are stable | One branch is getting slower; the merge waits for the straggler |
| Misleading: total step count is low but wall-clock is high | False sequencing — steps that could parallelise are waiting unnecessarily |
| Expert inspects: branch coverage in production vs test | Conditional branches that never fire in tests but fire in production indicate untested paths |
Boundary of applicability¶
Sequential excels when side-effect ordering is critical and each step materially transforms state for the next. Don't parallelise what must be ordered.
Parallel excels when multiple independent data sources are needed and merge logic is well-defined. The savings scale with branch count and latency.
DAG excels for real-world workflows where dependencies are neither purely linear nor embarrassingly parallel. Most production workflows are DAGs.
Conditional pathology: over-branching creates a combinatorial explosion of test cases. More than 3–4 nested conditions makes the workflow hard to reason about. Flatten branches or add an explicit routing step that collapses conditions.
Where this lives in the wild¶
- LangGraph — nodes and edges form an explicit graph. Conditional edges use Python functions that evaluate state. Parallel branches are expressed as multiple outgoing edges from a single node.
- Temporal — workflow functions use
asyncio.gather()for parallel activities andif/elsefor conditional paths. The SDK makes the graph shape visible in code. - Inngest — step functions implicitly create a DAG. Steps without dependencies on each other's output are automatically parallelised by the runtime.
- Apache Airflow — DAGs are the core abstraction. Tasks define dependencies explicitly. The scheduler handles parallelism within the DAG's constraints.
- OpenAI Agents SDK — handoff chains create sequential flows; parallel tool calls within an agent create implicit fan-out/fan-in.
Recall¶
- What invariant determines whether an edge should exist between two steps?
- In the onboarding example, what latency savings did parallelising sanctions + credit provide?
- What are the four merge strategies for parallel branches?
- Why is the merge policy often harder to design than the fan-out?
- How does failure recovery differ between sequential and parallel patterns?
- What does the signal "low step count but high wall-clock" indicate?
Interview Q&A¶
Q: How do you decide whether two steps should be sequential or parallel?
A: Check for data dependency: does step B read step A's output? Check for write conflict: do both modify the same state? If neither applies, they are parallel-safe. The edge should only exist if removing it would allow step B to execute without the data it needs.
Common wrong answer to avoid: "Parallelise everything for speed." Parallelism without a well-defined merge strategy creates more problems than sequential execution. And some steps have genuine ordering requirements (side-effect before notification).
Q: What's the most common latency waste in workflow orchestration?
A: False sequencing — steps that have no data dependency on each other but are executed serially because the team wrote them in order. The fix is free: identify true dependencies, parallelise the rest.
Common wrong answer to avoid: "Slow models." Model latency is often secondary to unnecessary sequencing. A 3s model call run in parallel with a 6s API call costs 6s total; sequentially it costs 9s.
Q: When should you avoid parallelism even if steps are independent?
A: When the merge strategy is undefined or when downstream steps can handle partial results but the team hasn't built that logic yet. Also when shared resources (rate-limited APIs, database connection pools) would be overwhelmed by concurrent access. Parallelism without merge discipline creates worse failures than sequential without parallelism.
Common wrong answer to avoid: "When it's too complex to implement." Complexity is a real cost but not the primary concern — undefined merge behaviour and resource contention are.
Q: How do you test conditional branches that rarely fire in production?
A: Construct synthetic test cases that force each branch predicate to true. Monitor branch coverage in production telemetry. Flag any branch that has fired in production but has no corresponding test case — that's an untested production path.
Common wrong answer to avoid: "Test the happy path and log the rest." Untested branches are where production incidents hide. They must have test coverage proportional to their blast radius.
Design/debug exercise (10 min)¶
Modeled example. The loan workflow has: 2 sequential edges (parse→eligibility, score→policy), 1 parallel pair (credit ‖ history), 2 conditional branches (eligible?, threshold?). Each conditional has a test case for both outcomes. The merge after parallel branches uses "wait for all" because both data sources are regulatory requirements.
Your turn. Take a workflow you've built. Draw the dependency graph. Label each edge: data-dependency or side-effect-ordering. Identify any false-sequential edges (steps that could parallelise). Specify the merge strategy for any fan-in point.
From memory. Sketch the four patterns (sequential, parallel, DAG, conditional). Draw the loan workflow's full DAG with its two conditional branches. Label the merge strategy at the fan-in after steps 3a and 3b.
Operational memory¶
This chapter established that workflow edges must represent genuine data dependencies or side-effect ordering constraints — any other edge wastes latency. The four patterns (sequential, parallel, DAG, conditional) combine in real workflows; most production workflows are mixed-pattern DAGs with conditional branches.
The merge problem is the hidden complexity of parallelism. Every fan-out creates a merge point where the dispatch loop must decide: wait for all? First wins? Quorum? Partial? The merge strategy determines P95 latency and failure behaviour more than any other workflow design choice.
Failure propagation differs by pattern: sequential fails at one point and resumes linearly; parallel allows independent branch retry; DAGs require subgraph replay; conditionals may need rollback to the branch point. Pattern choice is pattern-recovery choice.
Remember:
- Edges encode data dependencies. If you can't name what data B needs from A, the edge shouldn't exist.
- Default-sequential is the most common waste — audit every edge for necessity.
- Fan-out is easy; merge is where the complexity lives. Design the merge before the fan-out.
- "Wait for all" means your P95 = slowest branch + flakiest API. Plan accordingly.
- Each pattern has a characteristic failure shape. Pick the pattern whose failure mode you can tolerate.
- Conditional branches must have testable predicates and production coverage monitoring.
- The loan workflow is a mixed DAG: sequential where ordering matters, parallel where data sources are independent, conditional where risk thresholds determine the path.
Bridge. The graph has shape — nodes, edges, patterns. But as steps execute, they produce intermediate state: retrieved documents, computed scores, approval decisions. Where does that state live? How do steps share it without drowning each other in context? Next: state and context management across workflow steps. → 05-state-context-management.md