08. Idempotency & deduplication — do the action once, even if the request shouts twice¶

~15 min read. Safe retries require the system to remember whether the side effect already happened.

Built on the ELI5 in 00-eli5.md. The retry dose — safe repetition rather than blind repetition — depends on idempotency, which keeps one intended action from happening twice.

1) First picture: repeated request is not repeated permission¶

Network failure creates uncertainty. The client may not know whether the first action succeeded. If it retries blindly, double side effects can happen.

client sends refund request
      │
      ├── server processes refund
      └── acknowledgment is lost

client thinks "maybe it failed"
      │
      └── sends same refund again

The simple version: Two identical requests are not two business intentions. The user wanted one refund. The transport layer created two messages. Idempotency tells the system to honor the intent, not the message count.

2) What idempotent means in practice¶

An operation is idempotent if repeating it leads to the same final state. Read-only retrieval is often idempotent. Charging a card is not naturally idempotent. Creating a ticket is not naturally idempotent. Sending an email is not naturally idempotent. So what to do?

Attach an idempotency key to the intended action.

intent: refund order ORD-4413 for ₹400
idempotency_key: refund:ORD-4413:400:client_req_98

Now the server can say, "I have already processed this intent." For example, request one arrives. Server starts processing refund. Refund succeeds.

Response times out before client sees it. Client retries with same idempotency key. Server checks key store. Existing result found. Server returns previous success response. No second refund.

That is safe retry. The retry dose stayed useful because idempotency existed first.

3) Deduplication is the operational partner of idempotency¶

Now what is the difference? Idempotency is the contract. Deduplication is the mechanism that enforces or supports it. Common dedup methods:

request ID store,
message queue dedup window,
tool execution log,

exactly-once application-level guard.

incoming action
    │
    ├── seen key before? ─ yes ─→ return stored result
    └── no ─────────────── no ─→ execute + store result

The dedup store must keep enough detail. Not only whether the key existed, but what outcome happened. Why? Because a retry may need the original response,

not just a bare rejection. For example, tool execution store keeps:

idempotency key,
status completed,
refund ID rf_20391,
timestamp,
response body. A duplicate request arrives 40 seconds later. Server replays the stored response. Client sees consistent behavior.

4) Exactly-once execution is hard, so design honestly¶

Teams love the phrase, "exactly once." Reality is tougher. Distributed systems usually give you:

at least once,
at most once, or application-level exactly-once semantics for a narrow action.

transport world
┌──────────────────────────────┐
│ message may arrive twice     │
│ acknowledgment may be lost   │
│ worker may crash mid-action  │
└──────────────────────────────┘

The practical response: Be precise. Say, "This payment endpoint provides application-level idempotency for 24 hours." That is honest. Do not promise impossible magic.

For example, a worker writes to the database, then crashes before updating the dedup store. A replay arrives. Now you may still need a reconciliation step. This is why money workflows often combine:

idempotency key,
durable transaction log,
downstream reconciliation job. The vitals monitor should watch duplicate-attempt rates, because rising duplicates often mean transport trouble upstream.

5) AI agents need tool-call idempotency more than chat-only apps¶

Pure chat systems mostly produce text. Agents produce actions. That changes the risk. One loop bug can call the same tool several times. One timeout can replay a dangerous action.

agent plan
  ├── check order
  ├── issue refund
  ├── verify refund
  └── summarize to user

bug: verify step misreads state
  └── issue refund again

See the problem. Tool-call dedup should happen below the agent. Do not trust the agent alone to remember. For example, an AI assistant calls create_jira_ticket twice because the first acknowledgment was delayed. Without dedup,

you get duplicate tickets. With dedup key based on conversation ID, issue type, and normalized summary, the second call returns the first ticket ID. Now the user sees one ticket,

not two. The triage desk should mark write tools as idempotency-required dependencies.

6) Dedup windows and key design matter¶

Poor key design creates new bugs. If the key is too broad, legitimate new requests get blocked. If the key is too narrow, duplicates slip through.

too broad key   = refund:ORD-4413
problem         = blocks later valid partial refund

too narrow key  = random_uuid_every_time
problem         = duplicate retries look new

The simple version: Key design should reflect business intent. For example, better refund key: refund:ORD-4413:amount-400:reason-damaged-item This allows a later separate refund for a different amount or reason,

while deduping identical retries. The retry dose becomes safe only because the key encodes intent cleanly.

7) Store outcomes, not just locks¶

One more senior point. A lock only says, "Someone is working." That is not enough after crashes. Outcome storage is stronger.

bad memory
key exists = true

better memory
key exists = true
status = completed
result = refund_id rf_20391

If you only store a transient lock, a retry after crash may not know what happened. If you store the outcome, you can answer deterministically. That is much better reliability.

Where this lives in the wild¶

Stripe payment workflows — API reliability engineer: uses idempotency keys so client retries after network failure do not create double charges or double refunds.
GitHub Copilot workspace actions — agent platform engineer: deduplicates repository-modifying operations so repeated tool calls do not apply the same patch twice.
Intercom Fin — workflow automation owner: protects ticket-creation and escalation actions with dedup keys so a support conversation does not open duplicate internal cases.
Klarna assistant — transaction safety engineer: stores action outcomes for payment-adjacent tool calls so late retries can replay the original result safely.
Jira-integrated support bots — enterprise automation engineer: replays the original ticket ID on duplicate create attempts rather than creating multiple incident records.

Pause and recall¶

Why is a repeated message not the same as a repeated business intention?
What is the difference between idempotency and deduplication?
Why is exactly-once execution a dangerous phrase unless defined carefully?
Why should AI agent tool calls be treated as idempotency-sensitive by default?

Interview Q&A¶

Q: Why is idempotency the prerequisite for safe retries on side-effecting operations? A: Without it, the system cannot distinguish a recovery retry from a second real action, so duplicates become possible. Common wrong answer to avoid: "Because retries are uncommon on write paths." They happen exactly when the system is uncertain. Q: Why should dedup logic store outcomes instead of only storing a lock bit? A: Retries need a deterministic answer after crashes or delayed acknowledgments, which a lock alone cannot provide. Common wrong answer to avoid: "Because outcome storage is faster than locking." Speed is not the main reason. Q: Why is a randomly generated key per retry a bad idempotency strategy? A: It makes identical intent look like fresh intent, defeating deduplication entirely. Common wrong answer to avoid: "Because random keys are hard to debug." Debuggability is secondary to correctness. Q: Why is application-level exactly-once semantics more honest than claiming global exactly-once execution? A: Distributed transport can still duplicate messages, so correctness usually comes from business-level dedup and replay rules. Common wrong answer to avoid: "Because exactly-once is impossible everywhere." Some narrow exactly-once behavior is achievable with careful scoping.

Apply now (5 min)¶

Exercise. Pick one write tool in an AI workflow. Design an idempotency key for it. Write what fields belong in the key, where you store the result, and how long the dedup window should last.

Sketch from memory. Draw the duplicate-request path. Show where the retry dose enters, where the dedup store checks the key, and where the original result is replayed instead of executing again.

Bridge. Once retries and side effects are safe, we still need to know when the machine should stop acting alone. Next we call the senior doctor through human escalation. → 09-human-escalation.md