09. Caching Patterns Deep Dive — Reading Fast Without Touching the Database¶

~14 min read. The database is the bottleneck. The cache is the relief valve. Know when to open it.

Built on the ELI5 in 00-eli5.md. The reservation desk — a clipboard showing today's bookings without checking the full catalogue — is your cache. Stale clipboard causes problems. Empty clipboard causes stampedes.

1. Why Caches Exist and What They Trade Away¶

A cache is a fast, small, expensive store in front of a slow, large, cheap store.

Without cache:
  Request → Database (5–50 ms)  ──→  Response

With cache:
  Request → Cache hit  (0.1 ms) ──→  Response     (99% of requests)
  Request → Cache miss → DB → Cache write → Response  (1% of requests)

The trade-off is consistency. Cached data can become stale. Every caching strategy is a different answer to the question: "How stale is acceptable, and who is responsible for keeping the cache fresh?"

2. Cache-Aside (Lazy Loading)¶

The application drives all cache interactions. The cache does not know about the database.

Read path:
  App ─→ Cache: GET book:42
    Hit:  return cached value               ◄──── fast path
    Miss: App ─→ DB: SELECT * WHERE id=42
          App ─→ Cache: SET book:42 value TTL=300s
          App ─→ return value               ◄──── slow path (once)

Write path:
  App ─→ DB: UPDATE book SET title=... WHERE id=42
  App ─→ Cache: DELETE book:42              ◄──── invalidate, not update

Why delete instead of update on write? Updating requires reading the DB first to get the full object. Deleting is simpler and avoids a race condition where a slow write overwrites a fresher cache entry written by a concurrent request.

When to use it: - Most production systems start here. Simple to reason about. - Good when reads vastly outnumber writes. - The reservation desk clipboard is updated only when a patron asks. Stale data is corrected on next read.

Weakness: Cold start. After a restart or deployment, every key is a cache miss. The database absorbs a spike.

3. Write-Through¶

Every write goes to the cache and the database synchronously before returning to the client.

Write path:
  App ─→ Cache: SET book:42 value
  App ─→ DB:    UPDATE book SET ... WHERE id=42
  Both must succeed before app responds

Read path:
  App ─→ Cache: GET book:42  →  always a hit (assuming prior write)

Advantages: - Cache is always warm. No cold-start miss spike. - Data in cache is never stale with respect to your own writes.

Disadvantages: - Write latency doubles. You write to two stores sequentially. - Cache fills with data that is never read. You pay storage cost for unpopular items. - Combine with TTL to evict unused entries over time.

When to use it: Write-heavy workloads where read-after-write consistency matters. Authentication tokens, session state, the reservation desk where every booking must be instantly visible.

4. Write-Behind (Write-Back)¶

Write to the cache immediately, acknowledge to client, then flush to the database asynchronously.

Write path:
  App ─→ Cache: SET book:42 value  →  ACK immediately
  Background: Cache ─→ DB: UPDATE (batched, delayed by seconds–minutes)

Risk:
  Cache crashes before flush  →  data loss

Advantages: - Lowest write latency possible. Client gets confirmation in sub-millisecond. - Batch writes to DB reduce database load significantly.

Disadvantages: - Data loss if cache node fails before flush. Acceptable for analytics counters. Unacceptable for financial transactions. - Complexity: the cache must manage a write queue and retry logic.

When to use it: High-frequency counters (views, likes, play counts), non-critical metadata. Netflix view counts, ad impression counters.

5. Invalidation Strategies¶

Cache invalidation is famously hard. There are only a few viable strategies.

Strategy       Trigger                   Staleness window
──────────────────────────────────────────────────────────
TTL expiry     Timer                     0 to TTL value
Event-based    Write event (pub/sub)     Near-zero (ms)
Version tags   Tag invalidation          Near-zero (ms)
Manual purge   Admin operation           Zero (immediate)

TTL Math¶

Setting TTL too short: cache hit rate drops, database load rises. Setting TTL too long: stale data shown to users, consistency errors.

Optimal TTL ≈ (object change frequency)⁻¹ × acceptable staleness ratio

Example: book title changes once per week.
         Acceptable staleness: 1 hour.
         TTL = 3600 seconds.

For the reservation desk, a reservation changes when made or cancelled. TTL of 30 seconds is reasonable for display; use event-based invalidation for the actual booking check.

Cache Tags (Used by CDNs and Varnish)¶

Associate a cached response with one or more tags. When a tagged resource changes, invalidate all responses with that tag atomically.

Page "branch/north/catalogue" tagged with ["branch:north", "catalogue:v42"]
When catalogue updates: PURGE tag "catalogue:v42" → all tagged pages evict

6. Cache Stampede and Thundering Herd¶

These are two faces of the same problem: too many clients request the same missing key simultaneously.

Cache Stampede:
  Key expires at T=0
  1000 requests arrive at T=0
  All find cache miss simultaneously
  All query the database simultaneously
  Database collapses under 1000× normal load

Prevention Strategies¶

Probabilistic Early Expiration (PER): Before the TTL expires, a fraction of reads probabilistically recompute the cache early. No single expiry moment causes a stampede.

# Pseudo-code
remaining_ttl = cache.ttl(key)
if remaining_ttl < threshold and random() < recompute_probability:
    value = fetch_from_db(key)
    cache.set(key, value, ttl=NEW_TTL)

Mutex / Lock on miss: First requester acquires a lock and fetches from DB. Others wait or return stale data.

Miss → acquire lock:
  Lock acquired:  fetch DB, populate cache, release lock
  Lock failed:    return stale value (if available) or wait

Background refresh: A separate process refreshes hot keys before they expire. Application always reads from cache.

Thundering Herd applies the same scenario to service restarts: all clients reconnect simultaneously. Mitigate with jittered reconnect delays and slow-start traffic ramping.

7. Cache Warm-Up¶

Cold caches after deployment crash databases. Three strategies:

Pre-populate: script fetches top-N keys from DB before traffic shifts to new deployment.
Shadow traffic: copy of production traffic warms the cache organically before cutover.
Circuit breaker: allow cold traffic but shed load if DB latency exceeds a threshold; cache fills gradually.

Where this lives in the wild¶

Twitter / X (Infrastructure) — Massive Redis cluster for home timeline caching. Cache-aside with aggressive TTL prevents DB overload on cache miss.

Facebook (Memcache Team) — Use lease tokens to prevent stampedes: a miss grants one client a lease to fetch from DB; others wait or return stale data.

Cloudflare (CDN Engineering) — Cache tags for edge invalidation purge a tag across 200+ PoPs globally within 150 ms. The reservation desk equivalent: one cancel propagates everywhere.

Uber (Real-time Data) — Write-behind caching for surge pricing. Price updates hit Redis instantly, flush to MySQL in batches. Small data loss risk is acceptable for pricing metadata.

Shopify (Platform) — Cache stampede hit during Black Friday. Mutex locks with stale-while-revalidate were added to prevent recurrence.

Pause and recall¶

What is the key difference between cache-aside and write-through? Draw the write path for both.
Why does write-behind have lower write latency than write-through? What is the safety trade-off?
Cache stampede: describe the scenario in two sentences and name two prevention strategies.
The reservation desk must always show the latest booking status. Which invalidation strategy fits best and why?

Interview Q&A¶

Q: Compare cache-aside and write-through. When would you choose each?

Cache-aside is lazy: the application populates the cache on a read miss. Simple and flexible. Use when reads greatly outnumber writes and some staleness is tolerable. Write-through populates the cache synchronously on every write. Higher write latency but no cold-start misses. Use when read-after-write consistency is critical and write volume is manageable.

Common wrong answer to avoid: Saying write-through is strictly better because it is more consistent. Write-through doubles write latency and wastes cache space on data that is never read again. For most read-heavy systems, cache-aside with smart invalidation is the better starting point.

Q: What is a cache stampede and how do you prevent it?

A cache stampede occurs when a popular key expires and many concurrent requests simultaneously find a miss, all querying the database at once. Prevention approaches include probabilistic early recomputation, mutex locks on cache miss (with stale fallback), and proactive background refresh of hot keys.

Common wrong answer to avoid: Saying "increase the TTL." Longer TTL reduces stampede frequency but does not eliminate it, increases staleness, and shifts the problem rather than solving it. The correct answer addresses what happens at expiry, not just before.

Q: How do you handle cache warm-up after a fresh deployment?

Pre-populate by running a warm-up script fetching the top-N most-read keys before shifting traffic. Alternatively, use shadow traffic to warm organically. Combine with circuit breakers that shed load if DB latency spikes before the cache is warm.

Common wrong answer to avoid: Saying "the cache fills naturally." On high-traffic systems the database cannot absorb full cold-start load. Without warm-up, a deployment becomes an outage.

Q: What is write-behind caching and when is it appropriate?

Write-behind writes to the cache immediately and acknowledges the client, then asynchronously flushes to the database. Lowest possible write latency; reduces DB load through batching. Appropriate for non-critical high-frequency counters (view counts, likes) where occasional data loss on cache failure is tolerable.

Common wrong answer to avoid: Recommending write-behind for financial or inventory data. A cache crash before the flush loses confirmed writes. For anything where data loss is unacceptable, use write-through or direct DB writes with cache invalidation.

Apply now (5 min)¶

Exercise: The reservation desk system gets 50,000 reservation lookups per second and about 200 reservation changes per second. A cache stampede hit production last week when a popular time slot expired. Design a caching strategy: choose a pattern (cache-aside, write-through, or write-behind), an invalidation mechanism, a TTL value, and a stampede prevention mechanism. Justify each choice in one sentence.

Sketch from memory: Draw the cache-aside read path and write path (with invalidation). Then mark exactly where a stampede can happen and where your chosen prevention mechanism kicks in.

Bridge. You have mastered in-memory caching for hot data. Next: where does the cold, large, and cheap data live — and how do modern data lakes handle petabytes? → 10-object-storage-and-data-lakes.md