00. Redis — ELI5¶

Redis is a tiny brain that thinks in one microsecond. But it forgets everything if you don't tell it to remember.

Picture a server with two stores beside it.

One store is the disk database. Big shelves. Long aisles. Every request walks the aisles. Reads take milliseconds. Sometimes tens of milliseconds.

The other store is a single small desk right next to the server. Everything on the desk fits in RAM. One clerk sits at the desk. He handles one request at a time. But the desk is so close, and the clerk so fast, that he finishes most requests in microseconds.

That desk is redis. The clerk is single-threaded. The shelf labels are typed — list shelf, set shelf, hash shelf, sorted-set shelf. The clerk knows each shelf's commands by heart.

See. That is the whole mental model. One fast clerk. Typed shelves. Memory-first, disk-optional.

The picture¶

   ┌──────────── REDIS PROCESS (single-threaded) ────────────┐
   │                                                          │
   │   client A ──┐                                            │
   │   client B ──┼──► [ socket buffer ] ──► event loop        │
   │   client C ──┘                              │             │
   │                                             ▼             │
   │                              ┌──── command dispatch ────┐ │
   │                              │  GET  SET  INCR  LPUSH   │ │
   │                              │  HSET ZADD SADD EXPIRE   │ │
   │                              └────────────┬─────────────┘ │
   │                                           ▼               │
   │            ┌──────────────── KEYSPACE ──────────────┐     │
   │            │  "user:42:session"   → STRING          │     │
   │            │  "feed:42"           → LIST  (800 IDs) │     │
   │            │  "online:users"      → SET             │     │
   │            │  "cart:42"           → HASH            │     │
   │            │  "leaderboard"       → SORTED SET      │     │
   │            │  "rate:ip:1.2.3.4"   → STRING + TTL    │     │
   │            └────────────────────────────────────────┘     │
   │                                           │               │
   │                                           ▼               │
   │                              ┌── optional persistence ──┐ │
   │                              │   RDB snapshot  /  AOF   │ │
   │                              └──────────────────────────┘ │
   └──────────────────────────────────────────────────────────┘

One loop. One command at a time. Each key has a type. Persistence is optional and lives below the loop.

What problem does it solve¶

Take a plain app. Stateless server, postgres behind it. Every page load runs a few queries. Each query takes 5-20 ms. Multiply by 50 page loads per second. The database starts to sweat.

Now what to do? Put redis in front.

The session lives in redis — one GET, 0.5 ms, no SQL. The rendered product card lives in redis under a key — one GET, no join. The rate-limit counter lives in redis — one INCR, no row lock. The background job queue lives in redis — one LPUSH from web, one BRPOP from worker.

Same database, one-tenth the load. Same page, ten times faster. That is the redis tax-cut. The hard rule — redis is not your truth. Truth lives in postgres. Redis just remembers the answers for a little while.

The three things you actually deal with¶

1) Keys and values — but values are typed¶

Memcached only holds opaque blobs. Redis holds typed values. The key is always a string. The value is a string, list, hash, set, sorted set, stream, hyperloglog, or bitmap. Each type has its own command family. LPUSH works on lists. ZADD works on sorted sets. Wrong type on a key returns a WRONGTYPE error. The type is the contract.

This is why redis can do leaderboards, feeds, queues, and counters in one command that other key-value stores need a transaction for.

2) Commands — one at a time, atomic per command¶

Each command runs to completion before the next one starts. No locks. No race conditions inside a single command. INCR counter always advances by one — never lost, never doubled. LPUSH feed:42 tweet_id always lands at the head.

Need multiple commands as one unit? Use MULTI/EXEC or a Lua script. Both run with the rest of the world frozen out.

3) Persistence — pick your forgetting¶

Memory is fast but volatile. Redis offers two ways to remember after a crash:

RDB snapshot. Periodic dump of the whole keyspace to disk. Cheap, but you lose the last N minutes on crash.
AOF (append-only file). Every write command is appended to a log. Replayed on restart. Loses at most one second of data.

Most teams run AOF every-second + RDB hourly. Pure cache deployments often run with persistence off — if redis dies, the cache simply rebuilds from postgres on next request.

Where this lives in the wild¶

Twitter (X) home timelines. Fan-out service writes each new tweet's ID into a per-user redis list capped at the last 800 entries; reads are O(1) per follower while writes scale O(followers).
GitHub API rate limiter. Sharded redis clusters with Lua scripts enforce per-user request quotas atomically; client-side sharding sends writes to a primary and reads to replicas.
Stack Overflow L1/L2 cache. Redis serves as the shared L2 across web servers; pub/sub messages broadcast invalidations so each server's in-process L1 stays consistent.
Pinterest follower graph. User-id space split into 8192 virtual shards, each a redis DB, holding billions of follow relationships in master-slave pairs with EBS-backed AOF.
Uber hot-driver location store. Redis (with GEO commands) keeps only the most recent position per active driver in memory, fronting hexagonal H3 indexes for fast nearby-driver lookups.
Instagram feed cache. Top ~100 post IDs per user are cached as redis lists; randomized TTLs prevent stampedes when many users' caches expire together.
Snapchat user-service cache (KeyDB). Snap's multithreaded redis fork caches frequently requested user objects in GKE to cut cross-cloud AWS-to-GCP P99 latency from 49-133 ms.
Airbnb listing and search cache. Redis stores listing metadata, review aggregates, and precomputed ticket-trend results so search and pricing endpoints meet latency budgets.
Shopify background-job queues. Each pod's own redis node holds queues for webhook delivery, email sends, payment retries, and inventory syncs, isolating one shop's burst from others.
Slack event-handling bots. SET NX claims each incoming Slack event ID so retried deliveries are silently dropped; per-thread leases ensure only one worker processes a thread.
Stripe rate limiter. A redis cluster counts in-flight requests per type; the limiter reserves a fraction of capacity for critical traffic and rejects non-critical overage with 503.
Sidekiq background jobs (Ruby on Rails ecosystem). Sidekiq pushes JSON-serialised jobs onto redis lists; workers pop with blocking commands and store retry state in sorted sets keyed by execute-after timestamp.
Celery task broker (Python ecosystem). Celery RPUSHes serialised task messages to a redis list and uses a second redis DB as the result backend for fetching task return values.
DoorDash microservices cache (L3). Redis is the third layer of DashPass's caching stack beneath request-local maps and Caffeine in-process caches; runtime knobs flip layers on or off per service.
Netflix Counter Abstraction. Increment/decrement traffic for hundreds of thousands of counters per second per region lands in an in-memory store (EVCache, the memcached cousin of redis) and serves microsecond reads.
Real-time gaming leaderboards. Sorted sets with ZADD/ZREVRANK give O(log N) score updates and instant rank lookups for tens of millions of players on a single instance.
Web analytics unique-visitor counts. HyperLogLog under PFADD/PFCOUNT estimates daily/weekly uniques in a fixed 12 KB per key, instead of hundreds of MB for an exact set.
Django session storage. django-redis keeps logged-in user sessions in redis hashes with TTL, freeing postgres from per-request session lookups during traffic spikes.
Distributed locking for Slack bots. Redis-backed leases using SET key value NX PX ttl coordinate which one of many worker pods owns a given thread or job.
Discord bot gateway caches. Bots use redis hashes to cache guild, channel, and presence data received from Discord's gateway so command handlers do not re-fetch on every event.
Stack Exchange real-time websockets. Redis pub/sub fans out vote counts, new-answer notifications, and inbox badges from the web tier out to all connected websocket servers.
CDN edge cache control. Many edge platforms use redis to share per-URL invalidation timestamps and surrogate keys across PoPs so a single purge propagates in milliseconds.

Where to go next¶

The mental model is set. Now we open the box.

01-data-structures-single-thread-loop.md — strings, lists, hashes, sets, sorted sets, streams. The single-threaded event loop and the RESP protocol.
02-commands-persistence-clients.md — day-to-day commands you actually run, RDB vs AOF tradeoffs, pipelining, connection pooling, client libraries.
03-cluster-eviction-cache-stampedes.md — cluster mode and hash slots, eviction policies (LRU/LFU/allkeys/volatile), and how production teams beat the thundering-herd problem.
04-cache-patterns-vs-memcached-interview.md — cache-aside vs write-through vs write-behind, redis vs memcached tradeoffs, and the interview frames that come up at staff level.

Bridge. First the desk itself — what shelves it has, why the clerk is single-threaded, and what RESP looks like on the wire. → 01-data-structures-single-thread-loop.md