01. Data structures and the single-thread loop — how Redis stays fast¶

~18 min read. A team adds a per-user rate limiter — 100 requests per minute per user_id. The naive version uses a counter and a TTL. It under-counts during bursts and over-counts during clock drift. The version that survives a Friday traffic spike uses a sorted set, four commands, and one round-trip. By the end of this page you will know exactly why Redis can run that limiter at 200k ops/sec on a single core, and why the very thing that makes it fast — one clerk, one queue — is also the rope you can hang the system with.

Builds on: 00-eli5.md.

The mental model from the ELI5 was a single clerk at a desk. That clerk is not a metaphor. It is a literal design choice — Redis runs the core command path on one thread by deliberate engineering. Everything that makes Redis enviable for the small things — atomic INCR, lock-free counters, RESP simplicity — flows from that choice. Everything that makes Redis dangerous in production — head-of-line blocking on a slow KEYS *, a Lua script that runs eight milliseconds and freezes every other client — also flows from that choice. The data structures sit on top of this loop. Each one is shaped so that the common operation finishes in nanoseconds, because if the clerk pauses, every client waits.

We will trace one task across every section: a per-user rate limiter, 100 requests per minute per user_id, sliding window, using a sorted set. You will see why the sorted-set version is the right answer, why the single-thread loop is what makes its correctness free, and where the costs hide.

1) The single-threaded event loop and why it's fast¶

Redis uses a classic reactor pattern. One thread runs epoll_wait (on Linux) or kqueue (on BSD/macOS), collects ready file descriptors, and dispatches them in a tight loop. No locks. No context switches between commands. No mutex contention. The loop looks like this.

   ┌────────────────────────────────────────────────────────────────┐
   │           REDIS MAIN THREAD — the event loop                   │
   │                                                                │
   │   ┌─ epoll_wait() ────────────────────────────────────┐        │
   │   │  blocks for ~0 ms when clients are active         │        │
   │   └──────────────────┬────────────────────────────────┘        │
   │                      │ ready fds                                │
   │                      ▼                                          │
   │   ┌─ readQueryFromClient() ──────────────────────────┐         │
   │   │  parse RESP from socket buffer into argv[]       │         │
   │   └──────────────────┬───────────────────────────────┘         │
   │                      ▼                                          │
   │   ┌─ processCommand() ───────────────────────────────┐         │
   │   │  lookupCommand("ZADD") → ptr to zaddCommand()    │         │
   │   │  call zaddCommand(c) ── O(log N), no locks       │         │
   │   │  result goes into client's output buffer         │         │
   │   └──────────────────┬───────────────────────────────┘         │
   │                      ▼                                          │
   │   ┌─ sendReplyToClient() (next loop iteration) ──────┐         │
   │   │  drain output buffer to socket                   │         │
   │   └──────────────────────────────────────────────────┘         │
   │                                                                │
   │   then: AOF append (if enabled), expired-key sweep, cron       │
   └────────────────────────────────────────────────────────────────┘

Why this is fast. Each command call is a direct function pointer — lookupCommand is an O(1) hash lookup in a static command table. No method dispatch through a class hierarchy, no JIT warm-up, no GC pause. Memory allocation is via jemalloc with size-class buckets tuned for small objects. Persistence (AOF, RDB) is moved to background threads or fork()ed child processes so the main thread never blocks on disk.

Why this matters for our rate limiter. Imagine two requests from the same user_id arrive in the same millisecond. Each wants to read the user's recent-request set, count it, and add a new entry. In a multi-threaded store you would need a transaction or a row lock. In Redis, the second command literally cannot start until the first has returned — the clerk is busy. So a four-command Lua script — ZREMRANGEBYSCORE, ZCARD, ZADD, EXPIRE — runs as one indivisible block. No TOCTOU race. The correctness of the rate limit is free, paid for by the single thread.

Teacher voice. Single-threadedness is not a limitation Redis is overcoming. It is the property it is exploiting. The day you forget this is the day someone ships a Lua script that scans a million keys, and you find out what a 4-second p99 looks like.

The price is real. One slow command blocks every other client. KEYS * on a 10-million-key database freezes the world for seconds. A Lua script with a tight loop runs to completion before any other request is served. So Redis 6 added I/O threads to parallelise just the network read/write — the command execution itself remains single-threaded by design. KeyDB (Snap's fork, used to cache the User Service in GKE) goes further and parallelises commands too, at the cost of a more complex codebase.

Mini-FAQ. "Then why is Redis advertised as multithreaded since 6.0?" The threading is for socket I/O — reading bytes off the wire and writing replies back. The command itself still runs on one thread. If you blamed contention for a perf problem on Redis 5, it is still on one thread in 7.

2) Strings — the everyday key/value¶

Strings are Redis's primitive. Every key in the keyspace is a string. Most values are strings too. But the underlying implementation is SDS — Simple Dynamic String — a struct that stores the byte length, the allocated capacity, and the buffer. That length-prefix design gives you O(1) STRLEN and safe binary content (a string can hold a JPEG, a protobuf, a JSON blob; the NUL byte is not a terminator). Integer-valued strings get a special OBJ_ENCODING_INT encoding — the value is stored inline as a long, not as ASCII digits. That is why INCR is so cheap: it is a CPU register increment, not a parse-add-format cycle.

key "rate:user:42:count"
  ├─ type:     STRING
  ├─ encoding: int   (because "37" fits a long)
  ├─ value:    37
  └─ ttl:      54 sec remaining

The everyday commands. GET, SET, MGET, MSET, INCR, DECR, INCRBY, EXPIRE, TTL, SETNX, GETSET. All O(1). APPEND and GETRANGE add string-slice semantics; BITCOUNT, SETBIT, BITOP turn a string into a bitmap (useful for "did user X log in on day Y?" — one byte per million users, scaled to 125 KB per day).

Now back to our rate limiter. The first thing every engineer tries is the string counter version.

INCR rate:user:42                  → returns 1
EXPIRE rate:user:42 60             → set 60-second TTL on first hit
... 99 more INCRs ...
INCR rate:user:42                  → returns 101 → reject

It works for fixed windows. It does not work for sliding windows. The bucket flips at the minute boundary — 100 requests in the last second of one minute plus 100 more in the first second of the next minute counts as 200 spread across 120 seconds, but really happened in 2 seconds. Stripe's published rate limiter design avoids this exact failure by counting per-type, per-shape with overage budgets, all coordinated through Redis. The string-counter version is fine for "approximate, generous bucket"; for correctness, you need section 5.

Teacher voice. A string in Redis is a length-prefixed byte buffer. An integer-valued string is one CPU instruction. That is why INCR shows up in every counter and rate-limiter design on the planet — including the one GitHub Engineering describes in their sharded, replicated rate limiter post, where Lua scripts on Redis guarantee atomicity for storage logic.

3) Lists — LPUSH/RPOP, queues, capped lists¶

A Redis list is a doubly-linked sequence of elements. The two ends are O(1); accessing the middle by index is O(N). Push to head with LPUSH, pop from tail with RPOP, and you have a queue. Push to head with LPUSH, trim with LTRIM 0 799, and you have a capped feed. This is exactly how Twitter's home timeline service stores per-user timelines — each new tweet's ID is LPUSHed into a per-follower list capped at the last 800 entries. The fanout is large on write; the read is O(1) per follower.

LPUSH feed:42 tweet_id_991      → "991" at head
LTRIM feed:42 0 799             → keep only top 800
LRANGE feed:42 0 49             → fetch first 50 for display

Lists also serve as the job queue substrate for Sidekiq (Ruby) and the Celery redis broker (Python). Sidekiq pushes JSON-serialised job payloads onto lists named queue:default, queue:critical, etc., and workers BRPOP with a blocking timeout. The blocking pop is the key — the worker thread parks inside Redis until a job appears, no busy-wait. Shopify runs dedicated Redis per pod for these queues so one shop's webhook burst cannot starve another's.

Internally, the list does not stay a linked list. Small lists (under list-max-listpack-size entries, default 128 in Redis 7) live in a listpack — a compact, contiguous byte array. Large lists are a quicklist — a linked list of listpack nodes. The encoding transition is invisible to commands but visible in the memory footprint. A list of 100 small integers takes maybe 600 bytes as a listpack; the same data as separate nodes would be five times that.

Mini-FAQ. "Are lists the right thing for our rate limiter?" No. Counting last-N-events by scanning a list is O(N) and you must scan to evict old entries. The sorted set in section 5 does the same job in O(log N) and gives you ordering by timestamp for free.

4) Hashes — per-field updates, ziplist/listpack memory tricks¶

A hash is a key-to-fields-to-values map. HSET user:42 name "Aisha" tier "gold" credits 1250. The big win over storing JSON in a string is partial update: HINCRBY user:42 credits -50 decrements one field without re-serialising the whole document. The big win over many separate keys is metadata overhead — every Redis key carries a robj header (around 56 bytes); a hash bundles many fields under one key.

Small hashes — under hash-max-listpack-entries (default 128) and with each value smaller than hash-max-listpack-value (default 64 bytes) — are stored as a listpack, the same compact byte array used for small lists. The lookup is linear-scan over the listpack, but for 128 entries that is roughly 200 ns on modern hardware, well under the cost of even a single L3 miss for a hashtable probe. Cross the threshold and Redis converts the encoding to a hashtable, paying memory for O(1) field access.

encoding choice
─────────────────
size ≤ 128 fields AND each value ≤ 64 bytes → listpack (compact, scan)
size  > 128 fields OR  any value  > 64 bytes → hashtable (bigger, O(1))

Where hashes live in the wild. Discord bot gateways cache guild and channel objects as hashes — HGET guild:444 name is faster than parsing a JSON blob and lets you update presence on one field without a write-back. Django session storage in django-redis uses one hash per session, with field-level TTL semantics simulated via key-level EXPIRE. Pinterest's follower model uses Redis Hash to store per-board explicit follower counts and uses Sorted Set with timestamp scores for the actual follower lists.

For our rate limiter, hashes are not the natural fit either. Rate limits care about when events happened, not what fields a user has. We come to the right structure next.

5) Sorted sets — leaderboards, rate limiters¶

A sorted set is a hash table plus a skip list, sharing the same elements. The hash table gives O(1) membership and score lookup. The skip list gives O(log N) ordered iteration. Add an element with ZADD key score member. Remove a range by score with ZREMRANGEBYSCORE. Count members in a score range with ZCOUNT. Read top-K by rank with ZRANGE / ZREVRANGE.

key "rate:user:42"     (sorted set)
  ┌─ skip list (by score) ─┐    ┌─ hash table (by member) ─┐
  │  score=1715958000.123  │    │  "req_a8f3" → 1715958000 │
  │  score=1715958000.301  │    │  "req_b71c" → 1715958000 │
  │  score=1715958001.044  │    │  "req_c92d" → 1715958001 │
  │  ... 97 more ...       │    │  ... 97 more ...         │
  └────────────────────────┘    └──────────────────────────┘

Here is our rate limiter, finally correct. Use now as the score. Use a unique request id (or a high-resolution timestamp) as the member so duplicate-score collisions do not lose entries. On every request, run these four commands in one Lua script (atomic on the single thread).

local now    = tonumber(ARGV[1])      -- current ms
local window = tonumber(ARGV[2])      -- 60000 ms
local limit  = tonumber(ARGV[3])      -- 100
local key    = KEYS[1]

redis.call("ZREMRANGEBYSCORE", key, 0, now - window)   -- evict old
local count = redis.call("ZCARD", key)                  -- count current
if count >= limit then
  return 0                                              -- reject
end
redis.call("ZADD", key, now, ARGV[4])                   -- record this hit
redis.call("PEXPIRE", key, window)                       -- garbage-collect later
return 1                                                 -- allow

Four O(log N) operations on a set of size at most 100. That is ~30 microseconds round-trip including network. The sliding window is exact — every individual request's timestamp is stored, every expired one is dropped, there is no flip-bucket failure mode. This pattern appears in the Redis documentation, in dozens of production rate limiters, and in the sliding-window-log variant of the rate-limit-redis library used by Express and a hundred Node.js services.

Sorted sets also power real-time leaderboards — Discord guild rankings, gaming scoreboards, Reddit's hot-post ordering. ZADD leaderboard 4729 player:42 updates one player's score in O(log N); ZREVRANGE leaderboard 0 9 WITHSCORES fetches the top 10. Pinterest stores users-followed and follower lists as sorted sets keyed by timestamp so "show me the people I followed most recently" is one ZREVRANGE. Twitter's timeline service uses sorted sets in the same shape — score is the tweet's creation time, member is the tweet id, and ZREVRANGE paginates from newest to oldest.

Teacher voice. Whenever you catch yourself saying "I need the top-K by some changing score" or "I need to count events in a sliding window," the answer is almost always a sorted set. Anything else is reinventing the skip list.

Encoding-wise, small sorted sets (under 128 entries, values short) also use a listpack with a sorted layout. Cross the threshold and Redis builds the proper skip-list-plus-hashtable. Our rate-limiter sets, capped at 100 entries each, mostly stay in listpack form — even cheaper memory-wise.

6) Streams — XADD/XREAD, consumer groups¶

Streams are Redis 5's answer to the "we keep abusing lists as message queues" problem. A stream is an append-only log with auto-generated IDs (milliseconds-sequence like 1715958000123-0). XADD stream * field value field value appends. XREAD COUNT 10 STREAMS stream $ reads new entries. The killer feature is consumer groups: many workers can join one group with XREADGROUP, and Redis tracks which entries each consumer has claimed and acknowledged. Unacknowledged entries sit in the PEL — Pending Entries List — and can be reassigned with XCLAIM if a worker dies.

producer:  XADD orders * order_id 4481 user 42 amount 1250
                                  └─ Redis assigns ID "1715958000123-0"

consumer:  XREADGROUP GROUP fulfilment worker_1 COUNT 10 STREAMS orders >
           └─ returns entries not yet claimed by worker_1
              entries move to PEL

consumer:  XACK orders fulfilment 1715958000123-0
           └─ removes from PEL

This is "the celery-killer" only in the sense that it removes two of celery's reasons to exist: durability of the queue and per-consumer offset tracking. In practice, Streams shine when (a) you want at-least-once delivery with explicit ack, (b) you need many consumers reading a single stream in parallel, (c) you want a history of past events to replay. Where they hurt: the PEL is a separate radix tree, and millions of unacknowledged entries balloon memory and slow XREADGROUP. Either you ack aggressively or you set MAXLEN to bound the stream.

In production, Redis Streams underpin dashboards that ingest telemetry (the Redis-published fast data ingest pipeline pattern), event buses for microservices that do not need Kafka's scale, and CDC-lite pipelines where one Redis instance forwards changes to consumers. They are explicitly not a replacement for Kafka at high partition count — one Redis instance is one stream's home; sharding requires you to manage that yourself.

Mini-FAQ. "Are Streams in our rate limiter?" No, but they could be its observability backplane. Every reject decision could be XADDed to an audit stream that a downstream worker reads to populate dashboards. Sorted set for the decision, stream for the audit log.

7) Comparison table¶

Versions: Redis 7.2 on a c7g.large (2 vCPU Graviton, single instance, no cluster, AOF off, payloads ≤ 200 bytes). Numbers approximate, measured by redis-benchmark and corroborated by published case studies. Adjust by half for c7g.xlarge, double for tiny pipe-thin VMs.

Structure	Key ops	Time complexity	Typical ops/sec (single core)	Memory overhead (per entry)	Encoding transition (Redis 7.2)
String (int)	`INCR`, `GET`, `SET`, `EXPIRE`	O(1)	~250k INCR/sec	~56 B header + 8 B value	always int if parsable as long
String (raw)	`GET`, `SET`, `APPEND`	O(1), O(N) for APPEND	~180k GET/sec	~56 B header + SDS length	embstr ≤ 44 B, raw beyond
List	`LPUSH`, `RPOP`, `LRANGE`	O(1) ends, O(N) middle	~200k push/sec	~20 B per entry in listpack	listpack ≤ 128 entries, then quicklist
Hash	`HSET`, `HGET`, `HINCRBY`	O(1)	~180k HSET/sec	~12 B per field in listpack	listpack ≤ 128 fields & values ≤ 64 B
Set	`SADD`, `SISMEMBER`, `SCARD`	O(1)	~180k SADD/sec	~20 B per member in listpack	listpack/intset ≤ 128, then hashtable
Sorted set	`ZADD`, `ZRANGE`, `ZREMRANGEBYSCORE`	O(log N)	~100k ZADD/sec	~64 B per member (skiplist+ht)	listpack ≤ 128, then skiplist+hashtable
Stream	`XADD`, `XREADGROUP`, `XACK`	O(1) append, O(log N) read	~120k XADD/sec	~30 B per entry + PEL overhead	radix-tree macro-nodes always
HyperLogLog	`PFADD`, `PFCOUNT`	O(1)	~250k PFADD/sec	12 KB fixed per key	sparse → dense as cardinality grows

Our four-command Lua rate limiter (ZREMRANGEBYSCORE + ZCARD + ZADD + PEXPIRE) costs roughly 4 × O(log 100) ≈ 28 hash/skip operations per request. On the bench above, a single c7g.large sustains roughly 60k–80k rate-limit decisions/sec — enough for most APIs, and you scale by sharding by user_id once you outgrow it. GitHub's API rate limiter scaled exactly this way: client-side sharded, replicated Redis backend, Lua scripts for atomicity.

Where this lives in the wild¶

These structures show up everywhere — sometimes as the primary store, sometimes as a pure cache in front of Postgres or DynamoDB. The line matters: when Redis is the truth, you need persistence and replication tuned; when Redis is the cache, you tolerate cold restarts.

Companies running Redis (or a fork) as a primary store for ephemeral truth:

Twitter stores home timelines as sorted sets / lists of tweet IDs scored by time; sustains ~39M QPS across 10k+ instances with 105 TB RAM, per High Scalability's published profile.
Pinterest keeps the follower graph as sharded Redis with sorted-set followers and follower counts; AOF every second on EBS, BGSAVE hourly to S3, billions of edges across thousands of shards.
Snapchat (KeyDB) caches User Service objects in a multithreaded Redis fork in GKE to cut cross-cloud P99 from 49–133 ms to 1.5–2.1 ms per the Google Cloud case study.
Netflix EVCache (memcached cousin) runs the Distributed Counter Abstraction's hot path — hundreds of thousands of counter updates per second per region in microseconds.
GitHub runs a sharded, replicated dedicated Redis cluster for API rate limiting with Lua-script atomicity, per their engineering blog on scaling the limiter.
Sidekiq deployments at Shopify, GitHub, Stripe push JSON job payloads onto Redis lists and pop with BRPOP; queues for webhooks, emails, payment retries.
Celery deployments at thousands of Python shops use Redis lists as the broker and a second Redis DB as the result backend.
Discord bots and gateway caches (RainCache, redis-discord-cache) store guild, channel, presence as hashes keyed by snowflake ID across thousands of bot deployments.
Klarna and many e-commerce backends cache Postgres rows in Redis but, per Klarna's own incident report, treat the CDN/Redis cache configuration as a sensitive surface — a wrong header once leaked PII.
Slack bot platforms use SET key value NX PX ttl to claim each Slack event ID, ensuring webhook retries do not double-trigger bots; per-thread leases coordinate workers.

Companies running Redis primarily as a cache or rate-limit substrate in front of a SQL store:

Stack Overflow uses Redis as the shared L2 across web servers, with StackExchange.Redis and Protobuf serialisation, backing per-site key-prefixed L1 in-memory caches.
DoorDash runs Redis Lettuce as the third layer beneath request-local maps and Caffeine in-process caches, with runtime knobs to flip layers per service.
Airbnb migrated from self-managed Redis on EC2 Classic to fully-managed ElastiCache for listings, search, images and payments — multi-AZ, automatic failover.
Instagram caches the top ~100 post IDs per user feed as Redis lists, with randomised TTLs to break stampede correlations.
Uber uses Redis to cache hot driver locations (with GEO commands fronting H3 hex indexes) and as a regional building block of its Global Rate Limiter — though the cross-region control loop is its own thing, per Uber's rate-limiter blog.
Stripe runs Redis-backed rate limiters that count in-flight requests per type and reserve fractional capacity for critical traffic.
Django apps with django-redis keep logged-in user sessions in Redis hashes with TTL — postgres never sees per-request session reads.
Spotify, Instagram, Reddit use Redis HyperLogLog (PFADD/PFCOUNT) to count daily/weekly unique visitors in 12 KB per key instead of giant sets.
DoorDash feature store clients combined client-side cache with Redis to lift hit ratio to ~70%, per their engineering post.
Many CDN edge platforms (Cloudflare, Fastly stack components) use Redis to share per-URL invalidation timestamps so a purge propagates across PoPs in milliseconds.
LangGraph apps (e.g., Klarna's support stack, Elastic AI Assistant) keep short-lived agent state — pending tool calls, memory snippets — in Redis hashes with TTL.
Symfony/Laravel-heavy PHP shops use APCu as L1 and Redis as L2 — the pattern surfaced in production architecture posts well before becoming common Redis advice.

The two lists tell you the deployment posture. Truth-as-Redis shops invest heavily in AOF + replication + cluster failover drills. Cache-as-Redis shops tolerate cold restarts but spend their attention on stampede patterns and TTL randomisation.

Pause and recall¶

Why does a single-threaded server help the correctness of a Lua-script rate limiter, rather than hurt it?
What is the encoding Redis 7 uses for a small hash of 50 fields with short values, and when does it convert to a hashtable?
In the four-command sliding-window rate limiter, what would go wrong if you used score = now and member = now instead of a unique request id?
Why is LPUSH + LTRIM the right shape for Twitter's home-timeline lists, but the wrong shape for a sliding-window rate limiter?
What does the PEL (Pending Entries List) cost you if you forget to XACK?
Roughly how many sliding-window rate-limit decisions per second can one c7g.large sustain in the configuration above, and what is the bottleneck?
Why does Redis 6's "multithreaded" mode not let you run two ZADDs in parallel?
Name three production systems that use sorted sets for their core feature and explain the score they use.

Interview Q&A¶

Q1. Why is Redis single-threaded if multithreaded would obviously go faster? A. Because the cost model is upside-down for an in-memory store. Disk-bound systems benefit from threads because threads can overlap I/O waits; an in-memory store has no waits to overlap. Threads instead introduce lock contention, cache-line bouncing, and context-switch overhead — all more expensive than the work itself for sub-microsecond commands. Single-threaded execution also gives you free atomicity per command and per Lua script. Redis 6 added I/O threads (parallel socket read/write) but kept command execution single-threaded for exactly these reasons. Common wrong answer to avoid: "It's a historical artefact and they should fix it." It is a deliberate engineering choice with measurable benefits, not a legacy bug.

Q2. Design a rate limiter: 100 requests per minute per user_id, sliding window, low latency. A. Use a sorted set keyed by rate:user:<id>. On each request run a Lua script that (a) ZREMRANGEBYSCORE to drop entries older than now - 60s, (b) ZCARD to count the remaining, (c) reject if count ≥ 100, (d) otherwise ZADD with score=now, member=unique request id, and PEXPIRE to garbage-collect the key. The Lua script runs atomically on the single thread; no race condition. Each request is ~30 µs round-trip. Shard by user_id if you outgrow one node. Common wrong answer to avoid: "INCR a counter with a 60-second TTL." That is a fixed window, not sliding. You will under-count bursts that straddle a minute boundary and over-count when clients hit just before and after rollover.

Q3. Your colleague proposes storing each user's session as a JSON string under session:<id>. What is the problem if you need to update last_seen on every request? A. A string update is read-the-whole-blob, parse JSON, mutate, re-serialise, write-the-whole-blob. For a 4 KB session that is 4 KB of network on read plus 4 KB on write, plus parse cost. A hash with HSET session:<id> last_seen <ts> writes ~30 bytes and updates one field server-side. Hashes also let you set field-level encoding tricks (listpack) that pack many fields into ~12 bytes overhead each, so memory is also better. Common wrong answer to avoid: "Strings are fine, Redis is fast either way." It is fast, but the bandwidth and CPU bill are 100× higher for the string approach.

Q4. What happens to other clients when one client runs KEYS * on a 10-million-key database? A. They wait. KEYS is O(N) over the entire keyspace and runs on the single command thread. While it iterates — easily several seconds at 10M keys — no other command is dispatched. Every other client's request piles up in the socket buffer. Latency p99 spikes. The fix is SCAN, which uses a cursor and breaks the work into bounded chunks, yielding the thread between iterations. Common wrong answer to avoid: "Nothing, Redis handles it." The single-thread model means there is no "handles it in parallel."

Q5. When would you reach for a Redis Stream over a list-as-queue? A. When you need durable, ordered, at-least-once delivery to multiple parallel consumers with explicit acknowledgement, and possibly the ability to replay past events. Lists give you one consumer per item via BRPOP — once popped, gone. Streams give you consumer groups with per-consumer offsets and a PEL for redelivery. The trade-off: Streams use more memory per entry, the PEL needs management, and you cannot trivially shard one stream across nodes. Common wrong answer to avoid: "Streams replace Kafka." They replace lists-as-queues at small-to-medium scale; Kafka still wins at billions of events/day with many partitions.

Q6. Why does the four-command rate-limiter Lua script not suffer from a TOCTOU race? A. Because Redis is single-threaded and Lua scripts run atomically on the command thread. From the script's first ZREMRANGEBYSCORE to its last PEXPIRE, no other client's command interleaves. Without Lua, running those four commands as separate round-trips would allow two concurrent requests to both read ZCARD = 99 and both ZADD, ending at 101 — bypassing the limit. The atomicity is only free because of the single thread. Common wrong answer to avoid: "We use a Redis transaction with MULTI/EXEC." MULTI/EXEC queues commands but cannot branch on intermediate values (you cannot read ZCARD then conditionally ZADD). Lua is required for the if-then.

Q7. Sorted set has a hash table and a skip list internally. Why both? A. Each structure pays for what it gives you. The hash table gives O(1) "what is the score of member X?" and O(1) "is member X present?" The skip list gives O(log N) ordered iteration — "give me members with score between A and B" or "give me rank N". Without the hash, score lookup would be O(N). Without the skip list, range queries would be O(N log N). Sharing the same member pointers between the two costs only one extra pointer per entry. That is why sorted-set memory overhead per entry (~64 B) is roughly double a hash entry's. Common wrong answer to avoid: "Redis uses a red-black tree." It uses a skip list, deliberately — simpler implementation, better cache behaviour for range scans, easier concurrency at the cost of a non-issue here.

Q8. Your Redis CPU is pegged at 100% on one core; the other seven are idle. What do you investigate, in what order? A. First, SLOWLOG GET 20 to look for individual slow commands — usually KEYS, SMEMBERS on huge sets, big LRANGE, or a Lua script with a tight loop. Second, INFO commandstats to see which command type dominates total CPU. Third, check for hot keys with redis-cli --hotkeys or sampling. Fourth, consider whether the workload is genuinely command-CPU-bound (true single-thread limit, time to shard or shard with cluster mode) or whether it's network-bound (Redis 6 I/O threads or pipelining might help). Avoid the temptation to "just enable threading" — command execution is still single-threaded; you cannot fix a slow Lua script with more threads. Common wrong answer to avoid: "Move to a bigger instance with more cores." More cores help only if you shard the keyspace; for one node, the bottleneck remains one command at a time.

Apply now (10 min)¶

Step 1 — model the exercise. Take the rate-limiter task. Below is what a one-row design audit looks like for one candidate implementation, so you can copy the shape.

Question	What I would look for	Red flag answer
What structure stores per-user state?	Sorted set keyed by `rate:user:<id>`	"A string counter with TTL"
How is correctness under concurrency guaranteed?	Single Lua script doing read-then-write	"We use MULTI/EXEC" or "we don't think it's an issue"
What stops the set from growing forever?	`ZREMRANGEBYSCORE` on every call plus `PEXPIRE` on the key	"Manual cleanup once a day"
What happens if Redis restarts cold?	Limits reset; we accept one minute of laxness	"We replicate to a secondary just for the limiter"
How does it shard at scale?	Client-side hash of `user_id` to one of N Redis nodes	"We use a global counter on a single primary"

Step 2 — your turn. Take any rate limit, leaderboard, queue, or feed in your own stack. Fill the same five rows. For each red-flag answer, sketch the fix using one of the seven data structures above. If a row is green but the structure choice is non-obvious, write one sentence on why this structure and not the next one over.

Step 3 — sketch from memory. Redraw the event-loop diagram from section 1 — epoll_wait → readQueryFromClient → processCommand → sendReplyToClient — and trace the four-command rate-limiter Lua script through it. Label every arrow with what is moving (RESP bytes, parsed argv, output buffer, AOF append). If you can do this cold, you have internalised the loop.

Bridge. We have the shapes. Now the wire. The next file opens RESP, walks the day-to-day commands you actually type, contrasts RDB and AOF persistence, and shows how a client like redis-py or Lettuce turns your code into framed bytes the event loop can dispatch. → 02-commands-persistence-clients.md