02. NoSQL Document and Key-Value — Flexible boxes can share one shelf¶

~15 min read. Some workloads reward shape flexibility more than join discipline.

Built on the ELI5 in 00-eli5.md. The shelf — a shelf where each box can hold different items — now becomes documents, keys, and access-pattern-first design.

See. A document database asks a different question from relational design. Can one request usually read one self-contained object? If yes, documents can feel natural. MongoDB stores JSON-like documents. Each document can have nested fields. Different documents in the same collection can vary slightly. That is the big attraction. A product document can hold title, price, tags, dimensions, and media together. A blog post document can hold title, body, author, and comments together. Same idea. Different shape.

products
┌──────────────────────────────────────┐
│ _id: 501                             │
│ title: "Bluetooth Speaker"          │
│ price: 2499                          │
│ specs: { battery: "10h", color: "black" }
│ tags: ["audio", "portable"]        │
└──────────────────────────────────────┘

Why is this useful? Because one API call often wants the whole object. No join assembly is needed. Worked example. Suppose an ecommerce catalog has 2 million SKUs. Fashion items need size and fabric fields. Electronics need battery and wattage fields. Furniture needs material and dimensions fields. A rigid relational table can still model this. But it may produce hundreds of nullable columns or separate subtype tables. A document model stores each variant more directly. Now the caution. Schema flexibility is not schema absence. You still need required fields, validation, naming conventions, and migration discipline. Otherwise, the same logical field appears as phone, phoneNo, and contact_number. Then analytics teams cry quietly. So what to do? Use documents when one record is consumed as one unit. Use documents when nested attributes belong together naturally. Do not use documents just to postpone thinking.

2) DynamoDB teaches access-pattern-first design, not table-first design¶

DynamoDB is called single-table often. That phrase confuses people. It does not mean one messy bucket without structure. It means one carefully designed table serving known access patterns. You choose partition key and sort key around queries first. Suppose we are building a library app. Queries are these. 1. Get member profile by member ID. 2. List active loans for a member. 3. List holds for a member. 4. Get book metadata by book ID. In DynamoDB, we may store multiple entity types together.

PK            | SK              | entity_type | attributes
MEMBER#42     | PROFILE         | member      | name, city
MEMBER#42     | LOAN#2026-05-01 | loan        | book_id, due_date
MEMBER#42     | HOLD#2026-05-03 | hold        | book_id, status
BOOK#9001     | PROFILE         | book        | title, author

Now one partition groups related access neatly. Fetch member profile and loans with one targeted query pattern. That is why single-table can be powerful. Worked example with numbers. Assume 500,000 members exist. Average active loans per member are 3. Average holds per member are 1. Total member-related items become about 2 million rows. Querying one member partition usually touches only 5 nearby items. That is predictable and fast. Simple, no? But ad hoc querying becomes harder. If tomorrow product asks, “List all overdue loans by city and section,” you may need a new index or exported view. DynamoDB rewards clarity about access patterns upfront. It punishes vague analytics fantasies later. That is not a flaw. That is the contract. Design from user flows backward. Write keys from query verbs backward. If you cannot list top queries, you are not ready.

3) Redis is more than string values at the reservation desk¶

People say Redis and imagine only cache entries. That is incomplete. Redis offers strings, hashes, lists, sets, sorted sets, bitmaps, and streams. Each structure fits a different fast-state job. Strings fit session tokens or simple counters. Hashes fit compact object fragments. Sets fit unique membership checks. Sorted sets fit leaderboards and time-ordered queues. Lists fit lightweight work queues.

reservation desk uses
┌───────────────┬────────────────────────┐
│ structure     │ good for               │
├───────────────┼────────────────────────┤
│ string        │ session, counter       │
│ hash          │ user profile fragment  │
│ set           │ feature flags, uniques │
│ sorted set    │ ranking, recent items  │
└───────────────┴────────────────────────┘

Worked example. A reading app serves 40,000 requests per second. 20,000 requests ask for session state. 10,000 requests ask for “recently viewed books”. 10,000 requests increment reading streak counters. Those are classic Redis shapes. Store sessions as strings with TTL. Store recently viewed books in a sorted set per user. Store streak counters as integers with expiry logic. Now memory math. Suppose one session payload averages 1.5 KB. Five million active sessions need about 7.5 GB raw memory. With replication factor 2, you plan near 15 GB before overhead. That is perfectly reasonable for a hot-state tier. But remember the boundary. Redis is amazing for speed. Redis is usually not your long-term system of record. Expired state, counters, leaderboards, and rate limits love it. Complex historical reporting does not.

4) Flexible schema helps only when access patterns stay disciplined¶

Now combine the families. MongoDB helps when document shape varies and one document answers one request. DynamoDB helps when access patterns are known and latency must stay predictable. Redis helps when data is hot, transient, or structure-specific. All three can sit in one system. Example architecture for an education app:

catalog docs      → MongoDB
student sessions  → Redis
course progress   → DynamoDB

Why split like this? Course catalog pages need nested flexible content. Student sessions need millisecond expiry-aware reads. Course progress needs keyed lookups by learner and course. Worked example. Assume 3 million monthly active learners. Each learner has 12 progress items on average. That is 36 million progress records. Most reads are “get progress for learner X in course Y”. DynamoDB-style key access fits neatly. Catalog data changes weekly, not every second. MongoDB documents fit authors, chapters, and media arrays. Session tokens expire after thirty minutes. Redis handles that elegantly with TTL. So what to do? Pick the NoSQL family by data shape and query shape together. Do not say “NoSQL” and stop thinking. Say document, key-value, or in-memory structure. Then state the dominant read and write patterns clearly. That is the interview-grade answer.

Where this lives in the wild¶

Amazon retail backend engineer often designs DynamoDB keys around carts, sessions, and metadata reads that are known upfront.
Adobe Experience Manager engineer benefits from MongoDB-style documents when content objects carry nested, evolving fields.
Discord platform engineer uses Redis structures for counters, ephemeral presence, and hot gateway-adjacent state.
Duolingo product engineer can store flexible lesson content as documents while keeping rapid learner state in keyed stores.
DoorDash growth engineer leans on Redis for rate limits, experiments, and short-lived request context near the application edge.

Pause and recall¶

When does a document model beat a highly normalized relational model?
Why does DynamoDB force you to list access patterns before design?
Which Redis structure fits a leaderboard better than a plain string?
Why is schema flexibility not the same as zero discipline?

Interview Q&A¶

Q: Why choose MongoDB for a product catalog? A: Because one product often reads as one nested object, and attributes vary by category. Documents reduce modelling friction when fields evolve frequently. Common wrong answer to avoid: “Because MongoDB has no schema.” Q: Why does DynamoDB design begin with access patterns? A: Because keys determine latency, partitioning, and query feasibility. If the main queries are unclear, the table shape will be guesswork. Common wrong answer to avoid: “Because single-table means everything goes into one random partition.” Q: Why use Redis for sessions instead of the main database? A: Sessions are hot, short-lived, and keyed by direct lookup. TTL support and in-memory speed make Redis a better fit for that workload. Common wrong answer to avoid: “Because SQL databases cannot handle session reads.” Q: Can one system use MongoDB, DynamoDB, and Redis together? A: Yes, if each store owns a different workload shape and operational boundary. Polyglot persistence is justified when access patterns genuinely differ. Common wrong answer to avoid: “Using more than one database is automatically overengineering.”

Apply now (5 min)¶

Exercise: Take one app you use daily. List three read paths that want one whole document. List two read paths that want direct key lookup with strict latency. Then name one place where TTL clearly matters. Sketch from memory: Draw three boxes for MongoDB, DynamoDB, and Redis. Under each box, write one data shape and one example key. Then circle the store that would punish ad hoc queries most.

Bridge. Flexible boxes help, but some workloads are even more specialized. Write-heavy timelines and relationship traversals need different shelf designs altogether. → 03-wide-column-and-graph.md