Skip to content

00. Search & Information Retrieval — The Five-Year-Old Version

Imagine a busy post office where finding one letter fast is the whole game.


Think of search like a post office sorting room. Each letter is one document sitting in the building. Workers read the words on every letter and drop the letter number into labelled sorting bins. One bin says python. One bin says snake.

One bin says tutorial. So the room is not arranged by whole letters anymore. It is arranged by words. That is the big trick. Now a user walks in with an address label.

That address label is just the search query. The clerk reads the words on it. Then she checks the matching sorting bins. She pulls out the possible letter IDs from those bins. Simple, no?

But not all letters are equally useful. A letter mentioning the exact topic many times may help more. A letter using only vague common words may help less. So the clerk stamps every candidate with a postmark score.

That score says, “How likely is this letter to satisfy the address label?” Then the clerk lays the letters in a delivery route. Best first. Weak ones later. Sometimes the first pass is rough.

The room can quickly shortlist 100 letters. But for the top few, a specialist checks them again. That specialist is the express lane. It is slower, but more careful. It rereads the address label and the whole letter together.

Then it adjusts the delivery route before the user sees anything. Look. That whole story is search and information retrieval in kid words.

┌──────────────┐   words   ┌────────────────────┐
│ letter stack │ ────────→ │ sorting bins       │
└──────────────┘           │ word → letter IDs  │
                           └─────────┬──────────┘
address label ───────────────────────┤
                           candidate letters
                              postmark score
                               delivery route
                               express lane
Tiny worked example now. Suppose there are 3 letters. Letter 1 has car repair. Letter 2 has car insurance. Letter 3 has bike repair.

The address label is car repair. The car sorting bin holds [1, 2]. The repair sorting bin holds [1, 3]. The overlap is just [1]. So Letter 1 is the strongest match.

If we give 1 point per matching word, then: Letter 1 gets 2 points. Letter 2 gets 1 point. Letter 3 gets 1 point. So the delivery route becomes 1 → 2 → 3. See.

The bins find candidates. The score orders them. The express lane fixes the tricky ties.


The placeholders you will see called back

| Placeholder | Meaning |

|---|---|

| sorting bins | The inverted index; bins labelled by word, holding doc IDs |

| address label | The query; what the user writes on the envelope |

| letter | A document in the corpus |

| postmark score | Relevance score like TF-IDF, BM25, cosine, or another ranking number |

| express lane | The reranker; fast-forward path for the most promising results |

| delivery route | The final ranking; the order results are handed to the user |


Top resources


What's coming

  1. 01-keyword-search-failure.md — why literal matching breaks fast.

  2. 02-inverted-index.md — how the sorting bins are built.

  3. 03-tf-idf-scoring.md — how rare words earn bigger postmark score.

  4. 04-bm25.md — the scoring formula most teams actually ship.

  5. 05-query-understanding.md — how we clean and enrich the address label.

  6. 06-dense-retrieval.md — how vectors find meaning beyond exact words.

  7. 07-sparse-vs-dense.md — when sorting bins win, and when vectors win.

  8. 08-hybrid-search-fusion.md — how both worlds are combined.

  9. 09-learning-to-rank.md — how models learn a smarter delivery route.

  10. 10-cross-encoder-reranking.md — why the express lane is slow but sharp.

  11. 11-evaluation-metrics-ir.md — how to measure ranking quality.

  12. 12-search-relevance-tuning.md — which knobs teams tune in production.

  13. 13-honest-admission.md — what search people still cannot answer cleanly.


Memory map

Concept Prerequisite Pressure family Recurs later as Layer touched
Exact-match failure clean documents ambiguity, data quality query rewriting in RAG user query -> index
Inverted index tokenization latency, memory sparse retrieval systems text -> postings -> candidates
TF-IDF and BM25 term statistics relevance, calibration lexical ranking baselines index -> scorer -> ranking
Query understanding user intent ambiguity, safety routing and clarification API -> parser -> retrieval
Dense retrieval embeddings semantic mismatch vector databases and RAG model -> vector index -> candidates
Sparse vs dense choice lexical and vector search precision, recall hybrid retrieval retriever branches -> fusion
Learning to rank judged examples relevance, feedback bias production ranking models features -> model -> route
Cross-encoder reranking candidate generation bounded compute, precision RAG reranking shortlist -> model -> top-n
IR metrics judged lists evaluation, trust RAG evals labels -> dashboard -> release
Relevance tuning all prior retrieval operator attention search quality loops config -> experiment -> rollout

Bridge. First we see how a perfectly organized room can still fail when the address label uses the wrong words. → 01-keyword-search-failure.md