00. Advanced RAG — The Five-Year-Old Version¶

Module 07 gave you basic retrieval. This module teaches the retrieval system to think before it searches.

Imagine the librarian from Module 07 got promoted.

Now that person is the head researcher for the whole library.

A child asks a question, and the head researcher does not run immediately.

First, they clean the question so the shelves can understand it better.

That helper is the rewriter.

Then they imagine what a good answer might sound like.

Not to trust the imaginary answer.

Only to search in the right neighborhood.

That helper is the hypothesis.

Sometimes one question is secretly three smaller questions.

The head researcher breaks it apart, answers each part, and then combines the pieces.

That helper is the multi-step plan.

Then many pages come back from the shelves.

Some are related, but only a few are truly useful.

So the head researcher reads the question and each page together.

That helper is the cross-checker.

Finally, the draft answer is not trusted blindly.

The head researcher asks, “Do we really have enough proof, or should we search again?”

That last stoplight is the confidence gate.

See.

Basic RAG says, “Search once and answer.”

Advanced RAG says, “Think a little, search better, check again, then answer.”

Simple, no?

This whole module is about that promotion.

You will learn how to reshape messy questions.

You will learn how to search by imagined answers.

You will learn how to split hard questions into smaller ones.

You will learn how to mix dense and sparse search.

You will learn how to rerank, filter, retry, route, and abstain.

That is why advanced RAG feels less like a lookup box.

It feels more like a careful research workflow.

The placeholders you will see called back¶

Placeholder	Meaning
the rewriter	query transformation — reshaping the question for better search
the hypothesis	HyDE — imagining an answer to guide retrieval
the multi-step plan	decomposition — splitting complex queries into sub-questions
the cross-checker	reranker — second-pass deep scoring of candidates
the confidence gate	self-evaluation — deciding to answer or search again
the contents map	vectorless RAG — reading the table of contents and reasoning which section to open, instead of matching vectors

Top resources¶

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks — the original framing for retrieval plus generation.
HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels — the paper behind hypothetical answer search.
Self-RAG — retrieval, critique, and refinement inside one loop.
Corrective Retrieval Augmented Generation — retrieval quality checks with correction paths.
Pinecone Hybrid Search Guide — dense plus sparse retrieval in practice.
Cohere Rerank Docs — practical second-pass ranking with cross-encoders.
LangChain MultiQuery Retriever — one way to implement query expansion.

What's coming¶

01-opening-failure.md — why one-shot retrieval collapses under strong generation.
02-query-rewriting.md — how to preserve intent while making search easier.
03-query-expansion.md — how multiple query variants widen recall.
04-query-decomposition.md — how multi-hop questions become smaller, solvable hops.
05-hyde-hypothetical-embeddings.md — how imagined answers can pull real evidence closer.
06-parent-child-retrieval.md — how precise chunks and wide context cooperate.
07-hybrid-retrieval.md — how dense and sparse search cover each other’s blind spots.
08-cross-encoder-reranking.md — how a deeper second pass rescues the shortlist.
09-metadata-filtering-mmr.md — how filters and diversity stop noisy duplication.
10-corrective-rag.md — how systems judge retrieval quality before trusting it.
11-iterative-retrieval.md — how search becomes a loop instead of a single shot.
12-routing-strategies.md — how different queries take different retrieval paths.
13-confidence-gates.md — how the system decides to answer, retry, or abstain.
14-vectorless-rag.md — how reasoning over document structure replaces similarity search.
15-honest-admission.md — what even advanced RAG still cannot solve cleanly.

Memory map¶

Concept	Prerequisite	Pressure family	Recurs later as	Layer touched
Opening evidence failure	basic RAG pipeline	faithfulness, data quality	confidence gates and honest admission	user complaint -> retrieval -> answer risk
Query rewriting	raw query retrieval	ambiguity, latency	routing pre-step and corrective retry	API text -> retriever input -> trace
Query expansion	rewriting	recall, cost	iterative retrieval branches	query planner -> index fanout -> reranker load
Query decomposition	expansion	reasoning, coordination	agent planning and graph traversal	user task -> sub-question plan -> synthesis
HyDE	embeddings	semantic mismatch	generated retrieval probes	prompt -> embedding space -> candidate set
Parent-child retrieval	chunking	locality, context budget	citation pack construction	index chunk -> parent document -> prompt window
Hybrid retrieval	dense retrieval + sparse search	coverage, literal precision	reranking candidate pool	vector index + lexical index -> fusion
Cross-encoder reranking	candidate generation	bounded compute, precision	answer evidence ordering	model scoring -> latency -> quality signal
Metadata filtering and MMR	ranking	data quality, diversity	route-specific retrieval policies	metadata schema -> shortlist composition
Corrective and iterative RAG	retrieval metrics	feedback loops, operator attention	self-reflective agent loops	evaluator -> retry route -> logs
Routing and confidence gates	all prior controls	cost, safety, trust	production fallback policy	product API -> policy -> escalation

Bridge. Basic RAG fetches once and hopes for the best. The first thing we need to understand is why that hope breaks so fast. → 01-opening-failure.md