Skip to content

01. Week 8 — Advanced RAG

Key concepts to master

  • Basic RAG fails when the user query is not retrieval-shaped.
  • Query rewriting turns user language into retriever language.
  • Query expansion improves recall with synonyms, aliases, and missing entities.
  • Query decomposition breaks multi-hop questions into answerable sub-queries.
  • Step-back prompting asks for the higher-level principle before searching details.
  • HyDE embeds a hypothetical answer, not the raw question.
  • Parent-child retrieval keeps chunk precision without losing document-level context.
  • Fusion retrieval combines dense and sparse search, then merges rankings.
  • Cross-encoder reranking boosts precision after cheap retrieval.
  • Metadata filtering and MMR reduce junk and duplicate context.
  • Corrective loops decide when to retry, switch strategy, or abstain.
  • The confidence gate is the precursor to agent self-evaluation in Module 09.

🧠 Mental models

  • Query rewriting: "Translate user language into the retriever's dialect."
  • HyDE: "Write a plausible answer ghost first, then search for documents that resemble it."
  • Hybrid search: "Use both a keyword flashlight and a semantic compass."
  • Reranking: "Do expensive background checks only on the shortlist."
  • Multi-hop retrieval: "Cross the river stone by stone; each retrieved fact enables the next jump."
  • Confidence gate: "A circuit breaker that stops the system from bluffing when evidence is weak."

⚠️ Common traps

  • Stacking rewrite, expansion, and decomposition blindly and drifting farther from the real need.
  • Reranking too many candidates, which can dominate latency and erase ANN speed gains.
  • Optimizing precision on easy head queries while recall collapses on messy long-tail questions.
  • Evaluating only final answers and never labeling whether retrieval itself found the right evidence.
  • Letting corrective loops retry indefinitely instead of abstaining, switching strategy, or escalating.

🔗 Prerequisites & connections

Builds on: Module 07 (RAG Fundamentals) — basic chunking, embeddings, ANN retrieval, and faithfulness metrics are assumed. Feeds into: Module 09 (Agents & Tool Calling) — corrective loops, confidence gates, and multi-step retrieval become agent planning patterns.

💬 Interview phrasing

  • "Baseline RAG fails on multi-step business questions. What would you add first and why?"
  • "When does HyDE help, and when can it make retrieval worse?"
  • "Why does hybrid search usually beat dense-only retrieval in production?"
  • "How would you evaluate retrieval quality separately from generation quality?"
  • "When should the system retry retrieval versus answer 'I don't know'?"

⏱️ Difficulty markers

  • 🟢 metadata filtering and MMR
  • 🟡 query rewriting and expansion
  • 🟡 hybrid dense + sparse retrieval
  • 🔴 cross-encoder reranking trade-offs
  • 🔴 multi-hop retrieval orchestration
  • 🔴 retrieval evaluation and confidence gating

Self-check questions

For full Q&A and interview-style answers, see explainer §6.3.

  1. Why does basic RAG often fail on multi-step business questions? (§1.2)
  2. Rewrite vs expand vs decompose — when does each help? (§2.1-§2.3)
  3. What is step-back prompting, and why can it improve recall? (§2.4)
  4. Why can HyDE beat direct embedding of the user query? (§3.1)
  5. Parent-child retrieval vs flat chunk retrieval — what trade-off changes? (§3.3)
  6. Hybrid retrieval: why does dense + sparse usually beat either alone? (§3.4)
  7. Why rerank top-K instead of cross-encoding the whole corpus? (§4.1)
  8. What does MMR optimize that simple top-score sorting does not? (§4.4)
  9. What is the confidence gate, and what actions can it trigger? (§5.2-§5.4)
  10. Why is this module the bridge into agents and tool calling? (§6.5-§6.6)

Health check

  • [ ] All 6 explainer chapters read at least once
  • [ ] Can transform one raw question into rewrite + expansion + decomposition
  • [ ] Can explain HyDE, reranking, and MMR without notes
  • [ ] Assignment shipped with one corrective loop and eval results
  • [ ] Daily-recall prompts answerable from memory
  • [ ] Ready to start Module 09 with loop-thinking already internalized