13. Evaluating ingestion quality — how to know your pipeline is working¶
~13 min read. The eval most RAG teams skip, and then wonder why retrieval is "broken".
[Stub — to be written]
Outline:
- The four failure types: text loss, structural loss, hallucination, ordering error
- A golden-document set: 20-50 hand-curated docs with hand-extracted expected text
- Character-level recall: did we get the words back
- Structural recall: did we keep the heading hierarchy
- Numeric fidelity: did the digits in tables survive
- Reading-order tests: are paragraphs in the right sequence
- Diffing extracted output against ground truth
- Continuous evaluation: pin a few customer docs as canaries
- The "RAG eval will hide ingestion bugs" trap — why upstream eval matters separately
- Cross-reference to module 24 (evals in production)