13. Evaluating ingestion quality — how to know your pipeline is working¶

~13 min read. The eval most RAG teams skip, and then wonder why retrieval is "broken".

[Stub — to be written]

Outline:

The four failure types: text loss, structural loss, hallucination, ordering error
A golden-document set: 20-50 hand-curated docs with hand-extracted expected text
Character-level recall: did we get the words back
Structural recall: did we keep the heading hierarchy
Numeric fidelity: did the digits in tables survive
Reading-order tests: are paragraphs in the right sequence
Diffing extracted output against ground truth
Continuous evaluation: pin a few customer docs as canaries
The "RAG eval will hide ingestion bugs" trap — why upstream eval matters separately
Cross-reference to module 24 (evals in production)