00. Capstone Project — The Five-Year-Old Version¶
Integration is the real test: knowing every piece is not the same as building the whole house.
Think of learning AI engineering like learning the trades. In earlier modules you learned carpentry — how to shape transformers and attention layers into something useful. You learned plumbing — how RAG pipelines pull knowledge from storage and deliver it to the model. You learned wiring — how agents decide, call tools, and loop back. You learned painting — how generation models turn tokens into readable, human output.
Now someone hands you a plot of land and says: build the house.
The hard part is not any single trade. The hard part is cooperation. The carpenter cannot build doors before the plumber runs pipes through the walls. The wiring must wait for the framing. Every piece was fine on its own. Together, they must fit in the right order, at the right tolerance. That is the capstone.
See — most people stop after they get each component working. They show a RAG demo that retrieves correctly. They show a prompt that produces a good answer. They show an agent that calls a tool. Then they wire them together and the whole thing falls apart. Why? Because each component was tested alone, not in the chain.
The capstone forces you to think in chains, not components.
So what to do? We treat the capstone as a systems problem, not a model problem. We start with the user standing at the front door — what do they need the house to do? Then we draw the blueprint. Then we pour the foundation. Then we run the plumbing and the wiring. Then we inspect everything before move-in day. Simple, no?
Each of the thirteen files below teaches one layer of this build. We move in exactly the order a real product team would move. We do not skip to the fun parts. We do not treat deployment as an afterthought. We treat every layer as load-bearing.
The placeholders you will see called back¶
| Placeholder | Meaning |
|---|---|
| the blueprint | system design — what you are building and why |
| the foundation | infrastructure choices that everything rests on |
| the plumbing | data pipelines connecting components |
| the inspection | evaluation suite checking the whole house |
| the move-in day | deployment — making it real for users |
These words appear in bold throughout all 13 files. When you see them, connect back to this house-building mental model. They are shorthand for whole engineering concepts.
Top resources¶
- Eugene Yan — LLM Patterns — production AI system patterns beyond the demo stage.
- Hamel Husain — LLM Evaluation Guide — practical evaluation design, not academic theory.
- Chip Huyen — Designing ML Systems — the canonical reference for ML system thinking.
- Shreya Shankar — Evaluation Honesty — evaluation failure modes explained clearly.
- OpenAI Cookbook — Production Best Practices — concrete prompt and system design patterns.
- LangSmith Docs — Tracing and Observability — observability tooling for LLM chains in production.
- Phil Schmid — Deployment Walkthroughs — deployment and fine-tuning guides with real code.
- The Pragmatic Engineer on AI — honest engineering takes on AI in real products.
- Anthropic — Building Effective Agents — when to use agents and when not to.
- Martin Fowler — Strangler Fig Pattern — safe incremental deployment, applies directly to AI rollouts.
What's coming¶
- 01-opening-failure.md — Parts pass, system fails: why integration is the hardest challenge.
- 02-system-design-blueprint.md — Start with the user job, not the model capability.
- 03-architecture-choices.md — Single pipeline vs RAG vs agent vs multi-agent: choosing correctly.
- 04-data-pipeline-design.md — Retrieval, context assembly, freshness, chunking for the project.
- 05-implementation-strategy.md — Build order, vertical slices, iteration discipline.
- 06-prompt-engineering-project.md — System prompts, few-shot, chain-of-thought for the project.
- 07-evaluation-design.md — End-to-end evals, not just component checks.
- 08-monitoring-observability.md — Traces, dashboards, alerting on quality drift.
- 09-cost-latency-management.md — Token budgets, caching, model routing for production.
- 10-deployment-strategy.md — Staging, canary, rollback, CI/CD for AI systems.
- 11-presentation-portfolio.md — How to present a capstone: demo, writeup, what interviewers want.
- 12-integration-debugging.md — When the whole system breaks: systematic debugging approach.
- 13-honest-admission.md — What we do not fully understand about building AI systems.
Bridge. Every house starts with a failure — the first time you stand back and see what does not fit. → 01-opening-failure.md