Skip to content

03. Week 15 — Capstone Project Study Material

How to use this file

Use this document with the narrative in 02_explainer.md. The explainer gives the mental picture. This file gives the operational checklist, matrices, and references.

Cross-reference map

Need Start here Then go here
Understand why capstones fail 02_explainer.md, Chapter 1 Section 2 below
Choose architecture 02_explainer.md, Chapter 2 Sections 3-4 below
Plan implementation 02_explainer.md, Chapter 3 Section 5 below
Build evals and monitoring 02_explainer.md, Chapter 4 Sections 6-7 below
Package and present the work 02_explainer.md, Chapter 5 Sections 8-9 below
Self-test at the end 04_daily_recall.md 06_revision.md

Section 1 — What a strong capstone demonstrates

A strong capstone is not a bigger hands_on_lab. It is a proof of judgment. It shows you can balance user value, engineering complexity, quality, latency, and cost.

A hiring manager should be able to point at your project and say: - This person can ship an AI feature. - This person can explain trade-offs. - This person knows what to instrument. - This person knows where the risks are.

Section 2 — Capstone idea filter

Filter Green light Red flag
User Specific persona with a real workflow "Anyone who uses AI"
Scope One narrow job-to-be-done Full product platform
Data Reachable data source or realistic mock Undefined future dataset
Evaluation Can create a gold set in days Needs months of labeling
Demo Easy to explain in two minutes Needs long setup and context
Portfolio value Maps to target companies Interesting but irrelevant

Section 3 — Architecture choice matrix

Pattern When to choose it Benefits Risks
Single request pipeline One user action, low branching Simple, debuggable, fast MVP Can become rigid
RAG pipeline User needs grounded answers Strong factuality, inspectable context Retrieval quality becomes the bottleneck
Tool-using agent User action needs external systems Can act, not just answer Higher latency, more failure modes
Event-driven async stage Slow background enrichment Keeps user flow responsive Harder observability
Multi-agent split Different specialists are genuinely needed Clear division of labor Coordination overhead explodes

Default rule: start with the simplest pipeline that can satisfy the user story. Add sophistication only after measuring need.

Section 4 — Contracts between components

Write contracts before wiring components together. If the contracts are vague, integration pain is guaranteed.

Minimum contracts to define: 1. User request contract — input fields, auth context, session identifiers. 2. Retrieval contract — query, filters, top-k, returned chunk schema. 3. Tool contract — arguments, timeout, retry policy, safe fallback. 4. Response contract — answer, citations, confidence, refusal reason. 5. Telemetry contract — latency, tokens, cost, error code, trace id.

Suggested request envelope:

{
  "request_id": "uuid",
  "user_id": "string",
  "task_type": "ask|act|summarize",
  "input": "user message",
  "context": {
    "session_id": "string",
    "locale": "en-IN"
  }
}

Section 5 — Implementation sequencing

Phase 1: prove the user path

  • Hard-code weak points if needed.
  • Use the best available model first.
  • Get a visible output quickly.

Phase 2: replace critical stubs with real components

  • Swap mock retrieval for real retrieval.
  • Add tool safety checks.
  • Move prompts into versioned files.

Phase 3: add inspection

  • Replay tests.
  • Gold queries.
  • Latency logging.
  • Cost logging.

Phase 4: package the system

  • Containerize.
  • Add startup scripts.
  • Write README and architecture notes.
  • Record the demo.

Section 6 — System-level evaluation

Component evals are not enough. A capstone fails at handoffs. Measure the full chain.

Eval type What it catches Example
End-to-end gold set Broken handoffs, wrong final answer Retrieval okay, answer still wrong
Latency budget test Slow composite workflows Tool retry makes response unusable
Cost test Hidden expensive paths Agent loop burns tokens
Failure injection Missing fallbacks Retriever outage causes crash
Human review sample User trust issues Tone or action confidence is wrong

Section 7 — Metrics to track from week one

Area Metric Why it matters
Quality task success rate Tells you if the system solves the job
Quality citation faithfulness Important for grounded systems
Reliability error rate Users feel this immediately
Reliability fallback rate Reveals brittle dependencies
Latency p50 / p95 end-to-end One slow hop ruins the experience
Cost dollars per successful task Honest portfolio metric
Operations tokens per request Explains cost swings

Section 8 — Deployment basics

You do not need perfect infrastructure this week. You do need a credible path.

Minimal deployment story: - Application packaged with a reproducible environment. - Config separated from code. - Secrets stored outside the repo. - One command to run locally. - One command or workflow to deploy. - Health check route and basic logs.

Section 9 — Demo and portfolio packaging

The demo is part of the engineering work. If users cannot understand the value quickly, the project underperforms.

Use this order in the demo: 1. Problem statement. 2. Input from the user. 3. Visible system action. 4. Result with evidence. 5. One failure mode and your mitigation. 6. One number on quality. 7. One number on latency or cost.

Section 10 — Reference material

YouTube

Blogs

Section 11 — What Module 16 will assume

Module 16 assumes you have already felt the pain of: - Integration challenges. - Cost and latency trade-offs. - Deployment basics. - Explaining system decisions to other engineers.

That is why this module matters. Next week turns your lived decisions into reusable principles.