04. Week 0 — Daily Recall¶

Spaced practice. Answer from memory. If stuck, jump straight to the referenced section in 02_explainer.md.

Monday (after ELI5 + chapter 1)¶

In the doctor analogy, what are the symptom list, the diagnosis, the confidence meter, the specialist committee, and the overthinking trap? (§ELI5)
Why does the rule “if temperature > 100, sick” fail on many real patients? (§ELI5)
Explain the 99% training accuracy to 60% production accuracy failure in plain language. (§1.1)
What is overfitting, exactly? (§1.1)
Why should a Lead AI Engineer ask about leakage and split design immediately? (§1.2)

Bias vs variance — define each in one sentence. (§2.1)
What pattern in train vs validation scores signals underfitting? (§2.1)
What pattern signals overfitting? (§2.1)
Draw the bias-variance curve from memory. What does the middle represent? (§2.1)
Why can one linear boundary fail even when the task feels simple? (§2.2)
Why does L1 regularization often set some weights exactly to zero? (§2.3)
In one sentence, when would you prefer L2 over L1? (§2.3)

Write the linear regression equation. What does each term mean? (§3.1)
Write the MSE objective from memory. Why does it punish large errors strongly? (§3.1)
Gradient descent — what is being updated, and in which direction? (§3.2)
Do one update step: w=5, gradient 1.2, learning rate 0.1. What is the new weight? (§3.2)
Logistic regression — what changes relative to linear regression? (§3.3)
Why does logistic regression still produce a linear decision boundary? (§3.3)
Compute sigmoid(0) and explain why it matters. (§3.3)
Give one example where an interaction feature makes a linear model much better. (§3.4)

What kind of questions does a decision tree ask? (§4.1)
Geometrically, what kind of boundary does a tree draw? (§4.1)
Random forest — which problem is it mainly reducing? (§4.2)
Why does averaging many trees help? (§4.2)
Gradient boosting — what does each new tree learn? (§4.3)
In one sentence, why does XGBoost often dominate tabular tasks? (§4.4)
When should deep learning beat boosting instead? (§4.4)

Why must train, validation, and test play different roles? (§5.1)
What mistake turns the test set into a fake validation set? (§5.1)
When do you need stratified k-fold? Group k-fold? Time-series split? (§5.1)
Given TP=18, FP=6, FN=12, TN=64, compute precision. (§5.2)
Using the same numbers, compute recall and F1. (§5.2)
ROC-AUC vs PR-AUC — when does PR-AUC matter more? (§5.2, §5.4)
What is calibration? Give the 0.9-confidence example. (§5.3)
Why is 99% accuracy meaningless on a 1%-positive fraud problem? (§5.4)
How does threshold choice trade off precision and recall? (§5.4)

Explain the entire module using the doctor analogy only. (§ELI5)
Draw the bias-variance curve, L1 diamond, and L2 circle from memory. (§2.1, §2.3)
Compare logistic regression, random forest, and XGBoost in under 90 seconds. (§3.3, §4.2-§4.4)
Design an evaluation plan for a churn model with monthly data. (§5.1)
Define calibration, class imbalance, and leakage without notes. (§5.1, §5.3, §5.4)
Re-answer the self-check questions in 01_weekly_plan.md cold.