Exercise 10 — Backprop By Hand¶

Timebox: 60 minutes

Goal¶

Implement forward and backward passes for a 2-layer MLP without torch.autograd or tensorflow. Pure NumPy. Common foundations question for AI Eng loops.

Work in¶

backprop.py

Tasks¶

forward(x, W1, b1, W2, b2) → returns logits and intermediates needed for backward.
cross_entropy(logits, y) → scalar loss (numerically stable).
backward(intermediates, y) → gradients w.r.t. W1, b1, W2, b2.
step(params, grads, lr) → vanilla SGD update.
Train a tiny binary classifier on a synthetic 2D dataset (e.g., XOR) to convergence.

Done when¶

Loss decreases monotonically on average
Gradients pass a finite-difference check (within 1e-4)
You can derive the cross-entropy + softmax gradient on a whiteboard

Stretch¶

Replace SGD with momentum, then with Adam
Add ReLU vs tanh and report convergence differences