Exercise 08 — Sampling Strategies¶
Timebox: 30-45 minutes
Goal¶
Implement greedy, temperature, top-k, and top-p (nucleus) sampling from a logits array. Cover the most common live-coding question on decoding.
Work in¶
sampling.py
Tasks¶
greedy(logits)— argmax.temperature_sample(logits, T)— scale logits, softmax, sample.top_k(logits, k, T)— keep top-k logits, mask the rest, sample.top_p(logits, p, T)— nucleus: smallest set whose softmax sums to ≥ p.- Combine:
sample(logits, T, k=None, p=None)that applies whichever filters were given.
Done when¶
- Pure NumPy or pure PyTorch, no library helpers
- A unit test confirms each function does what it says on a hand-picked logits array
- You can explain to a peer why top-p is preferred when output entropy varies a lot
Stretch¶
- Add repetition penalty
- Add
min_p(any token belowmin_p × max_probmasked)