Home / AI Foundation / 05. LLM Training Pipeline LLM Training Pipeline¶ The chapters in this module, in reading order. # Chapter 00 LLM Training Lifecycle — The Five-Year-Old Version 01 Base Model Product Contract — fluency is not usefulness 02 Curriculum Data Mix — the model learns what it repeatedly reads 03 Next-Token Training Loop — the tiny contract repeated billions of times 04 Memory and Parallelism — training breaks where bytes multiply 05 PyTorch and Hugging Face Tooling — abstractions with escape hatches 06 SFT Behavior Copying — same loss, different scenes 07 Chat Protocol and Data Quality — behavior lives in tiny boundaries 08 Preferences, Reward Models, PPO, and DPO — choosing better answers without breaking the model 09 Lifecycle Decisions and Evals — choosing the next knob without fooling yourself 10 Honest Admission — the lifecycle works, but the knobs are not fully understood