04. Week 14 — Daily Recall¶
Companion files: Weekly Plan · Explainer · Study Material · Assignment · Revision
How to use: Answer each question out loud, from memory, before checking 02_explainer.md or 03_study_material.md. Aim for 1-2 sentences per answer. Terse is fine; wrong is not.
Monday¶
- [W13] CLIP — what does it learn and how?
- [W13] ViT — how does it apply transformers to images?
- Diffusion forward process — what happens in T steps? What is x_T?
Tuesday¶
- Write the closed-form expression for x_t given x_0. Name every variable.
- Stable Diffusion architecture — text encoder, U-Net, VAE. Role of each component.
- Latent diffusion — why work in latent space instead of pixel space? What is the speedup?
Wednesday¶
- Classifier-free guidance — what is trained differently? What is computed at inference?
- Guidance scale w: what happens at w = 1, w = 7, and w = 20?
- [W13] GAN generator vs discriminator — what does each do?
Thursday¶
- Text-to-image eval — how do you measure quality? (FID, CLIP score, human eval — limitations of each)
- Diffusion vs GAN — trade-offs (quality, diversity, training stability, inference speed).
- [W11] RAG vs fine-tuning vs prompting — decision framework.
Friday¶
- DiT vs U-Net — architectural difference. Which scales better and why?
- Name three inference speed techniques (DDIM, LCM, consistency models). One sentence each.
- Image generation safety — what production guardrails are needed? Name three.
After completing the week: run the four Retrieval Prompts in 02_explainer.md as a final self-test before starting 06_revision.md.