Interview Bank¶

A topic-organized question bank for AI engineer interviews (2026). Built from verbatim questions in public interview reports, with structured answer outlines, follow-ups, and traps.

This is the drilling material. For the "what to study and why" framing, see applied_ai_interview_focus.md.

How to use¶

Pick a topic file from the tree below.
Each entry starts with ### Q: "..." — the actual question phrasing from a real loop.
Tags tell you who asks it and how often. Filter by very-common first when cramming.
The answer outline is a skeleton, not a script. Practice speaking it out loud.
Pressure-test yourself with the follow-ups before moving on.
The "Numbers to drop" bullet is the senior signal — keep concrete numbers handy.

Tagging legend¶

Axis	Values	Meaning
Seniority	`screen` · `mid` · `senior` · `staff`	Loop level the question typically appears in
Frequency	`very-common` · `common` · `occasional`	How often it shows up in real 2026 reports
Type	`conceptual` · `scenario` · `design` · `debugging` · `coding`	What skill it tests
Source	free-text	Named report, repo, or company loop the phrasing came from

Cramming order¶

Mirror the tier structure from applied_ai_interview_focus.md:

Tier 1 (every loop): rag/rag-fundamentals.md, rag/rag-advanced.md, agents/agents-design.md, conceptual/prompt-engineering.md, conceptual/evals-production.md, cross-cutting-tradeoffs.md
Tier 2 (senior loops): conceptual/fine-tuning-adaptation.md, production/cost-latency-optimization.md, conceptual/safety-guardrails.md, production/mlops-deployment.md, conceptual/observability-tracing.md, rag/retrieval-and-ranking.md, system-design/ai-system-design.md
Tier 3 (role-specific + coding): everything else — including production/forward-deployed-engineering.md for FDE / solutions-engineer / customer-facing loops

File tree¶

cross-cutting-tradeoffs.md         # "X vs Y" decision questions (RAG vs FT, etc.)

conceptual/
  llm-fundamentals.md              # tokenization, attention, KV cache, scaling
  prompt-engineering.md            # system prompts, CoT, structured output, versioning
  fine-tuning-adaptation.md        # LoRA/QLoRA, DPO/RLHF, distillation, quantization
  evals-production.md              # golden sets, LLM-as-judge, drift, RAGAS
  safety-guardrails.md             # prompt injection, PII, red-teaming, moderation
  observability-tracing.md         # spans, OTel, LangSmith, debugging traces

rag/
  rag-fundamentals.md              # the end-to-end pipeline, citations, hallucinations
  rag-advanced.md                  # graph RAG, agentic RAG, multi-modal, hybrid scoring
  retrieval-and-ranking.md         # BM25, dense, hybrid, rerankers, RRF
  vector-databases.md              # pgvector, Pinecone, Weaviate, Qdrant trade-offs

agents/
  agents-design.md                 # ReAct, tools, schemas, stopping rules, MCP
  agents-debugging-production.md   # trace-driven debugging, loops, partial-failure
  multi-agent-orchestration.md     # planner/worker, hierarchical, communication
  memory-systems.md                # short/long-term, episodic vs semantic, eviction

production/
  cost-latency-optimization.md     # router models, caching, batching, distillation
  mlops-deployment.md              # canary, shadow, blue-green, traffic spikes
  inference-serving.md             # vLLM, TGI, TensorRT-LLM, SGLang
  incident-response.md             # postmortems, partial-failure playbooks
  forward-deployed-engineering.md  # FDE/solutions: client integration, adoption, trust

system-design/
  ai-system-design.md              # design ChatGPT, 10M-doc Q&A, voice, etc.

coding/
  ml-coding-rounds.md              # MHA, LoRA, beam search, top-p, autoregressive loop
  classic-algo.md                  # LRU, trie, union-find — AI-eng flavored
  practical-takehomes.md           # JSON+LLM, web crawler, doc NLP