Skip to content

Interview Bank

A topic-organized question bank for AI engineer interviews (2026). Built from verbatim questions in public interview reports, with structured answer outlines, follow-ups, and traps.

This is the drilling material. For the "what to study and why" framing, see applied_ai_interview_focus.md.

How to use

  1. Pick a topic file from the tree below.
  2. Each entry starts with ### Q: "..." — the actual question phrasing from a real loop.
  3. Tags tell you who asks it and how often. Filter by very-common first when cramming.
  4. The answer outline is a skeleton, not a script. Practice speaking it out loud.
  5. Pressure-test yourself with the follow-ups before moving on.
  6. The "Numbers to drop" bullet is the senior signal — keep concrete numbers handy.

Tagging legend

Axis Values Meaning
Seniority screen · mid · senior · staff Loop level the question typically appears in
Frequency very-common · common · occasional How often it shows up in real 2026 reports
Type conceptual · scenario · design · debugging · coding What skill it tests
Source free-text Named report, repo, or company loop the phrasing came from

Cramming order

Mirror the tier structure from applied_ai_interview_focus.md:

  1. Tier 1 (every loop): rag/rag-fundamentals.md, rag/rag-advanced.md, agents/agents-design.md, conceptual/prompt-engineering.md, conceptual/evals-production.md, cross-cutting-tradeoffs.md
  2. Tier 2 (senior loops): conceptual/fine-tuning-adaptation.md, production/cost-latency-optimization.md, conceptual/safety-guardrails.md, production/mlops-deployment.md, conceptual/observability-tracing.md, rag/retrieval-and-ranking.md, system-design/ai-system-design.md
  3. Tier 3 (role-specific + coding): everything else — including production/forward-deployed-engineering.md for FDE / solutions-engineer / customer-facing loops

File tree

cross-cutting-tradeoffs.md         # "X vs Y" decision questions (RAG vs FT, etc.)

conceptual/
  llm-fundamentals.md              # tokenization, attention, KV cache, scaling
  prompt-engineering.md            # system prompts, CoT, structured output, versioning
  fine-tuning-adaptation.md        # LoRA/QLoRA, DPO/RLHF, distillation, quantization
  evals-production.md              # golden sets, LLM-as-judge, drift, RAGAS
  safety-guardrails.md             # prompt injection, PII, red-teaming, moderation
  observability-tracing.md         # spans, OTel, LangSmith, debugging traces

rag/
  rag-fundamentals.md              # the end-to-end pipeline, citations, hallucinations
  rag-advanced.md                  # graph RAG, agentic RAG, multi-modal, hybrid scoring
  retrieval-and-ranking.md         # BM25, dense, hybrid, rerankers, RRF
  vector-databases.md              # pgvector, Pinecone, Weaviate, Qdrant trade-offs

agents/
  agents-design.md                 # ReAct, tools, schemas, stopping rules, MCP
  agents-debugging-production.md   # trace-driven debugging, loops, partial-failure
  multi-agent-orchestration.md     # planner/worker, hierarchical, communication
  memory-systems.md                # short/long-term, episodic vs semantic, eviction

production/
  cost-latency-optimization.md     # router models, caching, batching, distillation
  mlops-deployment.md              # canary, shadow, blue-green, traffic spikes
  inference-serving.md             # vLLM, TGI, TensorRT-LLM, SGLang
  incident-response.md             # postmortems, partial-failure playbooks
  forward-deployed-engineering.md  # FDE/solutions: client integration, adoption, trust

system-design/
  ai-system-design.md              # design ChatGPT, 10M-doc Q&A, voice, etc.

coding/
  ml-coding-rounds.md              # MHA, LoRA, beam search, top-p, autoregressive loop
  classic-algo.md                  # LRU, trie, union-find — AI-eng flavored
  practical-takehomes.md           # JSON+LLM, web crawler, doc NLP