01. Containers and Images — Pack once, run clean¶

⏱️ Estimated time: 18 min | Level: intermediate

ELI5 callback: Think of a busy shipping port. The dock manager must place every container on the right ship. Heavy ML work needs a cargo crane, and port security keeps lanes and permissions clean.

Why containers beat snowflake servers¶

Containers solve environment drift between laptop, CI, and production. A container packages code, runtime, libraries, and startup command together. Keep the analogy close. The dock manager reads the manifest, the container carries one workload unit, the ship offers capacity, the cargo crane handles ML-heavy lifts, and port security blocks unsafe access. Simple, no? One artifact then moves cleanly from build stage to runtime stage. See. Start with consistency, not magic. Now watch.

build flow
┌────────────┐    ┌────────────┐    ┌────────────┐
│ source     │ -> │ image      │ -> │ runtime    │
└─────┬──────┘    └─────┬──────┘    └─────┬──────┘
      │                 │                 │
      v                 v                 v
  repeatable | portable | fast start
Same recipe, different machines.

Containers share the host kernel, so they start faster than VMs. They are lighter than full virtual machines but still reasonably isolated. Isolation helps operations, but it is not perfect security by itself. Kubernetes later schedules pods, which usually wrap one main container. In interviews, explain reproducibility before dropping Docker vocabulary everywhere. Separate build concerns from deployment concerns from day one. So what to do? Use one main process unless two processes are tightly coupled. Write logs to stdout and stderr, not inside hidden files. Treat the writable filesystem as disposable unless storage is attached. Pass configuration through env vars or mounted files, not rebuilds.

How images and layers actually work¶

An image is a read-only recipe, not a running process. Each Dockerfile instruction usually creates another cached image layer. Good layer order speeds rebuilds and shrinks network transfer time. Bad layer order hides secrets, cache junk, and giant toolchains. See. Layers reward discipline. Now watch.

layer stack
┌────────────┐    ┌────────────┐    ┌────────────┐
│ base       │ -> │ deps       │ -> │ app        │
└─────┬──────┘    └─────┬──────┘    └─────┬──────┘
      │                 │                 │
      v                 v                 v
  OS libs | framework | code
Earlier layers should change less often.

Put rarely changing steps first, like OS packages and language runtimes. Put fast-changing app code later so cache hits stay high. Multi-stage builds keep compilers and test tools out of production. Copy only needed files with a tight build context. Use .dockerignore aggressively to avoid surprise bloat in layers. Tag releases by immutable digest when you need confidence. So what to do? Never bake credentials into an image layer. They stay there. Scan images for CVEs before promotion to higher environments. Prefer slim base images, but keep debugging practical for teams. Rebuild often because base layers age even when code does not.

Registries, tags, and pull behavior¶

A registry stores images so clusters can pull them later. Tags are convenient labels, but digests are the real identity. Pull policy decides when a node fetches a fresh copy. Image provenance matters because supply chains break silently first. See. Shipping labels matter only when the manifest is trustworthy. Now watch.

push and pull
┌────────────┐    ┌────────────┐    ┌────────────┐
│ build      │ -> │ push       │ -> │ pull       │
└─────┬──────┘    └─────┬──────┘    └─────┬──────┘
      │                 │                 │
      v                 v                 v
  CI artifact | registry | cluster node
Promotion should move one exact artifact.

The latest tag is convenient for demos and painful for debugging. Private registries need auth, mirrors, cleanup, and retention rules. Regional mirrors reduce cold-start delay during sudden traffic spikes. Pre-pulled images help big fleets when many pods start together. Signed images improve trust when many teams share one platform. Admission policies can block unapproved registries or mutable tags. So what to do? Use immutable release tags for staging and production. Keep retention rules or registry bills quietly keep growing. Monitor pull failures because they often look like random pod crashes. Separate dev, staging, and prod repositories with clear rules.

Runtime details that surprise teams¶

Running a container really means starting a constrained Linux process. ENTRYPOINT sets the executable, while CMD supplies default arguments. Small runtime assumptions break fast once you deploy at cluster scale. Your image should be boring to start, stop, and inspect. See. Runtime defaults become production incidents when nobody tests them. Now watch.

startup path
┌────────────┐    ┌────────────┐    ┌────────────┐
│ entrypoint │ -> │ process    │ -> │ signals    │
└─────┬──────┘    └─────┬──────┘    └─────┬──────┘
      │                 │                 │
      v                 v                 v
  PID 1 | work loop | shutdown
Start and stop behavior must be explicit.

PID 1 handles signals differently, so shutdown code must be tested. Your app should exit cleanly on SIGTERM before Kubernetes kills it. Time zones, certificates, and file paths should be explicit choices. Health endpoints should stay cheap, quick, and deterministic. Avoid writing important state inside the image filesystem itself. Prefer non-root users whenever the base image supports them. So what to do? Test stop behavior with a short termination window locally. Make startup logs obvious and structured for debugging. Fail fast when required config is missing or malformed. Keep startup commands boring enough for sleepy operators.

Image standards for platform teams¶

Platform teams should standardize image rules before cluster growth accelerates. Golden base images reduce repeated security and compliance work. CI should build once and promote the same artifact everywhere. Developers need fast feedback before bad images ever reach Kubernetes. See. Standardization saves engineering hours and reduces midnight surprises. Now watch.

promotion path
┌────────────┐    ┌────────────┐    ┌────────────┐
│ PR         │ -> │ scan       │ -> │ release    │
└─────┬──────┘    └─────┬──────┘    └─────┬──────┘
      │                 │                 │
      v                 v                 v
  lint build | policy check | signed image
Build once, then only promote.

Publish one base-image policy and update cadence per language stack. Measure image size, build time, and vulnerability count each week. Teach teams to separate build-time and run-time dependencies clearly. Store SBOM metadata with every promoted image for audits. Use admission rules to enforce minimum hygiene across teams. Prefer documented patterns over clever Dockerfiles nobody can debug. So what to do? One reference image per language stack is enough initially. Document escape hatches for truly special workloads. Review base-image owners and patch windows explicitly. Keep local developer flow close to the CI flow.

Where this lives in the wild¶

Flipkart seller services build language-specific base images and scan them in CI.
Swiggy backend teams promote one signed image across staging and production clusters.
ML batch jobs in Kubeflow pull large training images from regional private registries.
Platform groups at fintech companies enforce digest-only deployment for auditability.

Pause and recall¶

Why is an image not the same thing as a running container?
Why does layer order change both build speed and image size?
Why are tags weaker identities than image digests?
Which runtime details usually fail first under orchestration?

Interview Q&A¶

Q: Why do containers exist when VMs already isolate workloads? A: Containers package application dependencies in a portable unit that starts fast and behaves consistently. VMs isolate harder, but they are heavier and slower for routine app packaging. Common wrong answer to avoid: “Because containers are newer and therefore better.”

Q: Why are multi-stage builds so useful? A: They keep compilers, test tools, and temporary artifacts out of production images. That shrinks attack surface and often cuts image pull time as well. Common wrong answer to avoid: “They are only for making Dockerfiles look advanced.”

Q: Why should production deployments prefer digests over mutable tags? A: Digests point to one exact artifact, so rollback and debugging stay trustworthy. Mutable tags can silently move underneath you and break audit trails. Common wrong answer to avoid: “Tags are enough if the team is disciplined.”

Q: Why should images avoid writing state into their own filesystem? A: Pods can disappear anytime, so local writes vanish unless storage is attached. State belongs in external systems or explicit persistent volumes. Common wrong answer to avoid: “Because containers are read-only by default.”

Apply now (5 min)¶

Take one service you know and sketch a tiny Dockerfile for it. Mark which lines should change rarely and which should change often. Circle one place where a secret might accidentally leak into layers. Now rewrite the Dockerfile as a multi-stage build on paper. Finally, decide one immutable tag and one registry promotion rule.

Bridge. Goods packed. Now the dock manager must schedule them onto ships. → 02 → 02-pods-services-ingress.md