Skip to content

03. Encryption, TLS, and KMS

⏱️ Estimated time: 34 min | Level: intermediate

ELI5 callback: In the apartment building, the front door checks identity, the elevator key limits movement, the wall keeps tenants apart, the audit records rule-following, and the safe protects valuables.

1) What encryption is protecting

Encryption protects confidentiality when storage, transit, or a machine boundary becomes untrusted.

See. It does not replace authorization, logging, or tenant isolation. It complements them.

So what to do? Separate three questions: data in transit, data at rest, and key custody.

In design interviews, say exactly who should not read the data and at which point.

Now watch. The hardest mistakes are usually about keys, not algorithms.

  • The front door reminds you that trusted identity still matters before data is served.
  • The elevator key reminds you that approved access must still be scoped carefully.
  • The wall reminds you that ciphertext alone does not create tenant separation.
  • The audit reminds you that key use and data access need evidence.
  • The safe reminds you that encryption is useful only when the key is controlled well.

  • At rest means disks, backups, snapshots, and object storage contents.

  • In transit means network paths between browser, API, service, and database.
  • In use is the harder zone because plaintext exists in memory while work happens.
  • Name the attacker model before choosing the control stack.

2) TLS 1.3 is your transit baseline

TLS protects data moving over networks by encrypting the channel and authenticating the endpoint.

TLS 1.3 removes older weak options and shortens the handshake. Good. Use it.

Simple, no? Certificates prove the server you reached is the one you intended to reach.

Without proper certificate validation, encryption becomes costume jewellery.

Mutual TLS adds client identity too, which is useful for service-to-service paths.

  • Terminate TLS only where you trust the next hop and can explain that trust.
  • Re-encrypt internal hops when the network path is shared or loosely controlled.
  • Rotate certificates automatically because manual expiry handling always fails eventually.
  • Pinning can help in special clients, but operational mistakes can hurt availability.
  • Watch for mixed-content and downgrade gaps in browser-facing systems.

┌─────────┐ ClientHello ┌────────────┐ │ Client │───────────────▶│ Server │ └─────────┘◀───────────────└────────────┘ cert + key share Finished + traffic keys encrypted application data flows after that

3) Envelope encryption and KMS

Envelope encryption uses a data key for the payload and a master key to protect that data key.

This is practical because bulk data encryption stays fast while master keys stay tightly controlled.

Now watch. KMS gives managed key creation, policy, rotation, and usage logging.

Your app asks KMS to encrypt or decrypt small keys, not entire multi-gigabyte files.

That boundary keeps performance sane and security ownership clearer.

  • Generate a fresh data key per object, file, or record set when sensitivity justifies it.
  • Store the encrypted data key beside the ciphertext so decryption remains traceable.
  • Use customer-managed keys when compliance, separation, or revocation needs are stricter.
  • Limit which services may call decrypt because that permission is highly sensitive.
  • Cache decrypted data keys carefully and for short windows only.
  • Record key identifiers in metadata so re-encryption jobs can run safely later.

  • AES handles symmetric data encryption efficiently.

  • KMS or HSM-backed systems protect root key material and access policy.
  • Good systems separate encryption operations from business logic as much as possible.

4) Key lifecycle, rotation, and access control

A key management plan is incomplete without creation, rotation, disablement, deletion, and emergency response.

See. Encryption without a key lifecycle is just hopeful decoration.

Different data classes may need different keys, retention windows, and deletion handling.

So what to do? Classify data before you classify keys.

Every decrypt permission should feel expensive because it effectively unlocks confidentiality.

  • Rotate keys by policy, but also prepare ad-hoc rotation for compromise events.
  • Scope key usage to environment, tenant class, or data domain where possible.
  • Separate key administrators from application operators when compliance demands it.
  • Test restore paths because backups are useless if decryption metadata is missing.
  • Plan re-encryption jobs so they do not overload storage and queues unexpectedly.
  • Make deletion semantics explicit because destroyed keys can make recovery impossible.

5) Common mistakes and practical checks

Teams often say everything is encrypted, then discover plaintext in logs, queues, exports, or memory dumps.

See. Encryption claims should always include scope and exclusions.

Hardcoding keys, reusing one key everywhere, and skipping certificate validation are classic own goals.

Now watch. Secrets management and encryption overlap, but they are not identical disciplines.

The shortest answer in an interview is: encrypt the data, protect the keys, and limit decrypt paths.

  • Ask who can read backups, snapshots, and replica storage.
  • Check whether internal service traffic is still plaintext after the ingress layer.
  • Review whether logs, analytics events, and dead-letter queues leak sensitive fields.
  • Make sure key IDs, not raw keys, appear in configs and dashboards.
  • Test certificate expiry alerts before production learns the date first.
  • Verify that rotation does not break older ciphertext unexpectedly.

Saying encrypted everywhere is meaningless without scope.

Trust boundary changes should trigger key and TLS reviews.

Not every data class deserves the same key strategy.

Certificate automation is reliability work and security work together.

Envelope encryption works because it separates speed from key custody.

Key metadata is part of the system design, not optional garnish.

Backup recovery must include key recovery assumptions or deliberate key destruction plans.

Transit protection fails if one internal hop quietly downgrades to plaintext.

The decrypt permission is often more important than the encrypt permission.

Encryption boundaries should be drawn on architecture diagrams, not just compliance slides.

Where this lives in the wild

  • HTTPS termination for public APIs and web applications.
  • Object storage systems encrypting files, backups, and tenant exports.
  • Database platforms using managed encryption keys for tables and snapshots.
  • Service meshes securing east-west traffic with mutual TLS.
  • Payment, healthcare, and enterprise systems with strict key access controls.

Pause and recall

  • What is the difference between encryption at rest and in transit?
  • Why does envelope encryption use two layers of keys?
  • Why is decrypt permission so sensitive?
  • What can go wrong during key rotation if metadata is poor?

Interview Q&A

Q: Why is TLS 1.3 preferred today? A: It removes many older weak options, shortens the handshake, and gives a cleaner secure baseline. Common wrong answer to avoid: "TLS is just about performance tuning."

Q: What problem does envelope encryption solve? A: It lets systems encrypt large data efficiently while keeping master key operations centralized and tightly controlled. Common wrong answer to avoid: "It is only used because cloud vendors like extra complexity."

Q: When should you use customer-managed keys? A: Use them when compliance, separation of duties, revocation control, or tenant-specific policies require stronger ownership. Common wrong answer to avoid: "Always use the vendor default key because encryption is encryption."

Q: What is a common encryption mistake in distributed systems? A: Forgetting plaintext side channels such as logs, queues, snapshots, or unencrypted internal hops. Common wrong answer to avoid: "If the database is encrypted, the system is done."

Apply now (5 min)

Pick one sensitive data path in your product, like file upload or invoice storage.

Mark where the data is in transit, at rest, and briefly in plaintext for processing.

Then write which key protects it and who may call decrypt.

If you cannot answer that clearly, your design is not ready.

Bridge. Data encrypted. But multi-tenant systems share infrastructure. → 04