Skip to content

03. Kafka Deep Dive — Partitions, offsets, groups, and the promises you actually get

~18 min read. Kafka sounds simple until ordering, commits, and duplicates arrive together.

Built on the ELI5 in 00-eli5.md. The bulletin board — split into sections with memory and reading positions — lets many readers move fast without losing shared facts.


Partition first, then think ordering

Kafka topics are split into partitions for parallelism. Order is guaranteed only inside one partition. Across partitions, relative order is not guaranteed. So key choice is a business decision, not plumbing. Think of the bulletin board as several numbered columns. Messages with the same key should land together. That preserves per-entity ordering while scaling throughput. Bad keys create hot partitions and slow consumers.

Diagram:

Topic Orders
┌─────────┬─────────┬─────────┐
│ P0      │ P1      │ P2      │
├─────────┼─────────┼─────────┤
│ o1,o4   │ o2,o5   │ o3,o6   │
└─────────┴─────────┴─────────┘

  • Use order_id when one order needs strict sequencing.
  • Do not use random keys when business ordering matters.
  • Watch partition skew during traffic spikes.

Worked example: 1. Ecommerce emits all order events keyed by order_id. 2. Order 981 always lands in the same partition. 3. Created, Paid, and Shipped stay ordered there. 4. Another order can process in parallel elsewhere. 5. Throughput rises without breaking per-order sequence.

Kafka ordering is local, so your key defines the boundary.

Consumer groups share work with offsets

Each partition is read by one consumer inside one group. Different groups may read the same partition independently. An offset is the reader position inside that partition. Kafka stores records, while groups store reading progress. That is why replay does not require republishing. You can reset offsets for backfills and bug fixes. Rebalances move partitions between consumers when membership changes. Those rebalances can pause progress briefly.

Diagram:

Orders Topic P0 P1 P2
Group Billing:   C1→P0  C2→P1  C3→P2
Group Analytics: C4→P0  C5→P1  C6→P2
Offsets:         120    118    121

  • More consumers than partitions gives some idle members.
  • Fewer consumers than partitions makes each member handle many partitions.
  • Offset lag is a key operational signal.

Worked example: 1. Billing group has three consumers for three partitions. 2. One billing consumer crashes suddenly. 3. Kafka reassigns its partition to another member. 4. Processing resumes from the committed offset. 5. Analytics group continues unaffected with its own offsets.

Offsets belong to consumer groups, not to the stored records.

Producer acknowledgments decide durability speed trade-offs

Producer acks decide when send is considered successful. acks=0 is fastest but offers almost no safety. acks=1 waits for the leader partition replica only. acks=all waits for in-sync replicas and is safer. Safer acknowledgments usually increase latency slightly. Idempotent producers help avoid duplicates during retries. Batch size and linger also change throughput behavior. These knobs should match business risk, not ego.

Diagram:

Producer ─→ Leader
            ├→ Follower 1
            └→ Follower 2
acks=1  ▼ leader reply
acks=all▼ all ISR reply

  • Payments and ledgers usually prefer stronger acknowledgment settings.
  • Metrics pipelines may accept weaker settings for speed.
  • Tune retries with idempotence, not blind optimism.

Worked example: 1. A payment ledger producer uses acks=all. 2. Leader receives the record and replicates it. 3. One follower is slow but still in sync. 4. Producer waits before marking send successful. 5. That extra wait buys stronger durability confidence.

Acks are business risk settings hiding behind technical words.

Commit strategy shapes duplicates and gaps

Consumers usually poll, process, then commit offsets. Commit too early, and you may skip unprocessed records. Commit too late, and you may reprocess after crashes. Auto-commit is easy but often too blunt. Manual commits give control around real processing boundaries. Sync commit is safer but slower than async commit. Exactly-once semantics need careful producer and consumer coordination. They do not magically protect external side effects.

Diagram:

poll → process → write result → commit offset
       │ crash here?     │ crash here?
       ├→ duplicate risk └→ gap risk
Best boundary: commit after durable result

  • At-least-once is common because duplicates are manageable.
  • Exactly-once in Kafka mainly covers Kafka-to-Kafka style flows.
  • Email, payments, and webhooks still need idempotent handlers.

Worked example: 1. Consumer reads OrderPaid at offset 400. 2. It writes a projection row successfully. 3. Then it commits offset 401. 4. If crash happens before commit, replay may repeat work. 5. Idempotent writes make that repetition safe.

Commit after durable business work, not before wishful thinking.

Retention gives replay power, not infinite memory

Kafka keeps records for time or size based retention. Old data may expire even if some group never read it. Log compaction keeps latest record per key when useful. That helps snapshot-like topics such as account state. Retention must match replay, audit, and recovery needs. Infinite retention sounds nice until storage bills arrive. Short retention saves money but limits debugging and backfills. Treat retention as product policy, not afterthought.

Diagram:

Time →
[0][1][2][3][4][5][6]
retention window └──────────────┘
expired records vanish; group offsets alone cannot restore them

  • Keep enough retention for delayed consumers and reprocessing.
  • Use compaction when latest keyed state matters more than every event.
  • Document replay expectations before incidents happen.

Worked example: 1. Fraud team needs seven days of transaction event replay. 2. Analytics asks for thirty days for experimentation. 3. Platform sets tiered topics with different retention. 4. Consumer teams learn the expiry contract clearly. 5. Surprise backfill failures reduce sharply.

Retention is part of your interface, because replay depends on it.


Where this lives in the wild

Kafka topics often sit behind products where many consumers need the same ordered facts.

  • LinkedIn infrastructure engineer keys activity events for stable partitioning. Feed, search, and analytics groups each track their own offsets.

  • Uber payments platform engineer tunes acks and idempotent producers carefully. Ledger correctness matters more than shaving tiny latency.

  • Netflix observability data engineer watches consumer lag during deployments. Rebalances and slow partitions show up before dashboards break.

  • Swiggy marketplace backend engineer chooses delivery_id keys for event ordering. That preserves per-delivery state changes while scaling traffic.

  • Confluent solutions architect explains retention and compaction trade-offs to customers. Replay expectations must match storage policy from day one.


Pause and recall

If Kafka still feels abstract, answer these before moving on.

  1. Why does Kafka guarantee order only within one partition?
  2. What exactly does an offset represent for a consumer group?
  3. Why does exactly-once still not protect an email side effect automatically?
  4. How does retention policy affect replay and debugging?

Say the answer aloud before reading ahead tomorrow.


Interview Q&A

Keep the answers operational, not just definitional.

Q: How do partitions affect ordering? A: Kafka preserves order only within each partition, so keying strategy defines the ordered unit. Common wrong answer to avoid: "Kafka guarantees global topic order."

Q: What is a consumer group? A: It is a set of consumers sharing partitions and offsets to scale one logical application. Common wrong answer to avoid: "Every consumer in a group reads every message."

Q: What does exactly-once semantics really mean in Kafka? A: It mainly coordinates Kafka reads and writes to avoid duplicates within supported transactional flows. Common wrong answer to avoid: "Exactly-once means no duplicate side effect anywhere."

Q: Why are retention settings important? A: They decide how long replay remains possible for bugs, audits, and new consumers. Common wrong answer to avoid: "Retention only matters for storage cost."

Keep answers crisp, then add trade-offs only when asked.


Apply now (5 min)

Take one event type, such as OrderPaid or TripCompleted. Choose a partition key and justify the ordering boundary. Write where the consumer should commit offset in its flow. State whether acks=1 or acks=all matches the business risk. Then state how many days of retention would help debugging. Sketch from memory: - one topic with three partitions, - two consumer groups with separate offsets, - one processing path showing poll, work, and commit.


Bridge.04-sqs-rabbitmq-pubsub.md