02. Queues vs Streams — One worker takes the job, or many readers follow the flow¶

~15 min read. Both move messages, but they teach your system very different habits.

Built on the ELI5 in 00-eli5.md. The bulletin board — a shared posting place for messages — now splits into task piles and replayable logs.

Queue means hand one job to one worker¶

A queue is like a task pile near a service counter. One message is usually consumed by one worker. When worker A takes it, worker B should not. This is perfect for background jobs and work distribution. Image resizing, invoice PDFs, and email sending fit nicely. You care about completion more than long memory. Ordering can matter, but worker throughput matters more. RabbitMQ feels natural for this mental model.

Diagram:

Producer ─→ Queue ─→ Worker A
                 ├→ Worker B
                 └→ Worker C
One job ▼ one worker
No replay for everyone by default

Queues shine when work must be shared, not broadcast.
Ack means the worker finished, so the broker may remove.
Backlog size tells you how far workers are behind.

Worked example: 1. User uploads product images on a marketplace. 2. App puts ResizeImage jobs into a queue. 3. Any available worker takes one job. 4. After completion, the worker acks the job. 5. Another worker never needs that same job.

Queue thinking is about fair work distribution and controlled retries.

Stream means append facts and let many readers follow¶

A stream keeps an ordered log of published facts. Many consumer groups can read the same record independently. No single consumer steals the record from others. You track reading position instead of deleting immediately. Think of the bulletin board as a long numbered tape. This helps analytics, fraud detection, and read models. Kafka feels natural for this model. New readers can start later and still replay.

Diagram:

Producer ─→ Stream [0][1][2][3][4]
                    ├→ Group A reads 0→4
                    ├→ Group B reads 0→4
                    └→ Group C starts later at 2

Streams optimize for durable ordered history.
Consumer groups hold their own reading position.
Replay becomes a product feature, not a disaster hack.

Worked example: 1. Checkout emits OrderPlaced into a stream. 2. Billing group reads it for invoicing. 3. Search group reads it for denormalized views. 4. Analytics group reads it for funnel metrics. 5. All groups can reprocess the same fact later.

Stream thinking is about durable facts and independent readers.

RabbitMQ and Kafka solve different default problems¶

RabbitMQ starts with routing and delivery to consumers now. Kafka starts with durable append-only storage and later reading. RabbitMQ queues often remove messages after acknowledgment. Kafka usually keeps records for retention time regardless. RabbitMQ exchanges route by key, topic, or headers. Kafka topics route mostly by topic name and partition key. RabbitMQ feels like active dispatching. Kafka feels like shared event memory.

Diagram:

RabbitMQ: Producer → Exchange → Queue → Consumer
Kafka:    Producer → Topic Partitions → Consumer Groups
RabbitMQ: delivery-first
Kafka:    log-first

Choose RabbitMQ when routing flexibility and task handoff dominate.
Choose Kafka when replay, multiple groups, and event history dominate.
Both can move messages well, but defaults shape team behavior.

Worked example: 1. A fintech team routes OTP, email, and webhook jobs differently. 2. RabbitMQ exchanges express those paths cleanly. 3. The same company stores transaction events for audits. 4. Kafka keeps that ordered history available for many readers. 5. One company may use both for different problems.

Brand names matter less than the failure model you need.

Choose by question, not by fashion¶

Ask whether the message is work or shared fact. Ask whether many teams need the same data later. Ask whether replay is normal or exceptional. Ask whether routing rules are rich and dynamic. Ask whether storage cost or broker operations matter more. Ask whether consumers must be online right now. These questions beat vendor debates in meetings. They also simplify interviews nicely.

Diagram:

Need one worker?      ─→ Queue
Need many readers?    ─→ Stream
Need rich routing?    ─→ RabbitMQ side
Need replayable facts?─→ Kafka side

A task queue can feed workers for CPU-heavy jobs.
A stream can feed analytics and stateful downstream projections.
Mixing them deliberately is better than forcing one tool everywhere.

Worked example: 1. A food delivery app creates PDF invoices nightly. 2. That belongs on a queue for worker distribution. 3. The same app emits DeliveryCompleted for reporting. 4. That belongs on a stream for many consumers. 5. One architecture can use both patterns happily.

Start with usage shape, then map to technology calmly.

Where this lives in the wild¶

These differences show up clearly when companies split job execution from event analytics.

Amazon fulfillment SDE uses queues for pick-pack-print work packets. Workers should finish each packet once, not replay them forever.
Netflix data platform engineer relies on streams for playback and quality events. Many downstream groups consume the same flow independently.
Razorpay messaging engineer may route notification jobs through RabbitMQ. The same organisation can keep ledger events in Kafka.
Swiggy analytics engineer reads delivery lifecycle streams for ETAs and dashboards. Dispatch workers still consume operational tasks from queue-like systems.
LinkedIn feed infrastructure engineer leans on log semantics for activity events. Replay and many readers matter more than one-time worker pickup.

Pause and recall¶

If you can explain this without tool names, you truly understand it.

Why does a queue usually give one job to one worker?
Why can two consumer groups read the same stream independently?
What product need makes replay worth the extra complexity?
When would RabbitMQ feel more natural than Kafka?

Say the answer aloud before reading ahead tomorrow.

Interview Q&A¶

Interviewers love mental models that stay crisp under pressure.

Q: What is the core difference between a queue and a stream? A: A queue distributes work to consumers, while a stream preserves ordered records for many readers. Common wrong answer to avoid: "A stream is just a faster queue."

Q: Why is Kafka compared with a log? A: Because records stay for retention, and consumers track positions independently. Common wrong answer to avoid: "Kafka deletes each message after one consumer reads it."

Q: When is RabbitMQ a better fit? A: When routing patterns, request handoff, and worker coordination matter most. Common wrong answer to avoid: "RabbitMQ is only for legacy systems."

Q: Can the same system use both queues and streams? A: Yes. Use queues for jobs and streams for durable shared facts. Common wrong answer to avoid: "You must standardize on one broker everywhere."

Keep answers crisp, then add trade-offs only when asked.

Apply now (5 min)¶

Pick one existing async flow from your work or study notes. Separate messages into job handoff versus shared fact. For each message, write whether replay would help tomorrow. Now map one path to RabbitMQ-like behavior and another to Kafka-like behavior. State the operational win from each mapping clearly. Sketch from memory: - one queue with three workers, - one stream with two consumer groups, - one sentence explaining why deletion timing differs.

Bridge. → 03-kafka-deep-dive.md