04. SQS, RabbitMQ, and Google Pub/Sub — Managed queue, flexible broker, or cloud fan-out pipe¶
~16 min read. Tool choice gets easier when you compare delivery models, not logos.
Built on the ELI5 in 00-eli5.md. The bulletin board — the shared place messages wait or get replayed — now becomes three cloud-flavoured operational choices.
SQS gives durable queue basics with simple knobs¶
Amazon SQS is a managed queue with low operator burden. You send messages, then workers poll and process them. Visibility timeout hides a taken message temporarily. If the worker finishes, it deletes the message. If the worker crashes, the message becomes visible again. That creates at-least-once delivery by default. The bulletin board is managed for you, not hand-tuned. Choose Standard or FIFO depending on ordering and dedup needs.
Diagram:
Producer ─→ SQS Queue ─→ Worker
│
├→ visibility timeout hides message
└→ no delete? message returns later
- Standard queues scale widely but may deliver duplicates.
- FIFO queues preserve order better with lower throughput.
- DLQs catch repeatedly failing messages for inspection.
Worked example: 1. A report generation API pushes GeneratePDF jobs to SQS. 2. Lambda or ECS worker receives one job. 3. Visibility timeout starts while processing runs. 4. Success triggers DeleteMessage. 5. Failure lets the job reappear for retry.
SQS is excellent when you want durable queueing without broker babysitting.
RabbitMQ shines when routing rules really matter¶
RabbitMQ is a broker with exchanges, queues, and bindings. Producers publish to exchanges, not directly to queues. Exchanges then route by direct, topic, fanout, or headers. This makes delivery patterns very expressive. Consumers acknowledge after processing, similar to queue workers. Unacked messages can be redelivered after consumer failure. RabbitMQ works well for task queues and selective fan-out. It asks for more broker understanding than SQS.
Diagram:
- Topic exchanges help route messages by patterns.
- Dead lettering can move poison messages safely aside.
- Operational tuning matters for queues, memory, and clustering.
Worked example: 1. A fintech app publishes notification tasks to an exchange. 2. Email queue binds email. keys only. 3. SMS queue binds sms. keys only. 4. Webhook queue binds webhook.* keys only. 5. Each consumer scales independently.
RabbitMQ rewards teams who need rich routing more than raw replay.
Google Pub/Sub feels like cloud fan-out with managed scaling¶
Google Pub/Sub uses topics and subscriptions as core objects. Publish once, then many subscriptions can receive independently. Subscribers acknowledge after processing, like other brokers. Ack deadline extension handles longer processing times. Push and pull subscriptions support different consumer styles. Retention and replay exist, though mental models vary by setup. It fits event fan-out across managed cloud services nicely. Operator burden stays much lower than self-managed brokers.
Diagram:
Producer ─→ Topic
├→ Subscription A ─→ Service A
├→ Subscription B ─→ Service B
└→ Subscription C ─→ Service C
- Great for multi-subscriber event distribution in GCP ecosystems.
- Delivery is at-least-once unless you design around duplicates.
- Ack deadlines and retry policies need thoughtful tuning.
Worked example: 1. A retail app publishes InventoryChanged to Pub/Sub. 2. Search subscription updates product availability. 3. Pricing subscription recomputes campaign views. 4. Analytics subscription tracks stockout trends. 5. One publish feeds many independent services.
Pub/Sub works best when managed fan-out and cloud integration dominate.
Compare by burden, routing, and failure handling¶
SQS is simplest when you mostly need queue semantics. RabbitMQ is strongest when routing patterns drive architecture. Google Pub/Sub is attractive inside GCP event-heavy systems. Visibility timeout is central in SQS failure behavior. Ack plus redelivery is central in RabbitMQ and Pub/Sub too. DLQs exist in all three, but operations feel different. Cost, latency, and lock-in also shape decisions. So compare defaults before comparing marketing pages.
Diagram:
Need simple managed queue? ─→ SQS
Need rich routing control? ─→ RabbitMQ
Need managed cloud fan-out? ─→ Pub/Sub
Need careful duplicate handling? ─→ all three
- SQS minimizes broker operations.
- RabbitMQ maximizes routing flexibility.
- Pub/Sub maximizes managed fan-out convenience in GCP.
Worked example: 1. A startup runs all workloads on AWS and needs job queues. 2. SQS plus Lambda keeps operations simple. 3. Another team needs topic routing by business domains. 4. RabbitMQ exchanges express that routing cleanly. 5. A GCP-native analytics platform may choose Pub/Sub instead.
The best broker is the one whose failure model your team can run.
Where this lives in the wild¶
Real teams often choose these tools because cloud context and operator time matter.
-
Amazon backend engineer often pairs SQS with Lambda for asynchronous jobs. Visibility timeout and DLQ settings become daily operational levers.
-
Fintech platform engineer using RabbitMQ may separate OTP, webhook, and email routes. Exchange bindings keep message paths explicit and reviewable.
-
Google Cloud data engineer uses Pub/Sub for event fan-out into Dataflow and services. Managed scaling reduces time spent patching brokers.
-
Swiggy backend engineer on AWS may use SQS for restaurant ingestion retries. Simple queue semantics beat custom broker maintenance there.
-
Media platform SRE may keep RabbitMQ for operational workflows needing selective routing. Replay depth matters less than precise delivery paths.
Pause and recall¶
Now check whether you can explain the tools without vendor fanboy energy.
- What does visibility timeout protect in SQS?
- Why do exchanges make RabbitMQ feel different from SQS?
- How does Pub/Sub let one publish reach many consumers?
- When would operator burden decide the tool choice strongly?
Say the answer aloud before reading ahead tomorrow.
Interview Q&A¶
Short operational comparisons usually score better than vague feature lists.
Q: What is visibility timeout in SQS? A: It temporarily hides a received message so one worker can finish before others retry it. Common wrong answer to avoid: "It deletes the message automatically after receipt."
Q: Why might a team choose RabbitMQ over SQS? A: Because exchanges and bindings support richer routing and broker-level patterns. Common wrong answer to avoid: "RabbitMQ is always more scalable, so always choose it."
Q: What is the core Pub/Sub mental model? A: Publish to a topic once, then subscriptions receive independently and acknowledge separately. Common wrong answer to avoid: "Pub/Sub behaves exactly like one shared queue."
Q: Do all three need idempotent consumers? A: Yes. At-least-once delivery and redelivery risk make duplicates possible everywhere. Common wrong answer to avoid: "Managed services remove the need for idempotency."
Keep answers crisp, then add trade-offs only when asked.
Apply now (5 min)¶
List one workload needing simple job execution and one needing fan-out. Map the first to SQS, RabbitMQ, or Pub/Sub with one reason. Map the second again with one reason tied to delivery model. Write one failure story involving timeout, ack, or redelivery. Then say where a DLQ would sit in the design. Sketch from memory: - one SQS worker flow with visibility timeout, - one RabbitMQ exchange routing to two queues, - one Pub/Sub topic with three subscriptions.
Bridge. → 05-retries-dlq-idempotency.md