Message brokers are how loosely-coupled services exchange data without holding each otherβs network connections. Theyβre the substrate that makes asynchronous architectures possible β and the source of half the operational complexity in distributed systems.
This lesson is the working vocabulary: queue vs. stream, push vs. pull, the major brokers, and when to use which.
Two Paradigms
Most message systems fall into one of two categories:
Queue (Job Queue)
Each message goes to one consumer
Producer pushes work; one of N workers picks it up and processes. Once consumed, message is gone. Used for task queues, background jobs, work distribution.
Stream (Log)
Multiple consumers see all messages
Producer appends to a log; many consumers each track their own position. Multiple independent reads of the same data. Used for event sourcing, analytics, fan-out.
Kafka, Pulsar. Consumer controls pace. Easy back-pressure. Stateless broker for that consumer. Slight latency overhead from polling.
Modern systems mostly use pull-based models for the back-pressure benefits.
Delivery Semantics
What guarantees does the broker provide about message delivery?
π»
At most once
Messages may be lost on failure. Simple, but rare in production.
π
At least once
Default for most systems. Messages can be duplicated. Consumers must be idempotent.
π
Exactly once
Theoretically perfect. Hard to implement; often unnecessary. We have a whole lesson on this.
Ordering
When does message order matter?
No ordering
Maximum throughput
Messages can be processed in parallel. Workers pull from a queue freely. Right when each message is independent.
Ordered by partition
Practical compromise
Kafka partitions: strict order within a key (e.g., user ID), parallelism across keys. The right answer for most workloads where order matters.
Global ordering is expensive and usually unnecessary. Most βwe need orderingβ requirements are actually βwe need ordering per entityβ β which partition keys give you.
Common Patterns
π·
Work queue
Many workers pull from one queue. Each message processed once. Used for background jobs, image processing, email sending.
π’
Pub/sub fan-out
One publish, many subscribers. Each subscriber sees all messages. Used for notifications, cache invalidation, analytics.
π
Request/reply
Async RPC. Producer sends a message with a reply queue; consumer responds to that queue. RabbitMQ is good for this; Kafka is awkward.
π―
Routing
Messages routed to different queues based on content. RabbitMQ exchange + binding rules. Useful for complex business workflows.
πͺ¦
Dead-letter queue
Messages that fail processing repeatedly go here. Don't block the main queue. Required for production-grade handling.
π’
Delay queues
Process this message in 5 minutes. SQS supports natively; Kafka requires patterns or extensions.
Choosing for SaaS
Use case
Default choice
Background jobs / task queue
RabbitMQ or SQS
Event streaming, analytics, audit log
Kafka or Pulsar
Cross-region pub/sub
Google Pub/Sub or Pulsar
AWS-native, simple
SQS + SNS
GCP-native, simple
Pub/Sub
Both queue and stream needed
Pulsar (or Kafka with workarounds)
Hosted, low operational overhead
SQS / Pub/Sub / Confluent Cloud
For most SaaS, the practical answer is:
SQS / Cloud Pub/Sub for simple queue / pub-sub work.
Kafka (managed) when streaming is core to the architecture.
Both when you have both kinds of workloads β they coexist fine.
Recap
Two paradigms: queues distribute work; streams broadcast events.