Outbox Pattern
The outbox pattern solves one of the most common problems in distributed systems: how do you atomically write to a database and publish a message to a broker?
The naive answer (“just do both”) creates a class of subtle bugs that show up in production at the worst times. The outbox pattern is the clean, well-understood solution — and it’s worth implementing correctly the first time.
The Dual-Write Problem
A typical “write state and emit event” looks like:
def place_order(order_data):
db.insert(orders, order_data) # Step 1
broker.publish("OrderPlaced", order) # Step 2
Looks fine. But consider failure modes:
These are not theoretical. Production systems hit each of these regularly.
The Outbox Pattern
Instead of writing to two systems, write to one — the database — and treat the database as the source of truth for both state changes and events.
BEGIN;
-- The state change
INSERT INTO orders (id, customer_id, total, status)
VALUES (42, 'c-100', 99.99, 'placed');
-- The event, in the same transaction
INSERT INTO outbox (event_type, aggregate_id, payload)
VALUES ('OrderPlaced', 42, '{"id": 42, "customer_id": "c-100", ...}');
COMMIT;
Both writes are atomic — either both succeed or neither does. The order and the event are now guaranteed to be consistent.
A separate process — the relay or publisher — reads from the outbox and publishes to the broker:
def relay():
while True:
events = db.query("SELECT * FROM outbox WHERE published_at IS NULL ORDER BY id LIMIT 100")
for event in events:
broker.publish(event.type, event.payload)
db.update("UPDATE outbox SET published_at = NOW() WHERE id = ?", event.id)
Now there’s no atomicity problem. The DB transaction is atomic. The relay processes events from the DB. If the relay crashes mid-publish, on restart it picks up where it left off — at-least-once delivery.
Implementation Options
Polling-based relay
Simple. Periodically polls the outbox table for unpublished rows.
CDC-based (Change Data Capture)
Use database CDC (Postgres logical replication, MySQL binlog, MongoDB change streams) via Debezium to stream changes to the outbox table directly into Kafka.
[ Postgres outbox table ] → [ Debezium ] → [ Kafka ]
For high-throughput systems, CDC-based is the right answer. For modest scale, polling is fine.
At-Least-Once Delivery
The outbox pattern gives you at-least-once delivery. Failures during publishing can cause duplicates. Consumers must be idempotent.
def handle_order_placed(event):
if has_processed(event.id):
return # already handled, skip
create_fulfillment_record(event)
mark_processed(event.id)
Combined with idempotent consumers (covered in Exactly-Once Semantics), the outbox pattern delivers practical exactly-once semantics.
Outbox Cleanup
The outbox table grows without bound. Manage it:
Schema Considerations
A typical outbox schema:
CREATE TABLE outbox (
id BIGSERIAL PRIMARY KEY,
event_type VARCHAR(255) NOT NULL,
aggregate_id VARCHAR(255) NOT NULL,
aggregate_type VARCHAR(255),
payload JSONB NOT NULL,
metadata JSONB,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
published_at TIMESTAMP NULL
);
CREATE INDEX idx_outbox_unpublished ON outbox (id) WHERE published_at IS NULL;
The partial index makes the relay’s “find unpublished” query efficient even as the table grows.
Inbox Pattern (the Mirror)
The inbox pattern is the consumer-side counterpart for idempotency.
CREATE TABLE inbox (
message_id VARCHAR(255) PRIMARY KEY,
consumed_at TIMESTAMP NOT NULL
);
def handle_event(event):
try:
with transaction:
db.insert("INSERT INTO inbox (message_id, consumed_at) VALUES (?, NOW())", event.message_id)
apply_business_logic(event)
except UniqueConstraintViolation:
# already processed, ignore
pass
The unique constraint catches duplicates. Combined with the outbox pattern, you get reliable end-to-end delivery.
When the Outbox Pattern Helps
When You Don’t Need It
Recap
- The outbox pattern solves the dual-write problem: how to atomically change state AND emit an event.
- Write the state change AND a row to an “outbox” table in one DB transaction.
- A separate relay process publishes from the outbox to the broker.
- Two relay options: polling (simple) or CDC (low-latency, more ops).
- At-least-once delivery; consumers must be idempotent.
- Pair with the inbox pattern on the consumer side for end-to-end reliability.
- Use it whenever state changes must reliably trigger downstream effects.