Distributed Coordination and Locking

9 min read · Updated 2026-04-25

Distributed locks coordinate processes accessing shared resources across multiple machines. They’re fundamental for preventing concurrent writes to the same record, ensuring single-leader behavior, and guarding non-idempotent operations.

The hard part: distributed environments don’t have shared memory or synchronized clocks. Network failures partition nodes without warning. Long GC pauses can make a healthy node look dead. Building a correct distributed lock is genuinely tricky.

Why It’s Hard

Partial failures

A node may crash while others run. Detecting "did the lock holder die or just go silent?" is genuinely undecidable in general.

Clock skew

No global clock. Wall-clock timestamps are unreliable for ordering. NTP helps but doesn't guarantee tight bounds.

Network partitions

Nodes can be isolated for arbitrary periods. Messages can be delayed, reordered, or lost.

GC pauses

Long GC or VM pause makes a node appear dead. When it resumes, it doesn't know time has passed and may still think it holds the lock.

Atomicity

"Check if lock is free, then acquire it" must be a single atomic operation, not two steps.

Fencing

Without monotonically increasing tokens, a delayed operation from an old lock holder can corrupt state after a new holder takes over.

Common Pitfalls

Local locks in distributed services

Junior engineers reach for OS mutex or file lock. Other replicas have no idea about it. Lock effectively does nothing.

Single lock server with no failover

Centralized lock service crashes; whole system halts. Anything coordination-critical needs replication.

No TTL

If client crashes while holding lock, lock is held forever. Always use leases / TTLs.

Wall-clock ordering

Using timestamps for lock priority. Clock skew breaks this. Logical clocks or fencing tokens for correctness-critical use.

No fencing tokens

A lock holder pauses, lock expires, new holder takes over. Old holder resumes and writes to storage with stale "I have the lock" assumption.

Non-atomic check-and-set

Separate "check available" + "acquire" creates race conditions. Use atomic compare-and-swap or single-call API.

The Implementation Options

RDBMS locks

Relational databases offer two approaches you can use without any extra infrastructure.

Row-level locks

SELECT FOR UPDATE

Standard SQL row locking inside a transaction. Coordinated by the DB. Works only inside one DB; SQL transaction overhead.

Advisory locks

Postgres pg_advisory_lock(key)

Lightweight, integer-keyed mutex. Session or transaction scoped. Auto-released on disconnect. Postgres-only; no fencing token.

Best for: scenarios where all clients use the same database, no need for cross-DB coordination, simple use cases like “guard one scheduled job.”

Redis simple lock

The canonical Redis lock pattern:

SET resource_key my_random_value NX PX 30000

NX = set only if key doesn’t exist. PX 30000 = expire in 30 seconds. my_random_value = unique token chosen by client.

To release, use a Lua script that checks the value matches before deleting:

if redis.call("get", KEYS[1]) == ARGV[1]
then return redis.call("del", KEYS[1])
else return 0
end

Pros: Very fast, simple. Lock auto-expires if client crashes. Cons: Single Redis instance is a SPOF. Replication is asynchronous — failover can produce two clients thinking they hold the lock. No fencing token. No safety beyond the TTL.

Redlock

Redis’s multi-master lock algorithm. Uses N independent Redis nodes (typically 5):

Client attempts SET NX PX on each Redis master, recording timing.
If set succeeds on a majority (3 of 5) and total time < TTL, lock acquired.
Effective validity = TTL − elapsed time.
Failure → delete any partial locks, retry later.

ZooKeeper locks

ZooKeeper’s classic lock recipe uses ephemeral sequential znodes:

Client creates /locks/mylock/lock-NNNNN with EPHEMERAL + SEQUENTIAL flags.
Sequential flag appends a monotonically-increasing suffix.
Client lists children, checks if its znode has the lowest number.
If yes → it has the lock.
If no → set a watch on the immediate predecessor, wait for it to be deleted.
Release = delete the znode (or it auto-deletes when client disconnects).

Fair

Whoever requested first gets the lock first.

No thundering herd

Each waiting client watches only its predecessor — single notification per release.

Self-cleaning

Crashed client's znode auto-deletes via ephemeral semantics.

Built-in fencing

The sequential number IS a fencing token.

The cost: you need to run and maintain a ZooKeeper ensemble.

etcd locks

etcd is a Raft-based KV store with built-in concurrency primitives. The Lock API uses a lease (TTL) plus an atomic compare-and-swap:

lease = etcd.leaseGrant(ctx, 10)  // 10s TTL

txn = etcd.txn(ctx)
  .If(compare(CreateRevision(key), "=", 0))
  .Then(put(key, "locked", WithLease(lease.ID)))
  .Commit()

This atomically creates the key under the lease if it didn’t exist. Lock held until Unlock or lease expires.

Strongly consistent

Backed by Raft consensus. Real safety guarantees, not timing assumptions.

Lease IDs as fences

Use lease ID or revision number as fencing token for downstream operations.

Auto-renewal

Client can keep refreshing the lease via heartbeats while it's working.

Already running in K8s

If you have Kubernetes, you have etcd. Reusing it for app-level coordination is reasonable.

Fencing Tokens: The Safety Mechanism

A fencing token is a monotonically-increasing number issued with the lock. The protected resource (database, storage system) checks the token on every operation and rejects operations with stale tokens.

1. Acquire lock → get token N
2. Operation: write_data(payload, token=N)
3. Storage checks: latest_token >= N? Reject if smaller.

This protects against the “paused holder” problem. Even if your client paused for an hour past TTL, when it resumes and tries to write with token N, the storage rejects because someone else has token N+1.

Choosing the Right Tool

Use case	Recommendation
Simple mutex inside one Postgres app	Postgres advisory locks
Cache invalidation, transient locks	Redis simple lock with TTL
Multi-DB coordination, ok with rare anomalies	Redlock (with caveats)
Correctness-critical (financial, leader election)	ZooKeeper or etcd, with fencing tokens
Already running K8s	etcd Lock API (you have it anyway)

Recap

Local locks coordinate one process. Distributed locks coordinate machines that don’t share memory.
The hard parts: partial failures, clock skew, network partitions, GC pauses, atomicity, fencing.
Common pitfalls: no TTL, no fencing tokens, single point of failure, non-atomic acquire.
Tools: RDBMS advisory locks (simple), Redis (fast but unsafe under adversarial conditions), Redlock (controversial), ZooKeeper / etcd (strongly consistent, with fencing).
Fencing tokens are the difference between “looks safe” and “actually safe” — but require storage cooperation.
For correctness-critical use cases, prefer ZooKeeper or etcd. For convenience, Redis is fine.