Partitioning and Replication Strategies

9 min read · Updated 2026-04-25

Database performance becomes critical when application user counts go from hundreds to millions. Successful operation under load depends on understanding how different database families scale and choosing the right approach for your workload.

This lesson is a tour of the main database families and how each one scales.

The Trade-offs to Balance

Read vs. write patterns

Read-heavy (content sites) vs. write-heavy (IoT data ingest). Different scaling strategies for each.

Vertical vs. horizontal

Bigger machines vs. more machines. Vertical is simpler; horizontal scales further but adds complexity.

Replication vs. sharding

Replication = multiple copies of all data (read scale + redundancy). Sharding = data split across nodes (write scale).

Consistency vs. performance

Strong consistency = synchronous coordination. Eventual consistency = faster, may serve stale data.

Relational Databases (Postgres, MySQL)

Modern Postgres and MySQL have evolved significantly — supporting JSON documents, full-text search, and geospatial queries that used to require NoSQL. They remain the foundational data layer for most SaaS platforms.

Single-master architecture

Postgres uses a single-master architecture: all writes through one primary server, reads scale via streaming replication to multiple read replicas.

Strengths

What Postgres scales well at

Read-heavy workloads. Multiple replicas, each handling thousands of QPS at 10-50ms. ACID transactions. Mature SQL. Aurora-style separation of compute/storage extends this further.

Limitations

Where it hits walls

Write throughput is bounded by single primary. Vertical scaling has hard limits. Synchronous replication for full ACID becomes impractical when replicas slow down.

Scaling writes in relational databases

When write throughput maxes out a single primary:

Table partitioning

Postgres native partitioning by date or ID range. Single DB, multiple physical tables.

Citus, Vitess

Distribute tables across multiple nodes while preserving SQL semantics. Citus for Postgres, Vitess for MySQL. Tens of thousands of writes/sec.

Application-level sharding

Facebook-style: thousands of MySQL instances; app-level routing by user ID. Maximum control, maximum operational complexity.

Multi-leader replication

MySQL Group Replication, YugabyteDB. Multiple write nodes accept writes; conflict resolution required. Native or via add-ons (Postgres BDR, pglogical).

Document Stores (MongoDB)

MongoDB was built for horizontal scaling from day one. Sharded clusters distribute writes across multiple primaries simultaneously.

Architecture

[ Client ]
    │
    ▼
[ mongos router ] ── shard key lookup
    │
    ├─► [ Shard A ─ Replica Set ]
    ├─► [ Shard B ─ Replica Set ]
    └─► [ Shard C ─ Replica Set ]

Each shard = a replica set with its own primary and secondaries.
mongos is the routing layer — stateless, horizontally scalable.
Queries with shard key route to a single shard. Cross-shard queries fan out.
Shard key choice is the critical decision — bad keys create hot spots.

Wide-Column Stores (Cassandra)

Despite the name, “wide-column” doesn’t mean columnar storage. It means flexible row schemas where rows can have different columns. Internally, Cassandra is row-oriented.

Peer-to-peer architecture

   [ Client ]
       │
       ▼
   [ Any node — coordinator ] ──┐
              │                  │
   ┌──────────┼──────────┐       │  consistent hashing
   ▼          ▼          ▼       │  → owner
 [Node 1]  [Node 2]  [Node 3]    │

No master. Every node is equal. Data distribution via consistent hashing. Any node can act as coordinator.

Optimized for writes

Append-only commit log + memtable + SSTable architecture. Tens of thousands of writes per second per node, predictable latency.

Multi-datacenter native

Tunable replication strategies across DCs. Each region can be a separate "rack" or DC. Strong fit for global apps.

Tunable consistency

Per-query: ANY, ONE, QUORUM, ALL. Choose latency vs. consistency for each operation.

Linear scalability

Add a node, capacity increases proportionally. Designed for massive scale (Netflix, Apple, Discord).

The cost: limited query flexibility (must design table layout per query pattern), no joins, eventual consistency by default.

Key-Value Stores (Redis, DynamoDB)

Simplest data model: keys map to values. The simplicity enables extreme scale and low latency.

Redis

In-memory key-value store. Sub-millisecond latency. Hugely versatile (strings, hashes, lists, sets, streams, pub/sub).

Single-node — up to 100k+ ops/sec at 1ms latency. Limited by RAM.
Sentinel — automatic failover for high availability.
Cluster — sharding across nodes via hash slots. Linear write scaling.

Used as cache, session store, real-time analytics, leaderboards, message queues.

DynamoDB

AWS-managed key-value with optional document features. Built on a Dynamo paper-style architecture.

Single-digit ms latency

Predictable performance at any scale. SLA-backed.

Effectively unlimited scale

AWS handles partition management, replication, scaling. You don't see infrastructure.

Pay-per-request or provisioned

On-demand mode for variable load; provisioned mode for predictable savings.

Global tables

Multi-region active-active replication built in. Compliance-friendly for global SaaS.

The cost: limited query patterns (must design access patterns up front), join-less, item size limits.

NewSQL: SQL Without the Single-Master Limit

A category that emerged to combine SQL semantics with horizontal scale.

Spanner / CockroachDB

Distributed SQL with strong consistency. Spanner uses TrueTime; CockroachDB uses HLC. Real ACID across regions.

YugabyteDB

Postgres-compatible distributed SQL. Multi-region active-active. Open source.

TiDB

MySQL-compatible distributed SQL. Strong fit for migrations from MySQL.

When to consider

When you need both SQL semantics and write throughput beyond a single Postgres can give. The operational tax is real — adopt only when you actually need it.

Choosing for Multi-Tenant SaaS

A reasonable pattern for a SaaS platform:

Use case	Database choice
Tenant data, transactional	Postgres (with sharding or Aurora)
Tenant data, append-heavy time series	Cassandra or DynamoDB
User sessions, caching	Redis
Search	Elasticsearch / OpenSearch
Analytics aggregates	ClickHouse / BigQuery / Snowflake
Object storage	S3

Most SaaS platforms run multiple databases — each chosen for what it’s best at. The pragmatic approach is “use the right database for the job,” not “force everything into one.”

Recap

Different database families have fundamentally different scaling architectures.
Relational (Postgres, MySQL) — single-master, scale reads via replicas, scale writes via sharding tools (Citus, Vitess) or app-level partitioning.
Document (MongoDB) — sharded clusters, multi-primary, native horizontal scale; shard key choice is critical.
Wide-column (Cassandra) — peer-to-peer, optimized for writes and multi-DC, tunable consistency.
Key-value (Redis, DynamoDB) — simple data model, extreme scale, predictable latency.
NewSQL (Spanner, CockroachDB, YugabyteDB) — SQL with horizontal write scale, at operational cost.
Real platforms run multiple databases. Pick the right one per access pattern.