Shard key choice

user_id hashed → even distribution. Bad choice: timestamp (hotspots on newest). Choose carefully; changing is painful.

Advertisement

Shard key choice

user_id hashed → even distribution. Bad choice: timestamp (hotspots on newest). Choose carefully; changing is painful.

Advertisement

Routing layer

Application code or proxy translates key → shard. Keeps app code simple. Route lookups must be fast (cache the shard map).

Cross-shard queries

WHERE user_id = X: goes to one shard. Aggregate across all users: fan out to all shards, merge results. Expensive.

Rebalancing

Adding shards means moving data. Consistent hashing minimizes moves. Range-based sharding is easier to rebalance.

Per-shard replicas

Each shard has its own replica set. Adds complexity multiplied by shard count. Automate everything.