Shard key choice
user_id hashed → even distribution. Bad choice: timestamp (hotspots on newest). Choose carefully; changing is painful.
Advertisement
Shard key choice
user_id hashed → even distribution. Bad choice: timestamp (hotspots on newest). Choose carefully; changing is painful.
Advertisement
Routing layer
Application code or proxy translates key → shard. Keeps app code simple. Route lookups must be fast (cache the shard map).
Cross-shard queries
WHERE user_id = X: goes to one shard. Aggregate across all users: fan out to all shards, merge results. Expensive.
Rebalancing
Adding shards means moving data. Consistent hashing minimizes moves. Range-based sharding is easier to rebalance.
Per-shard replicas
Each shard has its own replica set. Adds complexity multiplied by shard count. Automate everything.