Shard Rebalancing Strategies

Any real system needs to rebalance. Adding capacity. Splitting hotspots. Retiring failed shards. Done poorly = downtime. Done well = invisible.

Shard rebalance: plan → dual-write → backfill → cutover → cleanup

Advertisement

Dual-write window

Start writing to both old + new shard. Reads still from old. Zero data loss.

Start writing to both old + new shard. Reads still from old. Zero data loss.

Advertisement

Copy old data → new shard while dual-writes continue. Rate-limit to avoid impact.

Compare row counts, sample data. When confident, atomically flip metadata so reads go to new shard.

Interim: reads from new. Writes still to both (in case rollback needed). Monitor.

Once confident, stop writing to old. Delete rows moved. Reclaim capacity.

Dual-write → backfill → verify → cutover → cleanup. Zero downtime, roll-back-able.