Anti-entropy repair is Cassandra's mechanism for ensuring replicas converge to the same data. Skipping it leads to silent divergence and ghost rows; running it wrong kills the cluster with I/O. The 2026 best-practice picture is clearer than it used to be.

Advertisement

Why you must repair

Hinted handoff covers brief outages; read repair fixes accessed data. Cold data on a replica that was down longer than the hint window can stay stale forever. Tombstones can't be safely garbage-collected past gc_grace_seconds without repair. Pick: weekly per node, before gc_grace expires.

Full repair — the hammer

nodetool repair on every node walks every range, computes Merkle trees, streams differences. Expensive: high I/O, can take hours. Run during low-traffic windows. Use --full on Cassandra 4+ to disambiguate from incremental.

Advertisement

Incremental repair (4.0+)

Marks SSTables as 'repaired' so future repairs skip them. Massive speedup after the initial baseline. Caveat: pre-4.0 incremental repair had bugs that left clusters in worse shape — only trust it on 4.0+. Run weekly.

Subrange repair

Splits the token range into smaller chunks; repairs one chunk at a time. Lower steady-state I/O. Tools: Cassandra Reaper, which schedules subrange repair across the cluster and pauses on overload. Best practice for large clusters.

Don't forget materialized views

MVs are a separate write path; repair them too. nodetool repair -pr . Subtle: if base table is repaired but MV isn't, read-after-write gives you the wrong answer through the MV.

Use Reaper for subrange + incremental repair on 4.0+. Weekly cadence. Treat skipped repair as a future outage you've scheduled.