CL is one of the few knobs you set per query. Used well, it lets one cluster serve workloads with very different consistency needs. Used badly, you ship 'mostly works' code that breaks during a single node restart. Here are five concrete cases with the right answer.

Advertisement

1. User-facing read-after-write (auth tokens)

Write CL=QUORUM, Read CL=QUORUM. R+W>RF means the user always sees their write. ~3-5ms p99 in a single DC. Don't use ONE — the next read can land on a stale replica.

2. Multi-DC reads (low-latency reads in user's region)

Write CL=LOCAL_QUORUM in writer DC + async replication to others. Read CL=LOCAL_QUORUM in user's DC. User reads stay local (10-30ms vs cross-region 100-300ms). Acceptable for most product workloads.

Advertisement

3. Telemetry ingest (write-once, low replay risk)

Write CL=ONE or LOCAL_ONE. Read CL=ONE. Maximum throughput, minimum latency. You may lose some writes during a node failure — fine for clickstream/metrics. Don't use for billing events.

4. Idempotency keys (must be unique)

Use lightweight transactions (LWT) — CL=SERIAL writes, CL=SERIAL reads. ~4x slower; use only on the unique-check path. Once accepted, downstream writes are cheap normal QUORUM.

5. Distributed leader election

LWT with TTL on a leader_lock row. The winning insert sets leader_id; others see CAS fail. Refresh before TTL expires. Don't try this with normal CL — race conditions are guaranteed.

Per-query CL is the feature. Match it to the workload's actual consistency need, not blanket QUORUM for everything.