If your producer pushes 1000 messages/sec and your consumer only processes 100/sec, something has to give. That "something" is backpressure — a signal flowing UPSTREAM from the slow consumer to the fast producer. Get this wrong and you OOM the server or drop messages silently.

Advertisement

The three strategies

BLOCK — the producer stops writing until the consumer catches up. Safest, but couples performance. DROP — discard messages when the buffer fills (head-drop, tail-drop, or priority-aware). BUFFER — accept everything and grow the queue. Eventually OOMs unless bounded.

HTTP/2 flow control (built-in)

Each HTTP/2 stream has a flow-control window. When the consumer doesn't send WINDOW_UPDATE frames, the producer's TCP send buffer fills and writes block. This is BLOCK strategy by default — works correctly if your application reads from the stream regularly.

Advertisement

Application-layer DROP

from collections import deque
from threading import Lock

class DropOldestQueue:
    def __init__(self, maxlen=1000):
        self.q = deque(maxlen=maxlen)   # auto-drops on overflow
        self.lock = Lock()
    def put(self, msg):
        with self.lock:
            self.q.append(msg)            # silently drops oldest
    def get(self):
        with self.lock:
            return self.q.popleft() if self.q else None

Reactive streams (RxJava / Project Reactor)

Reactive frameworks formalize backpressure with request(N) calls — the subscriber tells the publisher exactly how many items it can take. This gives clean DROP/BUFFER semantics and is the most flexible pattern when consumers vary in speed.

Monitoring is mandatory

Track queue_depth as a gauge, messages_dropped_total as a counter, and consumer_lag_seconds as a histogram. If you can't see backpressure happening, you can't tune it.

Pick BLOCK, DROP, or BUFFER explicitly and instrument it. Silent message loss in production is the worst outcome.