Throughput vs. Latency: The Highway Analogy

Understanding the fundamental trade-offs in system design.

In the world of concurrency and distributed systems, two metrics reign supreme: Throughput and Latency. While often discussed together, they represent distinct aspects of system performance, and optimizing for one often comes at the cost of the other.

Definitions

Latency: The time it takes for a single unit of work (a request, a packet, a car) to travel from start to finish. It is measured in time units (milliseconds, seconds).
Throughput: The number of units of work that can be processed per unit of time. It is measured as a rate (requests/sec, bits/sec).

The Highway Analogy

Imagine a highway.

Latency is how long it takes a single car to drive from Point A to Point B. To reduce latency (go faster), you increase the speed limit.
Throughput is how many cars pass a specific point on the highway every minute. To increase throughput, you add more lanes.

Interestingly, high traffic (high throughput attempt) often leads to congestion, which drastically increases the time it takes for any single car to arrive (high latency).

Interactive Visualization

Below is a simulation of a highway. You can adjust the number of lanes (capacity) and the arrival rate of cars (load). Observe how increasing the load impacts the speed of individual cars once the capacity is reached.

Little's Law

These concepts are mathematically related by Little's Law:

L = λ * W

Where:

L is the average number of items in the system (Concurrency).
λ (Lambda) is the average arrival rate (Throughput).
W is the average time an item spends in the system (Latency).

Summary

When designing systems, decide what is more critical for your user experience. A real-time trading system needs low latency. A batch processing system for payroll needs high throughput.