Throughput vs. Latency: The Highway Analogy
Understanding the fundamental trade-offs in system design.
In the world of concurrency and distributed systems, two metrics reign supreme: Throughput and Latency. While often discussed together, they represent distinct aspects of system performance, and optimizing for one often comes at the cost of the other.
Definitions
- Latency: The time it takes for a single unit of work (a request, a packet, a car) to travel from start to finish. It is measured in time units (milliseconds, seconds).
- Throughput: The number of units of work that can be processed per unit of time. It is measured as a rate (requests/sec, bits/sec).
The Highway Analogy
Imagine a highway.
- Latency is how long it takes a single car to drive from Point A to Point B. To reduce latency (go faster), you increase the speed limit.
- Throughput is how many cars pass a specific point on the highway every minute. To increase throughput, you add more lanes.
Interestingly, high traffic (high throughput attempt) often leads to congestion, which drastically increases the time it takes for any single car to arrive (high latency).
Interactive Visualization
Below is a simulation of a highway. You can adjust the number of lanes (capacity) and the arrival rate of cars (load). Observe how increasing the load impacts the speed of individual cars once the capacity is reached.
Little's Law
These concepts are mathematically related by Little's Law:
L = λ * W
Where:
Lis the average number of items in the system (Concurrency).λ(Lambda) is the average arrival rate (Throughput).Wis the average time an item spends in the system (Latency).
Summary
When designing systems, decide what is more critical for your user experience. A real-time trading system needs low latency. A batch processing system for payroll needs high throughput.