Definitions

  • Latency: The time it takes for a single unit of work (a request, a packet, a car) to travel from start to finish. It is measured in time units (milliseconds, seconds).
  • Throughput: The number of units of work that can be processed per unit of time. It is measured as a rate (requests/sec, bits/sec).
Advertisement

The Highway Analogy

Imagine a highway.

  • Latency is how long it takes a single car to drive from Point A to Point B. To reduce latency (go faster), you increase the speed limit.
  • Throughput is how many cars pass a specific point on the highway every minute. To increase throughput, you add more lanes.

Interestingly, high traffic (high throughput attempt) often leads to congestion, which drastically increases the time it takes for any single car to arrive (high latency).

Advertisement

Interactive Visualization

Below is a simulation of a highway. You can adjust the number of lanes (capacity) and the arrival rate of cars (load). Observe how increasing the load impacts the speed of individual cars once the capacity is reached.

Little's Law

These concepts are mathematically related by Little's Law:

L = λ * W

Where:

  • L is the average number of items in the system (Concurrency).
  • λ (Lambda) is the average arrival rate (Throughput).
  • W is the average time an item spends in the system (Latency).