[Answered] How can you calculate p99 latency?

Answer

P99 latency is a term used in performance testing to denote the time below which 99% of the network requests are served. It's a good way to understand the tail-end behavior of your system, as it provides insights beyond just average or median latencies, which can often mask outliers.

To calculate P99 latency, you have to collect all request latencies, sort them in ascending order, then find the value below which 99% of the measures fall.

For instance, assuming latencies is an array storing all latencies in milliseconds:

latencies = [...]   # Array of latencies

# Sort latencies in ascending order
latencies.sort()

# Get the index for 99th percentile
p99_index = int(len(latencies) * 0.99)

# Retrieve the latency at that index
p99_latency = latencies[p99_index]

In this Python code snippet, we first sort the latencies, then find the index corresponding to the 99th percentile (length of the list times 0.99). The p99 latency is the value at this index in the sorted list.

However, in production environments where latencies are gathered over multiple instances and long periods of time, you might use statistical approximation algorithms, like t-digest or HDR Histogram, as sorting large amounts of data may not be feasible due to resource constraints.

Question: How can you calculate p99 latency?

Answer

Was this content helpful?

Next Steps

Other Common Database Performance Questions (and Answers)

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Switch & save up to 80%