Question: How can you calculate p99 latency?


P99 latency is a term used in performance testing to denote the time below which 99% of the network requests are served. It's a good way to understand the tail-end behavior of your system, as it provides insights beyond just average or median latencies, which can often mask outliers.

To calculate P99 latency, you have to collect all request latencies, sort them in ascending order, then find the value below which 99% of the measures fall.

For instance, assuming latencies is an array storing all latencies in milliseconds:

latencies = [...] # Array of latencies # Sort latencies in ascending order latencies.sort() # Get the index for 99th percentile p99_index = int(len(latencies) * 0.99) # Retrieve the latency at that index p99_latency = latencies[p99_index]

In this Python code snippet, we first sort the latencies, then find the index corresponding to the 99th percentile (length of the list times 0.99). The p99 latency is the value at this index in the sorted list.

However, in production environments where latencies are gathered over multiple instances and long periods of time, you might use statistical approximation algorithms, like t-digest or HDR Histogram, as sorting large amounts of data may not be feasible due to resource constraints.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book

Start building today

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.