Question: What is the difference between p95 and p99 latency in performance metrics?


Latency" in computing refers to the delay before a transfer of data begins following an instruction for its transfer. It's a crucial aspect of understanding system performance and user experience, especially in distributed systems.

When we talk about "p95" and "p99" latencies, we are referring to percentile latencies. These numbers represent the maximum response time experienced by 95% and 99% of requests respectively.

  • p95 Latency: This value indicates that 95% of the requests were processed faster than this latency, and only 5% had a higher latency. In other words, 95 out of 100 requests have a latency equal to or lower than this value.
  • p99 Latency: Similarly, this metric shows the latency at which 99% of the requests were processed faster, and 1% had a higher latency. It means that 99 out of 100 requests have a latency lower than or equal to the p99 latency.

It's worth mentioning that p99 latency represents more extreme outliers in your system's performance than p95 latency. Thus, if you're optimizing for the best performance under peak conditions, paying attention to p99 latency can be more important because it helps ensure good experiences even for those users who might otherwise have unusually long wait times.

For example, consider a simple method for measuring request latency in Python using the time module:

import time def measure_latency(func): start_time = time.time() func() end_time = time.time() latency = end_time - start_time return latency

You could collect these latencies over time for all requests, then calculate the p95 and p99 latencies like so:

import numpy as np # Assume `latencies` is a list of latency measurements p95_latency = np.percentile(latencies, 95) p99_latency = np.percentile(latencies, 99)

In this code, np.percentile() calculates the desired percentile value (in our case, p95 or p99) from the given list of latencies.

While both p95 and p99 are useful metrics, they serve different purposes based on your performance optimization goals. In general, monitoring various percentiles can give you a more holistic view of your system's performance.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book

Start building today

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.