Question: What is P99 latency?


"P99 latency" refers to the 99th percentile of latency measurements. In other words, it's a statistical measure indicating that 99% of the latency values fall below this threshold.

In the context of databases or network services, latency is generally defined as the time taken for a packet of data to get from one designated point to another. Therefore, P99 latency specifically measures the time it takes for a request to be completed in the longest 1% of cases.

P99 latency is an important metric because it gives insight into the worst-case performance of your system. It helps in detecting and diagnosing occasional outliers that could seriously impact the user experience even though they might not affect the average latency significantly.

For instance, consider a service where most requests are processed in 200 milliseconds (ms), but 1% take up to 5 seconds. The average latency might still look good, but those experiencing the longer delay could find the service unacceptable.

Here's how you might measure P99 latency in practice using a Prometheus Query. Prometheus is a widely used open-source monitoring and alerting toolkit:

histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))

This query calculates the 99th percentile ('0.99') over a 5-minute interval for the http_request_duration_seconds_bucket metric.

Also, tools like Apache JMeter or Gatling can be used to simulate load on a service and record response times, which then can be analyzed to find the P99 latency.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book

Start building today

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.