This blog post covers monitoring in-memory datastores, focusing on Dragonfly. Learn how to use Prometheus and Grafana to monitor Dragonfly metrics and visualize them on a dashboard. Explore memory consumption, client-side metrics, server metrics and more.
June 21, 2023
Monitoring in-memory datastores requires a different approach to traditional disk-based databases. Since in-memory datastores are designed to store data in RAM, there's a greater risk of the datastore exceeding the allocated memory limit. When this happens, the datastore may have to evict data (e.g. Redis evicts keys based on the configured policy such as LRU) or it may become unstable or crash, leading to data loss or downtime.
Therefore, tracking memory consumption is one of the key focus areas when monitoring in-memory datastores. However, there are other important metrics to be considered as well, including client-side metrics that provide insight into how client applications are using the datastore as well as other server-side metrics (for example, ones related to CPU usage).
Dragonfly is a modern in-memory datastore that implements novel algorithms and data structures on top of a multi-threaded, shared-nothing architecture. Thanks to its API compatibility, Dragonfly can act as a drop-in replacement for Redis. At the time of writing, Dragonfly has implemented more than 200 Redis commands which represents good coverage for the vast majority of use cases. Due to Dragonfly's hardware efficiency, you can run a single node instance on a small 8GB instance or scale vertically to large 768GB machines with 64 cores. This greatly reduces infrastructure costs as well as architectural complexity.
A Prometheus Exporter can be used in scenarios where it is not feasible to instrument a given system with Prometheus metrics directly. It collects metrics from a specific system or application and exposes them in a format that Prometheus can scrape and ingest for monitoring and analysis.
The Prometheus Redis Metrics Exporter extracts metrics from Redis databases and makes them available in a Prometheus-compatible format. But we won't need it in our case since Dragonfly exposes Prometheus-compatible metrics out of the box! (available on http://<dragonfly-host>:6379/metrics by default)
Although 6379 is the default port, you can use the --bind-admin and --bind-admin-port flags respectively, to specify alternate host and port.
Let's start by setting up a simple environment to monitor Dragonfly. In this demo, We will have a Dragonfly instance running, along with a Prometheus instance to collect metrics and Grafana to visualize the metrics.
Before you begin, make sure you have the following installed: Docker, Docker Compose and Redis CLI.
To start with, Let's save the prometheus scrape configuration in a file called prometheus.yml. This file will be mounted into the prometheus container later.
- job_name: dragonfly_metrics
- targets: ['dragonfly:6379']
Next, we will create a docker-compose.yml file to define the services we need:
cat <<EOF > docker-compose.yml
Use Docker Compose to start Dragonfly, Prometheus and Grafana:
docker compose -p monitor-dragonfly up
Verify that the containers are running:
docker compose -p monitor-dragonfly ps
For each container, you should see the STATUS as running:
To check the metrics exposed by Dragonfly, navigate to http://localhost:6379/metrics in your browser:
To see these metrics in Prometheus, navigate to http://localhost:9090. Select Status -> Targets from the left menu:
Search for metrics related to Dragonfly (they are exposed with a dragonfly_ prefix.):
Most of the metrics available via the INFO command. Now let's dive deeper into some of them.
The dragonfly_connected_clients metric refers to the number of client connections that are currently established with Dragonfly. It includes both active and idle connections and monitoring it over time can provide insight into usage patterns and trends.
Let's connect to the Redis instance using three different redis-cli clients and see the value of connected_clients:
redis-cli -p 6379
In a few seconds, the value of dragonfly_connected_clients should be 3, and you should see that reflect in Prometheus:
Monitoring this metric can help identify potential performance and scalability issues, especially with clients that are not properly closing connections. dragonfly_blocked_clients
The dragonfly_blocked_clients metric refers to the number of client connections that are currently pending on a blocking call with Dragonfly such as BLPOP, BRPOP, BRPOPLPUSH etc.
Let's open a few redis-cli clients with the following command to block on a List:
blpop test_list 15
There should be a spike in the dragonfly_blocked_clients metric.
It should come down to 0 after 15 seconds (since that's the timeout we specified in the blpop command).
Let's look at Dragonfly server related metrics. For most of these, you can note down the initial value and then perform some operations to see the change in the metric value. The operation can be simple client operations using any Redis client or the load testing with the Redis benchmarking tool (e.g. redis-benchmark -t set -r 100000 -n 1000000)
Navigate to the Grafana console at http://localhost:3000/ (use admin/admin as credentials) and start by adding Prometheus as a data source.
From Add data source, select Prometheus
Enter http://prometheus:9090 as the URL
Select Save & Test
Although you can build you own Grafana dashboard, let's leverage a ready-to-use dashboard for now:
To experiment with the dashboard, you can perform simple client operations using any Redis client or the Redis benchmarking tool as below:
redis-benchmark -t set -r 100000 -n 1000000
You should now see the dashboard with metrics from Dragonfly:
Once you have completed the steps in this tutorial, use this command to stop the Docker containers:
docker compose -p monitor-dragonfly down -v
In this blog post, we explored how to monitor Dragonfly metrics using Prometheus and integrated with Grafana to visualize the metrics on a dashboard.