Using Redis Cluster: Key Features, Tutorial, and Best Practices

What Is a Redis Cluster?

Redis Cluster is a distributed implementation of Redis, enabling data sharding across multiple nodes. This provides increased scalability and availability for Redis deployments. Data is divided into 16,384 hash slots, which are then distributed among the available Redis nodes. In practice, a Redis Cluster can support up to 1,000 nodes.

Due to the single-threaded nature of Redis, it is often unable to fully utilize the computing resources of a multi-core server. This often leads to premature horizontal scaling, either across multiple servers or even sometimes running multiple Redis instances on a single multi-core server.

Key features of Redis Cluster include:

Scalability: Data is automatically sharded across multiple nodes, allowing for horizontal scaling as your data needs grow.
High Availability: Redis Cluster uses a combination of primary nodes and replicas to recover from node failures without manual intervention. When a primary node fails, its replica can be promoted to take over, providing continuous service availability.
Distribution: Hash slots are used to distribute data across the cluster, ensuring even distribution and simplifying re-sharding when needed.

In this article:

Why Use Redis Cluster?
Key Features of Redis Cluster
Key Redis Cluster Components
Redis Cluster vs. Redis Sentinel
Tutorial: Create and Use a Redis Cluster
Best Practices for Managing Redis Clusters

Why Use Redis Cluster?

There are two common use cases for Redis Cluster: horizontal scalability on multiple server machines and deployment on a single node to make use of multiple cores, due to Redis’s single-threaded nature for data operations. Oftentimes, a Redis Cluster deployment involves both strategies (using multiple servers, each running multiple Redis processes) to fully utilize the hardware. This approach, however, adds significant operational complexity.

Redis Cluster Explained by Architecture Notes

Horizontal Scalability for Large Workloads

Redis Cluster is designed to overcome the limitations of single-instance Redis deployments by enabling horizontal scaling. Traditional Redis is single-threaded for data operations and confined to a single node’s memory and processing capacity. When workloads grow, a standalone Redis server can quickly become a bottleneck.

Redis Cluster distributes data across multiple primary nodes using hash slots, allowing workloads to be spread across multiple servers. This improves both read and write throughput and supports larger datasets.

Local Cluster Deployment to Utilize Multiple Cores

Because Redis processes commands on a single thread, it cannot efficiently take advantage of multi-core CPUs within a single process. Running a Redis Cluster on a single machine allows multiple Redis processes to operate in parallel, each utilizing a separate CPU core.

This setup mimics horizontal scaling locally and is useful for performance testing or for maximizing CPU usage on powerful machines. Each instance manages a portion of the keyspace, and communication between nodes is handled internally.

Related content -> Redis and Dragonfly Architecture Comparison.

Key Features of Redis Cluster

Data Sharding

Redis Cluster splits data among multiple nodes using a concept called hash slots. Each key is assigned to one of 16,384 slots (by taking the CRC16 of the key modulo 16,384), and these slots are distributed among participating primary nodes. When the cluster expands or shrinks, the system can rebalance the slots between nodes.

HASH_SLOT = CRC16(key) mod 16384

This sharding mechanism enables horizontal scalability; by simply adding new nodes, the cluster’s data storage and throughput can grow. Furthermore, unlike client-side partitioning, Redis Cluster users are relieved from managing key distribution logic in the application layer, which reduces operational complexity and minimizes the risk of human error during cluster growth or rebalancing.

High Availability and Scalability

Redis Cluster resists failure and supports scaling. Each primary node can have one or more replicas, which serve as hot standbys. If a primary node is down, its replica can be automatically promoted as the new primary, ensuring continuous access to the data and minimal downtime.

Scalability with Redis Cluster is straightforward. New primary nodes are added by redistributing hash slots, and adding replicas increases redundancy. As the demand on your Redis deployment grows, both read and write operations can be distributed over a larger set of hardware, increasing total system throughput.

Data Distribution

Instead of all data residing on a single server, Redis Cluster partitions the keyspace and distributes it over multiple primary nodes. This distribution ensures that a single node is less likely to become a bottleneck and allows the system to handle larger datasets than could fit on a single machine. The hash slot mechanism distributes keys across multiple nodes, likely evenly assuming the key patterns are random. As nodes are added or removed from the cluster, manual data rebalancing is needed. The balance of hash slots among nodes is crucial to maintaining steady performance.

Note that Redis Cluster supports all standard Redis operations, provided the keys belong to the same hash slot. Single-key commands like GET, SET, and INCR work identically to a standalone Redis instance. However, multi-key operations (such as MGET or transactions involving multiple keys) require all specified keys to reside in the same hash slot.

Key Redis Cluster Components

Redis Cluster relies on several key components to maintain its distributed architecture and provide high availability and scalability. These components include primary nodes, replica nodes, and hash slots, each playing a critical role in the operation of the cluster.

Primary Nodes: Primary nodes hold the actual data in a Redis Cluster and are responsible for processing client requests. Each primary node manages a portion of the hash slots and handles operations on the keys that belong to those slots. When the cluster is first initialized, data is distributed among the primary nodes based on hash slots.
Replica Nodes: Replica nodes are copies of primary nodes that provide redundancy and fault tolerance. They maintain near real-time copies of the data stored on their corresponding primary node. If a primary node fails, one of its replicas is promoted to primary, minimizing service disruptions.
Hash Slots: Redis Cluster uses a partitioning strategy with 16,384 hash slots. Each key is assigned to a specific hash slot based on CRC16(key) mod 16384. These hash slots are distributed across the primary nodes, ensuring that the data is spread out evenly, assuming key patterns are random. The number of slots is fixed, and the cluster can dynamically adjust the assignment of slots when nodes are added or removed.
Cluster Nodes: Cluster nodes refer to both primary and replica nodes in the Redis Cluster. Each node is identified by a unique address and is responsible for a portion of the cluster’s overall workload. The nodes communicate with each other to exchange cluster topology information and manage failover procedures when necessary.
Gossip Protocol: Redis Cluster uses a gossip protocol to allow nodes to share information about the state of other nodes in the cluster. This communication helps the cluster remain aware of node failures, the addition of new nodes, and other status changes. The gossip protocol ensures that each node has up-to-date information to maintain cluster integrity and facilitate efficient operation. However, if there are too many nodes in the cluster, the gossip communication can imply large overhead, and it’s one of the reasons why Redis Cluster is practically limited to around 1,000 nodes.

Redis Cluster vs. Redis Sentinel

Redis Cluster and Redis Sentinel are both solutions designed to improve the availability and scalability of Redis, but they serve different purposes and are suited to different use cases.

Purpose and Architecture

Redis Cluster is a distributed Redis implementation that provides data sharding and fault tolerance. It splits the data across multiple nodes (primary and replica) and enables horizontal scaling. Redis Cluster scales out your Redis instance by partitioning data and distributing it over multiple nodes. It also handles node failures by promoting replicas to primaries.
Redis Sentinel is a high availability solution (HA) for a single Redis primary instance with one or more replicas. It manages the monitoring, failover, and notification process, ensuring that Redis remains available and operational in case of a primary node failure. Redis Sentinel does not provide data sharding. Instead, it manages failover to ensure minimal downtime.

Scalability

Redis Cluster supports horizontal scaling by partitioning data across multiple nodes.
Redis Sentinel is primarily an HA solution. Although it can scale read operations by allowing replicas to handle them. Note that replicas are near real-time and may serve stale reads.

Fault Tolerance

Redis Cluster can provide fault tolerance by attaching replicas to primary nodes. If a primary node fails, one of its replicas can be automatically promoted as primary.
Redis Sentinel provides HA by monitoring Redis instances and performing failover when a primary node fails. Sentinel promotes a replica to primary, and clients are expected to query Sentinel to obtain the updated location of the new primary.

Tutorial: Create and Use a Redis Cluster

Creating and interacting with a Redis Cluster involves several key steps. Follow these instructions to set up your Redis Cluster, use it for basic operations, and perform tasks such as resharding, failover testing, and adding/removing nodes. These instructions are adapted from the Redis documentation.

1. Create a Redis Cluster

Before creating a Redis Cluster, make sure you have multiple Redis instances running. At least three primary nodes are required for a minimal cluster setup. Here’s how to start:

Requirements:

A minimum of three Redis instances running in cluster mode.

Configure each instance by setting the following directives in the redis.conf file:

port 7000
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000

Ensure each Redis instance uses a unique port (e.g., 7000 to 7005 for six nodes) if deploying on the same server. You can create directories named after the port number (e.g., redis-7000, redis-7001, etc.) and place the configuration file inside each directory.

Starting Redis Instances:

Start each Redis instance in a separate terminal window using the following command:

$> cd redis-7000
$> redis-server ./redis.conf

Once each Redis instance is running, they will assign themselves unique node IDs.

Creating the Cluster:

To create the cluster, use the redis-cli tool. Run the following command to initialize the cluster:

$> redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 --cluster-replicas 1

This command sets up a six-node cluster with three primary nodes and three replicas. Confirm the proposed configuration by typing yes when prompted. You should see a message like [OK] All 16384 slots covered, indicating successful cluster creation.

2. Interact with the Cluster

Once the cluster is created, you can interact with it using a cluster-aware Redis client or the redis-cli tool.

Connecting to the Cluster:

You can connect to any of the cluster nodes using the following command:

$>redis-cli -c -p 7000

The -c flag tells redis-cli to enable cluster mode on the client side. When interacting with the cluster, if a key belongs to a different slot, the client will be redirected to the appropriate node handling that slot.

Example Redirected Commands:

# Assuming 'key_01' is managed by redis-7002.
redis-7000$> set key_01 value
#=> Redirected to slot [12182] located at 127.0.0.1:7002

# Assuming 'key_02' is managed by redis-700.
redis-7002$> get key_02
#=> Redirected to slot [12182] located at 127.0.0.1:7000
#=> "value"

3. Reshard the Cluster

Resharding moves hash slots from one set of nodes to another. To initiate resharding, use the redis-cli tool:

$> redis-cli --cluster reshard 127.0.0.1:7000

You will be prompted to specify how many slots to move, the source nodes, and the target node. Follow the instructions to redistribute data across the nodes.

4. Test Failover

To test failover, simulate a failure by crashing a primary node. Note that this requires additional setup to attach replica(s) to primary instances.

$> redis-cli -p 7002 debug segfault

Observe how the cluster reacts. The failover mechanism will promote a replica to primary and keep the cluster operational. You can monitor the consistency of the cluster with the previously mentioned example application.

5. Add or Remove a Node

To add a new node to the cluster, start a new Redis instance with a unique port (e.g., 7006).

Add the node to the cluster using the redis-cli tool by specifying the new node and a random existing node of the cluster:

$> redis-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000

To remove a node from the cluster, specify a random existing node of the cluster and the node ID of the to-be-removed node:

$> redis-cli --cluster del-node 127.0.0.1:7000 <node-id>

You must ensure the node is empty (no data) before removing it from the cluster.

6. Upgrade Nodes

To upgrade a Redis node in the cluster:

Trigger a manual failover using the CLUSTER FAILOVER command to promote a replica.
Upgrade the node as you would for a standalone Redis instance.
Once the upgrade is complete, trigger another failover to re-promote the upgraded node back as a primary instance.
Replicas can be upgraded as well without failover. Just ensure they are attached back to the desired primary instances.

The rest of the shards (primary-replicas) within the cluster can be upgraded the same way until all of them are upgraded.

Best Practices for Managing Redis Clusters

Here are some useful best practices to consider when working with Redis clusters.

1. Use Hashtags and Key Patterns for Multi-Key Operations

Redis Cluster only allows multi-key operations if all keys are in the same hash slot. This includes transactions (MULTI/EXEC), Lua scripts, and commands like MGET. To ensure co-location of related keys, use hashtags—a substring enclosed in {}—to explicitly define the part of the key used in the hash calculation.

For example, the keys user-profile:{1234} and user-session:{1234} will hash to the same slot and allow multi-key transactions. Plan key naming up front to avoid cross-slot errors like ERR CROSSSLOT Keys in request don't hash to the same slot.

Avoid using hashtags for unrelated data, which could lead to unbalanced clusters. Only group keys under a shared hash slot when they are logically related or necessary for atomic operations.

2. Reshard to Balance Cluster Load

Redis Cluster performance depends on even distribution of hash slots and traffic. Over time, certain nodes can become overloaded due to uneven data growth or skewed access patterns. Use the --cluster check command in redis-cli to inspect slot coverage and detect imbalances.

To redistribute load, initiate resharding with:

$> redis-cli --cluster reshard 127.0.0.1:7000

Specify how many slots to move and the source and destination nodes. Redis migrates keys live, without downtime, though latency may briefly spike.

For automation, use the non-interactive form:

$> redis-cli --cluster reshard <host>:<port> --cluster-from <source-node-id> --cluster-to <target-node-id> --cluster-slots <count> --cluster-yes

Regular (but not too frequent) resharding helps maintain performance as data grows or node resources change.

3. Monitor and Optimize Memory Usage

Redis is memory-resident, so effective memory management is critical. Use the INFO MEMORY command or the MEMORY USAGE command to track per-key memory usage. Set an appropriate maxmemory limit and apply eviction policies like allkeys-lru or volatile-lru to prevent out-of-memory errors.

4. Use Connection Pooling and Client-Side Pipelining

Opening a new connection for each Redis command adds significant overhead and can cause transient failures. Instead, use connection pooling in your Redis clients. Libraries like redis-py, jedis, and ioredis support pooling by default. A reasonable (usually small) number of persistent connections can serve most workloads efficiently.

Also leverage pipelining (sending multiple commands in a single network roundtrip) to reduce round-trip latency and system calls. This is especially effective in high-throughput applications or when batching reads/writes:

pipe = redis_client.pipeline()
pipe.set('key1', 'value1')
pipe.set('key2', 'value2')
pipe.execute()

Together, pooling and pipelining reduce load on both the client and Redis server, improving throughput and latency.

5. Identify and Eliminate Hot Keys and Large Keys

Hot keys (those accessed significantly more than others) can create bottlenecks in a Redis Cluster. Use redis-cli --hotkeys to detect them and --bigkeys to find oversized keys. Large or hot keys can result in uneven load and degraded performance, as a single node might handle disproportionate traffic.

To mitigate this, break large keys into smaller parts. For example, if a sorted set is too large, consider segmenting it by time or another domain-relevant attribute. For composite data types, avoid full scans like ZRANGE key 0 -1. Instead, paginate using defined ranges such as ZRANGE key 0 99, ZRANGE key 100 199, and so on for static collections. SCAN-family commands are recommended too, since they provide cursors to ensure full iteration even if a collection is being modified during scanning.

When deleting large keys, use UNLINK instead of DEL to offload the deletion to a background thread and prevent blocking. For batch deletions, use SCAN with patterns and reasonable batch sizes to gradually delete keys that are no longer needed.

Dragonfly: Next-Gen In-Memory Data Store with Limitless Scalability

Dragonfly is a modern, source-available, multi-threaded, Redis-compatible in-memory data store that stands out by delivering unmatched performance and efficiency. Designed from the ground up to disrupt legacy technologies, Dragonfly redefines what an in-memory data store can achieve.

Dragonfly Scales Both Vertically and Horizontally

Dragonfly’s architecture allows a single instance to fully utilize a modern multi-core server, handling up to millions of requests per second (RPS) and 1TB of in-memory data. This high vertical scalability often eliminates the need for clustering—unlike Redis, which typically requires a cluster even on a powerful single server (premature horizontal scaling). As a result, Dragonfly significantly reduces operational overhead while delivering superior performance.

For workloads that exceed even these limits, Dragonfly offers a horizontal scaling solution: Dragonfly Swarm. Swarm seamlessly extends Dragonfly’s capabilities to handle 100 million+ RPS and 100 TB+ of memory capacity, providing a path for massive growth.

Key Advancements of Dragonfly

Multi-Threaded Architecture: Efficiently leverages modern multi-core processors to maximize throughput and minimize latency.
Unmatched Performance: Achieves 25x better performance than Redis, ensuring your applications run with extremely high throughput and consistent latency.
Cost Efficiency: Reduces hardware and operational costs without sacrificing performance, making it an ideal choice for budget-conscious enterprises.
Redis API Compatibility: Offers seamless integration with existing applications and frameworks running on Redis while overcoming its limitations.
Innovative Design: Built to scale vertically and horizontally, providing a robust solution for rapidly growing data needs.

Dragonfly Cloud

Dragonfly Cloud is a fully managed service from the creators of Dragonfly, handling all operations and delivering effortless scaling so you can focus on what matters without worrying about in-memory data infrastructure anymore.