Scaling out Memcached is primarily done by adding more servers to your Memcached cluster and then redistributing the keys among the available servers. This operation is usually referred to as "sharding".
Here is a basic example of how this works in practice:
When you start, you might have one server:
As you need to scale, you add more servers to your list:
How does Memcached redistribute keys when you add or remove servers? It uses a concept called "consistent hashing". Essentially, each key is hashed and distributed around a consistent hash ring. When a new node is added, it only requires shifting keys between it and its immediate neighbor, minimizing the number of keys that need to be moved.
Note that while Memcached itself does not replicate data (i.e., store the same data on multiple servers), some client libraries provide support for replication, which can improve read performance and provide some degree of failover capability.
To efficiently distribute memory and compute resources, use a load balancing technique. Consistent hashing is commonly used with Memcached to ensure even distribution of data and minimize re-distribution when nodes are added or removed.
Keep in mind that scaling out doesn't increase the total capacity of data that can reside in Memcached since it's an in-memory cache. The total capacity is bound by the RAM across all Memcached instances. When you reach the limit, old data will be evicted.
Lastly, remember that scaling comes with increased complexity. For example, debugging can become more intricate as data might need to be traced across several servers. Also, network latency could potentially become an issue as you add more servers. Therefore, always monitor performance as you scale out.