Dragonfly

1.5 Years After the Valkey Fork: The In-Memory Data Landscape at the End of 2025

Analyze the evolving in-memory data landscape post-Valkey fork, comparing solutions for context-rich AI/ML workloads and performance.

November 6, 2025

The In-Memory Data Landscape 1.5 Years After the Valkey Fork | Cover Image

From Memcached to Redis to the AI Era

In‑memory data infrastructure became mission critical the moment we stopped accepting disk latency as a given. In the mid‑2000s, Memcached emerged as the caching backbone for the first generation of hyperscale web apps, most famously at Facebook, where it helped keep page rendering times low while scaling to hundreds of millions and then billions of users. That success reframed the caching layer as not just an optimization but a critical piece of infrastructure for scaling high-throughput applications and delivering modern user experiences.

Redis broadened in-memory use cases from just caching to a true data layer for server applications. Its rich data structures and dead‑simple programming model turned transient states, such as sessions, queues, counters, and streams, into first‑class primitives. Over time, Redis became the default data store for managing real-time, shared state in server-side architectures.

The Evolution of In-Memory Data Infrastructure

The Evolution of In-Memory Data Infrastructure

In 2025, we’re squarely in the AI/ML context era. Retrieval‑augmented generation (RAG), feature stores, real‑time personalization, online experimentation, and low‑latency inference pipelines are all examples of use cases that require massive amounts of real-time context, pushing the in‑memory layer far harder than traditional web use cases ever did. These context-engineering-related workloads demand more data per user request, steadier tail latencies under fan-out, and cost-efficient scaling across terabyte-scale datasets serving tens or even hundreds of millions of requests per second. It’s through this lens that we will analyze what has changed since the Redis licensing shift and the Valkey fork and where innovation is happening in this ecosystem.

Why Valkey Exists

In March 2024, Redis moved from the permissive BSD license to a dual RSAL/SSPL model. That change constrained how cloud providers could offer Redis and triggered a community response. Within days, the Linux Foundation announced Valkey, a fork based on the last permissive release (7.2.4) and backed by major vendors and contributors. The stated goal was to preserve a BSD‑licensed, Redis‑compatible codebase.

AWS and Google launched first‑party Valkey services—ElastiCache for Valkey and Memorystore for Valkey—positioning Valkey as the “truly open” option without licensing friction. It’s important to say the quiet part out loud: the clouds didn’t fork Redis to innovate on architecture; they could have pursued that years ago if it were a priority. The fork’s primary drivers were governance and economics, not a new technical direction at the very beginning.

What Changed Technically in the Last 18 Months

Valkey has shipped meaningful performance work—most notably I/O threading—while keeping the core execution model the same. The I/O threading introduced in Valkey 8.0 parallelizes networking, request parsing, and response writing, which effectively reduces the load on the main thread and improves throughput, particularly for I/O-bound workloads. While this feature offers a clear advantage under I/O pressure, it does not alter the fundamental architecture: command execution itself remains single-threaded. As server core counts and datasets grow, this single-threaded execution path still inevitably becomes the primary bottleneck.

Moreover, the recent Valkey 9.0 release builds on its Redis roots with significant advancements in cluster and data management. Outstandingly, it introduces atomic slot migration in cluster mode, which improves latency during resharding and provides a rollback mechanism (i.e., on data migration cancellation) by moving entire data slots at once. The release also adds support for multiple logical databases in cluster mode, enabling more accessible multi-tenancy. In terms of data control, Valkey 9.0 now supports hash field expiration, allowing individual fields within a hash to have their own TTL, just like Redis does. The release also includes the new conditional deletion command, DELIFEQ, to simplify distributed locking. Finally, to enhance network resilience, Valkey 9.0 adds support for Multipath TCP (MPTCP), delivering more robust replication and client connections in supported environments.

On the business side, the center of gravity has been cost. AWS priced ElastiCache for Valkey materially below its Redis engine—on the order of about twenty percent lower on node‑based clusters and even steeper discounts for serverless configurations—making “lift‑and‑shift to Valkey” attractive for many teams. Google took a similar path by bringing Memorystore for Valkey to general availability and pairing it with committed‑use discounts, roughly 20% for one‑year and 40% for three‑year terms. Put simply, Valkey’s adoption is fueled by a powerful combination: compelling economics and rapid innovation. In under two years, it has not only caught up to RedisStack modules like JSON and search but has also surpassed the original in areas like new commands and cluster management.

The Market Landscape Today

The in-memory data store landscape is a much more diverse place than it was just a couple of years ago, and that is a good thing.

Redis remains the de facto standard, running in production for countless organizations due to its maturity and extensive ecosystem. Redis Inc., the company behind the Redis project, appears to have focused much of their recent efforts on real-time data ETL and agentic-AI workloads, as evidenced by its recent acquisitions of Decodable and Featureform.

Valkey, born from the Redis codebase, has rapidly established itself as a viable migration target. For many teams, its appeal lies in a combination of meaningful technical improvements and the compelling economic incentives offered by major cloud providers. It represents an evolutionary path with a clearer open-source governance model.

Beyond these two, a broader ecosystem is addressing specific developer needs as well. For example, projects like Microsoft Research’s Garnet are joining the Redis protocol ecosystem and pushing the boundaries of performance. Meanwhile, cloud-focused providers like Upstash and Momento are gaining traction by offering a developer-friendly, ops-free experience. Notably, Momento has recently pivoted and started building its service on top of Valkey, offering a tuned and robust solution for modern applications.

Dragonfly, which is the only solution to implement a fully multi-threaded architecture, has emerged as the leading solution for large-scale, context-heavy workloads, which are becoming increasingly common with the growth in ML and AI use cases. Running these types of workloads is certainly possible on solutions like Redis and Valkey, but their reliance on a clustered architecture for horizontal scaling without being able to fully utilize multi-core servers introduces significant operational complexity and often results in degraded performance.

In-Memory Data Infrastructure for the AI/ML Era

Dragonfly is the in-memory data store purpose-built for the scale and performance requirements of the AI and ML era, and it is not a fork of Redis.

Dragonfly was designed from the ground up for today’s multi‑core, large‑memory machines. Instead of funneling all command execution through one main thread, Dragonfly uses a multi‑threaded, shared‑nothing architecture. Your dataset is partitioned into independent shards, each owned by a dedicated thread that executes commands in parallel. Within this architecture, compute simply scales with available cores. The data structures and memory management are engineered for cache locality and low contention, which translates into extremely high throughput combined with the threading model. Operationally, a single Dragonfly node can push millions of operations per second over large RAM footprints, so many teams can postpone, or avoid entirely, the complex sharding topologies they’ve grown used to. Meanwhile, Dragonfly scales horizontally as well after hitting the limit of a single massive machine. Snapshotting and persistence paths are optimized to minimize pauses and spikes under load.

Dragonfly speaks the Redis protocol and APIs, so existing clients and tooling continue to work. That keeps the migration costs low and makes it easy to evaluate performance and total cost without rewriting your application.

These design choices show up where it matters most: AI/ML context. RAG pipelines, online feature stores, and real‑time personalization don’t care about impressive microbenchmarks; they care about whether P99s stay flat as context windows widen. From Instacart’s 25TB, 125 million requests per second online feature store to ShareChat’s recommendation and personalization engineer, hundreds of organizations have already adopted Dragonfly in production to scale their AI/ML-powered applications.

Conclusion: Innovation Is Coming From Outside the Redis/Valkey Lineage

In the roughly 1.5 years since the Valkey fork, the ecosystem has changed governance and economics, with evolutionary improvements but not revolutionary breakthroughs. Valkey’s I/O threading improvements are welcome, yet command execution remains single‑threaded, and the same operational trade‑offs appear as workloads scale. The real architectural innovation, the kind that unlocks much more context per user request, is coming from systems purpose-built for today’s intensive data demands. Dragonfly is Redis‑compatible but architecturally different, designed to scale computation across cores and memory on a single node before you’re forced into fragile, failure‑prone cluster sprawl. That is why platform and infrastructure teams pushing into AI/ML‑driven, real‑time experiences are adopting Dragonfly.

How to Evaluate, Practically

If you are already migrating to Valkey on AWS or GCP for cost reasons, that’s the perfect moment to run a Dragonfly canary on a realistic slice of traffic. Compare throughput, P50/P99 latencies, and the number of servers needed at your target dataset size. Make the test reflect the real shape of your workload, because that is where architectural differences surface. Getting started is straightforward: your existing Redis clients work as‑is, so you can run an apples‑to‑apples evaluation without changing application code.

Dragonfly Wings

Stay up to date on all things Dragonfly

Join our community for unparalleled support and insights

Join

Switch & save up to 80%

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement. Instantly experience up to a 25X boost in performance and 80% reduction in cost