Introducing Dragonfly Cloud! Learn More

Question: How to backup a MongoDB sharded cluster?

Answer

Backing up a MongoDB sharded cluster involves capturing the state of all shards as well as the config servers that store the cluster's metadata. A consistent backup must include data from all shards and the config server(s) at roughly the same point in time. Here are the general steps and methods to achieve this:

1. Use MongoDB's mongodump

mongodump is a utility for creating a binary export of the contents of a database. To backup a sharded cluster, you would need to run mongodump against each shard and the config servers. This approach can be complex and time-consuming but offers a straightforward method if automation tools are not available.

Example:

mongodump --host=<shard-host>:<port> --out=/path/to/backup/shard1 mongodump --host=<configsvr-host>:<port> --out=/path/to/backup/configsvr

Repeat for each shard and config server. Note that running mongodump directly on production systems can impact performance.

2. Filesystem Snapshot

If your MongoDB instances are running on filesystems that support snapshots (like AWS EBS, GCP Persistent Disks, or using LVM), you can create a consistent snapshot across all shards and config servers by:

  1. Freezing writes to the cluster by locking it or using a similar mechanism.
  2. Creating a filesystem snapshot of each shard and config server.
  3. Unlocking the cluster to resume writes.

This method reduces downtime but requires coordination and might not be feasible in all environments.

3. MongoDB Cloud Manager or Ops Manager

MongoDB offers its own solutions for managing backups in sharded clusters through Cloud Manager and Ops Manager. These tools provide continuous, online backups with point-in-time recovery features and minimize the operational burden of managing backups manually.

4. MongoDB Atlas

For users of MongoDB's Atlas cloud service, backup is even simpler. Atlas provides continuous, automated backups with point-in-time recovery out of the box for sharded clusters. It handles the complexity of ensuring consistent backups across shards and config servers.

Important Considerations

  • Consistency: For logical backups (mongodump), ensure snapshots of all components are taken at approximately the same time to maintain consistency.
  • Automation: Regardless of the method chosen, automating the backup process can help reduce human error and ensure backups are taken regularly.
  • Testing: Regularly test your backup and restore process to ensure that your backups are valid and that you can meet your Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO).

Backup strategies can vary based on specific requirements and constraints of your deployment. Assessing your needs and understanding the trade-offs of each method are crucial steps in implementing an effective backup strategy for your MongoDB sharded cluster.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book

Start building today 

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.