Introducing Dragonfly Cloud! Learn More

Question: What is replication in MongoDB?

Answer

Replication is a key feature of MongoDB that ensures data redundancy and high availability. It involves copying data from one database server (the primary) to one or more servers (the secondaries). This process creates a group of MongoDB instances that hold the same data set, known as a replica set.

How Does Replication Work in MongoDB?

A MongoDB replica set consists of several nodes, usually at least three:

  • Primary node: Handles all write operations. All changes to data are recorded first on the primary.
  • Secondary nodes: Replica copies of the primary. They replicate the primary's oplog (operations log) and apply the operations to themselves in an asynchronous process, thus maintaining an up-to-date copy of the database.

If the primary node fails, one of the secondary nodes is automatically elected to become the new primary, ensuring minimal downtime and no data loss.

Setting Up a Basic Replica Set

  1. Start MongoDB Instances: For each member of your replica set, start a MongoDB instance with the --replSet option set to the name of your replica set.

    mongod --port 27017 --dbpath /srv/mongodb/db0 --replSet rs0 mongod --port 27018 --dbpath /srv/mongodb/db1 --replSet rs0 mongod --port 27019 --dbpath /srv/mongodb/db2 --replSet rs0
  2. Connect to One Instance: Connect to one of the instances using the MongoDB shell.

    mongo --port 27017
  3. Initialize the Replica Set: Use the rs.initiate() command to configure the replica set. You can pass a configuration object to this command to specify members and their roles.

    rs.initiate( { _id: 'rs0', members: [ { _id: 0, host: 'localhost:27017' }, { _id: 1, host: 'localhost:27018' }, { _id: 2, host: 'localhost:27019' } ] } )

Benefits of Replication

  • High Availability: Automatic failover to a secondary node if the primary becomes unavailable.
  • Data Redundancy: Multiple copies of data across different servers or data centers protect against data loss.
  • Read Scalability: Secondary nodes can be used to perform read operations, distributing the load.

Conclusion

Replication is a powerful feature of MongoDB, essential for maintaining data safety, achieving high availability, and scaling reads. Setting up a replica set requires careful planning but provides significant benefits for distributed applications.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book

Start building today 

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.