Dragonfly Cloud announces new enterprise security features - learn more

Question: How can you use replication for backups in PostgreSQL?

Answer

PostgreSQL's built-in replication feature can be an invaluable tool for creating backups and ensuring data durability. This process involves setting up a primary server, which handles all the write operations, and one or more standby servers (replicas), which receive copies of the data changes from the primary. Here's how you can leverage replication for backups in PostgreSQL:

1. Setting Up Streaming Replication

Primary Server Configuration:

  1. Edit postgresql.conf:

    • Set wal_level to replica.
    • Specify the maximum number of wal senders by setting max_wal_senders to a value greater than the number of replicas.
    • Optionally, adjust max_replication_slots if you plan to use physical replication slots.
  2. Edit pg_hba.conf: Add a line to allow replication connections from the replicas, specifying the IP addresses or ranges of the replica servers.

  3. Create Replication Role: Create a dedicated user role for replication on the primary server:

    CREATE ROLE replicator REPLICATION LOGIN PASSWORD 'yourpassword';

Replica Server Configuration:

  1. Base Backup: Stop the PostgreSQL service on the replica if it's running. Then, use the pg_basebackup command to create a base backup of the primary server:

    pg_basebackup -h primary_host -D /var/lib/postgresql/12/main -U replicator -P -v -X stream

    Replace /var/lib/postgresql/12/main with the path to your PostgreSQL data directory.

  2. Edit postgresql.conf:

    • Set hot_standby to on to allow read queries on the replica.
  3. Create recovery.conf (for PostgreSQL versions before 12) or standby.signal (for PostgreSQL 12 and later): For version 12 and later, create an empty file named standby.signal in the main data directory and configure connection information in postgresql.conf:

    primary_conninfo = 'host=primary_host port=5432 user=replicator password=yourpassword'

2. Using Replicas for Backups

While the primary server handles all the write operations, the replicas can be used to offload read-only queries and to perform backups without affecting the performance of the primary server. To take a consistent backup from a replica:

  1. Ensure the replica is in a consistent state.
  2. You may temporarily pause replication if desired to make a static file-level backup, or use tools like pg_dump to take logical backups directly from the replica.

3. Failover and Promotion

In case the primary server fails, one of the replicas can be promoted to become the new primary. This is done by triggering a failover, manually or automatically using tools like Replication Manager.

Promotion Command:

pg_ctl promote -D /var/lib/postgresql/12/main

Conclusion

Using replication for backups in PostgreSQL not only helps in achieving high availability but also improves the backup strategy by distributing the load and minimizing downtime during backup operations. Properly configured, this approach ensures data safety and system resilience.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book

Switch & save up to 80% 

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement. Instantly experience up to a 25X boost in performance and 80% reduction in cost