Question: How do you set up replication in PostgreSQL?
Answer
Replication in PostgreSQL is a process that allows you to copy and distribute database objects or data from one database server (the primary) to one or more database servers (the standbys). The primary goal of replication is to improve the availability, failover capabilities, and read scalability. PostgreSQL supports several types of replication, but we'll focus on setting up streaming replication, which is a widely used method for achieving high availability.
Requirements:
Before proceeding, ensure both the primary and standby servers have PostgreSQL installed and configured with the same major version.
Step 1: Configure the Primary Server
Edit the postgresql.conf
file on the primary server:
listen_addresses = '*' # Listen on all interfaces wal_level = replica # Required for streaming replication max_wal_senders = 5 # Max number of WAL sender processes archive_mode = on # Enable archiving archive_command = 'cp %p /path/to/archive/%f' # Archive command (adjust accordingly)
Also, modify the pg_hba.conf
file to allow connections from the standby server(s):
# Type Database User Address Method host replication all <standby_IP_address>/32 md5
Restart PostgreSQL to apply these changes.
Step 2: Prepare the Standby Server
On the primary server, use the pg_basebackup
utility to create a base backup of the primary server's data directory:
pg_basebackup -h <primary_IP_address> -D /var/lib/postgresql/12/main -U replicator -P -v -R
This command will clone the primary server's data directory into the specified directory on the standby server. Replace /var/lib/postgresql/12/main
with your actual data directory path.
Step 3: Configure the Standby Server
On the standby server, make sure the postgresql.conf
includes the following lines:
hot_standby = on # Allows the standby to accept read-only queries
Additionally, create a recovery.conf
file inside the standby's data directory with the following content:
standby_mode = 'on' primary_conninfo = 'host=<primary_IP_address> port=5432 user=replicator password=yourpassword'
This tells the standby server where to connect to the primary.
Step 4: Start the Standby Server
Finally, start the PostgreSQL service on the standby server. It should connect to the primary and start replicating changes.
Verification
You can verify the replication status by running the following command on the primary server:
SELECT * FROM pg_stat_replication;
This query returns information about each replication connection to the primary server.
Conclusion
Streaming replication in PostgreSQL is a powerful feature for creating high-availability setups. This basic guide covers setting up a simple streaming replication configuration. Depending on your requirements, you might need to adjust configurations, such as handling failover automatically with tools like Repmgr or Patroni.
Was this content helpful?
Other Common PostgreSQL Questions (and Answers)
- How do you manage Postgres replication lag?
- How can I limit the number of rows updated in a PostgreSQL query?
- What is PostgreSQL replication and how does it work?
- How does sharding work in PostgreSQL?
- What is partitioning in PostgreSQL?
- How do you limit the number of rows deleted in PostgreSQL?
- How do you use the PARTITION OVER clause in PostgreSQL?
- How do you use the PARTITION BY clause in PostgreSQL?
- What are PostgreSQL replication slots and how do they work?
- How can you partition an existing table in PostgreSQL?
- What is PostgreSQL replication streaming?
- What are PostgreSQLs size limitations?
Free System Design on AWS E-Book
Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.
Start building today
Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.