Question: How do you set up geo-replication in PostgreSQL?
Answer
Geo-replication in PostgreSQL refers to the practice of copying and maintaining database objects across multiple geographical locations in real-time. This setup enhances data availability, disaster recovery, and read scalability. Here's a comprehensive guide on setting up geo-replication using PostgreSQL streaming replication.
Prerequisites
- PostgreSQL installed on at least two servers located in different geographical locations.
- Network connectivity between these servers.
- A superuser role or similar permissions on both servers.
Step 1: Configure the Master Server
On the primary (master) server, edit the postgresql.conf
file:
listen_addresses = '*' # Listen on all interfaces. wal_level = replica # Minimum required for replication. max_wal_senders = 5 # Max number of walsender processes. wal_keep_segments = 64 # In PG 10+ use wal_keep_size instead.
Also, modify the pg_hba.conf
file to allow connections from the secondary (replica) server:
# TYPE DATABASE USER ADDRESS METHOD host replication all slave_ip/32 trust
Restart PostgreSQL to apply these changes:
sudo systemctl restart postgresql
Step 2: Prepare the Master for Replication
Create a replication user on the master:
CREATE ROLE replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'yourpassword';
Step 3: Back Up the Master Database
Use pg_basebackup
to create a base backup of the master database:
pg_basebackup -h master_ip -D /var/lib/postgresql/data -U replicator -vP -W
Step 4: Configure the Replica Server
Copy the master's data directory to the replica (ensure PostgreSQL is not running on the replica). Then, configure the postgresql.conf
similarly to the master, but tailored for the replica settings.
Create a recovery.conf
file (or use standby.signal
and edit postgresql.conf
directly in versions after PostgreSQL 12) in the PostgreSQL data directory with the following content:
standby_mode = 'on' primary_conninfo = 'host=master_ip port=5432 user=replicator password=yourpassword' trigger_file = '/tmp/postgresql.trigger.5432'
Start the PostgreSQL service on the replica:
sudo systemctl start postgresql
Step 5: Testing and Maintenance
Monitor the replication lag and ensure the systems are in sync. Use tools like pg_stat_replication
and repmgr
for monitoring and management.
This setup provides a basic geo-replication solution. For more complex requirements, consider solutions like Patroni, which extends PostgreSQL's built-in replication capabilities to provide automatic failover and easier management of replication clusters.
Was this content helpful?
Other Common PostgreSQL Questions (and Answers)
- How do you manage Postgres replication lag?
- How can I limit the number of rows updated in a PostgreSQL query?
- How does sharding work in PostgreSQL?
- How do you limit the number of rows deleted in PostgreSQL?
- How do you use the PARTITION OVER clause in PostgreSQL?
- What are PostgreSQL replication slots and how do they work?
- How can you partition an existing table in PostgreSQL?
- How do you partition a table by multiple columns in PostgreSQL?
- How do you check the replication status in PostgreSQL?
- What are the scaling limits of PostgreSQL?
- How do you scale Azure PostgreSQL?
- How do you use the limit clause in PostgreSQL to get the top N rows of a query result?
Free System Design on AWS E-Book
Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.
Switch & save up to 80%
Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement. Instantly experience up to a 25X boost in performance and 80% reduction in cost