[Answered] How do you set up geo-replication in PostgreSQL?

Answer

Geo-replication in PostgreSQL refers to the practice of copying and maintaining database objects across multiple geographical locations in real-time. This setup enhances data availability, disaster recovery, and read scalability. Here's a comprehensive guide on setting up geo-replication using PostgreSQL streaming replication.

Prerequisites

PostgreSQL installed on at least two servers located in different geographical locations.
Network connectivity between these servers.
A superuser role or similar permissions on both servers.

Step 1: Configure the Master Server

On the primary (master) server, edit the postgresql.conf file:

listen_addresses = '*'                   # Listen on all interfaces.
wal_level = replica                      # Minimum required for replication.
max_wal_senders = 5                      # Max number of walsender processes.
wal_keep_segments = 64                   # In PG 10+ use wal_keep_size instead.

Also, modify the pg_hba.conf file to allow connections from the secondary (replica) server:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
host    replication     all             slave_ip/32             trust

Restart PostgreSQL to apply these changes:

sudo systemctl restart postgresql

Step 2: Prepare the Master for Replication

Create a replication user on the master:

CREATE ROLE replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'yourpassword';

Step 3: Back Up the Master Database

Use pg_basebackup to create a base backup of the master database:

pg_basebackup -h master_ip -D /var/lib/postgresql/data -U replicator -vP -W

Step 4: Configure the Replica Server

Copy the master's data directory to the replica (ensure PostgreSQL is not running on the replica). Then, configure the postgresql.conf similarly to the master, but tailored for the replica settings.

Create a recovery.conf file (or use standby.signal and edit postgresql.conf directly in versions after PostgreSQL 12) in the PostgreSQL data directory with the following content:

standby_mode          = 'on'
primary_conninfo      = 'host=master_ip port=5432 user=replicator password=yourpassword'
trigger_file = '/tmp/postgresql.trigger.5432'

Start the PostgreSQL service on the replica:

sudo systemctl start postgresql

Step 5: Testing and Maintenance

Monitor the replication lag and ensure the systems are in sync. Use tools like pg_stat_replication and repmgr for monitoring and management.

This setup provides a basic geo-replication solution. For more complex requirements, consider solutions like Patroni, which extends PostgreSQL's built-in replication capabilities to provide automatic failover and easier management of replication clusters.

Question: How do you set up geo-replication in PostgreSQL?

Answer

Prerequisites

Step 1: Configure the Master Server

Step 2: Prepare the Master for Replication

Step 3: Back Up the Master Database

Step 4: Configure the Replica Server

Step 5: Testing and Maintenance

Was this content helpful?

Next Steps

Other Common PostgreSQL Questions (and Answers)

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Switch & save up to 80%