Question: How do you set up geo-replication in PostgreSQL?

Answer

Geo-replication in PostgreSQL refers to the practice of copying and maintaining database objects across multiple geographical locations in real-time. This setup enhances data availability, disaster recovery, and read scalability. Here's a comprehensive guide on setting up geo-replication using PostgreSQL streaming replication.

Prerequisites

  • PostgreSQL installed on at least two servers located in different geographical locations.
  • Network connectivity between these servers.
  • A superuser role or similar permissions on both servers.

Step 1: Configure the Master Server

On the primary (master) server, edit the postgresql.conf file:

listen_addresses = '*' # Listen on all interfaces. wal_level = replica # Minimum required for replication. max_wal_senders = 5 # Max number of walsender processes. wal_keep_segments = 64 # In PG 10+ use wal_keep_size instead.

Also, modify the pg_hba.conf file to allow connections from the secondary (replica) server:

# TYPE DATABASE USER ADDRESS METHOD host replication all slave_ip/32 trust

Restart PostgreSQL to apply these changes:

sudo systemctl restart postgresql

Step 2: Prepare the Master for Replication

Create a replication user on the master:

CREATE ROLE replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'yourpassword';

Step 3: Back Up the Master Database

Use pg_basebackup to create a base backup of the master database:

pg_basebackup -h master_ip -D /var/lib/postgresql/data -U replicator -vP -W

Step 4: Configure the Replica Server

Copy the master's data directory to the replica (ensure PostgreSQL is not running on the replica). Then, configure the postgresql.conf similarly to the master, but tailored for the replica settings.

Create a recovery.conf file (or use standby.signal and edit postgresql.conf directly in versions after PostgreSQL 12) in the PostgreSQL data directory with the following content:

standby_mode = 'on' primary_conninfo = 'host=master_ip port=5432 user=replicator password=yourpassword' trigger_file = '/tmp/postgresql.trigger.5432'

Start the PostgreSQL service on the replica:

sudo systemctl start postgresql

Step 5: Testing and Maintenance

Monitor the replication lag and ensure the systems are in sync. Use tools like pg_stat_replication and repmgr for monitoring and management.

This setup provides a basic geo-replication solution. For more complex requirements, consider solutions like Patroni, which extends PostgreSQL's built-in replication capabilities to provide automatic failover and easier management of replication clusters.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book
Start building today

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.