[Answered] How do you set up replication in PostgreSQL?

Answer

Replication in PostgreSQL is a process that allows you to copy and distribute database objects or data from one database server (the primary) to one or more database servers (the standbys). The primary goal of replication is to improve the availability, failover capabilities, and read scalability. PostgreSQL supports several types of replication, but we'll focus on setting up streaming replication, which is a widely used method for achieving high availability.

Requirements:

Before proceeding, ensure both the primary and standby servers have PostgreSQL installed and configured with the same major version.

Step 1: Configure the Primary Server

Edit the postgresql.conf file on the primary server:

listen_addresses = '*' # Listen on all interfaces
wal_level = replica    # Required for streaming replication
max_wal_senders = 5    # Max number of WAL sender processes
archive_mode = on      # Enable archiving
archive_command = 'cp %p /path/to/archive/%f' # Archive command (adjust accordingly)

Also, modify the pg_hba.conf file to allow connections from the standby server(s):

# Type  Database        User            Address                 Method
host    replication     all             <standby_IP_address>/32 md5

Restart PostgreSQL to apply these changes.

Step 2: Prepare the Standby Server

On the primary server, use the pg_basebackup utility to create a base backup of the primary server's data directory:

pg_basebackup -h <primary_IP_address> -D /var/lib/postgresql/12/main -U replicator -P -v -R

This command will clone the primary server's data directory into the specified directory on the standby server. Replace /var/lib/postgresql/12/main with your actual data directory path.

Step 3: Configure the Standby Server

On the standby server, make sure the postgresql.conf includes the following lines:

hot_standby = on # Allows the standby to accept read-only queries

Additionally, create a recovery.conf file inside the standby's data directory with the following content:

standby_mode = 'on'
primary_conninfo = 'host=<primary_IP_address> port=5432 user=replicator password=yourpassword'

This tells the standby server where to connect to the primary.

Step 4: Start the Standby Server

Finally, start the PostgreSQL service on the standby server. It should connect to the primary and start replicating changes.

Verification

You can verify the replication status by running the following command on the primary server:

SELECT * FROM pg_stat_replication;

This query returns information about each replication connection to the primary server.

Conclusion

Streaming replication in PostgreSQL is a powerful feature for creating high-availability setups. This basic guide covers setting up a simple streaming replication configuration. Depending on your requirements, you might need to adjust configurations, such as handling failover automatically with tools like Repmgr or Patroni.