Question: What is the PostgreSQL replication protocol?

Answer

PostgreSQL uses a streaming replication protocol to ensure data is copied (replicated) from a primary server to one or more standby servers. This replication is crucial for high availability, load balancing, and for ensuring data safety in case of hardware failure or other issues.

Streaming Replication

At its core, streaming replication allows real-time copying of WAL (Write-Ahead Logging) records from the primary server to the standby servers. Here's a simplified overview:

  1. Enable Replication on the Primary Server: You need to edit postgresql.conf to set up WAL level to replica or logical, specify the maximum number of WAL senders, and configure connection permissions in pg_hba.conf.

    # postgresql.conf wal_level = replica max_wal_senders = 5 max_replication_slots = 5
    # pg_hba.conf # TYPE DATABASE USER ADDRESS METHOD host replication all 192.168.1.0/24 md5
  2. Create a Standby Signal File on the Standby Server: This file tells the server to start in standby mode.

    touch /var/lib/postgresql/data/standby.signal
  3. Start the Replication Process: Using the pg_basebackup tool or other methods like pg_receivewal, you can initialize the data directory on the standby server with the current state of the primary server.

    pg_basebackup -h primary_host -D /var/lib/postgresql/data/ -U replicator -vP -W
  4. Keep the Data Synchronized: Once initialized, the standby server continuously connects to the primary server, receives WAL records, and applies them to stay in sync.

Logical Replication

PostgreSQL also supports logical replication, which enables replication at the level of database objects (tables, sequences). This method allows for more flexibility, such as replicating only specific tables or employing different schemas between primary and standby servers.

To set up logical replication:

  1. Configure Publisher: On the primary server, mark the database or table as a publication.

    CREATE PUBLICATION my_publication FOR TABLE my_table;
  2. Configure Subscriber: On the standby server, subscribe to the publication from the primary server.

    CREATE SUBSCRIPTION my_subscription CONNECTION 'host=primary_host dbname=mydb user=replicator password=mysecret' PUBLICATION my_publication;

Logical replication provides the flexibility needed for certain use cases but requires more initial setup and has some limitations, such as not replicating certain schema changes automatically.

Conclusion

Streaming replication and logical replication are powerful features provided by PostgreSQL for ensuring data redundancy, high availability, and flexibility in handling data across multiple servers. The choice between streaming and logical replication depends on your specific needs regarding data consistency, system architecture, and operational requirements.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book
Start building today

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.