Introducing Dragonfly Cloud! Learn More

Question: How does failover work in PostgreSQL replication?

Answer

Failover in PostgreSQL replication is a critical aspect of database administration, ensuring high availability and disaster recovery. It involves automatically or manually switching to a standby server when the primary server fails or becomes unreachable.

Types of Replication in PostgreSQL

PostgreSQL supports several replication methods, including streaming replication and logical replication. Streaming replication is widely used for high availability setups, where a standby server keeps an up-to-date copy of the data from the primary server using WAL (Write-Ahead Logging) files.

Failover Process

Automatic Failover

Automatic failover can be implemented using tools like Patroni, which wraps around PostgreSQL and manages automatic failover and configuration. Patroni uses a distributed configuration store (like etcd, ZooKeeper, or Consul) for leader election and state management.

# Example Patroni configuration snippet for enabling automatic failover scope: postgres namespace: /db/ name: pg_node_1 restapi: listen: 0.0.0.0:8008 connect_address: 192.168.100.100:8008 etcd: hosts: 192.168.100.101:2379,192.168.100.102:2379,192.168.100.103:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout : 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true

In this setup, if the primary server fails, Patroni will automatically promote one of the standby servers to be the new primary, based on pre-configured rules such as replication lag.

Manual Failover

Manual failover involves promoting a standby server to be the primary manually. This can be necessary in situations where you need to perform maintenance on the primary server or when automatic failover is not configured.

To manually promote a PostgreSQL standby server, you can use the pg_ctl command:

pg_ctl promote -D /path/to/standby/data/directory

This command converts the standby into a primary server, allowing it to accept write operations. After manual promotion, you need to ensure that other standbys (if any) start replicating from the new primary and reconfigure your application to connect to the new primary.

Conclusion

Failover in PostgreSQL replication is essential for maintaining data availability and integrity. Whether using automatic tools like Patroni or performing manual promotions, understanding and properly configuring failover processes are crucial for any production-grade PostgreSQL deployment.

Was this content helpful?

White Paper

Free System Design on AWS E-Book

Download this early release of O'Reilly's latest cloud infrastructure e-book: System Design on AWS.

Free System Design on AWS E-Book

Start building today 

Dragonfly is fully compatible with the Redis ecosystem and requires no code changes to implement.