Disaster Recovery 101: Architecting Auto-Failovers for PostgreSQL and C++ Deployments on OVH

Establishing a High-Availability PostgreSQL Cluster with Patroni and Keepalived

For mission-critical applications, a single PostgreSQL instance is a single point of failure. Architecting for high availability (HA) necessitates a robust failover mechanism. This section details the setup of a PostgreSQL cluster leveraging Patroni for automated leader election and PostgreSQL replication, complemented by Keepalived for virtual IP (VIP) management, ensuring seamless failover for our C++ application clients.

We’ll deploy this across multiple OVH instances, simulating distinct availability zones. The core components are:

PostgreSQL: The database itself, configured for streaming replication.
Patroni: A template for PostgreSQL HA. It manages PostgreSQL instances, handles leader election using a distributed configuration store (etcd, ZooKeeper, or Consul), and orchestrates failover. We’ll use etcd for this example due to its simplicity and widespread adoption.
etcd: The distributed key-value store Patroni relies on for leader election and cluster state management.
Keepalived: Provides a virtual IP address (VIP) that floats between the primary and standby PostgreSQL nodes. Clients connect to the VIP, abstracting the underlying database server.

Prerequisites and Initial Setup

Ensure you have at least three OVH instances. For this guide, we’ll assume the following:

Instance 1 (etcd-01): Runs etcd and Patroni.
Instance 2 (pg-01): Runs PostgreSQL (initial primary) and Patroni.
Instance 3 (pg-02): Runs PostgreSQL (initial standby) and Patroni.
Instance 4 (pg-03): Runs PostgreSQL (initial standby) and Patroni.
All instances have a static IP address assigned.
SSH access to all instances.
Root or sudo privileges.

1. Install etcd on etcd-01:

Installing etcd

On etcd-01:

On Debian/Ubuntu

Download the latest etcd release, extract, and install.

On CentOS/RHEL

Use the package manager.

Configuring etcd

Create an etcd configuration file (e.g., /etc/etcd/etcd.conf.yml) on etcd-01. This example sets up a single-node etcd cluster for simplicity, but in production, you’d want a multi-node etcd cluster for etcd’s own HA.

etcd Configuration (etcd.conf.yml)

name: etcd-01
data-dir: /var/lib/etcd
listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://:2379
listen-peer-urls: http://0.0.0.0:2380
initial-advertise-peer-urls: http://:2380
initial-cluster: etcd-01=http://:2380
initial-cluster-token: my-etcd-cluster-token
initial-cluster-state: new

Replace <ETCD_NODE_IP> with the actual IP address of etcd-01. Start and enable the etcd service:

sudo systemctl start etcd
sudo systemctl enable etcd

Deploying PostgreSQL and Patroni

On each PostgreSQL node (pg-01, pg-02, pg-03), install PostgreSQL and Patroni. We’ll use Python 3 and pip for Patroni.

Installing PostgreSQL and Patroni

On each PostgreSQL node:

# Install PostgreSQL (version specific, e.g., PostgreSQL 14)
sudo apt update
sudo apt install postgresql postgresql-contrib postgresql-14-pglogical -y # pglogical is useful for advanced replication scenarios

# Install Patroni and dependencies
sudo apt install python3 python3-pip python3-venv -y
sudo pip3 install --upgrade pip
sudo pip3 install "patroni[etcd]" psycopg2-binary

Configuring Patroni

Create a Patroni configuration file (e.g., /etc/patroni/patroni.yml) on each PostgreSQL node. This configuration points to the etcd cluster and defines PostgreSQL settings.

Patroni Configuration (patroni.yml)

# Global settings
scope: my_postgres_cluster
namespace: /service/ # Prefix for etcd keys

# etcd configuration
etcd:
  host: :2379 # IP of your etcd instance
  protocol: http

# PostgreSQL configuration
postgresql:
  listen: 0.0.0.0:5432
  connect_address: :5432 # IP of the current node
  data_dir: /var/lib/postgresql/14/main # Adjust path based on your PostgreSQL installation
  pgpass: /etc/patroni/pgpass # Path to pgpass file for Patroni to connect to other nodes
  replication:
    username: replicator
    password: your_replication_password
    ssl: false # Set to true in production with proper certs
  parameters:
    max_connections: 100
    shared_buffers: 256MB
    effective_cache_size: 768MB
    maintenance_work_mem: 64MB
    wal_level: replica
    wal_sync_method: fsync
    wal_compression: off
    wal_buffers: 16MB
    backend_flush_after: 200MB
    checkpoint_completion_target: 0.9
    # ... other PostgreSQL parameters
  replication_mode: async # or sync, depending on your HA requirements

# Restapi configuration (for Patroni's API)
restapi:
  listen: 0.0.0.0:8008
  connect_address: :8008

# Tags for node identification
tags:
  nofailover: false
  clonefrom: false

# Replication settings for Patroni
replication:
  synchronous_mode: false # Set to true if using synchronous replication
  synchronous_node_count: 0 # Number of synchronous replicas required

# Logging
log:
  level: INFO
  dir: /var/log/patroni
  file: patroni.log

Important:

Replace <ETCD_NODE_IP> with the IP of etcd-01.
Replace <NODE_IP> with the actual IP address of the PostgreSQL node where this configuration is being applied.
Ensure the data_dir matches your PostgreSQL installation.
Create the /etc/patroni/pgpass file with appropriate permissions (chmod 0600) and content for the replicator user.
Create the /var/log/patroni directory and ensure the Patroni user has write permissions.

Creating the Replicator User and pgpass

On each PostgreSQL node, before starting Patroni, you need to create the replication user and the pgpass file. The first node to start will initialize the cluster.

On pg-01 (the intended initial primary):

# Create the replicator user and grant necessary privileges
sudo -u postgres psql -c "CREATE USER replicator WITH REPLICATION PASSWORD 'your_replication_password';"

# Create pgpass file for Patroni
sudo mkdir -p /etc/patroni
sudo chown postgres:postgres /etc/patroni
sudo su - postgres -c "echo '*:*:*:replicator:your_replication_password' > /etc/patroni/pgpass"
sudo chmod 0600 /etc/patroni/pgpass

# Create log directory
sudo mkdir -p /var/log/patroni
sudo chown postgres:postgres /var/log/patroni

On pg-02 and pg-03, you’ll also create the pgpass file and log directory, but the replicator user creation might be handled by Patroni during the initial bootstrap if it’s not already present. It’s safer to create it beforehand on all nodes.

Starting Patroni and PostgreSQL

Create a systemd service file for Patroni (e.g., /etc/systemd/system/patroni.service) on each PostgreSQL node.

Patroni Systemd Service File

[Unit]
Description=Patroni PostgreSQL High-Availability
After=network.target

[Service]
User=postgres
Group=postgres
ExecStart=/usr/local/bin/patroni /etc/patroni/patroni.yml
Restart=on-failure
RestartSec=5s

[Install]
WantedBy=multi-user.target

Reload systemd, start, and enable Patroni:

sudo systemctl daemon-reload
sudo systemctl start patroni
sudo systemctl enable patroni

Patroni will now start PostgreSQL on the first node it initializes (usually pg-01), configure it as the primary, and register its state in etcd. On other nodes, Patroni will detect the existing primary, configure PostgreSQL as a replica, and start it.

Verifying the Cluster State

Check the status of Patroni and PostgreSQL services:

sudo systemctl status patroni
sudo systemctl status postgresql # This might be managed by patroni, check logs

You can also query etcd directly to see the cluster state:

curl http://:2379/v2/keys/service/my_postgres_cluster/global/leader

This should return information about the current leader node. You can also check the Patroni API on each node:

curl http://:8008/patroni

Implementing Virtual IP with Keepalived

Keepalived provides a floating VIP. We’ll configure it on the PostgreSQL nodes to ensure clients always connect to the active primary.

Installing Keepalived

On each PostgreSQL node (pg-01, pg-02, pg-03):

sudo apt update
sudo apt install keepalived -y

Configuring Keepalived

Edit the Keepalived configuration file (/etc/keepalived/keepalived.conf) on each PostgreSQL node. This setup uses VRRP (Virtual Router Redundancy Protocol) to manage the VIP.

Keepalived Configuration (keepalived.conf)

On the primary node (e.g., pg-01):

vrrp_script check_postgres {
    script "/usr/local/bin/check_postgres_status.sh"
    interval 2
    weight 20
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    state BACKUP # Start as BACKUP on all nodes except the initial MASTER
    interface eth0 # Replace with your actual network interface
    virtual_router_id 51
    priority 100 # Higher priority for the initial MASTER
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass your_vrrp_password
    }
    virtual_ipaddress {
        192.168.1.100/24 dev eth0 # Replace with your desired VIP and interface
    }
    track_script {
        check_postgres
    }
    notify_master "/usr/local/bin/patroni_promote.sh"
    notify_backup "/usr/local/bin/patroni_demote.sh"
    notify_fault "/usr/local/bin/patroni_stop.sh"
}

On the standby nodes (e.g., pg-02, pg-03):

vrrp_script check_postgres {
    script "/usr/local/bin/check_postgres_status.sh"
    interval 2
    weight 20
    fall 2
    rise 2
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0 # Replace with your actual network interface
    virtual_router_id 51
    priority 90 # Lower priority for standby nodes
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass your_vrrp_password
    }
    virtual_ipaddress {
        192.168.1.100/24 dev eth0 # Replace with your desired VIP and interface
    }
    track_script {
        check_postgres
    }
    notify_master "/usr/local/bin/patroni_promote.sh"
    notify_backup "/usr/local/bin/patroni_demote.sh"
    notify_fault "/usr/local/bin/patroni_stop.sh"
}

Key points:

virtual_router_id must be the same across all nodes in the VRRP group.
priority determines which node becomes MASTER. Higher priority wins.
virtual_ipaddress is the VIP that clients will connect to.
interface must be the correct network interface on your OVH instances.
auth_pass must be identical on all nodes.
state BACKUP is crucial for initial setup. The node with the highest priority and a passing script will transition to MASTER.

Custom Scripts for Keepalived and Patroni Integration

Keepalived needs scripts to check PostgreSQL health and to notify Patroni during state transitions. These scripts should be placed in /usr/local/bin/ and made executable.

check_postgres_status.sh

#!/bin/bash

PG_USER="postgres"
PG_HOST="127.0.0.1"
PG_PORT="5432"
PG_DB="postgres"

# Check if Patroni is running and healthy
if ! curl -s "http://127.0.0.1:8008/patroni" | grep -q '"state": "running"'; then
    exit 1
fi

# Check if PostgreSQL is running and accessible
if ! pg_isready -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PG_DB &>/dev/null; then
    exit 1
fi

# Check if this node is the leader (only relevant for MASTER state)
# This check is implicitly handled by Patroni's API state.
# If Patroni reports 'running' and PostgreSQL is accessible, it's sufficient for Keepalived.

exit 0

patroni_promote.sh

#!/bin/bash
# This script is called by Keepalived when this node becomes MASTER.
# Patroni's API will handle the actual promotion if needed, but this script
# can be used to perform additional actions or simply confirm the state.

# In a typical Patroni setup, Keepalived's role is to ensure the VIP is on the
# node that Patroni has elected as leader. Patroni itself manages PostgreSQL promotion.
# This script might be simplified or used for logging/external notifications.

echo "Node is now MASTER. Patroni should be managing PostgreSQL state."
exit 0

patroni_demote.sh

#!/bin/bash
# This script is called by Keepalived when this node transitions to BACKUP.
# Patroni will handle demoting the PostgreSQL instance if it was previously MASTER.

echo "Node is now BACKUP. Patroni should be managing PostgreSQL state."
exit 0

patroni_stop.sh

#!/bin/bash
# This script is called by Keepalived when this node enters FAULT state.
# This usually means Keepalived itself has failed or the node is unreachable.
# Patroni should ideally stop PostgreSQL gracefully if possible.

echo "Node is in FAULT state. Patroni should manage PostgreSQL shutdown."
exit 0

Make these scripts executable:

sudo chmod +x /usr/local/bin/check_postgres_status.sh
sudo chmod +x /usr/local/bin/patroni_promote.sh
sudo chmod +x /usr/local/bin/patroni_demote.sh
sudo chmod +x /usr/local/bin/patroni_stop.sh

Starting Keepalived

On each PostgreSQL node:

sudo systemctl start keepalived
sudo systemctl enable keepalived

Monitor the Keepalived logs (e.g., /var/log/syslog or journalctl -u keepalived) for state transitions and VIP assignment. The VIP should appear on the node that Patroni has elected as the leader.

Integrating C++ Application Clients

Your C++ application clients should be configured to connect to the PostgreSQL VIP (e.g., 192.168.1.100:5432) instead of a specific PostgreSQL server’s IP address. This abstracts the underlying database infrastructure.

Connection String Example

When using a PostgreSQL client library (like libpq), the connection string would look like this:

"host=192.168.1.100 port=5432 dbname=your_db user=your_user password=your_password"

Ensure your C++ application’s database connection logic is parameterized to accept the host and port, allowing you to easily update it to point to the VIP.

Testing Failover

To test the automated failover:

Identify the current primary PostgreSQL node (check the VIP or Patroni API).
Gracefully stop the Patroni service on the primary node: sudo systemctl stop patroni.
Observe the logs on the other PostgreSQL nodes. Patroni on a standby node should detect the primary’s failure, initiate a leader election, and promote itself to become the new primary.
Keepalived should detect the loss of the primary node (via the check_postgres_status.sh script failing) and move the VIP to the new primary node.
Verify that your C++ application can still connect and perform operations using the VIP.

For a more aggressive test, you can simulate a network failure or forcefully reboot the primary node. Always ensure you have a robust backup strategy in place before performing such tests in a production environment.

Production Considerations and Enhancements

This setup provides a solid foundation for PostgreSQL HA. For production deployments, consider the following:

Multi-Node etcd Cluster: For etcd’s own high availability, deploy a 3 or 5-node etcd cluster.
Synchronous Replication: Configure PostgreSQL and Patroni for synchronous replication if zero data loss is a strict requirement. This will impact write performance.
Network Configuration: Ensure proper firewall rules are in place to allow communication between etcd nodes, PostgreSQL nodes, and Keepalived’s VRRP traffic (protocol 112).
Monitoring: Implement comprehensive monitoring for Patroni, etcd, PostgreSQL, and Keepalived. Tools like Prometheus with exporters are highly recommended.
Backups: Regularly back up your PostgreSQL data using tools like pg_dump or pg_basebackup.
Security: Use SSL for etcd and PostgreSQL communication. Secure Keepalived authentication.
Testing: Regularly test failover scenarios to ensure the system behaves as expected.
PostgreSQL Version: Ensure all nodes are running the same PostgreSQL version.
Resource Allocation: Adequately provision CPU, memory, and disk I/O for your PostgreSQL instances, especially for the primary node.
OVH Specifics: Understand OVH’s network configuration, including floating IPs and internal network segmentation, which might influence your VIP strategy. You might consider using OVH’s managed PostgreSQL service if it meets your requirements and simplifies HA management.