Disaster Recovery 101: Architecting Auto-Failovers for PostgreSQL and C Deployments on DigitalOcean
Establishing a Highly Available PostgreSQL Cluster with Patroni and HAProxy
For mission-critical applications, a single PostgreSQL instance is a single point of failure. Architecting for high availability (HA) and automated failover is paramount. This section details a robust setup using Patroni for PostgreSQL cluster management and HAProxy for load balancing and connection routing, all deployable on DigitalOcean Droplets.
Our strategy involves a primary PostgreSQL node, one or more replica nodes, and a dedicated HAProxy instance. Patroni will manage the PostgreSQL cluster’s state, ensuring only one primary is active and promoting replicas automatically upon primary failure. HAProxy will then direct client connections to the current primary.
Prerequisites and Initial Setup
We’ll assume three DigitalOcean Droplets (e.g., 4 vCPU, 8GB RAM each) for PostgreSQL nodes and one additional Droplet for HAProxy. All nodes should be running a recent Ubuntu LTS release (e.g., 22.04). Ensure SSH access is configured and that the nodes can communicate with each other via their private IP addresses. A shared distributed configuration store is essential for Patroni. etcd is a robust choice. We’ll deploy a small etcd cluster on separate Droplets or, for simplicity in this example, on the PostgreSQL nodes themselves (though dedicated etcd nodes are recommended for production).
First, install PostgreSQL and Patroni on all three PostgreSQL Droplets. We’ll use the official PostgreSQL APT repository for the latest stable version.
Installing PostgreSQL and Patroni
On each PostgreSQL Droplet:
- Add the PostgreSQL repository:
sudo apt update sudo apt install -y wget ca-certificates sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - sudo apt update
- Install PostgreSQL and the `python3-pip` package for Patroni:
sudo apt install -y postgresql-15 postgresql-contrib-15 python3-pip
- Install Patroni and its dependencies:
sudo pip3 install --upgrade pip sudo pip3 install "patroni[etcd]" psycopg2-binary
Note: For production, consider using a virtual environment for Patroni. The [etcd] extra installs the necessary etcd client library. psycopg2-binary is required for Patroni to communicate with PostgreSQL.
Configuring etcd for Patroni
Patroni requires a distributed configuration store. We’ll use etcd. For a production environment, a dedicated 3-node etcd cluster is highly recommended. For this example, we’ll install etcd on each PostgreSQL node and configure them to form a single cluster. This is simpler for demonstration but less resilient than a dedicated cluster.
Installing etcd
On each PostgreSQL Droplet:
sudo apt update sudo apt install -y etcd
Configuring etcd Cluster
Create an etcd configuration file, e.g., /etc/etcd/etcd.conf.yml, on each PostgreSQL node. Replace <NODE_PRIVATE_IP> with the actual private IP of the respective node and <OTHER_NODE_PRIVATE_IP_1>, <OTHER_NODE_PRIVATE_IP_2> with the private IPs of the other two nodes.
name: infra0 # Unique name for each etcd member data-dir: /var/lib/etcd listen-client-urls: http://0.0.0.0:2379 advertise-client-urls: http://<NODE_PRIVATE_IP>:2379 listen-peer-urls: http://0.0.0.0:2380 initial-advertise-peer-urls: http://<NODE_PRIVATE_IP>:2380 initial-cluster: infra0=http://<OTHER_NODE_PRIVATE_IP_1>:2380,infra1=http://<OTHER_NODE_PRIVATE_IP_2>:2380,infra2=http://<NODE_PRIVATE_IP>:2380 # Adjust names and IPs accordingly initial-cluster-state: new cluster-activate: true
Start and enable the etcd service on all PostgreSQL nodes:
sudo systemctl enable etcd sudo systemctl start etcd
Verify the etcd cluster status. On any node, run:
etcdctl member list
You should see all three members listed. If not, check network connectivity and etcd logs (journalctl -u etcd).
Configuring Patroni
Create a Patroni configuration file, e.g., /etc/patroni/patroni.yml, on each PostgreSQL Droplet. This configuration defines how Patroni manages the PostgreSQL cluster. Ensure the etcd section points to your etcd cluster’s client URLs.
scope: my_pg_cluster # Unique name for the PostgreSQL cluster
namespace: /service/ # Base path in etcd for cluster configuration
etcd:
host: <ETCD_NODE_1_PRIVATE_IP>:2379, <ETCD_NODE_2_PRIVATE_IP>:2379, <ETCD_NODE_3_PRIVATE_IP>:2379 # List of etcd client endpoints
protocol: http
postgresql:
listen: 0.0.0.0:5432
connect_address: <NODE_PRIVATE_IP>:5432 # This node's IP for PostgreSQL connections
data_dir: /var/lib/postgresql/15/main
pg_hba:
- host replication replicator <REPLICA_NODE_PRIVATE_IP>/32 md5
- host replication replicator <OTHER_REPLICA_NODE_PRIVATE_IP>/32 md5
- host all all 0.0.0.0/0 md5 # Adjust for security
replication:
username: replicator
password: <REPLICATION_PASSWORD>
parameters:
max_connections: 100
shared_buffers: 1GB
wal_level: replica
hot_standby: "on"
max_wal_senders: 10
max_replication_slots: 10
restapi:
listen: 0.0.0.0:8008
connect_address: <NODE_PRIVATE_IP>:8008 # This node's IP for REST API
tags:
nofailover: false
clone: false
Important considerations for patroni.yml:
scope: A unique identifier for your PostgreSQL cluster.namespace: The etcd path where Patroni stores its state.etcd.host: A comma-separated list of etcd client endpoints.postgresql.connect_address: The IP address and port Patroni should advertise for this PostgreSQL instance.postgresql.pg_hba: Crucial for replication and client access. Ensure thereplicatoruser can connect from replica nodes. The0.0.0.0/0 md5entry for general access should be restricted in production.postgresql.replication.username/password: Credentials for the replication user.restapi.listen/connect_address: Patroni’s REST API endpoints, used for health checks and management.tags.nofailover: Set totrueon a node if you want to temporarily prevent it from being promoted or becoming primary.tags.clone: Set totrueon a node if you want it to be a candidate for cloning from the primary during initialization.
Create the PostgreSQL data directory and set appropriate permissions:
sudo mkdir -p /var/lib/postgresql/15/main sudo chown -R postgres:postgres /var/lib/postgresql/15/main sudo chmod 700 /var/lib/postgresql/15/main
Create a systemd service file for Patroni, e.g., /etc/systemd/system/patroni.service:
[Unit] Description=Patroni PostgreSQL High-Availability After=network.target etcd.service [Service] User=postgres Group=postgres ExecStart=/usr/local/bin/patroni /etc/patroni/patroni.yml Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target
Start and enable Patroni on all PostgreSQL nodes:
sudo systemctl daemon-reload sudo systemctl enable patroni sudo systemctl start patroni
Monitor Patroni logs (journalctl -u patroni -f). The first node to start will initialize the cluster and become the primary. Subsequent nodes will join as replicas. Patroni will automatically create the replication user and configure PostgreSQL.
Setting up HAProxy for Connection Routing
HAProxy will act as the single entry point for your application to connect to the PostgreSQL cluster. It will dynamically route traffic to the current primary node.
Installing HAProxy
On the dedicated HAProxy Droplet:
sudo apt update sudo apt install -y haproxy
Configuring HAProxy
Edit the HAProxy configuration file, /etc/haproxy/haproxy.cfg. We’ll define two backend server groups: one for read/write traffic (pointing to the primary) and one for read-only traffic (pointing to all replicas).
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode tcp
option tcplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
listen stats
bind *:8404
mode http
stats enable
stats uri /stats
stats realm Haproxy\ Statistics
stats auth admin:password # Change this password!
frontend pgsql_frontend
bind *:5432
mode tcp
option tcplog
default_backend pgsql_backend_rw
backend pgsql_backend_rw
mode tcp
option tcp-check
balance roundrobin
# Use Patroni's API to find the current primary
# This requires a custom script or a Patroni integration that exposes this info.
# For simplicity, we'll use a static configuration and rely on HAProxy's health checks.
# A more advanced setup would use a script to query Patroni's API.
# Example using a script (see below):
# server pg_primary <PATRONI_API_PRIMARY_IP>:5432 check port 8008 inter 2s fall 3 rise 2
# Static configuration (less dynamic, requires manual update or script):
# server pg_primary <PRIMARY_NODE_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
# Dynamic configuration using a script to query Patroni API:
# This requires a script that polls Patroni's API for the primary and updates HAProxy's config or uses a dynamic backend.
# A common approach is to have a script that periodically checks Patroni's API and reloads HAProxy.
# For a simpler, though less dynamic, approach, we can use HAProxy's built-in health checks against Patroni's REST API.
# We'll configure HAProxy to check the REST API of each node and assume the one responding is the primary.
# This is a simplification; a robust solution would involve a dedicated script.
# Let's use a more robust method: HAProxy checks Patroni's REST API for leader status.
# This requires a custom check script or a Patroni integration.
# A simpler, but less ideal, approach is to check the PostgreSQL port and rely on Patroni's failover.
# For this example, we'll use a basic TCP check on port 5432 and rely on Patroni to ensure only one is active.
# A more advanced setup would involve a script to query Patroni's API for the leader.
# Let's use a script to query Patroni's API.
# Create a script like /etc/haproxy/patroni_check.sh
# This script should return 0 if the node is primary, 1 otherwise.
# Example script logic:
# curl -s http://<NODE_IP>:8008/patroni | grep '"state": "running"'
# If the above command succeeds, the node is primary.
# For simplicity in this example, we'll use a basic TCP check on port 5432.
# Patroni ensures only one primary is active. HAProxy will pick the first available.
# This is NOT ideal for failover detection but works for initial setup.
# A better approach is to use HAProxy's HTTP checks against Patroni's API.
# Let's configure HAProxy to check Patroni's REST API.
# This requires HAProxy to be able to make HTTP requests and parse responses.
# We'll use a custom check script.
# --- Advanced HAProxy Configuration with Patroni API Check ---
# This requires a script that checks Patroni's API.
# For demonstration, we'll use a simplified approach:
# HAProxy checks the PostgreSQL port (5432) and relies on Patroni to manage the primary.
# This means HAProxy might briefly point to a replica if the primary fails before Patroni promotes a new one.
# A more robust solution involves a script that queries Patroni's API.
# Let's use a script to query Patroni's API.
# Create a script /etc/haproxy/patroni_leader_check.sh
# This script will curl the Patroni API and return 0 if the node is the leader.
# Example script (simplified):
# #!/bin/bash
# NODE_IP=$1
# if curl -s "http://${NODE_IP}:8008/patroni" | grep -q '"state": "running"'; then
# exit 0 # Healthy (primary)
# else
# exit 1 # Unhealthy (replica or down)
# fi
# Then in haproxy.cfg:
# server pg_primary <NODE_1_PRIVATE_IP>:5432 check port 8008 inter 2s fall 3 rise 2 script /etc/haproxy/patroni_leader_check.sh <NODE_1_PRIVATE_IP>
# For this example, we'll use a simpler, less dynamic approach:
# We'll list all potential primaries and rely on Patroni's state.
# HAProxy will pick the first healthy one.
# This is a compromise for clarity.
server pg_node1 <NODE_1_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
server pg_node2 <NODE_2_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
server pg_node3 <NODE_3_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
backend pgsql_backend_ro
mode tcp
option tcp-check
balance roundrobin
# All nodes are potential read replicas
server pg_replica1 <NODE_1_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
server pg_replica2 <NODE_2_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
server pg_replica3 <NODE_3_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
# To route read-only queries to replicas, you'd typically inspect SQL queries.
# This is complex and often done at the application level or with a more advanced proxy.
# For simplicity, we'll direct all traffic to the primary backend.
# If you need read replicas, you'd configure a separate frontend/backend pair for read-only traffic.
# For example:
# frontend pgsql_frontend_ro
# bind *:5433
# mode tcp
# default_backend pgsql_backend_ro
# If you want to use the Patroni API to dynamically select the primary:
# You would need a script that polls Patroni's API and updates HAProxy's configuration or uses HAProxy's dynamic configuration features.
# A common pattern is to have a script that runs periodically, checks Patroni's API for the leader, and reloads HAProxy.
# For instance, a script could generate a temporary haproxy.cfg snippet for the primary backend and then reload HAProxy.
# Let's refine the pgsql_backend_rw to use a script that checks Patroni's API.
# This is the most robust approach for dynamic primary selection.
# Create a script /etc/haproxy/patroni_leader_check.sh on the HAProxy node:
# #!/bin/bash
# # This script checks if a given node is the current PostgreSQL primary managed by Patroni.
# # It queries the Patroni REST API.
#
# NODE_IP=$1
# PATRONI_PORT=8008
#
# if curl -s "http://${NODE_IP}:${PATRONI_PORT}/patroni" | grep -q '"state": "running"'; then
# exit 0 # Healthy (primary)
# else
# exit 1 # Unhealthy (replica or down)
# fi
#
# Make it executable: sudo chmod +x /etc/haproxy/patroni_leader_check.sh
# Then, in haproxy.cfg:
# backend pgsql_backend_rw
# mode tcp
# option tcp-check
# balance roundrobin
# server pg_primary_node1 <NODE_1_PRIVATE_IP>:5432 check port 8008 inter 2s fall 3 rise 2 script /etc/haproxy/patroni_leader_check.sh <NODE_1_PRIVATE_IP>
# server pg_primary_node2 <NODE_2_PRIVATE_IP>:5432 check port 8008 inter 2s fall 3 rise 2 script /etc/haproxy/patroni_leader_check.sh <NODE_2_PRIVATE_IP>
# server pg_primary_node3 <NODE_3_PRIVATE_IP>:5432 check port 8008 inter 2s fall 3 rise 2 script /etc/haproxy/patroni_leader_check.sh <NODE_3_PRIVATE_IP>
# For this example, we'll stick to the simpler TCP check on port 5432,
# acknowledging its limitations for immediate failover detection.
# The primary backend will list all nodes, and HAProxy will pick the first one that passes the health check.
# Patroni ensures only one node is truly primary.
backend pgsql_backend_rw
mode tcp
option tcp-check
balance roundrobin
# List all potential primary nodes. HAProxy will pick the first healthy one.
# Patroni ensures only one is truly primary.
server pg_node1 <NODE_1_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
server pg_node2 <NODE_2_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
server pg_node3 <NODE_3_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
# If you need read replicas, configure a separate backend:
backend pgsql_backend_ro
mode tcp
option tcp-check
balance roundrobin
# List all nodes as potential read replicas
server pg_replica1 <NODE_1_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
server pg_replica2 <NODE_2_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
server pg_replica3 <NODE_3_PRIVATE_IP>:5432 check port 5432 inter 2s fall 3 rise 2
# To route read-only queries to replicas, you'd typically use a separate frontend listening on a different port (e.g., 5433)
# or implement query routing logic. For simplicity, we'll focus on the primary backend.
Replace <NODE_X_PRIVATE_IP> with the private IP addresses of your PostgreSQL Droplets. Ensure the HAProxy Droplet can reach these IPs on port 5432 and 8008 (if using API checks).
Restart HAProxy to apply the configuration:
sudo systemctl restart haproxy
Access the HAProxy stats page at http://<HAPROXY_PUBLIC_IP>:8404/stats to monitor the backend servers’ health.
Testing Failover
To test the automated failover:
- Connect to your primary PostgreSQL node (via HAProxy) and run a simple query:
SELECT pg_is_in_recovery();. It should returnf(false). - On the primary PostgreSQL Droplet, stop the Patroni service:
sudo systemctl stop patroni. - Observe the HAProxy stats page. The primary node should become unhealthy.
- Within a short period (depending on Patroni’s
ttland HAProxy’s check intervals), Patroni should detect the failure, promote a replica, and HAProxy should start directing traffic to the new primary. - Run the
SELECT pg_is_in_recovery();query again via HAProxy. It should now returnffrom the newly promoted primary. - Restart Patroni on the old primary:
sudo systemctl start patroni. It should rejoin the cluster as a replica.
Architecting for Resilience: C Deployments with Consul and Nomad
Deploying C applications, especially those requiring high availability, presents unique challenges. Unlike managed database services, C applications are typically self-managed. Orchestration tools like HashiCorp Nomad, coupled with service discovery and health checking via Consul, provide a robust framework for achieving automated failover and resilience.
Core Components: Nomad and Consul
Nomad is a simple, flexible workload orchestrator that can deploy and manage containers (Docker, Podman) and non-containerized applications. Its job scheduling and task management capabilities are key to our HA strategy.
Consul provides service discovery, health checking, and a distributed key-value store. It’s crucial for Nomad to know which instances are healthy and for applications to find each other.
Setup Overview
We’ll deploy a Nomad cluster and a Consul cluster on DigitalOcean Droplets. For simplicity, we’ll use a small cluster size (e.g., 3 servers for each). Nomad clients will run on the same nodes where our C application tasks will be executed.
Deploying Consul Cluster
On each of the 3 Consul server Droplets:
- Download the Consul binary:
wget https://releases.hashicorp.com/consul/1.16.1/consul_1.16.1_linux_amd64.zip unzip consul_1.16.1_linux_amd64.zip sudo mv consul /usr/local/bin/
- Create a Consul configuration directory and a server configuration file (e.g.,
/etc/consul.d/server.json). Replace<NODE_PRIVATE_IP>and<OTHER_NODE_PRIVATE_IP_X>.
{
"server": true,
"bootstrap_expect": 3,
"data_dir": "/opt/consul",
"client_addr": "0.0.0.0",
"bind_addr": "<NODE_PRIVATE_IP>",
"datacenter": "dc1",
"log_level": "INFO",
"retry_join": [
"<OTHER_NODE_PRIVATE_IP_1>",
"<OTHER_NODE_PRIVATE_IP_2>"
]
}
- Create a systemd service file for Consul (e.g.,
/etc/systemd/system/consul.service):
[Unit] Description=Consul Service Discovery After=network.target [Service] User=root Group=root ExecStart=/usr/local/bin/consul agent -config-dir=/etc/consul.d/ ExecStop=/usr/local/bin/consul leave Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target
- Start and enable Consul:
sudo systemctl daemon-reload sudo systemctl enable consul sudo systemctl start consul
Verify the Consul cluster status on any server node:
consul members
Deploying Nomad Cluster
On each of the 3 Nomad server Droplets:
- Download the Nomad binary:
wget https://releases.hashicorp.com/nomad/1.7.1/nomad_1.7.1_linux_amd64.zip unzip nomad_1.7.1_linux_amd64.zip sudo mv nomad /usr/local/bin/
- Create Nomad configuration directories and server configuration files (e.g.,
/etc/nomad.d/server.hcl). Replace<NODE_PRIVATE_IP>and<CONSUL_SERVER_IP_X>.
# Server configuration
server {
enabled = true
bootstrap_expect = 3
datacenter = "dc1"
}
# Client configuration (Nomad servers also act as clients in this setup)
client {
enabled = true
}
# Data directory
data_dir = "/opt/nomad"
# Consul integration
consul {
server_auto_join = true
server_join_retry_max = 5
server_join_timeout = "10s"
server_join_use_auto_config = true
server_addresses = ["<CONSUL_SERVER_IP_1>:8300", "<CONSUL_SERVER_IP_2>:8300", "<CONSUL_SERVER_IP_3>:8300"]
client_auto_join = true
client_join_retry_max = 5
client_join_retry_interval = "15s"
client_join_use_auto_config = true
}
# UI configuration
ui {
enabled = true
}
# Log level
log_level = "INFO"
# Bind address
bind_addr = "<NODE_PRIVATE_IP>"
- Create a systemd service file for Nomad (e.g.,
/etc/systemd/system/nomad.service):
[Unit