Disaster Recovery 101: Architecting Auto-Failovers for PostgreSQL and Python Deployments on Linode
Establishing a Highly Available PostgreSQL Cluster with Patroni
Achieving automated failover for PostgreSQL necessitates a robust cluster management solution. Patroni stands out as a leading choice, orchestrating PostgreSQL instances and managing high availability through distributed consensus. We’ll focus on a three-node setup for redundancy, leveraging etcd for cluster state management.
First, ensure etcd is installed and running on separate nodes or co-located with PostgreSQL nodes. For a production environment, etcd should be configured for TLS and have its own high availability setup. For this example, we’ll assume a basic etcd cluster accessible at etcd-node-1:2379, etcd-node-2:2379, and etcd-node-3:2379.
Patroni Configuration for PostgreSQL Nodes
Each PostgreSQL node will run a Patroni agent. The configuration file (e.g., patroni.yml) defines its role, connection details to etcd, and PostgreSQL parameters. Below is a representative configuration for a PostgreSQL node.
# patroni.yml
scope: my_pg_cluster
namespace: /service/
restapi:
listen: 0.0.0.0:8008
connect_address: 192.168.1.10:8008 # IP of this node's REST API
etcd:
hosts:
- etcd-node-1:2379
- etcd-node-2:2379
- etcd-node-3:2379
protocol: http # Use 'https' in production with TLS
username: # Optional etcd username
password: # Optional etcd password
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.1.10:5432 # IP of this node's PostgreSQL
data_dir: /var/lib/postgresql/14/main
pg_hba:
- host all all 0.0.0.0/0 md5
parameters:
max_connections: 100
shared_buffers: 128MB
wal_level: replica
hot_standby: "on"
max_wal_senders: 10
max_replication_slots: 10
replication:
type: pg berdasarkan
master:
host: 192.168.1.10 # IP of the current master
port: 5432
user: replicator
password: your_replication_password
# Bootstrap configuration (only for the initial master)
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576 # 1MB
postgresql:
use_pg_rewind: true
use_slots: true
parameters:
recovery_target_timeline: 'latest'
Key parameters to note:
scope: A unique identifier for this PostgreSQL cluster.namespace: The etcd path where Patroni will store cluster state.restapi.listenandrestapi.connect_address: The address and port for Patroni’s REST API, used for health checks and management.etcd.hosts: List of etcd endpoints.postgresql.data_dir: The PostgreSQL data directory. Ensure this is correctly set.postgresql.pg_hba: Crucial for allowing replication and application connections. Adjust IP ranges as per your security policy.replication.type: Set topgbasedfor standard PostgreSQL replication.bootstrap.postgresql.use_pg_rewind: Essential for fast failovers by rewinding the old master to a consistent state.bootstrap.postgresql.use_slots: Utilizes PostgreSQL replication slots to prevent WAL file deletion on the primary before they are consumed by replicas.
Bootstrapping the Cluster
To initiate the cluster, start Patroni on one node with the bootstrap configuration enabled. This node will attempt to acquire a lock in etcd and initialize itself as the primary PostgreSQL server.
On the designated initial primary node:
sudo systemctl enable patroni sudo systemctl start patroni
Patroni will:
- Attempt to acquire a leader lock in etcd.
- If successful, initialize the PostgreSQL data directory (if empty).
- Start PostgreSQL as the primary.
- Register itself as the primary in etcd.
Adding Replicas and Configuring Failover
Once the primary is up and registered in etcd, start Patroni on the other nodes. These nodes will detect the existing primary via etcd and configure themselves as replicas, performing a base backup if necessary.
On replica nodes:
sudo systemctl enable patroni sudo systemctl start patroni
Patroni on replica nodes will:
- Discover the current primary from etcd.
- Configure PostgreSQL to replicate from the primary.
- Perform a base backup (e.g., using
pg_basebackup) if the data directory is empty. - Start PostgreSQL in standby mode.
- Periodically check the primary’s health via etcd.
When the primary fails (e.g., Patroni agent stops responding, PostgreSQL process dies), Patroni on the remaining nodes will detect this. A consensus mechanism (etcd’s leader election) will determine which replica is best suited to become the new primary (based on lag, etc.). The chosen replica will be promoted, and other nodes will reconfigure to replicate from the new primary.
Integrating Python Applications with Auto-Failover
Your Python applications need to be aware of the PostgreSQL cluster’s dynamic nature. Hardcoding a single database host is a recipe for disaster. The strategy involves using a load balancer or a service discovery mechanism that can query Patroni’s API.
Option 1: Using a Load Balancer (HAProxy)
HAProxy is an excellent choice for load balancing TCP connections and can be configured to health-check PostgreSQL instances via Patroni’s REST API. This allows HAProxy to direct traffic only to healthy primary instances.
First, install HAProxy on a dedicated server or co-locate it with your application servers. Configure HAProxy to monitor the Patroni REST API endpoints.
# haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
daemon
defaults
log global
mode tcp
option tcplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
listen postgresql_cluster
bind *:5433 # Port your applications will connect to
mode tcp
option tcp-check
# Health check against Patroni's API to determine primary
tcp-check connect port 8008
tcp-check send GET /primary HTTP/1.0\r\n\r\n
tcp-check expect status 200
# Backend servers (all PostgreSQL nodes)
server pg_node1 192.168.1.10:5432 check port 8008 inter 2s fall 3 rise 2
server pg_node2 192.168.1.11:5432 check port 8008 inter 2s fall 3 rise 2
server pg_node3 192.168.1.12:5432 check port 8008 inter 2s fall 3 rise 2
In this HAProxy configuration:
mode tcp: HAProxy will operate at the transport layer, forwarding raw TCP packets.option tcp-check: Enables TCP-level health checks.tcp-check connect port 8008: HAProxy attempts to connect to the Patroni REST API port.tcp-check send GET /primary HTTP/1.0\r\n\r\n: Sends an HTTP GET request to the/primaryendpoint of Patroni. This endpoint returns the details of the current primary if the node is healthy and aware of the primary.tcp-check expect status 200: HAProxy expects a 200 OK HTTP status code from the/primaryendpoint. This signifies that the node is either the primary or a healthy replica aware of the primary.server ... check port 8008 inter 2s fall 3 rise 2: Configures each PostgreSQL node as a backend server. Thecheck port 8008combined with thetcp-checkdirectives ensures HAProxy only sends traffic to the actual PostgreSQL port (5432) of the node that is currently the primary.
Your Python application should now connect to your_haproxy_ip:5433. HAProxy will automatically route connections to the active PostgreSQL primary.
Option 2: Service Discovery with Python
For more dynamic environments or when HAProxy is not desired, your Python application can directly query Patroni’s API to find the current primary. This requires a custom connection logic within your application or a database connection pool library that supports dynamic host resolution.
Here’s a Python snippet demonstrating how to find the primary using the requests library:
import requests
import psycopg2
import time
PATRONI_API_URLS = [
"http://192.168.1.10:8008/primary",
"http://192.168.1.11:8008/primary",
"http://192.168.1.12:8008/primary",
]
def get_primary_db_connection_details():
"""
Queries Patroni API endpoints to find the current PostgreSQL primary.
Returns a dictionary with 'host' and 'port' or None if not found.
"""
for url in PATRONI_API_URLS:
try:
response = requests.get(url, timeout=2)
if response.status_code == 200:
data = response.json()
# Check if the 'role' is 'master' or if it's a replica reporting the master
# Patroni's /primary endpoint returns the master details.
# If a node is not the master, it might still respond with master info if it knows it.
# A more robust check might involve querying /cluster and checking roles.
# For simplicity, we assume /primary returns master info if available.
return {
"host": data.get("host"),
"port": data.get("port"),
"database": "your_db_name", # Add your database name
"user": "your_db_user", # Add your DB user
"password": "your_db_password" # Add your DB password
}
except requests.exceptions.RequestException:
# Ignore connection errors and try the next URL
continue
return None
def get_db_connection():
"""
Establishes a PostgreSQL connection, dynamically finding the primary.
Includes retry logic for failover scenarios.
"""
max_retries = 5
retry_delay = 5 # seconds
conn = None
db_details = None
for attempt in range(max_retries):
if conn is None:
db_details = get_primary_db_connection_details()
if db_details:
try:
print(f"Attempting to connect to primary at {db_details['host']}:{db_details['port']}...")
conn = psycopg2.connect(
host=db_details["host"],
port=db_details["port"],
database=db_details["database"],
user=db_details["user"],
password=db_details["password"]
)
print("Successfully connected to PostgreSQL.")
return conn
except psycopg2.OperationalError as e:
print(f"Connection failed: {e}")
conn = None # Ensure conn is None if connection fails
else:
print("Could not find primary database. Retrying...")
if conn is None:
print(f"Retrying connection in {retry_delay} seconds... (Attempt {attempt + 1}/{max_retries})")
time.sleep(retry_delay)
else:
# If we got here, conn is valid, so we should have returned already.
# This is a safeguard.
break
if conn is None:
raise Exception(f"Failed to connect to PostgreSQL after {max_retries} attempts.")
return conn
# Example usage:
if __name__ == "__main__":
try:
connection = get_db_connection()
cursor = connection.cursor()
cursor.execute("SELECT version();")
db_version = cursor.fetchone()
print(f"PostgreSQL version: {db_version[0]}")
cursor.close()
connection.close()
except Exception as e:
print(f"An error occurred: {e}")
This Python code:
- Iterates through a list of Patroni API endpoints.
- Makes a GET request to the
/primaryendpoint. - Parses the JSON response to extract the primary’s host and port.
- Uses
psycopg2to establish a connection. - Includes a retry mechanism with exponential backoff (simplified to fixed delay here) to handle transient connection issues during failover.
For production, consider integrating this logic into your database connection pool or ORM configuration. Libraries like SQLAlchemy can be configured with custom connection handlers or use connection strings that might be dynamically updated.
Monitoring and Alerting
Automated failover is only part of the solution. Comprehensive monitoring and alerting are critical to ensure the system is functioning as expected and to be notified of any issues that might require manual intervention.
Key Metrics to Monitor
- Patroni API Health: Regularly check the status of the
/healthendpoint on each Patroni node. A 200 OK indicates the agent is running and aware of the cluster state. - etcd Health: Monitor the health and performance of your etcd cluster. Unhealthy etcd can lead to cluster instability and failed failovers.
- PostgreSQL Replication Lag: Use Patroni’s API (e.g.,
/cluster) or direct SQL queries (pg_stat_replication) to monitor replication lag on standby servers. High lag can impact failover RPO. - PostgreSQL Instance Health: Monitor PostgreSQL process status, CPU, memory, disk I/O, and connection counts.
- HAProxy Statistics: If using HAProxy, monitor its backend server status and connection rates.
Tools like Prometheus with appropriate exporters (e.g., postgres_exporter, node_exporter, and custom Patroni/HAProxy exporters) are ideal for collecting and visualizing these metrics. Alerting can be configured via Alertmanager.
Example Prometheus Alert for Patroni Health
# prometheus.yml (alerting rules)
groups:
- name: postgresql_alerts
rules:
- alert: PatroniNodeUnhealthy
expr: up{job="patroni"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Patroni agent on {{ $labels.instance }} is down."
description: "The Patroni agent on {{ $labels.instance }} has been down for 5 minutes. This may indicate a problem with the PostgreSQL node or the Patroni service itself."
- alert: PostgreSQLPrimaryNotElected
# This query checks if there's no node reporting itself as 'master' in the cluster state.
# It assumes Patroni's /cluster endpoint provides role information.
# Adjust the query based on your specific Patroni metrics exporter.
# A more direct approach might be to check if the 'patroni_role' metric is absent or not 'master' for any node.
expr: count(patroni_role{role="master"}) by (cluster) == 0
for: 10m
labels:
severity: critical
annotations:
summary: "No PostgreSQL primary elected in cluster {{ $labels.cluster }}."
description: "Patroni has failed to elect a primary for the PostgreSQL cluster {{ $labels.cluster }} for 10 minutes. Manual intervention may be required."
Implementing this architecture provides a resilient PostgreSQL deployment that can automatically recover from node failures, ensuring minimal downtime for your Python applications.