Disaster Recovery 101: Architecting Auto-Failovers for PostgreSQL and WooCommerce Deployments on DigitalOcean

Establishing a Highly Available PostgreSQL Cluster with Patroni

For a robust WooCommerce deployment, a highly available PostgreSQL database is non-negotiable. We’ll architect an automated failover solution using Patroni, a template for PostgreSQL HA. This setup leverages etcd for distributed configuration and leader election, ensuring a single, active primary at any given time.

Our target is a three-node PostgreSQL cluster on DigitalOcean Droplets. One node will be the primary, and the other two will be replicas. Patroni will manage the state and orchestrate failovers.

Prerequisites

Three DigitalOcean Droplets (e.g., 4 vCPU, 8GB RAM) for PostgreSQL nodes.
One DigitalOcean Droplet for etcd cluster (minimum 2 vCPU, 4GB RAM). A 3-node etcd cluster is recommended for production HA. For simplicity in this example, we’ll use a single etcd instance, but this is NOT production-ready.
SSH access to all Droplets.
Basic understanding of Linux system administration and PostgreSQL.

Setting up the etcd Cluster

First, we need a reliable etcd cluster for Patroni to coordinate. Install etcd on your dedicated etcd Droplet(s).

Installation (Ubuntu/Debian)

Download the latest etcd release. Replace v3.5.9 with the current version.

ETCD_VER=v3.5.9
DOWNLOAD_URL=https://github.com/etcd-io/etcd/releases/download
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf etcd-${ETCD_VER}-linux-amd64.tar.gz
sudo mv etcd-${ETCD_VER}-linux-amd64/etcd /usr/local/bin/
sudo mv etcd-${ETCD_VER}-linux-amd64/etcdctl /usr/local/bin/
rm -rf etcd-${ETCD_VER}-linux-amd64*

Configuration

Create an etcd systemd service file. Adjust ETCD_NAME and INITIAL_CLUSTER for a multi-node setup. For a single node, INITIAL_CLUSTER can be omitted or point to itself.

sudo nano /etc/systemd/system/etcd.service

[Unit]
Description=etcd key-value store
Documentation=man:etcd(1)

[Service]
User=etcd
ExecStart=/usr/local/bin/etcd \
  --name etcd0 \
  --data-dir /var/lib/etcd \
  --listen-client-urls http://0.0.0.0:2379 \
  --advertise-client-urls http://<ETCD_NODE_IP>:2379 \
  --listen-peer-urls http://0.0.0.0:2380 \
  --initial-advertise-peer-urls http://<ETCD_NODE_IP>:2380 \
  --initial-cluster etcd0=http://<ETCD_NODE_IP>:2380 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-cluster-state new

[Install]
WantedBy=multi-user.target

Replace <ETCD_NODE_IP> with the actual IP address of your etcd Droplet. Create the etcd data directory and start the service.

sudo mkdir -p /var/lib/etcd
sudo chown etcd:etcd /var/lib/etcd
sudo systemctl daemon-reload
sudo systemctl enable etcd
sudo systemctl start etcd

Installing and Configuring Patroni on PostgreSQL Nodes

Patroni needs to be installed on each PostgreSQL Droplet. We’ll use pip for installation.

Installation (Ubuntu/Debian)

sudo apt update
sudo apt install -y python3-pip python3-dev build-essential libpq-dev
sudo pip3 install --upgrade pip
sudo pip3 install "patroni[etcd]" psycopg2-binary

PostgreSQL Setup

Install PostgreSQL on each node. Ensure the PostgreSQL service is stopped and disabled, as Patroni will manage it.

sudo apt install -y postgresql postgresql-contrib
sudo systemctl stop postgresql
sudo systemctl disable postgresql

Patroni requires a dedicated PostgreSQL user for replication and management. Create this user and grant necessary privileges.

-- Connect to PostgreSQL as the default postgres user
sudo -u postgres psql

-- Inside psql prompt:
CREATE USER replicator WITH REPLICATION PASSWORD 'your_replication_password';
ALTER USER replicator CREATEDB;
\q

Patroni Configuration File

Create the Patroni configuration file on each PostgreSQL node. This file defines the connection to etcd, PostgreSQL settings, and replication parameters.

sudo mkdir -p /etc/patroni
sudo nano /etc/patroni/patroni.yml

# patroni.yml
scope: my_pg_cluster # Unique name for your cluster
namespace: /service/ # etcd namespace for this cluster

restapi:
  listen: 0.0.0.0:8008
  connect_address: <NODE_IP>:8008 # IP of this node

etcd:
  host: <ETCD_NODE_IP>:2379
  protocol: http

postgresql:
  listen: 0.0.0.0:5432
  connect_address: <NODE_IP>:5432
  data_dir: /var/lib/postgresql/14/main # Adjust version if needed
  bin_dir: /usr/lib/postgresql/14/bin # Adjust version if needed
  pg_hba:
    - host    replication     replicator      0.0.0.0/0               md5
    - host    all             all             0.0.0.0/0               md5
  replication:
    username: replicator
    password: 'your_replication_password'
  parameters:
    max_connections: 100
    shared_buffers: 1GB
    effective_cache_size: 3GB
    maintenance_work_mem: 256MB
    checkpoint_completion_target: 0.9
    wal_buffers: 16MB
    default_statistics_target: 100
    random_page_cost: 1.1
    effective_io_concurrency: 200
    work_mem: 16MB
    min_wal_size: 1GB
    max_wal_size: 4GB
    logging_collector: on
    log_directory: /var/log/postgresql
    log_filename: postgresql-%a.log
    log_min_duration_statement: 250ms
    log_checkpoints: on
    log_connections: on
    log_disconnections: on
    log_lock_waits: on
    log_temp_files: 0
    log_autovacuum_min_duration: 0
    log_error_verbosity: default
    log_line_prefix: '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '

# For initial bootstrap, one node needs to be configured to initialize PG
# This is typically done by starting Patroni on one node first.
# The 'initdb' section is only needed on the FIRST node that starts.
# initdb:
#   - encoding: UTF8
#   - locale: en_US.UTF-8
#   - data-checksums

# Tags can be used to influence leader selection.
# The node with the highest 'nofailover' tag value will be preferred as leader.
# tags:
#   nofailover: 1
#   clonefrom: 1

Replace <NODE_IP> with the IP address of the specific PostgreSQL Droplet and <ETCD_NODE_IP> with your etcd Droplet’s IP. Ensure the data_dir and bin_dir paths match your PostgreSQL installation. For the first node that will initialize the cluster, uncomment and configure the initdb section. For subsequent nodes, ensure initdb is commented out.

Starting Patroni and PostgreSQL

Create a systemd service file for Patroni on each PostgreSQL node.

sudo nano /etc/systemd/system/patroni.service

[Unit]
Description=Patroni PostgreSQL HA
After=network.target

[Service]
User=postgres
Group=postgres
ExecStart=/usr/local/bin/patroni /etc/patroni/patroni.yml
Restart=on-failure
RestartSec=5s

[Install]
WantedBy=multi-user.target

Now, start Patroni and enable it to run on boot. Start it on one node first, then the others.

sudo systemctl daemon-reload
sudo systemctl enable patroni
sudo systemctl start patroni

After starting Patroni on the first node (with initdb uncommented), it will initialize PostgreSQL and register itself in etcd. Then, start Patroni on the other nodes. They will detect the existing cluster, connect as replicas, and Patroni will manage their states.

Verifying the Cluster and Failover

You can check the status of the cluster via Patroni’s REST API or by querying etcd.

Checking Cluster Status

Use patronictl (installed with Patroni) to inspect the cluster.

patronictl -c /etc/patroni/patroni.yml list

This command should show your cluster name, the number of nodes, the current leader, and the status of each member.

Simulating a Failover

To test automatic failover, stop the Patroni service on the current primary node.

# On the current primary node:
sudo systemctl stop patroni

Within seconds, Patroni on the other nodes will detect the primary’s absence, elect a new leader from the available replicas, and promote it. You can monitor this process using patronictl list on any node.

Integrating with WooCommerce

WooCommerce applications typically connect to a single database host. To ensure seamless failover, we need a mechanism to abstract the database endpoint.

Using a Load Balancer/Proxy

The most robust approach is to use a load balancer or a database proxy that can dynamically update its backend based on Patroni’s leader information. HAProxy is a strong candidate.

HAProxy Configuration for PostgreSQL

Install HAProxy on a separate Droplet or one of the existing nodes (though a dedicated node is recommended for production). Configure HAProxy to point to the current PostgreSQL primary.

sudo apt install -y haproxy

Edit the HAProxy configuration file:

sudo nano /etc/haproxy/haproxy.cfg

frontend pgsql_frontend
    bind *:5432
    mode tcp
    default_backend pgsql_backend

backend pgsql_backend
    mode tcp
    balance roundrobin
    option httpchk GET /primary
    http-check expect status 200
    # This is where the magic happens: we'll use a script to update these servers
    server pg_primary <CURRENT_PRIMARY_IP>:5432 check port 8008 inter 2s
    server pg_replica1 <REPLICA1_IP>:5432
    server pg_replica2 <REPLICA2_IP>:5432

The key here is the option httpchk GET /primary and check port 8008. HAProxy will query the Patroni REST API on each PostgreSQL node. The node that responds with HTTP 200 for the /primary endpoint is the current leader. HAProxy will automatically mark it as available and route traffic to it.

You’ll need to restart HAProxy after changes:

sudo systemctl restart haproxy

Your WooCommerce application should now connect to the HAProxy IP address on port 5432. When a failover occurs, HAProxy will automatically redirect traffic to the new primary.

Advanced Considerations and Best Practices

etcd HA: For production, deploy a 3 or 5-node etcd cluster for etcd’s own high availability.
Monitoring: Implement comprehensive monitoring for PostgreSQL, Patroni, and etcd. Prometheus and Grafana are excellent choices.
Backups: Automate regular PostgreSQL backups (e.g., using pg_basebackup or tools like Barman) and store them off-site.
Network Segmentation: Use DigitalOcean’s VPC and firewall rules to restrict access to PostgreSQL and etcd ports only from necessary sources.
Connection Pooling: For high-traffic WooCommerce sites, consider a connection pooler like PgBouncer placed behind HAProxy to manage database connections efficiently.
Read Replicas: For read-heavy workloads, you can configure additional PostgreSQL replicas specifically for read traffic, separate from the HA failover pool.
Configuration Management: Use tools like Ansible or Terraform to automate the deployment and configuration of your PostgreSQL and etcd clusters.
Testing: Regularly test your failover procedures. Simulate various failure scenarios (node failure, network partition, etcd failure) to ensure your system behaves as expected.

By implementing this Patroni-based auto-failover solution for PostgreSQL and integrating it with HAProxy, you significantly enhance the resilience and availability of your WooCommerce deployment on DigitalOcean, ensuring minimal downtime during database failures.

Disaster Recovery 101: Architecting Auto-Failovers for PostgreSQL and WooCommerce Deployments on DigitalOcean

Establishing a Highly Available PostgreSQL Cluster with Patroni

Prerequisites

Setting up the etcd Cluster

Installation (Ubuntu/Debian)

Configuration

Installing and Configuring Patroni on PostgreSQL Nodes

Installation (Ubuntu/Debian)

PostgreSQL Setup

Patroni Configuration File

Starting Patroni and PostgreSQL

Verifying the Cluster and Failover

Checking Cluster Status

Simulating a Failover

Integrating with WooCommerce

Using a Load Balancer/Proxy

HAProxy Configuration for PostgreSQL

Advanced Considerations and Best Practices

Recent Posts

Top Categories

Our Products

Our Services