Disaster Recovery 101: Architecting Auto-Failovers for MongoDB and WooCommerce Deployments on OVH

Establishing a Robust MongoDB Replica Set for High Availability

A foundational element for any disaster recovery strategy involving MongoDB is a properly configured replica set. This ensures data redundancy and provides automatic failover capabilities. For this architecture, we’ll assume a primary, a secondary, and an arbiter node. The arbiter does not hold data but participates in elections, preventing split-brain scenarios. We’ll deploy these across different OVH Availability Zones for maximum resilience.

First, ensure MongoDB is installed on your chosen OVH instances. The configuration file, typically located at /etc/mongod.conf, needs to be adjusted for replica set operation. Key parameters include replication.replSetName, net.bindIp, and net.port.

MongoDB Configuration Snippet

# /etc/mongod.conf

storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true
net:
  port: 27017
  bindIp: 0.0.0.0 # Or specific IPs for security
replication:
  replSetName: "rs0" # The name of your replica set
processManagement:
  fork: true
  pidFilePath: /var/run/mongodb/mongod.pid
log:
  destination: file
  path: /var/log/mongodb/mongod.log
  logAppend: true
security:
  keyFile: /etc/mongodb-keyfile # Path to your keyfile
  authorization: enabled

After configuring each MongoDB instance, start the service:

On each node:

Ensure the keyfile exists and has restricted permissions:

Create a keyfile (e.g., on one node and distribute it securely):

openssl rand -base64 756 > /etc/mongodb-keyfile
chmod 400 /etc/mongodb-keyfile
chown mongodb:mongodb /etc/mongodb-keyfile

Then, start the MongoDB service:

sudo systemctl start mongod
sudo systemctl enable mongod

Once all nodes are running, initiate the replica set configuration from one of the nodes (typically the one you intend to be the initial primary). Connect to the MongoDB shell:

mongo

Inside the MongoDB shell, initiate the replica set:

rs.initiate(
  {
    _id : "rs0",
    members: [
      { _id : 0, host : "mongo-node-1.example.com:27017" },
      { _id : 1, host : "mongo-node-2.example.com:27017" },
      { _id : 2, host : "mongo-arbiter.example.com:27017", arbiterOnly : true }
    ]
  }
)

Verify the replica set status:

rs.status()

This output will show the state of each member and confirm if the replica set is healthy. The rs.status() command is crucial for monitoring and troubleshooting replica set health.

Automating WooCommerce Application Failover with HAProxy

WooCommerce, being a PHP application, typically runs on web servers like Nginx or Apache, with PHP-FPM processing the dynamic content. To achieve automatic failover for the application layer, we’ll employ HAProxy. HAProxy is a high-performance TCP/HTTP load balancer and proxying solution that excels at health checking and redirecting traffic to healthy backend servers.

We’ll set up HAProxy to monitor multiple WooCommerce application instances. If a primary instance becomes unresponsive, HAProxy will automatically direct traffic to a secondary instance. This requires careful configuration of backend server definitions and health checks.

HAProxy Configuration for WooCommerce

# /etc/haproxy/haproxy.cfg

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

listen stats
    bind *:8404
    mode http
    stats enable
    stats uri /stats
    stats realm Haproxy\ Statistics
    stats auth admin:YourSecurePassword # Change this!

frontend http_frontend
    bind *:80
    mode http
    default_backend webservers

backend webservers
    mode http
    balance roundrobin
    option httpchk GET /wp-cron.php HTTP/1.1\r\nHost:\ yourdomain.com # Customize host header
    http-check expect status 200 # Expect a 200 OK for health check
    server web1 192.168.1.10:80 check fall 3 rise 2 # Primary WooCommerce server
    server web2 192.168.1.11:80 check fall 3 rise 2 # Secondary WooCommerce server
    # Add more servers as needed, ideally in different OVH Availability Zones

In this configuration:

global and defaults sections set up general logging, timeouts, and error handling.
listen stats provides a web interface to monitor HAProxy’s status, including backend server health. Remember to change the default password.
frontend http_frontend listens on port 80 for incoming HTTP traffic.
backend webservers defines the pool of application servers.
balance roundrobin distributes traffic evenly. For strict failover, you might consider balance leastconn or a custom script, but for auto-failover, roundrobin is often sufficient when combined with robust health checks.
option httpchk configures HAProxy to perform an HTTP GET request to /wp-cron.php on each backend server. This is a common, lightweight endpoint for WordPress health checks. Ensure your Host header is correctly set to your domain.
http-check expect status 200 ensures that HAProxy considers a server healthy only if it responds with an HTTP 200 status code.
server web1 ... check fall 3 rise 2 defines a backend server. check enables health checking. fall 3 means the server is considered down after 3 consecutive failed checks. rise 2 means the server is considered up after 2 consecutive successful checks.

After installing and configuring HAProxy, start and enable the service:

sudo systemctl start haproxy
sudo systemctl enable haproxy

To test the failover, you can stop the web server process on the primary WooCommerce instance (e.g., sudo systemctl stop apache2 or sudo systemctl stop nginx). HAProxy should detect the failure and automatically start sending traffic to the secondary instance. You can observe this in the HAProxy stats page.

Orchestrating Database and Application Failover with a Health Check Script

While MongoDB replica sets handle database failover automatically, and HAProxy handles application server failover, coordinating these events and ensuring a seamless transition for the user often requires an external orchestration layer. This is particularly true if your application needs to know which MongoDB node is the current primary to ensure writes are directed correctly, or if you need to perform application-level readiness checks before a new application server takes over.

A common approach is to use a custom script that periodically checks the health of both the MongoDB replica set and the application servers. This script can then trigger actions, such as updating DNS records, notifying monitoring systems, or even reconfiguring HAProxy if dynamic configuration is required (though HAProxy’s built-in health checks are usually sufficient).

Example Python Health Check Script

import pymongo
import requests
import time
import logging

# --- Configuration ---
MONGO_REPLICA_SET_NAME = "rs0"
MONGO_NODES = [
    "mongodb://mongo-node-1.example.com:27017/?replicaSet={}".format(MONGO_REPLICA_SET_NAME),
    "mongodb://mongo-node-2.example.com:27017/?replicaSet={}".format(MONGO_REPLICA_SET_NAME),
    "mongodb://mongo-arbiter.example.com:27017/?replicaSet={}".format(MONGO_REPLICA_SET_NAME)
]
APP_PRIMARY_URL = "http://web1.example.com/wp-cron.php" # Assuming web1 is primary
APP_SECONDARY_URL = "http://web2.example.com/wp-cron.php"
HEALTH_CHECK_INTERVAL = 30 # seconds
LOG_FILE = "/var/log/dr_health_check.log"

# --- Logging Setup ---
logging.basicConfig(filename=LOG_FILE, level=logging.INFO,
                    format='%(asctime)s - %(levelname)s - %(message)s')

def get_mongo_primary():
    """Connects to MongoDB and returns the primary node's hostname."""
    for mongo_uri in MONGO_NODES:
        try:
            client = pymongo.MongoClient(mongo_uri, serverSelectionTimeoutMS=5000)
            # The ismaster command is cheap and does not require auth.
            client.admin.command('ismaster')
            primary_info = client.admin.command('replSetGetStatus')
            for member in primary_info['members']:
                if member['stateStr'] == 'PRIMARY':
                    return member['name']
            client.close()
        except pymongo.errors.ConnectionFailure as e:
            logging.warning(f"Could not connect to {mongo_uri}: {e}")
        except Exception as e:
            logging.error(f"An error occurred while checking MongoDB: {e}")
    return None

def check_app_health(url):
    """Checks if an application URL is healthy (returns 200 OK)."""
    try:
        response = requests.get(url, timeout=5)
        return response.status_code == 200
    except requests.exceptions.RequestException as e:
        logging.warning(f"Application health check failed for {url}: {e}")
        return False

def main():
    logging.info("Starting DR health check script.")
    while True:
        current_mongo_primary = get_mongo_primary()
        if current_mongo_primary:
            logging.info(f"Current MongoDB primary: {current_mongo_primary}")
            # In a more advanced setup, you might compare this to an expected primary
            # and trigger actions if it changes unexpectedly or is unavailable.
        else:
            logging.error("Could not determine MongoDB primary. Replica set might be down.")
            # Trigger critical alert or automated recovery if possible

        is_primary_app_healthy = check_app_health(APP_PRIMARY_URL)
        is_secondary_app_healthy = check_app_health(APP_SECONDARY_URL)

        if is_primary_app_healthy:
            logging.info(f"Application primary ({APP_PRIMARY_URL}) is healthy.")
        else:
            logging.warning(f"Application primary ({APP_PRIMARY_URL}) is unhealthy.")
            if is_secondary_app_healthy:
                logging.info(f"Application secondary ({APP_SECONDARY_URL}) is healthy. Traffic should be directed here.")
                # Here you would implement actions to switch traffic, e.g.,
                # - Update DNS (if HAProxy is not used or needs re-pointing)
                # - Trigger HAProxy reconfigure (if dynamic)
                # - Send alerts
            else:
                logging.error(f"Both application primary and secondary are unhealthy.")
                # Trigger critical alert

        time.sleep(HEALTH_CHECK_INTERVAL)

if __name__ == "__main__":
    main()

This Python script uses pymongo to query the MongoDB replica set status and requests to check the health of the application endpoints. It runs in an infinite loop, performing checks at a defined interval. If the MongoDB primary cannot be determined or if the primary application server is unhealthy while the secondary is healthy, it logs the events. In a production environment, you would extend this script to:

Send alerts via email, Slack, or PagerDuty.
Trigger automated DNS updates (e.g., using OVH’s API to change A records).
Execute commands to reconfigure HAProxy if its built-in health checks are insufficient or if you need to switch to a completely different set of servers.
Perform application-specific readiness checks beyond a simple HTTP 200.

To run this script reliably, consider deploying it using a process manager like systemd or a container orchestration platform. Ensure the necessary Python libraries (pymongo, requests) are installed in the environment where the script runs.

OVH Specific Considerations and Best Practices

When architecting disaster recovery solutions on OVH, several platform-specific aspects are crucial:

Availability Zones (AZs): Deploy your MongoDB nodes and application servers across different OVH Availability Zones. This is the most effective way to protect against datacenter-level failures.
Networking: Ensure your firewall rules (OVH Security Groups or instance-level firewalls) allow traffic between your MongoDB nodes, application servers, and HAProxy instances. For MongoDB, this typically means allowing port 27017. For HAProxy, it means allowing traffic on port 80 (or your application’s port) and the stats port (e.g., 8404).
IP Addresses: Use private IP addresses for inter-server communication within OVH’s network for better security and performance. Public IPs should be managed by HAProxy or DNS for external access.
Monitoring: Leverage OVH’s monitoring tools in conjunction with your custom scripts and HAProxy stats. Monitor CPU, memory, disk I/O, and network traffic on all instances.
Backups: While replication provides high availability, it is not a substitute for backups. Implement a robust backup strategy for your MongoDB data, storing backups off-site or in a separate OVH region.
DNS Management: If you’re not solely relying on HAProxy for external access, consider using OVH’s DNS services. Automated failover might involve updating DNS records to point to a new HAProxy instance or a different set of application servers.

By combining MongoDB’s native replication, HAProxy’s intelligent load balancing and health checking, and a custom orchestration script, you can build a highly available and resilient WooCommerce deployment on OVH that automatically recovers from common failure scenarios.

Disaster Recovery 101: Architecting Auto-Failovers for MongoDB and WooCommerce Deployments on OVH

Establishing a Robust MongoDB Replica Set for High Availability

MongoDB Configuration Snippet

Automating WooCommerce Application Failover with HAProxy

HAProxy Configuration for WooCommerce

Orchestrating Database and Application Failover with a Health Check Script

Example Python Health Check Script

OVH Specific Considerations and Best Practices

Recent Posts

Top Categories

Our Products

Our Services