Disaster Recovery 101: Architecting Auto-Failovers for MongoDB and Shopify Deployments on Linode
Establishing a High-Availability MongoDB Replica Set on Linode
For a robust Shopify deployment, a highly available MongoDB backend is non-negotiable. This section details the architecture and configuration for a multi-node MongoDB replica set deployed across distinct Linode Availability Zones (AZs) to ensure resilience against single-node or even single-AZ failures. We’ll focus on a three-node setup for quorum and fault tolerance.
Prerequisites include:
- Three Linode instances (e.g., `mongo-primary`, `mongo-secondary-1`, `mongo-secondary-2`) provisioned in different Linode AZs.
- SSH access to all Linode instances.
- Basic understanding of Linux system administration.
Instance Preparation and MongoDB Installation
On each Linode instance, perform the following steps. We’ll use Ubuntu 22.04 LTS as the base OS.
First, update the package list and install MongoDB:
sudo apt update sudo apt upgrade -y sudo apt install -y mongodb
Next, configure MongoDB to listen on its private IP address and enable replica set functionality. Edit the MongoDB configuration file, typically located at /etc/mongod.conf.
On mongo-primary:
# /etc/mongod.conf on mongo-primary
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
systemLog:
destination: file
path: /var/log/mongodb/mongod.log
logAppend: true
net:
bindIp: 127.0.0.1, [PRIMARY_PRIVATE_IP] # Replace with mongo-primary's private IP
port: 27017
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod.pid
timeZoneInfo: /usr/share/zoneinfo
replication:
replSetName: rs0
On mongo-secondary-1 and mongo-secondary-2, use the same configuration but replace [PRIMARY_PRIVATE_IP] with their respective private IPs. Crucially, ensure the net.bindIp directive includes the instance’s private IP address. The replication.replSetName must be identical across all nodes.
After modifying the configuration, restart the MongoDB service:
sudo systemctl restart mongod sudo systemctl enable mongod
Initializing the Replica Set
Connect to the MongoDB instance on the designated primary node (mongo-primary) and initiate the replica set configuration.
mongo --host [PRIMARY_PRIVATE_IP] # Connect to the primary
Once connected, enter the MongoDB shell and run the following commands:
rs.initiate(
{
_id : "rs0",
configsvr : false,
members: [
{ _id: 0, host: "[PRIMARY_PRIVATE_IP]:27017" },
{ _id: 1, host: "[SECONDARY1_PRIVATE_IP]:27017" },
{ _id: 2, host: "[SECONDARY2_PRIVATE_IP]:27017" }
]
}
)
Replace [PRIMARY_PRIVATE_IP], [SECONDARY1_PRIVATE_IP], and [SECONDARY2_PRIVATE_IP] with the actual private IP addresses of your Linode instances. This command initializes the replica set named rs0 and adds the three members. The primary node will be elected automatically.
You can verify the replica set status by running rs.status() in the MongoDB shell. It should show all members in an `PRIMARY` or `SECONDARY` state.
Configuring Shopify for MongoDB Failover
Shopify’s backend applications need to be configured to connect to the MongoDB replica set. This is typically done via a connection string that specifies multiple hosts and the replica set name. The driver will handle failover automatically.
A typical MongoDB connection string for a replica set looks like this:
mongodb://[PRIMARY_PRIVATE_IP]:27017,[SECONDARY1_PRIVATE_IP]:27017,[SECONDARY2_PRIVATE_IP]:27017/?replicaSet=rs0&readPreference=primaryPreferred
The replicaSet=rs0 parameter is crucial. The readPreference=primaryPreferred ensures that read operations are directed to the primary node whenever possible, but will fall back to secondaries if the primary is unavailable. For write operations, MongoDB’s driver will automatically attempt to connect to the current primary.
In your Shopify application’s configuration files (e.g., environment variables, configuration YAMLs, or application code), update the MongoDB connection string to use this replica set URI. Ensure your application instances can reach the MongoDB nodes via their private IP addresses. This often involves configuring Linode’s VPC networking or ensuring firewall rules allow traffic on port 27017 between your application servers and MongoDB nodes.
Automating Failover with Linode NodeBalancers and Health Checks
While MongoDB’s replica set handles internal failover, external access to your Shopify application needs to be resilient. Linode NodeBalancers are ideal for distributing traffic to your Shopify application servers and can be configured with health checks to automatically remove unhealthy instances from the pool.
Setting up a Linode NodeBalancer for Shopify App Servers
Assume you have multiple Linode instances running your Shopify application (e.g., `shopify-app-1`, `shopify-app-2`, `shopify-app-3`) in different AZs.
1. Create a NodeBalancer: Navigate to the NodeBalancers section in your Linode Cloud Manager and create a new NodeBalancer. Select the region that best suits your deployment. 2. Configure Frontend: Set up a frontend listener for your application’s port (e.g., port 80 for HTTP or 443 for HTTPS). 3. Add Backend Nodes: Add your Shopify application Linode instances as backend nodes. Specify their private IP addresses and the port your application listens on (e.g., 3000 for Ruby on Rails). 4. Configure Health Checks: This is the critical part for automation. For each backend node, configure a health check:
- Protocol: HTTP or HTTPS (depending on your application’s setup).
- Path: A specific URL within your application that is guaranteed to return a
200 OKstatus code if the application is healthy. A common choice is a simple health check endpoint like/healthor/status. - Check Interval: How often to perform the check (e.g., 10 seconds).
- Response Timeout: How long to wait for a response (e.g., 5 seconds).
- Unhealthy Threshold: The number of consecutive failed checks before a node is considered unhealthy (e.g., 3).
- Healthy Threshold: The number of consecutive successful checks before a node is considered healthy again after being unhealthy (e.g., 2).
A sample health check configuration in the Linode Cloud Manager UI would look something like this:
Health Check Configuration: Protocol: HTTP Path: /health Check Interval: 10s Response Timeout: 5s Unhealthy Threshold: 3 Healthy Threshold: 2
Ensure your Shopify application has an endpoint (e.g., a route in Rails or a controller action) that responds to GET /health with a 200 OK status and perhaps some basic status information. If an application instance becomes unresponsive or fails its health checks, the NodeBalancer will automatically stop sending traffic to it. When the instance recovers and passes health checks again, the NodeBalancer will re-add it to the pool.
Implementing Application-Level Health Checks
The health check endpoint should be lightweight and verify critical dependencies. For a Shopify application, this might involve checking:
- Database connectivity (to the MongoDB replica set).
- Cache connectivity (e.g., Redis).
- The ability to perform a simple, non-destructive operation.
Here’s a conceptual example of a health check endpoint in Ruby on Rails:
# app/controllers/health_controller.rb
class HealthController < ApplicationController
skip_before_action :authenticate_user! # Or other relevant filters
def show
status = {
database: check_database,
cache: check_cache,
time: Time.current
}
if status[:database] && status[:cache]
render json: status, status: :ok
else
render json: status, status: :service_unavailable
end
end
private
def check_database
# Attempt a simple read operation on MongoDB
# Ensure your MongoDB connection string is configured correctly in config/database.yml
begin
# This is a simplified example; actual check might involve a specific query
# or checking replica set status.
Mongo::Client.new(Rails.configuration.database_configuration[Rails.env]["uri"]).database.command(ping: 1)
true
rescue Mongo::Error::ConnectionFailure, Mongo::Error::OperationFailure => e
Rails.logger.error "Database health check failed: #{e.message}"
false
end
end
def check_cache
# Attempt a simple operation on Redis
begin
Rails.cache.fetch("health_check", expires_in: 1.second) { "ok" }
Rails.cache.read("health_check") == "ok"
rescue Redis::CannotConnectError, Redis::TimeoutError => e
Rails.logger.error "Cache health check failed: #{e.message}"
false
end
end
end
# config/routes.rb Rails.application.routes.draw do get '/health', to: 'health#show' # ... other routes end
This setup ensures that if a Shopify application instance fails, the NodeBalancer will quickly divert traffic to healthy instances, minimizing downtime. Combined with MongoDB’s internal replica set failover, this provides a robust, automated disaster recovery strategy.