Disaster Recovery 101: Architecting Auto-Failovers for MongoDB and PHP Deployments on OVH
Establishing a MongoDB Replica Set for High Availability
A robust disaster recovery strategy for MongoDB hinges on a properly configured replica set. This ensures data redundancy and automatic failover in case of node failure. We’ll focus on a three-node replica set, a common and effective configuration for production environments, deployed on OVH’s Public Cloud instances.
Each node in the replica set should ideally reside in a different availability zone within the same OVH region for maximum resilience against zone-level outages. For this example, we’ll assume three Ubuntu 22.04 LTS servers, each with a static IP address.
MongoDB Configuration for Replica Set Members
On each of the three MongoDB servers (e.g., `mongo1.example.com`, `mongo2.example.com`, `mongo3.example.com`), modify the MongoDB configuration file (`/etc/mongod.conf`). Ensure the following settings are present and correctly configured:
Node 1 (`mongo1.example.com`)
# /etc/mongod.conf on mongo1.example.com
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
systemLog:
destination: file
path: /var/log/mongodb/mongod.log
logAppend: true
net:
bindIp: 0.0.0.0
port: 27017
replication:
replSetName: myReplicaSet
sharding:
clusterRole: configsvr
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod.pid
security:
keyFile: /etc/mongo-keyfile
Node 2 (`mongo2.example.com`)
# /etc/mongod.conf on mongo2.example.com
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
systemLog:
destination: file
path: /var/log/mongodb/mongod.log
logAppend: true
net:
bindIp: 0.0.0.0
port: 27017
replication:
replSetName: myReplicaSet
sharding:
clusterRole: configsvr
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod.pid
security:
keyFile: /etc/mongo-keyfile
Node 3 (`mongo3.example.com`)
# /etc/mongod.conf on mongo3.example.com
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
systemLog:
destination: file
path: /var/log/mongodb/mongod.log
logAppend: true
net:
bindIp: 0.0.0.0
port: 27017
replication:
replSetName: myReplicaSet
sharding:
clusterRole: configsvr
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod.pid
security:
keyFile: /etc/mongo-keyfile
Crucially, the replSetName must be identical across all members. The keyFile is essential for secure inter-node communication within the replica set. Generate a strong key file on one node and distribute it securely to the others.
Key File Generation and Distribution
On one of the nodes (e.g., `mongo1.example.com`), generate the key file:
# On mongo1.example.com openssl rand -base64 756 > /etc/mongo-keyfile sudo chown mongodb:mongodb /etc/mongo-keyfile sudo chmod 400 /etc/mongo-keyfile
Then, securely copy this key file to the other nodes:
# From mongo1.example.com to mongo2.example.com scp /etc/mongo-keyfile [email protected]:/tmp/mongo-keyfile # On mongo2.example.com after scp sudo mv /tmp/mongo-keyfile /etc/mongo-keyfile sudo chown mongodb:mongodb /etc/mongo-keyfile sudo chmod 400 /etc/mongo-keyfile # Repeat for mongo3.example.com
Initializing the Replica Set
After configuring and restarting MongoDB on all nodes, connect to one of the nodes (preferably the one intended to be the initial primary) using the `mongosh` client. Then, initiate the replica set configuration.
# Connect to mongo1.example.com
mongosh --host mongo1.example.com
# Inside the mongosh prompt:
rs.initiate(
{
_id: "myReplicaSet",
members: [
{ _id: 0, host: "mongo1.example.com:27017" },
{ _id: 1, host: "mongo2.example.com:27017" },
{ _id: 2, host: "mongo3.example.com:27017" }
]
}
)
Verify the replica set status:
rs.status()
You should see output indicating the state of each member (PRIMARY, SECONDARY). The replica set will automatically elect a primary if one is not initially designated or if the current primary fails.
PHP Application Integration with MongoDB Replica Sets
Your PHP application needs to be aware of the replica set to leverage its high availability features. This involves configuring the MongoDB connection string to include all replica set members and specifying the replica set name.
Connection String Configuration
Using the official MongoDB PHP driver (mongodb/mongodb), the connection string should be structured as follows. It’s best practice to store this in environment variables or a secure configuration management system.
<?php
// Example using environment variables
$mongoHost1 = getenv('MONGO_HOST_1') ?: 'mongo1.example.com';
$mongoHost2 = getenv('MONGO_HOST_2') ?: 'mongo2.example.com';
$mongoHost3 = getenv('MONGO_HOST_3') ?: 'mongo3.example.com';
$mongoPort = getenv('MONGO_PORT') ?: '27017';
$mongoReplicaSetName = getenv('MONGO_REPLICA_SET_NAME') ?: 'myReplicaSet';
$mongoDatabase = getenv('MONGO_DATABASE') ?: 'mydatabase';
$mongoUser = getenv('MONGO_USER');
$mongoPassword = getenv('MONGO_PASSWORD');
$connectionString = sprintf(
'mongodb://%s:%s,%s:%s,%s:%s/%s?replicaSet=%s',
$mongoHost1, $mongoPort,
$mongoHost2, $mongoPort,
$mongoHost3, $mongoPort,
$mongoDatabase,
$mongoReplicaSetName
);
// Add authentication if required
if ($mongoUser && $mongoPassword) {
$connectionString .= '&username=' . urlencode($mongoUser) . '&password=' . urlencode($mongoPassword);
}
try {
$client = new MongoDB\Client($connectionString);
$database = $client->selectDatabase($mongoDatabase);
// Perform database operations
$collection = $database->selectCollection('mycollection');
$result = $collection->insertOne(['name' => 'Test Document']);
echo "Successfully connected and inserted document.\n";
} catch (MongoDB\Driver\Exception\Exception $e) {
error_log("MongoDB connection error: " . $e->getMessage());
// Implement fallback or error handling logic here
echo "Failed to connect to MongoDB.\n";
}
?>
The key aspects here are:
- Listing all replica set members in the connection string.
- Specifying the
replicaSetparameter with the correct name. - The driver will automatically discover the current primary and direct write operations to it. Read operations can be configured to be distributed across secondaries using read preference options.
Read Preferences for Load Balancing and Availability
To distribute read load and improve availability, configure read preferences. The default is primary, meaning all reads go to the primary. For better performance and resilience, consider primaryPreferred, secondary, or secondaryPreferred.
<?php
// ... (previous connection setup)
use MongoDB\Driver\ReadPreference;
// Example: Read from secondaries if available, otherwise from primary
$readPreference = new ReadPreference(ReadPreference::RP_SECONDARY_PREFERRED);
try {
$client = new MongoDB\Client($connectionString, [], ['readPreference' => $readPreference]);
$database = $client->selectDatabase($mongoDatabase);
// Reads will now attempt to go to secondaries first
$collection = $database->selectCollection('mycollection');
$document = $collection->findOne(); // This read operation respects the read preference
print_r($document);
} catch (MongoDB\Driver\Exception\Exception $e) {
error_log("MongoDB connection error: " . $e->getMessage());
echo "Failed to connect to MongoDB.\n";
}
?>
The PHP driver, when connected to a replica set, will automatically handle failover. If the primary becomes unavailable, the driver will detect this and switch to a newly elected primary. This process is generally seamless for read operations, though writes might experience a brief interruption during the election period.
Automated Failover Monitoring and Alerting
While MongoDB’s replica set provides automatic failover, proactive monitoring and alerting are crucial for understanding system health and responding to incidents. OVH’s monitoring tools, combined with external services, can be leveraged.
OVH Instance and Network Monitoring
OVH’s control panel offers basic instance monitoring (CPU, RAM, disk, network traffic). Ensure these are configured to alert you on critical thresholds. For MongoDB-specific health, we need more granular checks.
MongoDB Health Checks with `mongostat` and `mongotop`
Regularly run `mongostat` and `mongotop` on each node to gauge performance and identify potential issues. These can be scripted and their output parsed for alerting.
# Example script to check replica set status and primary health # Run this from a separate monitoring server or one of the MongoDB nodes MONGO_HOSTS="mongo1.example.com:27017,mongo2.example.com:27017,mongo3.example.com:27017" REPLICA_SET_NAME="myReplicaSet" ALERT_EMAIL="[email protected]" # Check replica set status REPL_STATUS=$(mongosh --quiet --host $MONGO_HOSTS --eval "rs.status()" | grep '"stateStr" : "PRIMARY"') if [ -z "$REPL_STATUS" ]; then echo "CRITICAL: No MongoDB primary found in replica set $REPLICA_SET_NAME." | mail -s "MongoDB Alert: No Primary" $ALERT_EMAIL else echo "OK: MongoDB primary found." fi # Further checks could include: # - Checking oplog lag for secondaries # - Monitoring connection counts # - Checking disk space on MongoDB data directories
External Monitoring Services
Consider integrating with services like Prometheus and Grafana, or cloud-native solutions if available. Prometheus can scrape MongoDB metrics exposed via a `mongod` exporter. Grafana can then visualize these metrics and trigger alerts based on defined rules.
For instance, a Prometheus rule could look for a lack of a PRIMARY member in the replica set status, or excessive oplog lag on secondary members. These alerts can then be routed to Slack, PagerDuty, or email.
PHP Application-Level Failover Handling
While the MongoDB driver handles the database-level failover, your PHP application can implement more sophisticated error handling and fallback mechanisms.
Graceful Degradation and Retry Logic
When a MongoDB operation fails (e.g., due to a temporary network partition or during a primary election), catch the specific `MongoDB\Driver\Exception\Exception` and implement retry logic. Exponential backoff is a good strategy here.
<?php
// ... (connection setup)
$maxRetries = 3;
$baseDelayMs = 100; // 100ms
for ($attempt = 1; $attempt <= $maxRetries; $attempt++) {
try {
$client = new MongoDB\Client($connectionString);
$database = $client->selectDatabase($mongoDatabase);
$collection = $database->selectCollection('mycollection');
// Example: Insert operation
$result = $collection->insertOne(['data' => 'important data']);
echo "Operation successful on attempt {$attempt}.\n";
break; // Exit loop on success
} catch (MongoDB\Driver\Exception\ConnectionTimeoutException $e) {
error_log("MongoDB connection timeout on attempt {$attempt}: " . $e->getMessage());
// Fallback logic or specific handling for connection issues
} catch (MongoDB\Driver\Exception\WriteException $e) {
error_log("MongoDB write exception on attempt {$attempt}: " . $e->getMessage());
// Handle write concerns, etc.
} catch (MongoDB\Driver\Exception\Exception $e) {
error_log("MongoDB general exception on attempt {$attempt}: " . $e->getMessage());
// General error handling
}
if ($attempt < $maxRetries) {
$delay = $baseDelayMs * pow(2, $attempt - 1); // Exponential backoff
usleep($delay * 1000); // usleep takes microseconds
echo "Retrying operation in {$delay}ms...\n";
} else {
// All retries failed
error_log("MongoDB operation failed after {$maxRetries} attempts.");
// Implement critical error handling: e.g., return an error to the user,
// queue the operation for later processing, or trigger a higher-level alert.
echo "Operation failed permanently.\n";
}
}
?>
Circuit Breaker Pattern
For more complex scenarios, consider implementing a circuit breaker pattern. If repeated failures occur, the circuit breaker “opens,” preventing further calls to MongoDB for a period, thus avoiding overwhelming a struggling database. After a timeout, it enters a “half-open” state to test if the database has recovered.
While not built into the MongoDB driver, this logic can be implemented in your application layer using state machines and timers. Libraries like `php-circuit-breaker` can assist.
OVH Infrastructure Considerations for DR
Leveraging OVH’s infrastructure effectively is key to a robust disaster recovery plan.
Availability Zones and Regions
As mentioned, deploying MongoDB replica set members across different Availability Zones within the same OVH region is the first line of defense. For true disaster recovery against region-wide failures, consider a multi-region strategy. This is significantly more complex and involves:
- Cross-region data replication (e.g., using MongoDB Atlas global clusters, or custom solutions with tools like
mongo-syncor application-level replication). - Geo-distributed application instances.
- Global DNS load balancing (e.g., OVH’s Global DNS or third-party services) to direct traffic to the active region.
- Automated failover orchestration across regions.
Network Configuration and Security
Ensure your OVH Public Cloud network security groups and firewall rules allow MongoDB traffic (port 27017 by default) between replica set members. Restrict access to only necessary IPs. For inter-region communication, VPNs or dedicated network links might be required, adding complexity and cost.
Backup and Restore Strategy
While replica sets provide high availability, they are not a substitute for backups. Regularly back up your MongoDB data using tools like `mongodump`. Store these backups in a separate location, ideally in a different region or object storage (like OVH’s Object Storage). Test your restore process periodically.
# Example mongodump and mongorestore commands # On a node with mongodump installed: mongodump --host mongo1.example.com:27017 --username $MONGO_USER --password $MONGO_PASSWORD --authenticationDatabase admin --db mydatabase --out /path/to/backups/$(date +%Y-%m-%d_%H-%M-%S) # To restore: # mongorestore --host mongo1.example.com:27017 --username $MONGO_USER --password $MONGO_PASSWORD --authenticationDatabase admin --db mydatabase /path/to/backups/YYYY-MM-DD_HH-MM-SS/mydatabase
Automate these backup processes and verify their integrity. A comprehensive disaster recovery plan includes both high availability (automatic failover) and data durability (backups).