Disaster Recovery 101: Architecting Auto-Failovers for MySQL and C++ Deployments on Linode

Establishing a High-Availability MySQL Cluster with Orchestrator

For robust disaster recovery and automated failover, a self-healing MySQL cluster is paramount. We’ll leverage Orchestrator, a popular MySQL replication topology manager, to achieve this. Orchestrator monitors replication health, detects failures, and can automatically promote replicas to masters. This setup assumes a multi-Linode instance deployment, with each instance hosting a MySQL node and the Orchestrator agent.

First, ensure your MySQL instances are configured for replication. This typically involves setting up `server-id` uniquely on each instance and enabling binary logging (`log_bin`).

MySQL Configuration Snippets

On each MySQL node, modify your `my.cnf` (or equivalent) to include:

[mysqld]
server-id = 1  # Unique ID for each server
log_bin = /var/log/mysql/mysql-bin.log
binlog_format = ROW
relay_log = /var/log/mysql/mysql-relay-bin.log
read_only = 0 # Set to 1 for replicas, 0 for master (Orchestrator will manage this)

After applying these changes, restart your MySQL service on each node.

Installing and Configuring Orchestrator

Orchestrator needs to be installed on at least one, but ideally multiple, nodes for high availability of the orchestrator itself. For simplicity in this example, we’ll focus on a single orchestrator instance, but a production setup would involve a distributed consensus mechanism like etcd or Consul for orchestrator HA.

Download the latest Orchestrator binary from its GitHub releases page. For Debian/Ubuntu systems:

wget https://github.com/openark/orchestrator/releases/download/v3.2.7/orchestrator-3.2.7-linux-amd64.tar.gz
tar -xzf orchestrator-3.2.7-linux-amd64.tar.gz
sudo mv orchestrator-3.2.7-linux-amd64 /usr/local/bin/orchestrator

Create a configuration file, typically at `/etc/orchestrator/orchestrator.conf.json`:

{
  "Debug": false,
  "ListenAddress": ":3000",
  "MySQLTopologyUser": "orchestrator",
  "MySQLTopologyPassword": "your_orchestrator_db_password",
  "MySQLOrchestratorHostPort": "127.0.0.1:3306",
  "MySQLOrchestratorDatabaseName": "orchestrator",
  "GlobalWriteableExplorationIntervalHours": 1,
  "SlaveLagQuery": "SELECT * FROM mysql.slave_lag_check",
  "PromotionUser": "orchestrator_promote",
  "PromotionPassword": "your_promotion_password",
  "PromotionForgetMasterSeconds": 3600,
  "RecoveryPeriodBlockSeconds": 3600,
  "DiscoverByShowSlaveHosts": true,
  "DetectClusterAliases": true,
  "SnapshotTopologiesIntervalHours": 24,
  "PreemptionProcesses": 2,
  "PostponePromotionOnLagMinutes": 5,
  "AutoDiscoverOnStart": true,
  "MySQLDiscoveryIntervalSeconds": 10,
  "OrchestratorHost": "localhost",
  "OrchestratorPort": 3000,
  "HTTPPort": 8080,
  "RaftStorageDir": "/var/lib/orchestrator/raft",
  "RaftBind": "0.0.0.0:10000",
  "RaftServerID": 1,
  "RaftDiscoverViaGossip": true,
  "RaftGossipIntervalSeconds": 5,
  "RaftGossipPort": 7777,
  "RaftHeartbeatTimeoutSeconds": 10,
  "RaftLeaderTimeoutSeconds": 10,
  "RaftSnapshotCount": 1000,
  "RaftPurgeIntervalSeconds": 3600
}

You’ll need to create the MySQL users `orchestrator` and `orchestrator_promote` on your MySQL instances with appropriate privileges. The `orchestrator` user needs `REPLICATION CLIENT`, `REPLICATION SLAVE`, `SELECT`, `SHOW DATABASES`, `LOCK TABLES`, `PROCESS`, and `SUPER` privileges. The `orchestrator_promote` user needs `REPLICATION SLAVE`, `REPLICATION CLIENT`, `SUPER`, `RELOAD`, `PROCESS`, and `LOCK TABLES` privileges.

-- On your MySQL master (or any node to create users globally)
CREATE USER 'orchestrator'@'%' IDENTIFIED BY 'your_orchestrator_db_password';
GRANT REPLICATION CLIENT, REPLICATION SLAVE, SELECT, SHOW DATABASES, LOCK TABLES, PROCESS, SUPER ON *.* TO 'orchestrator'@'%';

CREATE USER 'orchestrator_promote'@'%' IDENTIFIED BY 'your_promotion_password';
GRANT REPLICATION SLAVE, REPLICATION CLIENT, SUPER, RELOAD, PROCESS, LOCK TABLES ON *.* TO 'orchestrator_promote'@'%';
FLUSH PRIVILEGES;

Initialize the Orchestrator database schema:

sudo orchestrator --mysql-db-password="your_orchestrator_db_password" --mysql-db-user="orchestrator" --mysql-db-host="127.0.0.1" --mysql-db-name="orchestrator" --mysql-db-port="3306" --config="/etc/orchestrator/orchestrator.conf.json" --init-db

Now, start the Orchestrator service. For systemd:

sudo systemctl enable orchestrator
sudo systemctl start orchestrator

Access the Orchestrator web UI at `http://your_orchestrator_ip:8080`. You’ll need to manually add your MySQL instances initially. Click “Discover DBs” and provide the connection details for your primary MySQL node. Orchestrator will then discover the topology.

Integrating C++ Application with MySQL Failover

Your C++ application needs to be aware of potential master changes. The most robust approach is to abstract database connection management. Instead of hardcoding a single master IP, your application should query Orchestrator’s API to determine the current master’s address.

Orchestrator exposes a REST API. We can use `curl` or a C++ HTTP client library (like `libcurl` or `cpprestsdk`) to fetch this information.

C++ Example: Dynamic MySQL Master Discovery

This C++ snippet demonstrates how to fetch the current master’s hostname/IP from Orchestrator’s API. For simplicity, we’ll use `curl` via `popen` for this example. In a production environment, a dedicated HTTP client library is recommended for better error handling and performance.

#include <iostream>
#include <string>
#include <cstdio>
#include <memory>
#include <array>

// Function to execute a command and capture its output
std::string exec(const char* cmd) {
    std::array<char, 128> buffer;
    std::string result = "";
    std::unique_ptr<FILE, decltype(&pclose)> pipe(popen(cmd, "r"), pclose);
    if (!pipe) {
        throw std::runtime_error("popen() failed!");
    }
    while (fgets(buffer.data(), buffer.size(), pipe.get()) != nullptr) {
        result += buffer.data();
    }
    return result;
}

// Function to get the current MySQL master from Orchestrator
std::string getCurrentMySQLMaster(const std::string& orchestratorApiUrl) {
    std::string command = "curl -s \"" + orchestratoratorApiUrl + "/api/who-is-master-of?candidate=" + orchestratorApiUrl + "\"";
    std::string output = exec(command.c_str());

    // Basic JSON parsing (for a real app, use a JSON library like nlohmann/json)
    // Expected format: {"candidate": "...", "masterHost": "...", "masterPort": ...}
    size_t masterHostPos = output.find("\"masterHost\":");
    if (masterHostPos == std::string::npos) {
        throw std::runtime_error("Could not find 'masterHost' in Orchestrator API response.");
    }
    masterHostPos += std::string("\"masterHost\":").length();

    size_t startQuote = output.find('"', masterHostPos);
    if (startQuote == std::string::npos) {
        throw std::runtime_error("Could not find start quote for masterHost.");
    }
    size_t endQuote = output.find('"', startQuote + 1);
    if (endQuote == std::string::npos) {
        throw std::runtime_error("Could not find end quote for masterHost.");
    }

    return output.substr(startQuote + 1, endQuote - startQuote - 1);
}

int main() {
    std::string orchestratorApi = "http://your_orchestrator_ip:8080"; // Replace with your Orchestrator API endpoint

    try {
        std::string masterHost = getCurrentMySQLMaster(orchestratorApi);
        std::cout << "Current MySQL Master: " << masterHost << std::endl;

        // Now use 'masterHost' to establish your MySQL connection
        // Example: connect_to_mysql(masterHost, "your_user", "your_password", "your_database");

    } catch (const std::exception& e) {
        std::cerr << "Error: " << e.what() << std::endl;
        // Implement fallback or retry logic here
        return 1;
    }
    return 0;
}

In your application’s connection logic, you would call `getCurrentMySQLMaster` periodically or upon detecting a connection error. If a connection fails, your application should re-query Orchestrator for the master and attempt to reconnect. This dynamic discovery ensures your application always points to the active master, even after an automated failover.

Automating Failover with Orchestrator Hooks

Orchestrator can execute custom scripts (hooks) before and after critical operations like promotions. This is crucial for notifying your application layer or performing other necessary state changes.

Create a script, for example, `/etc/orchestrator/hooks/mysql-failover-hook.sh`:

#!/bin/bash

# This script is executed by Orchestrator during failover events.
# It receives arguments detailing the event.

EVENT_TYPE="$1"
INSTANCE_KEY="$2" # e.g., "hostname:port"
CLUSTER_NAME="$3"
NEW_MASTER_KEY="$4" # Only present for promotion events

LOG_FILE="/var/log/orchestrator/hooks.log"

echo "$(date): Event Type: $EVENT_TYPE, Instance Key: $INSTANCE_KEY, Cluster Name: $CLUSTER_NAME, New Master Key: $NEW_MASTER_KEY" >> "$LOG_FILE"

if [ "$EVENT_TYPE" == "post-promotion" ]; then
    # The instance specified by INSTANCE_KEY has just been promoted to master.
    # The NEW_MASTER_KEY will be the same as INSTANCE_KEY in this case.

    # Example: Notify your application layer or a service discovery system.
    # This could involve sending an API call, updating a configuration file,
    # or publishing a message to a queue.

    # For demonstration, we'll just log the event.
    echo "$(date): Instance $INSTANCE_KEY promoted to master in cluster $CLUSTER_NAME." >> "$LOG_FILE"

    # In a real-world scenario, you might trigger a rolling restart of your C++ app instances
    # or update a load balancer configuration.
    # Example: curl -X POST http://your_app_api/notify_db_change?master=$INSTANCE_KEY

elif [ "$EVENT_TYPE" == "post-failure" ]; then
    # An instance has been marked as failed.
    echo "$(date): Instance $INSTANCE_KEY marked as failed in cluster $CLUSTER_NAME." >> "$LOG_FILE"
fi

exit 0

Make the script executable:

sudo chmod +x /etc/orchestrator/hooks/mysql-failover-hook.sh

Configure Orchestrator to use this hook by adding or modifying the `PreMasterPromotionProcesses` and `PostMasterPromotionProcesses` (or similar, check Orchestrator docs for exact hook names) settings in `orchestrator.conf.json`. For example, to run a script *after* promotion:

  // ... other settings ...
  "PostMasterPromotionProcesses": [
    "/etc/orchestrator/hooks/mysql-failover-hook.sh post-promotion",
    "/usr/local/bin/my_other_notification_script post-promotion"
  ],
  "PostFailureProcesses": [
    "/etc/orchestrator/hooks/mysql-failover-hook.sh post-failure"
  ]
// ...

Restart Orchestrator for the changes to take effect. When Orchestrator detects a master failure and promotes a replica, it will execute the configured hook script, allowing your C++ application or other systems to react accordingly.

Linode Specific Considerations

Ensure your Linode firewall rules allow traffic between your MySQL nodes for replication (typically port 3306) and between your application servers and the MySQL cluster. Also, ensure Orchestrator nodes can reach all MySQL nodes. For Orchestrator HA, consider using Linode’s private networking for inter-node communication.

Monitoring is critical. Use Linode’s monitoring tools, Prometheus/Grafana, or other solutions to track MySQL performance, replication lag, and Orchestrator’s health. Set up alerts for replication failures or promotion events.

For persistent storage, consider Linode Block Storage. Ensure your MySQL data directories are mounted correctly and that you have a strategy for backing up this data, independent of the failover mechanism.

Conclusion

By combining Orchestrator for automated MySQL failover with a C++ application designed for dynamic master discovery and leveraging Orchestrator’s hook system, you can build a highly available and resilient database layer. This architecture minimizes downtime and ensures your critical services remain operational even in the face of hardware failures or network issues on Linode.

Disaster Recovery 101: Architecting Auto-Failovers for MySQL and C++ Deployments on Linode

Establishing a High-Availability MySQL Cluster with Orchestrator

MySQL Configuration Snippets

Installing and Configuring Orchestrator

Integrating C++ Application with MySQL Failover

C++ Example: Dynamic MySQL Master Discovery

Automating Failover with Orchestrator Hooks

Linode Specific Considerations

Conclusion

Recent Posts

Top Categories

Our Products

Our Services