Server Monitoring Best Practices: Keeping Your C App and MySQL Clusters Alive on Linode

Proactive C Application Health Checks

For a C application critical to your operations, simply checking if the process is running isn’t enough. We need to ensure it’s not just alive, but healthy and responsive. This involves implementing a robust health check endpoint within the application itself and then using an external monitoring tool to periodically query it.

A common pattern is to expose an HTTP endpoint (e.g., `/healthz`) that performs internal checks. For a C application, this might involve:

Verifying critical internal data structures are valid.
Checking connectivity to essential external services (databases, message queues).
Ensuring internal thread pools are not exhausted.
Confirming recent successful operation cycles.

Let’s consider a simplified C example using `libmicrohttpd` for the web server and a hypothetical internal health check function `is_application_healthy()`.

C Application Health Check Endpoint Example

#include <stdio.h>
#include <stdlib.h>
#include <microhttpd.h>

#define PORT 8080

// Hypothetical function to check application's internal state
int is_application_healthy() {
    // In a real app, this would check DB connections, thread pools, etc.
    // For demonstration, we'll just return true.
    return 1; // 1 for healthy, 0 for unhealthy
}

static int
handle_request (void *cls, struct MHD_Connection *connection,
                const char *url, const char *method,
                const char *version, const char *upload_data,
                size_t *upload_data_size, void **con_cls)
{
    if (strcmp (url, "/healthz") == 0) {
        if (is_application_healthy()) {
            const char *response = "{\"status\": \"ok\"}";
            struct MHD_Response *mhd_response;
            mhd_response = MHD_create_response_from_buffer (strlen (response),
                                                            (void *) response, MHD_RESPMem_PERSISTENT);
            MHD_add_response_header (mhd_response, MHD_HTTP_HEADER_CONTENT_TYPE, "application/json");
            return MHD_queue_response (connection, MHD_HTTP_STATUS_OK, mhd_response);
        } else {
            const char *response = "{\"status\": \"unhealthy\"}";
            struct MHD_Response *mhd_response;
            mhd_response = MHD_create_response_from_buffer (strlen (response),
                                                            (void *) response, MHD_RESPMem_PERSISTENT);
            MHD_add_response_header (mhd_response, MHD_HTTP_HEADER_CONTENT_TYPE, "application/json");
            return MHD_queue_response (connection, MHD_HTTP_STATUS_INTERNAL_SERVER_ERROR, mhd_response);
        }
    }

    // Handle other routes or return 404
    const char *response = "Not Found";
    struct MHD_Response *mhd_response;
    mhd_response = MHD_create_response_from_buffer (strlen (response),
                                                    (void *) response, MHD_RESPMem_PERSISTENT);
    return MHD_queue_response (connection, MHD_HTTP_STATUS_NOT_FOUND, mhd_response);
}

int main (void)
{
    struct MHD_Daemon *daemon;

    daemon = MHD_start_daemon (MHD_NO_OPTS, PORT, NULL, NULL,
                               &handle_request, NULL, MHD_OPTION_END);
    if (daemon == NULL)
        return 1;

    printf("Server started on port %d\n", PORT);
    getchar (); // Keep server running until Enter is pressed

    MHD_stop_daemon (daemon);
    return 0;
}

Compile this with:

gcc -o health_server health_server.c -lmicrohttpd

External Monitoring with Prometheus and Node Exporter

To monitor this health endpoint externally, we’ll leverage Prometheus. The standard way to expose application metrics to Prometheus is via a metrics endpoint (often `/metrics`). However, for simple health checks, we can use `node_exporter`’s `textfile_collector` to periodically scrape our application’s health status and expose it as a Prometheus metric.

First, ensure `node_exporter` is installed and running on your Linode instance. You’ll need to configure it to use the `textfile_collector`. This is typically done by creating a directory (e.g., `/var/lib/node_exporter/textfile_collector`) and ensuring `node_exporter` is started with the `–collector.textfile.directory` flag pointing to this path.

Configuring Node Exporter Textfile Collector

Assuming `node_exporter` is managed by `systemd`, edit its service file (e.g., `/etc/systemd/system/node_exporter.service`):

[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
ExecStart=/usr/local/bin/node_exporter \
    --collector.textfile.directory=/var/lib/node_exporter/textfile_collector \
    --web.listen-address=0.0.0.0:9100

[Install]
WantedBy=multi-user.target

After editing, reload `systemd` and restart `node_exporter`:

sudo systemctl daemon-reload
sudo systemctl restart node_exporter

Next, create a script that will periodically check your C application’s `/healthz` endpoint and write a Prometheus-formatted metric to the `textfile_collector` directory. A simple `curl` command and `echo` will suffice.

Health Check Script for Textfile Collector

#!/bin/bash

APP_HEALTH_URL="http://localhost:8080/healthz" # Adjust if app is on a different host/port
METRIC_FILE="/var/lib/node_exporter/textfile_collector/app_health.prom"
TIMESTAMP=$(date +%s)

# Check if the application is reachable and healthy
if curl --fail --silent --connect-timeout 5 "$APP_HEALTH_URL" > /dev/null; then
    # Application is healthy
    echo "app_health_status{app=\"my_c_app\", instance=\"$(hostname)\"} 1 $TIMESTAMP" > "$METRIC_FILE"
    echo "App is healthy. Metric written to $METRIC_FILE"
else
    # Application is unhealthy or unreachable
    echo "app_health_status{app=\"my_c_app\", instance=\"$(hostname)\"} 0 $TIMESTAMP" > "$METRIC_FILE"
    echo "App is UNHEALTHY or unreachable. Metric written to $METRIC_FILE"
fi

exit 0

Make this script executable and place it in a location that `cron` can access (e.g., `/usr/local/bin/check_app_health.sh`). Then, set up a cron job to run it every minute:

sudo chmod +x /usr/local/bin/check_app_health.sh
sudo crontab -e
# Add the following line:
* * * * * /usr/local/bin/check_app_health.sh >> /var/log/app_health_check.log 2>&1

Now, Prometheus, configured to scrape `node_exporter` (typically on port 9100), will automatically pick up the `app_health_status` metric. You can then create Prometheus alerts based on this metric (e.g., alert if `app_health_status == 0` for more than 5 minutes).

MySQL Cluster Monitoring with Percona Monitoring and Management (PMM)

For MySQL clusters, especially those requiring high availability and performance, a dedicated monitoring solution is essential. Percona Monitoring and Management (PMM) is an excellent open-source choice that provides deep insights into MySQL performance, availability, and query analysis.

PMM consists of a server component (which you’ll deploy, often in a Docker container) and client agents that run on your database nodes. The server aggregates data from the agents and presents it via a Grafana dashboard.

Deploying PMM Server

The easiest way to deploy PMM is using Docker. Ensure you have Docker and Docker Compose installed on a dedicated Linode instance or a VM that can reach your MySQL cluster.

# Create a directory for PMM configuration
mkdir pmm-server
cd pmm-server

# Create a docker-compose.yml file
cat <<EOF > docker-compose.yml
version: '3.7'

services:
  pmm-server:
    image: perconalab/pmm-server:2
    container_name: pmm-server
    restart: always
    ports:
      - "80:80"
      - "443:443"
      - "8080:8080"
    volumes:
      - pmm-data:/var/lib/mysql
      - pmm-data:/var/lib/grafana
      - pmm-data:/opt/prometheus/
      - pmm-data:/opt/clickhouse-server/data/
      - pmm-data:/srv/grafana/
      - pmm-data:/srv/prometheus/
      - pmm-data:/srv/clickhouse-server/data/
    environment:
      - VIRTUAL_HOST=pmm.yourdomain.com # Replace with your domain or IP
      - LETSENCRYPT_HOST=pmm.yourdomain.com # Replace with your domain or IP
      - [email protected] # Replace with your email

volumes:
  pmm-data:
EOF

# Start PMM Server
docker-compose up -d

Access PMM via your browser at `http://pmm.yourdomain.com` (or the IP/port you configured). You’ll be prompted to set up an administrator account.

Installing PMM Client Agents on MySQL Nodes

On each Linode instance hosting your MySQL cluster nodes, install the PMM client. The installation process varies slightly by OS. For Debian/Ubuntu:

# Download and install the PMM client package
wget https://repo.percona.com/apt/percona-release_latest_all.deb
sudo dpkg -i percona-release_latest_all.deb
sudo apt-get update

# Install the PMM client
sudo apt-get install pmm2-client -y

# Configure the client to point to your PMM Server
# Replace pmm.yourdomain.com with your PMM server's address
sudo pmm-admin config set --server-address pmm.yourdomain.com:443

Once the client is installed and configured, you can add your MySQL instances to PMM. For a MySQL cluster, you’ll add each node individually.

Adding a MySQL Node to PMM

# Add a MySQL instance (replace with your actual credentials and host)
# For a single node:
sudo pmm-admin add mysql --host 127.0.0.1 --port 3306 --user pmm_user --password 'your_mysql_password' --service-name 'mysql-node-1'

# If using MySQL replication, add each replica and the primary
# Example for a primary:
sudo pmm-admin add mysql --host 192.168.1.10 --port 3306 --user pmm_user --password 'your_mysql_password' --service-name 'mysql-primary'

# Example for a replica:
sudo pmm-admin add mysql --host 192.168.1.11 --port 3306 --user pmm_user --password 'your_mysql_password' --service-name 'mysql-replica-1'

# You'll need to create a dedicated 'pmm_user' in MySQL with appropriate privileges.
# Example SQL for creating the user:
/*
CREATE USER 'pmm_user'@'%' IDENTIFIED BY 'your_mysql_password';
GRANT USAGE, PROCESS, REPLICATION CLIENT, SELECT, SHOW DATABASES, SUPER, REPLICATION SLAVE ON *.* TO 'pmm_user'@'%';
FLUSH PRIVILEGES;
*/

After adding your instances, PMM will start collecting metrics. Navigate to the PMM dashboard in your browser to view performance graphs, query analytics, and alerts for your MySQL cluster. PMM automatically sets up dashboards for MySQL, including replication lag, connection usage, query performance, and more.

Advanced MySQL Cluster Health Checks

Beyond basic metrics, PMM offers advanced health checks and alerts. For MySQL clusters, key areas to monitor include:

Replication Status: Ensure `Seconds_Behind_Master` is consistently low for all replicas. PMM’s dashboards highlight this.
InnoDB Buffer Pool Usage: Monitor hit rate and usage to ensure efficient caching.
Connection Usage: Track `Threads_connected` and `Max_used_connections` to prevent connection exhaustion.
Disk I/O and Latency: Crucial for database performance. PMM integrates with `node_exporter` to show these metrics.
Query Performance: Identify slow queries that can degrade cluster performance. PMM’s Query Analytics is invaluable here.
Error Logs: Monitor MySQL error logs for critical issues. PMM can ingest these logs.

Setting up PMM Alerts

PMM integrates with Alertmanager for sophisticated alerting. You can define alert rules directly within PMM’s Grafana instance. For example, to alert on replication failure:

1. Navigate to the PMM dashboard, then to Grafana.

2. Go to “Alerting” > “Alert rules”.

3. Click “New alert rule”.

4. Configure the rule. For replication lag, you might use a query like:

SELECT avg(seconds_behind_master) FROM pmm_metrics.mysql_replication_health WHERE instance = 'mysql-replica-1:3306'

5. Set the condition (e.g., “is above 300” for 5 minutes).

6. Define the notification channel (e.g., email, Slack, PagerDuty) configured in Alertmanager.

By combining application-level health checks for your C application with comprehensive monitoring of your MySQL cluster via PMM, you establish a robust system for maintaining high availability and performance on Linode.