• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • Projects
  • Products
  • Themes
  • Tools
  • Request for Quote

Vengala Vinay

Having 12+ Years of Experience in Software Development

  • Home
  • WordPress
  • PHP
    • Codeigniter
  • Django
  • Magento
  • Selenium
  • Server
Home » Server Monitoring Best Practices: Keeping Your C App and Redis Clusters Alive on Linode

Server Monitoring Best Practices: Keeping Your C App and Redis Clusters Alive on Linode

Proactive C Application Health Checks with Systemd

For critical C applications running on Linode, robust health monitoring is paramount. Relying solely on external probes can lead to delayed detection of internal application failures. Integrating health checks directly into the system’s service manager, like systemd, provides a more immediate and granular approach. We’ll configure systemd to periodically check the health of our C application and restart it if it becomes unresponsive.

Assume your C application listens on a specific port (e.g., 8080) and has a health check endpoint (e.g., /health) that returns an HTTP 200 OK status when healthy. If the application crashes or hangs, this endpoint will become unreachable.

Systemd Service Unit Configuration

Create a systemd service file for your application. Let’s assume your application binary is located at /opt/myapp/myapp-server and its configuration is at /etc/myapp/myapp.conf.

Create the service file /etc/systemd/system/myapp.service:

[Unit]
Description=My C Application Server
After=network.target

[Service]
ExecStart=/opt/myapp/myapp-server --config /etc/myapp/myapp.conf
Restart=on-failure
RestartSec=5
User=myappuser
Group=myappgroup
WorkingDirectory=/opt/myapp

# Health Check Configuration
ExecStartPost=/usr/bin/curl --fail --silent --connect-timeout 5 http://localhost:8080/health > /dev/null 2>&1
HealthCheckIntervalSec=10
HealthCheckTimeoutSec=5

[Install]
WantedBy=multi-user.target

Explanation:

  • Description: A human-readable description of the service.
  • After=network.target: Ensures the network is up before starting the service.
  • ExecStart: The command to start your C application.
  • Restart=on-failure: Configures systemd to restart the service if it exits with a non-zero status code.
  • RestartSec=5: Waits 5 seconds before attempting a restart.
  • User/Group: Runs the application as a non-root user for security.
  • WorkingDirectory: Sets the working directory for the application.
  • ExecStartPost: This is crucial. It runs a command after the main ExecStart command has successfully started. We use curl to hit the /health endpoint. --fail makes curl return a non-zero exit code if the HTTP status is not 2xx or 3xx. --silent suppresses progress meters and error messages. --connect-timeout 5 limits the connection attempt to 5 seconds. The output is redirected to /dev/null as we only care about the exit status.
  • HealthCheckIntervalSec: (Systemd v235+) Specifies how often to run the health check command.
  • HealthCheckTimeoutSec: (Systemd v235+) Specifies the maximum time the health check command is allowed to run.
  • [Install]: Defines how the service should be enabled.

After creating the service file, reload systemd, enable, and start your application:

sudo systemctl daemon-reload
sudo systemctl enable myapp.service
sudo systemctl start myapp.service

You can check the status and logs with:

sudo systemctl status myapp.service
sudo journalctl -u myapp.service -f

Redis Cluster Monitoring with Redis-CLI and Prometheus Exporter

Monitoring a Redis cluster involves tracking node health, memory usage, latency, and command statistics. A common and effective approach is to use the built-in redis-cli for basic checks and integrate with Prometheus using the redis_exporter for comprehensive metrics collection.

Basic Redis Node Health Check Script

We can create a simple shell script to check the status of each node in the Redis cluster. This script can be run periodically by cron or integrated into a more sophisticated monitoring system.

Create a script, e.g., /opt/scripts/check_redis_cluster.sh:

#!/bin/bash

REDIS_HOSTS=("redis-node-1:6379" "redis-node-2:6379" "redis-node-3:6379")
EXPECTED_MASTERS=2 # Assuming a 3-node cluster with 2 masters for HA
EXPECTED_SLAVES=1

CLUSTER_INFO=$(redis-cli -h redis-node-1 -p 6379 cluster info 2>&1)
CLUSTER_NODES=$(redis-cli -h redis-node-1 -p 6379 cluster nodes 2>&1)

# Check cluster_state
if echo "$CLUSTER_INFO" | grep -q "cluster_state:ok"; then
    echo "Redis cluster state is OK."
else
    echo "ERROR: Redis cluster state is NOT OK. Cluster Info:"
    echo "$CLUSTER_INFO"
    exit 1
fi

# Check number of masters and slaves
MASTERS=$(echo "$CLUSTER_NODES" | grep "master" | wc -l)
SLAVES=$(echo "$CLUSTER_NODES" | grep "slave" | wc -l)

if [ "$MASTERS" -eq "$EXPECTED_MASTERS" ] && [ "$SLAVES" -eq "$EXPECTED_SLAVES" ]; then
    echo "Redis cluster has $MASTERS masters and $SLAVES slaves. (Expected: $EXPECTED_MASTERS masters, $EXPECTED_SLAVES slaves)"
else
    echo "WARNING: Redis cluster has $MASTERS masters and $SLAVES slaves. (Expected: $EXPECTED_MASTERS masters, $EXPECTED_SLAVES slaves)"
    # Depending on criticality, you might want to exit 1 here.
fi

# Check individual node reachability and role
for NODE in "${REDIS_HOSTS[@]}"; do
    HOST=$(echo $NODE | cut -d: -f1)
    PORT=$(echo $NODE | cut -d: -f2)
    echo "Checking node: $HOST:$PORT"
    if redis-cli -h $HOST -p $PORT ping > /dev/null 2>&1; then
        ROLE=$(redis-cli -h $HOST -p $PORT role | awk '{print $1}')
        echo "  Node $HOST:$PORT is PINGable. Role: $ROLE"
    else
        echo "  ERROR: Node $HOST:$PORT is NOT PINGable."
        exit 1
    fi
done

exit 0

Make the script executable and add it to cron:

chmod +x /opt/scripts/check_redis_cluster.sh
# Add to cron: run every 5 minutes
echo "*/5 * * * * /opt/scripts/check_redis_cluster.sh >> /var/log/redis_cluster_check.log 2>&1" | crontab -

Integrating Redis Exporter with Prometheus

For more detailed metrics and integration with a centralized monitoring system like Prometheus, deploying the redis_exporter is highly recommended. This exporter runs as a separate service and exposes Redis metrics in a Prometheus-compatible format.

Installation:

Download the latest release from the redis_exporter releases page. For example, on a Debian/Ubuntu system:

wget https://github.com/oliver006/redis_exporter/releases/download/v1.47.0/redis_exporter-v1.47.0.linux.amd64.tar.gz
tar xvfz redis_exporter-v1.47.0.linux.amd64.tar.gz
sudo mv redis_exporter /usr/local/bin/

Systemd Service for Redis Exporter:

Create a systemd service file /etc/systemd/system/redis_exporter.service. This example assumes your Redis cluster is accessible via a service discovery mechanism or a static list of hosts. For simplicity, we’ll configure it to scrape a single node, but it can be configured to scrape multiple or use Redis Sentinel.

[Unit]
Description=Redis Exporter
After=network.target

[Service]
User=redis_exporter
Group=redis_exporter
ExecStart=/usr/local/bin/redis_exporter \
  --redis.addr=redis://redis-node-1:6379 \
  --redis.addr=redis://redis-node-2:6379 \
  --redis.addr=redis://redis-node-3:6379 \
  --check-keyspace=true \
  --check-keyspace.interval=15m \
  --namespace=redis_cluster
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Explanation:

  • --redis.addr: Specifies the Redis instance(s) to connect to. You can list multiple for a cluster.
  • --check-keyspace: Enables keyspace statistics collection.
  • --check-keyspace.interval: How often to collect keyspace stats.
  • --namespace: A prefix for Prometheus metrics.

Create a user for the exporter, set up permissions, reload systemd, and start the service:

sudo useradd --system --no-create-home redis_exporter
sudo systemctl daemon-reload
sudo systemctl enable redis_exporter.service
sudo systemctl start redis_exporter.service

The exporter will now be available at http://localhost:9121/metrics. Configure your Prometheus server to scrape this endpoint.

Prometheus Configuration for Redis Cluster Scraping

In your Prometheus configuration file (e.g., /etc/prometheus/prometheus.yml), add a scrape job for the Redis exporter:

scrape_configs:
  - job_name: 'redis_cluster'
    static_configs:
      - targets: ['localhost:9121'] # Or the IP/hostname of your redis_exporter instance
    metrics_path: '/metrics'

If you have multiple Redis exporters or want to use service discovery, adjust the static_configs accordingly. After updating Prometheus configuration, reload it:

curl -X POST http://localhost:9090/-/reload

Alerting on Key Redis Metrics

With Prometheus collecting metrics, you can define alerting rules in Prometheus’s rules.yml file (or a separate file referenced in prometheus.yml). Here are some essential alerts for a Redis cluster:

- alert: RedisClusterDown
  expr: |
    up{job="redis_cluster"} == 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Redis cluster exporter is down on {{ $labels.instance }}"
    description: "The Prometheus exporter for Redis cluster is not reachable."

- alert: RedisNodeNotPinging
  expr: |
    redis_up{job="redis_cluster"} == 0
  for: 2m
  labels:
    severity: critical
  annotations:
    summary: "Redis node {{ $labels.instance }} is not responding to PING"
    description: "The Redis node {{ $labels.instance }} is down or unreachable."

- alert: RedisHighMemoryUsage
  expr: |
    redis_memory_used_bytes{job="redis_cluster"} / redis_total_system_memory_bytes{job="redis_cluster"} * 100 > 85
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Redis node {{ $labels.instance }} has high memory usage ({{ $value | printf "%.2f" }}%)"
    description: "Redis node {{ $labels.instance }} is using {{ $value | printf "%.2f" }}% of its allocated memory."

- alert: RedisHighLatency
  expr: |
    rate(redis_commands_total{job="redis_cluster",verb="PING"}[5m]) > 0 AND
    avg_over_time(redis_commands_duration_seconds_sum{job="redis_cluster",verb="PING"}[5m]) / avg_over_time(redis_commands_total{job="redis_cluster",verb="PING"}[5m]) > 0.01
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Redis node {{ $labels.instance }} has high PING latency"
    description: "Redis node {{ $labels.instance }} is experiencing high latency for PING commands."

- alert: RedisClusterStateNotOk
  expr: |
    redis_cluster_state{job="redis_cluster"} != "ok"
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Redis cluster state is not OK on {{ $labels.instance }}"
    description: "The Redis cluster state on {{ $labels.instance }} is {{ $value }}, expected 'ok'."

These alerts, when combined with Prometheus Alertmanager, provide a robust notification system for critical issues affecting your C application and Redis cluster on Linode.

Primary Sidebar

A little about the Author

Having 12+ Years of Experience in Software Development, Vinay is a principal software architect, senior systems engineer, and elite technical consultant. He specializes in bespoke PHP/WordPress development, high-performance Magento 2 & Shopify architectures, custom plugin/theme development from scratch, and legacy code modernization (including VB6, VB.NET, PyQt, and Crystal Reports). Known for solving complex database bottlenecks, speed optimization (Core Web Vitals), and advanced security code auditing, Vinay engineers production-ready systems designed to scale under heavy concurrent load conditions.



Chat on WhatsApp

Recent Posts

  • Debugging Guide: Diagnosing PHP-FPM child process pool exhaustion in multi-site network environments with modern tools
  • Debugging and Resolving complex namespace class loading collisions issues during heavy concurrent database traffic
  • Step-by-Step Guide: Offloading high-frequency customer support tickets metadata writes to a Redis KV store
  • How to refactor legacy event ticket registers queries using modern WP_Query and custom Transient caching
  • Step-by-Step Guide: Offloading high-frequency member profile directories metadata writes to a Redis KV store

Categories

  • apache (1)
  • Business & Monetization (390)
  • Centos (4)
  • Comparisons & Decision Making (55)
  • Debian (2)
  • Debugging & Troubleshooting (662)
  • Desktop Applications (14)
  • DevOps (7)
  • DevOps & Cloud Scaling (962)
  • Django (1)
  • Laravel (4)
  • Migration & Architecture (192)
  • Mobile Applications (24)
  • MySQL (1)
  • Performance & Optimization (873)
  • PHP (5)
  • PHP Development (49)
  • Plugins & Themes (244)
  • Programming Languages (9)
  • Python (20)
  • Ruby on Rails (1)
  • Security & Compliance (647)
  • SEO & Growth (492)
  • Server (118)
  • Ubuntu (9)
  • VB6 & VB.NET (8)
  • Web Applications & Frontend (19)
  • Web Assembly (Wasm) (2)
  • WordPress (22)
  • WordPress Plugin Development (726)
  • WordPress Theme Development (357)

Recent Posts

  • Debugging Guide: Diagnosing PHP-FPM child process pool exhaustion in multi-site network environments with modern tools
  • Debugging and Resolving complex namespace class loading collisions issues during heavy concurrent database traffic
  • Step-by-Step Guide: Offloading high-frequency customer support tickets metadata writes to a Redis KV store

Top Categories

  • DevOps & Cloud Scaling (962)
  • Performance & Optimization (873)
  • WordPress Plugin Development (726)
  • Debugging & Troubleshooting (662)
  • Security & Compliance (647)
  • SEO & Growth (492)

Our Products

  • ERP & LMS Systems (4)
  • Directories & Marketplaces (4)
  • Healthcare Portals (3)
  • Point of Sale (POS) (2)
  • E-Commerce Engines (2)

Our Services

  • E-Commerce Development (10)
  • WordPress Development (8)
  • Python & Desktop GUI (7)
  • General Consulting (7)
  • Legacy Modernization (5)
  • Mobile App Development (4)

Copyright © 2026 · Vinay Vengala