Server Monitoring Best Practices: Keeping Your Laravel App and Redis Clusters Alive on DigitalOcean
Proactive Redis Cluster Health Checks with `redis-cli`
Maintaining the health of a Redis cluster is paramount for any high-traffic Laravel application. Beyond basic uptime, we need to monitor key performance indicators and cluster state. DigitalOcean’s managed Redis offers some insights, but direct `redis-cli` commands provide granular, real-time diagnostics. We’ll focus on commands that reveal cluster integrity and potential bottlenecks.
First, establish a connection to one of your Redis nodes. If you’re using a private network on DigitalOcean, this will be a private IP address. For external access, use the public IP and ensure your firewall rules are correctly configured.
Cluster State and Node Status
The `CLUSTER INFO` command provides a high-level overview of the cluster’s status. Pay close attention to `cluster_state` (should be `ok`) and `cluster_slots_assigned` vs. `cluster_slots_ok`.
redis-cli -h-p CLUSTER INFO
A more detailed view of individual nodes and their roles is obtained with `CLUSTER NODES`. This output is crucial for identifying nodes that are disconnected, in a `fail` state, or not participating correctly in the cluster.
redis-cli -h-p CLUSTER NODES
Look for:
- Nodes marked with `myself,master` or `myself,slave`.
- The `connected` status of each node.
- The `master` field for slaves, ensuring they point to a valid master.
- The `slots` field for masters, confirming they are responsible for the expected hash slots.
Performance Metrics for Bottleneck Detection
The `INFO` command, when used with the `CPU`, `MEMORY`, and `STATS` sections, is invaluable for spotting performance degradation. We can script checks against these metrics.
redis-cli -h-p INFO CPU MEMORY STATS
Key metrics to monitor:
used_memoryandused_memory_peak: Monitor memory usage against your droplet’s limits.instantaneous_ops_per_sec: High values might indicate a busy node.keyspace_hitsandkeyspace_misses: A low hit rate can suggest insufficient memory for caching or inefficient key usage.latest_fork_usec: A high value indicates a long fork operation, which can block the main Redis thread. This is particularly important for persistence operations (RDB/AOF).evicted_keys: Indicates that Redis is running out of memory and evicting keys.
Laravel Application Health Checks with Artisan and Custom Logic
For your Laravel application, health checks should go beyond simply verifying that the web server is responding. We need to ensure critical dependencies like the database and Redis are accessible and functioning correctly from the application’s perspective.
Artisan Command for Dependency Checks
Create a dedicated Artisan command to encapsulate these checks. This command can be triggered by external monitoring tools (e.g., via a cron job or a dedicated monitoring agent).
// app/Console/Commands/CheckAppHealth.php
namespace App\Console\Commands;
use Illuminate\Console\Command;
use Illuminate\Support\Facades\Cache;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Facades\Redis;
use Exception;
class CheckAppHealth extends Command
{
protected $signature = 'app:health-check';
protected $description = 'Performs health checks on application dependencies (DB, Redis)';
public function handle()
{
$this->info('Starting application health check...');
$this->checkDatabaseConnection();
$this->checkRedisConnection();
$this->info('Application health check completed successfully.');
return 0;
}
protected function checkDatabaseConnection(): void
{
try {
DB::connection()->getPdo();
$this->info('Database connection is healthy.');
} catch (Exception $e) {
$this->error('Database connection failed: ' . $e->getMessage());
// In a real-world scenario, you might want to throw an exception
// or exit with a non-zero status code to signal failure to monitoring systems.
// throw $e;
exit(1); // Signal failure
}
}
protected function checkRedisConnection(): void
{
try {
// Attempt a simple SET/GET operation to verify connectivity and functionality
$key = 'app_health_check_' . uniqid();
$value = 'ping';
Redis::set($key, $value, 'EX', 5); // Set with a short expiry
$retrievedValue = Redis::get($key);
if ($retrievedValue === $value) {
Redis::del($key); // Clean up
$this->info('Redis connection is healthy.');
} else {
$this->error('Redis SET/GET operation failed. Retrieved value mismatch.');
exit(1); // Signal failure
}
} catch (Exception $e) {
$this->error('Redis connection failed: ' . $e->getMessage());
exit(1); // Signal failure
}
}
}
Register this command in app/Console/Kernel.php:
protected $commands = [
\App\Console\Commands\CheckAppHealth::class,
];
Scheduling the Health Check
You can schedule this command to run periodically using Laravel’s scheduler. For external monitoring, you’ll typically run it via cron directly on the server or through a CI/CD pipeline.
// app/Console/Kernel.php
protected function schedule(Schedule $schedule)
{
// Run every minute for immediate feedback, adjust as needed
$schedule->command('app:health-check')->everyMinute();
}
Ensure your cron daemon is set up to run Laravel’s scheduler:
* * * * * cd /path-to-your-laravel-project && php artisan schedule:run >> /dev/null 2>&1
Server-Level Monitoring with DigitalOcean and Prometheus/Grafana
While application-level checks are vital, robust monitoring requires observing the underlying infrastructure. DigitalOcean provides basic metrics, but for advanced visualization and alerting, integrating Prometheus and Grafana is a standard practice.
DigitalOcean Droplet Metrics
DigitalOcean’s control panel offers graphs for CPU utilization, bandwidth, load, and memory. These are good for a quick glance but lack the configurability and alerting capabilities of dedicated monitoring stacks.
Setting up Prometheus and Grafana
A common approach is to deploy Prometheus and Grafana on a dedicated monitoring Droplet or within your application’s infrastructure if resources permit. We’ll use Docker for ease of deployment.
# docker-compose.yml
version: '3.7'
services:
prometheus:
image: prom/prometheus:v2.37.0
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus:/etc/prometheus/
command:
- '--config.file=/etc/prometheus/prometheus.yml'
restart: unless-stopped
grafana:
image: grafana/grafana:9.1.0
container_name: grafana
ports:
- "3000:3000"
volumes:
- grafana-storage:/var/lib/grafana
restart: unless-stopped
volumes:
grafana-storage:
Create a prometheus/prometheus.yml file. This configuration will scrape metrics from your Redis nodes (using `redis_exporter`) and your Laravel application servers (using `node_exporter` and potentially a custom exporter for Artisan command results).
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
scrape_configs:
# Scrape Prometheus itself
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
# Scrape Node Exporter for server metrics
- job_name: 'node_exporter'
static_configs:
- targets: [':9100', ':9100'] # Replace with your Laravel server IPs
# Scrape Redis Exporter for Redis metrics
- job_name: 'redis_exporter'
static_configs:
- targets: [':9121', ':9121'] # Replace with your Redis node IPs and exporter ports
# Example for a custom exporter that exposes Artisan command results (more advanced)
# - job_name: 'laravel_artisan_exporter'
# static_configs:
# - targets: [':9500'] # Assuming a custom exporter runs on port 9500
You’ll need to deploy `node_exporter` and `redis_exporter` on your respective servers. For `redis_exporter`, it typically runs as a separate process that connects to Redis and exposes metrics via an HTTP endpoint (defaulting to port 9121).
# Example for running redis_exporter in Docker on a Redis node docker run -d \ --name redis_exporter \ -p 9121:9121 \ oliver006/redis_exporter:latest \ --redis.addr=redis://localhost:6379 # Adjust if Redis is not on localhost:6379
For `node_exporter`, it’s usually installed directly on the server and run as a systemd service.
Grafana Dashboards and Alerting
Once Prometheus is scraping metrics, configure Grafana to use Prometheus as a data source. Import pre-built dashboards for Redis and Node Exporter (available on Grafana.com) or create custom ones. Key dashboards to look for:
- Redis Cluster Overview
- Redis Memory Usage
- Redis Performance Metrics
- Node Exporter Full Dashboard
Set up alerting rules in Prometheus (or directly in Grafana) for critical conditions:
- Redis cluster state is not `ok`.
- Redis node is unreachable.
- High Redis memory usage (e.g., > 85%).
- High `latest_fork_usec` in Redis.
- High CPU or Memory usage on Laravel servers.
- High disk I/O wait times.
- Laravel health check Artisan command returns non-zero exit code.
Advanced: Custom Laravel Health Check Exporter
To integrate the results of your `php artisan app:health-check` command directly into Prometheus, you can build a small custom exporter. This exporter would periodically run the Artisan command and expose its results as Prometheus metrics.
A simple Python script using `subprocess` to call Artisan and `prometheus_client` library can achieve this:
# laravel_artisan_exporter.py
from prometheus_client import start_http_server, Gauge
import subprocess
import time
import os
# Define metrics
laravel_health_status = Gauge(
'laravel_health_check_status',
'Health check status of Laravel dependencies (1 for OK, 0 for FAIL)',
['dependency']
)
# Get Laravel project path from environment variable
LARAVEL_PATH = os.environ.get('LARAVEL_PATH', '/var/www/html') # Default path, adjust as needed
def run_artisan_health_check():
try:
# Execute the Artisan command
# We capture stdout and stderr to check for specific messages
result = subprocess.run(
['php', 'artisan', 'app:health-check'],
cwd=LARAVEL_PATH,
capture_output=True,
text=True,
check=False # Don't raise exception on non-zero exit code, we'll check it manually
)
# Check exit code
if result.returncode == 0:
# Command ran successfully, assume all checks passed if no specific errors are logged
# More robust: parse output for "Database connection is healthy." and "Redis connection is healthy."
if "Database connection is healthy." in result.stdout and "Redis connection is healthy." in result.stdout:
laravel_health_status.labels('database').set(1)
laravel_health_status.labels('redis').set(1)
print("Artisan health check OK.")
else:
# Partial success or unexpected output
laravel_health_status.labels('database').set(0)
laravel_health_status.labels('redis').set(0)
print(f"Artisan health check: Partial success or unexpected output. STDOUT: {result.stdout}")
else:
# Command failed, try to determine which dependency failed based on output
print(f"Artisan health check FAILED with exit code {result.returncode}. STDERR: {result.stderr}")
if "Database connection failed" in result.stderr or "Database connection failed" in result.stdout:
laravel_health_status.labels('database').set(0)
else:
laravel_health_status.labels('database').set(1) # Assume OK if not explicitly failed
if "Redis connection failed" in result.stderr or "Redis connection failed" in result.stdout:
laravel_health_status.labels('redis').set(0)
else:
laravel_health_status.labels('redis').set(1) # Assume OK if not explicitly failed
except FileNotFoundError:
print(f"Error: php command not found or Laravel path '{LARAVEL_PATH}' is incorrect.")
laravel_health_status.labels('database').set(0)
laravel_health_status.labels('redis').set(0)
except Exception as e:
print(f"An unexpected error occurred: {e}")
laravel_health_status.labels('database').set(0)
laravel_health_status.labels('redis').set(0)
if __name__ == '__main__':
# Start up the server to expose the metrics.
# Expose metrics on port 9500
exporter_port = 9500
print(f"Starting Prometheus exporter on port {exporter_port}")
start_http_server(exporter_port)
# Run the health check periodically
check_interval_seconds = 60 # Run every 60 seconds
while True:
run_artisan_health_check()
time.sleep(check_interval_seconds)
Deploy this script on your Laravel server(s) and configure Prometheus to scrape it (as shown in the `prometheus.yml` example). Ensure the `LARAVEL_PATH` environment variable is set correctly for the script.
Conclusion: A Multi-Layered Approach
Effective server and application monitoring is a continuous process. By combining direct Redis cluster diagnostics, application-level Artisan checks, and infrastructure-wide metrics via Prometheus and Grafana, you build a resilient system. Proactive identification of issues before they impact users is the ultimate goal, and this multi-layered strategy provides the visibility needed to achieve it.