Server Monitoring Best Practices: Keeping Your PHP App and Redis Clusters Alive on DigitalOcean
Proactive PHP Application Health Checks
A robust monitoring strategy for PHP applications goes beyond basic CPU and memory utilization. We need to ensure the application itself is responsive and capable of handling requests. This involves implementing application-level health checks that can be polled by an external monitoring system.
A common and effective approach is to create a dedicated health check endpoint within your PHP application. This endpoint should perform critical checks, such as database connectivity, cache availability, and essential service dependencies. For a Laravel application, this might look like:
Laravel Health Check Endpoint Example
Create a new route in routes/api.php (or routes/web.php if you prefer):
<?php
use Illuminate\Http\Request;
use Illuminate\Support\Facades\Route;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Facades\Cache;
Route::get('/health', function () {
$status = 'ok';
$checks = [];
// Check database connection
try {
DB::connection()->getPdo();
$checks['database'] = 'connected';
} catch (\Exception $e) {
$status = 'error';
$checks['database'] = 'disconnected: ' . $e->getMessage();
}
// Check cache connection (assuming Redis)
try {
Cache::store('redis')->get('health_check_key'); // Simple read/write to test connection
Cache::store('redis')->put('health_check_key', 'test', 1);
$checks['cache'] = 'connected';
} catch (\Exception $e) {
$status = 'error';
$checks['cache'] = 'disconnected: ' . $e->getMessage();
}
// Add more checks as needed (e.g., external API availability)
return response()->json([
'status' => $status,
'checks' => $checks,
'timestamp' => now()->toIso8601String(),
], $status === 'ok' ? 200 : 503); // 503 Service Unavailable for errors
});
This endpoint returns a JSON response indicating the overall health and the status of individual components. A 200 OK status code signifies a healthy application, while a 503 Service Unavailable indicates a problem. This is crucial for external monitoring tools.
Monitoring PHP-FPM and Web Server Performance
Beyond the application logic, the underlying web server (Nginx or Apache) and PHP-FPM processes are critical. Monitoring their performance and resource consumption is paramount.
PHP-FPM Status Page
PHP-FPM provides a built-in status page that offers valuable insights into its worker processes. To enable it, you typically need to configure your PHP-FPM pool.
Edit your PHP-FPM pool configuration file (e.g., /etc/php/8.1/fpm/pool.d/www.conf):
; Add or uncomment these lines pm.status_path = /fpm-status ping.path = /fpm-ping ping.response = pong
Next, configure your web server (Nginx in this example) to proxy requests to the PHP-FPM status page. This requires a specific location block:
server {
listen 80;
server_name your_domain.com;
root /var/www/your_app/public;
index index.php index.html index.htm;
location / {
try_files $uri $uri/ /index.php?$query_string;
}
# PHP-FPM status page configuration
location ~ ^/fpm-status$ {
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_pass unix:/var/run/php/php8.1-fpm.sock; # Adjust path to your PHP-FPM socket
internal; # Restrict direct access if needed, or use auth_basic
}
# PHP-FPM ping page configuration
location ~ ^/fpm-ping$ {
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_pass unix:/var/run/php/php8.1-fpm.sock; # Adjust path to your PHP-FPM socket
allow 127.0.0.1; # Allow only localhost for ping
deny all;
}
location ~ \.php$ {
include snippets/fastcgi-php.conf;
fastcgi_pass unix:/var/run/php/php8.1-fpm.sock; # Adjust path to your PHP-FPM socket
}
location ~ /\.ht {
deny all;
}
}
With this setup, you can access http://your_domain.com/fpm-status to see detailed statistics about your PHP-FPM workers, including active processes, idle processes, and request statistics. The /fpm-ping endpoint is useful for automated checks to ensure PHP-FPM is responsive.
Nginx/Apache Metrics
Both Nginx and Apache offer ways to expose performance metrics. For Nginx, the ngx_http_stub_status_module is invaluable. Ensure it’s compiled into your Nginx binary (it usually is by default).
# In your Nginx configuration (e.g., http block or server block)
server {
# ... other configurations ...
location /nginx_status {
stub_status on;
access_log off;
# Optionally add authentication
# auth_basic "Restricted Area";
# auth_basic_user_file /etc/nginx/.htpasswd;
}
# ... other configurations ...
}
Accessing /nginx_status will provide output like:
Active connections: 123 Server accepts handled requests 1667890 1667890 3456789 Reading: 1 Writing: 3 Waiting: 119
Key metrics to monitor here are Active connections, accepts, handled, and requests. High waiting connections can indicate a bottleneck in your backend application or database.
Redis Cluster Health and Performance Monitoring
Redis, especially in a cluster configuration, requires diligent monitoring to ensure data availability and low latency. DigitalOcean’s managed Redis service simplifies some aspects, but understanding the underlying metrics is still crucial.
Redis INFO Command
The INFO command is the cornerstone of Redis monitoring. It provides a wealth of information about the server’s state, memory usage, persistence, replication, and more. You can execute this via redis-cli or programmatically.
redis-cli -c -h your_redis_host -p your_redis_port -a your_redis_password INFO memory redis-cli -c -h your_redis_host -p your_redis_port -a your_redis_password INFO persistence redis-cli -c -h your_redis_host -p your_redis_port -a your_redis_password INFO replication redis-cli -c -h your_redis_host -p your_redis_port -a your_redis_password INFO stats
Key metrics to watch:
- Memory Usage:
used_memory,used_memory_peak,mem_fragmentation_ratio. High fragmentation or nearingmaxmemorycan lead to performance issues or evictions. - Persistence:
rdb_last_save_time,aof_last_bgrewrite_time. Ensure persistence is happening successfully and not causing excessive load. - Replication:
master_repl_offset,slave_repl_offset. Monitor replication lag. In a cluster, this is critical for failover. - Clients:
connected_clients. A sudden spike might indicate an issue with your application’s connection management. - Keyspace:
db0:keys,db0:expires. Monitor the number of keys and expiring keys.
Redis Cluster Specifics
For Redis Cluster, the CLUSTER INFO and CLUSTER NODES commands are essential.
redis-cli -c -h your_redis_host -p your_redis_port -a your_redis_password CLUSTER INFO redis-cli -c -h your_redis_host -p your_redis_port -a your_redis_password CLUSTER NODES
CLUSTER INFO provides cluster-wide status, including cluster_state (should be ok), cluster_slots_assigned, cluster_slots_ok, cluster_slots_pfail, and cluster_slots_fail. Any non-zero values for pfail or fail indicate nodes are in a problematic state.
CLUSTER NODES lists all nodes in the cluster, their roles (master/slave), status (connected/disconnected), and assigned slots. This is invaluable for diagnosing connectivity issues between nodes or identifying unresponsive masters/replicas.
Leveraging DigitalOcean Monitoring and Alerting
DigitalOcean’s built-in monitoring provides a good baseline for Droplet resource utilization (CPU, RAM, Disk I/O, Network). However, for application-specific and Redis cluster health, you’ll need to integrate external monitoring tools or use DigitalOcean’s features strategically.
Custom Metrics with Prometheus and Grafana
A powerful combination for advanced monitoring is Prometheus for time-series data collection and Grafana for visualization and alerting. You can deploy these on a separate Droplet or use DigitalOcean’s managed offerings if available.
Prometheus Exporters:
- Node Exporter: For system-level metrics (CPU, RAM, disk, network) on your application and Redis Droplets.
- PHP-FPM Exporter: Several community-developed exporters exist that can scrape the
fpm-statuspage. - Nginx Exporter: Scrapes the
stub_statusendpoint. - Redis Exporter: A dedicated exporter that uses the
INFOandCLUSTERcommands to expose detailed Redis metrics.
You would configure Prometheus to scrape these exporters running on your application and Redis Droplets. Then, set up Grafana dashboards to visualize these metrics and configure alerting rules based on thresholds (e.g., high latency, low available memory, PHP-FPM worker saturation, Redis cluster node failures).
DigitalOcean Alerts
DigitalOcean Alerts can be configured for Droplet resource utilization. For application-level or Redis-specific alerts, you can leverage:
- External Monitoring Services: Integrate services like UptimeRobot, Pingdom, or Datadog that can poll your application’s health check endpoint (
/health) and Redis endpoints. - Custom Alerting Scripts: Write simple scripts (e.g., in Python or Bash) that run via cron on a separate monitoring Droplet. These scripts can query your application health endpoint, run
redis-cli CLUSTER INFO, and send notifications (e.g., via Slack, PagerDuty) if issues are detected.
Example Bash script for basic Redis cluster health check:
#!/bin/bash
REDIS_HOST="your_redis_host"
REDIS_PORT="your_redis_port"
REDIS_PASSWORD="your_redis_password"
# Check cluster state
CLUSTER_INFO=$(redis-cli -h $REDIS_HOST -p $REDIS_PORT -a $REDIS_PASSWORD CLUSTER INFO)
CLUSTER_STATE=$(echo "$CLUSTER_INFO" | grep cluster_state | awk -F':' '{print $2}' | tr -d ' ')
CLUSTER_PFAIL=$(echo "$CLUSTER_INFO" | grep cluster_slots_pfail | awk -F':' '{print $2}' | tr -d ' ')
CLUSTER_FAIL=$(echo "$CLUSTER_INFO" | grep cluster_slots_fail | awk -F':' '{print $2}' | tr -d ' ')
if [ "$CLUSTER_STATE" != "ok" ]; then
echo "CRITICAL: Redis cluster state is not OK ($CLUSTER_STATE)"
# Send alert here (e.g., via curl to a webhook)
exit 2
fi
if [ "$CLUSTER_PFAIL" -gt 0 ]; then
echo "WARNING: Redis cluster has nodes in PFAIL state ($CLUSTER_PFAIL)"
# Send alert here
exit 1
fi
if [ "$CLUSTER_FAIL" -gt 0 ]; then
echo "CRITICAL: Redis cluster has nodes in FAIL state ($CLUSTER_FAIL)"
# Send alert here
exit 2
fi
echo "OK: Redis cluster is healthy."
exit 0
Schedule this script using cron (e.g., every 5 minutes):
*/5 * * * * /path/to/your/redis_cluster_check.sh >> /var/log/redis_cluster_check.log 2>&1
Log Aggregation and Analysis
Centralized logging is indispensable for debugging and identifying the root cause of issues. Collecting logs from your PHP application, web server, and Redis instances into a single, searchable location significantly speeds up troubleshooting.
Tools and Techniques
- Filebeat/Logstash/Elasticsearch/Kibana (ELK Stack): A powerful, albeit complex, solution for log aggregation, storage, and visualization. Filebeat can be installed on each Droplet to ship logs to Logstash or directly to Elasticsearch.
- Fluentd: Another popular log collector that can forward logs to various destinations, including Elasticsearch or cloud-based logging services.
- DigitalOcean Log Management: DigitalOcean offers managed Kubernetes and potentially other services that integrate with logging solutions. For Droplets, you might need to set up your own aggregation pipeline.
- Application Logging Frameworks: Ensure your PHP application uses a robust logging library (like Monolog) configured to output logs in a structured format (e.g., JSON) which makes parsing easier for log aggregation tools.
Configure your web server (Nginx/Apache) and PHP-FPM to log errors and access details. For Redis, ensure the log level is set appropriately (e.g., loglevel notice or warning) and that logs are being written to a file.
By combining application-level health checks, detailed performance metrics for PHP-FPM and your web server, comprehensive Redis cluster monitoring, and centralized logging, you can build a resilient and observable system on DigitalOcean.