Server Monitoring Best Practices: Keeping Your Laravel App and Elasticsearch Clusters Alive on Linode
Proactive Monitoring for Laravel & Elasticsearch on Linode
Maintaining high availability for critical applications like Laravel and their supporting infrastructure, such as Elasticsearch clusters, demands a robust and proactive monitoring strategy. This isn’t about reacting to outages; it’s about anticipating them. On Linode, this translates to leveraging a combination of system-level metrics, application-specific insights, and specialized cluster health checks.
System-Level Metrics with Node Exporter and Prometheus
The foundation of any monitoring stack is granular system metrics. For Linode instances running your Laravel application or Elasticsearch nodes, `node_exporter` is indispensable. It exposes a wealth of information about CPU, memory, disk I/O, network traffic, and more, in a Prometheus-compatible format.
Installation and Configuration:
On each Linode instance (both Laravel app servers and Elasticsearch nodes), download and run `node_exporter`:
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz sudo mv node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/ sudo useradd --no-create-home --shell /bin/false node_exporter sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
Create a systemd service file to manage `node_exporter`:
[Unit] Description=Node Exporter Wants=network-online.target After=network-online.target [Service] User=node_exporter Group=node_exporter Type=simple ExecStart=/usr/local/bin/node_exporter [Install] WantedBy=multi-user.target
Enable and start the service:
sudo systemctl daemon-reload sudo systemctl enable node_exporter sudo systemctl start node_exporter
Prometheus Configuration:
In your Prometheus configuration file (e.g., `/etc/prometheus/prometheus.yml`), add scrape jobs for your Linode instances. Assuming your Prometheus server is accessible to your Linode instances and you’re using static configuration:
scrape_configs:
- job_name: 'node_exporter'
static_configs:
- targets: ['LINODE_APP_IP_1:9100', 'LINODE_APP_IP_2:9100', 'LINODE_ES_NODE_1_IP:9100', 'LINODE_ES_NODE_2_IP:9100', 'LINODE_ES_NODE_3_IP:9100']
labels:
environment: 'production'
role: 'app' # or 'elasticsearch'
Replace `LINODE_APP_IP_X` and `LINODE_ES_NODE_X_IP` with the actual IP addresses of your Linode instances. Restart Prometheus for the changes to take effect.
Laravel Application Performance Monitoring (APM)
For Laravel applications, system metrics only tell part of the story. We need to monitor request latency, error rates, database query performance, and queue throughput. While commercial APM solutions exist, a cost-effective approach involves integrating Prometheus client libraries and custom exporters.
Instrumenting Laravel with Prometheus Client:
Add the `promphp/prometheus_client_php` library to your Laravel project:
composer require promphp/prometheus_client_php
Create a service provider to register metrics and a route to expose them:
# app/Providers/AppServiceProvider.php
use Prometheus\CollectorRegistry;
use Prometheus\RenderTextFormat;
use Illuminate\Support\Facades\Route;
use Illuminate\Http\Request;
use Illuminate\Http\Response;
// ... other imports
public function register()
{
$this->app->singleton(CollectorRegistry::class, function ($app) {
return new \Prometheus\CollectorRegistry(new \Prometheus\Storage\InMemory());
});
}
public function boot()
{
// Middleware to track request duration
$this->app['router']->middleware('prometheus.request_duration', function ($request, $next) {
$start = microtime(true);
$response = $next($request);
$duration = microtime(true) - $start;
$registry = $this->app->make(CollectorRegistry::class);
$histogram = $registry->getOrRegisterHistogram(
'laravel_app',
'request_duration_seconds',
'Duration of HTTP requests in seconds',
[1, 5, 10, 20, 30, 60, 120, 240, 360, 720]
);
$histogram->observe($duration, ['method' => $request->method(), 'route' => $request->route()->getName() ?? 'unknown']);
return $response;
});
// Middleware to track error counts
$this->app['router']->middleware('prometheus.error_count', function ($request, $next) {
$response = $next($request);
if ($response->getStatusCode() >= 400) {
$registry = $this->app->make(CollectorRegistry::class);
$counter = $registry->getOrRegisterCounter(
'laravel_app',
'http_errors_total',
'Total HTTP errors by status code',
['status_code']
);
$counter->inc(['status_code' => $response->getStatusCode()]);
}
return $response;
});
// Expose metrics endpoint
Route::get('/metrics', function (Request $request) {
$registry = $this->app->make(CollectorRegistry::class);
$renderer = new RenderTextFormat();
$result = $renderer->render($registry->getMetricFamilySamples());
return new Response($result, 200, ['Content-Type' => RenderTextFormat::MIME_TYPE]);
});
}
Register the middleware in `app/Http/Kernel.php`:
protected $middlewareGroups = [
'web' => [
// ... other middleware
\App\Http\Middleware\PrometheusRequestDuration::class,
\App\Http\Middleware\PrometheusErrorCount::class,
],
// ...
];
Now, your Laravel application will expose metrics at `/metrics` on port 80 (or your configured web server port). Add this to your Prometheus configuration:
scrape_configs:
- job_name: 'laravel_app'
static_configs:
- targets: ['LINODE_APP_IP_1:80', 'LINODE_APP_IP_2:80']
labels:
environment: 'production'
role: 'app'
Queue Monitoring:
For Laravel queues, monitor the number of pending jobs. You can create a custom Artisan command that exposes this metric:
# app/Console/Commands/QueueMetrics.php
namespace App\Console\Commands;
use Illuminate\Console\Command;
use Illuminate\Support\Facades\DB;
use Prometheus\CollectorRegistry;
use Prometheus\Storage\InMemory;
class QueueMetrics extends Command
{
protected $signature = 'metrics:queue';
protected $description = 'Expose queue metrics';
public function handle(CollectorRegistry $registry)
{
$queueName = config('queue.default'); // Or specific queue names
$pendingJobs = DB::table('jobs')->where('queue', $queueName)->count();
$gauge = $registry->getOrRegisterGauge(
'laravel_queue',
'pending_jobs',
'Number of pending jobs in the queue',
['queue']
);
$gauge->set($pendingJobs, [$queueName]);
// This command would typically be run by a cron job and output metrics
// or be integrated into a dedicated metrics exporter service.
// For simplicity, we'll just log it here.
$this->info("Pending jobs for queue '{$queueName}': {$pendingJobs}");
}
}
Schedule this command to run frequently (e.g., every minute) via cron, and have it output to a file that a separate exporter process reads and exposes via HTTP. Alternatively, integrate it into a long-running PHP process that serves metrics.
Elasticsearch Cluster Health and Performance
Elasticsearch clusters require specialized monitoring. Key metrics include cluster health status (green, yellow, red), node status, indexing rate, search latency, JVM heap usage, and disk space. The official Elasticsearch Exporter is the standard tool for this.
Setting up Elasticsearch Exporter:
Download and run the Elasticsearch Exporter on a separate node or one of your Elasticsearch nodes (if resource permits):
wget https://github.com/prometheus-community/elasticsearch_exporter/releases/download/v1.7.0/elasticsearch_exporter-1.7.0.linux-amd64.tar.gz tar xvfz elasticsearch_exporter-1.7.0.linux-amd64.tar.gz sudo mv elasticsearch_exporter-1.7.0.linux-amd64/elasticsearch_exporter /usr/local/bin/ sudo useradd --no-create-home --shell /bin/false elasticsearch_exporter sudo chown elasticsearch_exporter:elasticsearch_exporter /usr/local/bin/elasticsearch_exporter
Create a systemd service file:
[Unit] Description=Elasticsearch Exporter Wants=network-online.target After=network-online.target [Service] User=elasticsearch_exporter Group=elasticsearch_exporter Type=simple ExecStart=/usr/local/bin/elasticsearch_exporter \ --es.uri=http://LINODE_ES_NODE_1_IP:9200 \ --es.all_indices \ --es.indices_stats \ --es.cluster_stats \ --es.node_stats \ --es.timeout=5m \ --web.listen-address=":9114" [Install] WantedBy=multi-user.target
Replace `LINODE_ES_NODE_1_IP` with the IP of one of your Elasticsearch nodes. If you have authentication enabled, you’ll need to add `–es.username` and `–es.password` flags. Enable and start the service:
sudo systemctl daemon-reload sudo systemctl enable elasticsearch_exporter sudo systemctl start elasticsearch_exporter
Add this to your Prometheus configuration:
scrape_configs:
- job_name: 'elasticsearch'
static_configs:
- targets: ['LINODE_ES_NODE_1_IP:9114', 'LINODE_ES_NODE_2_IP:9114', 'LINODE_ES_NODE_3_IP:9114']
labels:
environment: 'production'
role: 'elasticsearch'
Alerting with Alertmanager
Metrics are only useful if they trigger alerts when thresholds are breached. Prometheus integrates with Alertmanager for deduplication, grouping, and routing of alerts to various notification channels (Slack, PagerDuty, email).
Example Alerting Rules (in Prometheus rules file, e.g., `/etc/prometheus/rules.yml`):
groups:
- name: laravel_alerts
rules:
- alert: HighHttpRequestErrorRate
expr: sum(rate(laravel_app_http_errors_total{environment="production"}[5m])) by (status_code) / sum(rate(laravel_app_request_duration_seconds_count{environment="production"}[5m])) by (status_code) * 100 > 5
for: 10m
labels:
severity: warning
annotations:
summary: "High HTTP error rate detected ({{ $value | printf "%.2f" }}%) for status code {{ $labels.status_code }}"
description: "The error rate for status code {{ $labels.status_code }} has exceeded 5% for the last 10 minutes."
- alert: HighPendingQueueJobs
expr: laravel_queue_pending_jobs{environment="production"} > 100
for: 5m
labels:
severity: critical
annotations:
summary: "High number of pending queue jobs"
description: "There are currently {{ $value }} pending jobs in the '{{ $labels.queue }}' queue."
- name: elasticsearch_alerts
rules:
- alert: ElasticsearchClusterRed
expr: elasticsearch_cluster_health_status{environment="production"} == 2 # 0=green, 1=yellow, 2=red
for: 5m
labels:
severity: critical
annotations:
summary: "Elasticsearch cluster is RED"
description: "The Elasticsearch cluster is in a RED state. This indicates unassigned shards and potential data loss."
- alert: HighElasticsearchJVMLoad
expr: elasticsearch_jvm_memory_used_bytes{environment="production"} / elasticsearch_jvm_memory_max_bytes{environment="production"} * 100 > 85
for: 15m
labels:
severity: warning
annotations:
summary: "High JVM heap usage on Elasticsearch node"
description: "Node {{ $labels.instance }} has {{ $value | printf "%.2f" }}% JVM heap usage, exceeding 85%."
- alert: LowDiskSpaceElasticsearch
expr: node_filesystem_avail_bytes{mountpoint="/", environment="production"} / node_filesystem_size_bytes{mountpoint="/", environment="production"} * 100 < 10
for: 30m
labels:
severity: critical
annotations:
summary: "Low disk space on Elasticsearch node"
description: "Node {{ $labels.instance }} has less than 10% free disk space on {{ $labels.mountpoint }}."
Configure Alertmanager to receive these alerts and route them appropriately. Ensure your Linode firewall rules allow communication between Prometheus, Alertmanager, and your target instances.
Log Aggregation and Analysis
While metrics provide a quantitative view, logs offer qualitative insights into application behavior and errors. Centralizing logs from your Laravel applications and Elasticsearch nodes is crucial for debugging and historical analysis. A common stack is ELK (Elasticsearch, Logstash, Kibana) or its more modern fork, the EFK stack (Elasticsearch, Fluentd, Kibana).
Fluentd for Log Collection:
Install Fluentd on your Laravel app servers and Elasticsearch nodes to collect logs and forward them to your central Elasticsearch cluster.
# Example for Ubuntu/Debian curl -L https://td-assets.fluentd.org/assets/scripts/install-deb.sh | sudo bash sudo apt-get update sudo apt-get install -y fluentd fluentd-plugins-core
Configure Fluentd to tail log files (e.g., `/var/log/nginx/error.log`, Laravel logs, Elasticsearch logs) and output to Elasticsearch:
# /etc/fluentd/td-agent.conf
<source>
@type tail
path /var/log/nginx/error.log
pos_file /var/log/td-agent/nginx-error.log.pos
tag nginx.error
<parse>
@type nginx
</parse>
</source>
<source>
@type tail
path /var/www/html/your-laravel-app/storage/logs/laravel.log
pos_file /var/log/td-agent/laravel.log.pos
tag laravel.app
<parse>
@type multi_format
<pattern>
format json
</pattern>
<pattern>
format regexp
expression /^(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\s+(?<level>\w+):\s+(?<message>.*)$/
</pattern>
</parse>
</source>
<match **>
@type elasticsearch
host YOUR_ELASTICSEARCH_HOST # e.g., LINODE_ES_NODE_1_IP
port 9200
logstash_format true
logstash_prefix logstash
include_tag_key true
tag_key log_tag
flush_interval 5s
</match>
Restart Fluentd and ensure your Elasticsearch cluster is accessible from your Linode instances.
Conclusion
A comprehensive monitoring strategy for Laravel and Elasticsearch on Linode involves layering system metrics, application-specific instrumentation, cluster health checks, and centralized logging. By implementing these practices, you move from reactive firefighting to proactive system management, ensuring the stability and performance of your critical infrastructure.