Server Monitoring Best Practices: Keeping Your PHP App and DynamoDB Clusters Alive on Linode
Establishing a Robust Monitoring Foundation with Prometheus and Grafana
For any production PHP application, especially one leveraging a distributed NoSQL database like DynamoDB (or its AWS-compatible equivalents like Amazon DynamoDB or a self-hosted solution on Linode), a comprehensive monitoring strategy is non-negotiable. We’ll focus on a Prometheus and Grafana stack, a de facto standard for metrics collection and visualization in modern infrastructure. This setup allows us to collect granular data from our PHP application, the underlying Linode servers, and our DynamoDB cluster.
Instrumenting Your PHP Application with Prometheus Client Libraries
The first step is to expose application-level metrics. This involves integrating a Prometheus client library into your PHP application. We’ll use the popular promphp/prometheus_client_php library. This library allows you to define custom metrics (counters, gauges, histograms, summaries) and expose them via an HTTP endpoint that Prometheus can scrape.
Installation via Composer
Add the library to your project’s composer.json:
composer require promphp/prometheus_client_php
Defining and Exposing Metrics
Create a dedicated endpoint in your PHP application (e.g., /metrics.php) to expose these metrics. Here’s a simplified example:
<?php
require 'vendor/autoload.php';
use Prometheus\CollectorRegistry;
use Prometheus\Render\RenderTextFormat;
use Prometheus\Storage\InMemory;
// Initialize the registry
$registry = new CollectorRegistry(new InMemory());
// Define a counter for incoming requests
$counter = $registry->registerCounter(
'myapp_requests_total',
'Total number of requests received by the application',
['method', 'endpoint']
);
// Define a histogram for request durations
$histogram = $registry->registerHistogram(
'myapp_request_duration_seconds',
'Duration of HTTP requests in seconds',
['method', 'endpoint']
);
// --- In your application's request handling logic ---
// When a request comes in:
$method = $_SERVER['REQUEST_METHOD'];
$endpoint = $_SERVER['REQUEST_URI']; // Or a more specific route
$startTime = microtime(true);
// ... your application logic ...
$duration = microtime(true) - $startTime;
// Increment the counter
$counter->inc([$method, $endpoint]);
// Observe the duration in the histogram
$histogram->observe($duration, [$method, $endpoint]);
// --- End of application logic example ---
// Expose metrics for Prometheus to scrape
header('Content-Type: ' . RenderTextFormat::MIME_TYPE);
$renderer = new RenderTextFormat();
echo $renderer->render($registry->getMetricFamilySamples());
?>
Server-Level Metrics with Node Exporter
To monitor the health of your Linode instances (CPU, memory, disk I/O, network), we’ll deploy the Prometheus Node Exporter. This is a standard agent that collects hardware and OS metrics.
Installation and Configuration on Linode
Download the latest release for your server’s architecture (e.g., AMD64) from the Prometheus download page. For example, on a Debian/Ubuntu system:
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz cd node_exporter-1.7.0.linux-amd64 sudo mv node_exporter /usr/local/bin/
Create a systemd service to manage the Node Exporter:
[Unit] Description=Prometheus Node Exporter Wants=network-online.target After=network-online.target [Service] User=nobody Group=nogroup Type=simple ExecStart=/usr/local/bin/node_exporter [Install] WantedBy=multi-user.target
Enable and start the service:
sudo systemctl daemon-reload sudo systemctl enable node_exporter sudo systemctl start node_exporter
Verify it’s running and accessible on port 9100:
curl http://localhost:9100/metrics
Monitoring DynamoDB with AWS SDK and Custom Exporters
Monitoring DynamoDB requires a different approach. Since we’re not running DynamoDB directly on Linode (assuming AWS DynamoDB or a compatible managed service), we’ll use the AWS SDK for PHP to fetch metrics and expose them. Alternatively, for self-hosted DynamoDB-compatible databases on Linode, you’d adapt exporters for those specific databases (e.g., a PostgreSQL exporter if using YugabyteDB or CockroachDB with DynamoDB compatibility). For this example, we’ll focus on AWS DynamoDB.
Fetching DynamoDB Metrics via AWS SDK for PHP
You can write a PHP script that periodically queries CloudWatch for DynamoDB metrics and then exposes these metrics in a Prometheus-compatible format. This script would run on one of your Linode servers.
<?php
require 'vendor/autoload.php'; // Assuming you have AWS SDK and Prometheus client installed
use Aws\CloudWatch\CloudWatchClient;
use Prometheus\CollectorRegistry;
use Prometheus\Render\RenderTextFormat;
use Prometheus\Storage\InMemory;
// --- Configuration ---
$awsRegion = 'us-east-1'; // Your AWS region
$tableName = 'YourDynamoDBTableName'; // Your DynamoDB table name
$registry = new CollectorRegistry(new InMemory());
// --- Initialize AWS SDK Client ---
$cloudWatchClient = new CloudWatchClient([
'region' => $awsRegion,
'version' => 'latest',
// Add credentials here if not using IAM roles or environment variables
// 'credentials' => [
// 'key' => 'YOUR_ACCESS_KEY_ID',
// 'secret' => 'YOUR_SECRET_ACCESS_KEY',
// ],
]);
// --- Define Prometheus Metrics ---
$readCapacityUnitsConsumed = $registry->registerCounter(
'dynamodb_read_capacity_units_consumed_total',
'Total read capacity units consumed for the table.',
['table_name']
);
$writeCapacityUnitsConsumed = $registry->registerCounter(
'dynamodb_write_capacity_units_consumed_total',
'Total write capacity units consumed for the table.',
['table_name']
);
$throttledRequests = $registry->registerCounter(
'dynamodb_throttled_requests_total',
'Number of throttled requests to the table.',
['table_name', 'operation']
);
$successfulRequestLatency = $registry->registerHistogram(
'dynamodb_successful_request_latency_seconds',
'Latency of successful requests to the table.',
['table_name', 'operation']
);
// --- Fetch Metrics from CloudWatch ---
try {
$result = $cloudWatchClient->getMetricStatistics([
'Namespace' => 'AWS/DynamoDB',
'MetricName' => 'ConsumedReadCapacityUnits',
'Dimensions' => [['Name' => 'TableName', 'Value' => $tableName]],
'StartTime' => '-5 minutes', // Fetch last 5 minutes of data
'EndTime' => 'now',
'Period' => 60, // 1 minute granularity
'Statistics' => ['Sum'],
]);
if (!empty($result['Datapoints'])) {
foreach ($result['Datapoints'] as $datapoint) {
$readCapacityUnitsConsumed->inc([$tableName => $datapoint['Sum']]);
}
}
$result = $cloudWatchClient->getMetricStatistics([
'Namespace' => 'AWS/DynamoDB',
'MetricName' => 'ConsumedWriteCapacityUnits',
'Dimensions' => [['Name' => 'TableName', 'Value' => $tableName]],
'StartTime' => '-5 minutes',
'EndTime' => 'now',
'Period' => 60,
'Statistics' => ['Sum'],
]);
if (!empty($result['Datapoints'])) {
foreach ($result['Datapoints'] as $datapoint) {
$writeCapacityUnitsConsumed->inc([$tableName => $datapoint['Sum']]);
}
}
$result = $cloudWatchClient->getMetricStatistics([
'Namespace' => 'AWS/DynamoDB',
'MetricName' => 'ThrottledRequests',
'Dimensions' => [['Name' => 'TableName', 'Value' => $tableName]],
'StartTime' => '-5 minutes',
'EndTime' => 'now',
'Period' => 60,
'Statistics' => ['Sum'],
]);
if (!empty($result['Datapoints'])) {
foreach ($result['Datapoints'] as $datapoint) {
$throttledRequests->inc([$tableName => $datapoint['Sum']]);
}
}
// Latency metrics are a bit more complex as they are often averages.
// For simplicity, we'll fetch Average latency. You might want to adjust this.
$result = $cloudWatchClient->getMetricStatistics([
'Namespace' => 'AWS/DynamoDB',
'MetricName' => 'SuccessfulRequestLatency',
'Dimensions' => [['Name' => 'TableName', 'Value' => $tableName]],
'StartTime' => '-5 minutes',
'EndTime' => 'now',
'Period' => 60,
'Statistics' => ['Average'],
]);
if (!empty($result['Datapoints'])) {
foreach ($result['Datapoints'] as $datapoint) {
// Observe the average latency. Note: Prometheus histograms are cumulative.
// This might require careful consideration of how you want to represent average latency.
// For a direct mapping, you might need a Gauge or a different aggregation strategy.
// Here, we'll just observe the average as a single value for demonstration.
$successfulRequestLatency->observe($datapoint['Average'], [$tableName]);
}
}
} catch (\Aws\Exception\AwsException $e) {
// Log the error appropriately
error_log("Error fetching DynamoDB metrics: " . $e->getMessage());
}
// --- Expose Metrics ---
header('Content-Type: ' . RenderTextFormat::MIME_TYPE);
$renderer = new RenderTextFormat();
echo $renderer->render($registry->getMetricFamilySamples());
?>
You would then configure Prometheus to scrape this endpoint. For this script to work, ensure your Linode server has the AWS SDK for PHP installed and configured with appropriate IAM credentials (e.g., via IAM roles if running on EC2, or via environment variables/shared credential files if on Linode).
Configuring Prometheus for Scraping
Edit your Prometheus configuration file (typically /etc/prometheus/prometheus.yml) to include scrape targets for your PHP application, Node Exporter, and the DynamoDB metrics exporter.
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
scrape_configs:
# Scrape Prometheus itself
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
# Scrape Node Exporter on your Linode servers
- job_name: 'node_exporter'
static_configs:
- targets: ['linode-server-1:9100', 'linode-server-2:9100'] # Replace with your server IPs/hostnames
# Scrape your PHP application's metrics endpoint
- job_name: 'php_app'
static_configs:
- targets: ['linode-server-1:80', 'linode-server-2:80'] # Assuming your app is on port 80
metric_paths:
- /metrics.php # The endpoint you created
# Scrape the custom DynamoDB metrics exporter
- job_name: 'dynamodb_exporter'
static_configs:
- targets: ['linode-server-1:80'] # Assuming the exporter script runs on the same server and port as the app
metric_paths:
- /dynamodb_metrics.php # The endpoint for your DynamoDB exporter script
After updating the configuration, reload Prometheus:
sudo systemctl reload prometheus
Visualizing with Grafana and Setting Up Alerts
Grafana is essential for visualizing the collected metrics. Install Grafana on a separate server or one of your Linode instances. Configure Prometheus as a data source in Grafana.
Creating Dashboards
Import pre-built dashboards for Node Exporter (e.g., Dashboard ID 1860) and create custom dashboards for your PHP application metrics. For DynamoDB, you’ll likely need to build custom panels to visualize the metrics fetched via the AWS SDK.
Key Metrics to Monitor
- PHP Application: Request rate, error rate (HTTP 5xx), request latency (average, p95, p99), memory usage, CPU usage of PHP-FPM processes.
- Linode Servers: CPU utilization, memory usage, disk I/O (read/write operations, latency), network traffic (in/out), disk space.
- DynamoDB: Consumed Read/Write Capacity Units (vs. provisioned), throttled requests, successful request latency, item count, table size.
Alerting with Alertmanager
Integrate Prometheus with Alertmanager to define alerting rules. For example, you might want alerts for:
- High CPU/memory usage on Linode servers.
- Sustained high read/write latency for DynamoDB.
- A significant increase in throttled DynamoDB requests.
- A spike in PHP application errors (5xx responses).
- Low disk space on Linode instances.
An example Prometheus alerting rule:
groups:
- name: php_app_alerts
rules:
- alert: HighRequestLatency
expr: histogram_quantile(0.95, sum(rate(myapp_request_duration_seconds_bucket[5m])) by (le, endpoint)) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "High 95th percentile request latency for {{ $labels.endpoint }}"
description: "The 95th percentile request latency for endpoint {{ $labels.endpoint }} has been above 2 seconds for 5 minutes."
- alert: HighErrorRate
expr: sum(rate(http_requests_total{code=~"5..", job="php_app"}[5m])) / sum(rate(http_requests_total{job="php_app"}[5m])) * 100 > 5
for: 5m
labels:
severity: critical
annotations:
summary: "High HTTP 5xx error rate for PHP app"
description: "The error rate for the PHP application is above 5% for 5 minutes."
- name: dynamodb_alerts
rules:
- alert: DynamoDBThrottling
expr: sum(rate(dynamodb_throttled_requests_total[5m])) by (table_name) > 100
for: 5m
labels:
severity: warning
annotations:
summary: "High DynamoDB throttling for table {{ $labels.table_name }}"
description: "DynamoDB table {{ $labels.table_name }} is experiencing high throttling rates."
This comprehensive setup provides deep visibility into your PHP application’s performance, the health of your Linode infrastructure, and the behavior of your DynamoDB cluster, enabling proactive issue detection and resolution.