Server Monitoring Best Practices: Keeping Your PHP App and DynamoDB Clusters Alive on OVH

Proactive PHP Application Health Checks with OVH and DynamoDB

Maintaining the stability and performance of a PHP application, especially one relying on a distributed NoSQL database like AWS DynamoDB, requires a multi-layered monitoring strategy. This isn’t about basic uptime checks; it’s about deep introspection into application behavior, resource utilization, and database interaction patterns. We’ll focus on actionable metrics and configurations relevant to an OVH hosting environment, integrating with DynamoDB.

Monitoring PHP-FPM and Web Server Performance

PHP-FPM (FastCGI Process Manager) is the backbone of most modern PHP deployments. Its performance directly impacts application responsiveness. We’ll leverage its built-in status page and integrate it with a robust monitoring tool like Prometheus, often deployed on an OVH instance.

Enabling PHP-FPM Status Page

First, ensure your PHP-FPM configuration allows access to the status page. This is typically done in the PHP-FPM pool configuration file (e.g., `/etc/php/8.1/fpm/pool.d/www.conf`).

; /etc/php/8.1/fpm/pool.d/www.conf

; Enable pm status page
pm.status_path = /fpm-status
; Set allowed clients to your monitoring server's IP or a trusted subnet
; For simplicity, we'll use localhost here, assuming Prometheus is on the same host.
; In a production setup, restrict this to your monitoring server's IP.
; pm.access_Deny_list = deny all
; pm.access_allow_list = 127.0.0.1
; pm.access_allow_list = 192.168.1.0/24

After modifying the configuration, reload PHP-FPM:

sudo systemctl reload php8.1-fpm

Exposing PHP-FPM Metrics to Prometheus

We need a way to scrape these metrics. The php-fpm_exporter is a common choice. It can be run as a separate service or directly on the web server.

Install the exporter (example using Docker):

docker run -d \
  --name php-fpm-exporter \
  -p 9253:9253 \
  prom/php-fpm-exporter:latest \
  --php-fpm-status-url http://localhost:9000/fpm-status

Note: Adjust the --php-fpm-status-url to match your PHP-FPM configuration. If PHP-FPM is not running on the default port or is accessible via a different interface, update accordingly. If PHP-FPM is running on a different host, you’ll need to expose its status page over the network and point the exporter to that URL.

Prometheus Configuration for PHP-FPM

Add a scrape job to your Prometheus configuration file (e.g., /etc/prometheus/prometheus.yml) to collect metrics from the exporter.

scrape_configs:
  - job_name: 'php-fpm'
    static_configs:
      - targets: ['localhost:9253'] # Or the IP/port of your exporter if running elsewhere
        labels:
          instance: 'your-php-app-instance-name'

Reload Prometheus to apply the changes.

sudo systemctl reload prometheus

Monitoring Nginx/Apache Web Server

Your web server is the first line of defense. Monitoring its active connections, request rates, error logs, and latency is crucial.

Nginx Stub Status Module

Enable the stub_status module in your Nginx configuration.

# /etc/nginx/sites-available/your-app
server {
    listen 80;
    server_name your-domain.com;

    # ... other configurations ...

    location /nginx_status {
        stub_status on;
        access_log off;
        allow 127.0.0.1; # Restrict access to your monitoring server
        deny all;
    }

    # ... other configurations ...
}

Test Nginx configuration and reload:

sudo nginx -t
sudo systemctl reload nginx

Prometheus Configuration for Nginx

Use the nginx-exporter to scrape these metrics. It can be run as a separate service.

scrape_configs:
  - job_name: 'nginx'
    static_configs:
      - targets: ['localhost:9113'] # Default port for nginx-exporter
        labels:
          instance: 'your-web-server-instance-name'

Ensure your nginx-exporter is configured to point to your Nginx stub status URL.

DynamoDB Performance and Health Monitoring

DynamoDB, while managed, requires careful monitoring of throughput, latency, and errors. AWS CloudWatch is the primary tool here, but we can export key metrics for centralized analysis.

Key DynamoDB Metrics to Track

ConsumedReadCapacityUnits / ConsumedWriteCapacityUnits: Essential for understanding throughput usage and potential throttling.
ProvisionedReadCapacityUnits / ProvisionedWriteCapacityUnits: For provisioned tables, track utilization.
ThrottledRequests: Indicates requests that were rejected due to exceeding provisioned throughput.
SuccessfulRequestLatency: Average latency for successful requests.
SystemErrors: Errors originating from DynamoDB itself.
UserErrors: Errors originating from client-side requests (e.g., validation errors).
ItemCount: Number of items in the table.
TableSizeBytes: Size of the table.

Exporting CloudWatch Metrics to Prometheus

To integrate DynamoDB metrics into your existing Prometheus/Grafana stack, you can use the cloudwatch-exporter. This requires AWS credentials configured on the host where the exporter runs.

# Example cloudwatch-exporter configuration (config.yml)
# This file defines which metrics to scrape from CloudWatch.
# Ensure AWS credentials are set via environment variables or IAM role.

aws_credentials:
  region: "us-east-1" # Your DynamoDB region

metrics:
  - namespace: "AWS/DynamoDB"
    name: "ConsumedReadCapacityUnits"
    dimensions:
      - name: "TableName"
        value: "your-dynamodb-table-name" # Replace with your table name
    statistics:
      - "Sum"
      - "Average"
    period: 300 # 5 minutes
  - namespace: "AWS/DynamoDB"
    name: "ConsumedWriteCapacityUnits"
    dimensions:
      - name: "TableName"
        value: "your-dynamodb-table-name"
    statistics:
      - "Sum"
      - "Average"
    period: 300
  - namespace: "AWS/DynamoDB"
    name: "ThrottledRequests"
    dimensions:
      - name: "TableName"
        value: "your-dynamodb-table-name"
    statistics:
      - "Sum"
    period: 300
  - namespace: "AWS/DynamoDB"
    name: "SuccessfulRequestLatency"
    dimensions:
      - name: "TableName"
        value: "your-dynamodb-table-name"
    statistics:
      - "Average"
    period: 300

Run the cloudwatch-exporter, pointing it to your configuration file.

docker run -d \
  --name cloudwatch-exporter \
  -p 9118:9118 \
  -v /path/to/your/config.yml:/config.yml \
  prometheuscommunity/cloudwatch-exporter:latest \
  --config.file=/config.yml

Prometheus Configuration for CloudWatch Exporter

scrape_configs:
  - job_name: 'dynamodb'
    static_configs:
      - targets: ['localhost:9118'] # Or the IP/port of your exporter
        labels:
          instance: 'your-dynamodb-instance-name'

Application-Level Metrics and Error Tracking

Beyond infrastructure, application-specific metrics and error tracking are vital. This involves instrumenting your PHP code.

Instrumenting PHP with Prometheus Client

Use a PHP client library for Prometheus to expose custom metrics from your application. For example, tracking the number of calls to specific DynamoDB operations or the duration of those calls.

require 'vendor/autoload.php';

use Prometheus\CollectorRegistry;
use Prometheus\Render\CallbackRenderer;
use Prometheus\Storage\InMemory;

// Initialize registry and storage
$registry = new CollectorRegistry(new InMemory());

// Create a counter for DynamoDB operations
$dynamoDbOpsCounter = $registry->registerCounter(
    'myapp', 'dynamodb_operations_total', 'Total number of DynamoDB operations', ['operation']
);

// Create a histogram for DynamoDB operation durations
$dynamoDbDurationHistogram = $registry->registerHistogram(
    'myapp', 'dynamodb_operation_duration_seconds', 'Duration of DynamoDB operations in seconds', ['operation'], [0.1, 0.5, 1, 5, 10]
);

// Example usage within your application logic
function performDynamoDbOperation(string $operationName, callable $operation) {
    $startTime = microtime(true);
    try {
        $result = $operation();
        $duration = microtime(true) - $startTime;

        $dynamoDbOpsCounter->inc(['operation' => $operationName]);
        $dynamoDbDurationHistogram->observe($duration, ['operation' => $operationName]);

        return $result;
    } catch (Exception $e) {
        // Log the error, potentially increment an error counter
        error_log("DynamoDB operation failed: " . $e->getMessage());
        throw $e;
    }
}

// --- In your application code ---
// Example: Fetching an item from DynamoDB
$item = performDynamoDbOperation('getItem', function() use ($dynamoDbClient, $tableName, $key) {
    return $dynamoDbClient->getItem([
        'TableName' => $tableName,
        'Key' => $key,
    ]);
});

// Example: Exposing metrics endpoint
// This endpoint should be scraped by Prometheus.
// Ensure it's protected and only accessible by your monitoring system.
if ($_SERVER['REQUEST_URI'] === '/metrics') {
    header('Content-Type: text/plain');
    $renderer = new CallbackRenderer($registry);
    echo $renderer->render();
    exit;
}

Add a Prometheus scrape job for your application’s metrics endpoint.

scrape_configs:
  - job_name: 'php-app'
    static_configs:
      - targets: ['your-app-host:80'] # Assuming metrics endpoint is served on port 80
        metrics_path: '/metrics'
        labels:
          instance: 'your-php-app-instance-name'

Error Tracking with Sentry or Similar

For detailed error reporting, integrate an error tracking service like Sentry. The PHP SDK can capture exceptions and provide context.

// Example using Sentry PHP SDK
require 'vendor/autoload.php';

\Sentry\init([
    'dsn' => 'YOUR_SENTRY_DSN',
    'environment' => 'production',
    'release' => '[email protected]',
]);

try {
    // Your application code that might throw exceptions
    throw new \Exception("Something went wrong in the application!");
} catch (\Exception $e) {
    \Sentry\captureException($e);
    // Handle the exception
}

Log Aggregation and Analysis

Centralized logging is non-negotiable. Tools like ELK stack (Elasticsearch, Logstash, Kibana) or Grafana Loki are essential for aggregating logs from PHP-FPM, web servers, and your application.

Configuring Log Shipping

Use agents like Filebeat or Promtail to ship logs to your central logging system. For PHP-FPM, you’ll typically configure it to log to a file, and the agent will tail that file.

; PHP-FPM error log configuration
; /etc/php/8.1/fpm/php-fpm.conf
error_log = /var/log/php/php-fpm.log

# Example Filebeat configuration (filebeat.yml)
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/php/php-fpm.log
    - /var/log/nginx/error.log
    - /var/log/your-app/app.log # Your application's log file
  fields_under_root: true
  fields:
    environment: production
    app_name: your-php-app

output.elasticsearch:
  hosts: ["your-elasticsearch-host:9200"]
  index: "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"

Alerting Strategies

Once metrics and logs are collected, define meaningful alerts. Avoid alert fatigue by focusing on actionable events.

Alerting on Key Metrics

PHP-FPM: High number of active processes, low idle processes, high request rate with increasing latency.
Web Server: High 5xx error rate, high request latency, low connection availability.
DynamoDB: High ThrottledRequests, high SuccessfulRequestLatency, high ConsumedRead/WriteCapacityUnits approaching provisioned limits.
Application: High error rates (from Sentry or custom metrics), critical custom metric thresholds breached.

# Example Prometheus Alertmanager rule
groups:
- name: php_app_alerts
  rules:
  - alert: HighDynamoDBThrottling
    expr: sum(rate(dynamodb_throttled_requests_total{job="dynamodb"}[5m])) by (instance) > 0
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "High DynamoDB throttling detected on {{ $labels.instance }}"
      description: "DynamoDB table is experiencing significant throttling. Check throughput provisioning and application access patterns."

  - alert: HighPHPRequestLatency
    expr: avg_over_time(php_network_request_duration_seconds_avg{job="php-app"}[5m]) > 2.0
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "High PHP request latency on {{ $labels.instance }}"
      description: "Average PHP request duration is above 2 seconds for the last 10 minutes."

Conclusion

A comprehensive monitoring strategy for a PHP application on OVH with DynamoDB involves layers of infrastructure, web server, database, and application-level metrics. By instrumenting your application, configuring exporters, and centralizing logs, you gain the visibility needed to proactively identify and resolve issues before they impact your users.