Server Monitoring Best Practices: Keeping Your WordPress App and DynamoDB Clusters Alive on Linode

Establishing a Robust Monitoring Foundation with Prometheus and Grafana

Maintaining the health and performance of a WordPress application, especially when coupled with a managed NoSQL database like AWS DynamoDB (or a self-hosted equivalent on Linode), necessitates a proactive and comprehensive monitoring strategy. We’ll focus on setting up Prometheus for metrics collection and Grafana for visualization, leveraging their extensibility to cover both the WordPress layer and the underlying infrastructure.

Deploying Prometheus and Grafana on Linode

A common and effective approach is to deploy Prometheus and Grafana as Docker containers on a dedicated Linode instance. This provides isolation and simplifies management.

Docker Compose Setup

Create a docker-compose.yml file to define the Prometheus and Grafana services. We’ll include persistent volumes for data storage.

version: '3.7'

services:
  prometheus:
    image: prom/prometheus:v2.45.0
    container_name: prometheus
    ports:
      - "9090:9090"
    volumes:
      - prometheus_data:/prometheus
      - ./prometheus:/etc/prometheus
    restart: unless-stopped

  grafana:
    image: grafana/grafana:10.2.1
    container_name: grafana
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
    restart: unless-stopped

volumes:
  prometheus_data:
  grafana_data:

Prometheus Configuration (`prometheus.yml`)

The Prometheus configuration file (./prometheus/prometheus.yml) is crucial for defining scrape targets. Initially, we’ll configure it to scrape itself and Grafana. We’ll add WordPress and DynamoDB targets later.

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'grafana'
    static_configs:
      - targets: ['localhost:3000']

After creating these files, navigate to the directory containing docker-compose.yml and run:

docker-compose up -d

You should now be able to access Prometheus at http://your_linode_ip:9090 and Grafana at http://your_linode_ip:3000. The default Grafana login is admin/admin.

Monitoring WordPress Application Metrics

To monitor WordPress, we need to expose application-level metrics. The WP-Stateless plugin, despite its name, can be configured to expose Prometheus metrics via its REST API endpoint, or more commonly, a dedicated Prometheus exporter for PHP applications is used. For this example, we’ll assume a custom exporter or a plugin that exposes metrics on a specific port.

Node Exporter for System Metrics

First, deploy the node_exporter on your WordPress Linode instance to collect system-level metrics (CPU, memory, disk, network). You can run this directly on the host or as another Docker container.

# On your WordPress Linode
docker run -d \
  --name node_exporter \
  --restart unless-stopped \
  -p 9100:9100 \
  -v "/proc:/host/proc:ro" \
  -v "/sys:/host/sys:ro" \
  -v "/:/rootfs:ro" \
  --label "container_name=node_exporter" \
  prom/node-exporter:v1.7.0 --path.procfs=/host/proc --path.sysfs=/host/sys --path.rootfs=/rootfs

Prometheus Configuration for WordPress Node Exporter

Update your prometheus.yml to include the node_exporter as a target. If your WordPress is on the same Linode as Prometheus/Grafana, you can use host.docker.internal or the Linode’s private IP. For simplicity, assuming WordPress is on the same Linode:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'grafana'
    static_configs:
      - targets: ['localhost:3000']

  - job_name: 'wordpress_node_exporter'
    static_configs:
      - targets: ['host.docker.internal:9100'] # Or the Linode's private IP if not using Docker for Prometheus

After updating prometheus.yml, restart the Prometheus container:

docker-compose restart prometheus

You should now see the wordpress_node_exporter job appear in Prometheus’s Targets page (http://your_linode_ip:9090/targets).

Exposing WordPress Application Metrics

For actual WordPress application metrics (e.g., request latency, error rates, active users), you’ll need a PHP Prometheus client library or a plugin that exposes an endpoint. Let’s assume a hypothetical endpoint at http://your_wordpress_ip:9200/metrics.

# Add to your prometheus.yml
  - job_name: 'wordpress_app'
    static_configs:
      - targets: ['your_wordpress_ip:9200'] # Replace with actual IP and port

Restart Prometheus again.

Monitoring DynamoDB Clusters

Monitoring DynamoDB involves leveraging AWS CloudWatch metrics. Since we’re on Linode, we’ll need a way to pull these metrics into Prometheus. The cloudwatch_exporter is a popular choice for this.

Deploying CloudWatch Exporter

The cloudwatch_exporter can be deployed as a Docker container. It requires AWS credentials and a configuration file specifying which metrics to scrape.

AWS Credentials

Create an IAM user with read-only access to CloudWatch and DynamoDB. Store the credentials securely. A common method is to use an ~/.aws/credentials file or environment variables.

CloudWatch Exporter Configuration (`config.yml`)

Create a configuration file (e.g., ./cloudwatch_exporter/config.yml) to define the DynamoDB metrics you want to collect. This example focuses on key metrics like ConsumedReadCapacityUnits, ConsumedWriteCapacityUnits, and ThrottledRequests.

# ./cloudwatch_exporter/config.yml
discovery:
  jobs:
    - type: dynamodb
      regions:
        - us-east-1 # Replace with your DynamoDB region
      metrics:
        - name: ConsumedReadCapacityUnits
          statistics: [Average, Maximum]
          period: 300 # 5 minutes
        - name: ConsumedWriteCapacityUnits
          statistics: [Average, Maximum]
          period: 300
        - name: ThrottledRequests
          statistics: [Sum]
          period: 300
        - name: SuccessfulRequestLatency
          statistics: [Average, Maximum]
          period: 300
        - name: ProvisionedReadCapacityUnits
          statistics: [Average]
          period: 300
        - name: ProvisionedWriteCapacityUnits
          statistics: [Average]
          period: 300

    # Add other AWS services as needed
    # - type: ec2
    #   regions:
    #     - us-east-1
    #   metrics:
    #     - namespace: AWS/EC2
    #       name: CPUUtilization
    #       statistics: [Average]
    #       period: 300

Running the CloudWatch Exporter Container

Run the exporter as a Docker container, mapping the configuration and providing AWS credentials.

# On your Prometheus Linode
docker run -d \
  --name cloudwatch_exporter \
  --restart unless-stopped \
  -p 9119:9119 \
  -v $(pwd)/cloudwatch_exporter/config.yml:/config.yml \
  -e AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY_ID" \
  -e AWS_SECRET_ACCESS_KEY="YOUR_SECRET_ACCESS_KEY" \
  -e AWS_REGION="us-east-1" \
  quay.io/prometheuscommunity/cloudwatch-exporter:latest --config.file=/config.yml

Note: For production, it’s highly recommended to use IAM roles if running on EC2, or AWS SDK credential providers that don’t involve hardcoding keys directly in the container environment. For Linode, using a dedicated IAM user with restricted permissions and storing keys securely (e.g., in a Docker secret or a mounted volume with restricted permissions) is a viable approach.

Prometheus Configuration for CloudWatch Exporter

Add the cloudwatch_exporter to your prometheus.yml.

# Add to your prometheus.yml
  - job_name: 'dynamodb_cloudwatch'
    static_configs:
      - targets: ['localhost:9119'] # Assuming cloudwatch_exporter is on the same Linode as Prometheus
    # If cloudwatch_exporter is on a different Linode, use its IP:
    # - targets: ['cloudwatch_exporter_linode_ip:9119']

Restart Prometheus.

Configuring Grafana Dashboards

With Prometheus collecting data, Grafana becomes the visualization layer. We’ll add Prometheus as a data source and then import or create dashboards.

Adding Prometheus Data Source in Grafana

1. Log in to Grafana (http://your_linode_ip:3000).

Navigate to Configuration (gear icon) -> Data Sources.
Click “Add data source”.
Select “Prometheus”.
In the “URL” field, enter http://localhost:9090 (or the IP of your Prometheus server if it’s on a different machine).
Click “Save & Test”. You should see “Data source is working”.

Importing Pre-built Dashboards

Grafana has a rich community providing pre-built dashboards. You can import dashboards for Node Exporter and CloudWatch metrics.

Navigate to Dashboards (four squares icon) -> Browse -> Import.
You can import by Grafana.com Dashboard ID. Some useful IDs:
- Node Exporter Full Dashboard: 1860
- AWS DynamoDB Overview: 7248 (This might need customization for cloudwatch_exporter metrics)
Alternatively, you can create custom dashboards.

Key WordPress and DynamoDB Metrics to Monitor

WordPress Application:
- Request Latency (P95, P99)
- HTTP Error Rates (4xx, 5xx)
- PHP-FPM Pool Usage (if applicable)
- Database Query Performance (if exposed)
- Active Users/Sessions
DynamoDB:
- Consumed Read/Write Capacity Units vs. Provisioned
- Throttled Requests (critical indicator of performance bottlenecks)
- Successful Request Latency (Average, Max)
- Item Count
- Table Size
Infrastructure (Node Exporter):
- CPU Utilization
- Memory Usage
- Disk I/O and Space
- Network Traffic

Alerting with Prometheus Alertmanager

Proactive monitoring is incomplete without alerting. Prometheus Alertmanager handles alerts sent by Prometheus. We’ll integrate it into our Docker Compose setup.

Adding Alertmanager to Docker Compose

# Add to your docker-compose.yml
  alertmanager:
    image: prom/alertmanager:v0.26.0
    container_name: alertmanager
    ports:
      - "9093:9093"
    volumes:
      - alertmanager_data:/data
      - ./alertmanager:/etc/alertmanager
    restart: unless-stopped

volumes:
  prometheus_data:
  grafana_data:
  alertmanager_data:

Alertmanager Configuration (`alertmanager.yml`)

Configure Alertmanager to define receivers (e.g., email, Slack) and routing rules. This example shows a basic setup with email.

# ./alertmanager/alertmanager.yml
global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.example.com:587'
  smtp_from: '[email protected]'
  smtp_auth_username: '[email protected]'
  smtp_auth_password: 'your_smtp_password'

route:
  group_by: ['alertname', 'cluster', 'service']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: 'default-receiver'

receivers:
  - name: 'default-receiver'
    email_configs:
      - to: '[email protected]'
        send_resolved: true

# Example of specific routing for critical alerts
# routes:
#   - receiver: 'critical-alerts'
#     matchers:
#       - severity =~ "critical"
#     continue: true
#
# receivers:
#   - name: 'critical-alerts'
#     slack_configs:
#       - api_url: 'https://hooks.slack.com/services/...'
#         channel: '#critical-alerts'

Prometheus Configuration for Alertmanager

Tell Prometheus where to find Alertmanager.

# Add to your prometheus.yml
alerting:
  alertmanagers:
    - static_configs:
        - targets: ['localhost:9093'] # Assuming Alertmanager is on the same Linode

Restart Prometheus and Alertmanager:

docker-compose restart prometheus alertmanager

Example Prometheus Alert Rule

Create alert rules in a file (e.g., ./prometheus/rules/alerts.yml) and include it in prometheus.yml.

# ./prometheus/rules/alerts.yml
groups:
  - name: wordpress_alerts
    rules:
      - alert: HighErrorRate
        expr: |
          sum(rate(http_requests_total{code=~"5..", job="wordpress_app"}[5m]))
          /
          sum(rate(http_requests_total{job="wordpress_app"}[5m]))
          * 100 > 5
        for: 5m
        labels:
          severity: critical
          service: wordpress
        annotations:
          summary: "High HTTP 5xx error rate detected on WordPress"
          description: "More than 5% of requests to WordPress are returning 5xx errors over the last 5 minutes."

      - alert: DynamoDBThrottledRequests
        expr: sum(rate(dynamodb_throttled_requests_sum{job="dynamodb_cloudwatch"}[5m])) > 0
        for: 1m
        labels:
          severity: warning
          service: dynamodb
        annotations:
          summary: "DynamoDB throttled requests detected"
          description: "Throttled requests detected on DynamoDB over the last 5 minutes. Consider increasing provisioned capacity."

      - alert: HighCPULoad
        expr: node_load1 > 1.5 # Adjust threshold based on your Linode specs
        for: 10m
        labels:
          severity: warning
          service: node_exporter
        annotations:
          summary: "High CPU Load on WordPress Linode"
          description: "The 1-minute load average on the WordPress Linode is above 1.5."

# Include this rule file in prometheus.yml
# scrape_configs:
#   ...
# rule_files:
#   - "rules/alerts.yml"

Ensure you add the rule_files directive to your prometheus.yml and restart Prometheus.

Advanced Considerations and Best Practices

Service Discovery: For dynamic environments, use Prometheus’s service discovery mechanisms (e.g., file-based, Consul, Kubernetes SD) instead of static configs.
High Availability: Run Prometheus and Alertmanager in HA pairs.
Data Retention: Configure Prometheus’s --storage.tsdb.retention.time flag (or via Docker volume size limits) to manage disk space.
Security: Secure your Grafana and Prometheus endpoints with authentication and network access controls. Use TLS for all communications.
Custom Metrics: Instrument your WordPress application with custom metrics relevant to your business logic (e.g., order processing time, user sign-ups).
Log Aggregation: Complement metrics monitoring with a robust log aggregation system (e.g., ELK stack, Loki) for deeper diagnostics.
Synthetic Monitoring: Use tools like Prometheus Blackbox Exporter to simulate user interactions and monitor endpoint availability and latency from external locations.

By implementing this layered monitoring approach, you gain deep visibility into your WordPress application and DynamoDB cluster’s performance and health, enabling you to detect and resolve issues before they impact your users.