Server Monitoring Best Practices: Keeping Your WordPress App and DynamoDB Clusters Alive on Linode
Establishing a Robust Monitoring Foundation with Prometheus and Grafana
Maintaining the health and performance of a WordPress application, especially when coupled with a managed NoSQL database like AWS DynamoDB (or a self-hosted equivalent on Linode), necessitates a proactive and comprehensive monitoring strategy. We’ll focus on setting up Prometheus for metrics collection and Grafana for visualization, leveraging their extensibility to cover both the WordPress layer and the underlying infrastructure.
Deploying Prometheus and Grafana on Linode
A common and effective approach is to deploy Prometheus and Grafana as Docker containers on a dedicated Linode instance. This provides isolation and simplifies management.
Docker Compose Setup
Create a docker-compose.yml file to define the Prometheus and Grafana services. We’ll include persistent volumes for data storage.
version: '3.7'
services:
prometheus:
image: prom/prometheus:v2.45.0
container_name: prometheus
ports:
- "9090:9090"
volumes:
- prometheus_data:/prometheus
- ./prometheus:/etc/prometheus
restart: unless-stopped
grafana:
image: grafana/grafana:10.2.1
container_name: grafana
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
restart: unless-stopped
volumes:
prometheus_data:
grafana_data:
Prometheus Configuration (prometheus.yml)
The Prometheus configuration file (./prometheus/prometheus.yml) is crucial for defining scrape targets. Initially, we’ll configure it to scrape itself and Grafana. We’ll add WordPress and DynamoDB targets later.
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'grafana'
static_configs:
- targets: ['localhost:3000']
After creating these files, navigate to the directory containing docker-compose.yml and run:
docker-compose up -d
You should now be able to access Prometheus at http://your_linode_ip:9090 and Grafana at http://your_linode_ip:3000. The default Grafana login is admin/admin.
Monitoring WordPress Application Metrics
To monitor WordPress, we need to expose application-level metrics. The WP-Stateless plugin, despite its name, can be configured to expose Prometheus metrics via its REST API endpoint, or more commonly, a dedicated Prometheus exporter for PHP applications is used. For this example, we’ll assume a custom exporter or a plugin that exposes metrics on a specific port.
Node Exporter for System Metrics
First, deploy the node_exporter on your WordPress Linode instance to collect system-level metrics (CPU, memory, disk, network). You can run this directly on the host or as another Docker container.
# On your WordPress Linode docker run -d \ --name node_exporter \ --restart unless-stopped \ -p 9100:9100 \ -v "/proc:/host/proc:ro" \ -v "/sys:/host/sys:ro" \ -v "/:/rootfs:ro" \ --label "container_name=node_exporter" \ prom/node-exporter:v1.7.0 --path.procfs=/host/proc --path.sysfs=/host/sys --path.rootfs=/rootfs
Prometheus Configuration for WordPress Node Exporter
Update your prometheus.yml to include the node_exporter as a target. If your WordPress is on the same Linode as Prometheus/Grafana, you can use host.docker.internal or the Linode’s private IP. For simplicity, assuming WordPress is on the same Linode:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'grafana'
static_configs:
- targets: ['localhost:3000']
- job_name: 'wordpress_node_exporter'
static_configs:
- targets: ['host.docker.internal:9100'] # Or the Linode's private IP if not using Docker for Prometheus
After updating prometheus.yml, restart the Prometheus container:
docker-compose restart prometheus
You should now see the wordpress_node_exporter job appear in Prometheus’s Targets page (http://your_linode_ip:9090/targets).
Exposing WordPress Application Metrics
For actual WordPress application metrics (e.g., request latency, error rates, active users), you’ll need a PHP Prometheus client library or a plugin that exposes an endpoint. Let’s assume a hypothetical endpoint at http://your_wordpress_ip:9200/metrics.
# Add to your prometheus.yml
- job_name: 'wordpress_app'
static_configs:
- targets: ['your_wordpress_ip:9200'] # Replace with actual IP and port
Restart Prometheus again.
Monitoring DynamoDB Clusters
Monitoring DynamoDB involves leveraging AWS CloudWatch metrics. Since we’re on Linode, we’ll need a way to pull these metrics into Prometheus. The cloudwatch_exporter is a popular choice for this.
Deploying CloudWatch Exporter
The cloudwatch_exporter can be deployed as a Docker container. It requires AWS credentials and a configuration file specifying which metrics to scrape.
AWS Credentials
Create an IAM user with read-only access to CloudWatch and DynamoDB. Store the credentials securely. A common method is to use an ~/.aws/credentials file or environment variables.
CloudWatch Exporter Configuration (config.yml)
Create a configuration file (e.g., ./cloudwatch_exporter/config.yml) to define the DynamoDB metrics you want to collect. This example focuses on key metrics like ConsumedReadCapacityUnits, ConsumedWriteCapacityUnits, and ThrottledRequests.
# ./cloudwatch_exporter/config.yml
discovery:
jobs:
- type: dynamodb
regions:
- us-east-1 # Replace with your DynamoDB region
metrics:
- name: ConsumedReadCapacityUnits
statistics: [Average, Maximum]
period: 300 # 5 minutes
- name: ConsumedWriteCapacityUnits
statistics: [Average, Maximum]
period: 300
- name: ThrottledRequests
statistics: [Sum]
period: 300
- name: SuccessfulRequestLatency
statistics: [Average, Maximum]
period: 300
- name: ProvisionedReadCapacityUnits
statistics: [Average]
period: 300
- name: ProvisionedWriteCapacityUnits
statistics: [Average]
period: 300
# Add other AWS services as needed
# - type: ec2
# regions:
# - us-east-1
# metrics:
# - namespace: AWS/EC2
# name: CPUUtilization
# statistics: [Average]
# period: 300
Running the CloudWatch Exporter Container
Run the exporter as a Docker container, mapping the configuration and providing AWS credentials.
# On your Prometheus Linode docker run -d \ --name cloudwatch_exporter \ --restart unless-stopped \ -p 9119:9119 \ -v $(pwd)/cloudwatch_exporter/config.yml:/config.yml \ -e AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY_ID" \ -e AWS_SECRET_ACCESS_KEY="YOUR_SECRET_ACCESS_KEY" \ -e AWS_REGION="us-east-1" \ quay.io/prometheuscommunity/cloudwatch-exporter:latest --config.file=/config.yml
Note: For production, it’s highly recommended to use IAM roles if running on EC2, or AWS SDK credential providers that don’t involve hardcoding keys directly in the container environment. For Linode, using a dedicated IAM user with restricted permissions and storing keys securely (e.g., in a Docker secret or a mounted volume with restricted permissions) is a viable approach.
Prometheus Configuration for CloudWatch Exporter
Add the cloudwatch_exporter to your prometheus.yml.
# Add to your prometheus.yml
- job_name: 'dynamodb_cloudwatch'
static_configs:
- targets: ['localhost:9119'] # Assuming cloudwatch_exporter is on the same Linode as Prometheus
# If cloudwatch_exporter is on a different Linode, use its IP:
# - targets: ['cloudwatch_exporter_linode_ip:9119']
Restart Prometheus.
Configuring Grafana Dashboards
With Prometheus collecting data, Grafana becomes the visualization layer. We’ll add Prometheus as a data source and then import or create dashboards.
Adding Prometheus Data Source in Grafana
1. Log in to Grafana (http://your_linode_ip:3000).
- Navigate to Configuration (gear icon) -> Data Sources.
- Click “Add data source”.
- Select “Prometheus”.
- In the “URL” field, enter
http://localhost:9090(or the IP of your Prometheus server if it’s on a different machine). - Click “Save & Test”. You should see “Data source is working”.
Importing Pre-built Dashboards
Grafana has a rich community providing pre-built dashboards. You can import dashboards for Node Exporter and CloudWatch metrics.
- Navigate to Dashboards (four squares icon) -> Browse -> Import.
- You can import by Grafana.com Dashboard ID. Some useful IDs:
- Node Exporter Full Dashboard: 1860
- AWS DynamoDB Overview: 7248 (This might need customization for cloudwatch_exporter metrics)
- Alternatively, you can create custom dashboards.
Key WordPress and DynamoDB Metrics to Monitor
- WordPress Application:
- Request Latency (P95, P99)
- HTTP Error Rates (4xx, 5xx)
- PHP-FPM Pool Usage (if applicable)
- Database Query Performance (if exposed)
- Active Users/Sessions
- DynamoDB:
- Consumed Read/Write Capacity Units vs. Provisioned
- Throttled Requests (critical indicator of performance bottlenecks)
- Successful Request Latency (Average, Max)
- Item Count
- Table Size
- Infrastructure (Node Exporter):
- CPU Utilization
- Memory Usage
- Disk I/O and Space
- Network Traffic
Alerting with Prometheus Alertmanager
Proactive monitoring is incomplete without alerting. Prometheus Alertmanager handles alerts sent by Prometheus. We’ll integrate it into our Docker Compose setup.
Adding Alertmanager to Docker Compose
# Add to your docker-compose.yml
alertmanager:
image: prom/alertmanager:v0.26.0
container_name: alertmanager
ports:
- "9093:9093"
volumes:
- alertmanager_data:/data
- ./alertmanager:/etc/alertmanager
restart: unless-stopped
volumes:
prometheus_data:
grafana_data:
alertmanager_data:
Alertmanager Configuration (alertmanager.yml)
Configure Alertmanager to define receivers (e.g., email, Slack) and routing rules. This example shows a basic setup with email.
# ./alertmanager/alertmanager.yml global: resolve_timeout: 5m smtp_smarthost: 'smtp.example.com:587' smtp_from: '[email protected]' smtp_auth_username: '[email protected]' smtp_auth_password: 'your_smtp_password' route: group_by: ['alertname', 'cluster', 'service'] group_wait: 30s group_interval: 5m repeat_interval: 4h receiver: 'default-receiver' receivers: - name: 'default-receiver' email_configs: - to: '[email protected]' send_resolved: true # Example of specific routing for critical alerts # routes: # - receiver: 'critical-alerts' # matchers: # - severity =~ "critical" # continue: true # # receivers: # - name: 'critical-alerts' # slack_configs: # - api_url: 'https://hooks.slack.com/services/...' # channel: '#critical-alerts'
Prometheus Configuration for Alertmanager
Tell Prometheus where to find Alertmanager.
# Add to your prometheus.yml
alerting:
alertmanagers:
- static_configs:
- targets: ['localhost:9093'] # Assuming Alertmanager is on the same Linode
Restart Prometheus and Alertmanager:
docker-compose restart prometheus alertmanager
Example Prometheus Alert Rule
Create alert rules in a file (e.g., ./prometheus/rules/alerts.yml) and include it in prometheus.yml.
# ./prometheus/rules/alerts.yml
groups:
- name: wordpress_alerts
rules:
- alert: HighErrorRate
expr: |
sum(rate(http_requests_total{code=~"5..", job="wordpress_app"}[5m]))
/
sum(rate(http_requests_total{job="wordpress_app"}[5m]))
* 100 > 5
for: 5m
labels:
severity: critical
service: wordpress
annotations:
summary: "High HTTP 5xx error rate detected on WordPress"
description: "More than 5% of requests to WordPress are returning 5xx errors over the last 5 minutes."
- alert: DynamoDBThrottledRequests
expr: sum(rate(dynamodb_throttled_requests_sum{job="dynamodb_cloudwatch"}[5m])) > 0
for: 1m
labels:
severity: warning
service: dynamodb
annotations:
summary: "DynamoDB throttled requests detected"
description: "Throttled requests detected on DynamoDB over the last 5 minutes. Consider increasing provisioned capacity."
- alert: HighCPULoad
expr: node_load1 > 1.5 # Adjust threshold based on your Linode specs
for: 10m
labels:
severity: warning
service: node_exporter
annotations:
summary: "High CPU Load on WordPress Linode"
description: "The 1-minute load average on the WordPress Linode is above 1.5."
# Include this rule file in prometheus.yml
# scrape_configs:
# ...
# rule_files:
# - "rules/alerts.yml"
Ensure you add the rule_files directive to your prometheus.yml and restart Prometheus.
Advanced Considerations and Best Practices
- Service Discovery: For dynamic environments, use Prometheus’s service discovery mechanisms (e.g., file-based, Consul, Kubernetes SD) instead of static configs.
- High Availability: Run Prometheus and Alertmanager in HA pairs.
- Data Retention: Configure Prometheus’s
--storage.tsdb.retention.timeflag (or via Docker volume size limits) to manage disk space. - Security: Secure your Grafana and Prometheus endpoints with authentication and network access controls. Use TLS for all communications.
- Custom Metrics: Instrument your WordPress application with custom metrics relevant to your business logic (e.g., order processing time, user sign-ups).
- Log Aggregation: Complement metrics monitoring with a robust log aggregation system (e.g., ELK stack, Loki) for deeper diagnostics.
- Synthetic Monitoring: Use tools like Prometheus Blackbox Exporter to simulate user interactions and monitor endpoint availability and latency from external locations.
By implementing this layered monitoring approach, you gain deep visibility into your WordPress application and DynamoDB cluster’s performance and health, enabling you to detect and resolve issues before they impact your users.