Server Monitoring Best Practices: Keeping Your Shopify App and MongoDB Clusters Alive on OVH
Establishing a Robust Monitoring Foundation with OVH and Prometheus
Maintaining high availability for a critical Shopify app and its associated MongoDB clusters on OVH infrastructure demands a proactive and deeply integrated monitoring strategy. We’ll focus on Prometheus as our primary time-series database and monitoring system, leveraging its powerful querying capabilities and extensive ecosystem of exporters. This approach allows us to collect granular metrics from our application, web servers, and database instances, providing the visibility needed to anticipate and resolve issues before they impact end-users.
Deploying Prometheus and Grafana on OVH
A common and effective deployment pattern is to run Prometheus and Grafana on dedicated virtual machines or containers within your OVH environment. For simplicity and rapid deployment, we’ll outline a Docker-based setup. Ensure your OVH instances have appropriate security group rules configured to allow inbound traffic on ports 9090 (Prometheus) and 3000 (Grafana) from your management network or VPN.
First, create a docker-compose.yml file to define your Prometheus and Grafana services:
version: '3.7'
services:
prometheus:
image: prom/prometheus:v2.45.0
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus:/etc/prometheus
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
restart: unless-stopped
grafana:
image: grafana/grafana:10.2.0
container_name: grafana
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
restart: unless-stopped
volumes:
prometheus_data:
grafana_data:
Next, configure Prometheus by creating a prometheus.yml file in the same directory. This configuration will define scrape targets. Initially, we’ll set up targets for Prometheus itself and Grafana. We’ll add MongoDB and application targets later.
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'grafana'
static_configs:
- targets: ['localhost:3000']
Start the containers with:
docker-compose up -d
Access Prometheus at http://your-ovh-ip:9090 and Grafana at http://your-ovh-ip:3000. The default Grafana credentials are admin/admin, which you should change immediately.
Monitoring MongoDB with the `mongodb_exporter`
To gain deep insights into your MongoDB clusters, we’ll deploy the mongodb_exporter. This exporter runs as a separate service and exposes MongoDB metrics in a Prometheus-compatible format. We’ll run it as another Docker container.
First, ensure your MongoDB instances are accessible from the host running the exporter. You’ll need to create a user with sufficient privileges for the exporter to connect and query metrics. A common approach is to grant the clusterMonitor role.
# Connect to your MongoDB instance
mongosh "mongodb://your_mongo_host:27017"
# Create a user for the exporter (replace 'exporter_user' and 'your_strong_password')
use admin
db.createUser({
user: "exporter_user",
pwd: "your_strong_password",
roles: [ { role: "clusterMonitor", db: "admin" } ]
})
exit
Now, update your docker-compose.yml to include the mongodb_exporter and modify prometheus.yml to scrape it. We’ll assume your MongoDB instances are accessible via internal IPs or hostnames.
version: '3.7'
services:
prometheus:
image: prom/prometheus:v2.45.0
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus:/etc/prometheus
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
restart: unless-stopped
grafana:
image: grafana/grafana:10.2.0
container_name: grafana
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
restart: unless-stopped
mongodb_exporter:
image: percona/mongodb_exporter:latest
container_name: mongodb_exporter
ports:
- "9100:9100" # Default port for mongodb_exporter
environment:
- MONGODB_URI=mongodb://exporter_user:your_strong_password@your_mongo_host:27017/admin?authSource=admin
restart: unless-stopped
volumes:
prometheus_data:
grafana_data:
Update prometheus.yml to include the MongoDB exporter. If you have multiple MongoDB instances or replica sets, you’ll need to add them as separate targets or use service discovery.
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'grafana'
static_configs:
- targets: ['localhost:3000']
- job_name: 'mongodb'
static_configs:
- targets: ['mongodb_exporter:9100'] # Assuming mongodb_exporter is in the same Docker network
# If running mongodb_exporter on a different host or directly on the OVH VM:
# - targets: ['your_mongodb_exporter_ip:9100']
After restarting your Docker containers (docker-compose down && docker-compose up -d), you should see the mongodb job appear in Prometheus’s Targets page (http://your-ovh-ip:9090/targets). You can now query MongoDB metrics using PromQL, for example:
mongodb_up mongodb_connections_current mongodb_network_bytes_received_total mongodb_replication_lag_seconds
Monitoring Your Shopify App with Node Exporter and Application-Specific Metrics
For your Shopify app, which we’ll assume is running on a PHP stack (e.g., using Nginx and PHP-FPM), we need to monitor both the underlying infrastructure (CPU, memory, network) and application-level performance. The node_exporter is essential for system metrics.
Install node_exporter on the OVH instances hosting your Shopify application. You can download the binary or run it via Docker. For direct installation:
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz cd node_exporter-1.7.0.linux-amd64 sudo mv node_exporter /usr/local/bin/ sudo useradd -rs /bin/false node_exporter sudo systemctl daemon-reload sudo systemctl start node_exporter sudo systemctl enable node_exporter
Create a systemd service file for node_exporter:
[Unit] Description=Node Exporter Wants=network-online.target After=network-online.target [Service] User=node_exporter Group=node_exporter Type=simple ExecStart=/usr/local/bin/node_exporter [Install] WantedBy=multi-user.target
Ensure your OVH firewall allows inbound traffic on port 9100 for the node_exporter. Add this target to your prometheus.yml:
scrape_configs:
# ... other jobs ...
- job_name: 'node'
static_configs:
- targets: ['your_app_server_ip_1:9100', 'your_app_server_ip_2:9100']
For application-level metrics, we’ll use the php-fpm_exporter and potentially custom application metrics. For PHP-FPM, install the exporter on your application servers and configure Prometheus to scrape it.
# On your PHP-FPM server wget https://github.com/prometheus/php-fpm_exporter/releases/download/v0.3.0/php-fpm_exporter-0.3.0.linux-amd64.tar.gz tar xvfz php-fpm_exporter-0.3.0.linux-amd64.tar.gz sudo mv php-fpm_exporter /usr/local/bin/ # ... create systemd service similar to node_exporter, listening on port 9259 ...
Add to prometheus.yml:
scrape_configs:
# ... other jobs ...
- job_name: 'php_fpm'
static_configs:
- targets: ['your_app_server_ip_1:9259', 'your_app_server_ip_2:9259']
For custom application metrics, you can use a Prometheus client library for PHP. This involves instrumenting your application code to expose metrics via an HTTP endpoint (e.g., /metrics). A simple example using a hypothetical library:
<?php
require 'vendor/autoload.php';
use Prometheus\CollectorRegistry;
use Prometheus\Render\RenderTextFormat;
use Prometheus\Storage\InMemory;
$registry = new CollectorRegistry(new InMemory());
// Example: Counter for successful Shopify API calls
$counter = $registry->registerCounter('shopify_api_calls_total', 'Total number of Shopify API calls', ['endpoint']);
$counter->inc(['/admin/api/2023-10/orders.json']); // Increment for a specific endpoint
// Example: Gauge for current active users
$gauge = $registry->registerGauge('app_active_users', 'Current number of active users', ['user_type']);
$gauge->set(150, ['logged_in']);
// ... other metrics ...
header('Content-type: text/plain');
$renderer = new RenderTextFormat();
echo $registry->render($renderer);
?>
Configure Nginx to serve this endpoint and add it as a scrape target in Prometheus.
Alerting with Alertmanager
Effective monitoring is incomplete without a robust alerting system. Prometheus integrates seamlessly with Alertmanager. You’ll need to deploy Alertmanager (can also be Dockerized) and configure Prometheus to send alerts to it.
Add Alertmanager to your docker-compose.yml:
# ... other services ...
alertmanager:
image: prom/alertmanager:v0.26.0
container_name: alertmanager
ports:
- "9093:9093"
volumes:
- ./alertmanager:/etc/alertmanager
command:
- '--config.file=/etc/alertmanager/alertmanager.yml'
restart: unless-stopped
Create an alertmanager.yml file. This example configures email notifications:
global: resolve_timeout: 5m smtp_smarthost: 'smtp.your-email-provider.com:587' smtp_from: '[email protected]' smtp_auth_username: '[email protected]' smtp_auth_password: 'your_email_password' route: group_by: ['alertname', 'cluster', 'service'] group_wait: 30s group_interval: 5m repeat_interval: 4h receiver: 'email-notifications' receivers: - name: 'email-notifications' email_configs: - to: '[email protected]'
Configure Prometheus to use Alertmanager in prometheus.yml:
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093'] # Assuming alertmanager is in the same Docker network
Define alerting rules in a separate file (e.g., alerts.yml) and include it in your Prometheus configuration:
groups:
- name: mongodb_alerts
rules:
- alert: MongoDBHighConnectionCount
expr: mongodb_connections_current > 500
for: 5m
labels:
severity: warning
annotations:
summary: "High MongoDB connection count on {{ $labels.instance }}"
description: "MongoDB instance {{ $labels.instance }} has {{ $value }} current connections, exceeding the threshold."
- alert: MongoDBReplicationLag
expr: mongodb_replication_lag_seconds > 60
for: 2m
labels:
severity: critical
annotations:
summary: "MongoDB replication lag detected on {{ $labels.instance }}"
description: "MongoDB instance {{ $labels.instance }} has a replication lag of {{ $value }} seconds."
- name: app_alerts
rules:
- alert: HighCPULoad
expr: 100 - avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100 > 85
for: 10m
labels:
severity: warning
annotations:
summary: "High CPU load on {{ $labels.instance }}"
description: "Instance {{ $labels.instance }} is experiencing high CPU load ({{ $value | printf "%.2f" }}%)."
- alert: PHPFPMHighProcessCount
expr: php_fpm_process_count > 50
for: 5m
labels:
severity: warning
annotations:
summary: "High PHP-FPM process count on {{ $labels.instance }}"
description: "PHP-FPM on {{ $labels.instance }} is using {{ $value }} processes."
Add the alerts file to your Prometheus configuration:
# In prometheus.yml rule_files: - "/etc/prometheus/alerts.yml"
Restart Prometheus to load the new configuration and alert rules. You can verify the alert rules in Prometheus under the “Alerts” section.
Grafana Dashboards for Visualization
While Prometheus provides the data and Alertmanager handles notifications, Grafana is crucial for visualizing this data. You can import pre-built dashboards for MongoDB and Node Exporter from Grafana’s dashboard repository or create custom dashboards tailored to your application’s specific needs.
Key dashboards to import:
- MongoDB Dashboard (e.g., ID 7437 or similar for Percona MongoDB Exporter)
- Node Exporter Full (e.g., ID 1860)
- PHP-FPM Dashboard (if available or custom)
When setting up data sources in Grafana, point it to your Prometheus instance (e.g., http://prometheus:9090 if they are in the same Docker network, or http://your-ovh-ip:9090). For custom application dashboards, focus on visualizing key performance indicators (KPIs) such as request latency, error rates (HTTP 5xx, 4xx), queue lengths, and Shopify API call success/failure rates.
Advanced Considerations and Next Steps
Service Discovery: For dynamic environments, replace static configurations with service discovery mechanisms (e.g., Consul, Kubernetes service discovery if applicable) to automatically discover and scrape new instances.
High Availability for Prometheus/Alertmanager: For critical production setups, consider running multiple Prometheus and Alertmanager instances for redundancy. This can be achieved by configuring them to scrape each other and share alert rules.
Long-Term Storage: For longer retention periods, integrate Prometheus with remote storage solutions like Thanos, Cortex, or VictoriaMetrics.
Application Performance Monitoring (APM): While Prometheus excels at metrics, consider integrating APM tools (e.g., New Relic, Datadog, or open-source alternatives like Jaeger for distributed tracing) for deeper application-level diagnostics, especially for complex Shopify app workflows.
Log Aggregation: Complement metrics monitoring with a centralized logging solution (e.g., ELK stack, Loki) to correlate log events with metric anomalies.
By implementing this comprehensive monitoring strategy, you establish a resilient system capable of detecting, alerting on, and diagnosing issues across your Shopify app and MongoDB clusters on OVH, ensuring optimal performance and uptime.