Server Monitoring Best Practices: Keeping Your Shopify App and PostgreSQL Clusters Alive on Linode

Proactive PostgreSQL Monitoring with Prometheus and Grafana on Linode

Maintaining the health and performance of PostgreSQL clusters is paramount for any application, especially those with high transaction volumes like Shopify apps. Relying solely on Linode’s basic instance metrics is insufficient. We need granular, application-aware monitoring. This section details setting up Prometheus and Grafana to collect and visualize PostgreSQL-specific metrics, enabling proactive issue detection and resolution.

1. Deploying the PostgreSQL Exporter

Prometheus needs a way to scrape metrics from PostgreSQL. The standard tool for this is the postgres_exporter. We’ll deploy this as a separate service, ideally on a dedicated monitoring node or alongside Prometheus itself.

1.1. Installation and Configuration

Download the latest release of postgres_exporter. For simplicity, we’ll use a Docker container, which simplifies dependency management and deployment.

1.1.1. Docker Compose Setup

Create a docker-compose.yml file to manage the exporter and its connection to your PostgreSQL instances. Ensure your PostgreSQL instances are accessible from the Docker host running this compose file. You might need to adjust network configurations or use Linode’s private networking.

version: '3.8'

services:
  postgres_exporter:
    image: prometheuscommunity/postgres-exporter:latest
    container_name: postgres_exporter
    restart: unless-stopped
    ports:
      - "9187:9187" # Exporter's default port
    environment:
      # For each PostgreSQL instance, define a DSN.
      # Format: "postgresql://user:password@host:port/database?sslmode=disable"
      # It's highly recommended to use environment variables or a secrets manager for credentials.
      # For demonstration, we'll use direct values, but this is NOT production-ready.
      POSTGRES_EXPORTER_CONNECTION_STRING_DEFAULT: "postgresql://monitor_user:your_secure_password@your_pg_host_1:5432/your_database?sslmode=require"
      POSTGRES_EXPORTER_CONNECTION_STRING_CLUSTER2: "postgresql://monitor_user:your_secure_password@your_pg_host_2:5432/your_database?sslmode=require"
      # Add more CONNECTION_STRING_* for additional clusters
      POSTGRES_EXPORTER_EXTEND_QUERY_PATH: "/etc/postgres_exporter/queries.yaml" # Optional: for custom queries
    volumes:
      - ./postgres_exporter_queries.yaml:/etc/postgres_exporter/queries.yaml # Optional: for custom queries
    networks:
      - monitoring_net

networks:
  monitoring_net:
    driver: bridge

Important Security Note: Storing credentials directly in docker-compose.yml is insecure. In production, use Docker secrets, environment files loaded by the orchestrator, or a dedicated secrets management system.

1.2. PostgreSQL User Permissions

Create a dedicated PostgreSQL user for monitoring with minimal necessary privileges. This user should have read-only access to relevant system catalogs and statistics views.

-- Connect to your PostgreSQL instance as a superuser
CREATE USER monitor_user WITH PASSWORD 'your_secure_password';

-- Grant necessary privileges
GRANT CONNECT ON DATABASE your_database TO monitor_user;
GRANT USAGE ON SCHEMA pg_catalog TO monitor_user;
GRANT SELECT ON pg_stat_database TO monitor_user;
GRANT SELECT ON pg_stat_replication TO monitor_user;
GRANT SELECT ON pg_stat_activity TO monitor_user;
GRANT SELECT ON pg_stat_statements TO monitor_user; -- If pg_stat_statements extension is enabled
GRANT SELECT ON pg_locks TO monitor_user;
GRANT SELECT ON pg_settings TO monitor_user;
GRANT SELECT ON pg_stat_bgwriter TO monitor_user;
GRANT SELECT ON pg_stat_user_tables TO monitor_user;
GRANT SELECT ON pg_stat_user_indexes TO monitor_user;
GRANT SELECT ON pg_stat_database_conflicts TO monitor_user;

-- For custom queries defined in queries.yaml, you might need to grant SELECT on specific tables/views.
-- Example:
-- GRANT SELECT ON your_application_table TO monitor_user;

If you plan to use custom queries (e.g., for application-specific performance metrics), ensure the monitor_user has SELECT privileges on those tables or views. The pg_stat_statements extension needs to be enabled in postgresql.conf and created in the database (CREATE EXTENSION pg_stat_statements;) for its metrics to be available.

2. Setting up Prometheus Server

Prometheus will be responsible for scraping metrics from the postgres_exporter and other services. We’ll configure it to discover and poll our PostgreSQL exporters.

2.1. Prometheus Configuration (prometheus.yml)

Here’s a sample prometheus.yml configuration. This assumes Prometheus is running on the same network as the Docker containers for the exporters. If Prometheus is on a separate Linode instance, adjust the targets accordingly (e.g., using Linode’s private IP addresses).

global:
  scrape_interval: 15s # How frequently to scrape targets
  evaluation_interval: 15s # How frequently to evaluate rules

scrape_configs:
  # Scrape Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # Scrape Node Exporter for host metrics
  - job_name: 'node_exporter'
    static_configs:
      - targets:
          - 'your_linode_host_1_private_ip:9100'
          - 'your_linode_host_2_private_ip:9100'
          # Add all your Linode instance private IPs

  # Scrape PostgreSQL Exporter instances
  - job_name: 'postgres_exporter'
    static_configs:
      - targets:
          - 'your_docker_host_ip:9187' # If exporter is on a separate host
          # Or if running on the same host as Prometheus and using host networking:
          # - 'localhost:9187'
          # If using docker-compose on the same host, and ports are mapped:
          # - 'localhost:9187' # For the default connection string
          # - 'localhost:9188' # If you mapped POSTGRES_EXPORTER_PORT=9188 for cluster2
          # It's better to use service discovery or explicitly list targets if they are on different hosts.
          # Example for multiple exporters on different hosts:
          # - 'pg_exporter_host_1:9187'
          # - 'pg_exporter_host_2:9187'

  # Example using service discovery (if you have a service registry like Consul)
  # - job_name: 'postgres_exporter_sd'
  #   consul_sd_configs:
  #     - server: 'consul.service.consul:8500'
  #   relabel_configs:
  #     - source_labels: [__meta_consul_tags]
  #       regex: postgresql
  #       action: keep
  #     - source_labels: [__address__]
  #       regex: '(.*):9187'
  #       target_label: instance
  #     - source_labels: [__meta_consul_service_id]
  #       target_label: service_id

# Alerting rules (optional but recommended)
rule_files:
  - "rules/*.yml"

Note on Targets: The targets in static_configs should point to the IP address and port where the postgres_exporter is accessible. If running Prometheus and the exporter in Docker on the same host, and you’ve mapped the port (e.g., 9187:9187), then localhost:9187 is usually correct. If they are on different Linode instances, use the private IP of the instance running the exporter.

2.2. Installing and Running Prometheus

You can install Prometheus directly on a Linode instance or run it in Docker. Using Docker Compose is often the easiest way to manage Prometheus and its configuration.

# docker-compose.yml for Prometheus
version: '3.8'

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus:/etc/prometheus/
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    networks:
      - monitoring_net

volumes:
  prometheus_data:

networks:
  monitoring_net:
    driver: bridge

Place your prometheus.yml file in a directory named prometheus alongside this docker-compose.yml. Then run: docker-compose up -d.

3. Visualizing Metrics with Grafana

Grafana provides a powerful and flexible dashboarding solution. We’ll connect it to our Prometheus data source and import pre-built PostgreSQL dashboards.

3.1. Deploying Grafana

Again, Docker Compose is a convenient way to deploy Grafana.

version: '3.8'

services:
  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: unless-stopped
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
    networks:
      - monitoring_net

networks:
  monitoring_net:
    driver: bridge

Run: docker-compose up -d. Access Grafana at http://your_linode_ip:3000. Default credentials are admin/admin (you’ll be prompted to change the password on first login).

3.2. Configuring Prometheus Data Source in Grafana

1. Log in to Grafana.

Navigate to Configuration (gear icon) > Data Sources.
Click Add data source.
Select Prometheus.
In the URL field, enter the address of your Prometheus server (e.g., http://your_prometheus_host_ip:9090 or http://localhost:9090 if Grafana and Prometheus are on the same host/network).
Click Save & Test. You should see “Data source is working”.

3.3. Importing PostgreSQL Dashboards

Grafana has a rich community dashboard repository. We can import pre-built dashboards for PostgreSQL.

Go to Dashboards (four squares icon) > Browse.
Click Import.
You can import by Dashboard ID from grafana.com/grafana/dashboards/. Search for “PostgreSQL” and find popular ones like “PostgreSQL Overview” (ID: 1222) or “PostgreSQL by Percona” (ID: 721).
Alternatively, if you have a dashboard JSON file, you can upload it.
When prompted, select your Prometheus data source.
Click Import.

These dashboards will visualize key PostgreSQL metrics such as:

Connection counts
Replication lag
Query performance (if pg_stat_statements is enabled)
Cache hit ratios
Disk I/O
Transaction rates
Lock contention

4. Setting Up Alerting Rules

Dashboards are great for visualization, but alerts are crucial for proactive intervention. Prometheus Alertmanager handles this.

4.1. Defining Alerting Rules

Create a file (e.g., rules/postgres_alerts.yml) and add rules to your Prometheus configuration.

groups:
- name: postgresql.rules
  rules:
  - alert: PostgreSQLHighReplicationLag
    expr: |
      pg_replication_lag_seconds{job="postgres_exporter"} > 60 # Lag greater than 60 seconds
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "High replication lag detected on {{ $labels.instance }}"
      description: "PostgreSQL instance {{ $labels.instance }} has a replication lag of {{ $value }} seconds, exceeding the 60-second threshold for 5 minutes."

  - alert: PostgreSQLTooManyConnections
    expr: |
      pg_stat_activity_count{job="postgres_exporter"} > 100 # Example threshold, adjust based on your max_connections
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "High number of PostgreSQL connections on {{ $labels.instance }}"
      description: "PostgreSQL instance {{ $labels.instance }} has {{ $value }} active connections, approaching the configured limit."

  - alert: PostgreSQLDeadlocks
    expr: |
      rate(pg_stat_database_deadlocks{job="postgres_exporter"}[5m]) > 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Deadlock detected in PostgreSQL on {{ $labels.instance }}"
      description: "A deadlock has occurred on PostgreSQL instance {{ $labels.instance }} within the last 5 minutes."

  - alert: PostgreSQLHighLockWaitTime
    expr: |
      rate(pg_stat_database_blk_read_time_seconds{job="postgres_exporter"}[5m]) > 300 # Example: total block read time > 5 minutes in 5 min interval
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High lock wait time on {{ $labels.instance }}"
      description: "PostgreSQL instance {{ $labels.instance }} is experiencing significant lock wait times."

  # Add more rules for cache hit ratio, disk space, etc.

Ensure your prometheus.yml includes the rule_files directive pointing to this file.

4.2. Configuring Alertmanager

Alertmanager receives alerts from Prometheus and routes them to various receivers (email, Slack, PagerDuty, etc.).

# alertmanager.yml
global:
  resolve_timeout: 5m

route:
  group_by: ['alertname', 'cluster', 'service']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: 'default-receiver' # Default receiver if no specific match

  routes:
  - receiver: 'slack-notifications'
    match:
      severity: 'critical'
    continue: true # Allows matching other routes if needed

receivers:
- name: 'default-receiver'
  webhook_configs:
  - url: 'http://your-webhook-receiver:5001' # Example: a custom webhook

- name: 'slack-notifications'
  slack_configs:
  - api_url: 'https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX' # Replace with your Slack webhook URL
    channel: '#alerts'
    send_resolved: true
    text: "{{ range .Alerts }}*Alert:* {{ .Annotations.summary }} - `{{ .Labels.severity }}`\n*Description:* {{ .Annotations.description }}\n*Details:* {{ range .Labels.SortedPairs }} `{{ .Name }}={{ .Value }}` {{ end }}\n{{ end }}"

You’ll need to run Alertmanager, typically via Docker Compose, and configure Prometheus to send alerts to it.

# docker-compose.yml for Alertmanager
version: '3.8'

services:
  alertmanager:
    image: prom/alertmanager:latest
    container_name: alertmanager
    restart: unless-stopped
    ports:
      - "9093:9093"
    volumes:
      - ./alertmanager:/etc/alertmanager/
      - alertmanager_data:/data
    command:
      - '--config.file=/etc/alertmanager/alertmanager.yml'
      - '--storage.tsdb.path=/data'
    networks:
      - monitoring_net

volumes:
  alertmanager_data:

networks:
  monitoring_net:
    driver: bridge

Update your prometheus.yml to include the Alertmanager configuration:

# ... (previous prometheus.yml content) ...

alerting:
  alertmanagers:
    - static_configs:
        - targets:
           - 'alertmanager:9093' # If Alertmanager is in the same Docker network
           # - 'your_alertmanager_host_ip:9093' # If on a different host

5. Monitoring Your Shopify App (Node.js/Ruby/PHP)

Beyond the database, your application instances themselves need monitoring. This involves application performance monitoring (APM) and infrastructure-level metrics.

5.1. Node Exporter for System Metrics

As shown in the Prometheus config, node_exporter is essential. Install it on every Linode instance running your Shopify app. It exposes hardware and OS metrics like CPU, memory, disk I/O, and network traffic.

# On each application Linode instance:
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz
cd node_exporter-1.7.0.linux-amd64
sudo mv node_exporter /usr/local/bin/
sudo useradd -rs /bin/false node_exporter

# Create a systemd service file (/etc/systemd/system/node_exporter.service)
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter \
  --collector.filesystem.mount-points-exclude='^/(sys|proc|dev|host|etc)($$|/)' \
  --collector.netdev.sample-interval=10s \
  --collector.diskstats.sample-interval=10s \
  --collector.tcpstat.sample-interval=10s \
  --collector.textfile.directory=/var/lib/node_exporter/textfile_collector

[Install]
WantedBy=multi-user.target

# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
sudo systemctl status node_exporter

Ensure the Linode’s firewall allows access to port 9100 from your Prometheus server’s IP address.

5.2. Application-Specific Metrics (Examples)

This is highly dependent on your app’s stack.

5.2.1. Node.js Apps

Use libraries like prom-client to expose custom metrics via an HTTP endpoint (e.g., /metrics).

// Example using prom-client for Node.js
const express = require('express');
const client = require('prom-client');
const app = express();
const register = new client.Registry();

// Enable default metrics
client.collectDefaultMetrics({ register });

// Custom metric: Number of Shopify API calls
const shopifyApiCallCounter = new client.Counter({
  name: 'shopify_api_calls_total',
  help: 'Total number of Shopify API calls made',
  labelNames: ['endpoint', 'method'],
  register,
});

// Middleware to increment counter for API calls
app.use((req, res, next) => {
  if (req.path.startsWith('/api/shopify')) { // Example path
    shopifyApiCallCounter.labels(req.path, req.method).inc();
  }
  next();
});

// Endpoint to expose metrics
app.get('/metrics', async (req, res) => {
  res.setHeader('Content-Type', register.contentType);
  res.end(await register.metrics());
});

// Your application routes here...
app.get('/', (req, res) => {
  res.send('Hello World!');
});

// Start the metrics server (or integrate with your existing server)
const PORT = 8080; // Or any other port
app.listen(PORT, () => {
  console.log(`Metrics server listening on port ${PORT}`);
});

// In prometheus.yml, add a job for this:
// - job_name: 'my_shopify_app_nodejs'
//   static_configs:
//     - targets: ['your_app_host_private_ip:8080']

5.2.2. Ruby on Rails Apps

Use the prometheus_client gem.

# Gemfile
gem 'prometheus_client'

# Initialize Prometheus client (e.g., in config/initializers/prometheus.rb)
require 'prometheus_client'
require 'prometheus_client/middleware'

PrometheusClient.configure do |config|
  config.redis = { url: ENV['REDIS_URL'] } # Or use memory store
  config. للاستخدام_client = true
end

# Register custom metrics
$shopify_api_calls = PrometheusClient::Counter.new(
  name: 'shopify_api_calls_total',
  docstring: 'Total number of Shopify API calls made',
  labels: [:endpoint, :method]
)

# Add middleware to Rails application (config/application.rb or config/environments/*.rb)
config.middleware.use PrometheusClient::Middleware,
  metrics_path: '/metrics',
  registry: PrometheusClient.registry

# In a controller or service object, increment the counter:
# $shopify_api_calls.increment(endpoint: '/admin/api/2023-10/orders.json', method: 'GET')

# In prometheus.yml, add a job for this:
# - job_name: 'my_shopify_app_rails'
#   static_configs:
#     - targets: ['your_app_host_private_ip:3000'] # Assuming Rails default port

5.2.3. PHP Apps (e.g., Laravel)

Use libraries like prometheus_client_php.

// composer.json
// "require": { "promphp/prometheus_client": "^1.0" }

// In a service provider or bootstrap file (e.g., app/Providers/AppServiceProvider.php)
use Prometheus\Storage\InMemory;
use Prometheus\CollectorRegistry;
use Prometheus\RenderTextFormat;

// Initialize registry
$adapter = new InMemory();
$registry = new CollectorRegistry($adapter);

// Custom metric: Number of Shopify API calls
$counter = $registry->registerCounter(
    'my_app', // Namespace
    'shopify_api_calls_total',
    'Total number of Shopify API calls made',
    ['endpoint', 'method'] // Labels
);

// Example usage in a controller or service
public function callShopifyApi() {
    // ... API call logic ...
    $endpoint = '/admin/api/2023-10/orders.json';
    $method = 'POST';
    $counter->incBy(1, [$endpoint, $method]);
    // ...
}

// Create a route to expose metrics (e.g., routes/web.php)
Route::get('/metrics', function () use ($registry) {
    $renderer = new RenderTextFormat($registry);
    return response($renderer->render(), 200)
        ->header('Content-Type', RenderTextFormat::MIME_TYPE);
});

// In prometheus.yml, add a job for this:
// - job_name: 'my_shopify_app_php'
//   static_configs:
//     - targets: ['your_app_host_private_ip:80'] # Assuming web server port

5.3. APM Tools

For deeper insights into request tracing, error tracking, and performance bottlenecks within your application code, consider dedicated APM tools. Many integrate with Prometheus or offer their own dashboards:

New Relic
Datadog
Sentry (primarily error tracking, but has performance monitoring)
OpenTelemetry (an open-source standard for instrumentation, can send data to various backends including Prometheus)

Instrument your application code with the chosen APM agent. Configure it to send data to its respective backend. You can then correlate APM data with Prometheus metrics in Grafana for a holistic view.

6. Linode Specific Considerations

6.1. Network Configuration

Use Linode’s private networking feature to allow your Prometheus, Grafana, and Alertmanager instances to communicate securely and efficiently with your application and database servers without exposing them to the public internet. Ensure your Linode firewall rules are configured to allow traffic only from necessary sources (e.g., Prometheus scraping your app’s metrics endpoint only from the Prometheus server’s private IP).

6.2. Resource Allocation

Monitoring infrastructure (Prometheus, Grafana, Alertmanager) consumes resources. Allocate adequate CPU, RAM, and disk space for these services, especially if you have a large number of targets or a long retention period for metrics. Consider using dedicated Linode instances for your monitoring stack if resource contention becomes an issue.

6.3. High Availability for Monitoring

For critical applications, a single point of failure in your monitoring system is unacceptable. Consider:

Running Prometheus in a high-availability setup (federation or Thanos/Cortex).
Deploying multiple Grafana instances behind a load balancer.
Configuring Alertmanager with replica sets.

This adds complexity but ensures your monitoring remains operational even if one component fails.

Conclusion

Implementing a robust monitoring strategy using Prometheus and Grafana is essential for maintaining the stability and performance of your Shopify app and its underlying PostgreSQL clusters on Linode. By combining system-level metrics, application-specific instrumentation, and proactive alerting, you can detect and resolve issues before they impact your users, ensuring a seamless e-commerce experience.