• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • Projects
  • Products
  • Themes
  • Tools
  • Request for Quote

Vengala Vinay

Having 9+ Years of Experience in Software Development

  • Home
  • WordPress
  • PHP
    • Codeigniter
  • Django
  • Magento
  • Selenium
  • Server
Home » Server Monitoring Best Practices: Keeping Your Perl App and MongoDB Clusters Alive on AWS

Server Monitoring Best Practices: Keeping Your Perl App and MongoDB Clusters Alive on AWS

Establishing a Robust Monitoring Baseline for Perl Applications on AWS EC2

Maintaining the health and performance of Perl applications deployed on AWS EC2 instances requires a multi-layered monitoring strategy. Beyond basic CPU and memory utilization, we need to delve into application-specific metrics and system-level diagnostics that directly impact user experience and service availability. This section outlines essential checks and configurations.

System-Level Metrics with `collectd` and CloudWatch Agent

While CloudWatch provides fundamental EC2 metrics, a more granular view is often necessary. We’ll leverage `collectd` for detailed system statistics and then push these to CloudWatch for centralized visibility and alarming. The CloudWatch Agent can also be configured to collect logs and custom metrics.

First, install `collectd` on your EC2 instances:

sudo apt-get update && sudo apt-get install collectd collectd-utils -y
# Or for RHEL/CentOS:
# sudo yum install epel-release -y
# sudo yum install collectd collectd-plugins -y

Next, configure `collectd` to collect relevant metrics. A common configuration involves the `cpu`, `memory`, `disk`, and `interface` plugins. We’ll also set up the `write_cloudwatch` plugin to send data to AWS CloudWatch.

# /etc/collectd/collectd.conf

LoadPlugin cpu
LoadPlugin memory
LoadPlugin disk
LoadPlugin interface
LoadPlugin write_cloudwatch

<Plugin cpu>
    ReportExtended = true
</Plugin>

<Plugin memory>
    Granularity 1
</Plugin>

<Plugin disk>
    DiskDevice "sda"
    DiskDevice "xvda"
    DiskDevice "nvme0n1"
    IgnoreSelected "false"
</Plugin>

<Plugin interface>
    Interface "eth0"
    Interface "ens5"
</Plugin>

<Plugin write_cloudwatch>
    Region us-east-1  # Replace with your AWS region
    Namespace "EC2/PerlApp" # Custom namespace for your application
    # Optional: IAM role attached to the EC2 instance should have permissions
    # for 'cloudwatch:PutMetricData'
</Plugin>

Ensure your EC2 instance has an IAM role attached with permissions to write to CloudWatch. The `write_cloudwatch` plugin will automatically pick up credentials from the instance metadata or environment variables.

Restart `collectd` to apply the changes:

sudo systemctl restart collectd
# Or for older systems:
# sudo service collectd restart

Application-Specific Perl Metrics

For Perl applications, we need to monitor aspects like request latency, error rates, and worker process health. A common approach is to expose these metrics via an HTTP endpoint that `collectd` can scrape using the `httpcsv` plugin, or by writing directly to a time-series database like Prometheus (which can then be scraped by Prometheus itself or pushed to CloudWatch via exporters).

Here’s a simplified example of a Perl script exposing metrics:

use strict;
use warnings;
use HTTP::Server::Simple::CGI;
use CGI qw(:standard);
use Time::HiRes qw(time);

my $metrics = {
    requests_total => 0,
    errors_total   => 0,
    latency_sum_ms => 0,
    latency_count  => 0,
};

sub handle_request {
    my $self = shift;
    my $cgi  = shift;

    my $start_time = time;

    # Simulate application logic
    my $response_code = 200;
    if (rand() < 0.05) { # 5% chance of error
        $metrics->{errors_total}++;
        $response_code = 500;
        warn "Simulated error\n";
    } else {
        $metrics->{requests_total}++;
    }

    my $duration = time - $start_time;
    $metrics->{latency_sum_ms} += $duration * 1000;
    $metrics->{latency_count}++;

    my $latency_avg_ms = $metrics->{latency_count} > 0 ? $metrics->{latency_sum_ms} / $metrics->{latency_count} : 0;

    # Output metrics in a format collectd's httpcsv plugin can parse
    # Format: metric_name:value
    print "content-type: text/plain\n\n";
    print "requests_total:", $metrics->{requests_total}, "\n";
    print "errors_total:", $metrics->{errors_total}, "\n";
    print "latency_avg_ms:", sprintf("%.2f", $latency_avg_ms), "\n";

    # Simulate a response to the client
    print header(-status => $response_code);
    print start_html('Perl App');
    print p("Request processed in " . sprintf("%.2f", $duration) . " seconds.");
    print end_html;
}

# Configure collectd's httpcsv plugin to scrape this endpoint
# Example collectd config snippet:
# <Plugin httpcsv>
#     <URL "http://localhost:8080/metrics">
#         # Assuming your Perl app runs on port 8080
#         # Metrics are exposed at /metrics
#         # Parse metrics with a custom parser if needed, or rely on default
#         # For simple key:value, default parsing is often sufficient
#         # Example:
#         # Host "my-perl-app"
#         # Type "perl_app"
#         # Instance "webserver_1"
#         # Values "requests_total:requests_total,errors_total:errors_total,latency_avg_ms:latency_avg_ms"
#     </URL>
# </Plugin>

# Simple HTTP server setup
my $server = HTTP::Server::Simple::CGI->new(sub {
    my $self = shift;
    my $cgi = shift;
    handle_request($self, $cgi);
});

$server->run(8080); # Listen on port 8080

To integrate this with `collectd`, you would add the `httpcsv` plugin to your `collectd.conf` and configure it to scrape the `/metrics` endpoint of your Perl application. The output format is designed for easy parsing by `httpcsv`.

Log Aggregation and Analysis

Application logs are critical for debugging. We’ll use the CloudWatch Agent to collect Perl application logs and send them to CloudWatch Logs. This allows for centralized searching, filtering, and alarming on specific error patterns.

First, ensure the CloudWatch Agent is installed and configured. The agent’s configuration file (typically `/opt/aws/amazon-cloudwatch-agent/bin/config.json`) needs to specify log file locations.

{
  "agent": {
    "metrics_collection_interval": 60,
    "run_as_user": "cwagent"
  },
  "logs": {
    "metrics_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/var/log/perl_app/app.log",
            "log_group_name": "PerlApp/ApplicationLogs",
            "log_stream_name": "{instance_id}/app",
            "timezone": "UTC"
          },
          {
            "file_path": "/var/log/perl_app/error.log",
            "log_group_name": "PerlApp/ApplicationErrors",
            "log_stream_name": "{instance_id}/errors",
            "timezone": "UTC"
          }
        ]
      }
    }
  }
}

After updating the agent configuration, restart the agent:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s

Monitoring MongoDB Clusters on AWS with CloudWatch and Percona Monitoring and Management (PMM)

Managing MongoDB clusters, especially in a distributed environment like AWS, demands robust monitoring for performance, availability, and resource consumption. We’ll combine AWS native tools with specialized solutions like Percona Monitoring and Management (PMM) for deep insights.

Leveraging CloudWatch for MongoDB Instance Metrics

AWS provides basic EC2 metrics for instances running MongoDB. However, for specific MongoDB metrics, we need to push custom metrics. The CloudWatch Agent can be configured to collect these.

We can use the `mongostat` and `mongotop` utilities to gather real-time statistics. These can be polled periodically and pushed as custom metrics to CloudWatch. A Python script is well-suited for this task.

import subprocess
import json
import boto3
from datetime import datetime

# Configure your AWS region and MongoDB connection details
AWS_REGION = "us-east-1"
MONGODB_HOST = "localhost" # Or your MongoDB instance's IP/hostname
MONGODB_PORT = "27017"
NAMESPACE = "MongoDB/Cluster"

cloudwatch = boto3.client('cloudwatch', region_name=AWS_REGION)

def get_mongo_stats():
    stats = {}
    try:
        # Get basic connection stats
        result = subprocess.run(
            ["mongostat", "--host", MONGODB_HOST, "--port", MONGODB_PORT, "--noheaders", "--rowcount", "1", "--json"],
            capture_output=True,
            text=True,
            check=True
        )
        data = json.loads(result.stdout)
        if data and len(data) > 0:
            stats["insert_per_sec"] = data[0].get("insert", 0)
            stats["query_per_sec"] = data[0].get("query", 0)
            stats["update_per_sec"] = data[0].get("update", 0)
            stats["delete_per_sec"] = data[0].get("delete", 0)
            stats["getmore_per_sec"] = data[0].get("getmore", 0)
            stats["command_per_sec"] = data[0].get("command", 0)
            stats["flushes_per_sec"] = data[0].get("flushes", 0)
            stats["qr_per_sec"] = data[0].get("qr", 0)
            stats["qw_per_sec"] = data[0].get("qw", 0)
            stats["ar_per_sec"] = data[0].get("ar", 0)
            stats["aw_per_sec"] = data[0].get("aw", 0)
            stats["net_in_mb_per_sec"] = data[0].get("netIn", 0) / 1024.0 # Convert to MB
            stats["net_out_mb_per_sec"] = data[0].get("netOut", 0) / 1024.0 # Convert to MB
            stats["res_mb"] = data[0].get("res", 0)
            stats["dirty_percent"] = data[0].get("dirty", 0)
            stats["dirty_pages"] = data[0].get("dirty", 0) # Assuming 'dirty' is pages
            stats["idx_miss_ratio"] = data[0].get("idx%miss", 0)

        # Get top operations (e.g., slow queries)
        result_top = subprocess.run(
            ["mongotop", "--host", MONGODB_HOST, "--port", MONGODB_PORT, "--json", "--quiet", "1"],
            capture_output=True,
            text=True,
            check=True
        )
        top_data = json.loads(result_top.stdout)
        if top_data and len(top_data) > 0:
            # This is a simplified approach; mongotop output can be complex.
            # We'll focus on total time spent in operations.
            total_time_ms = 0
            for op in top_data:
                total_time_ms += op.get("time", 0)
            stats["total_op_time_ms"] = total_time_ms

    except subprocess.CalledProcessError as e:
        print(f"Error running mongostat/mongotop: {e}")
        return None
    except json.JSONDecodeError:
        print("Error decoding JSON output from mongostat/mongotop.")
        return None
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
        return None

    return stats

def put_metrics(stats):
    if not stats:
        return

    metric_data = []
    timestamp = datetime.utcnow()

    for key, value in stats.items():
        metric_data.append({
            'MetricName': key,
            'Value': value,
            'Unit': 'Count' if 'per_sec' in key or 'total' in key or 'qr' in key or 'qw' in key or 'ar' in key or 'aw' in key or 'res' in key or 'dirty' in key else 'Percent' if '%' in key else 'Bytes' if 'mb' in key else 'Milliseconds',
            'Timestamp': timestamp
        })

    try:
        cloudwatch.put_metric_data(
            Namespace=NAMESPACE,
            MetricData=metric_data
        )
        print(f"Successfully put {len(metric_data)} metrics to CloudWatch.")
    except Exception as e:
        print(f"Error putting metrics to CloudWatch: {e}")

if __name__ == "__main__":
    mongo_stats = get_mongo_stats()
    put_metrics(mongo_stats)

This script can be scheduled to run periodically (e.g., via cron) and will push key MongoDB operational metrics to a custom CloudWatch namespace. You can then create CloudWatch Alarms based on these metrics.

Implementing Percona Monitoring and Management (PMM)

For a more comprehensive and integrated monitoring solution, Percona Monitoring and Management (PMM) is an excellent choice. It provides deep visibility into MongoDB performance, query analysis, and cluster health.

PMM consists of a server component and client agents. The server can be deployed on an EC2 instance or as a container. The client agents are installed on your MongoDB nodes.

PMM Server Deployment (Docker Example)

Deploying PMM Server using Docker on an EC2 instance is straightforward. Ensure your EC2 instance has sufficient resources (CPU, RAM, disk) and security groups configured to allow access.

# Install Docker and Docker Compose on your EC2 instance
sudo apt-get update && sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin -y
# Or for RHEL/CentOS:
# sudo yum install -y yum-utils
# sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# sudo yum install docker-ce docker-ce-cli containerd.io docker-compose-plugin -y
# sudo systemctl start docker
# sudo systemctl enable docker

# Create a directory for PMM configuration and data
mkdir pmm-server
cd pmm-server

# Create a docker-compose.yml file
cat <<EOF > docker-compose.yml
version: '3'

services:
  pmm-server:
    image: perconalab/pmm-server:latest
    container_name: pmm-server
    restart: always
    ports:
      - "80:80"
      - "443:443"
      - "3306:3306" # For MySQL monitoring, if needed
      - "9090:9090" # For Prometheus
      - "9100:9100" # For Node Exporter
    volumes:
      - pmm-data:/srv/grafana
      - pmm-data:/opt/prometheus/data
      - pmm-data:/opt/consul-data
      - pmm-data:/var/lib/mysql
      - pmm-data:/var/lib/grafana
      - pmm-data:/var/lib/prometheus
      - pmm-data:/var/lib/clickhouse
      - pmm-data:/var/lib/clickhouse-server
      - pmm-data:/srv/www/grafana
      - pmm-data:/srv/www/html
      - pmm-data:/srv/www/api
      - pmm-data:/srv/www/api-v2
      - pmm-data:/srv/www/api-v1
      - pmm-data:/srv/www/api-v3
      - pmm-data:/srv/www/api-v4
      - pmm-data:/srv/www/api-v5
      - pmm-data:/srv/www/api-v6
      - pmm-data:/srv/www/api-v7
      - pmm-data:/srv/www/api-v8
      - pmm-data:/srv/www/api-v9
      - pmm-data:/srv/www/api-v10
      - pmm-data:/srv/www/api-v11
      - pmm-data:/srv/www/api-v12
      - pmm-data:/srv/www/api-v13
      - pmm-data:/srv/www/api-v14
      - pmm-data:/srv/www/api-v15
      - pmm-data:/srv/www/api-v16
      - pmm-data:/srv/www/api-v17
      - pmm-data:/srv/www/api-v18
      - pmm-data:/srv/www/api-v19
      - pmm-data:/srv/www/api-v20
      - pmm-data:/srv/www/api-v21
      - pmm-data:/srv/www/api-v22
      - pmm-data:/srv/www/api-v23
      - pmm-data:/srv/www/api-v24
      - pmm-data:/srv/www/api-v25
      - pmm-data:/srv/www/api-v26
      - pmm-data:/srv/www/api-v27
      - pmm-data:/srv/www/api-v28
      - pmm-data:/srv/www/api-v29
      - pmm-data:/srv/www/api-v30
      - pmm-data:/srv/www/api-v31
      - pmm-data:/srv/www/api-v32
      - pmm-data:/srv/www/api-v33
      - pmm-data:/srv/www/api-v34
      - pmm-data:/srv/www/api-v35
      - pmm-data:/srv/www/api-v36
      - pmm-data:/srv/www/api-v37
      - pmm-data:/srv/www/api-v38
      - pmm-data:/srv/www/api-v39
      - pmm-data:/srv/www/api-v40
      - pmm-data:/srv/www/api-v41
      - pmm-data:/srv/www/api-v42
      - pmm-data:/srv/www/api-v43
      - pmm-data:/srv/www/api-v44
      - pmm-data:/srv/www/api-v45
      - pmm-data:/srv/www/api-v46
      - pmm-data:/srv/www/api-v47
      - pmm-data:/srv/www/api-v48
      - pmm-data:/srv/www/api-v49
      - pmm-data:/srv/www/api-v50
      - pmm-data:/srv/www/api-v51
      - pmm-data:/srv/www/api-v52
      - pmm-data:/srv/www/api-v53
      - pmm-data:/srv/www/api-v54
      - pmm-data:/srv/www/api-v55
      - pmm-data:/srv/www/api-v56
      - pmm-data:/srv/www/api-v57
      - pmm-data:/srv/www/api-v58
      - pmm-data:/srv/www/api-v59
      - pmm-data:/srv/www/api-v60
      - pmm-data:/srv/www/api-v61
      - pmm-data:/srv/www/api-v62
      - pmm-data:/srv/www/api-v63
      - pmm-data:/srv/www/api-v64
      - pmm-data:/srv/www/api-v65
      - pmm-data:/srv/www/api-v66
      - pmm-data:/srv/www/api-v67
      - pmm-data:/srv/www/api-v68
      - pmm-data:/srv/www/api-v69
      - pmm-data:/srv/www/api-v70
      - pmm-data:/srv/www/api-v71
      - pmm-data:/srv/www/api-v72
      - pmm-data:/srv/www/api-v73
      - pmm-data:/srv/www/api-v74
      - pmm-data:/srv/www/api-v75
      - pmm-data:/srv/www/api-v76
      - pmm-data:/srv/www/api-v77
      - pmm-data:/srv/www/api-v78
      - pmm-data:/srv/www/api-v79
      - pmm-data:/srv/www/api-v80
      - pmm-data:/srv/www/api-v81
      - pmm-data:/srv/www/api-v82
      - pmm-data:/srv/www/api-v83
      - pmm-data:/srv/www/api-v84
      - pmm-data:/srv/www/api-v85
      - pmm-data:/srv/www/api-v86
      - pmm-data:/srv/www/api-v87
      - pmm-data:/srv/www/api-v88
      - pmm-data:/srv/www/api-v89
      - pmm-data:/srv/www/api-v90
      - pmm-data:/srv/www/api-v91
      - pmm-data:/srv/www/api-v92
      - pmm-data:/srv/www/api-v93
      - pmm-data:/srv/www/api-v94
      - pmm-data:/srv/www/api-v95
      - pmm-data:/srv/www/api-v96
      - pmm-data:/srv/www/api-v97
      - pmm-data:/srv/www/api-v98
      - pmm-data:/srv/www/api-v99
      - pmm-data:/srv/www/api-v100
      - pmm-data:/srv/www/api-v101
      - pmm-data:/srv/www/api-v102
      - pmm-data:/srv/www/api-v103
      - pmm-data:/srv/www/api-v104
      - pmm-data:/srv/www/api-v105
      - pmm-data:/srv/www/api-v106
      - pmm-data:/srv/www/api-v107
      - pmm-data:/srv/www/api-v108
      - pmm-data:/srv/www/api-v109
      - pmm-data:/srv/www/api-v110
      - pmm-data:/srv/www/api-v111
      - pmm-data:/srv/www/api-v112
      - pmm-data:/srv/www/api-v113
      - pmm-data:/srv/www/api-v114
      - pmm-data:/srv/www/api-v115
      - pmm-data:/srv/www/api-v116
      - pmm-data:/srv/www/api-v117
      - pmm-data:/srv/www/api-v118
      - pmm-data:/srv/www/api-v119
      - pmm-data:/srv/www/api-v120
      - pmm-data:/srv/www/api-v121
      - pmm-data:/srv/www/api-v122
      - pmm-data:/srv/www/api-v123
      - pmm-data:/srv/www/api-v124
      - pmm-data:/srv/www/api-v125
      - pmm-data:/srv/www/api-v126
      - pmm-data:/srv/www/api-v127
      - pmm-data:/srv/www/api-v128
      - pmm-data:/srv/www/api-v129
      - pmm-data:/srv/www/api-v130
      - pmm-data:/srv/www/api-v131
      - pmm-data:/srv/www/api-v132
      - pmm-data:/srv/www/api-v133
      - pmm-data:/srv/www/api-v134
      - pmm-data:/srv/www/api-v135
      - pmm-data:/srv/www/api-v136
      - pmm-data:/srv/www/api-v137
      - pmm-data:/srv/www/api-v138
      - pmm-data:/srv/www/api-v139
      - pmm-data:/srv/www/api-v140
      - pmm-data:/srv/www/api-v141
      - pmm-data:/srv/www/api-v142
      - pmm-data:/srv/www/api-v143
      - pmm-data:/srv/www/api-v144
      - pmm-data:/srv/www/api-v145
      - pmm-data:/srv/www/api-v146
      - pmm-data:/srv/www/api-v147
      - pmm-data:/srv/www/api-v148
      - pmm-data:/srv/www/api-v149
      - pmm-data:/srv/www/api-v150
      - pmm-data:/srv/www/api-v151
      - pmm-data:/srv/www/api-v152
      - pmm-data:/srv/www/api-v153
      - pmm-data:/srv/www/api-v154
      - pmm-data:/srv/www/api-v155
      - pmm-data:/srv/www/api-v156
      - pmm-data:/srv/www/api-v157
      - pmm-data:/srv/www/api-v158
      - pmm-data:/srv/www/api-v159
      - pmm-data:/srv/www/api-v160
      - pmm-data:/srv/www/api-v161
      - pmm-data:/srv/www/api-v162
      - pmm-data:/srv/www/api-v163
      - pmm-data:/srv/www/api-v164
      - pmm-data:/srv/www/api-v165
      - pmm-data:/srv/www/api-v166
      - pmm-data:/srv/www/api-v167
      - pmm-data:/srv/www/api-v168
      - pmm-data:/srv/www/api-v169
      - pmm-data:/srv/www/api-v170
      - pmm-data:/srv/www/api-v171
      - pmm-data:/srv/www/api-v172
      - pmm-data:/srv/www/api-v173
      - pmm-data:/srv/www/api-v174
      - pmm-data:/srv/www/api-v175
      - pmm-data:/srv/www/api-v176
      - pmm-data:/srv/www/api-v177
      - pmm-data:/srv/www/api-v178
      - pmm-data:/srv/www/api-v179
      - pmm-data:/srv/www/api-v180
      - pmm-data:/srv/www/api-v181
      - pmm-data:/srv/www/api-v182
      - pmm-data:/srv/www/api-v183
      - pmm-data:/srv/www/api-v184
      - pmm-data:/srv/www/api-v185
      - pmm-data:/srv/www/api-v186
      - pmm-data:/srv/www/api-v187
      - pmm-data:/srv/www/api-v188
      - pmm-data:/srv/www/api-v189
      - pmm-data:/srv/www/api-v190
      - pmm-data:/srv/www/api-v191
      - pmm-data:/srv/www/api-v192
      - pmm-data:/srv/www/api-v193
      - pmm-data:/srv/www/api-v194
      - pmm-data:/srv/www/api-v195
      - pmm-data:/srv/www/api-v196
      - pmm-data:/srv/www/api-v197
      - pmm-data:/srv/www/api-v198
      - pmm-data:/srv/www/api-v199
      - pmm-data:/srv/www/api-v200
      - pmm-data:/srv/www/api-v201
      - pmm-data:/srv/www/api-v202
      - pmm-data:/srv/www/api-v203
      - pmm-data:/srv/www/api-v204
      - pmm-

Primary Sidebar

A little about the Author

Having 9+ Years of Experience in Software Development.
Expertised in Php Development, WordPress Custom Theme Development (From scratch using underscores or Genesis Framework or using any blank theme or Premium Theme), Custom Plugin Development. Hands on Experience on 3rd Party Php Extension like Chilkat, nSoftware.

Recent Posts

  • Disaster Recovery 101: Architecting Auto-Failovers for Redis and PHP Deployments on OVH
  • How We Audited a High-Traffic WooCommerce Enterprise Stack on Google Cloud and Mitigated Race conditions during high-concurrency payment processing
  • Disaster Recovery 101: Architecting Auto-Failovers for Elasticsearch and Magento 2 Deployments on DigitalOcean
  • An Auditor’s Checklist for Securing WordPress Backends on OVH
  • Step-by-Step: Diagnosing Perl script high CPU throttling due to unoptimized regular expressions on AWS Servers

Copyright © 2026 · Vinay Vengala