Server Monitoring Best Practices: Keeping Your C App and MongoDB Clusters Alive on AWS
Monitoring C Application Performance with Prometheus and Grafana
For C applications running on AWS, granular performance monitoring is critical. We’ll leverage Prometheus for metrics collection and Grafana for visualization. The core idea is to expose application-specific metrics via an HTTP endpoint that Prometheus can scrape.
First, let’s instrument our C application. We’ll use a simple HTTP server to expose metrics in Prometheus text format. For this, we can use a lightweight HTTP library like mongoose or build a basic one ourselves. Here’s a conceptual example using a hypothetical metrics exposition library:
C Application Instrumentation
Assume we have a global metrics registry and functions to increment counters or set gauges. The exposition endpoint will iterate through these metrics and format them.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include "mongoose.h" // Assuming mongoose is included
// Global metrics
typedef struct {
const char* name;
const char* help;
long long value;
} PrometheusMetric;
PrometheusMetric request_counter = {"app_requests_total", "Total number of requests processed", 0};
PrometheusMetric active_connections_gauge = {"app_active_connections", "Number of currently active connections", 0};
PrometheusMetric process_cpu_seconds_total = {"process_cpu_seconds_total", "Total user and system CPU time spent in seconds", 0}; // Placeholder
// Function to increment a counter
void increment_metric(PrometheusMetric* metric) {
__sync_add_and_fetch(&metric->value, 1);
}
// Function to set a gauge
void set_metric(PrometheusMetric* metric, long long value) {
metric->value = value;
}
// HTTP event handler
static void fn(struct mg_connection *c, int ev, void *ev_data) {
if (ev == MG_EV_HTTP_MSG) {
struct mg_http_message *hm = (struct mg_http_message *) ev_data;
if (mg_http_match_uri(hm, "/metrics")) {
// Format metrics for Prometheus
char buffer[1024];
snprintf(buffer, sizeof(buffer),
"# HELP %s %s\n# TYPE %s counter\n%s %lld\n"
"# HELP %s %s\n# TYPE %s gauge\n%s %lld\n",
request_counter.name, request_counter.help, request_counter.name, request_counter.name, request_counter.value,
active_connections_gauge.name, active_connections_gauge.help, active_connections_gauge.name, active_connections_gauge.name, active_connections_gauge.value);
mg_http_reply(c, 200, "Content-Type: text/plain\r\n", "%s", buffer);
} else {
// Increment request counter for any valid request
increment_metric(&request_counter);
mg_http_reply(c, 200, NULL, "Hello from C App!\n");
}
}
}
int main(void) {
struct mg_mgr mgr;
mg_mgr_init(&mgr, NULL);
mg_http_listen(&mgr, "http://0.0.0.0:8080", fn, NULL); // Listen on port 8080
printf("Starting server on port 8080...\n");
for (;;) {
mg_mgr_poll(&mgr, 1000); // Poll every 1 second
// In a real app, you'd update gauges like active_connections here
// and potentially fetch process CPU time.
// For CPU time, you might parse /proc/self/stat or use platform-specific APIs.
// Example: set_metric(&active_connections_gauge, get_current_connections());
}
mg_mgr_free(&mgr);
return 0;
}
To compile this with mongoose:
gcc -o my_c_app my_c_app.c -I/path/to/mongoose/include -L/path/to/mongoose/lib -lmongoose -pthread
Prometheus Configuration
Next, configure Prometheus to scrape your C application’s metrics endpoint. This involves adding a scrape job to your prometheus.yml configuration.
scrape_configs:
- job_name: 'my_c_app'
static_configs:
- targets: [':8080'] # Replace with your EC2 instance IP
labels:
instance: 'my-c-app-instance-1'
env: 'production'
Ensure your EC2 security group allows inbound traffic on port 8080 from your Prometheus server’s IP address (or the VPC CIDR if Prometheus is within the same VPC).
Grafana Dashboards
Once Prometheus is scraping your application, you can set up Grafana dashboards. Add Prometheus as a data source in Grafana. Then, create a new dashboard and add panels using PromQL queries. For example, to visualize the request rate:
rate(app_requests_total[5m])
For active connections:
app_active_connections
Monitoring MongoDB Clusters with AWS CloudWatch and Percona Monitoring and Management (PMM)
Monitoring MongoDB on AWS, especially in a cluster setup (replica sets or sharded clusters), requires a multi-pronged approach. We’ll combine AWS CloudWatch for infrastructure-level metrics and Percona Monitoring and Management (PMM) for deep database-specific insights.
AWS CloudWatch for Infrastructure Metrics
CloudWatch automatically collects basic metrics for EC2 instances hosting your MongoDB nodes. Ensure the CloudWatch agent is installed and configured on your EC2 instances to collect more detailed system-level metrics (disk I/O, CPU utilization, network traffic, memory usage).
Key CloudWatch metrics to monitor for MongoDB nodes:
- EC2 Metrics:
CPUUtilization,NetworkIn,NetworkOut,DiskReadOps,DiskWriteOps,DiskReadBytes,DiskWriteBytes. - Custom Metrics (via CloudWatch Agent):
- Memory Usage:
mem_used_percent - Disk Space Usage:
disk_used_percent(for partitions hosting MongoDB data and logs) - Process Status: Monitor if the
mongodprocess is running.
To configure the CloudWatch agent for custom metrics, you’d typically edit its configuration file (e.g., /opt/aws/amazon-cloudwatch-agent/bin/config.json). Here’s a snippet for collecting memory and disk metrics:
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "cwagent"
},
"metrics": {
"namespace": "MongoDB",
"append_dimensions": {
"InstanceId": "${aws:InstanceId}"
},
"aggregation_dimensions": [
[ "InstanceId" ]
],
"metrics_collected": {
"mem": {
"measurement": [
"mem_used_percent"
],
"total_mem_size": null,
"unit": "Percent"
},
"disk": {
"measurement": [
"disk_used_percent"
],
"resources": [
"/data/db", // Adjust path to your MongoDB data directory
"/var/log/mongodb" // Adjust path to your MongoDB log directory
],
"unit": "Percent"
},
"procstat": [
{
"exe": "mongod",
"pid_usage_metric": "process_running",
"metrics_collection_interval": 60
}
]
}
}
}
After updating the configuration, restart the agent: sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s.
Percona Monitoring and Management (PMM) for Deep MongoDB Insights
While CloudWatch gives you infrastructure health, PMM provides in-depth database performance metrics. PMM consists of a server (which you can deploy on an EC2 instance) and client agents that run on your MongoDB nodes.
Deployment Steps:
- Deploy PMM Server: The easiest way is to use the official Docker image on an EC2 instance. Ensure this instance has network access to your MongoDB nodes.
docker run -d --name pmm-server \ --restart always \ -p 80:80 \ -p 443:443 \ -p 3306:3306 \ -p 9000:9000 \ percona/pmm-server:latest
Access the PMM UI via http://<pmm-server-ec2-ip>. Follow the on-screen instructions to add your MongoDB instances.
Install PMM Client Agent on MongoDB Nodes:
On each EC2 instance hosting a MongoDB node, install the PMM client. The exact commands depend on your OS (e.g., Ubuntu, Amazon Linux). Refer to the official PMM documentation for the latest installation instructions.
# Example for Ubuntu/Debian (check PMM docs for current version and exact commands) wget https://repo.percona.com/apt/percona-release_latest.$(lsb_release -sc)_all.deb sudo dpkg -i percona-release_latest.$(lsb_release -sc)_all.deb sudo apt-get update sudo apt-get install pmm2-client
Register MongoDB Instance with PMM Client:
pmm-admin add mongodb --host <mongodb_node_ip> --port 27017 --username <mongo_user> --password <mongo_password> <mongodb_instance_name> # Example: # pmm-admin add mongodb --host 10.0.1.10 --port 27017 --username monitor_user --password 'secret_password' mongo-node-1
Replace placeholders with your actual MongoDB connection details. Ensure the MongoDB user has sufficient privileges (e.g., clusterMonitor, readAnyDatabase roles).
Key PMM Dashboards for MongoDB:
- MongoDB Overview: Provides a high-level view of your MongoDB cluster’s health, including query performance, connections, replication lag, and resource utilization.
- MongoDB Query Analytics: Deep dive into slow queries, identifying problematic operations and their execution plans.
- MongoDB Replication: Crucial for replica sets, this dashboard shows replication lag between nodes, helping to identify synchronization issues.
- MongoDB WiredTiger: Monitors the performance of the WiredTiger storage engine, including cache usage, read/write operations, and compression.
Alerting Strategies
Combine both CloudWatch and PMM for robust alerting:
- CloudWatch Alarms: Set alarms on critical infrastructure metrics like
CPUUtilization> 90%,DiskReadOps> X/sec, or if themongodprocess is not running (usingprocstatmetric). Configure SNS topics for notifications (email, Slack via Lambda). - PMM Alerts: PMM integrates with Alertmanager. Configure alerts for high replication lag, excessive slow queries, low cache hit rates, or high connection counts.
For example, a CloudWatch alarm on CPUUtilization might trigger an investigation, while a PMM alert on replication lag might indicate a more immediate data consistency issue requiring specific MongoDB troubleshooting.