• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • Projects
  • Products
  • Themes
  • Tools
  • Request for Quote

Vengala Vinay

Having 12+ Years of Experience in Software Development

  • Home
  • WordPress
  • PHP
    • Codeigniter
  • Django
  • Magento
  • Selenium
  • Server
Home » Server Monitoring Best Practices: Keeping Your Magento 2 App and MySQL Clusters Alive on AWS

Server Monitoring Best Practices: Keeping Your Magento 2 App and MySQL Clusters Alive on AWS

Proactive MySQL Performance Monitoring with Enhanced CloudWatch Metrics

For Magento 2 deployments, especially those leveraging AWS RDS or Aurora, granular visibility into MySQL performance is paramount. Standard CloudWatch metrics offer a baseline, but often fall short of pinpointing the root cause of latency or resource contention. We need to augment these with custom metrics and advanced diagnostics.

A critical area is understanding connection pooling, query execution times, and buffer pool efficiency. We’ll focus on extracting these insights using a combination of AWS CloudWatch Agent and custom SQL queries.

1. Custom CloudWatch Agent Configuration for MySQL Metrics

The CloudWatch Agent can be configured to collect metrics directly from MySQL. This involves setting up a configuration file that specifies the metrics to scrape and how to scrape them. We’ll target key performance indicators like:

  • Threads_connected and Threads_running: To monitor connection load.
  • Slow_queries: To track queries exceeding the long_query_time threshold.
  • Innodb_buffer_pool_wait_free: Indicates if the InnoDB buffer pool needs more memory.
  • Innodb_row_lock_waits: Highlights contention for row locks.
  • Com_select, Com_insert, Com_update, Com_delete: To understand query type distribution.

Create a configuration file, for example, /opt/aws/amazon-cloudwatch-agent/bin/config.json, with the following structure:

Ensure your MySQL user has the necessary `PROCESS` and `REPLICATION CLIENT` privileges for the agent to query performance schema and status variables.

{
  "agent": {
    "metrics_collection_interval": 60,
    "run_as_user": "cwagent"
  },
  "metrics": {
    "namespace": "MySQL/Custom",
    "metrics_collected": {
      "mysql": {
        "measurement": [
          "Threads_connected",
          "Threads_running",
          "Slow_queries",
          "Innodb_buffer_pool_wait_free",
          "Innodb_row_lock_waits",
          "Com_select",
          "Com_insert",
          "Com_update",
          "Com_delete"
        ],
        "resources": [
          "localhost:3306"
        ],
        "user_credentials": {
          "user": "cloudwatch_user",
          "password": "your_secure_password"
        }
      }
    }
  }
}

After creating or updating the configuration file, validate it and restart the agent:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
sudo systemctl restart amazon-cloudwatch-agent

2. Real-time Query Analysis with Performance Schema

While CloudWatch metrics provide aggregated views, understanding individual slow queries requires diving into MySQL’s Performance Schema. This powerful tool allows for detailed instrumentation of server events. We’ll set up a query to identify the top N slowest queries within a given timeframe.

First, ensure Performance Schema is enabled and configured. In your my.cnf or my.ini:

[mysqld]
performance_schema = ON
; Consider enabling specific consumers for detailed analysis
; performance_schema_consumer_events_statements_history_long = ON
; performance_schema_consumer_events_statements_digest_text_256 = ON

After restarting MySQL, you can use the following SQL query to find the most time-consuming queries:

SELECT
    digest_text,
    COUNT_STAR,
    SUM_TIMER_WAIT / 1000000000000 AS total_latency_seconds,
    AVG_TIMER_WAIT / 1000000000000 AS avg_latency_seconds,
    MAX_TIMER_WAIT / 1000000000000 AS max_latency_seconds,
    SUM_ROWS_SENT,
    AVG_ROWS_SENT,
    SUM_ROWS_EXAMINED,
    AVG_ROWS_EXAMINED,
    FIRST_SEEN,
    LAST_SEEN
FROM
    performance_schema.events_statements_summary_by_digest
WHERE
    SCHEMA_NAME = 'magento_db_name' -- Replace with your Magento database name
ORDER BY
    total_latency_seconds DESC
LIMIT 10;

To automate this, you can create a cron job that runs this query periodically and sends the results to CloudWatch Logs or a dedicated alerting system. For example, a Python script:

import pymysql
import boto3
import json
import datetime

DB_HOST = 'your_rds_endpoint.rds.amazonaws.com'
DB_USER = 'cloudwatch_user'
DB_PASSWORD = 'your_secure_password'
DB_NAME = 'magento_db_name'
LOG_GROUP_NAME = '/aws/rds/mysql/magento_slow_queries'
REGION_NAME = 'us-east-1'

def get_slow_queries():
    conn = None
    try:
        conn = pymysql.connect(host=DB_HOST, user=DB_USER, password=DB_PASSWORD, database=DB_NAME, cursorclass=pymysql.cursors.DictCursor)
        with conn.cursor() as cursor:
            query = """
            SELECT
                digest_text,
                COUNT_STAR,
                SUM_TIMER_WAIT / 1000000000000 AS total_latency_seconds,
                AVG_TIMER_WAIT / 1000000000000 AS avg_latency_seconds,
                MAX_TIMER_WAIT / 1000000000000 AS max_latency_seconds,
                SUM_ROWS_SENT,
                AVG_ROWS_SENT,
                SUM_ROWS_EXAMINED,
                AVG_ROWS_EXAMINED,
                FIRST_SEEN,
                LAST_SEEN
            FROM
                performance_schema.events_statements_summary_by_digest
            WHERE
                SCHEMA_NAME = %s AND TIMER_WAIT > 0
            ORDER BY
                total_latency_seconds DESC
            LIMIT 10;
            """
            cursor.execute(query, (DB_NAME,))
            results = cursor.fetchall()
            return results
    except Exception as e:
        print(f"Error fetching slow queries: {e}")
        return []
    finally:
        if conn:
            conn.close()

def send_to_cloudwatch_logs(log_events):
    client = boto3.client('logs', region_name=REGION_NAME)
    try:
        response = client.put_log_events(
            logGroupName=LOG_GROUP_NAME,
            logStreamName=f"slow_queries_{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}",
            logEvents=log_events
        )
        print(f"Successfully sent logs to CloudWatch. Response: {response}")
    except Exception as e:
        print(f"Error sending logs to CloudWatch: {e}")

if __name__ == "__main__":
    slow_queries = get_slow_queries()
    if slow_queries:
        log_events = []
        timestamp = int(datetime.datetime.now().timestamp() * 1000)
        for query_data in slow_queries:
            message = json.dumps(query_data)
            log_events.append({
                'timestamp': timestamp,
                'message': message
            })
        send_to_cloudwatch_logs(log_events)
    else:
        print("No slow queries found or an error occurred.")

This script should be scheduled via cron to run every 5-15 minutes. Ensure the IAM role attached to your EC2 instance (or the credentials used by your Lambda function) has permissions for logs:CreateLogGroup, logs:CreateLogStream, and logs:PutLogEvents.

Magento 2 Application Performance Monitoring (APM) on AWS

Monitoring the Magento 2 application layer requires a different approach, focusing on request latency, error rates, and resource utilization within the PHP-FPM and web server (Nginx/Apache) processes. AWS provides several tools, but often a dedicated APM solution offers deeper insights.

1. Leveraging AWS X-Ray for Distributed Tracing

AWS X-Ray is invaluable for tracing requests as they traverse different services, including your Magento application, API gateways, and databases. For PHP, you’ll need to install and configure the X-Ray SDK.

Installation (Composer):

composer require aws/aws-sdk-php aws/aws-xray-sdk-php

Configuration in Magento:

You’ll need to integrate X-Ray tracing into your Magento application’s bootstrap process. This typically involves creating a plugin or an event observer that starts and stops X-Ray segments.

<?php
// app/code/Vendor/XrayTrace/etc/di.xml
use Magento\Framework\Event\ObserverInterface;
use Aws\XRay\XRayClient;
use Magento\Framework\App\RequestInterface;
use Magento\Framework\App\ResponseInterface;

class TraceRequest implements ObserverInterface
{
    protected $xrayClient;
    protected $request;
    protected $response;

    public function __construct(
        XRayClient $xrayClient,
        RequestInterface $request,
        ResponseInterface $response
    ) {
        $this->xrayClient = $xrayClient;
        $this->request = $request;
        $this->response = $response;
    }

    public function execute(\Magento\Framework\Event\Observer $observer)
    {
        $this->xrayClient->beginSegment('MagentoRequest');
        $this->xrayClient->getCurrentSegment()->putAnnotation('request_uri', $this->request->getRequestUri());
        $this->xrayClient->getCurrentSegment()->putAnnotation('request_method', $this->request->getMethod());

        // Add an observer to end the segment on response send
        $observer->getEvent()->getFrontController()->getResponse()->setBodyCallback(function($body) {
            $this->xrayClient->endSegment();
            return $body;
        });
    }
}
?>

<?php
// app/code/Vendor/XrayTrace/etc/events.xml
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:Event/etc/events.xsd">
    <event name="controller_action_predispatch">
        <observer name="vendor_xraytrace_trace_request" instance="Vendor\XrayTrace\Observer\TraceRequest"/>
    </event>
    <event name="controller_front_send_response_before">
        <observer name="vendor_xraytrace_end_segment" instance="Vendor\XrayTrace\Observer\TraceRequest"/>
    </event>
</config>
?>

<?php
// app/code/Vendor/XrayTrace/etc/di.xml (for XRayClient factory)
<?xml version="1.0"?>
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:ObjectManager/etc/config.xsd">
    <type name="Aws\XRay\XRayClient">
        <arguments>
            <argument name="config" xsi:type="array">
                <item name="region" xsi:type="string">us-east-1</item> <!-- Replace with your AWS region -->
                <item name="service" xsi:type="string">XRay</item>
                <item name="version" xsi:type="string">latest</item>
            </argument>
        </arguments>
    </type>
    <virtualtype name="Vendor\XrayTrace\Service\XRayClientFactory" type="Aws\XRay\XRayClient">
        <arguments>
            <argument name="config" xsi:type="array">
                <item name="region" xsi:type="string">us-east-1</item> <!-- Replace with your AWS region -->
            </argument>
        </arguments>
    </virtualtype>
    <preference for="Aws\XRay\XRayClient" type="Vendor\XrayTrace\Service\XRayClientFactory"/>
</config>
?>

You’ll also need to ensure the X-Ray daemon is running on your EC2 instances or that your Lambda functions are configured to send traces to X-Ray. For EC2, this typically involves installing the daemon and running it as a service.

2. PHP-FPM and Nginx/Apache Metrics with CloudWatch Agent

Beyond MySQL, the web server and PHP-FPM are critical bottlenecks. We can extend the CloudWatch Agent configuration to collect metrics from these components.

Nginx Metrics:

Ensure the Nginx `stub_status` module is enabled and configured in your Nginx configuration:

# nginx.conf or a site-specific conf file
http {
    # ... other configurations ...
    server {
        listen 80;
        server_name your_domain.com;

        location /nginx_status {
            stub_status on;
            access_log off;
            allow 127.0.0.1; # Allow access only from localhost for security
            deny all;
        }
        # ... other locations ...
    }
}

Add the following to your CloudWatch Agent config.json under metrics_collected:

    "nginx": {
      "measurement": [
        "active_connections",
        "reading_connections",
        "writing_connections",
        "waiting_connections"
      ],
      "resources": [
        "http://localhost/nginx_status"
      ],
      "metrics_collection_interval": 60
    }

PHP-FPM Metrics:

PHP-FPM exposes its status via a TCP socket or a Unix domain socket. Configure it in your PHP-FPM pool configuration (e.g., /etc/php/7.4/fpm/pool.d/www.conf):

; pm.status_path = /fpm_status
; You might need to configure access for the CloudWatch agent user
; or use a TCP socket for easier access.
; Example using TCP socket:
listen = 127.0.0.1:9000
; listen.mode = 0666
; listen.owner = www-data
; listen.group = www-data
; listen.acl_users =
; listen.acl_groups =
; listen.backlog = 511
; listen.allowed_clients = 127.0.0.1
; listen.tcp_nodelay = 1
; listen.ipv6 = 1

; For status page access via TCP socket:
; You might need to configure a separate web server block (e.g., Nginx)
; to proxy requests to the FPM socket and expose it.
; Or, configure the CloudWatch agent to connect directly if possible.
; A common approach is to use a small web server to expose FPM status.

If using a Unix socket, you might need a small web server (like Nginx) to proxy requests to it. If using TCP, the CloudWatch agent can often connect directly. Add to config.json:

    "phpfpm": {
      "measurement": [
        "accepted_conn",
        "active_processes",
        "idle_processes",
        "max_children",
        "max_requests",
        "pool",
        "process_mpm",
        "slow_requests"
      ],
      "servers": [
        "127.0.0.1:9000"
      ],
      "metrics_collection_interval": 60
    }

After updating the config.json, restart the CloudWatch agent as shown previously.

Alerting Strategies for High Availability

Effective alerting is the final layer of defense. We need to define thresholds for both custom and standard CloudWatch metrics, and integrate with notification services like SNS.

1. Critical MySQL Alerts

Set up alarms for:

  • Threads_running exceeding a high threshold (e.g., > 80% of max_connections).
  • Innodb_buffer_pool_wait_free consistently above 0.
  • Innodb_row_lock_waits showing a significant increase over a short period.
  • Slow_queries count increasing rapidly.
  • RDS/Aurora specific metrics like CPUUtilization, DatabaseConnections, FreeableMemory, ReadIOPS, WriteIOPS.

Example AWS CLI command to create a CloudWatch alarm:

aws cloudwatch put-metric-alarm \
    --alarm-name "Magento-MySQL-High-Connections" \
    --alarm-description "High number of active MySQL connections" \
    --metric-name Threads_running \
    --namespace "MySQL/Custom" \
    --statistic Average \
    --period 300 \
    --threshold 150 \
    --comparison-operator GreaterThanThreshold \
    --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
    --evaluation-periods 2 \
    --datapoints-to-alarm 2 \
    --treat-missing-data notBreaching \
    --alarm-actions arn:aws:sns:us-east-1:123456789012:MagentoAlertsTopic

2. Critical Application Alerts

Configure alarms for:

  • High CPUUtilization on EC2 instances running Magento.
  • High NetworkIn/NetworkOut.
  • High RequestCount and HTTPCode_5XX_Count from Application Load Balancer (ALB) or CloudFront.
  • X-Ray trace errors exceeding a threshold.
  • PHP-FPM slow_requests count.
  • Nginx 5xx error rate (can be derived from ALB logs or custom Nginx logging).

For application-level errors not captured by standard metrics, consider using CloudWatch Logs Insights to query your application logs for specific error patterns (e.g., “Fatal error”, “Exception”) and create metric filters or alarms based on those queries.

By combining granular MySQL performance metrics, application-level tracing with X-Ray, and comprehensive web server/PHP-FPM monitoring, you establish a robust observability stack for your Magento 2 application on AWS, ensuring high availability and rapid issue resolution.

Primary Sidebar

A little about the Author

Having 12+ Years of Experience in Software Development, Vinay is a principal software architect, senior systems engineer, and elite technical consultant. He specializes in bespoke PHP/WordPress development, high-performance Magento 2 & Shopify architectures, custom plugin/theme development from scratch, and legacy code modernization (including VB6, VB.NET, PyQt, and Crystal Reports). Known for solving complex database bottlenecks, speed optimization (Core Web Vitals), and advanced security code auditing, Vinay engineers production-ready systems designed to scale under heavy concurrent load conditions.



Chat on WhatsApp

Recent Posts

  • Troubleshooting namespace class loading collisions in production when using modern Sage Roots modern environments wrappers
  • Troubleshooting WooCommerce hook execution loops in production when using modern Classic Core PHP wrappers
  • Implementing automated compliance reporting for custom internal server status logs ledgers using dompdf library
  • Step-by-Step Guide to building a custom CSV bulk exporter block for Gutenberg using SolidJS high-performance reactive components
  • Troubleshooting Zend memory limit exceed in production when using modern Carbon Fields custom wrappers wrappers

Categories

  • apache (1)
  • Business & Monetization (390)
  • Centos (4)
  • Comparisons & Decision Making (55)
  • Debian (2)
  • Debugging & Troubleshooting (609)
  • Desktop Applications (14)
  • DevOps (7)
  • DevOps & Cloud Scaling (962)
  • Django (1)
  • Laravel (4)
  • Migration & Architecture (192)
  • Mobile Applications (24)
  • MySQL (1)
  • Performance & Optimization (824)
  • PHP (5)
  • PHP Development (30)
  • Plugins & Themes (244)
  • Programming Languages (9)
  • Python (20)
  • Ruby on Rails (1)
  • Security & Compliance (587)
  • SEO & Growth (492)
  • Server (23)
  • Ubuntu (9)
  • VB6 & VB.NET (8)
  • Web Applications & Frontend (19)
  • Web Assembly (Wasm) (2)
  • WordPress (22)
  • WordPress Plugin Development (132)
  • WordPress Theme Development (357)

Recent Posts

  • Troubleshooting namespace class loading collisions in production when using modern Sage Roots modern environments wrappers
  • Troubleshooting WooCommerce hook execution loops in production when using modern Classic Core PHP wrappers
  • Implementing automated compliance reporting for custom internal server status logs ledgers using dompdf library

Top Categories

  • DevOps & Cloud Scaling (962)
  • Performance & Optimization (824)
  • Debugging & Troubleshooting (609)
  • Security & Compliance (587)
  • SEO & Growth (492)
  • Business & Monetization (390)

Our Products

  • ERP & LMS Systems (4)
  • Directories & Marketplaces (4)
  • Healthcare Portals (3)
  • Point of Sale (POS) (2)
  • E-Commerce Engines (2)

Our Services

  • E-Commerce Development (10)
  • WordPress Development (8)
  • Python & Desktop GUI (7)
  • General Consulting (7)
  • Legacy Modernization (5)
  • Mobile App Development (4)

Copyright © 2026 · Vinay Vengala