• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • Projects
  • Products
  • Themes
  • Tools
  • Request for Quote

Vengala Vinay

Having 12+ Years of Experience in Software Development

  • Home
  • WordPress
  • PHP
    • Codeigniter
  • Django
  • Magento
  • Selenium
  • Server
Home » Server Monitoring Best Practices: Keeping Your Laravel App and Elasticsearch Clusters Alive on AWS

Server Monitoring Best Practices: Keeping Your Laravel App and Elasticsearch Clusters Alive on AWS

Proactive Health Checks for Laravel Applications on AWS EC2

Maintaining the health of a Laravel application deployed on AWS EC2 instances requires a multi-layered monitoring strategy. Beyond basic CPU and memory utilization, we need to ensure the application itself is responsive, its dependencies are functioning, and potential issues are flagged before they impact end-users. This involves implementing application-level health checks and integrating them with AWS CloudWatch.

Implementing a Laravel Health Check Endpoint

A robust health check endpoint within your Laravel application is the first line of defense. This endpoint should not only verify that the web server is responding but also check critical dependencies like the database connection, Redis cache, and any external APIs your application relies on. We’ll create a dedicated controller and route for this.

Health Check Controller

Create a new controller, for example, app/Http/Controllers/HealthCheckController.php.

// app/Http/Controllers/HealthCheckController.php
namespace App\Http\Controllers;

use Illuminate\Http\JsonResponse;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Facades\Cache;
use Illuminate\Support\Facades\Log;
use Illuminate\Support\Facades\Redis;
use Exception;

class HealthCheckController extends Controller
{
    /**
     * Checks the health of the application and its dependencies.
     *
     * @return \Illuminate\Http\JsonResponse
     */
    public function index(): JsonResponse
    {
        $status = 'ok';
        $checks = [];

        // 1. Database Connection Check
        try {
            DB::connection()->getPdo();
            $checks['database'] = 'connected';
        } catch (Exception $e) {
            $status = 'error';
            $checks['database'] = 'failed: ' . $e->getMessage();
            Log::error('Database connection failed: ' . $e->getMessage());
        }

        // 2. Cache (Redis) Connection Check
        try {
            Redis::connection()->ping();
            $checks['cache'] = 'connected';
        } catch (Exception $e) {
            $status = 'error';
            $checks['cache'] = 'failed: ' . $e->getMessage();
            Log::error('Redis connection failed: ' . $e->getMessage());
        }

        // 3. Basic Application Logic Check (e.g., can we access a specific configuration value?)
        try {
            config('app.name'); // Example: checking if config is loaded
            $checks['app_config'] = 'accessible';
        } catch (Exception $e) {
            $status = 'error';
            $checks['app_config'] = 'failed: ' . $e->getMessage();
            Log::error('Application config check failed: ' . $e->getMessage());
        }

        // Add more checks as needed (e.g., external API calls, queue status)

        return response()->json([
            'status' => $status,
            'checks' => $checks,
        ], $status === 'ok' ? 200 : 500);
    }
}

Define the Health Check Route

Add a route in routes/api.php (or routes/web.php if you prefer, but API is generally better for health checks).

// routes/api.php
use App\Http\Controllers\HealthCheckController;

Route::get('/health', [HealthCheckController::class, 'index']);

Ensure this route is accessible without authentication for monitoring purposes. If your application requires authentication for API routes, you might need to create a separate route group or middleware for health checks.

Configuring AWS CloudWatch Alarms

AWS CloudWatch is essential for monitoring EC2 instances and application health. We’ll set up alarms based on metrics and the health check endpoint.

CloudWatch Agent for System Metrics

The CloudWatch agent collects system-level metrics (CPU, memory, disk, network) and can also collect custom logs. Install and configure the agent on each EC2 instance running your Laravel application.

Installation (Amazon Linux 2 Example)

sudo yum update -y
sudo rpm -Uvh https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm

Configuration

Create a configuration file (e.g., /opt/aws/cloudwatch/cloudwatch-agent.json). This example collects basic system metrics and logs from Laravel’s storage/logs/laravel.log.

{
  "agent": {
    "metrics_collection_interval": 60,
    "run_as_user": "cwagent"
  },
  "metrics": {
    "namespace": "LaravelApp/EC2",
    "append_dimensions": {
      "InstanceId": "${aws:InstanceId}"
    },
    "metrics_collected": {
      "cpu": {
        "measurement": [
          "cpu_usage_idle",
          "cpu_usage_user",
          "cpu_usage_system",
          "cpu_usage_iowait"
        ],
        "totalcpu": true
      },
      "disk": {
        "measurement": [
          "free_percent",
          "used_percent",
          "inodes_free",
          "inodes_used"
        ],
        "resources": [
          "/",
          "/var/log"
        ],
        "ignore_file_system_types": [
          "sysfs",
          "devtmpfs",
          "tmpfs",
          "devfs",
          "iso9660",
          "overlay",
          "aufs",
          "squashfs"
        ]
      },
      "mem": {
        "measurement": [
          "mem_used_percent",
          "mem_used",
          "mem_total",
          "mem_cached",
          "mem_free"
        ]
      },
      "netif": {
        "measurement": [
          "bytes_sent",
          "bytes_recv",
          "packets_sent",
          "packets_recv"
        ]
      }
    }
  },
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/var/www/html/storage/logs/laravel.log",
            "log_group_name": "LaravelApp/EC2/Logs",
            "log_stream_name": "{instance_id}/laravel",
            "timestamp_format": "%Y-%m-%dT%H:%M:%S.%fZ",
            "timezone": "UTC"
          }
        ]
      }
    }
  }
}

Start the Agent

sudo /opt/aws/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/cloudwatch/cloudwatch-agent.json -s

CloudWatch Alarms for System Metrics

Navigate to the CloudWatch console in AWS. Create alarms for critical system metrics. For example, an alarm for high CPU utilization.

Example: High CPU Utilization Alarm

Metric: CPUUtilization (under the AWS/EC2 namespace, or LaravelApp/EC2 if using the agent’s namespace).

Statistic: Average

Period: 5 minutes

Threshold type: Static

Whenever CPU utilization is: Greater/Equal

than: 85%

Datapoints to alarm: 3 out of 3 (This means the condition must be true for 15 consecutive minutes).

Actions: Configure an SNS topic to send notifications (e.g., email, Slack via Lambda integration) when the alarm state changes.

CloudWatch Alarms for Application Health Endpoint

Monitoring the application’s health endpoint requires a slightly different approach, as it’s not a direct EC2 metric. We can use CloudWatch Synthetics Canaries or a custom Lambda function triggered by EventBridge.

Option 1: CloudWatch Synthetics Canaries

Canaries are ideal for simulating user interactions or API calls. We can create a canary to hit our /health endpoint.

Canary Configuration (Node.js Example)

In the CloudWatch console, create a Synthetics Canary. Choose “API Canary” and provide the URL of your health check endpoint (e.g., http://your-ec2-instance-ip/health or http://your-app-domain.com/health).

// Example Canary script (Node.js)
const synthetics = require('Synthetics');
const log = require('SyntheticsLogger');

const apiEndpoint = 'http://YOUR_APP_URL_OR_IP/health'; // Replace with your actual endpoint

exports.handler = async () => {
    const requestOptions = {
        url: apiEndpoint,
        method: 'GET',
        headers: {
            'Content-Type': 'application/json'
        },
        // Optional: Add a timeout
        timeout: 10000 // 10 seconds
    };

    log.info('Executing health check request to:', apiEndpoint);

    try {
        const response = await synthetics.executeHttpRequest(requestOptions);
        log.info('Received response:', JSON.stringify(response));

        // Check for a successful HTTP status code (2xx or 3xx)
        if (response.statusCode >= 200 && response.statusCode < 400) {
            // Further check the response body for our custom 'ok' status
            if (response.body && typeof response.body === 'string') {
                try {
                    const healthData = JSON.parse(response.body);
                    if (healthData.status === 'ok') {
                        log.info('Application health check successful.');
                        return response.body; // Success
                    } else {
                        log.error('Application reported an error status:', healthData.status);
                        throw new Error('Application reported an error status.');
                    }
                } catch (parseError) {
                    log.error('Failed to parse JSON response body:', parseError);
                    throw new Error('Failed to parse health check response body.');
                }
            } else {
                log.error('Response body is empty or not a string.');
                throw new Error('Empty or invalid response body from health check.');
            }
        } else {
            log.error(`Received non-success status code: ${response.statusCode}`);
            throw new Error(`Received non-success status code: ${response.statusCode}`);
        }
    } catch (error) {
        log.error('Health check failed:', error);
        throw error; // This will cause the canary to fail
    }
};

Configure the canary to run on a frequent schedule (e.g., every 1 or 5 minutes). Set up CloudWatch Alarms on the canary’s Synthetics Canary Run Failures metric. This alarm will trigger if the canary script fails (e.g., non-2xx response, JSON parsing error, or our custom ‘error’ status).

Option 2: Lambda Function with EventBridge

Alternatively, you can use a Lambda function to poll the health endpoint and trigger an alarm.

Lambda Function (Python Example)
import json
import os
import urllib3

# Replace with your application's health check endpoint
HEALTH_CHECK_URL = os.environ.get('HEALTH_CHECK_URL', 'http://YOUR_APP_URL_OR_IP/health')
HTTP_CLIENT = urllib3.PoolManager()

def lambda_handler(event, context):
    try:
        r = HTTP_CLIENT.request('GET', HEALTH_CHECK_URL, timeout=10)
        
        if r.status != 200:
            raise Exception(f"Health check returned status code: {r.status}")
            
        response_body = json.loads(r.data.decode('utf-8'))
        
        if response_body.get('status') != 'ok':
            raise Exception(f"Application reported status: {response_body.get('status')}")
            
        print(f"Health check successful: {response_body}")
        return {
            'statusCode': 200,
            'body': json.dumps('Health check OK')
        }
        
    except Exception as e:
        print(f"Health check failed: {e}")
        # This exception will cause the Lambda invocation to fail,
        # which can be monitored by CloudWatch.
        raise e

Deploy this Lambda function. Then, create an EventBridge (CloudWatch Events) rule to trigger this Lambda function on a schedule (e.g., every minute). Configure a CloudWatch Alarm on the Lambda function’s Invocations metric, specifically looking for Errors. If the Lambda function throws an exception (because the health check failed), the error count will increase, triggering the alarm.

Monitoring Elasticsearch Clusters on AWS

Elasticsearch clusters, especially when used with Laravel (e.g., for Scout or custom search), require dedicated monitoring. AWS offers Amazon Elasticsearch Service (now OpenSearch Service), which provides built-in metrics. We’ll focus on key metrics and setting up alarms.

Key Elasticsearch Metrics to Monitor

  • CPU Utilization: High CPU can indicate inefficient queries, indexing bottlenecks, or insufficient resources.
  • JVM Memory Pressure: Crucial for Elasticsearch performance. High pressure leads to garbage collection pauses and instability.
  • Disk Space Used: Running out of disk space will halt indexing and searching.
  • Indexing Rate: Monitor the rate at which documents are being indexed. Sudden drops or spikes can indicate issues.
  • Search Rate: Similar to indexing rate, monitor search request volume.
  • Search Latency: High latency directly impacts user experience.
  • Shards: Monitor the number of unassigned shards, which indicates cluster health problems.
  • Cluster Status: Elasticsearch reports its health as Green, Yellow, or Red. Red is critical.

Accessing Elasticsearch Metrics in CloudWatch

When you create an OpenSearch Service domain, it automatically publishes several metrics to CloudWatch under the AWS/OpenSearchService namespace. Ensure you have enabled “Detailed monitoring” for your domain to get metrics every minute (standard monitoring is every 5 minutes).

CloudWatch Alarms for Elasticsearch

Create alarms for the key metrics identified above.

Example: High JVM Memory Pressure Alarm

Metric: JVMMemoryPressure (under AWS/OpenSearchService namespace).

Statistic: Average

Period: 5 minutes

Threshold type: Static

Whenever JVM Memory Pressure is: Greater/Equal

than: 80% (Adjust based on your cluster’s baseline performance)

Datapoints to alarm: 3 out of 3

Actions: Configure an SNS topic for notifications.

Example: Unassigned Shards Alarm

Metric: UnassignedShards (under AWS/OpenSearchService namespace).

Statistic: Sum

Period: 1 minute

Threshold type: Static

Whenever Unassigned Shards is: Greater

than: 0

Datapoints to alarm: 1 out of 1

Actions: Configure an SNS topic for notifications. Unassigned shards are a critical indicator of cluster health issues.

Custom Elasticsearch Monitoring with CloudWatch Logs Insights

For more granular insights, you can configure your Elasticsearch cluster to send slow logs (search and index) to CloudWatch Logs. This allows you to use CloudWatch Logs Insights to query and analyze slow queries.

Enabling Slow Logs

In your OpenSearch Service domain configuration, under “Advanced options,” enable slowlog.enabled and set thresholds for index.search.slowlog.threshold.warn and index.indexing.slowlog.threshold.warn (e.g., 10s for warn, 30s for info).

Querying Slow Logs with Logs Insights

Once logs are flowing into CloudWatch Logs, use Logs Insights to find slow queries. For example, to find search queries taking longer than 5 seconds:

fields @timestamp, @message
| parse @message "* Search slowlog: * took<*>s" as search_time
| filter search_time > 5
| sort @timestamp desc
| limit 50

You can then create CloudWatch Alarms based on the results of these Logs Insights queries (e.g., alarm if more than X slow queries are found in Y minutes).

Centralized Logging and Alerting Strategy

A robust monitoring strategy is incomplete without a centralized logging and alerting system. AWS services like CloudWatch Logs, SNS, and potentially third-party tools like Datadog or Grafana (with Prometheus) play a crucial role.

Leveraging CloudWatch Logs for Centralization

As shown with the CloudWatch agent and Elasticsearch slow logs, direct logs to CloudWatch Logs. This provides a single pane of glass for application and system logs.

SNS for Alert Fan-out

AWS Simple Notification Service (SNS) is the backbone of our alerting. When a CloudWatch Alarm state changes (e.g., from OK to ALARM), it publishes a message to an SNS topic. This topic can then fan out notifications to multiple subscribers:

  • Email: For immediate human notification.
  • SMS: For critical alerts requiring urgent attention.
  • AWS Lambda: To trigger automated remediation actions (e.g., restarting a service, scaling up an instance).
  • SQS: To queue alerts for processing by other services.
  • HTTP/S endpoints: To integrate with external incident management tools (e.g., PagerDuty, Opsgenie).

Automated Remediation with Lambda

For common, predictable issues, automate remediation. For instance, if a health check fails repeatedly, a Lambda function subscribed to the SNS topic could attempt to restart the Laravel application process (e.g., via Systems Manager Run Command) or trigger an Auto Scaling event.

Example: Lambda to Restart Laravel Process

This Lambda function would be triggered by an SNS notification from a CloudWatch Alarm. It would use the AWS SDK to interact with Systems Manager.

import boto3
import json
import os

ssm = boto3.client('ssm')
instance_id = os.environ['TARGET_INSTANCE_ID'] # Or get dynamically from alarm event
command_document = "AWS-RunShellScript"
command_content = "sudo systemctl restart php-fpm && sudo systemctl restart apache2" # Adjust for your web server/PHP-FPM setup

def lambda_handler(event, context):
    print("Received event: " + json.dumps(event, indent=2))

    try:
        response = ssm.send_command(
            InstanceIds=[instance_id],
            DocumentName=command_document,
            Parameters={'commands': [command_content]},
            TimeoutSeconds=600,
            Comment='Automated restart of Laravel application due to health alert'
        )
        command_id = response['Command']['CommandId']
        print(f"Sent command {command_id} to instance {instance_id}")
        return {
            'statusCode': 200,
            'body': json.dumps(f'Command {command_id} sent successfully.')
        }
    except Exception as e:
        print(f"Error sending command: {e}")
        raise e

Ensure the Lambda function’s IAM role has permissions for ssm:SendCommand and potentially ec2:DescribeInstances if you need to dynamically find instance IDs.

Conclusion

A comprehensive monitoring strategy for Laravel applications and Elasticsearch on AWS involves proactive health checks at the application level, robust system metrics collection via the CloudWatch agent, and intelligent alerting. By combining CloudWatch Synthetics, alarms, and automated remediation with Lambda and SNS, you can significantly improve the reliability and uptime of your critical services.

Primary Sidebar

A little about the Author

Having 12+ Years of Experience in Software Development, Vinay is a principal software architect, senior systems engineer, and elite technical consultant. He specializes in bespoke PHP/WordPress development, high-performance Magento 2 & Shopify architectures, custom plugin/theme development from scratch, and legacy code modernization (including VB6, VB.NET, PyQt, and Crystal Reports). Known for solving complex database bottlenecks, speed optimization (Core Web Vitals), and advanced security code auditing, Vinay engineers production-ready systems designed to scale under heavy concurrent load conditions.



Chat on WhatsApp

Recent Posts

  • Top 5 SEO Growth Tactics to Explode Search Engine Visibility for SaaS to Boost Organic Search Growth by 200%
  • Top 100 Premium Newsletter and Subscription Business Models for Devs to Scale to $10,000 Monthly Recurring Revenue (MRR)
  • Top 100 Headless Decoupled Web App Ideas Built on Laravel API Backends in Highly Competitive Technical Niches
  • Top 100 Lightweight WordPress Themes for Ultra-Fast Loading Speeds for Modern E-commerce Founders and Store Owners
  • Top 100 Methods to Rank Tech Articles on the First Page of Google for Modern E-commerce Founders and Store Owners

Categories

  • apache (1)
  • Business & Monetization (352)
  • Centos (4)
  • Comparisons & Decision Making (55)
  • Debian (2)
  • Debugging & Troubleshooting (484)
  • DevOps (7)
  • DevOps & Cloud Scaling (918)
  • Django (1)
  • Migration & Architecture (66)
  • MySQL (1)
  • Performance & Optimization (623)
  • PHP (5)
  • Plugins & Themes (82)
  • Security & Compliance (523)
  • SEO & Growth (396)
  • Server (23)
  • Ubuntu (9)
  • WordPress (22)
  • WordPress Plugin Development (7)

Recent Posts

  • Top 5 SEO Growth Tactics to Explode Search Engine Visibility for SaaS to Boost Organic Search Growth by 200%
  • Top 100 Premium Newsletter and Subscription Business Models for Devs to Scale to $10,000 Monthly Recurring Revenue (MRR)
  • Top 100 Headless Decoupled Web App Ideas Built on Laravel API Backends in Highly Competitive Technical Niches
  • Top 100 Lightweight WordPress Themes for Ultra-Fast Loading Speeds for Modern E-commerce Founders and Store Owners
  • Top 100 Methods to Rank Tech Articles on the First Page of Google for Modern E-commerce Founders and Store Owners
  • Top 100 Custom Workflow and CRM Business Ideas for E-commerce Retailers to Minimize Server Costs and Load Overhead

Top Categories

  • DevOps & Cloud Scaling (918)
  • Performance & Optimization (623)
  • Security & Compliance (523)
  • Debugging & Troubleshooting (484)
  • SEO & Growth (396)
  • Business & Monetization (352)

Our Products

  • School Management & Student Administration System
  • Integrated Hospital & Clinic Management System
  • Real Estate Directory & Agent Portal
  • Restaurant POS & Table Booking System
  • Retail Inventory POS & Billing System
  • Pharmacy Inventory & Clinic Billing System

Our Services

  • Vibe Engineering & AI Code Auditing Services
  • Prompt Engineering & "Vibe Coding" Workflow Consulting
  • AI-Augmented "Vibe Coding" & Rapid MVP Development
  • Figma to Shopify Liquid Theme Customization
  • Figma to WooCommerce Frontend Development
  • Figma to Magento 2 Theme Development

Copyright © 2026 · Vinay Vengala