The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and Elasticsearch on AWS for Laravel

Nginx Tuning for Laravel Applications

Optimizing Nginx is paramount for serving Laravel applications efficiently, especially under load. We’ll focus on key directives that impact request handling, caching, and resource utilization on AWS.

Worker Processes and Connections

The worker_processes directive dictates how many worker processes Nginx will spawn. A common recommendation is to set it to the number of CPU cores available on your EC2 instance. For dynamic scaling, consider setting it to ‘auto’. The worker_connections directive limits the number of simultaneous connections a single worker process can handle. This should be set high enough to accommodate peak traffic but not so high that it exhausts system resources.

A good starting point for a t3.medium instance (2 vCPUs) would be:

worker_processes auto;
events {
    worker_connections 4096;
    multi_accept on;
}

The multi_accept on; directive allows workers to accept as many new connections as possible per event loop iteration, improving responsiveness.

Keepalive Connections

Enabling HTTP keep-alive connections reduces the overhead of establishing new TCP connections for each request. This is particularly beneficial for applications with many small requests.

Tune keepalive_timeout and keepalive_requests:

http {
    # ... other http directives
    keepalive_timeout 65;
    keepalive_requests 1000;
    # ...
}

A keepalive_timeout of 65 seconds is a common default, and 1000 requests per keep-alive connection is a reasonable limit.

Buffering and Gzip Compression

Nginx buffering can improve performance by reducing the amount of data transferred between Nginx and the upstream server. Gzip compression significantly reduces bandwidth usage and speeds up content delivery.

location ~ \.php$ {
    # ... other php directives
    fastcgi_buffers 8 16k;
    fastcgi_buffer_size 32k;
    # ...
}

gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml;
gzip_min_length 1000;
gzip_disable "msie6";

The fastcgi_buffers and fastcgi_buffer_size directives are crucial for PHP-FPM. For gzip, gzip_comp_level 6 offers a good balance between compression ratio and CPU usage. gzip_min_length prevents compressing very small files where the overhead might outweigh the benefits.

Client Body and Header Limits

Setting appropriate limits for client request bodies and headers prevents denial-of-service attacks and resource exhaustion from malformed or excessively large requests.

client_max_body_size 100M;
client_body_buffer_size 128k;
client_header_buffer_size 128k;
large_client_header_buffers 4 128k;

client_max_body_size should be set according to your application’s needs (e.g., file uploads). The buffer sizes should be sufficient for typical headers.

Gunicorn/PHP-FPM Tuning for Laravel

The choice between Gunicorn (for Python/WSGI) and PHP-FPM (for PHP) dictates the specific tuning parameters. For Laravel, we’ll assume PHP-FPM.

PHP-FPM Process Management

PHP-FPM’s process manager controls how worker processes are spawned and managed. The pm setting can be ‘static’, ‘dynamic’, or ‘ondemand’. ‘dynamic’ is often a good balance for web applications.

Key directives in /etc/php/[version]/fpm/pool.d/www.conf:

pm = dynamic
pm.max_children = 50
pm.start_servers = 5
pm.min_spare_servers = 2
pm.max_spare_servers = 10
pm.process_idle_timeout = 10s
pm.max_requests = 500

pm.max_children is the most critical. It should be set based on your server’s RAM and the memory footprint of your Laravel application. A common formula is (Total RAM - Reserved RAM) / Average PHP Process Size. Monitor your server’s memory usage and adjust accordingly. pm.max_requests helps prevent memory leaks by respawning workers after a certain number of requests.

PHP-FPM Slowlog and Request Termination

Identifying slow PHP scripts is crucial for optimization. Enabling the slowlog and setting request termination timeouts can help diagnose and prevent runaway scripts.

request_slowlog_timeout = 10s
request_slowlog_trace_depth = 20
request_terminate_timeout = 60s

request_slowlog_timeout logs requests exceeding this duration. request_terminate_timeout will kill a script if it runs longer than this, preventing it from hogging resources. Ensure this is longer than your longest expected script execution time.

Elasticsearch Tuning on AWS

For Laravel applications leveraging Elasticsearch for search, optimizing the Elasticsearch cluster is vital. This section focuses on JVM heap settings and shard allocation.

JVM Heap Size Configuration

The Java Virtual Machine (JVM) heap size directly impacts Elasticsearch’s performance and stability. It’s configured in jvm.options (typically found in /etc/elasticsearch/jvm.options or similar).

Set Xms (initial heap size) and Xmx (maximum heap size) to the same value to prevent heap resizing. A common recommendation is to allocate 50% of the system’s RAM to the heap, but never more than 30-32GB due to compressed ordinary object pointers (compressed oops).

-Xms8g
-Xmx8g

For an EC2 instance with 16GB RAM, setting the heap to 8GB is a good starting point. Always monitor heap usage and garbage collection activity.

Shard Allocation and Recovery

Efficient shard allocation and recovery are key to cluster health and performance. These settings are managed via the Cluster Settings API.

To view current settings:

curl -X GET "localhost:9200/_cluster/settings?pretty"

To adjust shard allocation throttling (e.g., during rebalancing or node failures):

curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.node_concurrent_recoveries": "2",
    "cluster.routing.allocation.node_concurrent_streams": "4",
    "cluster.routing.allocation.cluster_concurrent_rebalance": "2"
  }
}'

node_concurrent_recoveries limits the number of shards that can be recovered on a single node simultaneously. cluster_concurrent_rebalance limits concurrent shard rebalancing operations across the cluster. Adjust these based on your network bandwidth and disk I/O capabilities.

Index Buffer and Merge Settings

Tuning the index buffer and merge settings can significantly impact indexing performance.

curl -X PUT "localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d'
{
  "index": {
    "refresh_interval": "30s",
    "index_buffer_size": "512mb",
    "merge.scheduler.max_thread_count": "1",
    "merge.scheduler.auto_throttle": "false"
  }
}'

refresh_interval controls how often new documents become searchable. Increasing it (e.g., to 30s) can improve indexing throughput at the cost of search latency. index_buffer_size is the amount of memory allocated for indexing. Setting merge.scheduler.max_thread_count to 1 and disabling auto_throttle can sometimes improve indexing performance on dedicated indexing nodes, but requires careful monitoring.

Monitoring and Iterative Tuning

All these tuning parameters are starting points. Continuous monitoring is essential. Utilize AWS CloudWatch for EC2 metrics (CPU, Memory, Network), Nginx status modules, PHP-FPM status pages, and Elasticsearch’s monitoring APIs (or tools like Cerebro/Kibana). Observe the impact of your changes under realistic load conditions and iterate.