The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and Elasticsearch on Linode for C

Nginx as a High-Performance Frontend Proxy

For a robust web application stack, Nginx serves as an exceptional frontend proxy, efficiently handling static assets, SSL termination, and load balancing. Its event-driven architecture makes it ideal for high concurrency. We’ll focus on tuning key directives for optimal performance on a Linode instance.

Nginx Configuration Tuning

The primary configuration file is typically located at /etc/nginx/nginx.conf. We’ll adjust the events and http blocks.

Worker Processes and Connections

The worker_processes directive should ideally be set to the number of CPU cores available. For Linode instances, this is readily available. The worker_connections directive defines the maximum number of simultaneous connections that each worker process can handle. The total maximum connections will be worker_processes * worker_connections. Ensure your system’s file descriptor limits are also increased accordingly.

Tuning `nginx.conf`

user www-data;
worker_processes auto; # Or set to the number of CPU cores, e.g., 4
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;

events {
    worker_connections 4096; # Adjust based on expected load and system limits
    multi_accept on;
}

http {
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;

    server_tokens off; # Hide Nginx version for security

    # Gzip compression for text-based assets
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

    # Include other configuration files
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # SSL configuration (if applicable)
    # ssl_protocols TLSv1.2 TLSv1.3;
    # ssl_prefer_server_ciphers on;
    # ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;

    access_log off; # Disable access logs for performance if not strictly needed for debugging

    # Include virtual host configurations
    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;
}

System File Descriptor Limits

To support the high number of connections, we need to increase the system’s file descriptor limit. Edit /etc/security/limits.conf:

* soft nofile 65536
* hard nofile 65536
root soft nofile 65536
root hard nofile 65536

And also configure systemd for Nginx if it’s managed by systemd:

sudo systemctl edit nginx.service
# Add the following lines to the override file:
[Service]
LimitNOFILE=65536
LimitNOFILESoft=65536

After making these changes, reload the systemd daemon and restart Nginx:

sudo systemctl daemon-reload
sudo systemctl restart nginx

Gunicorn Tuning for Python Applications

Gunicorn (Green Unicorn) is a Python WSGI HTTP Server. Its performance is heavily influenced by the number of worker processes and the worker type.

Worker Processes and Type

The recommended number of worker processes is typically (2 * number_of_cores) + 1. For I/O-bound applications, using the gevent or eventlet worker types can significantly improve concurrency by using asynchronous I/O.

Starting Gunicorn with Optimal Settings

# Example for a Django/Flask app located at /srv/my_app
# Assuming 4 CPU cores
# Using 'sync' workers for CPU-bound tasks or simplicity
gunicorn --workers 9 --bind 0.0.0.0:8000 my_app.wsgi:application

# Using 'gevent' workers for I/O-bound tasks
# Ensure gevent is installed: pip install gevent
gunicorn --worker-class gevent --workers 9 --bind 0.0.0.0:8000 my_app.wsgi:application

For production, Gunicorn is usually run via a systemd service. Here’s a sample /etc/systemd/system/gunicorn.service:

[Unit]
Description=Gunicorn instance to serve my_app
After=network.target

[Service]
User=my_app_user
Group=www-data
WorkingDirectory=/srv/my_app
Environment="PATH=/srv/my_app/venv/bin"
ExecStart=/srv/my_app/venv/bin/gunicorn --workers 9 --worker-class gevent --bind unix:/run/gunicorn.sock my_app.wsgi:application

[Install]
Section=multi-user.target

And the corresponding Nginx configuration snippet in /etc/nginx/sites-available/my_app:

server {
    listen 80;
    server_name your_domain.com www.your_domain.com;

    location /static/ {
        alias /srv/my_app/static/;
    }

    location / {
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_pass http://unix:/run/gunicorn.sock;
    }
}

Remember to enable the Nginx site and reload Nginx:

sudo ln -s /etc/nginx/sites-available/my_app /etc/nginx/sites-enabled/
sudo systemctl restart nginx
sudo systemctl start gunicorn

PHP-FPM Tuning for PHP Applications

For PHP applications, PHP-FPM (FastCGI Process Manager) is the standard. Tuning its process manager settings is crucial for handling concurrent requests effectively.

Process Manager Settings

The primary configuration file is typically /etc/php/X.Y/fpm/php-fpm.conf, with pool configurations in /etc/php/X.Y/fpm/pool.d/www.conf (replace X.Y with your PHP version, e.g., 7.4 or 8.1).

Tuning `www.conf`

The pm (process manager) directive can be set to static, dynamic, or ondemand. For most Linode instances, dynamic offers a good balance.

[global]
; ... other global settings

[www]
; Choose one of the process management modes:
; static: a fixed number of processes are spawned.
; dynamic: processes are spawned dynamically based on load.
; ondemand: processes are spawned on demand.
pm = dynamic

; If pm is dynamic, these are the values that are used:
; pm.max_children: The maximum number of children that can be spawned.
; pm.start_servers: The number of children initially created.
; pm.min_spare_servers: The minimum number of idle respawned servers.
; pm.max_spare_servers: The maximum number of idle respawned servers.
; pm.process_idle_timeout: The number of seconds after which an idle process will be killed.
; pm.max_requests: The number of requests each child process should execute before respawning.

pm.max_children = 100       ; Adjust based on available RAM and expected load
pm.start_servers = 5        ; Initial number of workers
pm.min_spare_servers = 2    ; Minimum idle workers
pm.max_spare_servers = 10   ; Maximum idle workers
pm.process_idle_timeout = 10s ; Timeout for idle processes
pm.max_requests = 500       ; Restart worker after X requests to prevent memory leaks

; Listen on a Unix socket for Nginx to connect to
listen = /run/php/phpX.Y-fpm.sock ; Replace X.Y with your PHP version

; Set user and group
user = www-data
group = www-data

; Set permissions for the socket
listen.owner = www-data
listen.group = www-data
listen.mode = 0660

; Other useful settings
request_terminate_timeout = 30s ; Timeout for script execution
; php_admin_value[memory_limit] = 256M ; Example: set memory limit per request
; php_admin_flag[display_errors] = off ; Disable error display in production

After modifying www.conf, restart PHP-FPM:

sudo systemctl restart phpX.Y-fpm

Nginx Configuration for PHP-FPM

Your Nginx site configuration (e.g., /etc/nginx/sites-available/my_php_app) should include a section to pass PHP requests to the FPM socket:

server {
    listen 80;
    server_name your_php_app.com;
    root /var/www/my_php_app;
    index index.php index.html index.htm;

    location / {
        try_files $uri $uri/ /index.php?$query_string;
    }

    location ~ \.php$ {
        include snippets/fastcgi-php.conf;
        # Make sure this matches the 'listen' directive in your php-fpm pool config
        fastcgi_pass unix:/run/php/phpX.Y-fpm.sock; # Replace X.Y
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        include fastcgi_params;
    }

    location ~ /\.ht {
        deny all;
    }
}

Reload Nginx after applying changes:

sudo systemctl reload nginx

Elasticsearch Performance Tuning

Elasticsearch, while powerful, can be resource-intensive. Tuning involves JVM heap size, file system cache, and shard allocation.

JVM Heap Size

The JVM heap size is critical. It should be set to no more than 50% of your system’s RAM, and never exceed 30-32GB due to compressed ordinary object pointers (compressed oops). This is configured in /etc/elasticsearch/jvm.options.

-Xms4g
-Xmx4g

For a 16GB Linode instance, 4GB to 8GB is a reasonable starting point. Adjust based on your actual RAM and workload. Restart Elasticsearch after changing:

sudo systemctl restart elasticsearch

File System Cache

Elasticsearch relies heavily on the operating system’s file system cache. Ensure you have sufficient free RAM for this. Avoid running other memory-hungry applications on the same server as Elasticsearch.

Shard Allocation and Index Settings

The number of shards per index significantly impacts performance. Aim for fewer, larger shards rather than many small ones. For time-series data, consider index lifecycle management (ILM) to manage older indices.

Example: Setting Shard Count and Replicas

When creating an index, you can specify the number of primary shards and replicas. For a cluster with one node, set replicas to 0. For a multi-node cluster, 1 or 2 replicas are common.

PUT /my_index
{
  "settings": {
    "index": {
      "number_of_shards": 3,
      "number_of_replicas": 1
    }
  }
}

You can also dynamically update these settings (though primary shard count cannot be changed after creation):

PUT /my_index/_settings
{
  "index": {
    "number_of_replicas": 2
  }
}

Swappiness

Elasticsearch performs poorly when it swaps. Set the system’s swappiness to a low value, ideally 1 or 10. Edit /etc/sysctl.conf:

vm.swappiness = 1

Apply the change:

sudo sysctl -p

Monitoring and Iteration

Performance tuning is an iterative process. Continuously monitor your system’s resource utilization (CPU, RAM, I/O, network) using tools like htop, vmstat, iostat, and application-specific metrics. For Elasticsearch, use the _cat APIs and Kibana’s monitoring tools. Regularly review logs for errors or performance bottlenecks. Adjust configurations based on observed behavior and load patterns.