The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and MongoDB on Google Cloud for PHP

Nginx Configuration for High-Traffic PHP Applications

Optimizing Nginx for a PHP application on Google Cloud involves several key areas: efficient static file serving, robust proxying to your application server (Gunicorn/FPM), and effective caching strategies. We’ll focus on a common setup where Nginx acts as a reverse proxy to a PHP application managed by Gunicorn (for Python-based frameworks like Django/Flask) or directly serving PHP-FPM.

Static File Serving and Compression

Nginx excels at serving static assets. Ensure your `nginx.conf` or site-specific configuration file leverages `sendfile` for zero-copy transfers and `gzip` for compression. Set appropriate cache headers to allow browsers and intermediate proxies to cache static content effectively.

# /etc/nginx/nginx.conf or /etc/nginx/sites-available/your-app.conf

user www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;

events {
    worker_connections 768;
    multi_accept on;
}

http {
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;

    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # Gzip Compression
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml;

    # SSL Configuration (if applicable)
    # ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
    # ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;
    # include /etc/letsencrypt/options-ssl-nginx.conf;
    # ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;

    # Logging
    access_log /var/log/nginx/access.log;
    error_log /var/log/nginx/error.log;

    # Buffers and Timeouts
    client_body_buffer_size 10K;
    client_header_buffer_size 1K;
    large_client_header_buffers 2 4K;
    client_max_body_size 100m; # Adjust as needed
    send_timeout 3;
    client_body_timeout 10;
    client_header_timeout 10;
    lingering_close off;
    lingering_time 30;

    # Static File Caching
    location ~* \.(css|js|jpg|jpeg|png|gif|ico|svg|woff|woff2|ttf|eot)$ {
        expires 1y;
        add_header Cache-Control "public, immutable";
        access_log off;
        log_not_found off;
    }

    # Proxy to Application Server (PHP-FPM example)
    location / {
        try_files $uri $uri/ /index.php?$query_string;
    }

    location ~ \.php$ {
        include snippets/fastcgi-php.conf;
        # Adjust socket path based on your PHP-FPM configuration
        fastcgi_pass unix:/run/php/php8.1-fpm.sock;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        include fastcgi_params;
    }

    # Deny access to hidden files
    location ~ /\. {
        deny all;
    }

    # Include other configurations
    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;
}

Tuning PHP-FPM

PHP-FPM (FastCGI Process Manager) is crucial for handling PHP requests. The primary tuning parameters are found in its configuration file, typically `php-fpm.conf` or within pool configuration files in `php-fpm.d/`. The `pm` (process manager) setting is key.

pm = dynamic: Recommended for most scenarios. PHP-FPM will manage the number of child processes based on traffic.
pm = static: All processes are kept alive. Good for predictable, high-load environments where you want minimal latency, but can be resource-intensive.
pm = ondemand: Processes are created only when a request is received and killed after a period of inactivity. Saves resources but can introduce latency on initial requests.

When using `pm = dynamic`, the following parameters are critical:

pm.max_children: The maximum number of child processes that will be created. This is the most important setting. Too high, and you’ll exhaust server memory. Too low, and you’ll queue requests. A good starting point is to monitor your server’s RAM and set this to a value that leaves ample room for the OS and other services.
pm.start_servers: The number of child processes to start when PHP-FPM starts.
pm.min_spare_servers: The minimum number of idle supervisor processes.
pm.max_spare_servers: The maximum number of idle supervisor processes.
pm.process_idle_timeout: The number of seconds after which a child process will be killed if it is idle.
pm.max_requests: The number of requests each child process should execute before respawning. This helps prevent memory leaks.

Here’s an example of a tuned pool configuration (e.g., `/etc/php/8.1/fpm/pool.d/www.conf`):

; /etc/php/8.1/fpm/pool.d/www.conf

[www]
user = www-data
group = www-data
listen = /run/php/php8.1-fpm.sock
listen.owner = www-data
listen.group = www-data
listen.mode = 0660

; Process Manager settings
pm = dynamic
pm.max_children = 150       ; Adjust based on your server's RAM (e.g., 150 * ~20MB/process = ~3GB)
pm.start_servers = 20
pm.min_spare_servers = 10
pm.max_spare_servers = 30
pm.process_idle_timeout = 10s
pm.max_requests = 500

; Request handling
request_terminate_timeout = 60s ; Adjust based on your application's longest-running tasks
request_slowlog_timeout = 10s   ; Log requests exceeding this time to slowlog file

; Error logging
catch_workers_output = yes
; error_log = /var/log/php/php-fpm.log
; log_level = notice

; Other settings
; php_admin_value[memory_limit] = 256M
; php_admin_flag[display_errors] = off

Monitoring and Iteration: After applying these settings, monitor your server’s CPU, memory, and request latency. Use tools like htop, vmstat, and Nginx’s access logs (with appropriate logging formats) to identify bottlenecks. Adjust pm.max_children and other parameters iteratively. A common approach is to set pm.max_children such that the total memory usage of all PHP-FPM processes plus the OS and other services stays below 80% of total RAM under peak load.

Gunicorn Tuning for Python Applications

When using Gunicorn as your WSGI HTTP Server for Python applications (e.g., Django, Flask), tuning is primarily about worker processes and their types. Gunicorn typically runs behind Nginx, which forwards requests to Gunicorn.

Worker Processes and Types

Gunicorn’s performance is heavily influenced by the number and type of worker processes it spawns. The general recommendation is to use sync workers for CPU-bound tasks and gevent or event workers for I/O-bound tasks.

sync (Synchronous): The default worker type. Each worker can handle one request at a time. This is simple but can be a bottleneck if requests are slow.
event (Eventlet): Uses eventlet to provide a coroutine-based asynchronous framework. Can handle many concurrent connections efficiently.
gevent: Similar to event, using gevent for green threads. Often preferred for its performance and ease of use.

The number of workers is typically calculated based on the number of CPU cores available. A common formula is (2 * number_of_cores) + 1. However, this is a starting point and should be adjusted based on whether your application is CPU-bound or I/O-bound, and the available memory.

Gunicorn Configuration Example

You can configure Gunicorn via command-line arguments or a Python configuration file. Using a configuration file is generally cleaner for production deployments.

# gunicorn_config.py

import multiprocessing

# Number of worker processes
# A common starting point is (2 * number_of_cores) + 1
# For a 4-core VM, this would be 9.
workers = multiprocessing.cpu_count() * 2 + 1

# Worker class: 'sync', 'event', or 'gevent'
# 'gevent' is often a good choice for I/O-bound applications.
worker_class = 'gevent'

# Bind to a socket or IP address and port
# For production, binding to a Unix socket is often preferred when Nginx is on the same host.
# If Nginx is on a different host or you need to bind to a specific IP:
# bind = "0.0.0.0:8000"
bind = "unix:/path/to/your/app.sock" # Example Unix socket

# Maximum number of requests a worker will process before restarting
max_requests = 1000
max_requests_jitter = 50 # Randomize max_requests to spread restarts

# Timeout for worker requests
# Adjust based on your application's longest-running tasks.
timeout = 120

# Logging configuration
loglevel = 'info'
accesslog = '/var/log/gunicorn/access.log'
errorlog = '/var/log/gunicorn/error.log'

# Other useful settings
# daemon = True # Run as a daemon (often managed by systemd instead)
# pidfile = '/var/run/gunicorn.pid'
# user = 'your_app_user'
# group = 'your_app_group'

Nginx Proxy Configuration for Gunicorn:

# /etc/nginx/sites-available/your-app.conf

server {
    listen 80;
    server_name yourdomain.com;

    # Static files
    location /static/ {
        alias /path/to/your/app/static/;
        expires 1y;
        add_header Cache-Control "public, immutable";
    }

    # Media files (if applicable)
    location /media/ {
        alias /path/to/your/app/media/;
        expires 1y;
        add_header Cache-Control "public, immutable";
    }

    # Proxy requests to Gunicorn
    location / {
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # If Gunicorn is using a Unix socket:
        proxy_pass http://unix:/path/to/your/app.sock;

        # If Gunicorn is using a TCP socket:
        # proxy_pass http://127.0.0.1:8000;

        proxy_read_timeout 300s; # Increase if your app has long-running requests
        proxy_connect_timeout 75s;
    }
}

Systemd Service for Gunicorn: It’s best practice to manage Gunicorn with systemd.

# /etc/systemd/system/gunicorn.service

[Unit]
Description=Gunicorn instance to serve your-app
After=network.target

[Service]
User=your_app_user
Group=your_app_group
WorkingDirectory=/path/to/your/app
ExecStart=/path/to/your/venv/bin/gunicorn \
    --workers 9 \
    --bind unix:/path/to/your/app.sock \
    --config /path/to/your/app/gunicorn_config.py \
    your_app.wsgi:application # Replace with your app's WSGI entry point

# Optional: If using a TCP socket instead of Unix socket
# ExecStart=/path/to/your/venv/bin/gunicorn \
#     --workers 9 \
#     --bind 127.0.0.1:8000 \
#     --config /path/to/your/app/gunicorn_config.py \
#     your_app.wsgi:application

Restart=always
RestartSec=5

[Install]
Section=multi-user.target

After creating the service file, run:

sudo systemctl daemon-reload
sudo systemctl start gunicorn
sudo systemctl enable gunicorn
sudo systemctl status gunicorn

MongoDB Performance Tuning on Google Cloud

Optimizing MongoDB on Google Cloud (or any cloud provider) involves careful consideration of instance types, disk I/O, network, and MongoDB’s internal configuration. For production, consider using Google Cloud’s managed MongoDB service (e.g., MongoDB Atlas on GCP) or a highly available replica set deployed on Compute Engine.

Instance Type and Storage

Instance Type: Choose an instance type that balances CPU, RAM, and network throughput. For I/O-intensive workloads, instances with local SSDs can offer significant performance gains, but they are ephemeral. For persistent storage, use Persistent Disks (PDs).

SSD Persistent Disks: Essential for production MongoDB. They offer consistent low latency and high IOPS.
Local SSDs: Can be used for caching (e.g., WiredTiger cache) or temporary data if your application can tolerate data loss on instance termination.
Network Attached Storage (NAS): Generally not recommended for MongoDB due to latency.

Disk Size and IOPS: Provision your Persistent Disk with sufficient size and IOPS. The IOPS available depend on the disk size and type. For high-throughput workloads, you might need to provision larger disks or use provisioned IOPS PDs (though these are less common on GCP compared to AWS’s io1/io2). Monitor disk I/O wait times and throughput.

MongoDB Configuration (`mongod.conf`)

The primary configuration file for MongoDB is `mongod.conf`. Key parameters for performance tuning include:

# /etc/mongod.conf

storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true
    # For higher performance, consider increasing the journal commit interval.
    # This trades durability for speed. Use with caution.
    # commitInterval: 100 # Default is 100ms

  # WiredTiger storage engine settings
  engine: wiredTiger
  wiredTiger:
    collectionConfig:
      cacheRootPath: /var/lib/mongodb/wiredtiger
      # Specify compression for collections. 'snappy' is a good balance.
      # 'zlib' offers better compression but higher CPU usage.
      # 'zstd' is a modern, fast compressor.
      compression: snappy
    # Document cache size. This is crucial.
    # Allocate a significant portion of your instance's RAM here.
    # A common recommendation is 50% of RAM for the WiredTiger cache.
    # Example for a 32GB RAM instance:
    # cacheSizeGB: 16
    cacheSizeGB: 0.5 # Example for a smaller instance, adjust based on available RAM

    # Index configuration
    indexConfig:
      prefixCompression: true

# Network interfaces to listen on
net:
  port: 27017
  # bindIp: 127.0.0.1 # For local access only
  # bindIp: 0.0.0.0   # Listen on all interfaces (use firewall rules for security)
  bindIp: 127.0.0.1,10.128.0.2 # Example: localhost and a private GCP IP

# Logging settings
systemLog:
  destination: file
  path: /var/log/mongodb/mongod.log
  logAppend: true
  # verbosity: 0 # Default is 0 (normal)

# Sharding settings (if applicable)
# sharding:
#   clusterRole: configsvr
#   configsvrFilePattern: /var/lib/mongodb/configdb/db%d-%v
#   configsvrDataDirectory: /var/lib/mongodb/configdb
#   localLogDatabase: config
#   localLogFileName: /var/log/mongodb/mongod.log

# Security settings (essential for production)
# security:
#   authorization: enabled
#   keyFile: /path/to/your/keyfile # For replica sets

# Replication settings (essential for high availability)
# replication:
#   replSetName: rs0

Indexing Strategies

Proper indexing is paramount for MongoDB performance. Analyze your query patterns using explain() and ensure appropriate indexes are in place. Avoid over-indexing, as indexes consume disk space and slow down write operations.

// Example: Analyzing a query
db.collection.find({ user_id: 123, status: "active" }).explain("executionStats")

// Example: Creating a compound index
db.collection.createIndex({ user_id: 1, status: 1 })

Monitoring and Diagnostics

Regular monitoring is key to identifying and resolving performance issues. Use MongoDB’s built-in tools and Google Cloud’s monitoring suite.

MongoDB Tools:
- mongostat: Provides real-time server statistics (operations, connections, network, etc.).
- mongotop: Shows the time spent reading and writing by collection.
- db.serverStatus(): Comprehensive server status information.
- db.stats(): Database statistics.
- db.collection.explain(): Analyzes query execution plans.
Google Cloud Operations (formerly Stackdriver):
- Monitor CPU utilization, disk I/O, network traffic, and memory usage of your Compute Engine instances.
- Set up alerts for critical metrics (e.g., high CPU, low disk space, high latency).
- Use Cloud Logging to aggregate MongoDB logs.

Key Metrics to Watch:

Operations per second (reads/writes)
Query latency
Disk I/O wait times
WiredTiger cache hit rate (aim for 90%+)
Network traffic
CPU utilization
Memory usage
Connections

By systematically tuning Nginx, your PHP application server (Gunicorn/FPM), and MongoDB, you can build a highly performant and scalable PHP application on Google Cloud. Remember that performance tuning is an iterative process; continuous monitoring and adjustment are essential.