The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and MySQL on Google Cloud for Shopify

Nginx as a High-Performance Frontend for Gunicorn/PHP-FPM

When deploying applications on Google Cloud, particularly those serving dynamic content via Gunicorn (Python) or PHP-FPM, Nginx serves as an indispensable frontend. Its strengths lie in efficient static file serving, SSL termination, request buffering, and load balancing. Properly tuning Nginx is critical for minimizing latency and maximizing throughput.

Nginx Configuration Tuning

The core of Nginx performance tuning resides in its nginx.conf file, typically located at /etc/nginx/nginx.conf or within /etc/nginx/conf.d/. We’ll focus on key directives within the http block.

Worker Processes and Connections

The worker_processes directive dictates how many worker processes Nginx will spawn. Setting this to auto is generally recommended, allowing Nginx to detect the number of CPU cores available. The worker_connections directive sets the maximum number of simultaneous connections that each worker process can handle. This value should be set high enough to accommodate peak traffic, considering that each connection consumes a file descriptor.

http {
    # ... other http directives ...

    worker_processes auto;
    worker_connections 4096; # Adjust based on system limits and expected load

    # ... rest of http block ...
}

On the underlying Google Cloud Compute Engine instance, ensure the system’s file descriptor limit is also increased. This can be done by editing /etc/security/limits.conf:

* soft nofile 65536
* hard nofile 65536
root soft nofile 65536
root hard nofile 65536

Remember to restart Nginx and potentially the system for these changes to take effect.

Keepalive Connections

Enabling HTTP keep-alive connections reduces the overhead of establishing new TCP connections for each request. The keepalive_timeout directive controls how long an idle keep-alive connection will remain open. A value between 60 and 120 seconds is a good starting point. keepalive_requests limits the number of requests that can be made over a single keep-alive connection.

http {
    # ...

    keepalive_timeout 75;
    keepalive_requests 100;

    # ...
}

Buffering and Gzip Compression

Nginx’s buffering directives (client_body_buffer_size, client_max_body_size, proxy_buffers, proxy_buffer_size) are crucial for handling large request bodies and preventing memory exhaustion. For Gunicorn/PHP-FPM, these are particularly important when dealing with file uploads or large POST requests. Gzip compression significantly reduces bandwidth usage and improves load times for text-based assets.

http {
    # ...

    client_body_buffer_size 128k;
    client_max_body_size 100m; # Adjust based on expected max upload size
    proxy_buffers 8 16k;
    proxy_buffer_size 32k;

    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml;

    # ...
}

Gunicorn Tuning for Python Applications

Gunicorn (Green Unicorn) is a Python WSGI HTTP Server. Its performance is heavily influenced by the number of worker processes and the type of worker class used. For I/O-bound applications, the gevent or eventlet worker classes are excellent choices due to their asynchronous capabilities. For CPU-bound tasks, the default sync worker class with an appropriate number of workers is often sufficient.

Worker Processes and Threads

The --workers flag determines the number of worker processes. A common recommendation is (2 * number_of_cores) + 1. For asynchronous workers (like gevent), you might also configure the number of threads per worker using the --threads flag, though this is less common with async workers and more relevant for the sync worker class if you’re not using async.

# Example for gevent workers
gunicorn --workers 3 --worker-class gevent --bind 0.0.0.0:8000 myapp.wsgi:application

# Example for sync workers (less common for high concurrency)
# gunicorn --workers 5 --bind 0.0.0.0:8000 myapp.wsgi:application

The --worker-connections flag is specific to gevent and eventlet workers and defines the maximum number of simultaneous connections each worker can handle. This should be tuned in conjunction with Nginx’s worker_connections.

gunicorn --workers 3 --worker-class gevent --worker-connections 1000 --bind 0.0.0.0:8000 myapp.wsgi:application

Timeouts and Keepalive

The --timeout setting defines how long Gunicorn will wait for a worker to process a request before timing out. This should be set higher than your longest expected request processing time but not excessively high to avoid holding resources unnecessarily. The --keep-alive flag enables keep-alive connections between Gunicorn and its clients (though Nginx handles the client-facing keep-alives).

gunicorn --workers 3 --worker-class gevent --worker-connections 1000 --timeout 120 --keep-alive 2 --bind 0.0.0.0:8000 myapp.wsgi:application

PHP-FPM Tuning for PHP Applications

PHP-FPM (FastCGI Process Manager) is the de facto standard for running PHP applications. Its performance hinges on the process management settings, particularly the pm (process manager) type and associated directives.

Process Manager Configuration

The php-fpm.conf file (often found at /etc/php/X.Y/fpm/php-fpm.conf, where X.Y is the PHP version) and pool configuration files (e.g., /etc/php/X.Y/fpm/pool.d/www.conf) are key. The pm directive can be set to static, dynamic, or ondemand.

static: All child processes are created immediately. Good for predictable loads.
dynamic: Processes are created on demand up to pm.max_children, with idle processes being terminated.
ondemand: Processes are created only when a request is received.

For most production environments, dynamic offers a good balance. The following directives are crucial:

; /etc/php/X.Y/fpm/pool.d/www.conf

[www]
user = www-data
group = www-data
listen = /run/php/phpX.Y-fpm.sock ; Or a TCP socket like 127.0.0.1:9000

pm = dynamic
pm.max_children = 50       ; Adjust based on available RAM and expected concurrency
pm.min_spare_servers = 5   ; Minimum idle servers
pm.max_spare_servers = 10  ; Maximum idle servers
pm.start_servers = 2       ; Initial number of servers
pm.max_requests = 500      ; Restart a child process after this many requests

Tuning pm.max_children is critical. A common heuristic is to calculate based on available RAM. If each PHP-FPM worker consumes ~30MB of RAM, and you have 8GB of RAM, leaving 2GB for the OS and other services, you have ~6GB (6144MB) for PHP-FPM. This would allow for roughly 200 children (6144 / 30). However, always monitor actual memory usage and adjust downwards if necessary.

Request Handling and Performance

The request_terminate_timeout directive sets the maximum time a script can run before being terminated. This prevents runaway scripts from consuming resources indefinitely. Setting it to 0 disables the timeout, which is generally not recommended for production.

; /etc/php/X.Y/fpm/pool.d/www.conf

request_terminate_timeout = 60s ; Adjust based on longest expected script execution time

Ensure that PHP’s memory_limit in php.ini is also set appropriately for your application’s needs.

MySQL Tuning on Google Cloud

For database performance, especially with a platform like Shopify which can generate significant read/write traffic, tuning MySQL (or Cloud SQL for MySQL) is paramount. We’ll focus on key my.cnf (or mysqld.cnf) directives.

InnoDB Buffer Pool

The innodb_buffer_pool_size is arguably the most critical setting for InnoDB performance. It caches data and indexes. A common recommendation is to set it to 50-75% of available RAM on a dedicated database server. For Cloud SQL instances, this is managed by Google, but understanding its impact is vital.

[mysqld]
innodb_buffer_pool_size = 4G ; Adjust based on instance RAM (e.g., 4GB for a db-n1-standard-8)
innodb_buffer_pool_instances = 4 ; Typically 1 instance per GB of buffer pool, up to 8 or 16

Connection Handling

max_connections determines the maximum number of simultaneous client connections. This should be set high enough to handle peak load but not so high that it exhausts server memory. Monitor Threads_connected status variable to gauge actual usage.

[mysqld]
max_connections = 500 ; Adjust based on application needs and server capacity
thread_cache_size = 100 ; Cache threads for reuse

Query Cache (Deprecated/Removed in MySQL 8.0)

For MySQL versions prior to 8.0, the query cache could offer performance benefits for read-heavy workloads with identical queries. However, it’s known to be a source of contention and is disabled by default in newer versions and removed entirely in MySQL 8.0. If using an older version, tune with caution.

; [mysqld]
; query_cache_type = 1
; query_cache_size = 64M ; Adjust based on workload and contention

Logging and I/O

Disabling the general query log and slow query log in production unless actively debugging is crucial for performance. If enabled, ensure the long_query_time is set appropriately.

[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 2 ; Log queries taking longer than 2 seconds
log_queries_not_using_indexes = 1 ; Useful for identifying optimization opportunities

For I/O-bound workloads, consider tuning innodb_io_capacity and innodb_io_capacity_max to match the IOPS capabilities of your underlying Google Cloud storage (e.g., Persistent Disks). For SSDs, higher values are generally appropriate.

[mysqld]
innodb_io_capacity = 2000 ; Adjust based on disk type and IOPS
innodb_io_capacity_max = 4000 ; Adjust based on disk type and IOPS
innodb_flush_method = O_DIRECT ; Often beneficial on Linux with hardware RAID/SSDs

Monitoring and Iterative Tuning

Performance tuning is not a one-time event. Continuous monitoring is essential. Utilize Google Cloud’s Cloud Monitoring, Nginx’s status module, Gunicorn’s logging, and MySQL’s performance schema and slow query logs. Key metrics to watch include:

Nginx: Active connections, requests per second, error rates (5xx, 4xx), upstream response times.
Gunicorn/PHP-FPM: Worker utilization, request latency, error rates, memory usage per worker.
MySQL: CPU utilization, memory usage, disk I/O, network traffic, slow queries, connection counts, buffer pool hit rate.

Start with conservative settings and incrementally increase them while observing the impact on your metrics. Use load testing tools (e.g., k6, JMeter) to simulate traffic and validate tuning changes before deploying to production.