The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and MySQL on Google Cloud for Shopify
Nginx as a High-Performance Frontend for Gunicorn/PHP-FPM
When deploying applications on Google Cloud, particularly those serving dynamic content via Gunicorn (Python) or PHP-FPM, Nginx serves as an indispensable frontend. Its strengths lie in efficient static file serving, SSL termination, request buffering, and load balancing. Properly tuning Nginx is critical for minimizing latency and maximizing throughput.
Nginx Configuration Tuning
The core of Nginx performance tuning resides in its nginx.conf file, typically located at /etc/nginx/nginx.conf or within /etc/nginx/conf.d/. We’ll focus on key directives within the http block.
Worker Processes and Connections
The worker_processes directive dictates how many worker processes Nginx will spawn. Setting this to auto is generally recommended, allowing Nginx to detect the number of CPU cores available. The worker_connections directive sets the maximum number of simultaneous connections that each worker process can handle. This value should be set high enough to accommodate peak traffic, considering that each connection consumes a file descriptor.
http {
# ... other http directives ...
worker_processes auto;
worker_connections 4096; # Adjust based on system limits and expected load
# ... rest of http block ...
}
On the underlying Google Cloud Compute Engine instance, ensure the system’s file descriptor limit is also increased. This can be done by editing /etc/security/limits.conf:
* soft nofile 65536 * hard nofile 65536 root soft nofile 65536 root hard nofile 65536
Remember to restart Nginx and potentially the system for these changes to take effect.
Keepalive Connections
Enabling HTTP keep-alive connections reduces the overhead of establishing new TCP connections for each request. The keepalive_timeout directive controls how long an idle keep-alive connection will remain open. A value between 60 and 120 seconds is a good starting point. keepalive_requests limits the number of requests that can be made over a single keep-alive connection.
http {
# ...
keepalive_timeout 75;
keepalive_requests 100;
# ...
}
Buffering and Gzip Compression
Nginx’s buffering directives (client_body_buffer_size, client_max_body_size, proxy_buffers, proxy_buffer_size) are crucial for handling large request bodies and preventing memory exhaustion. For Gunicorn/PHP-FPM, these are particularly important when dealing with file uploads or large POST requests. Gzip compression significantly reduces bandwidth usage and improves load times for text-based assets.
http {
# ...
client_body_buffer_size 128k;
client_max_body_size 100m; # Adjust based on expected max upload size
proxy_buffers 8 16k;
proxy_buffer_size 32k;
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml;
# ...
}
Gunicorn Tuning for Python Applications
Gunicorn (Green Unicorn) is a Python WSGI HTTP Server. Its performance is heavily influenced by the number of worker processes and the type of worker class used. For I/O-bound applications, the gevent or eventlet worker classes are excellent choices due to their asynchronous capabilities. For CPU-bound tasks, the default sync worker class with an appropriate number of workers is often sufficient.
Worker Processes and Threads
The --workers flag determines the number of worker processes. A common recommendation is (2 * number_of_cores) + 1. For asynchronous workers (like gevent), you might also configure the number of threads per worker using the --threads flag, though this is less common with async workers and more relevant for the sync worker class if you’re not using async.
# Example for gevent workers gunicorn --workers 3 --worker-class gevent --bind 0.0.0.0:8000 myapp.wsgi:application # Example for sync workers (less common for high concurrency) # gunicorn --workers 5 --bind 0.0.0.0:8000 myapp.wsgi:application
The --worker-connections flag is specific to gevent and eventlet workers and defines the maximum number of simultaneous connections each worker can handle. This should be tuned in conjunction with Nginx’s worker_connections.
gunicorn --workers 3 --worker-class gevent --worker-connections 1000 --bind 0.0.0.0:8000 myapp.wsgi:application
Timeouts and Keepalive
The --timeout setting defines how long Gunicorn will wait for a worker to process a request before timing out. This should be set higher than your longest expected request processing time but not excessively high to avoid holding resources unnecessarily. The --keep-alive flag enables keep-alive connections between Gunicorn and its clients (though Nginx handles the client-facing keep-alives).
gunicorn --workers 3 --worker-class gevent --worker-connections 1000 --timeout 120 --keep-alive 2 --bind 0.0.0.0:8000 myapp.wsgi:application
PHP-FPM Tuning for PHP Applications
PHP-FPM (FastCGI Process Manager) is the de facto standard for running PHP applications. Its performance hinges on the process management settings, particularly the pm (process manager) type and associated directives.
Process Manager Configuration
The php-fpm.conf file (often found at /etc/php/X.Y/fpm/php-fpm.conf, where X.Y is the PHP version) and pool configuration files (e.g., /etc/php/X.Y/fpm/pool.d/www.conf) are key. The pm directive can be set to static, dynamic, or ondemand.
- static: All child processes are created immediately. Good for predictable loads.
- dynamic: Processes are created on demand up to
pm.max_children, with idle processes being terminated. - ondemand: Processes are created only when a request is received.
For most production environments, dynamic offers a good balance. The following directives are crucial:
; /etc/php/X.Y/fpm/pool.d/www.conf [www] user = www-data group = www-data listen = /run/php/phpX.Y-fpm.sock ; Or a TCP socket like 127.0.0.1:9000 pm = dynamic pm.max_children = 50 ; Adjust based on available RAM and expected concurrency pm.min_spare_servers = 5 ; Minimum idle servers pm.max_spare_servers = 10 ; Maximum idle servers pm.start_servers = 2 ; Initial number of servers pm.max_requests = 500 ; Restart a child process after this many requests
Tuning pm.max_children is critical. A common heuristic is to calculate based on available RAM. If each PHP-FPM worker consumes ~30MB of RAM, and you have 8GB of RAM, leaving 2GB for the OS and other services, you have ~6GB (6144MB) for PHP-FPM. This would allow for roughly 200 children (6144 / 30). However, always monitor actual memory usage and adjust downwards if necessary.
Request Handling and Performance
The request_terminate_timeout directive sets the maximum time a script can run before being terminated. This prevents runaway scripts from consuming resources indefinitely. Setting it to 0 disables the timeout, which is generally not recommended for production.
; /etc/php/X.Y/fpm/pool.d/www.conf request_terminate_timeout = 60s ; Adjust based on longest expected script execution time
Ensure that PHP’s memory_limit in php.ini is also set appropriately for your application’s needs.
MySQL Tuning on Google Cloud
For database performance, especially with a platform like Shopify which can generate significant read/write traffic, tuning MySQL (or Cloud SQL for MySQL) is paramount. We’ll focus on key my.cnf (or mysqld.cnf) directives.
InnoDB Buffer Pool
The innodb_buffer_pool_size is arguably the most critical setting for InnoDB performance. It caches data and indexes. A common recommendation is to set it to 50-75% of available RAM on a dedicated database server. For Cloud SQL instances, this is managed by Google, but understanding its impact is vital.
[mysqld] innodb_buffer_pool_size = 4G ; Adjust based on instance RAM (e.g., 4GB for a db-n1-standard-8) innodb_buffer_pool_instances = 4 ; Typically 1 instance per GB of buffer pool, up to 8 or 16
Connection Handling
max_connections determines the maximum number of simultaneous client connections. This should be set high enough to handle peak load but not so high that it exhausts server memory. Monitor Threads_connected status variable to gauge actual usage.
[mysqld] max_connections = 500 ; Adjust based on application needs and server capacity thread_cache_size = 100 ; Cache threads for reuse
Query Cache (Deprecated/Removed in MySQL 8.0)
For MySQL versions prior to 8.0, the query cache could offer performance benefits for read-heavy workloads with identical queries. However, it’s known to be a source of contention and is disabled by default in newer versions and removed entirely in MySQL 8.0. If using an older version, tune with caution.
; [mysqld] ; query_cache_type = 1 ; query_cache_size = 64M ; Adjust based on workload and contention
Logging and I/O
Disabling the general query log and slow query log in production unless actively debugging is crucial for performance. If enabled, ensure the long_query_time is set appropriately.
[mysqld] slow_query_log = 1 slow_query_log_file = /var/log/mysql/mysql-slow.log long_query_time = 2 ; Log queries taking longer than 2 seconds log_queries_not_using_indexes = 1 ; Useful for identifying optimization opportunities
For I/O-bound workloads, consider tuning innodb_io_capacity and innodb_io_capacity_max to match the IOPS capabilities of your underlying Google Cloud storage (e.g., Persistent Disks). For SSDs, higher values are generally appropriate.
[mysqld] innodb_io_capacity = 2000 ; Adjust based on disk type and IOPS innodb_io_capacity_max = 4000 ; Adjust based on disk type and IOPS innodb_flush_method = O_DIRECT ; Often beneficial on Linux with hardware RAID/SSDs
Monitoring and Iterative Tuning
Performance tuning is not a one-time event. Continuous monitoring is essential. Utilize Google Cloud’s Cloud Monitoring, Nginx’s status module, Gunicorn’s logging, and MySQL’s performance schema and slow query logs. Key metrics to watch include:
- Nginx: Active connections, requests per second, error rates (5xx, 4xx), upstream response times.
- Gunicorn/PHP-FPM: Worker utilization, request latency, error rates, memory usage per worker.
- MySQL: CPU utilization, memory usage, disk I/O, network traffic, slow queries, connection counts, buffer pool hit rate.
Start with conservative settings and incrementally increase them while observing the impact on your metrics. Use load testing tools (e.g., k6, JMeter) to simulate traffic and validate tuning changes before deploying to production.