The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and MongoDB on AWS for Shopify
Nginx as a High-Performance Reverse Proxy and Load Balancer
For a Shopify backend running on AWS, Nginx serves as the critical entry point, handling SSL termination, static file serving, and reverse proxying requests to your application servers (Gunicorn for Python/Django/Flask, or PHP-FPM for PHP applications). Optimizing Nginx is paramount for low latency and high throughput.
Nginx Configuration Tuning
We’ll focus on key directives within nginx.conf or included configuration files. Ensure these are tuned for your specific AWS instance types and expected load.
Worker Processes and Connections
The worker_processes directive should ideally be set to the number of CPU cores available on your EC2 instance. worker_connections defines the maximum number of simultaneous connections that each worker process can handle. The total maximum connections will be worker_processes * worker_connections.
Example Nginx Configuration Snippet
For a c5.xlarge instance (4 vCPUs):
# /etc/nginx/nginx.conf
user www-data;
worker_processes 4; # Match the number of CPU cores
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
events {
worker_connections 4096; # Adjust based on expected concurrent connections per worker
multi_accept on;
}
http {
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
server_tokens off; # Hide Nginx version for security
# Gzip compression for text-based assets
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
# SSL Configuration (essential for Shopify)
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers on;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;
ssl_session_cache shared:SSL:10m; # Adjust size as needed
ssl_session_timeout 10m;
ssl_session_tickets off; # Consider security implications
# Include virtual host configurations
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
Optimizing Static File Serving
Nginx excels at serving static assets. Leverage browser caching and efficient file serving directives.
# Inside your server block for static assets
location ~ ^/(images|javascript|js|css|flash|media|static)/ {
expires 365d; # Cache for a year
add_header Cache-Control "public";
access_log off; # Don't log access for static files to reduce I/O
try_files $uri $uri/ =404;
}
# For specific Shopify static assets (e.g., theme files)
location ~* \.(jpg|jpeg|png|gif|ico|css|js|svg|woff|woff2)$ {
expires 365d;
add_header Cache-Control "public";
access_log off;
}
Reverse Proxy Configuration
This section defines how Nginx forwards requests to your application servers. Key directives include proxy_pass, proxy_set_header, and timeouts.
# Inside your server block for dynamic requests
location / {
proxy_pass http://your_app_backend; # e.g., http://127.0.0.1:8000 for Gunicorn
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
proxy_buffer_size 128k;
proxy_buffers 4 256k;
proxy_busy_buffers_size 256k;
}
# For WebSocket support (if your app uses it)
location /ws/ {
proxy_pass http://your_app_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
Gunicorn Tuning for Python Applications
Gunicorn (Green Unicorn) is a popular WSGI HTTP Server for Python. Its performance is heavily influenced by the number of worker processes and the worker class used.
Worker Processes and Type
The recommended worker type for I/O-bound applications (typical for web apps) is gevent or eventlet, which use asynchronous I/O. For CPU-bound tasks, sync workers might be simpler but less performant under high concurrency. The number of workers is typically calculated as (2 * number_of_cores) + 1.
Example Gunicorn Command Line
# Assuming your Django/Flask app's WSGI entry point is in 'myproject.wsgi'
# For a 4-core instance:
gunicorn --workers 9 \
--worker-class gevent \
--bind 127.0.0.1:8000 \
--timeout 120 \
--graceful-timeout 120 \
--log-level info \
--access-logfile /var/log/gunicorn/access.log \
--error-logfile /var/log/gunicorn/error.log \
myproject.wsgi:application
Explanation:
--workers 9: For 4 cores,(2*4)+1 = 9workers. Adjust based on your application’s I/O vs. CPU bound nature.--worker-class gevent: Utilizes greenlets for concurrency. Requires installinggevent(`pip install gevent`).--bind 127.0.0.1:8000: Listens on localhost, port 8000, so Nginx can proxy to it.--timeout 120: Maximum time (in seconds) a worker can spend processing a request. Crucial for long-running operations.--graceful-timeout 120: Time to wait for existing requests to finish during a reload.--log-level info: Sets the logging verbosity.--access-logfile,--error-logfile: Essential for monitoring and debugging. Ensure the log directory exists and has correct permissions.
Gunicorn Configuration File
For more complex configurations or easier management, use a Gunicorn configuration file (e.g., gunicorn_config.py).
# gunicorn_config.py import multiprocessing bind = "127.0.0.1:8000" workers = (multiprocessing.cpu_count() * 2) + 1 worker_class = "gevent" # or "eventlet" timeout = 120 graceful_timeout = 120 loglevel = "info" accesslog = "/var/log/gunicorn/access.log" errorlog = "/var/log/gunicorn/error.log" # access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"' # Custom log format
Then run Gunicorn with:
gunicorn -c gunicorn_config.py myproject.wsgi:application
PHP-FPM Tuning for PHP Applications
PHP-FPM (FastCGI Process Manager) is the standard for running PHP applications with Nginx. Its performance hinges on the process manager settings.
Process Manager Settings
The primary configuration file is typically /etc/php/X.Y/fpm/php-fpm.conf and pool configurations in /etc/php/X.Y/fpm/pool.d/www.conf (where X.Y is your PHP version).
Tuning www.conf
; /etc/php/8.1/fpm/pool.d/www.conf [www] user = www-data group = www-data listen = /run/php/php8.1-fpm.sock ; Or a TCP socket like 127.0.0.1:9000 ; Process Manager Settings ; pm = dynamic ; or static or ondemand pm = dynamic ; For dynamic PM: ; pm.max_children = 50 ; Max number of children that can be started. ; pm.start_servers = 5 ; Number of children when pm is asked to start the pool. ; pm.min_spare_servers = 2 ; Number of children when pm should not generate more children. ; pm.max_spare_servers = 10 ; Number of children when pm should kill some children. ; pm.process_idle_timeout = 10s ; How long a child can be idle before being killed. ; For static PM (recommended for predictable load and dedicated servers): ; pm = static ; pm.max_children = 50 ; Fixed number of children. ; For ondemand PM (saves resources when idle): ; pm = ondemand ; pm.max_children = 50 ; pm.process_idle_timeout = 10s ; Idle children are killed after this timeout. ; Adjust based on your instance size and expected load. ; A common starting point for dynamic PM on a 4-core instance: pm.max_children = 100 pm.start_servers = 10 pm.min_spare_servers = 5 pm.max_spare_servers = 20 pm.process_idle_timeout = 30s ; Request handling request_terminate_timeout = 120s ; Max execution time for a script. Match Nginx proxy_read_timeout. ; request_slowlog_timeout = 10s ; Log scripts that take too long to execute. ; Environment variables ; env[MY_ENV_VAR] = 'value' ; Other useful settings ; catch_workers_output = yes ; Capture stdout/stderr of workers ; listen.owner = www-data ; listen.group = www-data ; listen.mode = 0660
Tuning Strategy:
pm:dynamicis a good balance.staticis best for consistent high load if you have enough memory.ondemandis good for minimizing idle resource usage.pm.max_children: This is the most critical. It’s limited by your server’s RAM. A rough estimate: each PHP-FPM worker might consume 20-50MB of RAM. For a 16GB RAM instance, you might aim for(16GB * 1024MB/GB) / 50MB/child ≈ 327children. Start lower and monitor.pm.start_servers,pm.min_spare_servers,pm.max_spare_servers: These control how quickly PHP-FPM scales up and down. Tune them to avoid constant process churning.request_terminate_timeout: Should align with Nginx’sproxy_read_timeoutto prevent conflicts.
MongoDB Performance Tuning on AWS
MongoDB performance is sensitive to hardware, configuration, and query patterns. On AWS, consider instance types optimized for I/O (e.g., i3, i4i, r6i with NVMe SSDs) and appropriate EBS volumes.
MongoDB Configuration (`mongod.conf`)
Key parameters in /etc/mongod.conf (or similar path) include storage engine settings, journaling, and cache size.
# /etc/mongod.conf
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
engine: wiredTiger # Default and recommended
# WiredTiger specific options
wiredTiger:
engineConfig:
cacheSizeGB: 0.75 # Allocate 75% of available RAM to WiredTiger cache
collectionConfig:
cache Размер: 0 # Use global cache
indexConfig:
cacheSizeGB: 0 # Use global cache
# Network interfaces
net:
port: 27017
bindIp: 0.0.0.0 # Or specific IPs for security
# Logging
systemLog:
destination: file
path: /var/log/mongodb/mongod.log
logAppend: true
verbosity: 0 # 0 for normal, 1 for verbose, 2 for debug
# Sharding (if applicable)
# sharding:
# clusterRole: configsvr
# configsvrFilePermissions:
# group:
# name: mongod
# mode: 0770
# other:
# read: false
# write: false
# Security (essential for production)
# security:
# authorization: enabled
# Replication (essential for production)
# replication:
# replSetName: rs0
Tuning Strategy:
storage.wiredTiger.cacheSizeGB: This is the most crucial parameter. Allocate a significant portion of your instance’s RAM to the WiredTiger cache. For a 32GB RAM instance, you might set this to 24GB (leaving 8GB for the OS and other processes). Monitor cache hit rates.storage.journal.enabled: true: Essential for durability. Do not disable unless you understand the risks.net.bindIp: Restrict access to only necessary IPs. Use security groups on AWS.systemLog.verbosity: Keep at 0 for production unless debugging. High verbosity impacts I/O.
AWS Specific Considerations for MongoDB
Instance Type: Use instances with local NVMe SSDs (i3, i4i) for best I/O performance. If using EBS, choose gp3 or io2 volumes and provision IOPS/throughput appropriately.
EBS Volumes: For non-local SSD instances, use gp3 volumes. They offer baseline performance and allow independent scaling of IOPS and throughput. Provision enough IOPS and throughput for your workload. For extremely demanding workloads, consider io2 Block Express.
Network: Ensure your EC2 instances are in the same VPC and Availability Zone (or use multi-AZ replication) for low latency. Use Security Groups to control access.
Monitoring MongoDB Performance
Regularly monitor key metrics using mongostat, mongotop, and the MongoDB Cloud Manager/Ops Manager or Prometheus/Grafana stack.
# Real-time connection and operation stats mongostat --host your_mongo_host:27017 --username your_user --password your_password --authenticationDatabase admin --oplog -n 5 # Example Output Interpretation: # inserts, query, update, delete, getmore, command # ns: namespace # res: resident memory # % idx miss: percentage of queries not using an index # qr|qw: query queue (reads/writes) # ar|aw: active query (reads/writes) # Monitor index usage and query performance mongotop --host your_mongo_host:27017 --username your_user --password your_password --authenticationDatabase admin 10 # Key metrics to watch: # - Cache Hit Rate: Aim for > 95% for WiredTiger. # - Disk I/O: Monitor read/write operations per second and throughput. # - Query Latency: Identify slow queries. # - Network Traffic: Ensure sufficient bandwidth. # - CPU Utilization: High CPU might indicate inefficient queries or insufficient cores. # - Memory Usage: Monitor RAM usage and ensure the cache is effectively utilized.
Use explain() on slow queries to identify missing indexes or inefficient query plans.
db.collection.find({ field: "value" }).explain("executionStats")
Putting It All Together: AWS Deployment Strategy
A typical setup on AWS for a Shopify backend might involve:
- Load Balancer: AWS Application Load Balancer (ALB) or Network Load Balancer (NLB) in front of Nginx instances.
- Web Servers: EC2 instances running Nginx, configured as described above. Use Auto Scaling Groups for high availability and scalability.
- Application Servers: EC2 instances running Gunicorn/PHP-FPM, also managed by Auto Scaling Groups. These instances would be targeted by Nginx’s
proxy_pass. - Database: AWS RDS for PostgreSQL/MySQL (if used for other services) or self-managed MongoDB on EC2 instances (preferably with local NVMe SSDs or provisioned IOPS EBS) in an Auto Scaling Group or managed via a separate cluster. Consider Amazon DocumentDB if a fully managed MongoDB-compatible service is preferred, though direct EC2 offers more tuning control.
- Caching: ElastiCache for Redis/Memcached for session storage and object caching.
- CDN: CloudFront for serving static assets globally.
This layered approach allows each component to be scaled and optimized independently, providing a robust and performant infrastructure for your Shopify application on AWS.