The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and Redis on Google Cloud for Ruby
Nginx as a High-Performance Frontend Proxy
When deploying Ruby applications on Google Cloud, Nginx serves as an indispensable frontend proxy. Its role extends beyond simple request routing; it handles SSL termination, static file serving, caching, and load balancing, offloading these critical tasks from your application servers. Proper tuning of Nginx is paramount for achieving optimal performance and stability.
Nginx Configuration for Ruby Applications
A robust Nginx configuration for a Ruby application, typically served via Gunicorn or Puma, involves several key directives. We’ll focus on optimizing worker processes, connection handling, and buffering.
Worker Processes and Connections
The worker_processes directive dictates how many worker processes Nginx will spawn. A common best practice is to set this to the number of CPU cores available on your instance. The worker_connections directive sets the maximum number of simultaneous connections that each worker process can handle. The total maximum connections will be worker_processes * worker_connections.
Example Nginx Configuration Snippet
Place these directives within the main context of your nginx.conf file, typically located at /etc/nginx/nginx.conf.
# Adjust worker_processes based on your VM's CPU cores. # For a 4-core VM, set to 4. worker_processes 4; # Maximum number of simultaneous connections per worker. # A common starting point is 1024, but this can be tuned higher. worker_connections 1024; # Enable Gzip compression for text-based assets. gzip on; gzip_vary on; gzip_proxied any; gzip_comp_level 6; gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript; # Increase the buffer size for requests. client_body_buffer_size 128k; client_max_body_size 10m; # Adjust as needed for file uploads # Timeout settings to prevent idle connections from consuming resources. client_header_timeout 3m; client_body_timeout 3m; send_timeout 3m; keepalive_timeout 65; # Enable TCP_NODELAY and TCP_CORK for better network performance. tcp_nodelay on; tcp_nopush on;
Proxying to Gunicorn/Puma
When proxying to a WSGI server like Gunicorn (for Python applications, though often used metaphorically for Ruby app servers) or a Ruby server like Puma, it’s crucial to configure appropriate timeouts and buffer sizes to prevent upstream timeouts and ensure smooth data transfer.
Example Server Block for Upstream Proxying
This configuration assumes your Ruby application server is listening on 127.0.0.1:8000. Adjust the proxy_pass directive accordingly.
server {
listen 80;
server_name your_domain.com www.your_domain.com;
# Serve static files directly from Nginx for performance.
location /static/ {
alias /path/to/your/app/public/static/;
expires 30d;
access_log off;
}
location / {
proxy_pass http://127.0.0.1:8000; # Or your Puma/Gunicorn address
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Increase proxy timeouts to prevent upstream timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
# Buffering settings
proxy_buffering on;
proxy_buffer_size 128k;
proxy_buffers 8 128k;
proxy_busy_buffers_size 256k;
}
# Optional: SSL configuration (if not handled by Google Cloud Load Balancer)
# listen 443 ssl;
# ssl_certificate /etc/letsencrypt/live/your_domain.com/fullchain.pem;
# ssl_certificate_key /etc/letsencrypt/live/your_domain.com/privkey.pem;
# include /etc/letsencrypt/options-ssl-nginx.conf;
# ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
}
Gunicorn/Puma Tuning for Ruby Applications
Your Ruby application server (e.g., Puma, Unicorn) is the heart of your application’s request processing. Tuning its worker count, threads, and timeout settings is critical. We’ll use Puma as the primary example, as it’s a popular choice for Ruby web applications.
Puma Worker and Thread Configuration
Puma’s concurrency model is based on workers and threads. A common strategy is to use multiple workers, each with multiple threads. The optimal number depends heavily on your application’s I/O-bound vs. CPU-bound nature and the available CPU cores.
Example Puma Command Line Arguments
When starting Puma, you can specify these parameters. For a typical Google Cloud instance with 4 vCPUs, a good starting point might be 2 workers, each with 4 threads.
# Example for a Rails application bundle exec puma -C config/puma.rb -w 2 -t 4 --bind tcp://127.0.0.1:8000
Puma Configuration File (config/puma.rb)
For more granular control, use a puma.rb configuration file. This allows for more sophisticated settings, including environment-specific configurations and daemonization.
# config/puma.rb
# Number of workers. Adjust based on CPU cores.
# For a 4-core VM, 2-4 workers is a good starting point.
workers 2
# Number of threads per worker.
# If your app is I/O bound, you might increase this.
# If CPU bound, keep it closer to 1.
threads 0, 4 # Min threads 0, Max threads 4
# Bind to a TCP socket for Nginx to proxy to.
bind "tcp://127.0.0.1:8000"
# Environment (e.g., "production")
environment ENV.fetch("RAILS_ENV") { "production" }
# Daemonize the process (run in background)
daemonize false # Set to true if not managed by a process manager like systemd
# Logging
pidfile "tmp/pids/puma.pid"
state_path "tmp/pids/puma.state"
log_requests true
# Graceful shutdown timeout
quiet_and_graceful_shutdown true
graceful_shutdown_timeout 15
# Preload application code
preload_app!
# Callbacks for worker startup/shutdown
on_worker_boot do
# Worker specific setup
ActiveRecord::Base.establish_connection if defined?(ActiveRecord)
end
on_worker_shutdown do
# Worker specific cleanup
end
# Worker timeout (in seconds) - how long a worker can be idle before being restarted.
# Adjust based on your application's typical request latency.
worker_timeout 60
Gunicorn (for Python, but principles apply)
While Gunicorn is for Python, its worker and thread concepts are similar. For a Ruby app, you’d use a Ruby WSGI server or a Rack-compatible server like Puma or Unicorn. If you were using Gunicorn for a Python app, the command might look like:
# Example for a Python app with Gunicorn gunicorn --workers 4 --threads 2 --bind 127.0.0.1:8000 myapp.wsgi:application
Redis Performance Tuning on Google Cloud
Redis is often used as a cache, session store, or message broker. Optimizing its performance on Google Cloud involves both Redis configuration and understanding its interaction with your application and network.
Redis Configuration (`redis.conf`)
Key parameters in redis.conf for performance include memory management, persistence, and network settings.
# redis.conf # Max memory to use. Crucial for preventing Redis from consuming all available RAM. # Set this to a value less than your instance's total RAM, leaving room for the OS and other processes. # Example for a 4GB RAM instance: maxmemory 3gb maxmemory-policy allkeys-lru # Eviction policy: LRU is common for caching. # Persistence settings: RDB snapshots are good for backups, AOF for durability. # For caching, you might disable persistence or use RDB only. # For session stores, AOF might be preferred. save 900 1 # Save at least once every 15 minutes if at least 1 key changed save 300 10 # Save at least once every 5 minutes if at least 10 keys changed save 60 10000 # Save at least once every 1 minute if at least 10000 keys changed appendonly no # Set to 'yes' for AOF persistence. For pure caching, 'no' is fine. # TCP settings tcp-backlog 511 # Default is 511, can be increased if you see connection issues under high load. tcp-keepalive 300 # Send a PING to clients every 300 seconds to keep connections alive. # Performance tuning for I/O # io-threads 4 # Enable I/O threading for Redis 6.0+ for better multi-core utilization. # io-threads-do-reads yes # If using io-threads, also use for reads.
Google Cloud Network Considerations
Network latency between your application servers and Redis is a significant factor. Ensure your Redis instance and application servers are in the same Google Cloud region and, ideally, the same zone or VPC network subnet to minimize latency.
Monitoring and Diagnostics
Regularly monitor Redis performance using commands like INFO and MONITOR. Pay attention to:
used_memory: Track memory consumption againstmaxmemory.instantaneous_ops_per_sec: High values indicate heavy load.keyspace_hitsandkeyspace_misses: Monitor cache hit ratio.evicted_keys: Indicates memory pressure and aggressive eviction.rejected_connections: If Redis is overloaded ormaxclientsis reached.
Example Redis `INFO` Output Analysis
# Server redis_version:6.2.6 redis_git_sha1:00000000 redis_git_dirty:0 redis_build_id:abcdef123456 # Clients connected_clients:150 client_recent_max_input_buffer:2 client_recent_max_output_buffer:0 blocked_clients:0 # Memory used_memory:1073741824 # 1GB used used_memory_human:1.00G used_memory_rss:1258291200 # 1.2GB RSS (includes overhead) used_memory_peak:1500000000 # Peak memory usage used_memory_peak_human:1.40G total_system_memory:4000000000 # 4GB total system memory total_system_memory_human:3.73G maxmemory:3221225472 # 3GB maxmemory limit maxmemory_human:3.00G maxmemory_policy:allkeys-lru # Persistence rdb_bgsave_in_progress:0 rdb_save_in_progress:0 rdb_last_save_time:1678886400 rdb_current_bgsave_time_last:0 rdb_current_bgsave_time:0 rdb_last_bgsave_status:ok rdb_last_bgsave_duration_sec:0 rdb_last_rewrite_time_sec:-1 rdb_buffer_length:0 rdb_dump_buffer_length:0 aof_enabled:0 aof_rewrite_in_progress:0 aof_rewrite_scheduled:0 aof_last_bgrewrite_time_sec:-1 aof_last_rewrite_time_sec:-1 aof_last_size:0 aof_current_size:0 aof_buffer_length:0 aof_meta_buffer_length:0 aof_pending_bio_fsync:0 aof_delayed_fsync:0 # Stats total_connections_received:1000000 total_commands_processed:50000000 instantaneous_ops_per_sec:15000 total_net_input_bytes:10000000000 total_net_output_bytes:20000000000 instantaneous_input_kbps:1000.00 instantaneous_output_kbps:2000.00 rejected_connections:0 sync_full:1 sync_partial_ok:0 sync_partial_err:0 expired_keys:0 evicted_keys:5000 # Indicates memory pressure, keys are being removed. keyspace_hits:45000000 keyspace_misses:5000000 ------------ Keyspace ------------ db0:keys=1000000,expires=0,avg_ttl=0
In this example, used_memory is well within maxmemory, but evicted_keys indicates that Redis is under memory pressure and is actively removing keys. If this is undesirable for your caching strategy, you’d need to increase maxmemory (if system RAM allows) or choose a more aggressive eviction policy.