The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and Elasticsearch on Google Cloud for Ruby
Nginx Tuning for High-Traffic Ruby Applications on Google Cloud
Optimizing Nginx is paramount for serving high-traffic Ruby applications, especially when leveraging Google Cloud Platform (GCP). This section details critical Nginx configurations for performance, security, and scalability, focusing on worker processes, connection handling, caching, and SSL/TLS optimization.
Worker Processes and Connections
The `worker_processes` directive dictates how many worker processes Nginx will spawn. A common best practice is to set this to the number of CPU cores available on your instance. For dynamic environments, `auto` can be used, allowing Nginx to determine the optimal number. The `worker_connections` directive sets the maximum number of simultaneous connections that each worker process can handle. This value, combined with `worker_processes`, determines the total connection capacity. Ensure your system’s file descriptor limits are also increased to accommodate these connections.
Systemd Service File for File Descriptor Limits
To increase file descriptor limits for Nginx, modify its systemd service file. This ensures that Nginx can handle a large number of open connections without hitting OS-level limits.
[Unit] Description=The Nginx HTTP Server After=syslog.target network.target remote-fs.target nss-lookup.target [Service] Type=forking PIDFile=/run/nginx.pid ExecStartPre=/usr/sbin/nginx -t ExecStart=/usr/sbin/nginx ExecReload=/bin/kill -s HUP $MAINPID ExecStop=/bin/kill -s QUIT $MAINPID PrivateTmp=true LimitNOFILE=65536 <-- Increased file descriptor limit LimitNPROC=65536 <-- Increased process limit [Install] WantedBy=multi-user.target
After modifying the service file (typically located at /etc/systemd/system/nginx.service.d/override.conf or by creating a new file like /etc/systemd/system/nginx.service and using systemctl edit nginx), reload the systemd daemon and restart Nginx:
sudo systemctl daemon-reload sudo systemctl restart nginx
Nginx Configuration Snippet
In your main Nginx configuration file (e.g., /etc/nginx/nginx.conf), adjust the following directives within the events block:
worker_processes auto; # Or set to the number of CPU cores
events {
worker_connections 4096; # Adjust based on expected load and system limits
multi_accept on;
}
Gzip Compression and Buffering
Enabling Gzip compression significantly reduces the bandwidth required to transfer assets, leading to faster load times. Buffering directives control how Nginx handles request and response bodies, which can impact memory usage and latency.
http {
# ... other http settings ...
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml;
# Buffering settings
proxy_buffering on;
proxy_buffer_size 16k;
proxy_buffers 8 16k;
proxy_busy_buffers_size 32k;
proxy_temp_file_write_size 32k;
}
SSL/TLS Optimization
For secure connections, SSL/TLS optimization is crucial. This includes enabling HTTP/2, optimizing cipher suites, and leveraging session caching.
server {
listen 443 ssl http2; # Enable HTTP/2
server_name your_domain.com;
ssl_certificate /etc/letsencrypt/live/your_domain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/your_domain.com/privkey.pem;
# Modern TLS configuration
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers on;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;
ssl_session_cache shared:SSL:10m; # Adjust size as needed
ssl_session_timeout 10m;
ssl_session_tickets off; # Consider security implications
# OCSP Stapling
ssl_stapling on;
ssl_stapling_verify on;
resolver 8.8.8.8 8.8.4.4 valid=300s; # Google DNS, adjust if necessary
resolver_timeout 5s;
# ... rest of your server configuration ...
}
Gunicorn/Puma Tuning for Ruby Applications
The application server (Gunicorn for Python, Puma for Ruby) is the bridge between Nginx and your application code. Proper configuration here is vital for handling concurrent requests efficiently.
Gunicorn Configuration (if applicable for Python-based services)
For Python applications, Gunicorn’s worker count and type significantly impact performance. The number of workers should generally be (2 * number_of_cores) + 1. The worker class also matters; gevent or eventlet are good for I/O-bound applications, while sync is simpler but less efficient for concurrency.
# Example Gunicorn command
gunicorn --workers 5 \
--worker-class gevent \
--bind 0.0.0.0:8000 \
your_app.wsgi:application
In a GCP environment, consider using a managed service like Cloud Run or App Engine Flexible for easier scaling and management of your Python applications, which abstract away some of these Gunicorn tuning concerns.
Puma Configuration (for Ruby Applications)
Puma, a popular choice for Ruby, offers threaded and forked worker models. For typical web applications, a combination of phased-out (forked) workers and threaded workers within each worker is effective. The -w flag sets the number of worker processes, and -t sets the number of threads per worker.
# Example Puma command (often managed by systemd or similar) # For a 4-core instance: # 2 worker processes, each with 5 threads puma -w 2 -t 5 --bind tcp://0.0.0.0:9292 --pidfile /var/run/puma.pid /path/to/your/app/config.ru
The optimal ratio of workers to threads depends heavily on your application’s I/O patterns and CPU-bound versus memory-bound characteristics. A common starting point is to set the total number of threads (workers * threads_per_worker) to roughly 2 * number_of_cores, and then adjust based on performance testing.
Elasticsearch Tuning on Google Cloud
Elasticsearch performance is critical for search functionality. Tuning involves JVM heap size, shard allocation, and indexing strategies. For GCP, consider using Elasticsearch Service on Elastic Cloud or self-managing on Compute Engine instances.
JVM Heap Size
The JVM heap size is arguably the most critical Elasticsearch tuning parameter. It should be set to no more than 50% of the total system RAM, and never exceed 30-32GB due to compressed ordinary object pointers (compressed oops). Set ES_HEAP_SIZE in the Elasticsearch environment configuration.
# In /etc/elasticsearch/jvm.options or via environment variables -Xms4g -Xmx4g
For a GCP instance with 8GB RAM, setting -Xms4g -Xmx4g is a reasonable starting point. Restart Elasticsearch after changing this.
Shard Allocation and Size
The number of primary shards per index impacts performance. Aim for primary shard sizes between 10GB and 50GB. Too many small shards increase overhead; too few large shards can hinder recovery and rebalancing. Elasticsearch’s default shard allocation settings are generally good, but monitor the cluster health API.
# Example: Creating an index with a specific number of primary shards
PUT /my-index
{
"settings": {
"index": {
"number_of_shards": 3, <-- Adjust based on expected data volume and query load
"number_of_replicas": 1 <-- Adjust based on availability needs
}
}
}
Use the Cluster Allocation Explain API to diagnose shard placement issues.
Indexing Performance
For high-volume indexing, consider disabling `refresh_interval` during bulk indexing operations and re-enabling it afterward. Also, tune the number of indexing threads and bulk queue sizes.
# Temporarily disable refresh for bulk indexing
PUT /my-index/_settings
{
"index": {
"refresh_interval": "-1"
}
}
# Perform bulk indexing...
# Re-enable refresh (e.g., every 5 seconds)
PUT /my-index/_settings
{
"index": {
"refresh_interval": "5s"
}
}
Monitor the Elasticsearch `_cat/thread_pool` API to understand thread pool usage and identify potential bottlenecks.
Monitoring and Diagnostics on GCP
Effective monitoring is key to identifying performance issues before they impact users. Leverage GCP’s built-in tools and integrate them with your application stack.
Nginx and Application Server Metrics
Use Nginx’s `stub_status` module to expose active connections, requests per second, and other key metrics. For Ruby applications, integrate libraries like prometheus-client-ruby to expose application-level metrics (e.g., request latency, error rates) that can be scraped by Prometheus.
# In nginx.conf, within http block
http {
# ...
server {
# ...
location /nginx_status {
stub_status;
allow 127.0.0.1; # Restrict access
deny all;
}
# ...
}
# ...
}
GCP’s Operations Suite (formerly Stackdriver) can ingest these metrics, providing dashboards and alerting capabilities.
Elasticsearch Monitoring
Utilize the Elasticsearch Monitoring features, often integrated with Kibana, to track cluster health, node statistics, JVM usage, indexing rates, and search performance. GCP’s Operations Suite can also ingest logs from Elasticsearch nodes for centralized analysis.
# Example: Checking cluster health curl -X GET "localhost:9200/_cluster/health?pretty" # Example: Checking node stats curl -X GET "localhost:9200/_nodes/stats?pretty"
Set up alerts in GCP Operations Suite for critical Elasticsearch conditions such as high CPU utilization, low disk space, or unhealthy cluster status.
Conclusion
Tuning Nginx, your application server (Gunicorn/Puma), and Elasticsearch is an ongoing process. This playbook provides a solid foundation for optimizing your Ruby stack on Google Cloud. Remember to benchmark changes, monitor performance continuously, and iterate based on real-world usage patterns.