The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and Elasticsearch on Linode for Ruby
Nginx as a High-Performance Frontend for Ruby Applications
When deploying Ruby applications, particularly those built with frameworks like Ruby on Rails or Sinatra, Nginx serves as an indispensable frontend. Its strengths lie in efficiently handling static assets, SSL termination, request buffering, and load balancing. For optimal performance, we’ll focus on tuning Nginx for maximum throughput and minimal latency.
Core Nginx Configuration Tuning
The primary configuration file, typically located at /etc/nginx/nginx.conf, contains global settings. Key directives to consider for performance include:
worker_processes: Set this to the number of CPU cores available on your Linode instance. Too few can lead to underutilization; too many can cause context-switching overhead.worker_connections: This defines the maximum number of simultaneous connections a worker process can handle. A common starting point is1024or higher, depending on expected traffic.keepalive_timeout: Controls how long an idle HTTP connection will remain open. A lower value (e.g.,65seconds) can free up resources faster, while a higher value might improve performance for clients with high latency.sendfile: Set toonto enable zero-copy data transfer from kernel space, significantly speeding up static file delivery.tcp_nopushandtcp_nodelay: Setting these tooncan improve network efficiency by reducing the number of packets sent and optimizing their delivery.
Here’s an example snippet from nginx.conf:
worker_processes auto; # Or set to the number of CPU cores
events {
worker_connections 4096; # Adjust based on expected load
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
keepalive_requests 1000; # Limit requests per keepalive connection
# Gzip compression for dynamic content
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
# Buffering settings for upstream connections
proxy_buffering on;
proxy_buffer_size 16k;
proxy_buffers 8 32k;
proxy_busy_buffers_size 64k;
# Include server configurations
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
Optimizing Static Asset Delivery
Nginx excels at serving static files directly, bypassing the Ruby application entirely. This is crucial for performance. Configure your server block to leverage this:
server {
listen 80;
server_name your_domain.com www.your_domain.com;
root /var/www/your_app/public; # Path to your Rails/Sinatra public directory
# Serve static assets directly
location ~ ^/(assets|images|javascripts|stylesheets|system)/ {
expires 1y;
add_header Cache-Control "public";
try_files $uri $uri/ =404;
}
# Proxy requests to the application server (Gunicorn/Puma/Unicorn)
location / {
proxy_pass http://unix:/run/your_app.sock; # Or http://127.0.0.1:8000
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300s; # Increase timeout for potentially long requests
proxy_connect_timeout 75s;
}
# SSL configuration (if applicable)
# listen 443 ssl;
# ssl_certificate /etc/letsencrypt/live/your_domain.com/fullchain.pem;
# ssl_certificate_key /etc/letsencrypt/live/your_domain.com/privkey.pem;
# ... other SSL settings
}
Gunicorn/Puma Tuning for Ruby Applications
For Ruby applications, Gunicorn (Python WSGI HTTP Server) is often used as a process manager, even though it’s a Python server. More commonly, Ruby applications use Puma or Unicorn. We’ll focus on Puma as it’s the default for Rails and widely adopted.
Puma Worker and Thread Configuration
Puma operates with a master process that spawns multiple workers. Each worker can then manage multiple threads. The key is to balance the number of workers and threads to match your server’s CPU and memory resources, and the nature of your application’s workload (I/O-bound vs. CPU-bound).
- Workers: Each worker is a separate Ruby process. More workers increase parallelism but consume more memory. A common strategy is to set workers to
(CPU cores * 2) + 1. - Threads: Threads within a worker handle concurrent requests. More threads can handle more requests simultaneously without spawning new processes, but excessive threads can lead to contention and context switching overhead. A typical range is
4-16threads per worker.
You can configure Puma via a config/puma.rb file in your Rails application:
# config/puma.rb
# Set the environment
environment ENV.fetch('RAILS_ENV') { 'production' }
# Number of workers to spawn.
# For a Linode with 4 cores, 2*4+1 = 9 workers might be a good starting point.
# Adjust based on memory usage.
workers ENV.fetch('WEB_CONCURRENCY') { 4 }.to_i
# Minimum number of threads per worker.
# If your app is I/O bound, you might increase this.
threads_count = ENV.fetch('RAILS_MAX_THREADS') { 5 }.to_i
threads threads_count, threads_count
# Bind to a Unix socket for Nginx to connect to.
# Ensure Nginx has read/write permissions to this socket's directory.
bind "unix:///run/your_app.sock"
# Or bind to a TCP port if Nginx is on a different machine or for development.
# bind "tcp://0.0.0.0:8000"
# Set the maximum number of connections per worker.
# This is often set to the number of threads.
max_concurrency threads_count
# Set the timeout for requests.
# This should be less than Nginx's proxy_read_timeout.
request_timeout 60
# Logging
stdout_redirect "#{__dir__}/log/puma.stdout.log", "#{__dir__}/log/puma.stderr.log", true
# Preload the application code before workers are forked.
preload_app!
# Callbacks for worker lifecycle
on_worker_boot do
# Worker specific setup code.
ActiveRecord::Base.establish_connection if defined?(ActiveRecord)
end
on_worker_shutdown do
# Worker specific cleanup code.
end
# Allow Puma to be restarted by `rails restart` command.
plugin :tmp_restart
To run Puma with these settings, you’d typically use a process manager like systemd. Here’s a sample systemd service file for your application (e.g., /etc/systemd/system/your_app.service):
[Unit] Description=Puma Application Server After=network.target [Service] Type=simple User=deploy # Or your application user Group=www-data # Or your application group WorkingDirectory=/var/www/your_app Environment="RAILS_ENV=production" Environment="RAILS_LOG_TO_STDOUT=disabled" # If logging to files Environment="WEB_CONCURRENCY=4" # Matches workers in puma.rb Environment="RAILS_MAX_THREADS=5" # Matches threads in puma.rb ExecStart=/usr/local/bin/bundle exec puma -C config/puma.rb ExecStop=/bin/kill -s TERM $MAINPID Restart=always RestartSec=5 [Install] WantedBy=multi-user.target
After creating or modifying the service file, reload systemd and start your application:
sudo systemctl daemon-reload sudo systemctl enable your_app.service sudo systemctl start your_app.service sudo systemctl status your_app.service
Elasticsearch Performance Tuning on Linode
Elasticsearch, while not directly serving web requests, is often a critical component for search functionality in Ruby applications. Optimizing its performance involves JVM tuning, shard management, and hardware considerations.
JVM Heap Size Configuration
The Java Virtual Machine (JVM) heap size is arguably the most critical setting for Elasticsearch performance. It dictates how much memory Elasticsearch can use for its data structures, caches, and operations. A common recommendation is to set the heap size to 50% of the system’s RAM, but never exceeding 30-32GB due to compressed ordinary object pointers (compressed oops).
Edit the jvm.options file, typically located at /etc/elasticsearch/jvm.options:
-Xms4g -Xmx4g
In this example, we’ve allocated 4GB of RAM for the heap. Adjust -Xms (initial heap size) and -Xmx (maximum heap size) based on your Linode instance’s RAM and your cluster’s needs. Ensure both are set to the same value to prevent resizing during operation.
Shard Allocation and Sizing
The number and size of shards significantly impact search and indexing performance. Too many small shards can overwhelm the cluster with overhead; too few large shards can limit parallelism and recovery speed.
- Shard Count: Aim for shards between 10GB and 50GB. For a 100GB index, 2-5 primary shards is a reasonable starting point.
- Replicas: For high availability and read performance, use replicas. A common setup is 1 replica per primary shard.
You can manage shard settings via the Elasticsearch API. For example, to set the number of primary shards to 3 and replicas to 1 for an index named my_index:
PUT /my_index
{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 1
}
}
}
To update settings on an existing index:
PUT /my_index/_settings
{
"index": {
"number_of_replicas": 2
}
}
Filesystem Cache and Swappiness
Elasticsearch relies heavily on the operating system’s filesystem cache. Ensure your Linode instance is configured to maximize its use.
- Swappiness: Set the
vm.swappinesskernel parameter to a low value (e.g.,1or10) to discourage the OS from swapping out Elasticsearch’s memory. Edit/etc/sysctl.confand add/modify the line:vm.swappiness = 1. Then apply withsudo sysctl -p. - File Descriptors: Elasticsearch requires a high number of open file descriptors. Ensure the limits are set appropriately in
/etc/security/limits.confand the Elasticsearch systemd service file.
# Example /etc/security/limits.conf entries * soft nofile 65536 * hard nofile 65536 root soft nofile 65536 root hard nofile 65536 # Example systemd service override for file descriptors # Create a file like /etc/systemd/system/elasticsearch.service.d/override.conf [Service] LimitNOFILE=65536
After making changes to sysctl.conf or limits.conf, you’ll need to restart Elasticsearch for them to take effect. For systemd overrides, run sudo systemctl daemon-reload and then restart the service.
Monitoring and Diagnostics
Continuous monitoring is key to identifying bottlenecks. Use tools like:
- Nginx:
nginx -s reload,tail -f /var/log/nginx/access.log,tail -f /var/log/nginx/error.log,netstat -tulnp | grep nginx. - Puma:
systemctl status your_app.service,journalctl -u your_app.service -f, check Puma’s log files. - Elasticsearch: Elasticsearch’s own monitoring APIs (e.g.,
_cat/nodes,_cat/indices,_cluster/stats), and external tools like Prometheus with the Elasticsearch exporter, or commercial APM solutions.
Regularly review logs for errors, high latency, and resource exhaustion. For Elasticsearch, pay close attention to garbage collection logs and search/indexing latency metrics.