The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and DynamoDB on Linode for Ruby

Optimizing Nginx for High-Traffic Ruby Applications

When deploying Ruby applications, particularly those built with frameworks like Ruby on Rails, on platforms like Linode, Nginx often serves as the primary web server and reverse proxy. Efficient Nginx configuration is paramount for handling concurrent connections, serving static assets, and effectively passing dynamic requests to your application server (e.g., Gunicorn or Puma). This section details critical Nginx tuning parameters for production environments.

Nginx Worker Processes and Connections

The number of worker processes and the maximum number of connections per worker are fundamental to Nginx’s concurrency handling. A common starting point is to set worker_processes to the number of CPU cores available on your Linode instance. For worker_connections, a value of 1024 is often a good baseline, but this can be increased based on your application’s needs and system limits.

Determining Optimal Worker Processes

To find the number of CPU cores, you can use the nproc command or inspect /proc/cpuinfo.

nproc

Then, configure Nginx in nginx.conf (typically located at /etc/nginx/nginx.conf or within /etc/nginx/conf.d/).

worker_processes auto; # or set to the number of CPU cores
# worker_processes 4;

events {
    worker_connections 4096; # Increased from default 1024
    # multi_accept on; # Consider enabling if your OS supports it well
}

The worker_connections directive defines the maximum number of simultaneous connections that each worker process can handle. The total maximum connections will be worker_processes * worker_connections. Ensure your system’s file descriptor limit (ulimit -n) is high enough to accommodate this. You can check and increase this limit in /etc/security/limits.conf.

# Add these lines to /etc/security/limits.conf
* soft nofile 65536
* hard nofile 65536
root soft nofile 65536
root hard nofile 65536

# Apply changes (or reboot)
ulimit -n 65536

Buffering and Timeouts

Buffering settings control how Nginx handles request and response bodies. For upstream applications, especially those that might take longer to process requests, adjusting these can prevent premature timeouts and improve perceived performance. Timeout directives are crucial for preventing idle connections from consuming resources indefinitely.

Client Body and Proxy Buffers

client_body_buffer_size, client_max_body_size, proxy_buffers, and proxy_buffer_size are key. For applications handling file uploads or large POST requests, client_max_body_size should be set appropriately. Adjusting buffer sizes can help manage memory usage and I/O operations.

http {
    # ... other http settings ...

    client_body_buffer_size 128k;
    client_max_body_size 50m; # Example: Allow up to 50MB uploads
    proxy_buffers 8 128k;      # 8 buffers, each 128k
    proxy_buffer_size 128k;
    proxy_connect_timeout 60s;
    proxy_send_timeout 60s;
    proxy_read_timeout 60s;
    send_timeout 60s;

    # ... server blocks ...
}

Gzip Compression and Caching

Enabling Gzip compression significantly reduces the bandwidth required to transfer text-based assets (HTML, CSS, JavaScript, JSON). Browser caching, configured via HTTP headers, allows clients to store static assets locally, reducing server load and improving page load times for repeat visitors.

Gzip Configuration

These directives should be placed within the http block.

http {
    # ... other http settings ...

    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6; # Compression level (1-9)
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml;
    gzip_min_length 1000; # Don't compress small files

    # ... server blocks ...
}

Browser Caching Headers

For static assets, set appropriate cache control headers. This is typically done within your server block, often using a location block for static file types.

server {
    # ... other server settings ...

    location ~* \.(css|js|jpg|jpeg|png|gif|ico|svg|woff|woff2|ttf|eot)$ {
        expires 30d; # Cache for 30 days
        add_header Cache-Control "public, no-transform";
        access_log off; # Optionally disable logging for static assets
    }

    # ... other location blocks ...
}

Tuning Gunicorn/Puma for Ruby Applications

When using a WSGI/Rack server like Gunicorn (Python, but often used with frameworks like Flask/Django which can be compared to Ruby’s needs) or Puma (a popular choice for Ruby), the number of worker processes and threads directly impacts concurrency and resource utilization. For Ruby applications, Puma is often the preferred choice due to its native Ruby integration.

Puma Worker and Thread Configuration

Puma’s configuration is typically done via a config/puma.rb file. The key directives are workers and threads.

# config/puma.rb

# Set the environment
environment ENV.fetch('RAILS_ENV') { 'production' }

# Number of Puma workers (processes)
# A good starting point is 2-4 workers per CPU core, depending on memory.
# If you have 4 cores, you might start with 4-8 workers.
workers ENV.fetch('WEB_CONCURRENCY') { 4 }.to_i

# Minimum and maximum number of threads per worker.
# Threads handle concurrent requests within a single worker process.
# A common pattern is to set threads to a range, e.g., 5-5 or 5-8.
threads_count = ENV.fetch('RAILS_MAX_THREADS') { 5 }.to_i
threads threads_count, threads_count

# Bind to a TCP socket for Nginx to proxy to.
# Use a Unix socket for better performance if Nginx and Puma are on the same machine.
# bind "unix:///path/to/your/app/shared/tmp/sockets/puma.sock"
bind "tcp://0.0.0.0:9292" # Example port

# Set the maximum number of connections per worker.
# This is often related to the number of threads.
# max_connections ENV.fetch('RAILS_MAX_CONNECTIONS') { 1000 }.to_i # Not a direct Puma config, but a conceptual limit

# Logging
stdout_redirect "#{__dir__}/log/puma.stdout.log", "#{__dir__}/log/puma.stderr.log", true

# PID file
pidfile "#{__dir__}/tmp/pids/puma.pid"

# State file
statefile "#{__dir__}/tmp/pids/puma.state"

# Preload the application
preload_app!

on_worker_boot do
  # Worker specific setup, e.g., database connection pooling
  ActiveRecord::Base.establish_connection if defined?(ActiveRecord::Base)
end

# Allow Puma to be restarted by `rails restart` command.
plugin :tmp_restart

Tuning Strategy:

Workers: Start with a number of workers that roughly matches your CPU cores, or slightly more if your application is I/O bound. Monitor CPU and memory usage. If workers are constantly waiting for CPU, reduce workers. If they are waiting for I/O (e.g., database queries), you might benefit from more workers or better database tuning.
Threads: Threads handle concurrent requests within a worker. A common recommendation is to set threads to a range like 5-5 or 5-8. The optimal number depends on the nature of your requests (CPU-bound vs. I/O-bound). Too many threads can lead to excessive context switching and memory overhead.
max_connections (Conceptual): While Puma doesn’t have a direct max_connections directive like some other servers, the combination of workers and threads defines your concurrency. Ensure your upstream server (Nginx) and downstream services (database) can handle the load.
Unix Sockets vs. TCP: For performance, especially on a single Linode instance, using a Unix domain socket for Nginx to communicate with Puma is generally faster than TCP/IP.

DynamoDB Performance Tuning for Ruby Applications

DynamoDB, while a powerful managed NoSQL database, requires careful consideration for optimal performance and cost-effectiveness. For Ruby applications, the AWS SDK for Ruby (aws-sdk-dynamodb gem) is the primary interface. Key tuning areas include provisioned throughput, indexing, and efficient query patterns.

Provisioned Throughput (RCU/WCU)

DynamoDB operates on a provisioned throughput model (Read Capacity Units – RCU, Write Capacity Units – WCU). Understanding your application’s read/write patterns is crucial. Use DynamoDB’s Auto Scaling feature to automatically adjust provisioned throughput based on actual traffic, preventing throttling and optimizing costs.

Efficient Querying and Indexing

Avoid full table scans. Design your access patterns first, then create Global Secondary Indexes (GSIs) or Local Secondary Indexes (LSIs) to support those patterns. GSIs are asynchronous and can be created/deleted without affecting the base table, making them ideal for supporting diverse query needs.

AWS SDK for Ruby Optimization

The AWS SDK for Ruby offers features to improve performance:

Batch Operations: Use batch_get_item and batch_write_item to perform multiple read or write operations in a single API call. This significantly reduces network latency and API request overhead.
Paging: When retrieving large result sets, implement proper pagination using last_evaluated_key to fetch data in manageable chunks.
Conditional Writes: Use conditional expressions to perform atomic updates or deletes, reducing the need for read-then-write cycles and preventing race conditions.
Client-Side Caching: For frequently accessed, relatively static data, consider implementing a client-side caching layer (e.g., using Redis or Memcached) to reduce direct DynamoDB reads.

Example: Batch Get Item in Ruby

This example demonstrates fetching multiple items by their primary keys using batch_get_item.

require 'aws-sdk-dynamodb'

# Configure your AWS credentials and region
# Ensure you have credentials configured (e.g., via environment variables, IAM role, or ~/.aws/credentials)
dynamodb = Aws::DynamoDB::Client.new(region: 'us-east-1')

table_name = 'YourDynamoDBTableName'

# Define the keys of the items you want to retrieve
keys_to_get = [
  { id: 'user-123', sort_key: 'profile' },
  { id: 'user-456', sort_key: 'settings' },
  { id: 'user-789', sort_key: 'preferences' }
]

# Prepare the request for batch_get_item
request_items = {
  table_name => {
    keys: keys_to_get,
    # Optionally specify attributes to retrieve
    # projection_attributes: ['id', 'username', 'email']
  }
}

begin
  result = dynamodb.batch_get_item(request_items: request_items)

  # Process the retrieved items
  items = result.responses[table_name]
  if items
    items.each do |item|
      puts "Retrieved item: #{item.inspect}"
    end
  else
    puts "No items found for the specified keys."
  end

  # Handle unprocessed keys if any
  if result.unprocessed_keys && result.unprocessed_keys[table_name]
    puts "Some keys were unprocessed. Consider retrying: #{result.unprocessed_keys[table_name]['keys'].inspect}"
    # Implement retry logic here if necessary
  end

rescue Aws::DynamoDB::Errors::ServiceError => e
  puts "Error fetching items from DynamoDB: #{e.message}"
end

Example: Batch Write Item (Put and Delete) in Ruby

This example shows how to perform multiple put and delete operations in a single request.

require 'aws-sdk-dynamodb'

dynamodb = Aws::DynamoDB::Client.new(region: 'us-east-1')
table_name = 'YourDynamoDBTableName'

# Prepare items for batch put
items_to_put = [
  { id: 'new-user-1', username: 'alice', email: '[email protected]' },
  { id: 'new-user-2', username: 'bob', email: '[email protected]' }
]

# Prepare keys for batch delete
keys_to_delete = [
  { id: 'old-user-1' },
  { id: 'old-user-2' }
]

# Construct the request items for batch_write_item
request_items = {
  table_name => []
}

# Add put requests
items_to_put.each do |item|
  request_items[table_name] << { put_request: { item: item } }
end

# Add delete requests
keys_to_delete.each do |key|
  request_items[table_name] << { delete_request: { key: key } }
end

begin
  result = dynamodb.batch_write_item(request_items: request_items)

  puts "Batch write operation completed."

  # Handle unprocessed items if any
  if result.unprocessed_items && result.unprocessed_items[table_name]
    puts "Some items were unprocessed. Consider retrying: #{result.unprocessed_items[table_name].inspect}"
    # Implement retry logic here
  end

rescue Aws::DynamoDB::Errors::ServiceError => e
  puts "Error performing batch write to DynamoDB: #{e.message}"
end

Monitoring and Iterative Tuning

Performance tuning is not a one-time task. Continuous monitoring is essential. Utilize tools like:

Linode Metrics: CPU, memory, disk I/O, and network traffic.
Nginx Status Module: Provides real-time metrics on active connections, requests, and errors.
Puma/Gunicorn Metrics: Worker status, thread utilization, request latency.
AWS CloudWatch: For DynamoDB metrics like consumed capacity, throttled requests, latency, and item count.
Application Performance Monitoring (APM) tools: Such as New Relic, Datadog, or Scout APM, to pinpoint bottlenecks within your Ruby code and database interactions.

Regularly review these metrics. When you observe high latency, increased error rates, or resource saturation, use the data to identify the specific component (Nginx, Puma, DynamoDB, or your application code) that is the bottleneck and apply targeted tuning adjustments. Remember to test changes in a staging environment before deploying to production.