The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and DynamoDB on OVH for Ruby

Nginx Configuration for Ruby Applications on OVH

Optimizing Nginx as a reverse proxy for Ruby applications, especially those using Gunicorn or Puma, on OVH infrastructure requires a granular approach. We’ll focus on key directives that impact connection handling, request buffering, and static file serving. This setup assumes a typical OVH VPS or dedicated server environment where you have root access.

Worker Processes and Connections

The `worker_processes` directive dictates how many worker processes Nginx will spawn. A common starting point is to set this to the number of CPU cores available. `worker_connections` defines the maximum number of simultaneous connections that each worker process can handle. The total maximum connections will be `worker_processes * worker_connections`.

For a typical 4-core OVH VPS, a good starting point would be:

worker_processes 4; # Or auto to let Nginx decide based on CPU cores

events {
    worker_connections 1024; # Adjust based on expected load and server memory
    # multi_accept on; # Can improve performance by accepting multiple connections at once
}

Buffering and Timeouts

Buffering directives control how Nginx handles request and response bodies. For upstream applications, it’s often beneficial to disable client request body buffering if your application can handle large uploads directly or if you want to reduce memory usage on the Nginx server. Timeouts are crucial to prevent hanging connections from consuming resources.

Consider these settings within your `server` block:

http {
    # ... other http settings ...

    client_body_buffer_size 128k;
    client_max_body_size 100m; # Adjust based on your application's needs for file uploads
    client_header_buffer_size 1k;
    large_client_header_buffers 4 8k;

    send_timeout 60s;
    client_body_timeout 60s;
    client_header_timeout 60s;
    keepalive_timeout 65s;
    keepalive_requests 100;

    # Disable client request body buffering for upstream communication
    # This can be useful if your app server (Gunicorn/Puma) handles large uploads directly
    # client_body_in_file_only off;
    # client_body_in_single_buffer on;

    # Proxy settings
    proxy_connect_timeout 60s;
    proxy_send_timeout 60s;
    proxy_read_timeout 60s;
    proxy_buffer_size 16k;
    proxy_buffers 8 16k;
    proxy_busy_buffers_size 32k;
    proxy_temp_file_write_size 32k;

    # ... rest of http config ...
}

Gzip Compression

Enabling Gzip compression can significantly reduce bandwidth usage and improve perceived load times for text-based assets. Ensure your upstream application server is configured to handle compressed requests if you enable `Accept-Encoding` for upstream.

http {
    # ... other http settings ...

    gzip on;
    gzip_vary on;
    gzip_proxied any; # Compress responses for all proxied requests
    gzip_comp_level 6; # Compression level (1-9)
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml;
    gzip_min_length 1000; # Minimum response length to compress
    gzip_disable "msie6"; # Disable for older IE versions if necessary

    # ... rest of http config ...
}

Static File Serving

Offloading static file serving to Nginx is a fundamental optimization. Configure Nginx to serve static assets directly, bypassing your Ruby application entirely. This involves setting `expires` headers for caching and using `try_files` to efficiently locate files.

server {
    listen 80;
    server_name your_domain.com www.your_domain.com;

    # Serve static assets directly
    location ~ ^/(assets|images|javascripts|stylesheets)/ {
        root /path/to/your/rails/public; # Adjust to your Rails public directory
        expires 30d;
        add_header Cache-Control "public";
        access_log off; # Optionally disable access logs for static files
        try_files $uri $uri/ =404;
    }

    # Proxy all other requests to your application server
    location / {
        proxy_pass http://unix:/path/to/your/app.sock; # Or http://127.0.0.1:8000 for TCP
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 120s; # Longer timeout for application processing
        proxy_send_timeout 120s;
    }

    # ... other server configurations ...
}

Gunicorn/Puma Tuning for Ruby on Rails

The choice between Gunicorn and Puma often depends on the application’s I/O patterns and threading model. Both require careful tuning of worker processes, threads, and timeouts.

Gunicorn Configuration

Gunicorn is a WSGI HTTP Server. Its performance is heavily influenced by the number of worker processes and the worker class used. For CPU-bound tasks, a process-per-core model is often effective. For I/O-bound tasks, a threaded worker class might be beneficial.

A typical Gunicorn configuration file (e.g., gunicorn_config.py):

import multiprocessing

# Number of worker processes. A good starting point is 2 * number_of_cores + 1
workers = multiprocessing.cpu_count() * 2 + 1

# Worker class. 'sync' is the default and most stable.
# 'gevent' or 'eventlet' can be used for I/O bound applications with async libraries.
worker_class = 'sync'

# Bind to a Unix socket or TCP port.
# For Nginx proxying via Unix socket:
bind = "unix:/path/to/your/app.sock"
# For Nginx proxying via TCP:
# bind = "127.0.0.1:8000"

# Timeout for worker requests. Adjust based on your application's longest-running tasks.
timeout = 120

# Maximum number of requests a worker can handle before restarting.
# Helps prevent memory leaks.
max_requests = 5000
max_requests_jitter = 500 # Randomize max_requests to spread restarts

# Logging configuration
loglevel = 'info'
accesslog = '-' # Log to stdout, which can be captured by systemd/Docker
errorlog = '-'

# Threads for threaded worker classes (if not using 'sync')
# threads = 2 # Example for gevent/eventlet

To run Gunicorn with this configuration:

gunicorn --config gunicorn_config.py your_app.wsgi:application

Puma Configuration

Puma is a multi-threaded, multi-process web server for Ruby. It excels at handling concurrent requests due to its threading model. Tuning involves setting the number of workers (processes) and threads per worker.

A common way to configure Puma is via a config/puma.rb file in your Rails application:

# config/puma.rb

# Change to match your CPU core count
workers ENV.fetch("RAILS_MAX_THREADS") { 4 }.to_i

# Min threads per worker to use.
# A good starting point is 5.
min_threads_by_instance = ENV.fetch("RAILS_MIN_THREADS") { 5 }.to_i

# Max threads per worker to use.
# A good starting point is 5.
max_threads_by_instance = ENV.fetch("RAILS_MAX_THREADS") { 5 }.to_i

threads min_threads_by_instance, max_threads_by_instance

# Bind to a Unix socket or TCP port.
# For Nginx proxying via Unix socket:
bind "unix:///path/to/your/app.sock"
# For Nginx proxying via TCP:
# bind "tcp://127.0.0.1:9292"

# Set the environment
environment ENV.fetch("RAILS_ENV") { "production" }

# Logging
stdout_redirect "/var/log/puma.stdout.log", "/var/log/puma.stderr.log", true

# Allow Puma to be restarted by `rails restart` command.
plugin :tmp_restart

# If using systemd, this is often handled by the service file
# pidfile "/path/to/your/app.pid"

# Increase worker timeout for long-running requests
worker_timeout 120

# Preload the application code to speed up worker startup
preload_app!

on_worker_boot do
  # Worker specific setup for Rails.
  # This is called after worker boot and before the worker starts accepting requests.
  ActiveRecord::Base.establish_connection if defined?(ActiveRecord::Base)
end

# Allow workers to share connections to the database, etc.
# This is a good idea for performance.
before_fork do
  ActiveRecord::Base.connection_pool.disconnect! if defined?(ActiveRecord::Base)
end

To run Puma in clustered mode:

bundle exec puma -C config/puma.rb

DynamoDB Performance Tuning on OVH

While DynamoDB is a managed AWS service, its performance can be indirectly impacted by network latency and application-level interactions, especially when your application is hosted on OVH. Optimizing your DynamoDB usage involves understanding throughput, data modeling, and query patterns.

Throughput Provisioning

DynamoDB operates on a provisioned throughput model (or on-demand). For predictable workloads, provisioned throughput is often more cost-effective. Monitor your consumed read and write capacity units (RCUs and WCUs) and adjust provisioned capacity accordingly. OVH’s network latency to AWS regions will add to the round-trip time for every DynamoDB operation.

Use CloudWatch metrics to track:

ConsumedReadCapacityUnits
ConsumedWriteCapacityUnits
ProvisionedReadCapacityUnits
ProvisionedWriteCapacityUnits
ThrottledRequests (indicates you need more capacity)

Consider using DynamoDB Auto Scaling to automatically adjust provisioned throughput based on actual usage, which can help manage costs and prevent throttling.

Data Modeling and Query Optimization

The way you model your data in DynamoDB is paramount. Avoid full table scans. Design your primary keys (partition key and sort key) to support your most frequent access patterns. Use Global Secondary Indexes (GSIs) and Local Secondary Indexes (LSIs) judiciously to enable efficient querying on attributes other than the primary key.

When querying from an OVH-hosted application, minimize the number of requests. Batch operations (BatchGetItem, BatchWriteItem) can reduce network overhead. However, be mindful of their limitations (e.g., 100 items per batch for BatchGetItem).

# Example using Boto3 (Python SDK) for batch get
import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('YourTableName')

response = table.batch_get_item(
    RequestItems={
        'YourTableName': {
            'Keys': [
                {'id': 'item1', 'range': 'sub1'},
                {'id': 'item2', 'range': 'sub2'},
                # ... up to 100 items
            ],
            # Optionally specify attributes to retrieve
            # 'ProjectionExpression': 'attribute1, attribute2'
        }
    }
)

items = response['Responses']['YourTableName']
# Handle unprocessed keys if any
if 'UnprocessedKeys' in response:
    print("Unprocessed keys:", response['UnprocessedKeys'])

Network Latency Considerations

The physical distance between your OVH servers and the AWS region hosting your DynamoDB table will introduce latency. Choose an AWS region geographically closest to your OVH data center to minimize this. Regularly test network performance between your OVH instances and AWS endpoints using tools like ping and traceroute.

# Example: Ping an AWS endpoint (replace with your region's endpoint)
ping dynamodb.us-east-1.amazonaws.com

# Example: Traceroute to an AWS endpoint
traceroute dynamodb.eu-west-2.amazonaws.com

If latency is a significant bottleneck, consider caching frequently accessed DynamoDB data in your application’s memory or using a distributed cache like Redis (which can also be hosted on OVH) to reduce direct DynamoDB calls.