The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and DynamoDB on OVH for C

Optimizing Nginx for High Throughput on OVH Instances

When deploying high-traffic applications on OVH infrastructure, Nginx serves as a critical front-end. Fine-tuning its configuration is paramount for maximizing request handling capacity and minimizing latency. We’ll focus on key directives that directly impact performance, particularly in a reverse-proxy setup for Gunicorn or PHP-FPM.

Worker Processes and Connections

The `worker_processes` directive dictates how many worker processes Nginx will spawn. A common recommendation is to set this to the number of CPU cores available on your instance. For OVH instances, this can be determined dynamically or set statically based on your instance type. The `worker_connections` directive sets the maximum number of simultaneous connections that each worker process can handle. The total theoretical maximum connections is `worker_processes * worker_connections`.

Consider an OVH instance with 8 vCPUs. A good starting point for `worker_processes` would be 8. If your application is I/O bound, you might even consider slightly more, but for CPU-bound tasks, matching the core count is usually optimal. `worker_connections` should be set high enough to accommodate peak load. A value of 4096 is often a safe bet, but monitor your system’s file descriptor limits.

Nginx Configuration Snippet

worker_processes auto; # Or set to the number of CPU cores, e.g., 8;
# Increase the maximum number of open file descriptors per process
worker_rlimit_nofile 65535;

events {
    worker_connections 4096; # Max connections per worker
    multi_accept on; # Accept multiple connections at once
    use epoll; # Linux-specific, high-performance event notification mechanism
}

http {
    # ... other http configurations ...

    sendfile on; # Efficiently transfer data from one file descriptor to another
    tcp_nopush on; # Improves efficiency of sending data over TCP
    tcp_nodelay on; # Disables the Nagle algorithm, reducing latency for small packets
    keepalive_timeout 65; # Time to keep persistent connections open
    keepalive_requests 1000; # Max requests per keepalive connection

    # Enable Gzip compression for text-based assets
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

    # ... server blocks ...
}

After modifying nginx.conf, always test the configuration and gracefully reload Nginx:

Test Configuration:

sudo nginx -t

Reload Nginx:

sudo systemctl reload nginx

Tuning Gunicorn for Python Applications

Gunicorn (Green Unicorn) is a popular WSGI HTTP Server for Python. Its performance is heavily influenced by the number of worker processes and the worker class used. For CPU-bound tasks, a synchronous worker class like `sync` is often sufficient, while for I/O-bound applications, asynchronous workers like `gevent` or `event` can offer better concurrency.

Worker Processes and Threads

The --workers flag determines the number of worker processes. A common heuristic is (2 * number_of_cpu_cores) + 1. This formula aims to keep CPU cores busy while accounting for I/O waits. For I/O-bound applications using asynchronous workers, the number of workers might be less critical than the number of concurrent connections each worker can handle.

When using `sync` workers, each worker handles one request at a time. If you’re using `gevent` or `event` workers, you can specify the number of threads per worker using the --threads flag. However, for simplicity and robustness, especially when starting, using `sync` workers with an appropriate number of processes is often preferred.

Gunicorn Command Line Example

Assuming an OVH instance with 8 vCPUs and a Python application that is moderately CPU and I/O bound, we might start with 17 workers (2*8 + 1). If the application is heavily I/O bound and we decide to use `gevent` workers, we might reduce the worker count and introduce threads.

Example using `sync` workers:

gunicorn --workers 17 \
         --bind 0.0.0.0:8000 \
         --timeout 120 \
         --graceful-timeout 120 \
         --log-level info \
         your_project.wsgi:application

Example using `gevent` workers (requires `gevent` installed):

pip install gevent
gunicorn --worker-class gevent \
         --workers 4 \
         --threads 10 \
         --bind 0.0.0.0:8000 \
         --timeout 120 \
         --graceful-timeout 120 \
         --log-level info \
         your_project.wsgi:application

The --timeout and --graceful-timeout values should be adjusted based on your application’s typical request processing time. Monitoring Gunicorn’s worker utilization and response times is crucial for further tuning.

PHP-FPM Configuration for PHP Applications

For PHP applications, PHP-FPM (FastCGI Process Manager) is the standard. Its performance tuning revolves around managing the pool of PHP worker processes. The primary configuration file is typically php-fpm.conf or files within php-fpm.d/.

Process Manager Settings

PHP-FPM offers three main process management strategies: static, dynamic, and ondemand. For predictable high-traffic scenarios on OVH, static or dynamic are generally preferred over ondemand, which can introduce latency on initial requests.

static: A fixed number of child processes are spawned when FPM starts and remain active. This offers the most predictable performance but can be memory-intensive if not sized correctly.

dynamic: The number of child processes varies between pm.min_spare_servers and pm.max_children based on demand. This is a good balance between resource utilization and responsiveness.

ondemand: Processes are spawned only when a request arrives and are killed after a certain idle period. This conserves memory but can lead to higher latency for the first few requests.

PHP-FPM Pool Configuration Example (Dynamic)

Let’s consider an OVH instance with 8 vCPUs. We’ll configure a pool using the dynamic process manager. The key is to balance pm.max_children with available memory and CPU. A common starting point for pm.max_children is to ensure that the total memory footprint of all PHP processes doesn’t exceed a safe limit (e.g., 70-80% of available RAM). Each PHP-FPM worker can consume a significant amount of memory, especially with complex applications.

[www.example.com]
user = www-data
group = www-data
listen = /run/php/php7.4-fpm.sock # Or TCP/IP socket like 127.0.0.1:9000
listen.owner = www-data
listen.group = www-data
listen.mode = 0660

pm = dynamic
pm.min_spare_servers = 5
pm.max_spare_servers = 20
pm.max_children = 100 # Adjust based on memory and CPU. Start lower and increase.
pm.start_servers = 10
pm.process_idle_timeout = 10s

request_terminate_timeout = 120s
request_slowlog_timeout = 10s
slowlog = /var/log/php/php7.4-fpm.slow.log

catch_workers_output = yes
catch_workers_output_level = notice

To apply these changes, restart the PHP-FPM service:

sudo systemctl restart php7.4-fpm # Adjust version as needed

The pm.max_children value is critical. If set too high, you’ll experience Out-Of-Memory (OOM) errors and system instability. If set too low, your application will struggle to handle concurrent requests. Monitor system memory usage (e.g., using htop or free -m) and PHP-FPM logs for insights.

DynamoDB Performance Tuning on AWS (via OVH)

While OVH provides the compute and network infrastructure, your application might interact with AWS services like DynamoDB. Optimizing DynamoDB performance is crucial for applications with heavy read/write loads. The primary levers for tuning DynamoDB are Read Capacity Units (RCUs) and Write Capacity Units (WCUs).

Provisioned Throughput vs. On-Demand

Provisioned Throughput: You explicitly define the RCUs and WCUs your table or global secondary index (GSI) needs. This is cost-effective for predictable workloads. If you exceed provisioned capacity, you’ll experience throttling (ProvisionedThroughputExceededException).

On-Demand Capacity: DynamoDB instantly scales to accommodate your traffic. This is ideal for unpredictable workloads but can be more expensive for consistently high traffic.

Monitoring and Auto-Scaling

For provisioned throughput, AWS Auto Scaling for DynamoDB is essential. It automatically adjusts RCUs and WCUs based on actual consumption, preventing throttling while optimizing costs. You define target utilization percentages (e.g., 70% for reads, 50% for writes).

AWS CLI Example for Auto Scaling Configuration

This example configures auto-scaling for a DynamoDB table’s main index. Ensure you have the AWS CLI configured with appropriate credentials.

# Define the DynamoDB table name and region
TABLE_NAME="your-dynamodb-table-name"
REGION="us-east-1" # Or your preferred AWS region

# Define scaling policies for Read Capacity
aws application-autoscaling put-scaling-policy \
    --service-namespace dynamodb \
    --resource-id "table/$TABLE_NAME" \
    --scalable-dimension "dynamodb:table:ReadCapacityUnits" \
    --policy-name "TargetTrackingReadScaling" \
    --target-tracking-scaling-policy-configuration '{
        "TargetValue": 70.0,
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "DynamoDBReadCapacityUtilization"
        },
        "ScaleOutCooldown": 60,
        "ScaleInCooldown": 300
    }' \
    --region $REGION

# Define scaling policies for Write Capacity
aws application-autoscaling put-scaling-policy \
    --service-namespace dynamodb \
    --resource-id "table/$TABLE_NAME" \
    --scalable-dimension "dynamodb:table:WriteCapacityUnits" \
    --policy-name "TargetTrackingWriteScaling" \
    --target-tracking-scaling-policy-configuration '{
        "TargetValue": 50.0,
        "PredefinedMetricSpecification": {
            "PredefinedMetricType": "DynamoDBWriteCapacityUtilization"
        },
        "ScaleOutCooldown": 60,
        "ScaleInCooldown": 300
    }' \
    --region $REGION

# Register the table for Application Auto Scaling (if not already done)
aws application-autoscaling register-scalable-target \
    --service-namespace dynamodb \
    --resource-id "table/$TABLE_NAME" \
    --scalable-dimension "dynamodb:table:ReadCapacityUnits" \
    --min-capacity 5 \
    --max-capacity 1000 \
    --region $REGION

aws application-autoscaling register-scalable-target \
    --service-namespace dynamodb \
    --resource-id "table/$TABLE_NAME" \
    --scalable-dimension "dynamodb:table:WriteCapacityUnits" \
    --min-capacity 5 \
    --max-capacity 1000 \
    --region $REGION

Remember to adjust min-capacity and max-capacity based on your expected traffic patterns and cost considerations. Monitoring DynamoDB’s ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits metrics in CloudWatch is essential for validating your scaling configurations.