The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and DynamoDB on OVH for C
Optimizing Nginx for High Throughput on OVH Instances
When deploying high-traffic applications on OVH infrastructure, Nginx serves as a critical front-end. Fine-tuning its configuration is paramount for maximizing request handling capacity and minimizing latency. We’ll focus on key directives that directly impact performance, particularly in a reverse-proxy setup for Gunicorn or PHP-FPM.
Worker Processes and Connections
The `worker_processes` directive dictates how many worker processes Nginx will spawn. A common recommendation is to set this to the number of CPU cores available on your instance. For OVH instances, this can be determined dynamically or set statically based on your instance type. The `worker_connections` directive sets the maximum number of simultaneous connections that each worker process can handle. The total theoretical maximum connections is `worker_processes * worker_connections`.
Consider an OVH instance with 8 vCPUs. A good starting point for `worker_processes` would be 8. If your application is I/O bound, you might even consider slightly more, but for CPU-bound tasks, matching the core count is usually optimal. `worker_connections` should be set high enough to accommodate peak load. A value of 4096 is often a safe bet, but monitor your system’s file descriptor limits.
Nginx Configuration Snippet
worker_processes auto; # Or set to the number of CPU cores, e.g., 8;
# Increase the maximum number of open file descriptors per process
worker_rlimit_nofile 65535;
events {
worker_connections 4096; # Max connections per worker
multi_accept on; # Accept multiple connections at once
use epoll; # Linux-specific, high-performance event notification mechanism
}
http {
# ... other http configurations ...
sendfile on; # Efficiently transfer data from one file descriptor to another
tcp_nopush on; # Improves efficiency of sending data over TCP
tcp_nodelay on; # Disables the Nagle algorithm, reducing latency for small packets
keepalive_timeout 65; # Time to keep persistent connections open
keepalive_requests 1000; # Max requests per keepalive connection
# Enable Gzip compression for text-based assets
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
# ... server blocks ...
}
After modifying nginx.conf, always test the configuration and gracefully reload Nginx:
Test Configuration:
sudo nginx -t
Reload Nginx:
sudo systemctl reload nginx
Tuning Gunicorn for Python Applications
Gunicorn (Green Unicorn) is a popular WSGI HTTP Server for Python. Its performance is heavily influenced by the number of worker processes and the worker class used. For CPU-bound tasks, a synchronous worker class like `sync` is often sufficient, while for I/O-bound applications, asynchronous workers like `gevent` or `event` can offer better concurrency.
Worker Processes and Threads
The --workers flag determines the number of worker processes. A common heuristic is (2 * number_of_cpu_cores) + 1. This formula aims to keep CPU cores busy while accounting for I/O waits. For I/O-bound applications using asynchronous workers, the number of workers might be less critical than the number of concurrent connections each worker can handle.
When using `sync` workers, each worker handles one request at a time. If you’re using `gevent` or `event` workers, you can specify the number of threads per worker using the --threads flag. However, for simplicity and robustness, especially when starting, using `sync` workers with an appropriate number of processes is often preferred.
Gunicorn Command Line Example
Assuming an OVH instance with 8 vCPUs and a Python application that is moderately CPU and I/O bound, we might start with 17 workers (2*8 + 1). If the application is heavily I/O bound and we decide to use `gevent` workers, we might reduce the worker count and introduce threads.
Example using `sync` workers:
gunicorn --workers 17 \
--bind 0.0.0.0:8000 \
--timeout 120 \
--graceful-timeout 120 \
--log-level info \
your_project.wsgi:application
Example using `gevent` workers (requires `gevent` installed):
pip install gevent
gunicorn --worker-class gevent \
--workers 4 \
--threads 10 \
--bind 0.0.0.0:8000 \
--timeout 120 \
--graceful-timeout 120 \
--log-level info \
your_project.wsgi:application
The --timeout and --graceful-timeout values should be adjusted based on your application’s typical request processing time. Monitoring Gunicorn’s worker utilization and response times is crucial for further tuning.
PHP-FPM Configuration for PHP Applications
For PHP applications, PHP-FPM (FastCGI Process Manager) is the standard. Its performance tuning revolves around managing the pool of PHP worker processes. The primary configuration file is typically php-fpm.conf or files within php-fpm.d/.
Process Manager Settings
PHP-FPM offers three main process management strategies: static, dynamic, and ondemand. For predictable high-traffic scenarios on OVH, static or dynamic are generally preferred over ondemand, which can introduce latency on initial requests.
static: A fixed number of child processes are spawned when FPM starts and remain active. This offers the most predictable performance but can be memory-intensive if not sized correctly.
dynamic: The number of child processes varies between pm.min_spare_servers and pm.max_children based on demand. This is a good balance between resource utilization and responsiveness.
ondemand: Processes are spawned only when a request arrives and are killed after a certain idle period. This conserves memory but can lead to higher latency for the first few requests.
PHP-FPM Pool Configuration Example (Dynamic)
Let’s consider an OVH instance with 8 vCPUs. We’ll configure a pool using the dynamic process manager. The key is to balance pm.max_children with available memory and CPU. A common starting point for pm.max_children is to ensure that the total memory footprint of all PHP processes doesn’t exceed a safe limit (e.g., 70-80% of available RAM). Each PHP-FPM worker can consume a significant amount of memory, especially with complex applications.
[www.example.com] user = www-data group = www-data listen = /run/php/php7.4-fpm.sock # Or TCP/IP socket like 127.0.0.1:9000 listen.owner = www-data listen.group = www-data listen.mode = 0660 pm = dynamic pm.min_spare_servers = 5 pm.max_spare_servers = 20 pm.max_children = 100 # Adjust based on memory and CPU. Start lower and increase. pm.start_servers = 10 pm.process_idle_timeout = 10s request_terminate_timeout = 120s request_slowlog_timeout = 10s slowlog = /var/log/php/php7.4-fpm.slow.log catch_workers_output = yes catch_workers_output_level = notice
To apply these changes, restart the PHP-FPM service:
sudo systemctl restart php7.4-fpm # Adjust version as needed
The pm.max_children value is critical. If set too high, you’ll experience Out-Of-Memory (OOM) errors and system instability. If set too low, your application will struggle to handle concurrent requests. Monitor system memory usage (e.g., using htop or free -m) and PHP-FPM logs for insights.
DynamoDB Performance Tuning on AWS (via OVH)
While OVH provides the compute and network infrastructure, your application might interact with AWS services like DynamoDB. Optimizing DynamoDB performance is crucial for applications with heavy read/write loads. The primary levers for tuning DynamoDB are Read Capacity Units (RCUs) and Write Capacity Units (WCUs).
Provisioned Throughput vs. On-Demand
Provisioned Throughput: You explicitly define the RCUs and WCUs your table or global secondary index (GSI) needs. This is cost-effective for predictable workloads. If you exceed provisioned capacity, you’ll experience throttling (ProvisionedThroughputExceededException).
On-Demand Capacity: DynamoDB instantly scales to accommodate your traffic. This is ideal for unpredictable workloads but can be more expensive for consistently high traffic.
Monitoring and Auto-Scaling
For provisioned throughput, AWS Auto Scaling for DynamoDB is essential. It automatically adjusts RCUs and WCUs based on actual consumption, preventing throttling while optimizing costs. You define target utilization percentages (e.g., 70% for reads, 50% for writes).
AWS CLI Example for Auto Scaling Configuration
This example configures auto-scaling for a DynamoDB table’s main index. Ensure you have the AWS CLI configured with appropriate credentials.
# Define the DynamoDB table name and region
TABLE_NAME="your-dynamodb-table-name"
REGION="us-east-1" # Or your preferred AWS region
# Define scaling policies for Read Capacity
aws application-autoscaling put-scaling-policy \
--service-namespace dynamodb \
--resource-id "table/$TABLE_NAME" \
--scalable-dimension "dynamodb:table:ReadCapacityUnits" \
--policy-name "TargetTrackingReadScaling" \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 70.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "DynamoDBReadCapacityUtilization"
},
"ScaleOutCooldown": 60,
"ScaleInCooldown": 300
}' \
--region $REGION
# Define scaling policies for Write Capacity
aws application-autoscaling put-scaling-policy \
--service-namespace dynamodb \
--resource-id "table/$TABLE_NAME" \
--scalable-dimension "dynamodb:table:WriteCapacityUnits" \
--policy-name "TargetTrackingWriteScaling" \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 50.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "DynamoDBWriteCapacityUtilization"
},
"ScaleOutCooldown": 60,
"ScaleInCooldown": 300
}' \
--region $REGION
# Register the table for Application Auto Scaling (if not already done)
aws application-autoscaling register-scalable-target \
--service-namespace dynamodb \
--resource-id "table/$TABLE_NAME" \
--scalable-dimension "dynamodb:table:ReadCapacityUnits" \
--min-capacity 5 \
--max-capacity 1000 \
--region $REGION
aws application-autoscaling register-scalable-target \
--service-namespace dynamodb \
--resource-id "table/$TABLE_NAME" \
--scalable-dimension "dynamodb:table:WriteCapacityUnits" \
--min-capacity 5 \
--max-capacity 1000 \
--region $REGION
Remember to adjust min-capacity and max-capacity based on your expected traffic patterns and cost considerations. Monitoring DynamoDB’s ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits metrics in CloudWatch is essential for validating your scaling configurations.