The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and MongoDB on OVH for C
OVH Infrastructure: A Performance Baseline
This playbook assumes a typical OVH Public Cloud deployment: a dedicated instance running Ubuntu LTS, serving a web application powered by Python (Gunicorn) or PHP (FPM), with a MongoDB backend. Our goal is to systematically tune each layer for optimal performance under load, focusing on resource utilization and latency reduction. We’ll start by establishing a baseline and then iteratively optimize.
Nginx Tuning for High Throughput
Nginx acts as our reverse proxy, load balancer, and static file server. Its configuration is critical for efficient request handling and resource management.
Worker Processes and Connections
The `worker_processes` directive should ideally be set to the number of CPU cores available. `worker_connections` dictates the maximum number of simultaneous connections a worker can handle. A common starting point is to set `worker_connections` to a value that, when multiplied by `worker_processes`, exceeds your expected peak concurrent connections, while also considering system limits.
Configuration Snippet (nginx.conf)
# Determine the number of CPU cores
# Example: If you have 8 cores, set worker_processes to 8
worker_processes 8;
# Adjust based on expected peak concurrent connections and system limits
# A common starting point is 1024 or higher, depending on your application's needs
# and the system's file descriptor limits.
worker_connections 4096;
# Enable epoll for Linux for better scalability
events {
use epoll;
worker_connections 4096; # This should match the global worker_connections
}
# Other Nginx configurations (http block, server blocks, etc.) follow...
Keepalive Connections
Enabling HTTP keep-alive reduces the overhead of establishing new TCP connections for each request. Tune `keepalive_timeout` and `keepalive_requests` to balance resource usage with connection efficiency.
Configuration Snippet (nginx.conf – http block)
http {
# ... other http settings ...
# Enable keepalive connections
keepalive_timeout 65; # Time to keep a connection open after the last request
keepalive_requests 100; # Max requests per keepalive connection
# ... rest of http block ...
}
Buffering and Caching
Nginx’s buffering can significantly impact performance, especially for large requests or responses. Adjusting `client_body_buffer_size`, `client_header_buffer_size`, `large_client_header_buffers`, and `proxy_buffers` can optimize memory usage and I/O. For static assets, leverage Nginx’s file system cache (`open_file_cache`).
Configuration Snippet (nginx.conf – http block)
http {
# ... other http settings ...
# Buffering settings
client_body_buffer_size 10K;
client_header_buffer_size 1K;
large_client_header_buffers 2 128; # Number of buffers and size
# Proxy buffering (if proxying to Gunicorn/FPM)
proxy_buffer_size 128k;
proxy_buffers 4 256k;
proxy_busy_buffers_size 256k;
# Open file cache for static assets
open_file_cache max=2000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
# ... rest of http block ...
}
Gzip Compression
Enabling Gzip compression reduces bandwidth usage and speeds up content delivery. Tune `gzip_comp_level` and `gzip_types` for optimal balance between CPU usage and compression ratio.
Configuration Snippet (nginx.conf – http block)
http {
# ... other http settings ...
# Gzip compression
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6; # Compression level (1-9)
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
# ... rest of http block ...
}
Gunicorn Tuning for Python Applications
Gunicorn (Green Unicorn) is a Python WSGI HTTP Server. Its performance is heavily influenced by the number of worker processes and their type.
Worker Processes and Type
The `workers` setting is crucial. A common recommendation is `(2 * number_of_cores) + 1`. For I/O-bound applications, consider using `gevent` or `eventlet` workers, which support asynchronous I/O. For CPU-bound tasks, the default `sync` workers are often sufficient, but `threads` can be used if your application is thread-safe and benefits from concurrency within a single process.
Command Line Arguments
# Example for 8 CPU cores, using sync workers gunicorn --workers 17 --worker-class sync --bind 0.0.0.0:8000 myapp.wsgi:application # Example using gevent workers for I/O bound applications gunicorn --workers 17 --worker-class gevent --bind 0.0.0.0:8000 myapp.wsgi:application
Worker Timeout and Max Requests
Setting `timeout` prevents workers from hanging indefinitely on slow requests. `max_requests` helps to prevent memory leaks by restarting workers after a certain number of requests.
Configuration Snippet (gunicorn.conf.py)
import multiprocessing bind = "0.0.0.0:8000" workers = (2 * multiprocessing.cpu_count()) + 1 worker_class = "sync" # or "gevent", "eventlet", "threads" # Timeout in seconds for worker requests timeout = 30 # Restart workers after this many requests max_requests = 1000 # Other settings can be added here, e.g., logging, access logs, etc. # accesslog = "/var/log/gunicorn/access.log" # errorlog = "/var/log/gunicorn/error.log"
PHP-FPM Tuning for PHP Applications
PHP-FPM (FastCGI Process Manager) is the de facto standard for running PHP applications. Its performance hinges on the process manager settings.
Process Manager Settings
The `pm` (process manager) can be set to `static`, `dynamic`, or `ondemand`. For predictable high-load scenarios, `static` can offer the lowest latency by keeping a fixed number of workers ready. `dynamic` is a good balance, scaling workers based on demand. `ondemand` is resource-efficient but can introduce latency on initial requests.
Configuration Snippet (php-fpm.conf or pool.d/www.conf)
; Example for dynamic process management pm = dynamic pm.max_children = 50 ; Maximum number of children that can be started. pm.start_servers = 5 ; Number of children when pm is dynamic. Should ideally match your CPU cores. pm.min_spare_servers = 2 ; Minimum number of idle tanscripts. pm.max_spare_servers = 8 ; Maximum number of idle tanscripts. pm.process_idle_timeout = 10s ; The number of seconds after which a child process will be killed. ; Example for static process management (for predictable high load) ; pm = static ; pm.max_children = 50 ; Fixed number of children ; Other important settings: request_terminate_timeout = 30s ; Timeout for script execution listen.owner = www-data listen.group = www-data listen.mode = 0660
Opcode Caching
Opcode caching (e.g., OPcache) is non-negotiable for PHP performance. Ensure it’s enabled and tuned correctly.
Configuration Snippet (php.ini)
[OPcache] opcache.enable=1 opcache.enable_cli=1 opcache.memory_consumption=128 ; Adjust based on your application's code size opcache.interned_strings_buffer=16 opcache.max_accelerated_files=10000 opcache.revalidate_freq=2 ; Check for file updates every 2 seconds (adjust for development vs production) opcache.validate_timestamps=1 ; Set to 0 in production for maximum performance if you have a deployment process that invalidates cache opcache.save_comments=1 opcache.load_comments=1 opcache.enable_file_override=0
MongoDB Tuning for Performance
MongoDB performance is heavily dependent on hardware, schema design, indexing, and configuration. On OVH, disk I/O and RAM are key resources.
WiredTiger Storage Engine
WiredTiger is the default and recommended storage engine. Its performance is influenced by cache size and compression.
Configuration Snippet (mongod.conf)
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
engine: wiredTiger
wiredTiger:
engineConfig:
cacheSizeGB: 0.75 # Allocate 75% of available RAM to WiredTiger cache, minus OS and other processes.
# Example: If you have 8GB RAM, and OS/other processes use ~2GB, allocate 6GB.
# cacheSizeGB: 6
collectionConfig:
BlockCompressor: snappy # or zstd for better compression at a slight CPU cost
indexConfig:
PrefixCompression: true
# Other configurations like replication, sharding, network, security, etc.
# ...
Indexing Strategy
Proper indexing is paramount. Use `explain()` to analyze query performance and identify missing or inefficient indexes. Regularly review slow query logs.
Example: Analyzing a Query
// Connect to your MongoDB instance
// db.collection('your_collection').find({ field1: 'value1', field2: 'value2' }).explain()
// Example output indicating a collection scan (bad)
// {
// "queryPlanner": {
// "namespace": "your_db.your_collection",
// "indexFilterSet": false,
// "winningPlan": {
// "stage": "COLLSCAN",
// "direction": "forward"
// },
// // ... other details
// },
// // ...
// }
// Example output indicating index usage (good)
// {
// "queryPlanner": {
// "namespace": "your_db.your_collection",
// "indexFilterSet": false,
// "winningPlan": {
// "stage": "IXSCAN",
// "keyPattern": { "field1": 1, "field2": 1 },
// "indexName": "idx_field1_field2",
// // ... other details
// },
// // ...
// },
// // ...
// }
Monitoring and Diagnostics
Regular monitoring is key to identifying bottlenecks. Utilize tools like `htop`, `iotop`, `netstat`, Nginx’s `stub_status` module, Gunicorn’s stats, PHP-FPM status page, and MongoDB’s `mongostat` and `mongotop`.
OVH Specific Considerations
OVH instances often come with high-performance NVMe SSDs. Ensure your MongoDB `dbPath` is on a fast storage volume. For high-traffic sites, consider dedicated database instances or managed MongoDB services if your application architecture allows. Network latency between your application server and database server, even within the same OVH region, can be a factor; monitor this closely.
Benchmarking and Load Testing
Before and after tuning, perform load tests using tools like `k6`, `JMeter`, or `locust`. Simulate realistic user traffic to validate your optimizations and identify new bottlenecks. Focus on metrics like response time, throughput, error rates, CPU utilization, memory usage, and I/O wait times.