The Ultimate DevOps Playbook: Tuning Nginx, Gunicorn/FPM, and MongoDB on OVH for Python
OVH Infrastructure Baseline: Understanding the Landscape
This playbook assumes a standard OVH Public Cloud setup. We’ll focus on tuning Nginx as the reverse proxy, Gunicorn (for Python WSGI applications) or PHP-FPM (for PHP applications), and MongoDB as the primary database. The OVH environment often provides bare-metal or virtualized instances with specific network configurations and resource allocations. Understanding your instance type (e.g., dedicated, vps, instance family) and its associated CPU, RAM, and network bandwidth is the first critical step. Before diving into tuning, establish baseline performance metrics for each component under typical load. This includes request latency, throughput, CPU utilization, memory consumption, and disk I/O for Nginx, your application server, and MongoDB.
Nginx Tuning for High Throughput and Low Latency
Nginx acts as the front door. Its configuration directly impacts how efficiently your application servers are utilized and how quickly clients receive responses. We’ll focus on worker processes, connections, and caching.
Worker Processes and Connections
The number of worker processes should generally match the number of CPU cores available to the Nginx instance. This allows for true parallel processing. The worker_connections directive defines the maximum number of simultaneous connections that each worker process can handle. A common starting point is 1024, but this can be increased significantly if your application servers can handle the load and you have sufficient RAM.
Nginx Configuration Snippet
Locate your nginx.conf file (typically /etc/nginx/nginx.conf or within /etc/nginx/conf.d/). Adjust the events block:
user www-data;
worker_processes auto; # Or set to the number of CPU cores
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
events {
worker_connections 4096; # Adjust based on RAM and application server capacity
multi_accept on;
}
http {
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
server_tokens off; # Important for security
# ... other http configurations ...
}
Explanation:
worker_processes auto;: Nginx will automatically determine the optimal number of worker processes based on the number of CPU cores.worker_connections 4096;: Significantly increases the connection limit per worker. Monitor system limits (ulimit -n) and adjust accordingly.sendfile on;: Allows Nginx to send files directly from the kernel’s page cache, reducing overhead.tcp_nopush on;: Optimizes packet sending by sending headers and data in a single packet.tcp_nodelay on;: Disables the Nagle algorithm, reducing latency for small packets.keepalive_timeout 65;: Keeps persistent connections open for a specified duration.server_tokens off;: Hides the Nginx version number, a minor security hardening step.
Gzip Compression and Caching
Compressing responses reduces bandwidth usage and speeds up delivery for clients. Browser caching via appropriate headers is also crucial.
Nginx Configuration Snippet (within http block)
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript image/svg+xml;
# Browser caching for static assets
location ~* \.(css|js|jpg|jpeg|png|gif|ico|svg|woff|woff2|ttf|eot)$ {
expires 30d;
add_header Cache-Control "public, no-transform";
}
Explanation:
gzip on;: Enables Gzip compression.gzip_vary on;: Adds the Vary: Accept-Encoding header, important for proxy caches.gzip_proxied any;: Compresses responses for proxied requests.gzip_comp_level 6;: Sets the compression level (1-9, 6 is a good balance).gzip_types ...;: Specifies MIME types to compress.location ~* \.(...)$: Defines caching for static assets.expires 30d;: Instructs browsers to cache these assets for 30 days.add_header Cache-Control "public, no-transform";: Further cache control directives.
Gunicorn Tuning for Python WSGI Applications
Gunicorn is a popular WSGI HTTP Server for Python. Its performance is heavily influenced by the number of worker processes and the worker type.
Worker Processes and Type
The optimal number of worker processes is typically (2 * number_of_cores) + 1. This formula accounts for CPU-bound tasks and potential I/O waits. For I/O-bound applications, the gevent or eventlet worker types (asynchronous) can significantly improve concurrency.
Gunicorn Command Line / Configuration
You can start Gunicorn with specific settings:
gunicorn --workers 5 --worker-class gevent --bind 0.0.0.0:8000 myapp.wsgi:application
Or, use a Gunicorn configuration file (e.g., gunicorn_config.py):
import multiprocessing bind = "0.0.0.0:8000" workers = multiprocessing.cpu_count() * 2 + 1 worker_class = "gevent" # or "sync", "eventlet" threads = 2 # If using sync worker class and need threading timeout = 120 # seconds keepalive = 5 # seconds
Explanation:
workers: The number of worker processes. The formula(2 * cores) + 1is a good starting point.worker_class:geventoreventletare excellent for I/O-bound applications.syncis the default and suitable for CPU-bound tasks.threads: If usingsyncworkers, you can add threads to handle concurrent requests within a worker process.timeout: The maximum time a worker can spend on a request before being killed. Adjust based on your application’s typical request duration.keepalive: The number of seconds to wait for a new request on a persistent connection.
Logging and Error Handling
Effective logging is crucial for debugging and performance monitoring. Gunicorn can log to stdout/stderr (ideal for containerized environments) or to specific files.
Gunicorn Configuration Snippet (in gunicorn_config.py)
import logging loglevel = "info" # or "debug", "warning", "error", "critical" accesslog = "-" # Log to stdout/stderr errorlog = "-" # Log to stdout/stderr # Alternatively, log to files: # accesslog = "/var/log/gunicorn/access.log" # errorlog = "/var/log/gunicorn/error.log" # loglevel = "info" # logfile = "/var/log/gunicorn/gunicorn.log" # For file logging, ensure the directory exists and has correct permissions: # sudo mkdir -p /var/log/gunicorn # sudo chown www-data:www-data /var/log/gunicorn
PHP-FPM Tuning for PHP Applications
PHP-FPM (FastCGI Process Manager) is the standard for running PHP applications with Nginx. Tuning its process management and buffer sizes is key.
Process Management
PHP-FPM offers several process management strategies: static, dynamic, and ondemand. dynamic is often a good balance, allowing FPM to scale processes based on demand while maintaining a minimum pool.
PHP-FPM Configuration Snippet (e.g., /etc/php/8.1/fpm/pool.d/www.conf)
; Choose one of the process management modes ; pm = static ; pm = dynamic pm = ondemand ; If pm is 'dynamic' pm.max_children = 100 pm.start_servers = 10 pm.min_spare_servers = 5 pm.max_spare_servers = 15 pm.max_requests = 500 ; If pm is 'ondemand' pm.max_children = 100 pm.max_spare_servers = 5 pm.process_idle_timeout = 10s pm.max_requests = 500 ; If pm is 'static' ; pm.max_children = 50 ; Adjust based on your server's RAM and expected load. ; Start with conservative values and increase as needed. ; Monitor CPU and RAM usage closely. ; For Nginx to communicate with PHP-FPM listen = /run/php/php8.1-fpm.sock ; listen.owner = www-data ; listen.group = www-data ; listen.mode = 0660 ; For TCP/IP communication (less common for single-server setups) ; listen = 127.0.0.1:9000 ; listen.allowed_clients = 127.0.0.1
Explanation:
pm: The process manager.ondemandis efficient for low-traffic sites, whiledynamicorstaticare better for high-traffic sites.pm.max_children: The maximum number of child processes that will be spawned. This is the most critical setting for preventing OOM errors.pm.start_servers: The number of child processes to start when the FPM master process is started.pm.min_spare_servers: The minimum number of idle processes that FPM should maintain.pm.max_spare_servers: The maximum number of idle processes that FPM should maintain.pm.max_requests: The number of requests each child process will execute before respawning. This helps prevent memory leaks.listen: The socket PHP-FPM will listen on. A Unix socket (.sock) is generally faster than TCP/IP.
Buffer Settings
Large POST requests or complex scripts might require larger buffer sizes.
PHP-FPM Configuration Snippet (within the same www.conf)
; Adjust these values if you encounter "script reached max_execution_time" or buffer overflow errors ; For large file uploads or complex POST requests ; php_admin_value[upload_max_filesize] = 64M ; php_admin_value[post_max_size] = 64M ; php_admin_value[memory_limit] = 256M ; Ensure this is sufficient for your application ; For large output buffering ; php_admin_value[output_buffering] = 4096 ; Default is 4096 (4K) ; php_admin_value[output_handler] = ob_gzhandler ; If you want to gzip output at PHP level (less common with Nginx gzip) ; For FastCGI communication ; fastcgi_buffers = 8 16k ; fastcgi_buffer_size = 32k ; fastcgi_read_timeout = 300 ; seconds ; fastcgi_send_timeout = 300 ; seconds
Explanation:
upload_max_filesizeandpost_max_size: Crucial for handling file uploads and large form submissions.memory_limit: The maximum amount of memory a script can consume.fastcgi_buffersandfastcgi_buffer_size: These are Nginx directives (configured in your Nginx site config, not PHP-FPM pool) that control the buffers for FastCGI communication. They should be tuned in conjunction with PHP-FPM.fastcgi_read_timeoutandfastcgi_send_timeout: Increase these if your PHP scripts take a long time to execute.
MongoDB Tuning for Performance and Scalability
MongoDB’s performance is heavily dependent on hardware (especially RAM and disk speed), configuration, and query optimization. OVH instances often provide good I/O, but tuning is still essential.
Storage Engine and WiredTiger
The default and recommended storage engine is WiredTiger. It offers excellent compression and concurrency. Ensure you are using it and that its cache is adequately sized.
MongoDB Configuration Snippet (/etc/mongod.conf)
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
engine: wiredTiger
wiredTiger:
engineConfig:
cacheSizeGB: 0.75 # Allocate 75% of available RAM to WiredTiger cache
collectionConfig:
blockCompressor: snappy # or zstd for better compression
indexConfig:
prefixCompression: true
# network:
# bindIp: 127.0.0.1 # Or your specific IP/interface
operationProfiling:
slowOpThresholdMs: 100 # Log queries slower than 100ms
mode: "slowOp" # or "all" or "off"
# logging:
# quiet: false
# logAppend: true
# path: /var/log/mongodb/mongod.log
# verbosity: 0
# sharding:
# clusterRole: configsvr # or shardsvr if part of a sharded cluster
Explanation:
cacheSizeGB: This is the most critical setting. Allocate a significant portion of your server’s RAM (e.g., 75%) to the WiredTiger cache. MongoDB will use this cache to hold frequently accessed data and indexes in memory.blockCompressor:snappyis fast,zstdoffers better compression ratios but might use more CPU. Choose based on your workload.slowOpThresholdMs: Essential for identifying slow queries. Set this to a reasonable value (e.g., 100ms) and regularly check logs or usedb.slowQueries.find().operationProfiling.mode: Set toslowOpto log only slow queries.
Indexing Strategy
Proper indexing is paramount for MongoDB performance. Without indexes, MongoDB must perform collection scans, which are extremely slow for large collections.
Example: Creating an Index
// Connect to your MongoDB instance
// mongo
// Use your database
use mydatabase;
// Create an index on the 'users' collection for the 'email' field
db.users.createIndex( { email: 1 } );
// Create a compound index for queries filtering by 'status' and sorting by 'createdAt'
db.orders.createIndex( { status: 1, createdAt: -1 } );
// Check existing indexes
db.collection.getIndexes();
Key Considerations:
- Analyze your application’s query patterns using the slow query logs or MongoDB’s profiling tools.
- Create indexes that match your most frequent and performance-critical queries.
- Avoid over-indexing, as indexes consume disk space and add overhead to write operations.
- Compound indexes are powerful but order matters.
Connection Pooling
Your application code should leverage connection pooling to avoid the overhead of establishing a new MongoDB connection for every request. Most MongoDB drivers (e.g., PyMongo, Mongoose) handle this by default.
Example: PyMongo Connection Pooling
from pymongo import MongoClient
# Connection pooling is handled automatically by MongoClient
# The default maxPoolSize is 100
client = MongoClient('mongodb://localhost:27017/', maxPoolSize=100)
# Access your database and collection
db = client.mydatabase
users_collection = db.users
# Perform operations
user = users_collection.find_one({"email": "[email protected]"})
# No need to explicitly close the connection in a long-running application
# The client object manages the pool.
Ensure your application’s connection pool size is configured appropriately and doesn’t exceed the number of available connections your MongoDB server can handle (which is typically very high).
Putting It All Together: OVH Deployment Workflow
When deploying or updating your application on OVH, follow these steps:
- Provision OVH Instance(s): Select instance types that match your CPU, RAM, and storage requirements. For MongoDB, prioritize instances with fast SSDs.
- Install Dependencies: Install Nginx, your Python runtime (or PHP), Gunicorn/PHP-FPM, and MongoDB.
- Configure Nginx: Set up your reverse proxy configuration, pointing to your Gunicorn/PHP-FPM upstream.
- Configure Application Server (Gunicorn/PHP-FPM): Tune worker processes, types, and timeouts based on your application’s characteristics.
- Configure MongoDB: Adjust
mongod.conffor WiredTiger cache, logging, and profiling. - Deploy Application Code: Use your preferred deployment method (e.g., Ansible, Docker, manual SCP).
- Apply Indexes: Ensure all necessary indexes are created on your MongoDB collections.
- Test and Benchmark: Use tools like
ab(ApacheBench),wrk, or JMeter to simulate load and measure performance. Monitor CPU, RAM, network, and disk I/O using tools likehtop,iotop, and OVH’s monitoring dashboard. - Iterate: Based on monitoring and benchmarking results, adjust configuration parameters (e.g., worker counts, cache sizes, timeouts) and re-test.
Monitoring and Iterative Tuning
Performance tuning is not a one-time event. Continuous monitoring is essential. Key metrics to track include:
- Nginx: Active connections, requests per second, error rates (5xx, 4xx), upstream response times.
- Gunicorn/PHP-FPM: Worker utilization, request queue length, CPU/memory per worker, error logs.
- MongoDB: Query execution times (slow queries), cache hit ratio, disk I/O, network traffic, CPU/memory usage.
- System-wide: CPU load, memory usage (especially swap usage), disk I/O wait times, network bandwidth.
Utilize OVH’s built-in monitoring tools, Prometheus/Grafana, Datadog, or other APM solutions to gain visibility. Regularly review slow query logs in MongoDB and application error logs. Make incremental changes to configurations, test their impact, and document your findings. This iterative process is the core of effective DevOps performance tuning.