Step-by-Step: Diagnosing Out of Memory (OOM) Killer terminating PHP-FPM pool workers on OVH Servers
Identifying the OOM Killer’s Handiwork
When your PHP-FPM pool workers are being unceremoniously terminated, and you suspect the Linux Out-Of-Memory (OOM) Killer is the culprit, the first step is to confirm its involvement. The kernel logs are your primary source of truth. On most Linux distributions, including those commonly found on OVH servers, these logs are accessible via `dmesg` or by examining `/var/log/syslog` or `/var/log/messages`.
A tell-tale sign is a message similar to this, indicating that a process was killed to free up memory:
[timestamp] Out of memory: Kill process [PID] ([process_name]) score [score] or sacrifice child [timestamp] Killed process [PID] ([process_name]) total-vm:[VM_SIZE]kB, anon-rss:[RSS_SIZE]kB, file-rss:[FILE_RSS_SIZE]kB:gfp_mask=[GFP_MASK], order:[ORDER], oom_score_adj:[OOM_SCORE_ADJ]
Pay close attention to the `[process_name]`. If you consistently see `php-fpm` or a specific PHP worker process (often identified by its parent PID pointing to the master `php-fpm` process) being killed, you’ve found your perpetrator.
Analyzing PHP-FPM Memory Usage
Once the OOM Killer is confirmed, the next logical step is to understand *why* your PHP-FPM workers are consuming so much memory. This often boils down to inefficient code, memory leaks, or simply insufficient resources allocated to the pool.
PHP-FPM’s configuration plays a crucial role here. The `pm.max_children`, `pm.start_servers`, `pm.min_spare_servers`, `pm.max_spare_servers`, and `pm.max_requests` directives directly influence how many worker processes are spawned and how they are managed. For dynamic process management (`pm = dynamic`), `pm.max_children` is the hard limit on the number of child processes that will be spawned. If your application consistently needs more than this, the OOM Killer will eventually be invoked.
Let’s examine a typical PHP-FPM pool configuration file, usually located in `/etc/php/[version]/fpm/pool.d/www.conf` or a similar path:
; Start a new pool served at, e.g. a virtual socket or TCP port [www] ; Unix user/group of processes user = www-data group = www-data ; The address on which to accept FastCGI requests. ; Valid syntaxes are: ; 'unix:/path/to/php-fpm.sock' - to use a UNIX socket (preferred on performance) ; 'tcp://127.0.0.1:9000' - to use a TCP socket listen = /run/php/php7.4-fpm.sock ; Set to 'on' if you want to use OpenBasedir protection ; By default it is disabled. ;open_basedir = /var/www/ ; Set to 'on' to prevent arbitrary script execution via the --allow-url-fopen option. ; By default it is disabled. ;disable_functions = ; Set the maximum number of processes that will be spawned. ; This is the hard limit. pm.max_children = 50 ; Set the number of child processes to be started when the pool starts. pm.start_servers = 5 ; Set the minimum number of idle respawned processes. pm.min_spare_servers = 2 ; Set the maximum number of idle respawned processes. pm.max_spare_servers = 10 ; The number of requests each child process should execute before respawning. ; This can help prevent memory leaks from accumulating over time. pm.max_requests = 500 ; The process manager will now use a different algorithm to manage the number of child processes. ; pm.dynamic = on ; This is the default if pm is not set to 'static' or 'ondemand' ; pm.status_path = /status ; pm.ping_path = /ping ; pm.access_log = /var/log/php/www.access.log ; pm.error_log = /var/log/php/www.error.log ; pm.slowlog = /var/log/php/www.slow.log
If `pm.max_children` is set too high for the available RAM, or if individual PHP requests are memory-intensive, you’ll hit the system’s memory limit. Conversely, if `pm.max_requests` is too high, long-running processes might accumulate memory leaks.
System-Level Memory Monitoring
Beyond PHP-FPM’s configuration, understanding the overall system memory usage is critical. Tools like `htop`, `top`, `free`, and `vmstat` are invaluable for this.
Running `htop` (or `top`) and sorting by memory usage (`M` key in `top`, or by clicking the `MEM%` column in `htop`) will show you which processes are consuming the most RAM. Look for `php-fpm` worker processes, but also consider other potential memory hogs like your web server (Nginx/Apache), database (MySQL/PostgreSQL), caching layers (Redis/Memcached), or any other services running on the server.
# Example using top, sorted by memory top -o %MEM
The `free -h` command provides a quick overview of system memory:
free -h
total used free shared buff/cache available
Mem: 7.7Gi 4.1Gi 1.2Gi 150Mi 2.4Gi 3.2Gi
Swap: 2.0Gi 100Mi 1.9Gi
Pay attention to the `available` memory. When this figure drops close to zero, the system is under severe memory pressure, and the OOM Killer is likely to be invoked.
`vmstat` can provide a more detailed, time-series view of system activity, including memory, swap, I/O, and CPU:
# Report every 5 seconds vmstat 5
Look for a high `si` (swap in) and `so` (swap out) rate, which indicates heavy swap usage, a precursor to OOM conditions. Also, monitor the `free` column for decreasing available memory.
Profiling PHP Code for Memory Leaks
If PHP-FPM workers themselves are the primary memory consumers, and the configuration seems reasonable, the next step is to investigate the PHP code for memory leaks or excessive memory usage per request. Tools like Xdebug and Blackfire.io are indispensable here.
Using Xdebug:
Xdebug can generate a call graph that includes memory usage information. You’ll need to configure Xdebug to enable profiling and set the output directory.
; In your php.ini or a conf.d file xdebug.mode = profile xdebug.output_dir = /tmp/xdebug_profiling xdebug.profiler_enable_trigger = 1 ; Enable profiling via a trigger (e.g., GET/POST parameter) xdebug.profiler_trigger_value = "XDEBUG_PROFILE" xdebug.collect_assignments = 1 xdebug.collect_return_values = 1 xdebug.collect_vars = 1
Then, trigger a request that you suspect is causing high memory usage by appending `?XDEBUG_PROFILE=1` (or the configured trigger value) to the URL. After the request completes, examine the generated files in `/tmp/xdebug_profiling`. You can then use tools like KCacheGrind (on Linux/macOS) or Webgrind (web-based) to visualize these profiling results and identify functions that consume significant memory.
Using Blackfire.io:
Blackfire.io is a powerful commercial profiling tool that offers more advanced features and a user-friendly interface. After installing the Blackfire agent and PHP extension, you can trigger a profile directly from your browser or via the command line.
# Example using Blackfire CLI blackfire run --samples=100 --memory=500MB -- php your_script.php
The Blackfire web UI provides detailed insights into function calls, memory allocations, and execution time, making it easier to pinpoint memory-hungry parts of your application.
Tuning PHP-FPM and System Resources
Based on your analysis, you can tune PHP-FPM and system resources. This is an iterative process.
- Adjust `pm.max_children`: If your application consistently needs more processes than `pm.max_children` allows, and you have sufficient RAM, increase this value. However, be cautious; setting it too high can still lead to OOM conditions if individual requests are memory-intensive. A common starting point is to calculate based on available RAM: `(Total RAM – RAM for OS/other services) / Average RAM per PHP-FPM worker`.
- Tune `pm.max_requests`: If you suspect memory leaks within long-running processes, lowering `pm.max_requests` can help by respawning workers more frequently. A value between 100 and 1000 is typical.
- Optimize PHP Configuration: Review `memory_limit` in `php.ini`. While this is a per-request limit, if many requests hit this limit and require complex operations, it can indirectly contribute to overall memory pressure.
- Increase Server RAM: If your server is consistently maxing out its RAM, the most straightforward solution might be to upgrade your OVH server’s RAM.
- Optimize Application Code: Address identified memory leaks or inefficient memory usage in your PHP application code. This is often the most sustainable solution.
- Use `pm = ondemand`: For applications with highly variable traffic, `pm = ondemand` can be more memory-efficient. It starts processes only when needed and can kill idle ones. However, it can introduce slight latency for the first request after a period of inactivity.
After making configuration changes, always restart PHP-FPM (`sudo systemctl restart php7.4-fpm` or similar) and monitor system behavior closely.
Advanced: Systemd-OOMD and Kernel Tuning
Modern Linux systems often employ `systemd-oomd`, a daemon that aims to provide more intelligent OOM handling than the traditional kernel OOM Killer. You can check its status and configuration:
systemctl status systemd-oomd
If `systemd-oomd` is active, it might be interfering with or supplementing the kernel’s OOM behavior. Its configuration is typically found in `/etc/systemd/oomd.conf`. You can adjust its aggressiveness or disable it if you prefer to rely solely on the kernel’s OOM Killer (though this is generally not recommended without a strong understanding of the implications).
For kernel-level tuning, you can adjust the `oom_score_adj` for specific processes. This value influences how likely a process is to be chosen by the OOM Killer. A more negative value makes it less likely to be killed, while a more positive value makes it more likely. You can set this dynamically or via systemd unit files.
# Example: Make a specific PHP-FPM worker less likely to be killed # Find the PID of the worker pgrep -f "php-fpm: pool www" # Set oom_score_adj (e.g., -500 to make it very unlikely) echo -500 > /proc/[PID]/oom_score_adj
However, directly manipulating `oom_score_adj` for PHP-FPM workers is often a workaround rather than a solution. It’s better to address the root cause of high memory consumption.