Diagnosing and Resolving PHP-FPM Segfaults and Core Dumps on Ubuntu 24.04 LTS using GDB and Systemd coredumpctl

Understanding PHP-FPM Segfaults and Core Dumps

Encountering segmentation faults (segfaults) in PHP-FPM, especially in production environments, is a critical issue that can lead to application instability and downtime. These faults typically indicate a low-level memory access violation within the PHP interpreter or one of its extensions. On modern Ubuntu systems, particularly Ubuntu 24.04 LTS, the systemd journal and its integrated coredump handling provide powerful tools for diagnosing these problems. This guide focuses on leveraging these tools, specifically GDB (GNU Debugger) and coredumpctl, to pinpoint the root cause of PHP-FPM segfaults.

Configuring Systemd for Core Dumps

Before we can analyze core dumps, we need to ensure the system is configured to generate them. By default, systemd might limit core dump generation. We’ll adjust the systemd service unit for PHP-FPM and the system-wide kernel settings.

Enabling Core Dumps for PHP-FPM Service

We’ll create an override for the PHP-FPM systemd service to enable core dumps. This ensures that when PHP-FPM crashes, a core dump file is generated.

First, identify the PHP-FPM service name. It’s typically php-fpm.service. For PHP 8.3, it would be php8.3-fpm.service.

Create a systemd override directory:

sudo mkdir -p /etc/systemd/system/php8.3-fpm.service.d/

Create an override file, for example, /etc/systemd/system/php8.3-fpm.service.d/coredump.conf, with the following content:

[Service]
# Enable core dumps for this service
CoreDumpDirectory=/var/lib/systemd/coredump/
# Set a reasonable limit for core dump size (e.g., unlimited)
LimitCORE=infinity
# Set a reasonable limit for the number of core dumps
# Note: This is often managed by systemd-coredump's configuration, but can be set here too.
# For simplicity, we'll rely on systemd-coredump's defaults or global settings.

Reload the systemd daemon to apply the changes:

sudo systemctl daemon-reload

Configuring Kernel Parameters for Core Dumps

Ensure the kernel is configured to allow core dumps. We’ll use sysctl for this.

Check current core dump settings:

sysctl kernel.core_pattern
sysctl fs.suid_dumpable

The kernel.core_pattern determines where core dumps are saved and how they are processed. For systemd integration, it’s typically set to |/usr/share/apport/apport --coredump or |/lib/systemd/systemd-coredump %P %u %g %s %t %c %e. If it’s not set or is set to 0, core dumps might not be generated or saved correctly.

The fs.suid_dumpable setting controls whether core dumps are generated for setuid/setgid executables. While PHP-FPM workers typically run as a specific user and group, setting this to 2 (debug) or 1 (suidsymlink) can be beneficial for debugging complex scenarios, though it has security implications.

To enable core dumps and set the pattern for systemd integration, you can create a sysctl configuration file, e.g., /etc/sysctl.d/99-coredump.conf:

# Enable core dumps for all processes
kernel.core_pattern = |/usr/share/apport/apport --coredump
# Or for systemd-coredump integration:
# kernel.core_pattern = |/lib/systemd/systemd-coredump %P %u %g %s %t %c %e

# Allow core dumps for setuid/setgid executables (use with caution)
fs.suid_dumpable = 2

Apply the changes immediately:

sudo sysctl -p /etc/sysctl.d/99-coredump.conf

Verify the settings:

sysctl kernel.core_pattern
sysctl fs.suid_dumpable

Triggering and Locating Core Dumps with coredumpctl

Once configured, the next step is to trigger a PHP-FPM segfault and then use coredumpctl to manage and inspect the resulting core dump.

Simulating a Segfault (for testing purposes)

In a controlled environment, you might simulate a segfault. A common way is to use a PHP script that intentionally causes a memory error. For instance, an infinite recursion or an attempt to access memory beyond allocated bounds (though the latter is harder to trigger reliably in safe mode PHP).

Consider a simple PHP script that might cause issues under specific conditions or with certain extensions loaded:

<?php
// This is a simplified example. Real-world segfaults are often more complex.
// A segfault might occur due to bugs in extensions, ZTS (Zend Thread Safety) issues,
// or memory corruption from external libraries.

// Example: Attempting to dereference a null pointer (if PHP allowed it directly like C)
// In PHP, this often results in a parse error or a different type of exception.
// A more realistic scenario involves C extensions.

// For demonstration, let's try something that might stress memory or recursion.
// This is NOT guaranteed to segfault but illustrates the concept.
function recursive_call($depth) {
    if ($depth < 0) {
        return;
    }
    // Potentially trigger a segfault if $depth is extremely large and stack overflows
    // or if there's a bug in the recursion handling.
    recursive_call($depth - 1);
}

// Trigger a deep recursion. Adjust depth based on system limits.
// This might lead to a stack overflow, which can manifest as a segfault.
// recursive_call(100000); // Uncomment to test, but be aware of system impact.

// A more direct way to trigger a segfault would be through a C extension.
// For example, if a C extension has a bug like:
// char *ptr = NULL;
// *ptr = 'A'; // Dereferencing NULL pointer - this would segfault.

// If you have a custom C extension, you could introduce such a bug.
// Otherwise, rely on observing real-world crashes.

echo "Script finished without crashing.\n";
?>

To trigger a segfault in a live PHP-FPM environment, you would typically need to reproduce the conditions that led to the crash. This might involve specific user requests, heavy load, or interaction with particular external services or databases.

Using coredumpctl to List and Inspect Core Dumps

After a PHP-FPM process crashes and generates a core dump, coredumpctl becomes your primary tool.

List all available core dumps:

sudo coredumpctl list

This command will show a table with information about each core dump, including the PID, the executable name (e.g., php-fpm8.3), the signal that caused the crash, and the time. Look for entries related to your PHP-FPM workers.

To get more details about a specific core dump (e.g., the one with the latest timestamp or a specific PID):

sudo coredumpctl info <PID_or_CORE_ID>

Replace <PID_or_CORE_ID> with the PID of the crashed PHP-FPM worker or the core dump ID from the list output.

The info command provides details like the executable path, command line arguments, user, group, signal, and the status of the dump (e.g., ‘coredump’). It also indicates the location where the core dump file is stored (usually in /var/lib/systemd/coredump/).

Debugging with GDB

The most powerful tool for analyzing core dumps is GDB. We’ll use it to examine the state of the PHP-FPM process at the moment of the crash.

Attaching GDB to a Core Dump

First, you need the path to the core dump file. The coredumpctl info command will provide this. It will typically be something like /var/lib/systemd/coredump/core....zst. Note that systemd often compresses core dumps (e.g., with zstd).

You also need the executable that generated the core dump. This is usually the PHP-FPM binary itself (e.g., /usr/sbin/php-fpm8.3).

To debug a core dump, you’ll use GDB like this:

sudo gdb /usr/sbin/php-fpm8.3 /var/lib/systemd/coredump/core.php-fpm8.3.12345.1678886400.zst

If the core dump is compressed (e.g., with zstd), GDB might not be able to read it directly. You might need to decompress it first:

sudo zstdcat /var/lib/systemd/coredump/core.php-fpm8.3.12345.1678886400.zst > /tmp/core.uncompressed

Then, debug the uncompressed file:

sudo gdb /usr/sbin/php-fpm8.3 /tmp/core.uncompressed

Essential GDB Commands for Analysis

Once GDB is loaded with the core dump, you’ll be presented with the GDB prompt ((gdb)). Here are the key commands:

bt (backtrace): This is the most crucial command. It shows the call stack at the time of the crash, indicating the sequence of function calls that led to the fault. Look for functions related to PHP core, extensions, or your application code if it’s deeply integrated.
info frame: Provides detailed information about the current stack frame (selected by bt).
frame <N>: Switch to stack frame number <N>.
p <variable>: Print the value of a variable in the current frame.
info locals: Display all local variables in the current frame.
info args: Display all arguments passed to the function in the current frame.
list: Show the source code around the current execution point. This requires having the debug symbols and source code available.
disassemble: Show the assembly code around the current execution point.
quit: Exit GDB.

When analyzing the backtrace (bt), pay close attention to the function names. If you see functions from specific PHP extensions (e.g., imagick_..., redis_...) or third-party libraries, that’s a strong indicator of where the problem lies. If the stack trace is deep within PHP’s internal functions (e.g., zend_execute_scripts, memory management functions), it might point to a core PHP bug or a complex interaction.

Debugging with Debug Symbols

For effective debugging, it’s essential to have debug symbols installed for PHP-FPM and any relevant extensions. Without them, GDB will show addresses and generic function names, making analysis much harder.

On Ubuntu, debug symbols are often available in separate packages. For PHP, you might need to install packages like php8.3-dbg. For system libraries, look for packages ending in -dbg or -dbgsym.

To install debug symbols for PHP 8.3:

sudo apt update
sudo apt install php8.3-dbg

After installing debug symbols, reload GDB with the core dump. The output of bt should now be much more informative, showing source file names and line numbers.

Common Causes and Resolution Strategies

Extension Bugs

Many segfaults originate from bugs within PHP extensions, especially third-party ones. If the backtrace points to functions within a specific extension, consider:

Updating the extension to its latest stable version.
Disabling the extension temporarily to see if the segfaults stop.
Reporting the bug to the extension’s developers with the GDB backtrace and reproduction steps.
If it’s a custom-built extension, debugging its C/C++ source code directly.

Memory Corruption

Memory corruption can be subtle and hard to track. It might be caused by:

Bugs in C extensions that write past allocated buffer boundaries.
Race conditions in multi-threaded environments (though PHP-FPM workers are typically single-threaded per process, the underlying libraries might be multi-threaded).
Issues with the Zend Engine itself (less common, but possible).

GDB can help identify the exact memory access violation (e.g., “Segmentation fault” or “Access violation”). If the crash occurs during memory allocation/deallocation (e.g., in malloc, free, or PHP’s memory management functions), it’s a strong indicator of corruption.

Configuration Issues

While less likely to cause direct segfaults, certain PHP or PHP-FPM configuration settings could indirectly lead to memory exhaustion or instability that triggers a crash. For example:

Extremely high memory_limit combined with inefficient code.
Large post_max_size or upload_max_filesize that can lead to excessive memory usage for large uploads.
Incorrect pool configurations in php-fpm.conf (e.g., pm.max_children too high for available RAM).

Review your PHP and PHP-FPM configuration files (e.g., php.ini, /etc/php/8.3/fpm/php-fpm.conf, and pool configurations in /etc/php/8.3/fpm/pool.d/) for any unusual settings.

External Libraries and Dependencies

PHP often relies on external C libraries (e.g., libpng, libjpeg, OpenSSL, GD). Bugs in these libraries, or incorrect integration with them by PHP extensions, can also lead to segfaults. If the backtrace shows functions from these libraries, investigate their versions and potential known issues.

Advanced Troubleshooting Techniques

Valgrind

For deeper memory error detection, Valgrind is an invaluable tool. It can detect memory leaks, invalid memory accesses, and uninitialized memory usage. However, running PHP-FPM under Valgrind significantly impacts performance and is usually only feasible in development or staging environments.

To run PHP-FPM workers under Valgrind:

# Example: Run a single PHP-FPM worker process with Valgrind
# You'll need to adjust your PHP-FPM pool configuration to use a specific binary
# or script that launches the worker under valgrind.
# This is a complex setup and often involves modifying the systemd service or pool config.

# A simpler approach for testing specific scripts:
valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all php /path/to/your/script.php

For PHP-FPM itself, you might need to modify the exec line in your pool configuration to prepend valgrind. This is highly dependent on your setup and requires careful testing.

AddressSanitizer (ASan)

If you are compiling PHP or its extensions from source, compiling with AddressSanitizer enabled can catch memory errors at runtime with less overhead than Valgrind. This requires recompiling PHP and its extensions with specific compiler flags (e.g., -fsanitize=address).

PHP Core Dumps and `php-fpm -R`

While coredumpctl and GDB are powerful, sometimes PHP itself can be configured to generate its own core dumps or log more detailed error information. The php-fpm -R command (run as root) can sometimes provide more verbose logging, but it’s generally not recommended for production due to security and performance implications.

Conclusion

Diagnosing PHP-FPM segfaults on Ubuntu 24.04 LTS involves a systematic approach. By correctly configuring systemd for core dumps, leveraging coredumpctl to locate and inspect them, and using GDB with debug symbols to analyze the crash state, you can effectively pinpoint the root cause. Remember to consider common culprits like extension bugs, memory corruption, and configuration issues. For persistent or complex problems, advanced tools like Valgrind or AddressSanitizer may be necessary.