How We Audited a High-Traffic Magento 2 Enterprise Stack on Linode and Mitigated XML External Entity (XXE) injection in old SOAP integrations
Initial Stack Assessment and Threat Landscape
Our engagement began with a deep dive into a high-traffic Magento 2 Enterprise Edition (now Adobe Commerce) stack hosted on Linode. The primary concern was a recent security audit report flagging potential XML External Entity (XXE) injection vulnerabilities, particularly within legacy SOAP integrations. This stack served a global e-commerce platform with millions of daily requests, making any vulnerability a critical risk. The infrastructure comprised several Linode instances: dedicated web servers (Nginx), application servers (PHP-FPM), a managed MySQL database cluster, and a Redis cache layer. The core issue stemmed from older, third-party SOAP integrations that were still in use for critical business processes, such as inventory management and order fulfillment, and had not been updated to leverage more secure, modern APIs.
The threat model for XXE injection in this context is straightforward: an attacker could craft malicious XML payloads sent to the SOAP endpoints. If the server-side XML parser is not configured securely, it may attempt to resolve external entities referenced in the XML. This can lead to:
- Disclosure of sensitive local files (e.g., configuration files, SSH keys).
- Server-Side Request Forgery (SSRF) by forcing the server to make requests to internal or external resources.
- Denial of Service (DoS) through recursive entity expansion (Billion Laughs attack).
Identifying Vulnerable SOAP Endpoints and XML Parsers
The first actionable step was to pinpoint the exact SOAP endpoints exposed and the underlying XML parsing mechanisms. Magento 2, by default, uses PHP’s built-in SimpleXML or DOMDocument for XML processing. The vulnerability often lies in how these parsers are configured, or more commonly, in how the raw XML input is handled before parsing.
We initiated this by examining the codebase for SOAP client and server implementations. A targeted grep across the Magento installation and its extensions was crucial:
grep -r "soapClient" /var/www/html/ grep -r "DOMDocument" /var/www/html/ grep -r "SimpleXMLElement" /var/www/html/
This revealed several legacy SOAP clients and servers, primarily within custom modules and older third-party integrations. The critical observation was that many of these integrations were directly passing user-supplied or externally sourced XML strings to the parsers without sanitization or disabling external entity resolution.
For example, a common pattern for vulnerable XML parsing in PHP looks like this:
<?php // Potentially vulnerable code $xml_string = $_POST['xml_data']; // Data from an external source $dom = new DOMDocument(); $dom->loadXML($xml_string); // Vulnerable: external entities can be resolved $data = simplexml_load_string($xml_string); // Also potentially vulnerable ?>
Mitigation Strategy: Disabling External Entity Resolution
The most effective and direct mitigation for XXE is to disable the resolution of external entities within the XML parser configuration. For PHP’s `DOMDocument` and `libxml` (which `SimpleXMLElement` also uses), this is achieved by setting specific parser options.
The key options to disable are:
- `LIBXML_NOENT`: Disables the expansion of general entities.
- `LIBXML_XINCLUDE`: Disables XInclude processing.
We implemented these by modifying the relevant parsing functions within the identified vulnerable modules. It’s crucial to do this at the point of parsing, not just on the input string itself.
Here’s the secure way to parse XML using `DOMDocument`:
<?php // Secure parsing $xml_string = $_POST['xml_data']; // Data from an external source // Create a new DOMDocument object $dom = new DOMDocument(); // Set parser options to disable external entity resolution and XInclude // LIBXML_NOENT: Disables the expansion of general entities. // LIBXML_XINCLUDE: Disables XInclude processing. // LIBXML_NONET: Disables the use of network resource loading. (Added for extra safety) $dom->resolveExternals = false; // Explicitly set for older PHP versions if needed, though options are preferred. $dom->loadXML($xml_string, LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NONET); // If using SimpleXML, it's often a wrapper around libxml. // The safest approach is to use DOMDocument and then convert if needed, // or ensure the underlying libxml context is secured. // For direct SimpleXML parsing, ensure libxml options are set globally or via context if possible, // but DOMDocument offers more granular control. // Example of converting DOMDocument to SimpleXMLElement if needed // $xml = simplexml_import_dom($dom); ?>
For `SimpleXMLElement`, direct control over `LIBXML_NOENT` and `LIBXML_XINCLUDE` is less straightforward as it often relies on the default `libxml` behavior. The recommended approach is to use `DOMDocument` with the secure options and then convert the `DOMDocument` object to a `SimpleXMLElement` if the rest of the logic expects it. This ensures the initial parsing is safe.
<?php
// Secure parsing with SimpleXML via DOMDocument
$xml_string = $_POST['xml_data'];
$dom = new DOMDocument();
// Ensure secure parsing options are applied
$dom->loadXML($xml_string, LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NONET);
// Convert the secure DOMDocument to SimpleXMLElement
$xml = simplexml_import_dom($dom);
if ($xml === false) {
// Handle XML parsing errors
throw new Exception("Failed to parse XML securely.");
}
// Proceed with processing $xml object
?>
Configuration-Level Hardening (Nginx and PHP-FPM)
While code-level fixes are paramount, infrastructure-level hardening provides an additional layer of defense. For the Nginx and PHP-FPM setup on Linode, we reviewed and adjusted configurations to minimize attack vectors.
Nginx Configuration:
We ensured that Nginx was configured to reject requests with excessively large XML payloads, which could be indicative of a DoS attempt (e.g., Billion Laughs attack). This is done via `client_max_body_size` and potentially `large_client_header_buffers`.
http {
# ... other http configurations ...
client_max_body_size 1m; # Adjust size based on legitimate needs, keep it minimal.
large_client_header_buffers 2 8k; # Default is usually fine, but can be tuned.
server {
# ... server configurations ...
location ~ \.php$ {
# ... php-fpm configuration ...
fastcgi_buffers 8 16k;
fastcgi_buffer_size 32k;
# ... other fastcgi params ...
}
}
}
PHP-FPM Configuration:
PHP’s `php.ini` settings are critical. We enforced `libxml.load_external_entities` to `off` and `libxml.disable_entity_loader` to `on` globally. This is the most robust way to prevent XXE if code-level fixes are missed or incomplete.
[PHP] ; ... other php settings ... ; Disable loading external entities globally libxml.load_external_entities = Off libxml.disable_entity_loader = On ; Increase memory limit for potentially large XML processing, but be cautious. ; memory_limit = 256M ; Set a reasonable execution time limit ; max_execution_time = 60 ; ... other php settings ...
After applying these `php.ini` changes, a restart of the PHP-FPM service was necessary:
sudo systemctl restart php8.1-fpm # Adjust version as per your installation
Testing and Verification
Post-mitigation, rigorous testing was essential. We employed a multi-pronged approach:
- Automated Scanning: Re-ran the initial security scanner against the SOAP endpoints.
- Manual Penetration Testing: Used tools like Burp Suite to craft malicious XML payloads designed to exploit XXE. This included payloads attempting to read local files (`file:///etc/passwd`), perform SSRF (`http://127.0.0.1/`), and trigger DoS conditions.
- Code Review: A final pass of the modified code to ensure the fixes were correctly implemented and no new vulnerabilities were introduced.
A typical XXE payload attempting to read a local file would look something like this:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE foo [ <!ELEMENT foo ANY > <!ENTITY xxe SYSTEM "file:///etc/passwd" > ]> <root> <data>&xxe;</data> </root>
If the system was still vulnerable, the content of `/etc/passwd` would be returned within the SOAP response. After applying the `LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NONET` flags and the global `php.ini` settings, such payloads should result in parsing errors or simply not resolve the external entity, returning the literal string `&xxe;` or an error indicating external entity loading is disabled.
Conclusion and Ongoing Vigilance
Mitigating XXE injection in legacy SOAP integrations on a high-traffic Magento 2 stack requires a combination of precise code fixes, robust configuration hardening, and thorough testing. By disabling external entity resolution at the parser level (`DOMDocument` flags and `php.ini` settings) and reinforcing with Nginx configurations, we successfully closed the identified XXE vulnerabilities. This case study highlights the persistent risk posed by older integration patterns and the importance of proactive security audits and timely patching. For CTOs and VPs of Engineering, this underscores the need for a comprehensive inventory of all integrations, regular security assessments, and a strategy for deprecating or modernizing legacy systems that may harbor such critical vulnerabilities.