Step-by-Step: Diagnosing XML External Entity (XXE) injection in old SOAP integrations on AWS Servers
Identifying Potential XXE Vulnerabilities in SOAP Integrations
XML External Entity (XXE) injection remains a persistent threat, particularly in legacy systems that rely on SOAP integrations. These vulnerabilities arise when an XML parser processes untrusted XML input and is configured to allow external entity references. In a cloud environment like AWS, diagnosing these issues requires a systematic approach, often involving deep dives into application logs, network traffic, and server configurations. This guide focuses on diagnosing XXE in SOAP integrations hosted on AWS EC2 instances, assuming a common LAMP or LEMP stack with PHP handling the SOAP requests.
The core of an XXE attack involves crafting malicious XML that exploits the parser’s ability to fetch external resources. For example, an attacker might try to read sensitive files on the server (e.g., `/etc/passwd`) or perform Server-Side Request Forgery (SSRF) by making the server request an internal or external URL.
Phase 1: Log Analysis and Anomaly Detection
The first line of defense in diagnosing any server-side issue is thorough log analysis. For XXE, we’re looking for unusual patterns in application logs, web server access logs, and potentially system logs.
Application Logs (PHP/SOAP Client)
If your PHP application is logging SOAP request/response details or errors, this is your primary source. Look for exceptions related to XML parsing, file operations, or network requests originating from the SOAP processing logic. Many PHP SOAP clients, especially older ones or those with custom configurations, might be susceptible if not properly secured.
Consider enabling verbose logging in your SOAP client if possible. For example, if using PHP’s `SoapClient`, you might wrap its calls in a try-catch block and log detailed error messages. A common indicator of XXE is an attempt to access local files or external URLs that don’t align with normal business logic.
Example of logging SOAP errors in PHP:
<?php
$client = new SoapClient("http://example.com/service.wsdl");
try {
$result = $client->someOperation(array("param" => "value"));
// Process result
} catch (SoapFault $e) {
// Log the exception details for analysis
error_log("SOAP Fault: " . $e->getMessage() . " | Code: " . $e->getCode() . " | Detail: " . print_r($e->detail, true));
// Specific checks for XXE indicators
if (strpos($e->getMessage(), "failed to load external entity") !== false ||
strpos($e->getMessage(), "Could not resolve host") !== false ||
strpos($e->getMessage(), "Permission denied") !== false) {
error_log("POTENTIAL XXE DETECTED: " . $e->getMessage());
// Further investigation needed
}
}
?>
Web Server Access Logs (Nginx/Apache)
Your web server logs (e.g., Nginx or Apache) can reveal incoming requests that might be part of an XXE attack. Look for requests to your SOAP endpoint that contain unusual XML payloads. While the raw XML might not be logged by default, the URL and request method are. If you suspect XXE, you might need to temporarily enable more detailed logging.
For Nginx, you can modify the `log_format` directive in your `nginx.conf` to include request body snippets (use with extreme caution in production due to performance and log size implications).
# In nginx.conf or conf.d/*.conf
http {
# ... other http settings ...
# Example: Log request body up to 1024 bytes.
# WARNING: This can significantly increase log size and I/O.
log_format main_with_body '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'body_length:$request_body_length body:$request_body';
server {
listen 80;
server_name your_domain.com;
access_log /var/log/nginx/access.log main_with_body; # Use the new format
# ... other server settings ...
}
}
For Apache, you can use the `mod_dumpio` module to log request bodies. Ensure it’s enabled and configured.
# In httpd.conf or apache2.conf
LoadModule dumpio_module modules/mod_dumpio.so
# Configure logging for your virtual host or globally
<VirtualHost *:80>
# ... other directives ...
DumpIOInput On
DumpIOOutput On
DumpIOPath /var/log/apache2/dumpio/
DumpIODriver syslog # Or file
CustomLog /var/log/apache2/access.log combined
</VirtualHost>
When examining these logs, look for XML payloads that contain DTD declarations with `
Phase 2: Network Traffic Analysis
If logs are insufficient or you need to capture live traffic, network analysis tools are invaluable. On an AWS EC2 instance, you can use tools like `tcpdump` or Wireshark.
Using tcpdump
`tcpdump` is a powerful command-line packet analyzer. To capture traffic directed to your SOAP endpoint (assuming it’s on port 80 or 443), you can use commands like these:
# Capture HTTP traffic on port 80 to a file sudo tcpdump -i eth0 -s 0 -w /tmp/soap_traffic.pcap 'port 80' # Capture HTTPS traffic (requires decryption if possible, or focus on metadata) # For HTTPS, you'll typically see encrypted data. Decryption requires private keys, # which is complex and often not feasible in production. Focus on patterns if possible. # If the SOAP endpoint is proxied (e.g., via ALB), you might capture traffic there. sudo tcpdump -i eth0 -s 0 -w /tmp/soap_traffic_https.pcap 'port 443' # Filter by specific IP if you know the source of suspicious requests # sudo tcpdump -i eth0 -s 0 -w /tmp/soap_traffic_filtered.pcap 'src host 1.2.3.4 and port 80'
After capturing traffic, you can analyze the `.pcap` file using Wireshark or by piping `tcpdump` output to other tools. For HTTP traffic, you can often see the XML payload directly if it’s not compressed or encrypted.
To extract HTTP requests from a `tcpdump` capture, you can use `tshark` (the command-line version of Wireshark):
# Extract HTTP requests from the pcap file tshark -r /tmp/soap_traffic.pcap -Y "http.request" -T fields -e http.request.method -e http.request.uri -e http.file_data
Leveraging AWS VPC Flow Logs
AWS VPC Flow Logs can provide metadata about network traffic, including source/destination IPs, ports, and bytes transferred. While they don’t capture payload content, they can help identify suspicious traffic patterns, such as connections to unusual external IPs or ports from your application servers.
Configure VPC Flow Logs for your relevant subnet or network interface and send them to CloudWatch Logs or an S3 bucket. You can then query these logs using CloudWatch Logs Insights or Athena to find anomalies.
Phase 3: Server-Side Configuration and Code Review
Once potential XXE activity is identified, the next step is to understand how your application and its environment are configured to handle XML parsing.
PHP XML Parser Configuration
PHP’s built-in XML parsers (`libxml`) are the most common culprits. By default, they can be vulnerable. Modern PHP versions (7.x and 8.x) have made some improvements, but explicit configuration is crucial.
The key is to disable external entity loading. This is typically done using `libxml_disable_entity_loader(true);` before parsing any untrusted XML. This function is deprecated as of PHP 8.0.0, and removed as of PHP 8.1.0. For newer versions, you should use `libxml_set_options()`.
Example of secure XML parsing in PHP:
<?php
// For PHP < 8.0.0
// libxml_disable_entity_loader(true);
// For PHP >= 8.0.0 (and recommended for all versions)
// Disable external entities and DTDs
$options = LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_DTDLOAD; // These are often enabled by default and need disabling
$current_options = libxml_get_options();
// Explicitly disable external entity loading
// The exact constants might vary slightly or be combined.
// The goal is to prevent loading external DTDs and entities.
// A common approach is to disable DTD loading entirely if not needed.
// For libxml2 versions >= 2.9.0, LIBXML_PARSEHUGE can be used to prevent DoS.
// A more robust approach for modern PHP (>= 8.0)
// Disable external entity loading and DTDs
$xml = new DOMDocument();
// Disable loading of external entities and DTDs
// LIBXML_PARSEHUGE is for preventing DoS, not directly XXE prevention.
// The key is to prevent external entity resolution.
// The exact flags can be tricky. The most direct way is to ensure DTD loading is off.
// If you need DTDs for validation, you must ensure they are from trusted sources.
// Recommended approach for PHP 8+ to disable external entities:
// This is achieved by disabling DTD loading.
// If you need DTDs, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// Note: LIBXML_NOENT is often what attackers want to enable, so ensure it's NOT used if external entities are a concern.
// The most effective way is to prevent DTD loading if not strictly required.
// If you are parsing XML from an untrusted source, and DO NOT need DTDs:
$xml = new DOMDocument();
$xml->loadXML($untrusted_xml_string); // This might still be vulnerable if DTDs are implicitly loaded.
// The safest way is to explicitly disable DTD loading if not required.
// If DTDs are required, ensure they are validated against a local, trusted schema.
// A more explicit way to disable external entity loading for libxml2:
// This requires understanding the libxml2 options.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly necessary.
// For PHP < 8.0, libxml_disable_entity_loader(true) is the primary function.
// For PHP >= 8.0, you need to manage libxml options.
// Example for PHP >= 8.0:
// Disable DTD loading to prevent external entity resolution.
// If you need DTDs, ensure they are local and trusted.
$xml = new DOMDocument();
// Setting LIBXML_DTDATTR and LIBXML_DTDLOAD to false is not directly possible
// as they are flags to enable features. The absence of these flags implies they are off.
// The critical part is preventing the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// A common pattern for securing XML parsing in PHP:
function parseXmlSecurely(string $xmlString): DOMDocument {
$doc = new DOMDocument();
// Disable external entity loading and DTDs if not needed.
// For PHP < 8.0: libxml_disable_entity_loader(true);
// For PHP >= 8.0: Manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following options aim to prevent external entity resolution.
// LIBXML_NOENT is often the target of XXE, so ensure it's not enabled by default.
// LIBXML_XINCLUDE is also a potential vector.
// For PHP >= 8.0, the recommended approach is to ensure DTD loading is off.
// If you are using a SoapClient, it might have its own internal XML processing.
// You might need to configure the SoapClient's options if possible, or
// ensure the XML it receives is sanitized.
// A robust approach for PHP 8+ is to explicitly disable DTD loading.
// If you need DTDs, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent DTD loading.
// If DTDs are required, ensure they are local and trusted.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// If you are using a SoapClient that internally uses libxml, you might need to
// configure the SoapClient itself or ensure the underlying XML parser is secured.
// The following is a common pattern to disable external entity resolution.
// It's often recommended to disable DTD loading if not strictly required.
// The key is to prevent the parser from fetching external DTDs.
// For PHP < 8.0:
// libxml_disable_entity_loader(true);
// For PHP >= 8.0, manage libxml options.
// The most effective way is to prevent D