• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • Home
  • Projects
  • Products
  • Themes
  • Tools
  • Request for Quote

Vengala Vinay

Having 9+ Years of Experience in Software Development

  • Home
  • WordPress
  • PHP
    • Codeigniter
  • Django
  • Magento
  • Selenium
  • Server
Home » Step-by-Step: Diagnosing XML External Entity (XXE) injection in old SOAP integrations on OVH Servers

Step-by-Step: Diagnosing XML External Entity (XXE) injection in old SOAP integrations on OVH Servers

Understanding the XXE Threat in SOAP Integrations

XML External Entity (XXE) injection remains a persistent vulnerability, particularly in legacy systems and SOAP integrations. These integrations, often found in older enterprise architectures, parse XML payloads without proper sanitization. When an attacker can control parts of the XML input, they can craft malicious payloads that exploit the XML parser’s ability to fetch external resources. This can lead to sensitive data disclosure, Server-Side Request Forgery (SSRF), denial-of-service (DoS) attacks, and even remote code execution in some scenarios. On OVH servers, like any other hosting environment, the underlying XML parsing libraries and server configurations dictate the susceptibility to XXE.

Identifying Potential XXE Vectors in SOAP Requests

The primary indicator of an XXE vulnerability in a SOAP integration is the parser’s behavior when encountering specially crafted DOCTYPE declarations. A typical SOAP request involves an XML envelope. If this envelope contains a DOCTYPE declaration that references an external entity, and the server’s XML parser is configured to resolve these entities, an XXE attack is possible. We’ll focus on diagnosing this within the context of a PHP-based SOAP service hosted on an OVH server, as PHP’s `libxml` is commonly used.

Diagnostic Step 1: Analyzing Server Logs

The first line of defense is to scrutinize your web server and application logs. Look for unusual patterns in incoming SOAP requests. This might include requests with verbose DOCTYPE declarations or requests that seem to be attempting to access internal network resources or external URLs that are not part of the legitimate integration flow.

On an OVH server, you’ll typically find:

  • Apache/Nginx access logs: /var/log/apache2/access.log or /var/log/nginx/access.log
  • PHP error logs: Often configured via php.ini, e.g., /var/log/php/error.log
  • Application-specific logs: If your SOAP service logs detailed request/response information.

Search for patterns like:

<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
<soap:Envelope ...>
  <soap:Body>
    <your:request xmlns:your="http://your.namespace.com">
      <your:data>&xxe;</your:data>
    </your:request>
  </soap:Body>
</soap:Envelope>

If you see requests containing such DOCTYPEs, especially if they are followed by application errors or unexpected behavior, it’s a strong indicator. Also, monitor for requests that might be attempting to access internal IP addresses (e.g., 192.168.x.x, 10.x.x.x, 172.16.x.x-172.31.x.x) or metadata services (like those on cloud providers, though less common for direct XXE exploitation on OVH unless it’s an internal service).

Diagnostic Step 2: Simulating XXE Payloads

To confirm the vulnerability, you need to send crafted requests. This is best done in a controlled staging or development environment. We’ll use `curl` for this, targeting a hypothetical SOAP endpoint https://your-ovh-domain.com/soap_service.php.

Scenario A: File Disclosure (e.g., reading /etc/passwd)

curl -X POST \
  https://your-ovh-domain.com/soap_service.php \
  -H 'Content-Type: text/xml; charset=utf-8' \
  -H 'SOAPAction: "http://your.namespace.com/YourOperation"' \
  --data-binary '<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
               xmlns:your="http://your.namespace.com">
  <soap:Body>
    <your:GetData>
      <your:ItemId>&xxe;</your:ItemId>
    </your:GetData>
  </soap:Body>
</soap:Envelope>'

If the response contains the content of /etc/passwd (or a partial dump, or an error indicating it tried to access it), this confirms file disclosure via XXE. The attacker would typically embed the entity reference within a data field that is then echoed back in the response.

Scenario B: Server-Side Request Forgery (SSRF)

This attempts to make the server perform a request to an internal or external resource. We’ll try to access a hypothetical internal service on port 8080.

curl -X POST \
  https://your-ovh-domain.com/soap_service.php \
  -H 'Content-Type: text/xml; charset=utf-8' \
  -H 'SOAPAction: "http://your.namespace.com/YourOperation"' \
  --data-binary '<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://127.0.0.1:8080/internal"> ]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
               xmlns:your="http://your.namespace.com">
  <soap:Body>
    <your:GetData>
      <your:ItemId>&xxe;</your:ItemId>
    </your:GetData>
  </soap:Body>
</soap:Envelope>'

Observe the response. If the response contains data that would typically come from http://127.0.0.1:8080/internal, or if there’s a timeout/error that suggests the server *attempted* to connect, it indicates SSRF capability. You might also see different error messages depending on whether the port is open or closed.

Diagnostic Step 3: Inspecting PHP’s XML Parsing Configuration

The vulnerability often stems from how PHP’s XML parsers (like `libxml`) are configured. By default, `libxml` versions prior to 2.9.0 were more permissive. Even in later versions, specific options can enable or disable external entity loading. We need to check the PHP configuration and the code that handles XML parsing.

Checking php.ini settings:

On your OVH server, locate your active php.ini file. This can vary based on your hosting plan and PHP version. Common locations include:

  • /etc/php/[php_version]/apache2/php.ini
  • /etc/php/[php_version]/fpm/php.ini
  • /usr/local/etc/php/[php_version]/php.ini

Look for these directives:

libxml_disable_entity_loader = Off

If libxml_disable_entity_loader is set to Off (or commented out, implying default behavior which might be vulnerable), this is a critical finding. Modern PHP versions (7.x and 8.x) have this set to On by default, but older versions or custom configurations might not.

Checking PHP Code for XML Parsing:

Even if php.ini is configured correctly, the application code might override these settings or use XML parsers that are not affected by libxml_disable_entity_loader. Examine the PHP code that receives and parses the SOAP XML payload. Look for:

<?php
// Example of vulnerable parsing using SimpleXML
$xmlString = file_get_contents('php://input');
$xml = simplexml_load_string($xmlString); // Potentially vulnerable

// Example of vulnerable parsing using DOMDocument
$dom = new DOMDocument();
$dom->loadXML($xmlString); // Potentially vulnerable

// Explicitly disabling entity loading (GOOD PRACTICE)
libxml_disable_entity_loader(true);
$xml = simplexml_load_string($xmlString);

$dom = new DOMDocument();
$dom->loadXML($xmlString); // Still potentially vulnerable if not configured

// Using DOMDocument with security options (BETTER PRACTICE)
$dom = new DOMDocument();
$dom->resolveExternals = false; // Explicitly disable external entity resolution
$dom->loadXML($xmlString);
?>

The key is to ensure that libxml_disable_entity_loader(true); is called before any XML parsing functions, or that the XML parser objects are configured with security options like resolveExternals = false for DOMDocument.

Diagnostic Step 4: Network Traffic Analysis (Advanced)

If logs and code analysis are inconclusive, or if you suspect the server is making outbound connections that aren’t logged by the web server, network traffic analysis can be invaluable. This is more intrusive and requires appropriate permissions.

Using tcpdump or wireshark:

On the OVH server, you can use tcpdump to capture network packets. You’ll want to filter for traffic originating from your web server’s IP address and potentially targeting common internal ports or external suspicious IPs.

# Capture traffic on port 80 and 443 from the web server's IP
sudo tcpdump -i any -n -s 0 'host your_server_ip and (port 80 or port 443)' -w /tmp/xxe_capture.pcap

# Or, if you suspect specific internal IPs being targeted
sudo tcpdump -i any -n -s 0 'host your_server_ip and dst net 192.168.0.0/16' -w /tmp/xxe_capture.pcap

After capturing traffic during a simulated XXE attack (using the curl commands from Step 2), analyze the .pcap file with Wireshark. Look for:

  • Outbound HTTP/HTTPS requests to unexpected destinations.
  • DNS lookups for external domains that are not part of your application’s normal operation.
  • Connections to internal IP addresses or ports.

This step is crucial for confirming SSRF attacks that might not leave obvious traces in application logs.

Mitigation Strategies

Once an XXE vulnerability is confirmed, immediate mitigation is necessary. The most effective approach is to disable external entity processing entirely.

1. PHP Configuration:

; In your php.ini file
libxml_disable_entity_loader = On

Ensure this setting is present and set to On. If you are using PHP-FPM, you might need to restart the PHP-FPM service for changes to take effect (e.g., sudo systemctl restart php[php_version]-fpm).

2. Code-Level Mitigation:

If you cannot control php.ini or need defense-in-depth, explicitly disable entity loading in your PHP code:

<?php
// Always call this before parsing untrusted XML
libxml_disable_entity_loader(true);

// Use DOMDocument with security options
$dom = new DOMDocument();
$dom->resolveExternals = false; // Crucial for preventing XXE
$dom->loadXML($xmlString);

// Or use SimpleXML after disabling entity loader
$xml = simplexml_load_string($xmlString);
?>

3. Input Validation and Sanitization:

While not a primary defense against XXE itself (as it exploits the parser), validating the structure and content of incoming XML can help reject malformed or suspicious requests early. However, rely on disabling entity loading as the main protection.

4. Web Application Firewall (WAF):

A WAF can be configured to detect and block common XXE patterns in requests. While useful, it should be considered a supplementary layer, not a replacement for secure parsing configurations.

Conclusion

Diagnosing XXE in SOAP integrations on OVH servers requires a systematic approach, combining log analysis, targeted payload simulation, and an understanding of PHP’s XML parsing capabilities. By following these steps, DevOps engineers can effectively identify, confirm, and remediate XXE vulnerabilities, safeguarding sensitive data and system integrity.

Primary Sidebar

A little about the Author

Having 9+ Years of Experience in Software Development.
Expertised in Php Development, WordPress Custom Theme Development (From scratch using underscores or Genesis Framework or using any blank theme or Premium Theme), Custom Plugin Development. Hands on Experience on 3rd Party Php Extension like Chilkat, nSoftware.

Recent Posts

  • Step-by-Step: Diagnosing indexing lock conflicts and high CPU during bulk stock updates on DigitalOcean Servers
  • How to Debug and Fix memory leaks and socket exhaustion in daemon processes in Modern C++ Applications
  • Infrastructure as Code: Provisioning Secure PHP Clusters on DigitalOcean Using Terraform
  • Fixing Slow Largest Contentful Paint (LCP) caused by unoptimized database queries in Legacy Laravel Codebases Without Breaking API Contracts
  • An Auditor’s Checklist for Securing Laravel Backends on Google Cloud

Copyright © 2026 · Vinay Vengala