Mitigating OWASP Top 10 Risks: Finding and Patching XML External Entity (XXE) injection in old SOAP integrations in C
Understanding XXE in C-based SOAP Integrations
XML External Entity (XXE) injection remains a persistent threat, particularly within legacy systems that rely on XML parsing. When these systems process untrusted XML input, an attacker can exploit vulnerabilities in the XML parser to access sensitive files on the server, perform denial-of-service attacks, or even conduct server-side request forgery (SSRF). This is especially relevant for older SOAP integrations built in C, where direct memory manipulation and less robust default parser configurations can exacerbate the risks.
The core of an XXE vulnerability lies in how an XML parser handles external entities defined within an XML document. An external entity declaration allows an XML document to reference content from external sources, such as local files or URLs. If the parser is configured to resolve these external entities, an attacker can craft malicious XML to point to sensitive system resources.
Identifying XXE Vulnerabilities in C SOAP Clients/Servers
The first step in mitigation is detection. In C, XML parsing is often handled by libraries like `libxml2`. A common indicator of an XXE vulnerability is the presence of code that directly uses `libxml2`’s parsing functions without proper configuration to disable external entity resolution. Look for patterns where XML documents are loaded and processed from potentially untrusted sources.
Consider a simplified C code snippet that might be part of a SOAP client or server handling an incoming XML request. Without explicit security measures, this code is susceptible to XXE:
#include <libxml/parser.h>
#include <libxml/tree.h>
// ... other includes and functions ...
void process_soap_request(const char* xml_data) {
xmlDocPtr doc;
xmlNodePtr cur;
// Parse the XML data
doc = xmlReadMemory(xml_data, strlen(xml_data), "noname.xml", NULL, 0);
if (doc == NULL) {
fprintf(stderr, "Failed to parse XML document\n");
return;
}
// ... further processing of the XML document ...
xmlFreeDoc(doc);
xmlCleanupParser();
}
The `xmlReadMemory` function, by default, might attempt to resolve external entities. If `xml_data` originates from an untrusted network source or user input, an attacker could inject an XML payload like this:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Header/>
<soapenv:Body>
<m:processRequest xmlns:m="http://example.com/myapi">
<m:data>&xxe;</m:data>
</m:processRequest>
</soapenv:Body>
</soapenv:Envelope>
If the C code then attempts to print or log the content of `&xxe;`, it would inadvertently expose the contents of `/etc/passwd`. To find these vulnerabilities systematically, static analysis tools that can parse C code and identify `libxml2` usage are invaluable. Dynamic analysis, by fuzzing the SOAP endpoints with crafted XML payloads containing various XXE attack vectors (e.g., `file://`, `http://`, ``), is also crucial.
Patching XXE Vulnerabilities in C with libxml2
The most effective way to mitigate XXE in `libxml2` is to disable external entity resolution entirely. This is achieved by setting parser options before calling parsing functions.
The key functions to use are `xmlParserOption` and `xmlReaderSettings`. Specifically, we want to disable `XML_PARSE_NOENT` (which resolves entities) and `XML_PARSE_DTD` (which loads DTDs, often where entities are defined).
Using `xmlReadMemory` with Security Options
When using `xmlReadMemory`, you can pass specific options. A more secure approach involves explicitly setting these options:
#include <libxml/parser.h>
#include <libxml/tree.h>
// ... other includes and functions ...
void process_soap_request_secure(const char* xml_data) {
xmlDocPtr doc;
xmlNodePtr cur;
xmlParserCtxtPtr ctxt;
// Create a parser context
ctxt = xmlReaderForMemory(xml_data, strlen(xml_data), "noname.xml", NULL, 0);
if (!ctxt) {
fprintf(stderr, "Failed to create XML parser context\n");
return;
}
// Disable external entity resolution and DTD loading
// XML_PARSE_NOENT: Disable entity substitution
// XML_PARSE_DTD: Disable DTD loading (including external DTDs)
// XML_PARSE_NONET: Disable network access (for SSRF prevention)
xmlTextReaderSetParserProp(ctxt, XML_PARSER_LOAD_EXT_XINCLUDE, 0); // Disable XInclude
xmlTextReaderSetParserProp(ctxt, XML_READER_SET_OPTIONS,
xmlTextReaderGetParserProp(ctxt, XML_READER_SET_OPTIONS) |
XML_PARSE_NOENT | XML_PARSE_DTD | XML_PARSE_NONET);
// Parse the XML document
doc = xmlNewDoc(BAD_CAST "1.0"); // Create a dummy document to attach to
xmlParseDocument(ctxt);
doc = ctxt->myDoc; // Get the parsed document
if (doc == NULL) {
fprintf(stderr, "Failed to parse XML document securely\n");
xmlFreeTextReader(ctxt); // Free context if doc is null
return;
}
// ... further processing of the XML document ...
xmlFreeDoc(doc);
xmlFreeTextReader(ctxt); // Free the context
xmlCleanupParser();
}
In this improved version, we use `xmlReaderForMemory` and `xmlTextReaderSetParserProp` to configure the parser. The options `XML_PARSE_NOENT` and `XML_PARSE_DTD` are critical. `XML_PARSE_NONET` is also added to prevent SSRF attacks by blocking network access during parsing.
Using `xmlReader` for SAX-like Processing
For very large XML documents or when a SAX-like event-driven approach is preferred, `xmlReader` provides fine-grained control. The security options are applied similarly.
#include <libxml/xmlreader.h>
// ... other includes and functions ...
void process_soap_request_reader_secure(const char* xml_data) {
xmlTextReaderPtr reader;
int ret;
int done;
int type;
// Create a reader for the XML data
reader = xmlReaderForMemory(xml_data, strlen(xml_data), "noname.xml", NULL, 0);
if (!reader) {
fprintf(stderr, "Failed to create XML reader\n");
return;
}
// Disable external entity resolution and DTD loading
// Note: These options are applied to the reader's internal parser context.
xmlTextReaderSetParserProp(reader, XML_PARSER_LOAD_EXT_XINCLUDE, 0); // Disable XInclude
xmlTextReaderSetParserProp(reader, XML_READER_SET_OPTIONS,
xmlTextReaderGetParserProp(reader, XML_READER_SET_OPTIONS) |
XML_PARSE_NOENT | XML_PARSE_DTD | XML_PARSE_NONET);
// Process the XML stream
do {
type = xmlTextReaderRead(reader);
if (type < 0) {
fprintf(stderr, "Error reading XML stream\n");
break;
}
// ... process nodes as needed ...
// For example, to get node name: xmlTextReaderConstName(reader)
// To get node value: xmlTextReaderConstValue(reader)
} while (type != 1); // 1 is XML_READER_TYPE_END_OF_FILE
xmlFreeTextReader(reader);
xmlCleanupParser();
}
The `xmlTextReaderSetParserProp` function is used here as well to enforce the security options on the reader’s underlying parser context. This ensures that even if the XML document attempts to declare external entities, they will not be resolved.
Beyond libxml2: General Best Practices
While securing the XML parser is paramount, a defense-in-depth strategy is essential:
- Input Validation: Even with a secure parser, validate the structure and content of the XML against a known schema (XSD). Reject any XML that deviates from the expected format before it even reaches the parser.
- Least Privilege: Ensure the process running the SOAP integration operates with the minimum necessary file system and network permissions. This limits the blast radius if an XXE vulnerability is somehow exploited.
- Dependency Management: Keep `libxml2` and all other libraries up-to-date. Vulnerabilities are discovered and patched regularly. Regularly scan your dependencies for known security issues.
- Logging and Monitoring: Implement robust logging for parsing errors and suspicious XML patterns. Monitor these logs for signs of attempted attacks.
- Disable Unused Features: If your SOAP integration doesn’t require features like XInclude, explicitly disable them using `XML_PARSE_XINCLUDE` option set to 0.
For C-based SOAP integrations, the focus must be on the XML parsing library. By diligently applying the security options provided by `libxml2` and adopting a layered security approach, you can effectively mitigate the risks posed by XXE injection.