What Is an XXE Attack?
A security vulnerability where unsafe XML entity resolution can read files, call internal services, or leak secrets.
What Is an XXE Attack?
An XXE attack, or XML External Entity attack, is a security vulnerability where an XML parser resolves attacker-controlled external entities and exposes local files, internal services, or secrets. In LLM and agent systems, it shows up when tools ingest XML from uploads, APIs, emails, or retrieved documents during eval pipelines and production traces. FutureAGI treats XXE as a detectable code and workflow risk: flag unsafe parser settings, block untrusted XML paths, and regression-test agent tools that parse XML.
Why XXE Attacks Matter in Production LLM and Agent Systems
XXE is easy to miss because the risky line is often outside the model prompt. A research agent may fetch a vendor XML feed, a document-ingestion tool may unpack OOXML metadata, or a support workflow may parse SAML, SVG, SOAP, RSS, or legacy API responses. If the parser allows external entities, attacker-supplied XML can read /etc/passwd, call cloud metadata endpoints, scan internal URLs, or cause parser expansion that exhausts CPU and memory.
The first symptom rarely says “XXE.” Developers see unexpected outbound HTTP requests from a parsing service. SREs see egress to link-local addresses, odd DNS lookups, elevated parser latency, 500s after file uploads, or p99 spikes on document conversion. Compliance and security teams see possible secret exposure, PII leakage, or evidence gaps because the agent trace records the final answer but not the parser event that supplied it. End users feel the blast radius when an assistant returns leaked internal content or when uploads are disabled after an incident.
Agentic systems increase the risk because XML parsing becomes one step in a larger trajectory. The model may choose a file-conversion tool, pass a user-provided URL, retry with another parser, summarize the result, and store it in memory. A 2026 security review has to evaluate the tool path, not just the final natural-language answer.
How FutureAGI Detects XXE Attacks
FutureAGI maps this term to the anchor eval:XXEDetector. In a secure-eval workflow, engineers run XXEDetector against code paths that parse XML before an agent can call them: document import, URL fetch, SAML callback, SOAP connector, SVG rendering, and third-party feed ingestion. The detector is tied to CWE-611 in the inventory, so a finding should be triaged as an XML parser configuration issue, not as a model-quality failure.
A practical pattern is to attach the detector to a release dataset for tool-enabled agents. Each row includes the tool name, XML source, parser library, expected policy, and fixture payload. The eval result becomes a gate: unsafe external entity resolution blocks release, creates a ticket, and adds the payload to regression evals. In production, traceAI captures the path around the parser call through traceAI-langchain or another integration, with agent.trajectory.step identifying which agent step selected the XML tool. If the route is exposed through Agent Command Center, a pre-guardrail can reject untrusted XML uploads or route suspicious requests to human review.
FutureAGI’s approach is to connect code-level security findings with agent behavior. Unlike a generic Semgrep rule that stops at source code, the workflow asks whether an LLM or agent can reach the parser, with which input source, and under which policy.
How to Measure or Detect XXE Attacks
Use several signals because no single score captures XXE exposure:
XXEDetectorfindings - flag XML parser usage that can allow external entity resolution, aligned to CWE-611.- Parser configuration evidence - verify DTDs, external entities, XInclude, and network fetches are disabled for untrusted XML.
- Trace signals - inspect
agent.trajectory.step, selected tool name, XML source, and unexpected egress around parser spans. - Dashboard signals - track eval-fail-rate-by-parser, blocked XML request rate, parser-error rate, p99 parse latency, and upload failure rate.
- User-feedback proxies - monitor reports of leaked internal text, failed document uploads, and escalations tied to XML-heavy workflows.
from fi.evals import XXEDetector
xml_parser_code = "DocumentBuilderFactory.newInstance()"
detector = XXEDetector()
result = detector.evaluate(code=xml_parser_code)
print(result)
Common Mistakes
- Allowing DTDs because test XML never uses them. XXE payloads rely on parser features most business schemas do not need.
- Validating XML schema but not parser configuration. XSD validation does not disable external entity resolution, remote fetching, or XInclude.
- Scanning web endpoints only. Agent risk often lives in offline converters, background workers, and document ingestion jobs.
- Logging parsed XML after failure. Error logs can capture secrets or local file content produced by the malicious entity.
- Fixing one parser language. Multi-service agents may parse XML in Java, Python, Node, and vendor SDK adapters.
Frequently Asked Questions
What is an XXE attack?
An XXE attack is a security vulnerability where an XML parser resolves attacker-controlled external entities and exposes local files, internal services, or secrets. In LLM and agent systems, it matters when tools ingest XML from uploads, URLs, APIs, emails, or documents.
How is an XXE attack different from SSRF?
XXE is specifically caused by unsafe XML external entity resolution. SSRF is the broader pattern of making a server call an attacker-chosen URL; XXE can become SSRF when the XML parser fetches internal network resources.
How do you measure an XXE attack?
Use FutureAGI `XXEDetector` to flag XML External Entity vulnerabilities in parser code and tool paths. Then track detector findings, blocked XML requests, unexpected egress, and regression eval failures.