top of page

Subscribe to our newsletter

Critical XXE Vulnerability CVE-2025-66516 (CVSS 10.0) in Apache Tika Enables File Disclosure, SSRF, and Remote Code Execution – Immediate Patch Required

  • Rescana
  • 38 minutes ago
  • 5 min read
Image for post about CVE-2025-66516 Critical XXE Bug CVE-2025-66516 (CVSS 10.0) Hits Apache Tika, Requires Urgent Patch

Executive Summary

A critical XML External Entity (XXE) injection vulnerability, CVE-2025-66516 (CVSS 10.0), has been identified in Apache Tika, a widely used content analysis toolkit. This vulnerability enables unauthenticated attackers to exploit the PDF parsing functionality, leading to arbitrary file disclosure, Server-Side Request Forgery (SSRF), and, under certain conditions, remote code execution. The flaw is present in multiple Apache Tika modules, including tika-core, tika-parser-pdf-module, and tika-parsers, and affects all major platforms. Given the prevalence of Apache Tika in document processing pipelines, search engines, and compliance solutions, this vulnerability poses a severe risk to organizations globally. Immediate patching is strongly advised to mitigate the risk of exploitation and potential data breaches.

Technical Information

CVE-2025-66516 is a maximum-severity XXE vulnerability affecting the XML Forms Architecture (XFA) processing within the PDF parser of Apache Tika. The vulnerability arises from improper handling of external XML entities when parsing XFA forms embedded in PDF files. When a malicious PDF containing a specially crafted XFA payload is processed, the vulnerable Apache Tika modules resolve external entities, allowing attackers to read arbitrary files from the server’s filesystem, initiate SSRF attacks, and, in certain configurations, execute arbitrary code.

The vulnerability is rooted in the tika-core module, which is responsible for the core parsing logic. The issue also affects the tika-parser-pdf-module and, in legacy deployments, the tika-parsers module. The attack vector is straightforward: an attacker submits or uploads a malicious PDF to any service that uses a vulnerable version of Apache Tika for content extraction or document analysis. No authentication is required, and the attack can be performed remotely over the network.

The technical mechanism leverages the XML parser’s ability to resolve external entities. By embedding an XFA form in a PDF with a DOCTYPE declaration referencing external resources (such as file:///etc/passwd on Unix systems or file:///C:/Windows/System32/config/SAM on Windows), the attacker can exfiltrate sensitive files. If the server has outbound network access, the attacker can also force the server to make HTTP requests to internal or external systems, facilitating SSRF. In rare cases, if the XML parser is configured to allow external DTDs with executable payloads, remote code execution may be possible.

This vulnerability is an expansion of CVE-2025-54988, clarifying that the root cause is in tika-core and not limited to the PDF parser module. Notably, users who only upgraded the PDF parser module but did not update tika-core remain vulnerable. In Apache Tika 1.x, the PDFParser resided in the tika-parsers module, so legacy deployments are also at risk.

The affected modules and versions are as follows: org.apache.tika:tika-core versions 1.13 through 3.2.1 (patched in 3.2.2), org.apache.tika:tika-parser-pdf-module versions 2.0.0 through 3.2.1 (patched in 3.2.2), and org.apache.tika:tika-parsers versions 1.13 through 1.28.5 (patched in 2.0.0). All platforms, including Linux, Windows, and macOS, are affected.

The vulnerability is classified under CWE-611 (Improper Restriction of XML External Entity Reference) and is mapped to MITRE ATT&CK techniques T1190 (Exploit Public-Facing Application) and T1048 (Exfiltration Over Alternative Protocol).

Exploitation in the Wild

As of the latest public advisories, there are no confirmed reports of active exploitation of CVE-2025-66516 in the wild. However, the attack surface is significant due to the widespread use of Apache Tika in document ingestion pipelines, search engines such as Apache Solr and Elasticsearch, compliance and e-discovery platforms, and cloud-based content analysis services. The vulnerability is trivial to exploit, requiring only the submission of a malicious PDF file to a vulnerable service.

Proof-of-concept (PoC) exploits have been published and referenced in the official Apache mailing list advisory. These PoCs demonstrate how a crafted PDF with an embedded XFA form can trigger the XXE vulnerability, leading to file disclosure and SSRF. Security researchers and threat intelligence platforms, including The Hacker News, SecurityAffairs, and SOCRadar, have highlighted the criticality and ease of exploitation.

Given the network attack vector, lack of authentication requirements, and the availability of public PoCs, exploitation is considered imminent. Organizations should assume that opportunistic attackers and advanced threat actors will rapidly weaponize this vulnerability.

APT Groups using this vulnerability

No specific Advanced Persistent Threat (APT) groups have been publicly attributed to the exploitation of CVE-2025-66516 as of this advisory. However, XXE vulnerabilities are a well-established attack vector for both state-sponsored and financially motivated threat actors. Historically, APT groups targeting government agencies, financial institutions, and enterprises have exploited XXE flaws in document processing and content management systems to gain initial access, exfiltrate sensitive data, and pivot within internal networks.

Given the criticality and ubiquity of Apache Tika, it is highly likely that APT groups and cybercriminal organizations will incorporate this vulnerability into their toolkits. Sectors at elevated risk include government, legal, financial, healthcare, and any organization processing untrusted documents at scale.

Affected Product Versions

The following Apache Tika modules and versions are affected by CVE-2025-66516: org.apache.tika:tika-core versions 1.13 through 3.2.1, org.apache.tika:tika-parser-pdf-module versions 2.0.0 through 3.2.1, and org.apache.tika:tika-parsers versions 1.13 through 1.28.5. The vulnerability is remediated in tika-core version 3.2.2, tika-parser-pdf-module version 3.2.2, and tika-parsers version 2.0.0. All operating systems and deployment environments are impacted, including on-premises, cloud, and hybrid infrastructures.

Workaround and Mitigation

Immediate patching is the most effective mitigation. Organizations must upgrade all affected Apache Tika modules to the latest secure versions: tika-core 3.2.2 or later, tika-parser-pdf-module 3.2.2 or later, and tika-parsers 2.0.0 or later. It is critical to ensure that all dependencies are updated, as partial upgrades (such as updating only the PDF parser module) do not fully remediate the vulnerability.

For environments where immediate patching is not feasible, temporary workarounds include disabling the PDF parser functionality in Apache Tika and implementing pre-processing filters to reject PDFs containing /AcroForm or XFA references. Application-layer firewalls or web application firewalls (WAFs) can be configured to detect and block XML payloads with external entity declarations. Additionally, organizations should review and restrict the file system and network permissions of document processing services to minimize the impact of potential exploitation.

Security teams should audit all applications and services utilizing Apache Tika, especially those exposed to untrusted document uploads. Monitoring for indicators of compromise is essential. Key IOCs include suspicious PDF uploads with XFA/XXE payloads, unexpected outbound network requests from Tika servers (indicative of SSRF), and unusual file access patterns targeting sensitive files such as /etc/passwd or Windows SAM.

Network segmentation and isolation of document processing services from sensitive internal resources are recommended to reduce lateral movement opportunities in the event of compromise.

References

NVD Entry: https://nvd.nist.gov/vuln/detail/CVE-2025-66516, Apache Advisory: https://lists.apache.org/thread/s5x3k93nhbkqzztp1olxotoyjpdlps9k, CVE-2025-54988: https://cve.org/CVERecord?id=CVE-2025-54988, The Hacker News Coverage: https://thehackernews.com/2025/12/critical-xxe-bug-cve-2025-66516-cvss.html, SecurityAffairs: https://securityaffairs.com/185363/security/maximum-severity-xxe-vulnerability-discovered-in-apache-tika.html, CCB Belgium Advisory: https://ccb.belgium.be/advisories/warning-critical-vulnerability-apache-tika-modules-can-lead-data-exfiltration-and, SOCRadar CVE Radar: https://socradar.io/labs/app/cve-radar/cve-2025-66516

Rescana is here for you

Rescana is committed to helping organizations manage third-party risk and strengthen their cybersecurity posture. Our TPRM platform provides continuous monitoring, automated risk assessment, and actionable intelligence to help you identify and mitigate vulnerabilities across your supply chain and digital ecosystem. If you have questions about this advisory or require assistance with vulnerability management, incident response, or third-party risk, our team is ready to help. Please contact us at ops@rescana.com.

bottom of page