Visionaries Have Democratised Remote Network Access - Citrix Virtual Apps and Desktops (CVE Unknown)
====================================================================================================
![Visionaries Have Democratised Remote Network Access - Citrix Virtual Apps and Desktops (CVE Unknown)](https://images.seebug.org/1731577663789-w331s)
Well, we’re back again, with yet another fresh-off-the-press bug chain (and associated Interactive Artifact Generator). This time, it’s in Citrix’s “Virtual Apps and Desktops” offering.
![](https://images.seebug.org/1731577666515-w331s)
This is a tech stack that enables end-users (and likely, your friendly neighbourhood ransomware gang) to access their full desktop environment from just about anywhere, whether they’re using a laptop, tablet, or even a phone.
It’s essentially the ‘thin client’ experience that people were very excited about some 30 years ago - instead of having software and files stored on each individual device, the application (or entire desktop) runs on a big meaty server safely tucked away in a datacenter, and is streamed to end-users via the network. It’s sort of like Remote Desktop, but more enterprise-y.
It is, of course, a bit more polished than Remote Desktop, fully capable of being used by an end user. Here’s an example (courtesy of Citrix) of a user running Chrome via their organization’s setup.
![](https://images.seebug.org/1731577668288-w331s)
Generally speaking, enterprises _love_ this kind of tech. This is for a few different reasons:
* **Full Desktop Experience**: When done right, users will ‘feel’ like they’re using their own desktop, with all their usual apps, files, and settings. Oftentimes, users can’t even tell that there’s an entire network between them and their application, but in reality, it’s all hosted somewhere else. This means that users can pick up right where they left off, no matter what device they are on.
* **Easy Management**: Virtual desktops are a dream for IT teams to manage (again with the caveat of ‘when done right’). Administrators can update, secure, and troubleshoot from one central location rather than on every employee's device. Imagine a world where the only things you have to troubleshoot outside the server room are keyboards, mice, and monitors. That’s the promise.
* **✨✨Enhanced Security✨✨ (…………………………)**: Since data is stored in a central location, it’s easier to keep secure, or so the theory goes. If someone loses their device, there’s no data on it; it’s all safely stored in the cloud. You don’t have to revoke anything or remotely wipe any devices.
* **Remote Work Ready**: Users can log in to their desktop from home, the office, or anywhere else they can get to the network, which is ideal for flexible work arrangements and keeps everything consistent.
Without sounding like a Citrix salesperson, Virtual Desktops gives users a seamless desktop experience wherever they are, and companies get easier management and something about security benefits, we assume.
Attackers, of course, also love this technology, but for a very different reason.
Having your enterprise applications published interactively and ready for your users can also mean having them ready for attackers, just one authentication step away - and we’ve certainly seen that attackers have enjoyed some great successes, with [CVE-2024-6151](https://www.cve.org/CVERecord?id=CVE-2024-6151&ref=labs.watchtowr.com) being the most recent example.
This one is a privesc bug yielding SYSTEM privileges for any VDI user, which is actually a lot worse than it might initially sound since that’s SYSTEM privileges on the _server_ that _hosts_ all the applications and access is ‘by design’ - allowing an attacker to impersonate any user (including administrators) and monitor behaviour, connectivity.
Since everything is so seamless and portable, it’s an easy jump from there to impersonating users or ‘shadowing’ them, observing their every action. The centralized administration system can easily become a panopticon.
However, we haven’t seen many true unauthenticated RCE bugs reported.
We came up with 6 potential reasons:
1. Perhaps this is due to the architecture of thin-client solutions, in which privilege escalation attacks are so powerful.
2. Perhaps these solutions were built in a way that reflected the necessary security posture of a remote desktop access tool for enterprises?
3. Perhaps these solutions were built in a way that reflected the necessary security posture of a remote desktop access tool for enterprises?
4. Perhaps these solutions were built in a way that reflected the necessary security posture of a remote desktop access tool for enterprises?
5. Perhaps these solutions were built in a way that reflected the necessary security posture of a remote desktop access tool for enterprises?
6. Perhaps these solutions were built in a way that reflected the necessary security posture of a remote desktop access tool for enterprises?
Regardless, Citrix’s Virtual Apps and Desktop solution is a huge, complex system with many moving parts (we’ll zoom in on just one of them and tear it to pieces shortly).
Everyone knows the more complex things get, the more chance of bugs, and while the vendor puts a lot of effort into securing it \[citation needed\] , there must be _some_ RCE out there? Surely?
Given the historical lack of weaknesses demonstrating an apparent lack of world-ending destruction, we decided to take a closer look at the product.
Did Citrix democratise remote network access? Removing this privilege from the authorised?
Citrix Session Recording - AKA ‘the source of all evil’
-------------------------------------------------------
One thing that a ‘thin client’-esque solution lends itself to very well is monitoring - also known as 'stalking', 'audit functions', or just 'the ability for administrators to see what a logged-in user is doing'.
In the context of a VDI solution, the session data being sent to the client is literally a video stream. Thus, it’s “easy” for an authorized administrator to tee off a feed and watch for themselves.
Citrix have taken this a little bit further with their feature “Session Recording”. This feature captures user activity, recording keyboard and mouse input, along with the video stream of the desktop’s reaction. It’s something akin to recording of a virtual machine session (or, your APT friend spamming their Cobalt Strike beacon with the ‘screenshot’ command).
Citrix advertise the feature as being really useful for monitoring (somewhat obviously), but also for compliance and troubleshooting. It can even be set up so that certain actions (like identifying sensitive data) will trigger recording, which helps meet regulatory needs and flag suspicious activities.
![](https://images.seebug.org/1731577670360-w331s)
It’s also invaluable for troubleshooting, as end-users can ‘record’ a problem manifesting, showing the technical team exactly what happens (rather than just opening a ticket with the subject of “it doesn’t work”).
Here’s an example of what a user sees when their session is recorded:
![](https://images.seebug.org/1731577672272-w331s)
Overall, it provides a secure record of user activity, helping with audits, detecting unusual behavior, and diagnosing problems. Now, here is what a session recording flow looks like from a user’s perspective:
User Login
|
v
+-------------------+
| Citrix Login |
+-------------------+
|
v
+------------------------+
| Virtual Apps and |
| Desktops Environment |
+------------------------+
|
v
+------------------------+
| Start Session |
+------------------------+
|
v
+------------------------------------+
| Session Recording Service |
| - Monitors the session |
| - Starts recording if conditions |
| are met (e.g., policy triggered) |
+------------------------------------+
|
v
+------------------------------------+
| Record User Actions |
| - Keystrokes, application access, |
| and screen activity |
+------------------------------------+
|
v
+------------------------------------+
| Store Session Recording |
| - Saved in a secure repository |
| - Available for review by admins |
+------------------------------------+
One of our key motivations was to uncover the architecture behind this feature—how Citrix engineers approached recording a user’s session, handling the data securely, and transmitting it within the environment. This isn't just a matter of "starting a screen recorder."
Questions came to mind:
* Which process captures the session?
* How is the recording stored?
* Are multiple components involved, or does a single process oversee the entire operation?
The answers to these questions map out a critical attack surface, which is exactly what makes this feature so interesting from a security perspective.
Let’s be clear: recording user sessions reliably, and at scale, as Citrix does, is extremely challenging.
This isn’t like opening QuickTime and pressing “record” on a laptop; this is a web application that’s streaming and recording multiple remote desktop sessions simultaneously in a secure, enterprise-grade product.
The sheer technical demand, combined with the complexity of coordinating processes that handle real-time streaming, storage, and secure data transfer, means there are countless intricate moving parts. And as we know, complexity often breeds opportunities for mistakes—mistakes that sometimes lead to serious vulnerabilities.
We wanted to peel back the layers of this sophisticated feature, bringing awareness to the complexities Citrix engineers navigate and why such features deserve rigorous scrutiny. Understanding these details underscores the impressive scope of Citrix's technology, while also illuminating the potential for security gaps within a powerful but complex system.
Eventually, after reviewing processes and examining the documentation, we came up with the following diagram.
Citrix Session Recording Conceptual Architecture
+-------------+ +---------+ +----------------------------+
| User | -----> | Citrix | -----> | Virtual Desktops |
+-------------+ +---------+ | and Servers |
| |
| +------------------------+ |
| | Windows 10/11 VDA | |
| +------------------------+ |
+----------------------------+
|
v
+----------------------------------+
| Session Recording Server |
+----------------------------------+
| | |
v v v
+------------------+ +----------------+ +-----------+
| Recording Policy | | Recording | | Database |
| Console | | Player | +-----------+
+------------------+ +----------------+
As you can see, there is a “Session Recording Server”, which can be on the same machine where Citrix Virtual Apps and Desktop is installed or on a separate machine.
When the Citrix main component records the session, it passes it to this ‘Session Recording Server’, which then stores the recording in a database, along with the metadata you’d expect—the user who submitted it, the date, and suchlike.
![](https://images.seebug.org/1731577674417-w331s)
At a later date, an authorized administrator can then examine the session footage using the Player component, which queries the database attached to the Session Recording Server and retrieves the relevant information.
![](https://images.seebug.org/1731577678756-w331s)
Finally, the ‘Recording Policy Console’ is the component that exposes more fine-grained control to administrators, allowing specific triggers to be set for when the session should start recording. For example, you could start a recording session every time a user accessed a particularly high-value file share.
![](https://images.seebug.org/1731577681548-w331s)
Okay so now we understand the roles of these different components, but one question remains: how do these components communicate with each other? Is the communication via network sockets? Maybe named pipes, or some kind of shared memory?
To answer these questions, it’s time to dig around in the filesystem and find the executables involved. Fortunately, this is straightforward, as Citrix helpfully keeps an organized folder structure. We soon found a directory named ‘SessionRecording’ - what could it hold but components relating to session recording?
![](https://images.seebug.org/1731577683703-w331s)
Perfect. We’ve got a bunch of folders here for the various components of session recording - the database component, for example, the player, and (most importantly!) the ‘Server’ component, where all the juicy logic is likely to be tucked away waiting for us.
We started to take a closer look at some of the components involved, looking for any sign of communication with other components. While doing so, we stumbled upon an executable named “SsRecStorageManager.exe” - which, intriguingly, was running by default as a Windows service.
![](https://images.seebug.org/1731577685521-w331s)
With a filename like that, this must be a component that handles the storage of session recordings. And the icon of a video camera is just _too tempting_ to ignore - what hacker doesn’t want to be able to spy on sessions like a video camera?!
Of course, being the smart hackers we are, our first port of call is simply to read the documentation for this component (rather than dive right in with a decompiler). We did a quick search and found the following piece of [documentation](https://www.eginnovations.com/documentation/Citrix-Session-Recording-Server/Citrix-Session-Recording-Storage-Manager-Test.htm?ref=labs.watchtowr.com):
> Citrix Session Recording Storage Manager is a Windows service that manages the recorded session files received from each Session Recording-enabled computer running XenApp and XenDesktop. The Storage Manager receives the session recordings as message bytes via the Microsoft Message Queuing (MSMQ) service. To maintain the integrity of the recordings at all times, the Storage Manager should be able to manage the received messages as quickly as they are sent by the Session Recording agent.
Okay, great - now we have a general understanding of what this service does. It takes the recorded session files, receiving them via MSMQ. This component, MSMQ, or ‘Microsoft Message Queuing’, simply allows two separate processes to communicate via a ‘queue’ - for example, one side might enqueue a message along the lines of ‘list all the recordings in the database’, and then another application might pick up this message from the queue and respond by placing a list of recording data back into the queue.
There’s an important detail that this process implies, however.
Because the queue deals with data that travels between processes (and even between entire machines), some kind of conversion is needed for the objects placed in the queue. We can’t simply dump chunks of memory in there, since they might not be understood by the receiving end - we need some kind of _Serialization_ process to convert the data into a form that can then be interpreted by the other end. (for those .NET enthusiasts, .NET usually refers to this as ‘Marshalling’).
Of course, complexity is a great place for bugs to hide, and historically, serialization interfaces have proven a great way for attackers to surreptitiously insert their own data, which the trusting application will then deserialize and process as if it originated from a trusted party. There’s a minor detail here, in that the MSMQ component isn’t actually exposed to the network via TCP, but don’t fret - we will deal with that part later.
Casting our gaze back over the documentation, it is stated that session recording data is transferred simply as ‘message bytes’. Just reading this triggered our ‘spidey sense’ and made us want to learn more about how these ‘message bytes’ are transferred - how are they serialized, and are we able to abuse the deserialization process? So, we broke out our trust decompiler and started dissecting the service. For those following along, we performed this analysis on version `Citrix_Virtual_Apps_and_Desktops_7_2402_LTSR` .
Reviewing a codebase of this size is always a time-consuming task, and patience is required as we exhaustively audit code. Eventually, though, we came across a class named `SmAudStorageManager.EventMetadataWithTime` , which stood out to us. Take a look:
/* 1 */ using System;
/* 2 */ using SmAudCommon;
/* 3 */
/* 4 */ namespace SmAudStorageManager
/* 5 */ {
/* 6 */ // Token: 0x0200000A RID: 10
/* 7 */ [Serializable]
/* 8 */ internal class EventMetadataWithTime
/* 9 */ {
/* 10 */ // Token: 0x04000048 RID: 72
/* 11 */ public EventMetadata m_eventMetadata;
/* 12 */
/* 13 */ // Token: 0x04000049 RID: 73
/* 14 */ public DateTime m_eventTime;
/* 15 */
/* 16 */ // Token: 0x0400004A RID: 74
/* 17 */ public DateTime m_eventUtcTime;
/* 18 */
/* 19 */ // Token: 0x0400004B RID: 75
/* 20 */ public string m_user;
/* 21 */
/* 22 */ // Token: 0x0400004C RID: 76
/* 23 */ public string m_domain;
/* 24 */
/* 25 */ // Token: 0x0400004D RID: 77
/* 26 */ public string m_server;
/* 27 */
/* 28 */ // Token: 0x0400004E RID: 78
/* 29 */ public Guid m_ctxSessionID;
/* 30 */ }
/* 31 */ }
One can immediately notice this class has been marked with the `[Serializable]` attribute, which raises our suspicion that there might be some sort of de/serialization being performed by this executable.
Indeed, the .[NET documentation](https://learn.microsoft.com/en-us/dotnet/api/system.serializableattribute?view=net-8.0&ref=labs.watchtowr.com) states simply that this attribute “Indicates that a class can be serialized using binary or XML serialization”. It seems very likely that this executable is serializing data ready for the MSMQ queue.
From here, we followed the footprints of serialization usage across the codebase, decompiling more libraries to find what serialization API is being used, ultimately intending to check if serialization is being used in an insecure fashion.
After looking at countless different methods, we encountered the `SmAudStorageManager.ProjectInstaller.Install(IDictionary)` method.
Let's examine its implementation closely - pay attention to the code snippet down below while you’re following along.
You can notice that line (41) instantiates the `MessageQueue` which is part of the MSMQ class `System.Messaging.MessageQueue.MessageQueue`.
Then, from line (45) to line (48), permission restrictions are placed on this queue instance by calling the `SetPermissions` method.
public override void Install(IDictionary stateSaver)
{
/* 1 */ try
/* 2 */ {
/* 3 */ Trace.WriteLine(string.Format("\nBegin Session Recording Storage Manager install @ {0} ...", DateTime.Now));
/* 4 */ Trace.WriteLine("Determining service dependencies...");
/* 5 */ ArrayList arrayList = new ArrayList();
/* 6 */ arrayList.Add("Eventlog");
/* 7 */ arrayList.Add("MSMQ");
/* 8 */ this.AddServiceNameIfExists(arrayList, "COMSysApp");
/* 9 */ this.AddServiceNameIfExists(arrayList, "EventSystem");
/* 10 */ this.serviceInstaller.ServicesDependedOn = (string[])arrayList.ToArray(typeof(string));
/* 11 */ Trace.WriteLine(" Service depends on:");
/* 12 */ foreach (string text in this.serviceInstaller.ServicesDependedOn)
/* 13 */ {
/* 14 */ Trace.WriteLine(string.Format(" {0}", text));
/* 15 */ }
/* 16 */ try
/* 17 */ {
/* 18 */ Trace.WriteLine("CitrixSsRecStorageManager: base.Install begin");
/* 19 */ this.UninstallIfExists("CitrixSsRecStorageManager");
/* 20 */ base.Install(stateSaver);
/* 21 */ Trace.WriteLine("CitrixSsRecStorageManager: base.Install end");
/* 22 */ }
/* 23 */ catch (Exception ex)
/* 24 */ {
/* 25 */ ExceptionHelper.TraceException(ex);
/* 26 */ throw;
/* 27 */ }
/* 28 */ try
/* 29 */ {
/* 30 */ ServiceSidController.AddServiceSid("CitrixSsRecStorageManager", ServiceSidController.SERVICE_SID_TYPE.SERVICE_SID_TYPE_UNRESTRICTED);
/* 31 */ }
/* 32 */ catch (Exception ex2)
/* 33 */ {
/* 34 */ ExceptionHelper.TraceException(ex2);
/* 35 */ throw;
/* 36 */ }
/* 37 */ try
/* 38 */ {
/* 39 */ Trace.WriteLine("Begin set MSMQ permissions...");
/* 40 */ Trace.WriteLine(string.Format(" Begin open queue {0}...", this.messageQueueInstaller.Path));
/* 41 */ MessageQueue messageQueue = new MessageQueue(this.messageQueueInstaller.Path);
/* 42 */ Trace.WriteLine(" End open queue");
/* 43 */ try
/* 44 */ {
/* 45 */ messageQueue.SetPermissions(ProjectInstaller.GetLocalizedTrusteeName(WellKnownSidType.BuiltinAdministratorsSid), MessageQueueAccessRights.FullControl, AccessControlEntryType.Allow);
/* 46 */ messageQueue.SetPermissions(ProjectInstaller.GetLocalizedTrusteeName(WellKnownSidType.LocalSystemSid), MessageQueueAccessRights.FullControl, AccessControlEntryType.Allow);
/* 47 */ messageQueue.SetPermissions(ProjectInstaller.GetLocalizedTrusteeName(WellKnownSidType.NetworkServiceSid), MessageQueueAccessRights.FullControl, AccessControlEntryType.Allow);
/* 48 */ messageQueue.SetPermissions(ProjectInstaller.GetLocalizedTrusteeName(WellKnownSidType.AnonymousSid), MessageQueueAccessRights.GenericWrite, AccessControlEntryType.Allow);
/* 49 */ Trace.WriteLine("End set MSMQ permissions");
/* 50 */ }
/* 51 */ catch (Exception ex3)
/* 52 */ {
/* 53 */ throw ex3;
/* 54 */ }
/* 55 */ finally
/* 56 */ {
/* 57 */ messageQueue.Close();
/* 58 */ }
/* 59 */ }
/* 60 */ catch (Exception ex4)
/* 61 */ {
/* 62 */ ExceptionHelper.TraceException(ex4);
/* 63 */ }
/* 64 */ try
/* 65 */ {
/* 66 */ string signingCertificateThumbprint = RegistryConfiguration.Server.SigningCertificateThumbprint;
/* 67 */ if (signingCertificateThumbprint != null && signingCertificateThumbprint != string.Empty)
/* 68 */ {
/* 69 */ CertSecurityHelper.AddCertAccessRuleThroughThumbprint(signingCertificateThumbprint, CertSecurityHelper.StorageManagerServiceUser);
/* 70 */ Trace.WriteLine(string.Format("Install AddCertAccessRule Finished.", Array.Empty<object>()));
/* 71 */ }
/* 72 */ }
/* 73 */ catch (Exception ex5)
/* 74 */ {
/* 75 */ this.SendToMsiLog(string.Format("Install AddCertAccessRule Exception: {0}.", ex5.Message));
/* 76 */ }
/* 77 */ try
/* 78 */ {
/* 79 */ DirFileSecurityHelper.UpdateRecordingDirAndFilePermission();
/* 80 */ }
/* 81 */ catch (Exception ex6)
/* 82 */ {
/* 83 */ this.SendToMsiLog(string.Format("Install UpdateRecordingDirAndFilePermission Exception: {0}.", ex6.Message));
/* 84 */ }
/* 85 */ try
/* 86 */ {
/* 87 */ ProjectInstaller.CheckDatabaseConnection();
/* 88 */ DirFileSecurityHelper.UpdateLiveRecordingFilePermission(DatabaseProxy.DatabaseConnection);
/* 89 */ }
/* 90 */ catch (Exception ex7)
/* 91 */ {
/* 92 */ this.SendToMsiLog(string.Format("Install UpdateLiveRecordingFilePermission Exception: {0}.", ex7.Message));
/* 93 */ }
/* 94 */ Trace.WriteLine("End Session Recording Storage Manager Install\n");
/* 95 */ try
/* 96 */ {
/* 97 */ string text2 = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "log");
/* 98 */ if (!Directory.Exists(text2))
/* 99 */ {
/* 100 */ DirectorySecurity directorySecurity = new DirectorySecurity();
/* 101 */ directorySecurity.AddAccessRule(new FileSystemAccessRule("NETWORK SERVICE", FileSystemRights.FullControl, InheritanceFlags.ContainerInherit | InheritanceFlags.ObjectInherit, PropagationFlags.None, AccessControlType.Allow));
/* 102 */ directorySecurity.AddAccessRule(new FileSystemAccessRule("Administrators", FileSystemRights.FullControl, InheritanceFlags.ContainerInherit | InheritanceFlags.ObjectInherit, PropagationFlags.None, AccessControlType.Allow));
/* 103 */ Directory.CreateDirectory(text2, directorySecurity);
/* 104 */ }
[..SNIP..]
One can quickly notice how terrible these permissions are, allowing the precious ‘Full Control’ access to almost everyone, and also `GenericWrite` to `AnonymousSid` at line (48), allowing anyone at all to put messages onto the queue, which will then be processed and acted upon.
messageQueue.SetPermissions(ProjectInstaller.GetLocalizedTrusteeName(WellKnownSidType.AnonymousSid), MessageQueueAccessRights.GenericWrite, AccessControlEntryType.Allow);
So now, our picture of how this all works is getting clearer.
This method takes care of setting up the initial access to the MSMQ component, creating a queue, and setting its permissions. However, it sets permissions which are not restrictive enough. Some more analysis reveals that this queue is used, later on, to send and receive events that are of type `EventMetadataWithTime` .
Do you remember that we talked about this class earlier? **Now we know where the `[Serializable]` that we found earlier is being used - exactly here.**
However - we still have unanswered questions:
* Where is this queue being used?
* How is the data received from this queue processed?
We looked at the rest of the methods, and found the `OpenQueue()` method which answered most of our questions. Let’s have a look at it:
/* */ public bool OpenQueue()
/* */ {
/* 1 */ bool flag = false;
/* 2 */ try
/* 3 */ {
/* 4 */ string receiveQueueName = Globals.ReceiveQueueName;
/* 5 */ if (!MessageQueue.Exists(receiveQueueName))
/* 6 */ {
/* 7 */ throw new Exception(string.Format(Strings.Error_QueueDoesNotExist, receiveQueueName));
/* 8 */ }
/* 9 */ MessageQueue.EnableConnectionCache = true;
/* 10 */ this.m_DataQueue = new MessageQueue(receiveQueueName);
/* 11 */ if (!this.m_DataQueue.CanRead)
/* 12 */ {
/* 13 */ throw new Exception(string.Format(Strings.Error_QueueReadAccessDenied, receiveQueueName));
/* 14 */ }
/* 15 */ this.m_DataQueue.Formatter = new /* */ BinaryMessageFormatter(FormatterAssemblyStyle.Simple, /* */ FormatterTypeStyle.TypesWhenNeeded);
/* 16 */ Trace.WriteLine(string.Format("Open queue: {0}", receiveQueueName));
/* 17 */ flag = true;
/* 18 */ }
/* 19 */ catch (Exception ex)
/* 20 */ {
/* 21 */ string errorMethod_OpeningQueue = Strings.ErrorMethod_OpeningQueue;
/* 22 */ ExceptionHelper.LogException(ex, 2078, errorMethod_OpeningQueue, ExceptionHelper.LogAction.Error);
/* 23 */ }
/* 24 */ return flag;
/* 25 */ }
At line (4), a global variable named `Globals.ReceiveQueueName` is retrieved. One can quickly guess that this is the message queue name. Then, at line (5), this queue name is checked for accessibility via the `MessageQueue.Exists` method. After this, at line (10), this queue name is used to instantiate an instance of the `MessageQueue` class, which is then assigned to the `m_DataQueue` property.
What’s interesting here, though, is the fact that the `m_DataQueue.Formatter` property for this queue is set at the line (15) to a `BinaryMessageFormatter`.
Ruh-Roh!
Time has told us that using a BinaryFormatter for deserialization is almost always dangerous - it exposes a lot of functionality to whoever can provide it with messages, and while it can be used securely, it provides enough ‘footguns’ that even Microsoft themselves [say it shouldn’t be used](https://learn.microsoft.com/en-us/dotnet/standard/serialization/binaryformatter-security-guide?ref=labs.watchtowr.com):
> The [**BinaryFormatter**](https://learn.microsoft.com/en-us/dotnet/api/system.runtime.serialization.formatters.binary.binaryformatter?ref=labs.watchtowr.com) type is dangerous and is **_not_** recommended for data processing. Applications [**should stop using `BinaryFormatter`**](https://learn.microsoft.com/en-us/dotnet/standard/serialization/binaryformatter-migration-guide/?ref=labs.watchtowr.com) as soon as possible, even if they believe the data they're processing to be trustworthy. `BinaryFormatter` is insecure and can't be made secure.
That’s some pretty strong language coming from the original creators of the library!
Connecting the pieces here, we can see that the `Serializable` type that we saw earlier (`EventMetadataWithTime` ) is being deserialized using the BinaryFormatter assigned to the `m_DataQueue` MSMQ instance. Let us not forget the fact that anyone can talk to this queue due to the insecure permissions that were set during the queue initialization routine.
Before we dig too deep into Citrix, let’s take a quick step back and recap some BinaryFormatter theory.
.NET BinaryFormatter Deserialization 101
----------------------------------------
We could talk for _hours_ about exploiting .NET’s BinaryFormatter. There are so many gadgets and such a huge amount background knowledge which is always fun to outline and share. Unfortunately, though, time is always short, so we’ll stick with a quick refresher.
If you missed the Microsoft quote above, we reiterate it here:
> `BinaryFormatter` is insecure and can't be made secure
Bear that in mind as we go forward (grepping your own codebase for BinaryFormatter and exploiting the results is left as an exercise for the reader).
So, what can go wrong when using a BinaryFormatter?
A quick note at this point - please bear in mind here that the following is just an example to help the reader to quickly understand how a simple ‘gadget’ looks and that this isn’t the gadget we’ll use for our detection artefact generator exploitation.
Given the ability to persuade the target to deserialize a message we give it, the next step is to locate what’s known as a ‘gadget’. A ‘gadget’ is a class that contains one or more methods that are invoked during the deserialization process. Under controlled circumstances, these methods then do things useful to an attacker.
This is easier to demonstrate than to explain, so take a look at the following code, which is of a dangerous type:
using System;
using System.IO;
[Serializable]
public class LogFile
{
private string filePath;
public LogFile(string path)
{
filePath = path;
}
~LogFile()
{
if (File.Exists(filePath))
{
File.Delete(filePath);
Console.WriteLine($"[Warning] Deleted file: {filePath}");
}
}
}
}
This class contains a constructor that takes a string and assigns it to a property named “filePath.” It also contains a destructor that deletes the file when the object is destroyed. Innocent enough, right?
Ok, but what does this really mean in the context of exploitation? Well, if an attacker can deliver a serialized instance of this `LogFile` class, the deserializer will duly instantiate the class, as one would expect. Once the garbage collector kicks in during the end of the object lifecycle, the destructor of the newly-deserialized class will be invoked, and the file deleted. Since the “filePath” field is set by the deserializer, it is under our control, and we can delete any file we like - pow, we’ve just discovered a ‘gadget’ that permits arbitrary file deletion.
Of course, this is just an example, and somewhat contrived. ‘Real’ gadgets that enable RCE are typically more complex, and the motivated reader is invited to explore the work of great security researchers (such as James Forshaw, Alvaro Muñoz, Oleksandr Mirosh, Soroush Dalili, Piotr Bazydło, to name a few). These individuals have developed ‘universal’ gadgets that can target specific .NET framework versions to achieve full remote code execution.
You can refer to the real-world gadgets reference above by taking a look at the [YSoSerial.NET](https://github.com/pwntester/ysoserial.net?ref=labs.watchtowr.com) project.
Exploiting MSMQ Deserialization
-------------------------------
OK, so that’s interesting background knowledge, but how does it help us exploit MSMQ and thus our target itself?
Well, armed with this knowledge, we know what to look for - some kind of class that exposes dangerous functionality after being deserialized.
Fortunately, we don’t need to find this ourselves - we’re going to use the `TypeConfuseDelegate` gadget, which works against .NET framework targets. If you’d like to know more about this phenomenal gadget discovered by James Forshaw, refer to [this article](https://googleprojectzero.blogspot.com/2017/04/exploiting-net-managed-dcom.html?ref=labs.watchtowr.com).
We do, however, have one extra issue here, as we mentioned previously. MSMQ is usually reached via TCP port 1801, but this port isn’t open by default in a Citrix environment.
Naturally, at this point, we gave up and considered farming geese - how can we exploit our neat deserialization bug without access to the underlying service?
After staring at the wall, **jump**ing on some straight command injection vulnerabilities, and soul-searching - we remembered [CVE-2023-21554](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-21554&ref=labs.watchtowr.com), which was a bug in MSMQ itself.
Since we’re the kind of people to read exploit code for fun, we recalled that the exploit code (handily found in Metasploit) appeared to contain an HTTP payload.
Could this mean there’s a way to ‘jump’ from HTTP into the MSMQ data?
Is it possible to enqueue a message entirely via HTTP?
Citrix’s solution exposes HTTP/HTTPS to the Internet by default (duh), so perhaps there was some light?
![](https://images.seebug.org/1731577687536-w331s)
[https://github.com/rapid7/metasploit-framework/blob/96f6f66429cf82729c13c5f615ef6046fe3dbc4b/modules/auxiliary/scanner/msmq/cve\_2023\_21554\_queuejumper.rb#L292](https://github.com/rapid7/metasploit-framework/blob/96f6f66429cf82729c13c5f615ef6046fe3dbc4b/modules/auxiliary/scanner/msmq/cve_2023_21554_queuejumper.rb?ref=labs.watchtowr.com#L292)
A quick search yielded a [promising result](https://learn.microsoft.com/en-us/archive/msdn-magazine/2003/december/send-msmq-messages-securely-over-the-internet-with-http-and-soap?ref=labs.watchtowr.com):
![](https://images.seebug.org/1731577690271-w331s)
It seems that since MSMQ v3 (published a long time ago!), support for MSMQ over HTTP has been present. Woe betide us, though, it isn’t enabled by default.
Everyone knows that, for a secure product, the minimum of functionality should be exposed, right? Surely Citrix don’t enable this extra functionality, just because it’s there?
I mean, why would they?
Their code uses MSMQ over TCP, surely they wouldn’t enable this unused feature and just gratuitously expose attack surface.. ?
…?
……?
………..?
……………..?
…………………….?
Exploiting MSMQ over HTTP!
--------------------------
Well, as it turns out, they actually _do_ enable this seemingly-unnecessary feature.
Even though they don’t use this feature within any functionality that we can see, it is nonetheless activated behind the scenes when a user installs the product.
Perhaps they have a further product that uses it, which has to interface over HTTP.
Perhaps some developer accidentally enabled it, committed the code, and forgot about it.
We’ll leave the root-cause-analysis to Citrix themselves.
![](https://images.seebug.org/1731577696025-w331s)
Now, we have all the parts needed to build an exploit.
We know there is a MSMQ instance with misconfigured permissions, and we know that it uses the infamous BinaryFormatter class to perform deserialization.
The ‘cherry on top’ is that it can be reached not only locally, through the MSMQ TCP port, but also from any other host, via HTTP.
This combo allows for a good old unauthenticated RCE. Since we're dealing with a deserialization issue, a bug class that is known for being relatively stable, we can expect a high degree of confidence that our exploit (once crafted) will work reliably - there's no tricky heap manipulation or other entropy creeping in.
Bob the Packet builder
----------------------
In order to build the exploit packet, we had to dive into the packet’s structure in detail, which turned out to be quite challenging due to limited documentation.
Through plenty of trial and error, we tested different configurations and tweaked various fields by setting up a local test environment that would process MSMQ messages and also enabled HTTP support, the small application we wrote allowed us to experiment with different data types and values until we gradually started to understand how each piece fit together.
Explanation of the Packet Structure
-----------------------------------
Here's a breakdown of the main parts of this MSMQ HTTP message:
1. **HTTP Request Line**: This is the starting line, indicating that it’s a `POST` request directed at a specific queue endpoint on the server (`/msmq/queue_name`). This tells the server we're sending a new message to the queue.
2. **HTTP Headers**: Standard HTTP headers provide basic information about the message:
* **Host** specifies the server’s address.
* **Content-Type** is `multipart/related`, which means the message has multiple parts, like the SOAP envelope and the serialized data payload. It also includes a `boundary` value to separate these parts and specifies that the main type in this multipart message is `text/xml`.
* **SOAPAction** defines the action type as `"MSMQMessage"`, informing the server of the kind of message it’s handling.
3. **First Boundary (SOAP Envelope)**: After the initial headers, we hit the first boundary, which begins the actual message content. The SOAP envelope is an XML structure with two key sections:
* **Header**: Contains routing information (`To`, `Action`, `MessageID`) that tells the server where this message is going and assigns it a unique ID.
* **Properties**: Metadata fields like `ExpiresAt` and `SentAt`, which help manage the message’s lifecycle and timing.
* **Body**: Inside the body, specific fields define the message's details, such as `Priority` (importance level) and `BodyType` (specifying whether the data is binary or text).
4. **Second Boundary (Serialized Data Payload)**: The next part, separated by another boundary, holds the actual content of the message. It has:
* **Content-Type** as `application/octet-stream`, which indicates binary data.
* **Content-Id** links this data back to the SOAP envelope.
* **Serialized Data**: This is the message’s main data payload that gets pushed to the broker queue, in our case a serialized .NET Object
These elements work together to create a structured message that MSMQ can route, prioritize, and deliver across networks. By understanding this layout, we could assemble a complete packet that the MSMQ server would accept.
+--------------------------------------------------+
| HTTP Request Line |
| POST /msmq/queue_name HTTP/1.1 |
+--------------------------------------------------+
| Host: example.com |
| Content-Type: multipart/related; |
| boundary="MSMQ_SOAP_boundary_12345"; |
| type="text/xml" |
| Content-Length: 2100 |
| SOAPAction: "MSMQMessage" |
| Proxy-Accept: NonInteractiveClient |
+--------------------------------------------------+
--MSMQ_SOAP_boundary_12345
+----------------------------------------------------+
| Content-Type: text/xml; charset=UTF-8 |
| Content-Length: 800 |
+----------------------------------------------------+
| SOAP Envelope |
| <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://... |
| <SOAP-ENV:Header> |
| <m:Routing xmlns:m="<http://schemas>..."> |
| <Action>SendMessage</Action> |
| <To><http://example.com/msmq/queue_name></To>|
| <MessageID>uuid:123456789</MessageID> |
| </m:Routing> |
| <m:Properties> |
| <ExpiresAt>20250101T123000</ExpiresAt> |
| <SentAt>20241107T100000</SentAt> |
| </m:Properties> |
| </SOAP-ENV:Header> |
| <SOAP-ENV:Body> |
| <m:MessageBody> |
| <Priority>5</Priority> |
| <BodyType>Binary</BodyType> |
| </m:MessageBody> |
| </SOAP-ENV:Body> |
| </SOAP-ENV:Envelope> |
--MSMQ_SOAP_boundary_12345
--MSMQ_SOAP_boundary_12345
+--------------------------------------------------+
| Content-Type: application/octet-stream |
| Content-Length: 1300 |
| Content-Id: uuid:123456789 |
+--------------------------------------------------+
| MSMQ Message |
| [Binary data or serialized object] |
+--------------------------------------------------+
--MSMQ_SOAP_boundary_12345--
0:00
/0:32
1×
Conclusion
----------
So, what’ve we seen today?
Well, once again, we’ve lost a little more faith in the Internet.
We’ve seen how a carelessly-exposed MSMQ instance can be exploited, via HTTP, to enable unauthenticated RCE against Citrix Virtual Apps and Desktops. We've walked the reader through crafting an exploit, using an off-the-shelf gadget, resulting in a stable and reliable exploit 'chain' which makes for easy exploitation.
This isn't really a bug in the BinaryFormatter itself, nor a bug in MSMQ, but rather the unfortunate consequence of Citrix relying on the documented-to-be-insecure BinaryFormatter to maintain a security boundary. It's a 'bug' that manifested during the design phase, when Citrix decided which serialization library to use.
What’s interesting about this particular case is that exploitation was possible even though the MSMQ port was not accessible. The reality is that this is a rabbit hole, and that most (we hope ‘all’) attackers would’ve given up at this point, or even ignored the attack surface entirely.
* Editors note: In fairness, most attackers are probably too busy just sending shell commands to vendor specific management interfaces between firewalls and management appliances. This is a wild guess. Anyway, let’s jump on.
We reported the deserialization issue to Citrix, along with the condition of HTTP-exposed MSMQ queue, as one issue. It is, at this point, a matter of opinion if this constitutes two individual bugs or one 'real' bug. While it is inarguable that Citrix's use of a BinaryFormatter with untrusted data is a de-facto bug, we don't have enough context to determine if exposing the MSMQ queue via HTTP is a really a bug, caused by a careless oversight, or a carefully-calculated effect of some obscure business requirement.
Either way, after some initial back-and-forth, Citrix managed to reproduce the issue, and after some coordination, a disclosure date of November 12th was mutually agreed.
Citrix were friendly in communication and took our report seriously, but at the time of writing, we are not currently aware of version numbers for patches or CVE identifiers for the aforementioned weaknesses. Not the end of the world, and it is what it is.
As ever, remediation advice is simply 'update to a patched version'. It is difficult to see how this bug could be mitigated otherwise, since the MSMQ interface is such a core part of the way that the application works, and attempting to restrict access to it would likely result in subtle (or not-so-subtle) breakage of the environment.
The only real thing we are certain of is that the bug exists in the version we analyzed, `Citrix_Virtual_Apps_and_Desktops_7_2402_LTSR` . We will, of course, update this post as we learn more about which versions are fixed and which are not. <TBD>
Date
Event
July 14th 2024
Initial disclosure to vendor
August 9th 2024
Vendor reports they are unable to reproduce, requests video of exploitation
August 11th 2024
watchTowr responds with video and proof-of-concept exploit
<TBD>
Citrix releases fix in the form of version <TBD>
November 12th 2024
Disclosure deadline
At [watchTowr](https://www.watchtowr.com/?ref=labs.watchtowr.com), we believe continuous security testing is the future, enabling the rapid identification of holistic high-impact vulnerabilities that affect your organisation.
It's our job to understand how emerging threats, vulnerabilities, and TTPs affect your organisation.
If you'd like to learn more about the [**watchTowr Platform**](https://www.watchtowr.com/?ref=labs.watchtowr.com)**, our Attack Surface Management and Continuous Automated Red Teaming solution**, please get in touch.
暂无评论