MindshaRE: Variant Hunting with IDA Python

June 27, 2018 | Jasiel Spelman

MindShaRE is our periodic look at various reverse engineering tips and tricks. The goal is to keep things small and discuss some everyday aspects of reversing. You can view previous entries in this series here.


Variant hunting is a useful skill, and likewise, it can be helpful to find ways of augmenting searches. I had some very rough notes on something I tried while looking for a variant of a Mobile Pwn2Own vulnerability and realized it’d make for a perfect introduction to variant hunting with IDA Python as it lends itself to being very minimal. The original vulnerability was submitted by both 360 Security Team as well as Tencent Keen Security Lab and is identified by ZDI-18-147 and ZDI-18-156. Both are identified by CVE-2017-7171 because Apple determined after the contest that they shared the same root cause.

The gist of the vulnerability involves a Mach service where one of the functions expects complex data from the sender but does not check if the input message has MACH_MSGH_BITS_COMPLEX (0x80000000) bit set in the msgh_bits field. Properly searching for variants of this vulnerability would involve tracking input arguments to follow the mach_msg argument, looking for a dereference equivalent to the offset of the msgh_bits field, then finally checking for a comparison of it against 0x80000000. That’s doable but let’s see if there is a simpler way.

Here's an example of a correctly handled function:

Here's another correctly handled function:

This is an ARM binary, but we can focus just on the last instruction in each image: TBZ is Test bit and Branch if Zero, and TBNZ is Test bit and Branch if Not Zero. The first argument is the value to test, the second is the bit position, and the third argument is the jump target. Since 0x1f is 31, those two opcodes are how this binary is checking to see if the MACH_MSGH_BITS_COMPLEX bit is set in the msgh_bits field.

A quick, but simple, way of looking for this is to look for any TBZ/TBNZ opcode with 0x1f as the second operand. There are a lot of potential issues with this approach, but as a naïve start, it’ll work. We could limit our search space to one of the first few basic blocks in a function, but that gets more involved and in a couple of tests missed some of what I needed so I skipped that. I generally try to start with a quicker albeit weaker approach before going to something heavier handed.

Game Plan

Our initial approach will be to look through all of the Mach function handlers for this particular service and check for a TBZ/TBNZ opcode where 0x1f is the second operand, then we'll manually check the matches.

First, let's break down all of the functions that we’ll need:

      - filter

This is a Python built-in, and it takes two arguments: function and an iterable. It'll return an iterable containing only elements for which the function returns a truthy value.

Usage of this function is equivalent to doing:

      - idc.GetMnem

This IDA API takes an address (ea_t) and returns the opcode mnemonic for that address. It's considered bad practice to use it, as DecodeInstruction is more resilient in most cases. We’re using it solely because we can check for two different instructions at once.

      - idc.GetOperandValue

This IDA API takes an address (ea_t) and an 0-based index and returns the value of the associated operand. This is also considered bad practice as the value here will vary based on the type of the operand.

      - idc.Name

This IDA API takes an address (ea_t) and returns the name, if any, associated with it. It now returns '' if no name is present, however in older versions of IDA, it used to return None. As such, I often do something like (idc.Name(x) or '') to ensure I'm dealing with a string.

      - idautils.Functions

This IDA API enumerates all known functions within the IDB.

      - idc.NextHead

This IDA API returns next defined item after the given address. We won’t directly use it, but one of my helper methods depends on it.

      - idautils.Chunks

This IDA API returns a list of function chunks. This requires that IDA has properly disassembled the function and produced the correct control flow graph. Otherwise, this function will return an error or will omit some of the basic blocks that do indeed belong to the function.

      - glitch_helpers._get_function_instr_iter

This is a personal helper method that takes one argument: function address. It'll return an iterable that will give all the addresses for a particular function. Depends on idautils.Chunks to yield the basic blocks of the function and idc.NextHead to iterate through the instructions within each basic block.

      - glitch_helpers.set_result_iterator

This is a personal helper method that takes two arguments: a function and an iterable. You iterate forward and backwards through the iterable with Ctrl-Shift-N and Ctrl-Shift-P, respectively. Every time you iterate, the function is called on whatever the iterable returns.

      - pgo

This is a personal helper method that takes one argument: an address to jump to that will be printed.

Putting it all together, here's the one-liner I used:

That’s definitely a bit of a monstrosity, so let's take it one piece at a time. We’ll start by rewriting it to not be a one-liner:

The instr_checks_COMPLEX function takes an address and first checks to see if the opcode at the address is either TBNZ or TBZ and then checks if the second operand is 0x1f. This will tell us if a particular instruction is responsible for performing a comparison against 0x80000000.

The func_checks_COMPLEX function takes an address, presumed to be the beginning of a function, and iterates through every instruction within that function. If any of the instructions perform the check, then the function returns False indicating that we are not interested in this function.

The searchspace filter is populated using a quick and dirty way of finding all of the functions I've renamed (with other IDA Python) to be interesting for this particular case. We go through every function IDA knows about and check to see if the function name starts with mach__0x, at which point we include it in the list.

I should probably share that my attempt here was a failure. I looked for variants on the patched binary and didn’t see anything useful. My suspicion is that if there were any other vulnerable methods, Apple went ahead and proactively patched them. Regardless, this seemed like a decent example of what you can do relatively quickly with the IDA Python API once you have a few helper methods.

I’ve written a lot of IDA Python for a variety of tasks, and I’ve found that the overly simple helpers, such as this result iterator, have been amongst the most helpful out of all of the IDA Python I’ve written, solely because they’ve saved a lot of time when dealing with banal and tedious tasks.

Variant hunting is an important part of auditing for vulnerabilities, and anything that can aid is valuable. I hope you've enjoyed this blog post, look out for future blog posts on IDA and Python. Until then, you can find me on Twitter at @WanderingGlitch, and follow the team for the latest in exploit techniques and security patches.