Thank you.
Well, hello and thank you for coming to see my talk on the deaffuscator.
Like you said, I'm Eric Laspi.
And the deaffuscator is an IDA Pro plugin that we created at Riverside to essentially
remove obfuscation that we found in our malware and protections that we analyzed for commercial
companies and mostly malware is where we get our patterns and that sort of thing.
So nothing proprietary.
So I'm going to go over the problem of obfuscation, kind of describe what it is, what it looks
like, and then I'm going to give you an example of some malware that we looked at, Rust Dock
B. It has a decryption routine that's just full of obfuscation.
And I'm going to show you our solution and describe how it works and then give you an
in-presentation demonstration of what the deaffuscator does.
And then I'll show you Rust Dock B before deaffuscation and then after.
And then I'm going to show you a little bit of sample source code and just give you an
idea of how we deal with a simple people pattern and then sum it all up.
So first, obfuscation.
What is it?
Something that malware authors use a lot to hide their malicious code.
They'll do things like anti-disassembly techniques just to try to trick IDA Pro, Olly, whatever
your disassembler is, into thinking that maybe certain regions of code aren't valid or visible
or they just try to fill it with junk so it's difficult to understand at first glance.
It costs reverse engineers lots of time.
Like I said, it disrupts control flow.
And removing it manually is difficult and tedious.
Like I'll describe with Rust Dock B, some reverse engineers have spent, you know, days,
weeks looking at just the decryption routine when really it's pretty simple and just full
of garbage.
So here's an example of one that we handle.
We call it a push-pop math because you are pushing an immediate, popping it into the
EDX here and then X-O-ing it with another immediate and then we jump to that result.
So the de-offuscator on its first run would take this, emulate the result, just move that
into EDX, know up all the junk instructions and then if you ran it through a second run
of the de-offuscator then you get a jump to that byte 40107B.
Now to the example.
This is a good example.
Like I said, it implements a lot of obfuscation patterns and so, you know, whenever we come
across a new malware or something that we haven't seen yet, we just take the common
patterns, dump them into our de-offuscator and, you know, it's a one-time thing so it
doesn't take an incredible amount of time and it saves us a ton of time.
We have all kinds of things like where we push registers onto the stack and then further
down we'll do some math on the registers.
It all looks very complicated and valid and then they just pop the registers back off
and so lots of useless stuff but then also a lot of things where they obfuscate a key
by doing math on it and so we just like to get at that core functionality faster without
having to wade through the obfuscation.
They've also obscured the control flow, mangled jumps, some things that IDA Pro doesn't handle
by default so.
Here's the control flow we started out with, all kinds of code flattening.
I mean, this is one function but it obviously isn't going to really have 25 different entry
points and places to exit either so.
Let's look at a small piece of it.
There's a, this is also more complicated than necessary.
I'm going to show you there's unreferenced instructions that doesn't seem to come from
anywhere.
We also have an obfuscated pop in the form of, let me get my mouse here.
You're moving the value on top of the stack into EBX and then here's an obfuscated jump
right in the middle of that which in the case of the obfuscator we would, our first run
would probably just get rid of this push ret and turn it into a jump and then our next
run would deal with this obfuscated pop where we then add ESP to form.
Another type of obfuscated push, they do both pushes and pops this way by doing a dec or
inc ESP over and over and over again and then moving a value into a register or onto the
stack and here's an obfuscated, this obfuscated jump, these two, the push ret, those are dealt
with by IadoPro because apparently I had a discussion with Ilfac the other day and he
said that apparently some compiler actually generates that so that's why he handles it.
But this one, it's obfuscated a little more, they obfuscate the push and then do a ret
so it doesn't quite handle that.
So I'm going to show you our solution which is our deobfuscator plug-in.
We combine emulation techniques, we can emulate math operations, emulate things going on in
the stack and we just do it on a sort of case by case basis.
We find the beginning of a pattern and then we emulate until that pattern no longer matches
and then we move onto the next set of instructions and look for more patterns.
We try to determine the proper control flow using getting rid of anti-disassembly and
then we transform the instructions to enhance readability and for both static and dynamic
analysis.
In dynamic it's easy to see where code goes when the jumps go to the proper place the first
time and you don't have to go around all over the place and for static it's easier to look
at and that sort of thing.
We have six basic modes of operation, one we just recently added.
Anti-disassembly is the first thing you would want to do if you're running this plug-in.
We replace anti-disassembly with simplified code so that IDA can then reanalyze and you
know, straighten out your control flow.
With passive we have simple people rules.
I'll kind of go into what people rule is briefly just something really simple where
there's not a lot of analysis to be done just like a exchange EBX, EAX and then followed
by the exact same thing.
We would just code that up really easy and it's passive because removing it is not a
very risky thing to do as far as changing the behavior of the code.
And then we have aggressive rules.
We make some assumptions about memory contents.
We also track registers and some stack contents also.
With ultra we make more aggressive assumptions.
We track multiple registers whereas aggressive, I'm sorry I said in aggressive, we usually
just track one register at a time.
So you can choose any of those four either exclusively or in combination to get the right
de-offuscation level for your particular application.
And then with remove no ops that's the thing that we do when it's all said and done and
there are no more patterns being found by our first four modes.
We want to jump over the no ops that we've created just to make it look nicer and so
that you don't have to step through no ops when you're slogging.
Collapse is sort of just an evolution of remove no ops.
Instead of jumping over no ops in this case we take the function that you want to de-offuscade
and we just move all the instructions from below no ops or from in between no ops.
We just move it all up so that you get hopefully one continuous code block.
It's invoked with all Z and you get this nice gooey.
Like I said you can select any of these de-off levels up at the top in combination with each
other.
Remove no ops that's something to be done by itself.
You give it a start and end address.
Usually you want to do it by function.
Weird things happen if you start going over the end of the function.
But also we have collapse down below that and you want to give it a start and end slack
space and the reason we did that is because sometimes you'll have especially with code
that's intentionally obfuscated you'll have functions that have chunks of other functions
mixed in there and you don't want to just write over them.
So give yourself enough room and find a good slack space that will fit all of the function
in and then it will just slap it in there.
We use the structures created by IdaPro.
Obviously it's done the analysis for us, found all of the operands and that sort of
thing.
We can follow jumps and calls or not for anti-disassembly.
And it's better to just go straight from start to end address.
We can track registers and stack contents.
Here's a piece of demonstration code that Jason put together.
It's protected to the gills with anti-disassembly obfuscation.
It really only does one or two different things but it's also a little bit longer than what
it appears to be here.
You can see there's an obfuscated jump at the end.
We will need to do some calculation to figure out what edx is so we can then jump there.
And when we run the obfuscator iteratively, it'll get rid of all this stuff eventually.
The first run, we would want to run anti-disassembly like I said.
You'll find like jzjump, call math.
These are things that are preventing analysis.
A jzjump is we've got a couple of useless jumps at the beginning and followed by a junk
bite.
That's another thing.
Since this is kind of like an empirical thing that we do, we just, we find obfuscation patterns
and if it's common that there's a junk bite after that, then we say, hey, maybe there
will be a junk bite.
Let's look for that.
And so it's in the pattern.
So this jump goes to a no-op.
The next jump goes to right after the no-op, nothing.
So we just no-op everything there and it'll go on through.
Next pattern is call math.
Here edx is getting a return of a call and then there's some math on the edx.
So we'll just emulate the result and move it right into edx.
No-op the rest and it's done.
The obfuscator outputs a text file which you can see comes in the format of a, we have
a offset address, file offset and then we give it an integer number of what bytes we're
going to inject and then just the bytes to inject on each line.
And then we take our binary injector, Perl script.
It's pretty simple and we just give it the arguments of the text file and then the binary
or copy of the binary if you want to keep your original for dynamic analysis or in case
you're afraid that you've messed up something that might be checksummed or whatever.
The new functionality we've added also is to just modify the IDA Pro database in place.
So you can, it makes it a lot faster to run the de-opfuscator iteratively and because
you can just patch there, run it again, patch, run, patch.
Otherwise you've got to run your patching thing and then reload the database.
So now we reload or if we've done just our dynamic patching there then we can see we've
straightened out the control flow a little bit there.
We got rid of the conditional jumps.
And so now I'm going to describe why we want all this slack space.
Let me go back here to the previous one.
You notice we've got these no ops here and the reason that we don't want to immediately
just jump over them or collapse this function into something smaller is because the no ops
are useful in the interim for making more de-opfuscations.
So here's an example of why that's the case.
Let's say we've got a bunch of no ops in between some instructions and that's from
a previous de-opfuscation.
Well obviously this can just be changed into a move instruction which is only two bytes
but if our prior analysis has given us a value for EAX then maybe we just want to move that
immediate into EAX and that's going to take a couple more bytes.
So we've given ourselves some space to do that.
The next run, we're going to run passive, aggressive and ultra once we have gotten rid
of our anti-disassembly and we find a move math, move math or pop twice.
So let's see what those are.
We move an immediate into EAX, XOR it and so we're just going to emulate that result
and move it into EAX in the first place and then no op the math instruction.
Move math or pop, kind of similar except we can handle a couple of cases here if a register
is overwritten with another move instruction or if something is popped off the stack into
the register, it's a little bit more of a general pattern.
So we get rid of the pointless instructions there and simplify it.
To finish it all up once we've gone through and found all the patterns that we can, we're
going to run one of those either no op remove or collapse modes and I've already described
those kind of.
Here's an example showing no op remove.
In many cases the collapse mode is going to give you a little bit cleaner where you won't
have the breaks between blocks and that sort of thing.
You'll just have one nice block of control flow.
But in this case it doesn't make that much of a difference.
So no op remove, you can see we've got tons of no ops left over from this whole thing.
A block entirely made of no ops, we just yank that out and then we write a jump right under
the last non no op instructions.
And you can see this only, this code actually only does a couple of things.
So Rustdoc B, obviously much bigger example.
We're going to show you what it looks like after.
And you can see it's all linear.
I mean there's some, and now you can easily identify the structure of the loops and really
quickly tell what's going on here.
So yeah, let's look at it a little closer.
You can see there's an outside loop, the blue line and then there's a couple of nested
loops going on there.
And then down here we have two more independent loops before we actually at the bottom jump
into the real nasty code.
Here's the decryption pseudo code that we were able to pretty quickly reverse engineer
once we had gone through the de-obfuscation process which that took all of maybe an hour
or so.
You can see the structure of the loops.
We have that blue outside loop, the two nested ones and then the two independent loops at
the bottom.
To give you some sample source code, an idea of how the de-obfuscator works.
I'm not quite sure if we're going to be able to release the source code to the entire thing
but this gives you a feel for how you could do something similar yourself maybe.
The problem of a null sub, a call null, IDA Pro identifies it as a null sub.
It knows that this just jumps to a return but it's ugly and we could use the slack space
in a number of different ways.
So our simple solution is to, we just identify if there's a call made, if the instruction
we're looking at is a call, if it's of type O near, you know, using all of those nice SDK
tools and then we look at the address where it's going, see if that's a return and if
it is, we know off the whole bunch.
So in summary, I would say the reason this is a useful tool is that lots of malware authors
use obfuscation techniques to hide their IP and keep you from reverse engineering it also
to evade antivirus software and that sort of thing.
And we detect and simplify a lot of those patterns and as we go, as we find more, we
just add those to the database and we don't really have to do anything with them after
that.
Ideas for future development, we've already done this.
Coding, code collapsing, check that off.
It would be nice also to be able to add a sort of grammar like a parser so that we could,
so that people could, without needing the source code or needing to write source code
for every pattern, it would be really cool if they could just kind of say, okay, I found
this pattern and it's got a move of two, you know, into a register of an immediate and
you could just write patterns in that way.
We also like to black box control flow to track successors and predecessors across calls
and jumps just so that we can determine if there's bogus control flow or honey code
that never really gets executed.
It's just there for show.
So that would eliminate a lot of reverse engineering time.
Here's my contact information and, yeah, I guess now it's time for questions.
Yes.
How do you generate the patterns?
Okay.
Well, I guess if what you're getting, how do you write the code for to identify a pattern?
Oh, yeah.
We figured out just by static analysis, so it does take a one-time sort of reverse engineering
of what that pattern does and determining that it's either useless code or that it does
something that can be simplified, put into one or two instructions.
Oh, the size of patterns?
Usually there may be two instructions, one instruction sometimes, not too big.
Each pattern is, and we try to keep them simple actually to make it more flexible as far as
when it's meant to be run iteratively, so we want to find the most basic unit of pattern
possible so that you can eliminate that unit of pattern and then that way variations in
patterns are caught on the next run.
Sure.
Oh, seconds.
I mean, each run of the de-offiskator, yeah, seconds, maybe five or ten for something the
size of RustSocB.
Of course, one issue that we do face that I don't really see a way around it, but when
you have cross references to code that it makes it more difficult to determine whether
or not you can really change that code without affecting some other part of the program.
So sometimes bogus cross references will trip us up and when we're running in the more aggressive
modes then sometimes we'll just ignore that stuff.
But other times, you know, if you want to make sure that things are safe, you just have
to punt.
Yes?
How do you know if it's bogus?
Yeah, especially in the case where the entire block is no ops.
Well, that is the case where our iterative-ness comes into play.
We would first decide that that content of that block can be simplified to something
that is no ops and then the next pass would obviously then remove the entire block by
jumping over it or moving the other code into the first block.
No, actually we've started to dabble in tracking the flag registers and what instructions a
flag – affect the flags in what ways so that we can hopefully get a handle on where
the program is going to go.
Like for example, we have some patterns that will determine if an instruction is just setting
up a conditional jump to always go a particular way.
So we're getting on handling that.
Yes?
I guess the short answer is no.
Anyone else?
All right.
Well, thank you.