Occasionally we get projects where it certainly helps to know a little bit of reverse engineering. Maybe there is some hidden functionality that is triggered when a specific input is given or when a specific option is activated in a configuration file. Maybe it’s an application that communicates over some obscure protocol, and you want to understand its structure better. Could be anything..really. With that in mind, I was using some of my research time to improve my reversing skills. While doing so, I stumbled upon a small little challenge on a cool introductory course on CourseRa. That challenge had a few obstacles deliberately placed, to make reversing a little harder. I found circumventing those a lot of fun and thought I’d briefly talk about my experience here.
Disclaimer: I’m going to assume you know what disassembling and debugging a program means. I’ll try and keep it really simple, but might occasionally use jargon without introducing it.
The binary in question was a small Linux binary and the secret was hidden inside it. The challenge was to identify the secret and submit it to the CourseRa website. The first thing I did was to check the type of the binary using the file command in Linux. Here’s what I found.
bonus_reverse-challenge: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24,BuildID[sha1]=0x2fe5f1647532449ffeef36a7fa31ae8319c8818d, stripped
Stripped? So... you have functions, constants and variable names in a program. If all those are badly/randomly named you’d have trouble trying to find out what exactly what the code does. Now, if you have only a binary but no code, it’s hard enough trying to find out what exactly the program’s trying to do. Stripping a binary makes it harder. Common function names like main(), for example won’t be there in a stripped binary. So you have no clue...no starting point at all. You cannot load up a debugger (gdb for Linux) and try and run a gdb command like disassemble main. The debugger will not recognize that there is anything called main() at all.
So I had to start reading up on how to debug stripped binaries. In short you have to identify the entry point for the binary, set a breakpoint there and then start reading the assembly. So the command to that in gdb, after you load the file, is info file. Get the entry point from here and then set a breakpoint on it using the br *ADDRESS command. Now you can view assembly code by using commands like x/10i $pc [Next 10 assembly instructions]. So... just because you can’t use your “usual” commands does not mean you can’t disassemble or debug anything.
As I went on through the binary I then noticed that there was a bit where it would check if I was in a debugger, using the return value of the ptrace call and exit if I was. So the binary would call ptrace() with its arguments, if ptrace returned -1(error) it’d mean that the binary was loaded inside a debugger. So my binary would run, get to ptrace(), make the call and exit().
Here’s a little snippet of disassembly from the binary.
mov al, 1Ah
int 80h ; LINUX - sys_ptrace
mov [ebp+var_C], eax
mov eax, [ebp+var_C]
cmp eax, 0FFFFFFFFh
jnz short locret_804896C
… So if you look at the last line.. it’s effectively saying, if there’s an error or eax is -1, go to the code at 804896C, which turns out to be exit(). That’s NOT the branch I wanted to take. I didn’t want to exit and wanted to continue debugging.
So I opened the binary up in a hex editor, searched for that string of hex bytes and changed the jnz to jz. This just meant that, IF there was an error don’t exit. :) The value for jnz is 75 and that of jz is 74. So I just opened the file up in a hex editor and changed 75 to 74 and saved the file. This then let me debug freely.
As I went further into the challenge, I started using IDA more and more to understand what the code was actually doing. I marked dead paths and colored code that I did NOT want to analyze. Eventually I hit a dead end, as I’d taken every single path that IDA had suggested but not gotten anywhere. Ugh.
I then switched over to IDA’s TEXT view (Graph view is the default view and very cool, but it can be put off by some clever code as in this case). The moment I went into text view, I saw a RED colored line in IDA and some code underneath which looked very much like the solution to the problem. Here’s a screenshot of what exactly I saw.
The moment you see RED in IDA, it’s a warning sign that IDA is not sure of what it’s doing and you should manually dig deeper.
A little below I saw what looked like a hardcoded key and a simple XOR encryption routine. So I decided to not spend any more time debugging the program but try and write a little bit of Python code instead to decrypt the encrypted key. I gave the key and the character it was being XOR’d with as inputs to a program and within a few minutes I had the answer to the program. Here is the little snippet of assembly code that I managed to reverse and write Python code for. I’ve commented each line so it’s easier to understand things.
In short, each byte of the clear text string was being XOR’d with the byte 2A to encrypt it. XOR though is a completely reversible algorithm. So Plaintext XOR key = Ciphertext and Ciphertext XOR key = Plaintext. We had the key ( ‘*’ ) and we had the cipher text.. so getting the plaintext back wasn’t too hard at all. Entering the plaintext string on the CourseRa website confirmed that I was right. Nice. :)
In a nutshell, I learnt how to debug stripped binaries, bypass ptrace protected anti debug mechanisms, learnt a few tricks about IDA and wrote a little bit of Python code to reverse an algorithm. Quite a lot of fun. :) If you have any questions or feedback or just want to discuss something about reversing, please leave a comment or drop me an Email.