Obfuscation or Open Design?

In November of last year engadget ran a story explaining how easy it was to decompile Windows Phone 7 applications. A lot of developers were surprised that their apps could be reverse-engineered and decompiled and attackers could easily browse the source of their applications. The attack goes something like this: download Lutz’s Red Gate’s .net Reflector, Sync your Windows Phone 7 or download a Windows Phone 7 App to your computer, open the app in the Reflector, right click and select “decompile.” Once the application has been decompiled Reflector displays the application's source in the main window. There's even an option to export the source as a Visual Studio Project. This makes it easy to understand the algorithms used for key management, licensing, or that sweet graphics engine they've developed. You can leverage the same attack on Jar or Silverlight files you download off the Internet, or any other application written in a language that is built into some kind of IL of Byte-code.

Before you start polishing off your “C Programming Language” and decree everything must be compiled to the bare metal or decide you’ll write your own obfuscated assembly… by hand… consider your options and the benefits jumping through all these hoops will get you.

The common reactions to decompilation attacks are:

  1. Write it in a native language
  2. Obfuscate
  3. Design with openness in mind

I’ll be talking about each of these in depth in this blog, but here’s a preview:

Native Languages, such as C/C++ are compiled to Machine Code, which can be interpreted by the processor directly, without an interpreter, however these can be decompiled too, there are some amazing tools out there to help with this. Obfuscation can raise the bar, however if the reward is great enough the attacker will break the obfuscation, additionally the  obfuscation may simply frustrate your attacker, training them with laser-like focus on your app. Understanding the threats early and designing with openness and security in mind can help you move many of the threats off the untrusted mobile device.

<disclaimer >I’m not a lawyer, so I’ll stick to the stuff I do know, application security, but if you’re really concerned about Intellectual Property and somebody stealing your ideas I’d suggest you read up on Patent Law and some of the laws around “Prior Art.” </disclaimer>

Mobile Devices are simply another class of client. Developers have been programming client applications since two computers were networked together. One lesson we keep learning is don’t trust the client. Don’t trust it for input validation, don’t trust it to create your SQL queries, don’t trust it for authentication or authorization, and don’t trust it for anything that matters. Clients have come in all kinds of different flavors over the years, the web is the client-server paradigm that rules today, but 10 years ago the client server model was custom, everything had a client we had to download and a port we had to open to enable the functionality of the software. Neither of these models is inherently more secure than the other.

Most of the apps I’ve seen on mobile devices do a great job of letting me access my data that is already in the cloud, mashup two or more sources of data for my mobile browsing pleasure or it performs the bulk of the processing on the server due to processing or data storage limitations on the handheld.. If your app falls into any of these categories, you’ve got little to worry about. Most the neat stuff your app does is not on the device! Just keep innovating and you’ll stay consistently ahead.

Here are some examples of apps I used often:

  • OneBusAway – Mashup/Server Processing
  • Twitter – Cloud
  • Yelp – Cloud

So what if your app really needs to protect something? Two apps on my device immediately come to mind and fall into that category: Rhapsody and Kindle. Both of these applications protect data using Digital Rights Management (DRM) so protecting keys is … well, key. We'll talk more about encryption and key management options later.

Let’s talk about options.

Write it in a native language

One of the first things to come up in early discussions about securing a client app is to write it in a language that will compile into Machine Code. Machine Code is the lowest level language to give instructions to the processor itself. It is highly hardware dependent, making portability difficult and increases the risk to other security vulnerabilities such as Buffer Overflows, Format String Vulnerabilities and other Memory management issues. Most of these issues are far worse than information disclosure or decompilation attacks could ever be; they likely allow for arbitrary code execution on the remote device.

Native code is more difficult to decompile, true, but with tools like IDA pro it’s possible to disassemble native applications and sometimes possible to reverse the binary back to readable C source code. Writing your application in a native code language can help obfuscate the original code and can make it more difficult for an attacker to understand what the application is doing, however the risks inherited by native code don’t outweigh the protections provided by modern languages like C# and Java.

Choosing a Native language over a managed language for security purposes is like locking yourself in the Lion’s cage at the zoo because you’re afraid of the mice.

Obfuscate

Microsoft’s official stance on releasing .net applications is to obfuscate your application before release. Specifically with the PreEmptive Solutions’ Dotfuscator Microsoft says the following about Dotfuscator on their website “any .NET program where the source code is not bundled with the application should be protected with Dotfuscator.”

Obfuscating an application attempts to make reverse engineering and decompilation more difficult, however these techniques can range from producing difficult to read code after a decopile to frustrating a focused attacker enough for them to wage a personal vendetta against your application and code, vowing to untangle the mess of obfuscated code if it takes dozens of RedBulls and weeks of late nights.

Basic obfuscation techniques include rewriting methods, parameters and variables with small or meaningless strings. Advanced obfuscation techniques attempt to actively exploit techniques used by decompilers to get back to the original source.

When asked about obfuscation at conferences or in classes I usually respond the same way: It can’t hurt. Obfuscating your code will raise the bar for who can decompile your code and reduce the likelihood of an attacker being able to quickly and easily Trojan your binaries. However, like most things, as a single line of defense it is far from sufficient.

Design with Openness in Mind

Of course this entire blog has been one big lead-up to what I really wanted to talk about: the security principle of Designing with Openness in Mind. If we assume our attackers have access to our source (and comments), bug tracking system, design documents and architecture diagrams and we can still look each other in the eye and say “this is a secure system” then we’ve gone a long way on the path to complete security.

All cryptographic solutions are based with this in mind. If you want to you can learn how AES, one of the most popular and secure algorithms in the world, works. Heck, there are even stick figure cartoons to help you understand. At the end of the day understanding exactly how AES works will not help you break the encryption, in fact understanding it will help you make better decisions about how to use it properly.

We’re not all building crypto libraries, of course, but we can apply this same principle to our code. By making the above assumptions we’re covering all our bases and making sure there aren’t any “keys to the castle” hidden in source code. By understanding how easily an attacker and reverse engineer and decompile our applications we’re less likely to simply hope they don’t find our secrets; we will take steps to make sure they don’t.

Here are a few common examples of designing with security and openness in mind up front:

Client-Server Authentication: I’ve seen more than a few applications that need to authenticate a client to a server without user interaction. The naïve way of accomplishing this task is to simply embed the same set of credentials in each binary and have the client send these up to the server for authentication. Of course, if one set of credentials is compromised, all clients must be updated. The next level is to allow for some kind of registration phase and give each client unique credentials. This increases security by limiting the damage possible if one set of credentials are lost, but it can be very challenging to protect credentials on the client or while in transit. Finally using a valid Public Key Infrastructure system can help design and build a secure system without compromising security, speed or ease of development.  Simply generating and sharing X.509 (SSL) Certificates on the server can go a long way to building a system that is resilient to tampering and sniffing attacks.

Plugin Extensibility: What if you want to be able to extend your application with new plugins? Perhaps you want these plugins to come from only a trusted source. So you want the application to be able to validate the source of the DLL. One basic way of doing this is as above, simply embed a super-secret string in the DLL then ask the DLL for that secret when it’s loaded, if the secret matches, you’re good to go, right? Wrong, it’s trivial to discover that secret and build it into a rogue library. Another poor solution I’ve seen is to build a secret algorithm that will process data in some specific way. For the sake of example let’s say the secret algorithm is “add 5, divide by 2 and round down.” Now in order to check the validity of the DLL I generate a number, do the calculation myself, send the number to the DLL and check what it returns. I generate 7 that means I’m looking for a 9 (floor((7+5)/2)  = 9. If they match we know it’s the right DLL, right? Wrong again, just like the hidden secret issues above I can discover your secret algorithm by decompiling or reverse engineering your existing DLLs. Worst case I can just write a pass through method that will ask your valid DLLs for the answer! Crypto to the rescue again. We can sign each of our DLLs and binaries before we ship. This will bind our DLLs to a trusted source (anybody that has the private signing key) and allow the application or system to cryptographically validate the DLLs authenticity.

I don’t mean to imply cryptography is the source of all security solutions, these are simply two common examples that I’ve seen cause problems for a lot of our clients. In the long term thinking about threats with the assumption the attacker has access to your inner workings can make for a significantly more secure system. Attacks like reverse engineering and decompiation are the technical side of the greater issue of secret hiding. People love talking about their work especially the cunning algorithms they design, you’d be amazed at what you can hear at a pub, a popular lunch stop or just by asking.

As you design and architect your application go through each component and ask yourself what data or algorithm needs to be protected? What is the loose thread that is holding your application together that, if discovered by an attacker, could lead to the ultimate compromise of that application. Once these pieces have been discovered decide whether that threat is something you will mitigate or a risk that you are comfortable assuming according to your internal policy.