Some weeks ago I started developing a binary diffing plugin for IDA Pro (in IDA Python) like Zynamics BinDiff, DarunGrim or Turbo Diff. The reasons to create one more (open source) plugin for such task are various, but the following are the main ones:
- We need an Open Source plugin/tool that is updated, maintained and easy to modify or adapt.
- The plugin should do much more than what the current ones do. It must offer much more functionality than previously existing ones.
- The plugin should be as deeply integrated in IDA as possible (because 99% of serious researchers use IDA as the main tool).
- The plugin must not be subject to big corporation’s desires (i.e., Google).
The plugin or tool I have more used and the one I liked the most was Zynamics BinDiff. However, after Google bought the company, updates to it are either too slow or non existent (you can check this issue and, my favourite, this one, where Google people tells to actually patch the binary and that, may be, they can have a real fix for the next week). Also, nobody can be sure Google is not going to finally kill the product making it exclusively a private tool (i.e., only for Google) or simply killing it because they don’t want to support it for a reason (like it killed GoogleCode or other things before). Due to this reason, because I like no current open source plugins for bindiffing and, also, because they lack most of the features that, on my mind, a decent todays binary diffing tool should have, I decided to create one of mine: Diaphora.
Two years ago I started a project, for fun, to try to catch as much malware and URLs related to malware as possible. I have written about this before. In this post I’ll explain the heuristics I use for trying to classify URLs as malicious with “Malware Intelligence” (the name of the system that generates the daily Malware URLs feed). What a normal user sees in any of the 2 text files offered are simply domains or URLs that I classified as “malicious”. But, how does “Malware Intelligence” classifies them as malicious? What heuristics, techniques, etc… does it use?
This is a history of fail. I was analysing a piece of code, in assembly, that I thought would be vulnerable to a zero allocation bug allowing me to overwrite some bytes of heap space (overwriting a structure with many function pointers!). However, after spending like 2 hours analysing statically the “bug”, and documenting it, I finally discovered it wasn’t vulnerable. #Fail.
Auditing a product recently I noticed a curious scenario where I control the following:
- Unix based: The limited vulnerability allows one to create any file as root controlling the contents of that file. I can even overwrite existing files.
- Windows based: The vulnerability allows one to execute an operating system command but doesn’t allow, for some reason, copying files as the Unix vulnerability allows.
In the next paragraphs I will explain how one could exploit such somewhat limited scope vulnerabilities in order to execute remote arbitrary code in the context of the running application (root under Unix and SYSTEM under Windows). In any case, I’ll also explain the opposite case: one can execute an arbitrary operating system command in Unix based systems but can’t create an arbitrary file in the system and one can create an arbitrary file anywhere in the system in Windows operating systems but cannot execute an arbitrary command.
It’s been a while since I started writing a first prototype to try to catch as much malware (URLs and samples) as possible. Today I can say my project is all grown up as it’s generating, daily, a feed with around 9.000 malware URLs and with a low rate of false positives (although there may be some).
The process of finding malware URLs in my tool used to be only a matter of finding suspicious URLs in social networks (Twitter and Identi.ca), checking mail accounts receiving loads of bad stuff and nothing else. At first. Today I’m using crawlers, honeypots, sandboxes, thirdy party public URL feeds, private URL feeds (provided under consent), executable unpackers, heuristic engines for Flash movies, PDFs, OLE2 documents, etc… It changed a lot and became a big project that, I hope, can give useful information for malware researchers.