Pyew! A Python tool to analyze malware

2010, Feb 08    

Working in a disassembler with code analysis to speed up (graph) analysis of malware dumps (malware dumped from memory while running) I decided to write a tool using this core oriented to malware analysis and the result is Pyew!

Pyew is a tool like radare or biew/hiew. It’s an hexadecimal viewer, disassembler for IA32 and AMD64 with support for PE & ELF formats as well as other non executable formats, like OLE2 or PDF. In the project’s page you may find usage examples (like the superficial analysis of some Mebroot dowloaders) as well as the features of the version available for download as a package (however, I recommend you to download the bleeding edge version from the Mercurial repository available here).

Anyway, even when Pyew have a command line interface (and a graphical user interface is planned) it was written for batch analysis of malware. Let’s imagine the following situation: You need to analyze a bunch of malware samples, i.e. 1000 new samples. What would you do? Analyze all of them manually one per one? It’s better to write some sort of batch script to analyze the samples and get a simple report about the malwares. You may find in the wiki of Pyew a batch script example to check for some specific marks at the file header, get the API calls made at entry point or to get a list of uncommon mnemonics found in the entry point.

Just to show another example of Pyew in batch mode I will explain how to write a simple script to get mnemonics of instructions used commonly as antidebugs. Let’s start writting the script. First import the libraries we need:

  1. from pyew_core import CPyew

We need to import the class CPyew from pyew_core (the kernel of Pyew). Next, write a code to handle the load of one file and, after the load, print the antidebugs found:

import sys
  1. from pyew_core import CPyew
  2.  
  3. filename = sys.argv[1]
  4. pyew = CPyew(batch=True) # Specify that we're in batch mode
  5. pyew.codeanalysis = True # Just in case, by default code analysis is always performed
  6. pyew.loadFile(filename) # Load the file and read all the structures, perform code analysis, etc…
  7.  
  8. print pyew.antidebug

That’s all! This simple script will take as input a file and will analyze it for mnemonics used as antidebug (like INT 3 or RDTSC). Now, it’s time to write a better script that takes a directory and recursively traverses every subdirectory to analyze all files. The final result is here