<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Unintended Results &#187; pyew</title>
	<atom:link href="http://joxeankoret.com/blog/category/pyew/feed/" rel="self" type="application/rss+xml" />
	<link>http://joxeankoret.com/blog</link>
	<description>Or maybe not</description>
	<lastBuildDate>Fri, 14 May 2010 23:41:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Analyzing PDF exploits with Pyew</title>
		<link>http://joxeankoret.com/blog/2010/02/21/analyzing-pdf-exploits-with-pyew/</link>
		<comments>http://joxeankoret.com/blog/2010/02/21/analyzing-pdf-exploits-with-pyew/#comments</comments>
		<pubDate>Sun, 21 Feb 2010 14:46:23 +0000</pubDate>
		<dc:creator>joxean</dc:creator>
				<category><![CDATA[Malware]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[pyew]]></category>
		<category><![CDATA[obfuscated]]></category>
		<category><![CDATA[pdf]]></category>

		<guid isPermaLink="false">http://joxeankoret.com/blog/?p=95</guid>
		<description><![CDATA[Something I really hate to do when analyzing PDF malware exploits is to manually extract the streams and manually decode them to see the, typically, hidden JavaScript code, so I decided to extend the PDF plugin for Pyew to automatically see them. Now, with the new version of the plugin (download it from the Mercurial [...]]]></description>
			<content:encoded><![CDATA[<p>Something I really hate to do when analyzing PDF malware  exploits is to manually extract the streams and manually decode them to see the, typically, hidden JavaScript code, so I decided to extend the PDF plugin for <a title="Pyew" href="http://code.google.com/p/pyew" target="_blank">Pyew</a> to automatically see them. Now, with the new version of the plugin (download it from the <a href="http://code.google.com/p/pyew/source/checkout" target="_blank">Mercurial repository</a>) we can see what filters are used in the exploit and, the most important thing, we can see the decoded streams, independently of how many filters are being used.<br />
<span id="more-95"></span><br />
<strong>Example</strong></p>
<p>For example, I will take one obfuscated PDF exploit (SHA256 6a8204ee7b703f96f811f32f903ac9df4045b05910d633fc34fed89e2e0a7576). I will open it in Pyew to see what is inside so, simply, run the command &#8220;pyew pdf.file&#8221;:</p>
<blockquote><p>$ pyew sample.pdf<br />
PDF File</p>
<p>PDFiD 0.0.9_PL 6a8204ee7b703f96f811f32f903ac9df4045b05910d633fc34fed89e2e0a7576<br />
PDF Header: %PDF-1.1<br />
obj                    4<br />
endobj                 4<br />
stream                 1<br />
endstream              1<br />
xref                   1<br />
trailer                1<br />
startxref              1<br />
/Page                  1<br />
/Encrypt               0<br />
/ObjStm                0<br />
/JS                    1<br />
/JavaScript            1<br />
/AA                    0<br />
/OpenAction            1<br />
/AcroForm              0<br />
/JBIG2Decode           0<br />
/RichMedia             0<br />
/Colors &gt; 2^24         0<br />
%%EOF                  1<br />
After last %%EOF       0<br />
Total entropy:           4.293999 (      5547 bytes)<br />
Entropy inside streams:  3.669587 (      4773 bytes)<br />
Entropy outside streams: 5.132696 (       774 bytes)</p>
<p>(&#8230;)</p>
<p>[0x00000000]&gt; p<br />
%PDF-1.1<br />
%&amp;#1074;&amp;#1075;&amp;#1055;&amp;#1059;<br />
1 0 obj<br />
&lt;&lt;<br />
/Type /Catalog<br />
/OpenAction &lt;&lt;<br />
/JS 4 0 R<br />
/S /JavaScript<br />
&gt;&gt;<br />
/Pages 2 0 R<br />
&gt;&gt;<br />
endobj<br />
2 0 obj<br />
&lt;&lt;<br />
/Type /Pages<br />
/Kids [ 3 0 R ]<br />
/Count 1<br />
&gt;&gt;<br />
endobj<br />
3 0 obj<br />
&lt;&lt;<br />
/Type /Page<br />
/Parent 2 0 R<br />
/Resources &lt;&lt;<br />
/Font &lt;&lt;<br />
/F1 &lt;&lt;<br />
/Type /Font<br />
/Name /F1<br />
/Subtype /Type1<br />
/BaseFont /Helvetica<br />
&gt;&gt;<br />
&gt;&gt;<br />
&gt;&gt;<br />
/MediaBox [ 0 0 795 842 ]<br />
&gt;&gt;<br />
endobj<br />
4 0 obj<br />
&lt;&lt;<br />
/Length 4769<br />
/Filter [/ASCIIHexDecode /ASCII85Decode /#4c</p></blockquote>
<p>What we see in Pyew? The output of <a href="http://blog.didierstevens.com/programs/pdf-tools/" target="_blank">PDFId</a> (a great tool by Didier Stevens) as well as the hexadecimal output of the first block (512 bytes). Taking a brief look to the 1st block of data we see one "OpenAction" to execute JavaScript. Surprise. The code "/JS 4 0 R" specifies that the JavaScript code to be executed is the object number 4. Seeking to the offset where the object #4 is and printing the buffer (in ASCII) we will find the following:</p>
<blockquote>
<pre>[0x000001b7]&gt; s 0x1b7
[0x000001b7]&gt; p
4 0 obj
&lt;&lt;
        /Length 4769
        /Filter [/ASCIIHexDecode /ASCII85Decode /#4c#5a#57De#63#6fde /R#75nLen#67t#68#44ecod#65 /FlateDecode ]
&gt;&gt;stream
4A2E3539605651222D714E634326304C5A47725A236A63494B26682C323A4E532&#8230;</pre>
</blockquote>
<p>The object is multiple times encoded and, which is more, the strings to specify what filters must be used in order to decode the stream are encoded too. It's perfectly legal according to the PDF specifications, although pretty suspicious. Pyew does a good job decoding both the encoded strings and the multiple times encoded stream. To see the streams just type "pdfvi" to see the encoded streams in the console:</p>
<blockquote>
<pre>eval(unescape("%76%61%72%20%56%68%4C%66%4E%20%3D..."))</pre>
</blockquote>
<p>Wow! it's a <em>small</em> chunk of JavaScript data <img src='http://joxeankoret.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  Pyew <em>automagically</em> applied all the filters needed (ASCIIHexDecode, ASCII85Decode, LZWDecode, RunLengthDecode and FlateDecode) and printed out the obfuscated code. We can see it, too, in a graphical user interface. Instead of typing "pdfvi" execute the command "pdfview". You will see the following screen:</p>
<div id="attachment_96" class="wp-caption aligncenter" style="width: 310px"><a href="http://joxeankoret.com/blog/wp-content/uploads/2010/02/pdf1.png"><img class="size-medium wp-image-96" title="Obfuscated Stream View" src="http://joxeankoret.com/blog/wp-content/uploads/2010/02/pdf1-300x156.png" alt="Obfuscated Stream View" width="300" height="156" /></a><p class="wp-caption-text">Obfuscated Stream View</p></div>
<p><strong>More Examples</strong></p>
<p>OK, so we can see now the encoded stream but, what if there are a lot of encoded streams and we must check them all or if we want to see just one of them? For this purpose, and also to show the Pyew's APIs, I created an example usage of the PDF API. The example reads all the streams and shows a list of all the encoded streams as you may see in the following snapshot:</p>
<div id="attachment_97" class="wp-caption aligncenter" style="width: 310px"><a href="http://joxeankoret.com/blog/wp-content/uploads/2010/02/pdf2.png"><img class="size-medium wp-image-97" title="Usage example of the PDF API" src="http://joxeankoret.com/blog/wp-content/uploads/2010/02/pdf2-300x156.png" alt="Usage example of the PDF API" width="300" height="156" /></a><p class="wp-caption-text">Usage example of the PDF API</p></div>
<p>Using this simple screen we can see all the streams or just one specific (encoded) stream. This is the code of this example usage of the Pyew's API for the PDF format:</p>
<pre lang="python">#!/usr/bin/env python

import os
import sys

from pyew_core import CPyew
from easygui import choicebox, fileopenbox, msgbox

def main(filename=None):
    if filename is None:
        filename = fileopenbox(msg="Select PDF file", default="*.pdf", filetypes=["*.pdf"])
        if filename is None:
            return

    pyew = CPyew(batch=True)
    pyew.loadFile(filename)

    streams = pyew.plugins["pdfilter"](pyew, doprint=True)
    if len(streams) == 0:
        msgbox(title="PDF Streams",msg="No encoded streams found")

    l = []
    l.append("About PDF Streams Viewer")
    l.append("See all streams (both encoded and unencoded)")
    for x in streams:
        l.append("Stream %d encoded with %s" % (x, streams[x]))
    l.append("Quit")

    while 1:
        c = choicebox(msg="Select one stream to view it decoded", title="Stream Viewer", choices=l)
        if c is None:
            break
        elif c.lower() == "quit":
            break
        elif c.lower().startswith("about"):
            msgbox(title="About PDF Streams Viewer",
                   msg="Example usage of the Pyew APIs to see PDF streams. Written by Joxean Koret")
        elif c.lower().startswith("see all"):
            pyew.plugins["pdfview"](pyew, doprint=False, stream_id=-1)
        else:
            stream_id = int(c.split(" ")[1])
            pyew.plugins["pdfview"](pyew, stream_id=stream_id)

if __name__ == "__main__":
    if len(sys.argv) == 1:
        main()
    else:
        main(sys.argv[1])</pre>
<p>And, that's all for the moment. I hope you like the new Pyew's features <img src='http://joxeankoret.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://joxeankoret.com/blog/2010/02/21/analyzing-pdf-exploits-with-pyew/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Malware Tricks I</title>
		<link>http://joxeankoret.com/blog/2009/12/02/malware-tricks-i/</link>
		<comments>http://joxeankoret.com/blog/2009/12/02/malware-tricks-i/#comments</comments>
		<pubDate>Wed, 02 Dec 2009 21:57:42 +0000</pubDate>
		<dc:creator>joxean</dc:creator>
				<category><![CDATA[Malware]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[pyew]]></category>

		<guid isPermaLink="false">http://joxeankoret.com/blog/?p=76</guid>
		<description><![CDATA[Today, while analyzing a family of malwares (the familiy called by some vendors as &#8220;Krap&#8221;) I noticed a good and new, at least for me, antiemulation technique. What do you think this sample code does? some_func: ; Do stuff... start: push offset some_func jmp edx What is this? We&#8217;re pushing the address of the function [...]]]></description>
			<content:encoded><![CDATA[<p>Today, while analyzing a family of malwares (the familiy called by some vendors as &#8220;Krap&#8221;) I noticed a good and new, at least for me, antiemulation technique. What do you think this sample code does?</p>
<pre lang="asm">some_func:
  ; Do stuff...

start:
   push offset some_func
   jmp edx</pre>
<p><span id="more-76"></span><br />
What is this? We&#8217;re pushing the address of the function some_func in the stack and, after this, jumping unconditionally to the address contained at EDX. The question here is: What value has the EDX register before executing your first line of assembly code? You have the address of ntdll!KiFastSystemCallRet:</p>
<p style="text-align: center;">
<a href="http://joxeankoret.com/blog/wp-content/uploads/2009/12/anal_edx.png"><img class="size-medium wp-image-77 aligncenter" title="Value of EDX at the very first program\'s instruction" src="http://joxeankoret.com/blog/wp-content/uploads/2009/12/anal_edx-300x178.png" alt="" width="300" height="178" /></a></p>
<p>So, basically, we&#8217;re jumping to a return only function (see a detailed description of <a href="http://www.dumpanalysis.org/blog/index.php/2008/01/10/what-is-kifastsystemcallret/">KiFastSystemCallRet</a>) efectively returning into the &#8220;some_func&#8221; function. The emulators I tested, as in example, the Bochs Debugger module that comes with IDA Pro, initialize all the registers to 0: a cool trick! And the first time I see this.</p>
<p>The tricks I typically find in malware are undocumented (or non typical) API calls mixed with junk code, as the following example extracted from a Mebroot downloader:</p>
<pre lang="asm">
000013a7 PUSH 0x74327ebc
000013ac CALL KERNEL32.dll!WriteFile
000013b2 TEST EAX, EAX
000013b4 JZ 0x000013bb      ; 1
000013b6 JMP 0x0000108e     ; 2
000013bb PUSH 0x0
000013bd CALL KERNEL32.dll!DisconnectNamedPipe
</pre>
<p>Junk code using APIs relatively commons:</p>
<pre lang="asm">
00001c1f PUSH 0x0
00001c21 PUSH 0x0
00001c23 CALL SHLWAPI.dll!SHDeleteKeyA
00001c29 PUSH 0x100
00001c2e CALL msvcrt.dll!malloc
00001c34 ADD ESP, 0x4
00001c37 PUSH EAX
00001c38 CALL msvcrt.dll!free
00001c3e ADD ESP, 0x4
00001c41 PUSH 0x0
00001c43 CALL WINMM.dll!timeKillEvent
00001c49 PUSH 0x10005129
00001c4e LEA EAX, [EBP-0x20]
00001c51 PUSH EAX
00001c52 CALL USER32.dll!wsprintfA
00001c58 ADD ESP, 0x8
00001c5b PUSH 0x0
00001c5d CALL ADVAPI32.dll!RegCloseKey
00001c63 CALL ole32.dll!OleUninitialize
</pre>
<p>Very simple API calls not commonly emulated (extracted from the dropper of the rootkit TDSS):</p>
<pre lang="asm">
00000813 XOR ESI, ESI
00000815 PUSH ESI
00000816 MOV EAX, [0x40600c]        ; kernel32.dll!GetModuleHandleA
0000081d CALL EAX
0000081f (PUSH 0x74
00000821 MOV EAX, [0x406080]        ; msvcrt.dll!iscntrl
00000827 CALL EAX
00000829 POP ECX
0000082a TEST EAX, EAX
0000082c JNZ 0x000008ad     ; 1
00000832 PUSH 0x6d
00000834 PUSH 0x68
00000836 MOV EAX, [0x40607c]        ; msvcrt.dll!is_wctype
0000083d CALL EAX
</pre>
<p>Or strange x86 assembly instructions like multibyte NOPs with redundant prefixes and so on (found in some variants of Sality): </p>
<pre lang="asm">
f30f1f90909090. rep nop [eax+0x66909090]
</pre>
<p>I know it&#8217;s just one antiemulation trick and there are thousands of them but this trick is new (at least for me), special and cool!</p>
]]></content:encoded>
			<wfw:commentRss>http://joxeankoret.com/blog/2009/12/02/malware-tricks-i/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
