PDF Exploits: A Deep Dive

On Friday, several of our users received phishing emails that contained PDF attachments, and reported these emails through Reporter. The PDF attachment is a slight deviation from the typical zip-with-exe or zip-with-scr; however, it’s still delivering malware to the user.

Here’s what the email looks like:

Figure 1 -- Phishing email

Figure 1 — Phishing email

For static analysis, the attackers have used a few tricks to make analysis more difficult, such as several layers of zlib compression as well as difficult-to-track variable names. In the PDF file, there is only one section that is zlib compressed, and this is the section we examined in Figure 2.

Figure 2 -- Flate/zlib compressed blob

Figure 2 — Flate/zlib compressed blob

To decode, we used the built in zlib library of python. Take note of the sections marked in Figure 3.

Figure 3 zlib headers

Figure 3 — Zlib headers

The marked sections are the zlib headers, or magic bytes, which define the beginning of zlib compressed data. After one pass, more compressed data is returned, giving one impression that zlib.decompress() didn’t work. However, no errors were received in our python interpreter and the values have changed, showing that the decompressing was actually successful. It takes several passes in order to get to the underlying code. On the third pass, we can see non-compressed code. (Figure 4) Once decoded, we can write this out to the file final_code.txt. (Figure 5).

Figure 4 Zlib demcompress

Figure 4 — Beginning of actual code after layers of zlib decompress

Figure 5 zlib decompressed code

Figure 5 — Beginning of zlib decompressed code

By analyzing the code, we can see the variable “ROlowh” being referenced with a long sequence of hex characters. This is the shellcode that will be injected into Adobe reader, if successfully exploited.

Figure 6 shellcode

Figure 6 — Shellcode, pre-decode

By copy / pasting this to a new notepad++ tab, we can clean up the code (remove quotes, commas, and braces) and convert the data from hex to ascii. (Figure 7)

Figure 7 plugin hex

Figure 7 — Converting blob of data with Notepad++ plugin hex -> ascii

Once converted, you can look at the end of the shellcode and see what may be a domain. The letters are out of order, as these are remnants of the original hex being later represented as unescape (in the Javascript code), which switches the order of the text. This is what the domain looks like pre-swap:

Figure 8 preunescape shellcode

Figure 8 — Pre-unescape shellcode

And by swapping bytes 1 and 0, 3 and 2, etc., we can see what the domain should be:

Figure 9 decoded URL

Figure 9 — Decoded URL from the shellcode

By executing the attachment with a vulnerable version of Adobe reader and capturing the network traffic, we can confirm this is actually the domain for the malware. The website redirects to www. (Figure 10) and a 404 is returned from that page. (Figure 11)

Figure 10 301 redirect

Figure 10 — 301 redirect for the www. site

Figure 11 404 malware removal

Figure 11 — 404 with the removal of the malware

By performing a VirusTotal search, we can see that the URL has been submitted (Figure 12) and the malware is vawtrak, which has low anti-virus hits. (Figure 13)

Figure 12 Searching for Domain

Figure 12 — Searching for the domain on VirusTotal

Figure 13 -- Vawtrack and low av hits

Figure 13 — Vawtrak and low AV hits

As always, keep on the lookout for suspicious files, and if it’s phishy…report it!

VirusTotal Links:

Malicious PDF Analysis: https://www.virustotal.com/en/file/907722c08e013080fa33a681b9ec0d52b2d0f7e8e426c3356c35a8d8f891656d/analysis/1410191848/

Install2.exe: https://www.virustotal.com/en/file/125d0e5a20437dd4256207d24bd770c0c735e67cab04a6b034aff9ad6ad4c013/analysis/

Additional dropped file: https://www.virustotal.com/en/file/411ade26373cbe307e39c523a258c4a9feaf089772bd6c811916dbf3088daa73/analysis/1409950220/

Researchers analyze phishing campaign spreading 'vawtrak' malware
Phishing continues to be effective, McAfee Labs report shows