Analysis of a Malicious PDF File |
Analysis of a Malicious PDF File
Author: Amit Malik aka DouBle_Zer0 
See Also
 Yesterday, I downloaded a malicious PDF file for my regular analysis. I found something strange in the PDF file than the other malicious PDF files. The file was using a different kind of technique and I was not aware about it. But after some googling I found that the same technique was exposed in 2010 so it was not a new technique.
In this article, I will discuss on this particular malware method and showcase how we can easily unearth the same using the beautiful tool called PDFStreamDumper [Reference 1].
Starting PDF Analysis using PDFStreamDumper
When I opened the file in PDFStreamDumper , I got this following structure:
pdf malware analysis
So we can clearly see that object 19 and 22 contains some javascript (because in PDFStreamDumper if any object contains JavaScript it will be displayed with red colour. )
If we look into object 19 we get this:
pdf malware analysis
Deep PDF Malware Inspection
Now inspecting deeper, wee can see that object 20 contains actual JavaScript stream.

Let us click on the object 20 on the left side to see its real contents as shown below.
pdf malware analysis
As we can see from the figure that it contains big chunk of mangled javascript. So I copied it and formatted it, after that script look like below.
pdf malware analysis
At top level glance, the above code does not provide much information about vulnerability or  exploit used in this PDF file. But because it was a malicious PDF so I was sure that it is using some exploits. But the question was how to retrieve that part. I searched for getPageNthword and getPageNumwords because these were the only keywords that creating some confusion to me. Hopefully I found some details that getPageNumwords will return the number of words in a page and getPageNthword is used to loop through the words and manipulate them based on parameters passed.

So from the above computation we can say that variable 'b' store the results and this[edit](b) = eval(b) is used to execute the javascript expression inside b. We can display variable 'b' via console functions ( and console.println()). So by replacing this[edit](b) with or console.println(b) we can get the value of b.

PDFStreamDumper provide a Update_Stream feature to update the PDF stream so we can easily make this change and can save it into a another PDF file and after that PDF file should display content of variable 'b' when opened.
pdf malware analysis
Here is the content of variable 'b'
pdf malware analysis
Here is the full dump of formatted script from this PDF file
pdf malware analysis
As we can see in the above picture that get_shellcode function first calling get_url() function. so if we take a look to get_url() function we see that it is using author name as a key to decode the url. But variable 's' is also seems to contain some shellcode.

During my analysis I found that shellcode try to execute a file named "e.exe" from temp folder. So we can say that the actual script first exploit the vulnerability and then download the "e.exe" from the decoded url and then execute that program.
pdf malware analysis
Red lines in the above screenshot shows the vulnerabilities exploited by this malicious PDF file.
This article showcased how to identify hidden malicious script elements within dangerous PDF file using PDFStreamDumper. For better learning it is advised you to try dissecting sample PDF file using the mentioned tool.
  1. PDFStreamDumper - Free tool for the analysis of malicious PDF documents
  2. PDF File Specification
See Also