All posts tagged pdfid

Good eveninggggggg friends. I am very happy and the cause for my happiness is the Hackercool pdf monthly magazine I recently started. The test edition was received positively. But some of the security conscious readers have raised concerns whether this pdf magazine may be booby trapped to hack my readers. So I thought it would be good to make a howto on pdf forensics. By the end of this article, you will be able to tell whether the pdf you received is genuine or malicious.

For this howto, I will create a malicious PDF with Metasploit using the following exploit.


As is well known, this exploit hides an exe within a PDF file. This PDF file can be sent to our target using any social engineering technique. When the target user clicks on it, we will get reverse_tcp connection. Another file we will be analyzing is the PDF copy of my Hackercool monthly magazine. Both of the files are shown below.


The first tool will be using is pdfid. Pdfid will scan a file to look for certain PDF keywords, allowing you to identify PDF documents that contain (for example) JavaScript or execute an action when opened. It will also handle name obfuscation.

Let us first analyze the pdf we created with Metasploit as shown below. As we can see below, the evil.pdf has JavaScript, Openaction and launch objects which are indeed malicious.


Now let us analyze my monthly magazine as shown below.


As you have seen above, it’s totally clean. No JavaScript, nothing. That should calm my magazine readers.

Now coming to the malicious PDF, we can disable the malicious elements of the file using pdfid as shown below. Now the file is clean.


Now if we want to do further analysis on the malicious PDF, we can use another tool called pdf-parser. It will parse a PDF document to identify the fundamental elements used in the analyzed file.

Type command “pdf-parser /root/Desktop/evil.pdf” without quotes.

That will parse the entire PDF and its objects (We saw earlier that our malicious pdf contains 12 objects). On observation, objects 10 and 9 evoke some interest. We can also parse each object of the pdf file.  Let us parse the object 10 as shown below.

We can see it has a launch action which launches the cmd.exe.


Similarly in object 9 we can see a JavaScript action.


Using pdf-parser with the ‘c’ option will display the content for objects without streams or  with streams without filters.


On observation we can see a stream that looks like shellcode present in object 8.


That’s all for today my friends. Please have a look at my monthly magazine.