Malicious documents analysis: my own list of tools

It’s important to have the right tools to analyze suspect documents!

Currently, the main malware infection vehicle remains the classic malicious document attached to an email.
So it is very important to have the right tools to analyze suspect documents.

Let’s see a list of my favorite tools for analyzing Microsoft Office and PDF files.


Microsoft Office

OfficeMalScanner

Locates shellcode and VBA macros into MS Office Files, and alsoextracts shellcode and embeds it an EXE file for further analysis.

http://www.reconstructer.org/code.html

Microsoft Offvis

Shows raw contents and structure of an MS Office file, and identifies some common exploits.

Announcing OffVis 1.0 Beta

Hachoir-urwid

Allow navigation through the structure of binary Office files and viewing stream contents.

https://bitbucket.org/haypo/hachoir/wiki/hachoir-urwid

Office Binary Translator

Converts DOC, PPT, and XLS files into Open XML files.

http://b2xtranslator.sourceforge.net/

pyOLEScanner

Can examine and decode some aspects of malicious binary Office files.

https://github.com/Evilcry/PythonScripts/blob/master/pyOLEScanner.py


PDF

PDFiD

Identifies PDFs that contain strings associated with scripts and actions.

PDF Tools

PDF-parser

Examines the structure of PDF files.

PDF Tools

Origami

Origami is a framework written in Ruby designed to parse, analyze, and forge PDF documents.

pdfwalker examines the structure of PDF files, pdfextract extract JavaScript from PDF files, pdfsh offer an interactive command-line shell for examining PDF files.

https://github.com/cogent/origami-pdf

Jsunpack-n

Allow the extraction of JavaScript from PDF files.

http://jsunpack.blogspot.it/2009/06/jsunpack-n-updates-for-pdf-decoding.html

PDF Stream Dumper

Combines many PDF analysis tools under a single graphical user interface.

Has specialized tools for dealing with obsfuscated javascript, low level pdf headers and objects, and shellcode. In terms of shellcode analysis, it has an integrated interface for libemu sctest, an updated build of iDefense sclog, and a shellcode_2_exe feature.

http://sandsprite.com/blogs/index.php?uid=7&pid=57

Peepdf

Offers a shell for examining PDF files.

With peepdf it’s possible to see all the objects in the document showing the suspicious elements, supports the most used filters and encodings, it can parse different versions of a file, object streams and encrypted files.

http://eternal-todo.com/tools/peepdf-pdf-analysis-tool

PDF X-RAY Lite

Creates an HTML report containing decoded PDF file structure and contents.

https://github.com/9b/pdfxray_lite

SWF mastah

Extracts SWF objects from PDF files.

Utilizing functions within Peepdf, I wrote a simple command line tool called swf_mastah to extract a SWF file from a PDF. The benefits with this tool are that it does handle ObjStms, it decodes all the samples I have, handles encryption and it accounts for multiple object versions.

https://github.com/9b/pdfxray_public/blob/master/builder/swf_mastah.py

Pyew

Malware analysis tool that includes commands for examining and decoding structure and content of PDF files.

Pyew is a (command line) python tool to analyse malware. It does have support for hexadecimal viewing, disassembly (Intel 16, 32 and 64 bits), PE and ELF file formats (it performs code analysis and let you write scripts using an API to perform many types of analysis), follows direct call/jmp instructions in the interactive command line, displays function names and string data references; supports OLE2 format, PDF format and more. It also supports plugins to add more features to the tool.

https://github.com/joxeankoret/pyew

Comments