2. The Malware
A sample of the malware analyzed in this article can be obtained at http://www.offensivecomputing.net/.
|Figure 1.0 - Malware found on Offensive Computing|
The analysis is performed on a system running Ubuntu 10.04. The PDF document is examined in a file editor in order to identify any suspicious objects contained within the file. In Figure 1.1 VIM is used to view the PDF file and examine its contents. Object 13 is the object shown in Figure 1.1. We can be sure this is malicious code due to the extremely large content in the variable "s". It includes a string of numbers that will most likely represents some form of a shellcode.
|Figure 1.1 - Large string from object 13 from the malicious PDF|
|Figure 1.3 - Output from Jsunpuck executed with the malicious PDF|
|Figure 1.4 - Output of SpiderMonkey executed with the malicious PDF|
|Figure 1.5 - Folder containing the two log files created by SpiderMonkey|
In figure 1.5 the files “eval.001.log” and “eval.002.log” are the two files created by SpiderMonkey. The first file contains the string that is created by the parsing function in figure 1.2.
|Figure 1.6 - Contents of eval.001.log|
|Figure 1.7 - Snippet from the contents of eval.002.log|
Figure 1.7 shows a snippet of the contents for eval.002.log. The payload starting with “%uC033” and ending with “%u0070” is copied and saved in a separate file “payload.txt”. In order to analyze the shellcode we need to convert to hex representation and for this we use a Perl script provided by “Malware Analyst’s Cookbook and DVD” .
|Figure 1.8 - Payload converted to shellcode with Perl Script|
Figure 1.8 shows the HEX and ASCII representation of the shellcode we converted from the payload string. The ASCII representation displays a url http://audiodr7... that is most likely the address the malware will attempt to contact and download more malicious code. The shellcode should be saved in a separate file labeled in this example “shellcode.txt”. Figure 1.9 shows the command to save the output to a separate file.
|Figure 1.9 - Shellcode saved to text file named "shellcode.txt"|
4. Analysis of Shellcode
The next step is to utilize a tool called libemu  that runs shellcode in an emulated environment. Libemu should pop an alert if any windows api functions are called and provide the instructions that are executed.
|Figure 1.10 - Output of libemu executed with shellcode|
In Figure 1.10 the step size is 100000 and the option verbose is enabled. Libemu displays that the windows function GetTempPathA is called by the malware and the execution stops there. The reason the execution is stopped because GetTempPathA expects a temporary path to be returned to the program to use and none is given so the program cannot continue. This is one limitation of libemu. However, we can perform a manual analysis of the binary instructions of the malware and a user level debugger Immunity debugger  can be utilized.
The hex code is needed to inject the malware into immunity debugger. Figure 1.8 displays the hex code and this code is copied to a separate file labeled “hexdump.txt”. To facilitate the process of obtaining the hex code without the offset or ASCII information the command in Figure 1.11 is used.
|Figure 1.11 - Hex dump only of the malicious shellcode|
Instead of displaying it on the screen we save it to the file hexdump.txt as shown in Figure 1.12.
|Figure 1.12 - Command to output shellcode to text file in hex code format|
Immunity debugger is installed on a system running Windows XP SP2. From the hex dump file we can easily obtain the executable file by using the online Sandsprite tool “shellcode 2 exe” . The hex dump is pasted into the textbox provided by the webpage and the executable is created and downloaded to the system.
|Figure 1.13 - Shellcode 2 exe web interface|
The file created is labeled “shellcode.exe_”. This file can be opened with immunity debugger.
|Figure 1.14 - Shellcode executable loaded into Immunity Debugger|
To step through the program the key “F8” is used. To step into a function the key “F7” is used. To set a software breakpoint the key “F2” is used. To run the program or execute until a breakpoint is reached, the key “F9” is used. These are the commands used for this analysis. For an explanation on how to use Immunity Debugger refer to Dr. Fu’s Security Blog .
The first interesting instruction is at the address 00401002. Here the instruction “MOV EAX, DWORD PTR FS:[EAX+30]” copies an address to the EAX register. The FS segment region should set a red flag because this region stores critical information. The description of this location can be verified with winDBG. Attach windbg to any process or executable and examine the data structure for the thread information block.
|Figure 1.15 - Data structure for Thread Environment Block in WinDBG|
As we can see in Figure 1.15 the Instruction FS: refers to the ProcessEnvironmentBlock section and it is a 32-bit pointer. The next location that is saved to the EAX register is at the address 00401008. DS[EAX+C] is executed and after DS[EAX+1C]. First DS[EAX+C] saves the address of the “Ldr” which is a pointer to _PEB_LDR_DATA. This can be verified with WinDBG.
|Figure 1.16 - Data structure of _PEB in WinDBG|
The second instruction DS[EAX+1C] now saves the address of InInitializationOrderModuleList to the EAX register. This address points to the beginning of a list of modules and the malware will probably try to access one of these modules later. This can also be verified with Windbg.
|Figure 1.17 - Data structure of _PEB_LDR_DATA|
As we can see in Figure 1.17 InInitializationOrderModuleList is at the offset 1C. Next let us set a breakpoint at 0040105F. As we can see from figure 1.20 there is a nested loop. After some analysis we can conclude that the malware has its own hash table and attempts to locate a specific function to load from kernel32.dll. At the address 0040105B the instruction CMP EDI, EAX compares the hash values and if they are not equal continues to search the list of modules. When the malware finds the module it will pass the instruction JNZ and continue to the instruction at 0040105F which pops the top of the stack to the ESI register.
|Figure 1.18 - Section of shellcode loaded in Immunity Debugger|
After the breakpoint has been set to 0040105F we can run the program to the breakpoint with the key “F9”. Continue to step through the program until the instruction ADD EAX, EBX at address 00401071. Here we find the function that the malware was searching for in the EAX register. The function is GetTempPathA and it corresponds to the output of libemu.
|Figure 1.19 - Registers of shellcode.exe at address 00401071|
We continue to step through the program and inside the function GetTempPathA it obtains the temp folder for the system and returns the Unicode string to the malware. Figure 1.20 displays the stack contents at the address 7C822220 which is inside the function GetTempPathA. The value stored is “C:\DOCUME~1\Mario\LOCALS~1\Temp\”.
|Figure 1.19 - Stack contents of shellcode.exe at address 7C822220|
We continue to step through the program and at address 0040109E at the instruction PUSH EAX we can see that the ESI register contains the temp address of the system and the file name for an executable “e.exe”. This is most likely the file the malware wants to download.
|Figure 1.20 - Immunity Debugger instructions and registers at the address 0040109E|
We continue to step through the program and notice the functions that are called by the malware. It should show the true intentions of what the malware is trying to accomplish. A breakpoint is set at the address POP EDI to quickly find the different functions the malware will call. This location is chosen because it is after the hash table function that searches for a function and if matched will display the name in the stack register.
|Figure 1.21 - Immunity Debugger showing the function in EAX register|
The second function called by the malware is GetProcAddress and this is from the dll file kernel32. The function name can be seen in the register EAX in Figure 1.21. We continue to the next function by pressing “F9”.
|Figure 1.22 - Immunity Debugger showing the function in the EAX register|
Above in Figure 1.22 the third function called is stored in the EAX register. The function is LoadLibraryA and it is also found in the kernel32.dll file. If we further examine the function call to LoadLibrary we find that two extra libraries are loaded into memory. First twain_32.dll and second urlmon.dll.
Again we execute the program to the breakpoint at 00401073 and the fourth function called is URLDownloadToFileA from the library urlmon.dll. The function can be seen in the EAX register in Figure 1.23.
|Figure 1.23 - Immunity Debugger showing the function in the EAX register|
Examining the call to URLDownloadToFileA we encounter the web address it connects to and attempts to download an executable from this URL. The address is “http://audiodr7...” and it is the same that appeared in the hexdump of the shellcode in Figure 1.9. Figure 1.26 shows the stack contents at the address 772BAAD3 inside the URLDownloadToFileA function.
|Figure 1.24 - Stack contents at address 772BAAD3|
Again we execute the program to the previously set breakpoint by pressing “F9” and we obtain the fifth function called. The function WinExec from the library kernel32 is called and the address is stored in the EAX register. After the WinExec function is called the malware terminates and the system is infected.
|Figure 1.25 - Immunity Debugger showing the function in the EAX register|
Now we have an overview of what the audiodr7 malware is trying to accomplish and what functions the malware attempts to call. To summarize we have 5 important functions that are called.
- GetTempPath – Obtains the location of the temporary folder for the system
- GetProcAddress – Obtains the address of the process running
- LoadLibraryA – Calls this function to load two extra libraries, twain_32.dll and urlmon.dll
- URLDownloadFileA – Connects to audiodr7 url and downloads the file “e.exe” to temp location
- WinExec – The last function called in order to execute the downloaded file “e.exe”
To conclude, many tools exist to help aid in the analysis of malware. The approach described above is one way to reverse engineer malware, specifically malware that is embedded into a PDF document.
 Jsunpack, Available at https://code.google.com/p/jsunpack-n/
 SpiderMonkey, Available at https://developer.mozilla.org/en/SpiderMonkey
 Michael Leigh, “Malware Analyst’s Cookbook and DVD”, Available at
 Libemu – x86 Shellcode Emulation, Available at http://libemu.carnivore.it/
 Immunity Debugger, Available at http://www.immunitysec.com/products-immdbg.shtml
 Shellcode 2 Exe, Available at http://sandsprite.com/shellcode_2_exe.php
 Dr. Xiang Fu, Malware Analysis Tutorial 4: Int2dh Anti-Debugging, Available at,