Malware Mondays

Episode 1: Procmon

Start with Process ID (PID) When seeing the initial process drop new files, hash both files to compare whether they are the same.

From the Process-Tree select the top process executing other processes and select include-subtree. Gives a filter that includes child processes.

Filters

CreateFile not only creates a file, but also means creating a handle to a file, which happens a lot. Instead filter on WriteFile: if a file is important, data will be probably written to it. #RegSetValue #DeleteFile

Tip: save filters as a baseline, include process tree, take that as the baseline for analysing new samples.

Hotkeys

Ctrl-J to jump to file location

Done with particular events? Right-click remove events before

source code

tip: check out my repositories with source code link and analyse executables in Procmon.

testing

Reverse Engineering Malware with Ghidra

Overview notes

Native code (IDA (Pro), Ghidra, Binary Ninja) Interpreted code (.NET, Java, dnSpy, JD)

Check out the courses:

  • Identifying and defeating code obfuscation
  • Identifying and defeating packing
  • Identifying and defeating anti-reverse engineering and anti-analysis

Windows:

  • Program trees (provides an overview of the binary structure of the program)
  • Symbol tree (breakdown and overview of all program symbols such as imports, way to identify functions)
  • Data type manager (structures and other data types, typically from header files, or created by you)
  • Listing (disassembly of the executable code from the program)
    • sidebar displays program overview, entropy and a number of different bookmarks
    • default is a linear view, may open a graph view per function
    • can patch/modify the listing
    • options change based on area of program you are exploring
    • create bookmarks and add labels to help with analysis
    • modify register values and convert data types (convert data into code)

Decompiler

Listed to the right of the Listing window Converts machine code to assembly then PCode then C Warning: there’s not necessarily a direct match in the decompiled code from the original source code You can modify the decompiler output: - edit local variables, data structures, return types - consider changing a signature instead of a local instance - changes in signature are also reflected in the listing window - leverage header files when possible (usually not available)

The decompiler creates structures which help with program analysis and comprehension It can lock analysis through the ‘Decompiler Parameter ID’ (locking signatures) and committing parameters, return values and local variables trace variable usage and highlight variables that may be impacted going forwards/backwards (navigate through a function by way of a variable) Provides the ability to export functions to use with different tools

Demo: Analyzing a trojan

At entry point a call to __security_init_cookie and then an unconditional jump to another location. This is a telltale that it’s likely compiler generated code. To identify main is to scroll on and identify a call preceded by three pushes, corresponding to three pointers to:

  • argc (argument count)
  • -argv (argument vector (list))
  • envp (array of pointers to environment variables) The three arguments to main.

First, follow that unconditional jump and then scroll until you see three pushes. Double click the following function and to get the graph, use the shortcut icon or the file menu to view the graph.

CodeBrowser

Highlighting: hold ‘Windows’ key and highlight an mnemonic or data in listing window Annotating the CodeBrowser allows you to add your insights into the program. You can use labels to mark locations of interest. Ghidra will add labels during autoanalysis. You can have multiple labels per address.

There are five types of comments:

  • EOL
  • plate
  • post
  • pre
  • repeatable

Data types: can be applied at a single address or a selected range Use the data type manager window to select the type you want to apply, drag and drop it to the location you want to apply it, and you enhance you understand the code you’re analyzing.

Processor specific help: you can right-click on instructions and select ‘processor manual’

Tools and techniques to perform function analysis

Additional function windows:

  • function call graph (shows function calls from current function)
  • function graph
  • function call trees depict a hierarchical relationship of function calls

Important: you need to understand where you are, when you begin analyzing a program. Generally, you have a function, strings or some other indicator that will drive you to begin analysis at a certain point. In the absence of those indicators, you will want to start at the beginning of the program, ignoring unnecessary code, like code generated by the compiler.

Ghidra can manage external programs. You can add additional programs (libraries) that your program depends on. This way you can quickly navigate to functions in those programs.

During flow analysis of a program, perhaps not all functions are identified. You can instead right-click and create or edit a function. This way you can change parameter types, change the calling convention, undefine functions, etcetera.

You can also add symbols, if they are available. Usually with malware you don’t have access to PDB files.

Demo: function analysis

One way to prevent anti-analysis techniques using IsDebuggerPresent: Set a bp on IsDebuggerPresent from kernel32 Set eax to 0 before test eax eax

Analyzing shellcode: move eax,fs:0x30 (walking the PEB)

Headless mode: can be helpful when you have a lot of repetitive tasks, like analyzing multiple samples.

Demo: scripting analysis

You can for example import and run a script that prints stack strings as ASCII strings and highlights their location.

Looking at decompiled code, you can sometimes see local variables with hex values above. These are telltale signs of the use of stack strings. You can convert hex values to character sequences, which can sometimes show you ASCII strings.

To use a script, you can use the Script Manager, right-click the script, and run it. The console below will show the output.

Malware analysis: Identifying and Defeating Packing

Primary signs of packing

  • strings (the presence or absence)
  • imports or a lack of
  • sections (their number and names)
  • entropy (high entropy in the sections is a leading indicator)
  • signatures from known packers

Detecting packing with signatures PEstudio will look for patterns which are actually a series of byte values. This byte pattern is compared inside your sample and if it matches, the name in the signature is displayed. When bytes are displayed as xx, these are wildcards. The signature can be either compared to the entry point (ep_only) or searched in the entire binary.

PEiD established a signature database and while it is no longer maintained, this approach to signature writing is used by PEstudio. Also a warning: PEiD can actually execute your sample through its plugins!

Section names are arbitrary for execution. This might be why packers would create meaningless section names. Common section names are .text or CODE, .rdata (readonly), .bss (uninitialized data), .reloc (for relocated data). Programs need to import functionality to interact with the operating system. Usually this is done via dynamic linking. Runtime linking is done dynamically. Malware can use this to create its own import table. The last one is statically linked, meaning its functionality is built into (compiled) the binary.

You can also look at the call graph of a binary. Most packers will begin the program from START. Main is something the compiler adds. There can be very limited crossreferences to start. Where even simple ‘Hello world!’ programs tend to have more crossreferences.

Of the strings that you can see, important ones can be: VirtualAlloc, memset and memcpy. These can sometimes not show up in the Imports table, which means they are dynamically resolving its imports. This is an obfuscation technique.

Search for the DOS stub to see where other PE files have been extracted in memory. remember: malware authors can change or remove the value ‘This program cannot be run in DOS mode’.

remember: the program you’re analyzing is also in the PE file format. Other libraries it has loaded will also be loaded in the virtual address space of your process. You will find legitimate PE files. There are techniques to reduce this noise.

Software breakpoints to set on memory allocations: VirtualAlloc or VirtualAllocEx. #Windbg: bp kernel32!VirtualAlloc. #debugging: most debuggers allow you to search symbols (functions) of libraries. x /D /f KERNEL32!v* in WinDbg.

debugging-memory-allocations: after setting breakpoints on VirtualAlloc*, watch the return value in the EAX/RAX register. This is the address of the newly allocated memory. Then investigate permissions with such commands as !vprot in WinDbg. Then watch the contents of memory as the program begins to use them. You can also search memory for evidence of a new PE file. It helps to avoid high-address regions of memory, where imported libraries are usually imported.

investigate-memory-allocations: dump memory you want to investigate. This technique requires trial and error. You may dump corrupt PE files, you will develop a feel for how a PE file should look in memory/hex-editor.

Demo: Unpacking a ransomware

  1. set a bp on f.e. VirtualAlloc* (stub)
  2. begin execution
  3. step out from the breakpoint
  4. see the return value in the EAX register (the memory address just allocated from the call to MemoryAlloc)
  5. dd memory address which is still empty
  6. !vprot memory address that gives information about the allocation (size, base, permissions)
  7. if you hit the breakpoint again, step out
  8. then we should inspect the contents in memory again