Monday, September 8, 2014

New Rekall Plugin: Mac Compressed Memory

Beginning of August I went to the DFRWS US conference in Denver. There were lots of very interesting talks on forensics presented at that conference but I found one of them particularly interesting. The title of the presented paper was "In Lieu of Swap: Analyzing Compressed RAM in Mac OS X and Linux" (Golden G. Richard III and Andrew Case) [1]. The conference organizers found this paper really good too, it won this years Best Paper Award.

In the talk, one of the authors presented how they are able to extract Mac compressed memory from an image. Since OS X Mavericks, Macs can compress memory regions that are not currently used to save some space [2]. This obviously poses some problems for memory forensics since for example simple string matching won't work anymore if the memory you are looking at was compressed.

In the presentation, one of the authors introduced their approach to find and decompress those memory regions (they implemented a Volatility plugin which does it). I became particularly interested when I heard that he was complaining that the Python implementation of the decompression algorithm they have is very slow. I really wanted to analyze what they have done and if there is a way of improving the speed of this algorithm but unfortunately I found out that the authors have not released the code for this yet. Even now, one month later, the code is nowhere to be found :( I thought this might be an
interesting challenge though so that same evening after coming back from conference beers I sat down and implemented my own version of the decompressor.

To compress memory, Apple uses an algorithm called WKdm which was published in 1997 by Paul Wilson and Scott F. Kaplan (with slight modifications). There is a standard C implementation freely available (for example found in this repository [3]) but no Python one as far as I know. Apple uses a highly optimized assembler version for Mavericks which is way faster than the C implementation.

I ported the code from [3] basically 1:1 to Python and quickly saw why the Python version is slow. There are lots of bit operations that Python is particularly bad at and also many of the operations are vectorized in C which is not so easily done in Python. I spent some time optimizing this using some precomputations and use of Python iterators as pointers and actually got acceptable performance for this algorithm. My implementation can now do 1000 compress/decompress operations of a 4k page per second on my 2014 Macbook Pro 15 inch which is 2.5 times as much as what was claimed in the talk for the implementation they had. Of course I am fully aware that I'm comparing apples and oranges here since the hardware they used was probably completely different. However, they still have not released the code they were talking about so I have no way of staging a fair comparison. Regardless of the performance difference, this was a fun exercise. If you want to take a look at my optimized implementation, it's available in the Rekall repository [4].

As a side effect of having this memory decompressor, I was also able to implement a Rekall plugin that dumps all compressed memory pages in a Darwin image to disk, it's called "dumpcompressedmemory":

rekal -f ~/mem.dump

<...>
mem.dump 14:03:46> mkdir /tmp/compressed_memory
mem.dump 14:03:54> dumpcompressedmemory(dump_dir="/tmp/compressed_memory/")
<...>

mem.dump 14:04:03> ls /tmp/compressed_memory/
segment0/ segment1/ segment10/
<...>

mem.dump 14:04:08> ls /tmp/compressed_memory/segment1
slot1.dmp slot16.dmp slot23.dmp slot29.dmp slot34.dmp slot43.dmp
slot10.dmp slot17.dmp slot24.dmp slot3.dmp slot35.dmp
<...>


[1] http://dfrws.org/2014/proceedings/DFRWS2014-1.pdf

[2] https://www.apple.com/osx/advanced-technologies/

[3] https://github.com/santolucito/CCache/tree/master/WKdm

[4] https://github.com/google/rekall/blob/master/rekall/plugins/darwin/WKdm.py

Friday, March 28, 2014

OSX 10.9 Memory Acquisition

Late in 2012, the only free solution for Mac memory acquisition was MacMemoryReader, which is not open source and depends on a binary distribution of the driver to be built especially for the target kernel. This means MacMemoryReader actually carries 5 different versions of it's kernel extension in it's supportfiles directory, and automatically chooses the correct one as long as it is used on a system that was supported at the time the program got packaged.

These constraints didn't fit our use case very well, which is why I developed an open source memory acquisition program for Mac OSX called OSXPmem. This program was designed specifically with the goal to be as operating system independent as possible, while providing a stable way to acquire memory even on future versions of Mac OSX.

With the 10.9 release of OSX, MacMemoryReader stopped working (it only packages drivers for OSX 10.5 - 10.8). Fortunately, OSXPmem still works and is currently the only free memory acquisition tool which is able to acquire memory on Mac OSX 10.9 or higher.

In this blogpost I want to elaborate a bit on the reasons for that, while also giving an overview on how memory acquisition on Mac OSX actually works under the hood.

How Memory Acquisition is Implemented

Because of memory protection it is not possible for a normal user-mode application to just go ahead and read all the memory. Each process resides in a virtual address space which is mapped into physical memory using paging. The data-structures used for this are managed by the kernel, and can only be accessed from system-mode. This is the reason why any memory acquisition program must first load a driver into the kernel, that provides the actual memory access.

Enumerating Physical Memory

Physical memory is not continuous, so don't expect the physical address space of a computer with e.g. 4GB of RAM to be laid out as one chunk from 00000000 - FFFFFFFF. The physical address space itself is again a virtual construct, controlled by the northbridge. For faster access to device memory, many hardware devices like graphics or network cards map parts of their integrated memory or registers into the physical address space via memory mapped I/O. The BIOS (or on modern systems the EFI) initializes the hardware on boot and arranges all these memory chunks into something called the physical address space. Here is a redacted example of the layout on my laptop:

 Size of physical address space: 6171918336 bytes (206 segments)  
 Memory Type       Start Addr    End Addr  
 Conventional  0000000000000000 000000000008f000  
 Reserved      000000000008f000 0000000000090000  
 Conventional  0000000000090000 00000000000a0000  
 Reserved      00000000000a0000 0000000000100000   
 ...   
 Conventional  0000000100000000 000000016fe00000  

Regions marked "Conventional" are guaranteed to be backed by physical memory, while regions marked "Reserved" might contain device memory. It should be avoided to read from the reserved regions, as this can trigger interrupts on a device and thus even result in a hardware malfunction, system instability and a kernel crash. If you're interested in more details I can recommend Gustavo Duarte's excellent articles Getting Physical With Memory and Motherboard Chipsets and the Memory Map.

The bottom line is that a memory acquisition tool must find out how physical memory is laid out in order to acquire it in a way that does not cause instability and thus loss of the desired data. This is usually done by querying the kernel, as it manages physical memory and has to know it's layout. On Mac OSX the EFI passes this information to the component of the kernel called the platform expert, which stores it in a data-structure in PE_state.bootArgs->MemoryMap. If you look at line 13 of PE_state_raw.dtrace in MacMemoryReader's supportfiles directory, you can see how it is obtained:

 self->kgm_boot_args = ((struct boot_args*)(`PE_state).bootArgs);  

OSXPmem uses the same source of information to obtain the memory map. Unfortunately this information is not very reliable, as this data-structure is not used by the kernel at runtime, making it an easy target for rootkit manipulation. In one of our papers last year we showed that it is trivial to overwrite this data-structure, making memory acquisition impossible on the system. We also implemented a more reliable way to obtain an accurate memory map in that publication. However, this has not been implemented for OSX yet.

Mapping Physical Memory

Because even the kernel operates inside virtual memory, the memory acquisition driver needs to map physical memory into the kernels address space. On OSX, this is ultimately done by the Mach portion of the kernel. However, kernel extensions are not allowed to link directly to kernel functions. Instead, they are linked to Kernel Programming Interfaces (KPI), which are interfaces to a limited set of symbols Apple allows kernel extensions to use. The functions available through these interfaces can change with every kernel release, especially the ones in the "unsupported" KPI (where the Platform Experts state is linked from). But even supported KPIs like the ones for physical memory mapping can change, as Apple has shown in the past.

Both OSXPmem and MacMemoryReader use a simple BSD kernel extension (kext) to provide memory access. However, physical memory mapping is only supported from Apples IOKit framework. This is the reason why both kext use the IOMemoryDescriptor class to map memory into the kernel. Now with the introduction of memory compression in OSX 10.9 this method has been found to cause problems. The kernel can compress and uncompress memory on the fly, and doesn't want any other driver touching it. It is entirely possible that in future OSX releases some KPIs will change and this method of mapping memory will not be possible at all.

Luckily, we have also developed a different mapping technique. Originally intended as a rootkit resilient mapping technique, this approach has also proven to be very reliable on any major operating system. It works by directly modifying the page tables and clearing the caches, forcing the memory management unit (MMU) to remap an existing virtual address. As illustrated in Figure 1, a virtual page called "Rogue Page" is remapped by editing the page table entry (PTE) directly and then flushing the translation lookaside buffer (TLB). This enables us to access any arbitrary page in the physical address space, without any kernel dependency. Details on this approach can be found in our paper.

Figure 1: PTE Remapping


By "bypassing" the kernel when accessing memory this technique does not depend on any KPIs and thus isn't affected by compressed RAM or any obstacles Apple might decide to put into the IOMemoryDescriptor in future versions of OSX. This is why we made it the default mapping method in WinPmem and OSXPmem. We expect this method to be more stable and resilient to changes in the kernel and even anti-forensic techniques.

Kernel Extention Code Signing

With the release of OSX 10.9 Apple has changed it's previous policy on driver signature enforcement. While it was previously possible to sign a kernel extension, unsigned kext could also be loaded without any issues. In 10.9 and later, the kext loader displays a warning message when an unsigned kext is loaded. It is very likely that Apple will adopt the Microsoft approach at some time, preventing unsigned kext to be loaded. This is why we also provide a signed binary for OSXPmem RC2 on the Rekall download page.

32-bit Kernel Support

OSXPmem only supports Macs using the 64-bit version of OSX. Apple introduced 64-bit builds with OSX 10.6, making it the default with 10.7. Since OSX 10.8, the 32-bit kernel is deprecated and all Macs running 10.8 or higher should be running in 64 bit mode. This means OSXPmem will not work on OSX 10.5 or older, and might not work on some 10.6 systems. I will not add 32-bit support, as these kernels are deprecated and only run on very old machines anyways. If you need to acquire memory from such a system I encourage you to use MacMemoryReader.

Monday, March 10, 2014

How to stop memory acquisition by changing one byte.

In our recent paper, we examined memory acquisition in details and tested a bunch of tools. Memory acquisition tools have to achieve two tasks to be useful:
  1. They need to be able to map a region of physical memory into the virtual address space, so it can be read by the tool.
  2. They need to know where in the physical address space it is safe to read. Reading a DMA mapped region will typically crash the system (BSOD).
Since PCI devices are able to map DMA buffers into the physical address space, it is not safe to read these buffers. When a read operation occurs on the memory bus for these addresses, the device might become activated and cause a system crash or worse. The memory acquisition tool needs to be able to avoid these DMA mapped regions in order to safely acquire the memory.
Let us see what occurs when one loads the memory acquisition driver. Since our goal is to play around with memory modification, we will enable write support for the winpmem acquisition tool (This example uses a Windows 7 AMD64 VM):
In [2]:
!c:/Users/mic/winpmem_write_1.5.5.exe -l -w
Enabling write mode.
Driver Unloaded.
Loaded Driver C:\Users\mic\AppData\Local\Temp\pmeE820.tmp.
Write mode enabled! Hope you know what you are doing.
CR3: 0x0000187000
 2 memory ranges:
Start 0x00001000 - Length 0x0009E000
Start 0x00100000 - Length 0x7CEF0000
Acquisition mode PTE Remapping


We see that winpmem extracts its driver to a temporary location, and loads it into the kernel. It then reports the value of the Control Register CR3 (This is the kernel's Directory Table Base - or DTB).
Next we see that the driver is reporting the ranges of physical memory available on this system. There are two ranges on this system with a gap in between. To understand why this is let's consider the boot process:
  • When the system boots, the BIOS configures the initial physical memory map. The RAM in the system is literally installed at various ranges in the physical address space by the BIOS.
  • The operating system is booted in Real mode, at which point a BIOS service interrupt is issued to query this physical memory configuration. It is only possible to issue this interrupt in Real mode.
  • During the OS boot sequence, the processor is switched to protected mode and the operating system continues booting.
  • The OS configures PCI devices by talking to the PCI controller and mapping each PCI device's DMA buffer (plug and play) into one of the gaps in the physical address space. Note that these gaps may not actually be backed by any RAM chips at all (which means that a write to that location will simply not stick - reading it back will produce 0).
The important thing to take from this is that the physical memory configuration is done by the machine BIOS on its own (independent of the running operating system). The OS kernel needs to live with whatever configuration the hardware boots with. The hardware will typically install some gaps in the physical address range so that PCI devices can be mapped inside them (Some PCI devices can only address 4GB so there must be sufficient space in the lower 4GB of physical address space for these.).
Since the operating system can only query the physical memory map when running in real mode, but needs to use it to configure PCI devices while running in protected mode, there must be a data structure somewhere which keeps this information around. When WinPmem queries for this information, it can not be retrieved directly from the BIOS - since the machine is already running in protected mode.
The usual way to get the physical memory ranges is to call MmGetPhysicalMemoryRanges(). This is the function API:
PPHYSICAL_MEMORY_DESCRIPTOR NTAPI MmGetPhysicalMemoryRanges(VOID);
We can get Rekall to disassemble this function for us. First we initialize the notebook, opening the winpmem driver to analyze the live system. Since Rekall uses exact profiles generated from accurate debugging information for the running system, it can resolve all debugging symbols directly. We therefore can simply disassemble the function by name:
In [2]:
from rekall import interactive
interactive.ImportEnvironment(filename=r"\\.\pmem")
Initializing Rekall session.
Done!

In [3]:
dis "nt!MmGetPhysicalMemoryRanges"
   Address      Rel Op Codes             Instruction                    Comment
-------------- ---- -------------------- ------------------------------ -------
------ nt!MmGetPhysicalMemoryRanges ------
0xf80002cd9690    0 488bc4               MOV RAX, RSP                   
0xf80002cd9693    3 48895808             MOV [RAX+0x8], RBX             
0xf80002cd9697    7 48896810             MOV [RAX+0x10], RBP            
0xf80002cd969b    B 48897018             MOV [RAX+0x18], RSI            
0xf80002cd969f    F 48897820             MOV [RAX+0x20], RDI            
0xf80002cd96a3   13 4154                 PUSH R12                       
0xf80002cd96a5   15 4155                 PUSH R13                       
0xf80002cd96a7   17 4157                 PUSH R15                       
0xf80002cd96a9   19 4883ec20             SUB RSP, 0x20                  
0xf80002cd96ad   1D 65488b1c2588010000   MOV RBX, [GS:0x188]            
0xf80002cd96b6   26 41bf11000000         MOV R15D, 0x11                 
0xf80002cd96bc   2C 4533e4               XOR R12D, R12D                 
0xf80002cd96bf   2F f6835904000020       TEST BYTE [RBX+0x459], 0x20    
0xf80002cd96c6   36 458d6ff0             LEA R13D, [R15-0x10]           
0xf80002cd96ca   3A 7405                 JZ 0xf80002cd96d1              nt!MmGetPhysicalMemoryRanges + 0x41
0xf80002cd96cc   3C 418bfc               MOV EDI, R12D                  
0xf80002cd96cf   3F eb2a                 JMP 0xf80002cd96fb             nt!MmGetPhysicalMemoryRanges + 0x6B
0xf80002cd96d1   41 66ff8bc6010000       DEC WORD [RBX+0x1c6]           
0xf80002cd96d8   48 33c0                 XOR EAX, EAX                   
0xf80002cd96da   4A f04c0fb13de5e4dcff   LOCK CMPXCHG [RIP-0x231b1b], R15 0x0 nt!MmDynamicMemoryLock
0xf80002cd96e3   53 740c                 JZ 0xf80002cd96f1              nt!MmGetPhysicalMemoryRanges + 0x61
0xf80002cd96e5   55 488d0ddce4dcff       LEA RCX, [RIP-0x231b24]        0x0 nt!MmDynamicMemoryLock
0xf80002cd96ec   5C e83f00c3ff           CALL 0xf80002909730            nt!ExfAcquirePushLockShared
0xf80002cd96f1   61 808b5904000020       OR BYTE [RBX+0x459], 0x20      
0xf80002cd96f8   68 418bfd               MOV EDI, R13D                  
0xf80002cd96fb   6B 488b053679e3ff       MOV RAX, [RIP-0x1c86ca]        0xFFFFFA8001793FD0 nt!MmPhysicalMemoryBlock
0xf80002cd9702   72 33c9                 XOR ECX, ECX                   
0xf80002cd9704   74 41b84d6d5068         MOV R8D, 0x68506d4d            
0xf80002cd970a   7A 8b10                 MOV EDX, [RAX]                 
0xf80002cd970c   7C 4103d5               ADD EDX, R13D                  
0xf80002cd970f   7F c1e204               SHL EDX, 0x4                   
0xf80002cd9712   82 e8f944d3ff           CALL 0xf80002a0dc10            nt!ExAllocatePoolWithTag
0xf80002cd9717   87 488be8               MOV RBP, RAX                   
0xf80002cd971a   8A 493bc4               CMP RAX, R12                   
0xf80002cd971d   8D 7545                 JNZ 0xf80002cd9764             nt!MmGetPhysicalMemoryRanges + 0xD4
0xf80002cd971f   8F 413bfd               CMP EDI, R13D                  
0xf80002cd9722   92 7539                 JNZ 0xf80002cd975d             nt!MmGetPhysicalMemoryRanges + 0xCD
0xf80002cd9724   94 498bc7               MOV RAX, R15                   
0xf80002cd9727   97 f04c0fb12598e4dcff   LOCK CMPXCHG [RIP-0x231b68], R12 0x0 nt!MmDynamicMemoryLock
0xf80002cd9730   A0 740c                 JZ 0xf80002cd973e              nt!MmGetPhysicalMemoryRanges + 0xAE
0xf80002cd9732   A2 488d0d8fe4dcff       LEA RCX, [RIP-0x231b71]        0x0 nt!MmDynamicMemoryLock
0xf80002cd9739   A9 e80655bfff           CALL 0xf800028cec44            nt!ExfReleasePushLockShared
0xf80002cd973e   AE 80a359040000df       AND BYTE [RBX+0x459], 0xdf     
0xf80002cd9745   B5 664401abc6010000     ADD [RBX+0x1c6], R13W          
0xf80002cd974d   BD 750e                 JNZ 0xf80002cd975d             nt!MmGetPhysicalMemoryRanges + 0xCD
0xf80002cd974f   BF 488d4350             LEA RAX, [RBX+0x50]            

Note that Rekall is able to resolve the addresses back to the symbol names by using debugging information. This makes reading the disassembly much easier. We can see that this function essentially copies the data referred to from the symbol nt!MmPhysicalMemoryBlock into user space.
Lets dump this memory:
In [8]:
dump "nt!MmPhysicalMemoryBlock", rows=2
    Offset                           Hex                              Data       Comment
-------------- ------------------------------------------------ ---------------- -------
0xf80002b11038 d0 3f 79 01 80 fa ff ff 01 00 01 00 fe 3d 09 a1  .?y..........=.. nt!MmPhysicalMemoryBlock + 0
0xf80002b11048 a0 a8 83 01 80 fa ff ff 70 6a 7b 01 80 fa ff ff  ........pj{..... nt!IoFileObjectType + 0

This appears to be an address, lets dump it:
In [10]:
dump 0xfa8001793fd0, rows=4
    Offset                           Hex                              Data       Comment
-------------- ------------------------------------------------ ---------------- -------
0xfa8001793fd0 02 00 00 00 00 00 00 00 8e cf 07 00 00 00 00 00  ................ 
0xfa8001793fe0 01 00 00 00 00 00 00 00 9e 00 00 00 00 00 00 00  ................ 
0xfa8001793ff0 00 01 00 00 00 00 00 00 f0 ce 07 00 00 00 00 00  ................ 
0xfa8001794000 fe ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................ 

The data at this location contains a struct of type _PHYSICAL_MEMORY_DESCRIPTOR which is also the return value from the MmGetPhysicalMemoryRanges()call. We can use Rekall to simply construct this struct at this location and print out all its members.
In [12]:
memory_range = session.profile._PHYSICAL_MEMORY_DESCRIPTOR(0xfa8001793fd0)
print memory_range
[_PHYSICAL_MEMORY_DESCRIPTOR _PHYSICAL_MEMORY_DESCRIPTOR] @ 0xFA8001793FD0
  0x00 NumberOfRuns   [unsigned long:NumberOfRuns]: 0x00000002
  0x08 NumberOfPages  [unsigned long long:NumberOfPages]: 0x0007CF8E
  0x10 Run           <Array 2 x _PHYSICAL_MEMORY_RUN @ 0xFA8001793FE0>


In [13]:
for r in memory_range.Run:
    print r
[_PHYSICAL_MEMORY_RUN Run[0] ] @ 0xFA8001793FE0
  0x00 BasePage   [unsigned long long:BasePage]: 0x00000001
  0x08 PageCount  [unsigned long long:PageCount]: 0x0000009E

[_PHYSICAL_MEMORY_RUN Run[1] ] @ 0xFA8001793FF0
  0x00 BasePage   [unsigned long long:BasePage]: 0x00000100
  0x08 PageCount  [unsigned long long:PageCount]: 0x0007CEF0


So what have we found?
  • There is a symbol called nt!MmPhysicalMemoryBlock which is a pointer to a _PHYSICAL_MEMORY_DESCRIPTOR struct.
  • This struct contains the total number of runs, and a list of each run in pages (0x1000 bytes long).
Lets write a Rekall plugin for this:
In [15]:
from rekall.plugins.windows import common

class WinPhysicalMap(common.WindowsCommandPlugin):
    """Prints the boot physical memory map."""

    __name = "phys_map"

    def render(self, renderer):
        renderer.table_header([
                ("Physical Start", "phys", "[addrpad]"),
                ("Physical End", "phys", "[addrpad]"),
                ("Number of Pages", "pages", "10"),
                ])

        descriptor = self.profile.get_constant_object(
            "MmPhysicalMemoryBlock",
            target="Pointer",
            target_args=dict(
                target="_PHYSICAL_MEMORY_DESCRIPTOR",
                ))

        for memory_range in descriptor.Run:
            renderer.table_row(
                memory_range.BasePage * 0x1000,
                (memory_range.BasePage + memory_range.PageCount) * 0x1000,
                memory_range.PageCount)
This plugin will be named phys_map and essentially creates a table with three columns. The memory descriptor is created directly from the profile, then we iterate over all the runs and output the start and end range into the table.
In [16]:
phys_map
Physical Start  Physical End  Number of Pages
-------------- -------------- ---------------
0x000000001000 0x00000009f000 158       
0x000000100000 0x00007cff0000 511728    

So far, this is a pretty simple plugin. However, lets put on our black hat for a sec.
In our DFRWS 2013 paper we pointed out that since most memory acquisition tools end up calling MmGetPhysicalMemoryRanges() (all the ones we tested at least), then by disabling this function we would be able to sabotage all memory acquisition tools. This turned out to be the case, however, by patching the running code in memory we would trigger Microsoft's Patch Guard. In our tests, we disabled Patch Guard to prove the point, but this is less practical in a real rootkit.
In reality, a rootkit would like to be able to modify the underlying data structure behind the API call itself. This is much easier to do and wont modify any kernel code, thereby bypassing Patch Guard protections.
To test this, we can do this directly from Rekall's interactive console.
In [18]:
descriptor = session.profile.get_constant_object(
    "MmPhysicalMemoryBlock",
    target="Pointer",
    target_args=dict(
      target="_PHYSICAL_MEMORY_DESCRIPTOR",
    )).dereference()

print descriptor
[_PHYSICAL_MEMORY_DESCRIPTOR Pointer] @ 0xFA8001793FD0
  0x00 NumberOfRuns   [unsigned long:NumberOfRuns]: 0x00000002
  0x08 NumberOfPages  [unsigned long long:NumberOfPages]: 0x0007CF8E
  0x10 Run           <Array 2 x _PHYSICAL_MEMORY_RUN @ 0xFA8001793FE0>


Since we loaded the memory driver with write support, we are able to directly modify each field in the struct. For this proof of concept we simply set the NumberOfRuns to 0, but a rootkit can get creative by modifying the runs to contain holes located in strategic regions. By specifically crafting a physical memory descriptor with a hole in it, we can cause memory acquisition tools to just skip over some region of the physical memory. The responders can then walk away thinking they have their evidence, but critical information is missing.
In [19]:
descriptor.NumberOfRuns = 0
Now we can repeat our phys_map plugin, but this time, no runs will be found:
In [20]:
phys_map
Physical Start  Physical End  Number of Pages
-------------- -------------- ---------------

To unload the driver, we need to close any handles to it. We then try to acquire a memory image in the regular way.
In [32]:
session.physical_address_space.close()
In [2]:
!c:/Users/mic/winpmem_write_1.5.5.exe test.raw
Driver Unloaded.
Loaded Driver C:\Users\mic\AppData\Local\Temp\pme3879.tmp.
Will generate a RAW image
CR3: 0x0000187000
 0 memory ranges:
Acquitision mode PTE Remapping

Driver Unloaded.

This time, however, Winpmem reports no memory ranges available. The result image is also 0 bytes big:
In [3]:
!dir test.raw
 Volume in drive C has no label.
 Volume Serial Number is 6438-7315

 Directory of C:\Users\mic

03/07/2014  12:02 AM                 0 test.raw
               1 File(s)              0 bytes
               0 Dir(s)   3,416,547,328 bytes free

At this point, running the dumpit program from moonsols will cause the system to immediately reboot. (It seems that dumpit is unable to handle 0 memory ranges gracefully and crashes the kernel).

How stable is this?

We have just disabled a kernel function, but this might de-stabilize the system. What other functions in the kernel are calling MmGetPhysicalMemoryRanges?
Lets find out by disassembling the entire kernel. First we need to find the range of memory addresses the kernel code is in. We use the peinfo plugin to show us the sections which are mapped into memory.

In [2]:
peinfo "nt"

Attribute            Value
---------------------- -----
Machine              IMAGE_FILE_MACHINE_AMD64
TimeDateStamp        2009-07-13 23:40:48+0000
Characteristics      IMAGE_FILE_EXECUTABLE_IMAGE, IMAGE_FILE_LARGE_ADDRESS_AWARE
GUID/Age             F8E2A8B5C9B74BF4A6E4A48F180099942
PDB                  ntkrnlmp.pdb
MajorOperatingSystemVersion 6
MinorOperatingSystemVersion 1
MajorImageVersion    6
MinorImageVersion    1
MajorSubsystemVersion 6
MinorSubsystemVersion 1

Sections (Relative to 0xF8000261F000):
Perm Name          VMA            Size     
---- -------- -------------- --------------
xr-  .text    0x000000001000 0x00000019b800
xr-  INITKDBG 0x00000019d000 0x000000003a00   These are Executable sections.
xr-  POOLMI   0x0000001a1000 0x000000001c00
xr-  POOLCODE 0x0000001a3000 0x000000003000
xrw  RWEXEC   0x0000001a6000 0x000000000000
-r-  .rdata   0x0000001a7000 0x00000003ca00
-rw  .data    0x0000001e4000 0x00000000fc00
-r-  .pdata   0x000000278000 0x00000002fa00
-rw  ALMOSTRO 0x0000002a8000 0x000000000800
-rw  SPINLOCK 0x0000002aa000 0x000000000a00
xr-  PAGELK   0x0000002ac000 0x000000014c00
xr-  PAGE     0x0000002c1000 0x000000232600
xr-  PAGEKD   0x0000004f4000 0x000000004e00
xr-  PAGEVRFY 0x0000004f9000 0x000000021600
xr-  PAGEHDLS 0x00000051b000 0x000000002800
xr-  PAGEBGFX 0x00000051e000 0x000000006800
-rw  PAGEVRFB 0x000000525000 0x000000000000
-r-  .edata   0x000000529000 0x000000010a00
-rw  PAGEDATA 0x00000053a000 0x000000004c00
-r-  PAGEVRFC 0x000000548000 0x000000002a00
-rw  PAGEVRFD 0x00000054b000 0x000000001400
xrw  INIT     0x00000054d000 0x000000056c00
-r-  .rsrc    0x0000005a4000 0x000000035e00
-r-  .reloc   0x0000005da000 0x000000002200

Data Directories:
-                                             VMA            Size     
---------------------------------------- -------------- --------------
IMAGE_DIRECTORY_ENTRY_EXPORT             0xf80002b48000 0x000000010962
IMAGE_DIRECTORY_ENTRY_IMPORT             0xf80002bc1cec 0x000000000078
IMAGE_DIRECTORY_ENTRY_RESOURCE           0xf80002bc3000 0x000000035d34
IMAGE_DIRECTORY_ENTRY_EXCEPTION          0xf80002897000 0x00000002f880
IMAGE_DIRECTORY_ENTRY_SECURITY           0xf80002b5ec00 0x000000001c50
IMAGE_DIRECTORY_ENTRY_BASERELOC          0xf80002bf9000 0x000000002078
IMAGE_DIRECTORY_ENTRY_DEBUG              0xf800027bb5c0 0x000000000038
IMAGE_DIRECTORY_ENTRY_COPYRIGHT          0x000000000000 0x000000000000
IMAGE_DIRECTORY_ENTRY_GLOBALPTR          0x000000000000 0x000000000000
IMAGE_DIRECTORY_ENTRY_TLS                0x000000000000 0x000000000000
IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG        0x000000000000 0x000000000000
IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT       0x000000000000 0x000000000000
IMAGE_DIRECTORY_ENTRY_IAT                0xf800027c6000 0x000000000380
IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT       0x000000000000 0x000000000000
IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR     0x000000000000 0x000000000000
IMAGE_DIRECTORY_ENTRY_RESERVED           0x000000000000 0x000000000000

Import Directory (Original):
Name                                               Ord  
-------------------------------------------------- -----

Export Directory:
    Entry      Stat Ord   Name                                              
-------------- ---- ----- --------------------------------------------------
0xf80002677794 M    0     ntoskrnl.exe!AlpcGetHeaderSize (nt!AlpcGetHeaderSize)
0xf80002677760 M    1     ntoskrnl.exe!AlpcGetMessageAttribute (nt!AlpcGetMessageAttribute)
0xf80002665eb0 M    2     ntoskrnl.exe!AlpcInitializeMessageAttribute (nt!AlpcInitializeMessageAttribute)
0xf800026b5ac0 M    3     ntoskrnl.exe!CcCanIWrite (nt!CcCanIWrite)         
0xf8000262a244 M    4     ntoskrnl.exe!CcCoherencyFlushAndPurgeCache (nt!CcCoherencyFlushAndPurgeCache)
... (Truncated)
0xf80002b4d2ab M    2111  ntoskrnl.exe! (None)                              
Version Information:
key                  value
-------------------- -----

Now instead of disassembling to the interactive notebook, we store it in a file. This does take a while but will produce a large text file containing the complete disassembly of the windows kernel (With debugging symbols cross referenced).
In [3]:
dis offset=0xF8000261F000+0x1000, end=0xF8000261F000+0x525000, output="ntkrnl_amd64.dis"
Now we can use our favourite editor (Emacs) to check all references to MmGetPhysicalMemoryRanges. We can see references from:
  • nt!PfpMemoryRangesQuery - Part of ExpQuerySystemInformation.
  • nt!IoFillDumpHeader - Called from crashdump facility.
  • nt!IopGetPhysicalMemoryBlock - Called from crashdump facility.
We can also check references to MmPhysicalMemoryBlock. Many of these functions appear related to the Hot-Add memory functionality:
  • nt!IoSetDumpRange
  • nt!MiFindContiguousPages
  • nt!MmIdentifyPhysicalMemory
  • nt!MmReadProcessPageTables
  • nt!MiAllocateMostlyContiguous
  • nt!IoFillDumpHeader
  • nt!MiReleaseAllMemory
  • nt!MmDuplicateMemory
  • nt!MiRemovePhysicalMemory
  • nt!MmAddPhysicalMemory
  • nt!MmGetNumberOfPhysicalPages - This seems to be called from Hibernation code.
  • nt!MiScanPagefileSpace
  • nt!MmPerfSnapShotValidPhysicalMemory
  • nt!MmGetPhysicalMemoryRanges
Some testing remains to see how stable this modification is in practice. It appears that probably Hot Add memory will no longer work, and possibly hibernation will fail (Hibernation is an alternate way to capture memory images, as Rekall can also operate on hibernation files). Although the above suggests that crash dumps are affected, I have tried to produce a crashdump after this modification, but it still worked as expected (This is actually kind of interesting in itself).

PS

This note was written inside Rekall itself by using the IPython notebook interface.

Friday, February 21, 2014

Do we need the Kernel Debugging Block?

I have written a blog article in the past describing the Kernel Debugging Block (KDBG) in detail http://scudette.blogspot.ch/2012/11/finding-kernel-debugger-block.html as it is used by Volatility in order to "bootstrap" the analysis process. Many plugins require a list of processes, and Volatility uses the KDBG in order to locate the PsActiveProcessHead symbol (which is the head of the doubly linked list holding the _EPROCESS objects together).
Recently, the Volatility blog reminded us that the KDBG is critical for memory analysis. In that post, the author recognizes that the KDBG block is encoded on Window 8 and is not readily scanned for using the usual kdbgscan plugin. In particular that blog post states:
An encoded KDBG can have a hugely negative effect on your ability to perform memory forensics. This structure contains a lot of critical details about the system, including the pointers to the start of the lists of active processes and loaded kernel modules, the address of the PspCid handle table, the ranges for the paged and non-paged pools, etc. If all of these fields are encoded, your day becomes that much more difficult.
We have previously demonstrated in our OSDFC training workshop that the KDBG block can be trivially overwritten without affecting system stability. Since the kdbgscan plugin simply scans for the plain text "KDBG" signature, by overwriting this signature it is impossible to locate the KDBG, nor bootstrap memory analysis. Indeed with Volatility you are going to have a really bad day. It is still possible to workaround this limitation, and our workshop describes all the workarounds available, but it is definitely not ideal.
This problem was also discussed in the Black Hat talk One-byte Modification for Breaking Memory Forensic Analysis.

Does Rekall use the KDBG?

Volatility windows profiles are typically generated using the pdbparse project, using the  pdb_tpi_vtypes.py script. They normally only contain the vtype definitions (embedded into python files, for example vista_sp0_x64_vtypes.py).
While developing the Rekall profile system (which is described in detail in previous blog posts), new profiles were generated for windows kernels. Rather than rely on the pdbparse project to parse the pdb files, we have implemented a complete Microsoft PDB parser within the Rekall framework (This will be described in a future blog post).
Microsoft PDB files contain a number of streams. One of the streams describes struct definitions and can be used to generate the vtypes. However, interestingly, there are a few more streams which extract global symbols from the PDB file. (The pdbparse project does provide am additional script to extract the constants from the pdb file, but that script is not currently used by Volatility).
In other words, the PDB file contains the addresses in memory of many symbols. This is akin to the System.map file we use when analyzing a Linux memory image. Lets examine a typical Rekall windows profile:
{
 "$CONSTANTS": {
.....
  "PromoteNode": 611168,
  "PropertyEval": 451884,
  "PsAcquireProcessExitSynchronization": 1157620,
  "PsActiveProcessHead": 96160,
  "PsAssignImpersonationToken": 1479504,
  "PsBoostThreadIo": 219912,
....
  "KdD3Transition": 805316,
  "KdDebuggerDataBlock": 2003056,
  "KdDebuggerEnabled": 2562992,
  "KdDebuggerInitialize0": 805256,
  "KdDebuggerInitialize1": 805244,
...
We can see that the typical Microsoft kernel PDB file contains a huge number of symbols which are not exported in the PE export table. In particular we see the symbol PsActiveProcessHead which is required to list processes. We also see the exact location of the Kernel Debugger block in KdDebuggerDataBlock symbol (Just in case we need it). The symbol offset is specified relative to the Kernel Base address (i.e. the MZ header where the kernel is mapped into memory).
Let us examine in detail the steps that Rekall goes through in the pslist module by enabling verbose logging:
$ rekall --verbose -f  ~/images/win7.elf pslist
.....
INFO:root:Autodetected physical address space Elf64CoreDump                     1
DEBUG:root:Opened url http://profiles.rekall.googlecode.com/git//pe.gz
INFO:root:Loaded profile pe from URL:http://profiles.rekall.googlecode.com/git/ 2
DEBUG:root:Verifying profile GUID/F8E2A8B5C9B74BF4A6E4A48F180099942             3
DEBUG:root:Opened url http://profiles.rekall.googlecode.com/git//GUID/F8E2A8B5C9B74BF4A6E4A48F180099942.gz
DEBUG:root:Opened url http://profiles.rekall.googlecode.com/git//ntoskrnl.exe/AMD64/6.1.7600.16385/F8E2A8B5C9B74BF4A6E4A48F180099942.gz
INFO:root:Loaded profile ntoskrnl.exe/AMD64/6.1.7600.16385/F8E2A8B5C9B74BF4A6E4A48F180099942 from URL:http://profiles.rekall.googlecode.com/git/
INFO:root:Loaded profile GUID/F8E2A8B5C9B74BF4A6E4A48F180099942 from URL:http://profiles.rekall.googlecode.com/git/
DEBUG:root:Found _EPROCESS @ 0x2818140 (DTB: 0x187000)                          4
INFO:root:Detected ntkrnlmp.pdb with GUID F8E2A8B5C9B74BF4A6E4A48F180099942
  Offset (V)   Name                    PID   PPID   Thds     Hnds   Sess  Wow64 Start                    Exit
-------------- -------------------- ------ ------ ------ -------- ------ ------ ------------------------ ------------------------
INFO:root:Detected kernel base at 0xF8000261F000                                5
0xfa80008959e0 System                    4      0     84      511 ------  False 2012-10-01 21:39:51+0000 -
0xfa8001994310 smss.exe                272      4      2       29 ------  False 2012-10-01 21:39:51+0000 -
0xfa8002259060 csrss.exe               348    340      9      436      0  False 2012-10-01 21:39:57+0000 -
1Rekall auto-detects this image as contained in an EWF file.
2Rekall now contacts the profile repository to retrieve the parser for the PE file format.
3The PE profile is used to scan for RSDS signatures. These are verified so we can be pretty confident that we loaded the exact profile for this image.
4The Kernel DTB is located by scanning for the Idle process.
5We now find the kernel’s base address. Once that is known, the addresses of all symbols in the kernel’s virtual address space are known directly from the profile. i.e. We do not need to scan for anything, we already know where everything is.
Rekall generally does not need to use the KDBG at all. This is much faster since it does not need to scan for it, but more importantly, is much more robust because malware can not overwrite thePsActiveProcessHead symbol without crashing the system.
Since Rekall uses a profile repository we are able to locate the exact profile for the kernel we are analyzing. Therefore we do not need to scan for anything - we always prefer to just read the exact addresses from the profile without guessing. This makes analysis far more robust and simple.

Another example, the callbacks plugin.

Another example of this technique is the callbacks plugin. Here, Volatility resorts to disassembling various exported functions to try to locate the offset of a number of non-exported callback pointer tables (e.g. PsSetLoadImageNotifyRoutine is disassembled to get to PspLoadImageNotifyRoutine). This algorithm is pretty fragile and complex. It also only works on 32 bit systems at the moment, since signatures need to be developed for different architectures.
However, this algorithm is entirely not needed, if one uses the correct profile for the exact kernel version. You can simply look up the exact addresses of the (non-exported) symbols you need. Here is the Rekall code:
        routines = ["_PspLoadImageNotifyRoutine",             1
                    "_PspCreateThreadNotifyRoutine",
                    "_PspCreateProcessNotifyRoutine"]

        for symbol in routines:
            # The list is an array of 8 _EX_FAST_REF objects
            addrs = self.profile.get_constant_object(         2
                symbol,
                target="Array",
                target_args=dict(
                    count=8,
                    target='_EX_FAST_REF')
                )

            for addr in addrs:                                3
                callback = addr.dereference_as("_GENERIC_CALLBACK")
                if callback:
                    yield "GenericKernelCallback", callback.Callback, None
1We look up each one of these symbols by name.
2We use the profile directly to instanstiate an array of 8 _EX_FAST_REF.
3We dereference each of the addresses to find the callbacks.
There is no need to scan or disassemble anything to retrieve the symbol addresses, since we know exactly where they are already.

What else can we do with profile constants?

The amount of information provided in the kernel PDB files is truly extensive. Not only does Microsoft provide non-exported function names, but also global names, string names, import table entries and much more.
This is extremely useful when disassembling code in Rekall. Since Rekall disassembles the code which is resident in memory, all relocations, imports, exports etc have already been done by the kernel. In other words if we see a memory reference, we can resolve it to know where it is or what it is without considering imports.
Here is an example of disassembling the PsSetLoadImageNotifyRoutine routine on a 64 bit image (This is what Volatility is doing in the callbacks plugin).
$ rekall -f  ~/images/win7.elf dis 'ntoskrnl.exe!PsSetLoadImageNotifyRoutine'
   Address      Rel Op Codes             Instruction                    Comment
-------------- ---- -------------------- ------------------------------ -------
------ ntoskrnl.exe!PsSetLoadImageNotifyRoutine ------
0xf80002aa1050    0 48895c2408           MOV [RSP+0x8], RBX
0xf80002aa1055    5 57                   PUSH RDI
0xf80002aa1056    6 4883ec20             SUB RSP, 0x20
0xf80002aa105a    A 33d2                 XOR EDX, EDX
0xf80002aa105c    C e8bfb1feff           CALL 0xf80002a8c220            ntoskrnl.exe!ExAllocateCallBack
0xf80002aa1061   11 488bf8               MOV RDI, RAX
0xf80002aa1064   14 4885c0               TEST RAX, RAX
0xf80002aa1067   17 7507                 JNZ 0xf80002aa1070             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x20
0xf80002aa1069   19 b89a0000c0           MOV EAX, 0xffffffffc000009a
0xf80002aa106e   1E eb4a                 JMP 0xf80002aa10ba             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x6A
0xf80002aa1070   20 33db                 XOR EBX, EBX
0xf80002aa1072   22 488d0d27d4d9ff       LEA RCX, [RIP-0x262bd9]        0xFFFFF8A0001310BF ntoskrnl.exe!PspLoadImageNotifyRoutine
0xf80002aa1079   29 4533c0               XOR R8D, R8D
0xf80002aa107c   2C 488bd7               MOV RDX, RDI
0xf80002aa107f   2F 488d0cd9             LEA RCX, [RCX+RBX*8]
0xf80002aa1083   33 e8c817f8ff           CALL 0xf80002a22850            ntoskrnl.exe!ExCompareExchangeCallBack
0xf80002aa1088   38 84c0                 TEST AL, AL
0xf80002aa108a   3A 7511                 JNZ 0xf80002aa109d             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x4D
0xf80002aa108c   3C ffc3                 INC EBX
0xf80002aa108e   3E 83fb08               CMP EBX, 0x8
0xf80002aa1091   41 72df                 JB 0xf80002aa1072              ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x22
0xf80002aa1093   43 488bcf               MOV RCX, RDI
0xf80002aa1096   46 e805e9f5ff           CALL 0xf800029ff9a0            ntoskrnl.exe!IopDeallocateApc
0xf80002aa109b   4B ebcc                 JMP 0xf80002aa1069             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x19
0xf80002aa109d   4D f083053bd4d9ff01     LOCK ADD DWORD [RIP-0x262bc5], 0x1 0x1 ntoskrnl.exe!PspLoadImageNotifyRoutineCount
0xf80002aa10a5   55 8b05d5d3d9ff         MOV EAX, [RIP-0x262c2b]        0x7 ntoskrnl.exe!PspNotifyEnableMask
0xf80002aa10ab   5B a801                 TEST AL, 0x1
0xf80002aa10ad   5D 7509                 JNZ 0xf80002aa10b8             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x68
0xf80002aa10af   5F f00fba2dc8d3d9ff00   LOCK BTS DWORD [RIP-0x262c38], 0x0 0x7 ntoskrnl.exe!PspNotifyEnableMask
0xf80002aa10b8   68 33c0                 XOR EAX, EAX
0xf80002aa10ba   6A 488b5c2430           MOV RBX, [RSP+0x30]
0xf80002aa10bf   6F 4883c420             ADD RSP, 0x20
0xf80002aa10c3   73 5f                   POP RDI
We can see that addresses are resolved according to the known symbols at that address (In the Volatility code we are actually after the PspLoadImageNotifyRoutine address).