Friday, February 21, 2014

Do we need the Kernel Debugging Block?

I have written a blog article in the past describing the Kernel Debugging Block (KDBG) in detail as it is used by Volatility in order to "bootstrap" the analysis process. Many plugins require a list of processes, and Volatility uses the KDBG in order to locate the PsActiveProcessHead symbol (which is the head of the doubly linked list holding the _EPROCESS objects together).
Recently, the Volatility blog reminded us that the KDBG is critical for memory analysis. In that post, the author recognizes that the KDBG block is encoded on Window 8 and is not readily scanned for using the usual kdbgscan plugin. In particular that blog post states:
An encoded KDBG can have a hugely negative effect on your ability to perform memory forensics. This structure contains a lot of critical details about the system, including the pointers to the start of the lists of active processes and loaded kernel modules, the address of the PspCid handle table, the ranges for the paged and non-paged pools, etc. If all of these fields are encoded, your day becomes that much more difficult.
We have previously demonstrated in our OSDFC training workshop that the KDBG block can be trivially overwritten without affecting system stability. Since the kdbgscan plugin simply scans for the plain text "KDBG" signature, by overwriting this signature it is impossible to locate the KDBG, nor bootstrap memory analysis. Indeed with Volatility you are going to have a really bad day. It is still possible to workaround this limitation, and our workshop describes all the workarounds available, but it is definitely not ideal.
This problem was also discussed in the Black Hat talk One-byte Modification for Breaking Memory Forensic Analysis.

Does Rekall use the KDBG?

Volatility windows profiles are typically generated using the pdbparse project, using the script. They normally only contain the vtype definitions (embedded into python files, for example
While developing the Rekall profile system (which is described in detail in previous blog posts), new profiles were generated for windows kernels. Rather than rely on the pdbparse project to parse the pdb files, we have implemented a complete Microsoft PDB parser within the Rekall framework (This will be described in a future blog post).
Microsoft PDB files contain a number of streams. One of the streams describes struct definitions and can be used to generate the vtypes. However, interestingly, there are a few more streams which extract global symbols from the PDB file. (The pdbparse project does provide am additional script to extract the constants from the pdb file, but that script is not currently used by Volatility).
In other words, the PDB file contains the addresses in memory of many symbols. This is akin to the file we use when analyzing a Linux memory image. Lets examine a typical Rekall windows profile:
  "PromoteNode": 611168,
  "PropertyEval": 451884,
  "PsAcquireProcessExitSynchronization": 1157620,
  "PsActiveProcessHead": 96160,
  "PsAssignImpersonationToken": 1479504,
  "PsBoostThreadIo": 219912,
  "KdD3Transition": 805316,
  "KdDebuggerDataBlock": 2003056,
  "KdDebuggerEnabled": 2562992,
  "KdDebuggerInitialize0": 805256,
  "KdDebuggerInitialize1": 805244,
We can see that the typical Microsoft kernel PDB file contains a huge number of symbols which are not exported in the PE export table. In particular we see the symbol PsActiveProcessHead which is required to list processes. We also see the exact location of the Kernel Debugger block in KdDebuggerDataBlock symbol (Just in case we need it). The symbol offset is specified relative to the Kernel Base address (i.e. the MZ header where the kernel is mapped into memory).
Let us examine in detail the steps that Rekall goes through in the pslist module by enabling verbose logging:
$ rekall --verbose -f  ~/images/win7.elf pslist
INFO:root:Autodetected physical address space Elf64CoreDump                     1
DEBUG:root:Opened url
INFO:root:Loaded profile pe from URL: 2
DEBUG:root:Verifying profile GUID/F8E2A8B5C9B74BF4A6E4A48F180099942             3
DEBUG:root:Opened url
DEBUG:root:Opened url
INFO:root:Loaded profile ntoskrnl.exe/AMD64/6.1.7600.16385/F8E2A8B5C9B74BF4A6E4A48F180099942 from URL:
INFO:root:Loaded profile GUID/F8E2A8B5C9B74BF4A6E4A48F180099942 from URL:
DEBUG:root:Found _EPROCESS @ 0x2818140 (DTB: 0x187000)                          4
INFO:root:Detected ntkrnlmp.pdb with GUID F8E2A8B5C9B74BF4A6E4A48F180099942
  Offset (V)   Name                    PID   PPID   Thds     Hnds   Sess  Wow64 Start                    Exit
-------------- -------------------- ------ ------ ------ -------- ------ ------ ------------------------ ------------------------
INFO:root:Detected kernel base at 0xF8000261F000                                5
0xfa80008959e0 System                    4      0     84      511 ------  False 2012-10-01 21:39:51+0000 -
0xfa8001994310 smss.exe                272      4      2       29 ------  False 2012-10-01 21:39:51+0000 -
0xfa8002259060 csrss.exe               348    340      9      436      0  False 2012-10-01 21:39:57+0000 -
1Rekall auto-detects this image as contained in an EWF file.
2Rekall now contacts the profile repository to retrieve the parser for the PE file format.
3The PE profile is used to scan for RSDS signatures. These are verified so we can be pretty confident that we loaded the exact profile for this image.
4The Kernel DTB is located by scanning for the Idle process.
5We now find the kernel’s base address. Once that is known, the addresses of all symbols in the kernel’s virtual address space are known directly from the profile. i.e. We do not need to scan for anything, we already know where everything is.
Rekall generally does not need to use the KDBG at all. This is much faster since it does not need to scan for it, but more importantly, is much more robust because malware can not overwrite thePsActiveProcessHead symbol without crashing the system.
Since Rekall uses a profile repository we are able to locate the exact profile for the kernel we are analyzing. Therefore we do not need to scan for anything - we always prefer to just read the exact addresses from the profile without guessing. This makes analysis far more robust and simple.

Another example, the callbacks plugin.

Another example of this technique is the callbacks plugin. Here, Volatility resorts to disassembling various exported functions to try to locate the offset of a number of non-exported callback pointer tables (e.g. PsSetLoadImageNotifyRoutine is disassembled to get to PspLoadImageNotifyRoutine). This algorithm is pretty fragile and complex. It also only works on 32 bit systems at the moment, since signatures need to be developed for different architectures.
However, this algorithm is entirely not needed, if one uses the correct profile for the exact kernel version. You can simply look up the exact addresses of the (non-exported) symbols you need. Here is the Rekall code:
        routines = ["_PspLoadImageNotifyRoutine",             1

        for symbol in routines:
            # The list is an array of 8 _EX_FAST_REF objects
            addrs = self.profile.get_constant_object(         2

            for addr in addrs:                                3
                callback = addr.dereference_as("_GENERIC_CALLBACK")
                if callback:
                    yield "GenericKernelCallback", callback.Callback, None
1We look up each one of these symbols by name.
2We use the profile directly to instanstiate an array of 8 _EX_FAST_REF.
3We dereference each of the addresses to find the callbacks.
There is no need to scan or disassemble anything to retrieve the symbol addresses, since we know exactly where they are already.

What else can we do with profile constants?

The amount of information provided in the kernel PDB files is truly extensive. Not only does Microsoft provide non-exported function names, but also global names, string names, import table entries and much more.
This is extremely useful when disassembling code in Rekall. Since Rekall disassembles the code which is resident in memory, all relocations, imports, exports etc have already been done by the kernel. In other words if we see a memory reference, we can resolve it to know where it is or what it is without considering imports.
Here is an example of disassembling the PsSetLoadImageNotifyRoutine routine on a 64 bit image (This is what Volatility is doing in the callbacks plugin).
$ rekall -f  ~/images/win7.elf dis 'ntoskrnl.exe!PsSetLoadImageNotifyRoutine'
   Address      Rel Op Codes             Instruction                    Comment
-------------- ---- -------------------- ------------------------------ -------
------ ntoskrnl.exe!PsSetLoadImageNotifyRoutine ------
0xf80002aa1050    0 48895c2408           MOV [RSP+0x8], RBX
0xf80002aa1055    5 57                   PUSH RDI
0xf80002aa1056    6 4883ec20             SUB RSP, 0x20
0xf80002aa105a    A 33d2                 XOR EDX, EDX
0xf80002aa105c    C e8bfb1feff           CALL 0xf80002a8c220            ntoskrnl.exe!ExAllocateCallBack
0xf80002aa1061   11 488bf8               MOV RDI, RAX
0xf80002aa1064   14 4885c0               TEST RAX, RAX
0xf80002aa1067   17 7507                 JNZ 0xf80002aa1070             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x20
0xf80002aa1069   19 b89a0000c0           MOV EAX, 0xffffffffc000009a
0xf80002aa106e   1E eb4a                 JMP 0xf80002aa10ba             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x6A
0xf80002aa1070   20 33db                 XOR EBX, EBX
0xf80002aa1072   22 488d0d27d4d9ff       LEA RCX, [RIP-0x262bd9]        0xFFFFF8A0001310BF ntoskrnl.exe!PspLoadImageNotifyRoutine
0xf80002aa1079   29 4533c0               XOR R8D, R8D
0xf80002aa107c   2C 488bd7               MOV RDX, RDI
0xf80002aa107f   2F 488d0cd9             LEA RCX, [RCX+RBX*8]
0xf80002aa1083   33 e8c817f8ff           CALL 0xf80002a22850            ntoskrnl.exe!ExCompareExchangeCallBack
0xf80002aa1088   38 84c0                 TEST AL, AL
0xf80002aa108a   3A 7511                 JNZ 0xf80002aa109d             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x4D
0xf80002aa108c   3C ffc3                 INC EBX
0xf80002aa108e   3E 83fb08               CMP EBX, 0x8
0xf80002aa1091   41 72df                 JB 0xf80002aa1072              ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x22
0xf80002aa1093   43 488bcf               MOV RCX, RDI
0xf80002aa1096   46 e805e9f5ff           CALL 0xf800029ff9a0            ntoskrnl.exe!IopDeallocateApc
0xf80002aa109b   4B ebcc                 JMP 0xf80002aa1069             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x19
0xf80002aa109d   4D f083053bd4d9ff01     LOCK ADD DWORD [RIP-0x262bc5], 0x1 0x1 ntoskrnl.exe!PspLoadImageNotifyRoutineCount
0xf80002aa10a5   55 8b05d5d3d9ff         MOV EAX, [RIP-0x262c2b]        0x7 ntoskrnl.exe!PspNotifyEnableMask
0xf80002aa10ab   5B a801                 TEST AL, 0x1
0xf80002aa10ad   5D 7509                 JNZ 0xf80002aa10b8             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x68
0xf80002aa10af   5F f00fba2dc8d3d9ff00   LOCK BTS DWORD [RIP-0x262c38], 0x0 0x7 ntoskrnl.exe!PspNotifyEnableMask
0xf80002aa10b8   68 33c0                 XOR EAX, EAX
0xf80002aa10ba   6A 488b5c2430           MOV RBX, [RSP+0x30]
0xf80002aa10bf   6F 4883c420             ADD RSP, 0x20
0xf80002aa10c3   73 5f                   POP RDI
We can see that addresses are resolved according to the known symbols at that address (In the Volatility code we are actually after the PspLoadImageNotifyRoutine address).


  1. Is the RSDS portion of ntoskrnl.exe part of the data protected by PatchGuard?

  2. Hi Brendan,
    I did not think so, but I thought I better check. I loaded the winpmem with write support and overwrote one byte of the GUID for ntoskrnl. System continued running with no problems. Rekall failed to find the correct profile, though.

    One of the comments that people gave me for this post is that we are essentially substituting the KDBG signature for the RSDS signature - we are still vulnerable to simple anti forensics. This is true and it is something we are working on (I will try to write another blog post next week about it).

    The short description is that we have implemented an index to the profiles in the repository. The index is built from all profiles we know about (so it can not be used to retrieve GUIDs not in the repository), but it essentially does not rely on the RSDS signature, but on known bytes inside the binary itself (code section and data section). These sections are protected by patch guard (by definition) and so this is a much stronger signal. Its also not practical for an attacker to change the index signature since the index can easily be regenerated to use different comparison points.

    The overall effect is that it is much faster than scanning since looking up the index only involves some in memory string comparisons, and it is way more robust. If the profile is not recognized from the index, we fall back to the RSDS scan as before - this helps us identify profiles we do not have yet in the repository.