Friday, April 24, 2015

The Pmem Memory acquisition suite.

The Rekall project has maintained a set of open source memory acquisition tools for a while now. After all, Memory acquisition is the first step in memory analysis. Before any analysis can be done, we need to acquire the memory in the first place. There are a number of commercial solutions to acquire memory, but sadly open source solutions have been abandoned or not maintained (For example win32dd has been a popular solution many years ago but has now been commercialized and is no longer open source).
We believe in open source forensic tools to make testing and transparency easier. We also believe that the availability of open source solutions spurs further development in the field and enables choices.
That is the reason we feel an open source, well tested and capable forensic memory acquisition tool is essential - we call it the Pmem suite of tools. The pmem acquisition tool aims to provide a complete imaging solution for Windows, Linux and OSX (OSXPmem is the only memory acquisition tool we are aware of, which works on the latest version of OSX - 10.10.x - commercial or open source).
As we continue to develop Rekall into the most powerful memory forensic platform, we developed the need to extend the acquisition tool. For example, when Rekall gained the ability to analyze the windows pagefile, it became important that the acquisition tool also collect the page file during acquisition. Similarly we require the tool to collect critical system binaries.
We realized that we were in a unique position - not only are we developing the most cutting edge memory analysis tool, but we are also developing the most advanced memory acquisition tool. By being in control of the development process of both tools, we can leverage the acquisition to assist the analysis, and leverage the analysis to improve the acquisition.
For example, one of the first things that a memory analysis framework requires is to derive the location of the page tables (dtb or CR3), the location of the kernel image in memory (kaslr shift) or the exact version of the kernel. All of these facts are immediately available to the acquisition tool at acquisition time - if only there was a way for the acquisition tool to store this metadata in the image, we would be able to analyze the image faster and more accurately.
Similarly, we often analyze memory images we acquired and discover that we left some evidence behind during acquisition time - for example, if we try to dump executables from memory, we might discover that many file mapped pages are not present in the image. If only we could have acquired these files during the acquisition time…
Our goal is to create a synergy between analysis and acquisition - collect as much information as we can during the acquisition stage, driven by preliminary analysis.
In order to do this preliminary triaging, we need to gain access to the live physical memory of the system. Pmem is the only suite of memory acquisition tools that allow for live forensics of the system they are running on. While other acquisition tools are designed to dump memory image files from kernel space, pmem tools generally pass data into user space and allow user space processes direct access to physical memory.
It turns out that as physical memory sizes increase it takes so long to copy a complete image out to disk, that smear is becoming a significant problem (e.g. on very large servers). In this case live forensic analysis is the only practical solution since the physical memory is examined over a very short period of time (think running a pslist plugin which just follows a linked list).
We actually believe live memory analysis is the way forward.

Image file format

Traditionally acquisition tools (like dd) simply wrote out a RAW format image. This is by far the simplest image file format. In this format, the physical address space is written byte for byte directly into the image file.
The nice thing about a raw image is that you don’t need any special tools to read it - every byte in the file corresponds to the same address in physical memory. Some of the earliest memory analysis tools therefore only worked on RAW images.
However there are a number of problems with RAW images:
  • No ability to store sparse regions - all reserved regions must be padded in the image with zeros giving a larger image size. For example if you have 4GB of RAM, there will be about 1GB PCI hole reserved for DMA (e.g. video cards), so the RAW image is actually 5GB in size.
  • No support for compression, encryption etc. This is a problem because sometimes using a fast compressor can actually produce higher throughput by minimizing IO.
  • No support for additional metadata. This is required for the acquisition tool to tell us these critical constants we need for analysis!
  • No support for embedding additional files, such as the pagefile, kernel image etc.
There are some other image file formats sometimes used but none of them have all the required features:
The Microsoft Crashdump file, for example, is commonly used with windows images - however this is a proprietary, undocumented file format with no support for compression or embedding (although it supports some windows specific metadata) it is also non-extensible. We do not recommend acquiring with this format directly - if you need to analyze the image with the windows debugger we recommend using the Rekall raw2dmp plugin to create a dump file later.
An ELF core file is the standard image format used by GDB and Linux when making a core dump. This format allows the storage of sparse memory regions, but has only limited support for extensible metadata. It is not possible to use this format to collect related files (like the pagefile, kernel image etc). This format is the default produced by versions of Rekall’s pmem acquisition tools prior to version 2.0. Certain virtualization tools like Virtual Box produce memory images in this format so it can still be useful.
EWF is a compression format which is used by Encase. It offers the ability for the image to be compressed but does not support sparse files, nor multiple streams (at least the versions supported by the open source libewf tool).
Various ad-hoc imaging formats that are sometimes used. Rekall can read those if you receive them in this format, but these format are not suitable for our purposes (no compression or multiple files can be collected in the same image file):
  • Limes - an ad-hoc imaging format sometimes used on Linux. Does not really offer any advantages over an ELF core dump.
  • HPAK - A proprietary format used in HBGary’s tools.
  • Mach-O - This is the binary format used on OSX. These kind of images used to be produced by the now defunct “Mac Memory Reader”. Does not really offer any advantages over ELF core dumps.
After version 2.0 Rekall’s pmem suite of acquisition tools have switched to the AFF4 format for the default image format. AFF4 offers all the required features and more:
  • A peer reviewed open standard for storing digital images.
  • Supports compression using the Zlib and Snappy compression formats (Snappy allows imaging at speeds greater than 300mb/s). This is really important to reduce memory smear.
  • Supports storing arbitrary metadata via RDF information triples.
  • Supports collecting multiple files (streams) in the same file. Thus we can collect binaries, pagefile as well as the physical memory the time of acquisition. Rekall can then use all these information sources seamlessly during analysis (i.e. no need to explicitly tell Rekall which is the pagefile).
The image file format is based on the standard Zip file format, with all the advantages that brings, such as readily available tools for recovery of corrupted image files, inspection, verification and manipulation of zip files. Zip files are natively supported in almost every programming language - decompressing an AFF4 stream can be done in 4 lines of python without the use of a special AFF4 library (but the pyaff4 library can also be used).

AFF4 Volume overview.

We said that the AFF4 format is built on top of the standard ZIP format. This means we can actually use the regular zip program to inspect an AFF4 volume.
The following is an image of a Windows Server 2003 system, acquired together with the pagefile. As you can see it is just a zip file:
$ unzip -l images/Windows_Server-2003-R2_SP2-English-32Bit-Base-2015.02.11.aff4

Archive:  images/Windows_Server-2003-R2_SP2-English-32Bit-Base-2015.02.11.aff4
  Length      Date    Time    Name
---------  ---------- -----   ----
      847  2015-03-10 00:50   information.turtle
       56  2015-03-10 00:50   PhysicalMemory/map
       64  2015-03-10 00:50   PhysicalMemory/idx
 12313883  2015-03-10 00:50   PhysicalMemory/data/00000031
     4048  2015-03-10 00:50   PhysicalMemory/data/00000031/index
        8  2015-03-10 00:50   c%3a/pagefile.sys/00000016
     4096  2015-03-10 00:49   PhysicalMemory/data/00000021/index
     4096  2015-03-10 00:49   PhysicalMemory/data/00000024/index
   166912  2015-03-10 00:50   c%3a/pagefile.sys/00000012
   166912  2015-03-10 00:50   c%3a/pagefile.sys/00000013
     4096  2015-03-10 00:50   c%3a/pagefile.sys/00000015/index
      204  2015-03-10 00:50   PhysicalMemory/information.yaml
        4  2015-03-10 00:50   c%3a/pagefile.sys/00000016/index
---------                     -------
278598663                     102 files
We can see that the AFF4 volume is denoted by a globally unique name aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed. This is called the AFF4 URN and uniquely identifies this volume. The metadata is stored in this volume’s archive member called “information.turtle”. We also see a number of streams - ThePhysicalMemory is the memory stream of the machine’s physical memory, “c%3a/pagefile.sys” is the stream corresponding with the machine’s pagefile.
Lets examine the metadata stored in the information.turtle archive member:
$ unzip -p images/Windows_Server-2003-R2_SP2-English-32Bit-Base-2015.02.11.aff4 information.turtle
@base <aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed> .
@prefix rdf: <> .
@prefix aff4: <> .
@prefix xsd: <> .
@prefix memory: <> .

    aff4:category memory:physical ;
    aff4:stored <> ;
    a aff4:map .

    aff4:chunk_size 32768 ;
    aff4:chunks_per_segment 1024 ;
    aff4:compression <> ;
    aff4:size 1073336320 ;
    aff4:stored <> ;
    a aff4:image .

    aff4:category memory:pagefile ;
    aff4:chunk_size 32768 ;
    aff4:chunks_per_segment 1024 ;
    aff4:compression <> ;
    memory:pagefile_number 0 ;
    aff4:size 536870912 ;
    aff4:stored <> ;
    a aff4:image .
This shows us all the streams that exist in this volume encoded using the Turtle RDF serialization. Each stream has a number of attributes (key value pairs). The stream aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed/PhysicalMemory has a category of memory:physical (i.e. it is a physical memory image). It is implemented as an aff4:map stream - i.e. this is a sparse stream which uses aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed/PhysicalMemory/data as backing storage.
We can see the backing stream is an aff4:image typed stream with 32kb chunks, 1024 chunks per segment, using zlib compression.
Additionally we can see the pagefile is stored in a separate stream with a category memory:pagefile (Rekall can then use the category to automatically know how to use each stream).

The PMEM suite of acquisition tools.

The Rekall project maintains a set of acquisition tools for the three supported operating systems: Windows, Linux and OSX. Since version 2.0, the three imagers have been merged into a single common framework. This means that you use them in the same way, and they all produce the same type of AFF4 images.
All imagers share the common AFF4 imager architecture. This means you can use all imagers for basic manipulation of all AFF4 volumes. Hence we will discuss these common features here. Below we discuss some of the differences in the implementations between the operating systems.
Lets consider the output from the –help command:
$ linpmem --help

   linpmem  [--elf] [-m] [-p </path/to/pagefile>] ...  [-V] [-d] [-v] [-t]
            [-i </path/to/file/or/device>] ...  [-e <string>] [-o
            </path/to/file>] [-c <zlib, snappy, none>] [--] [--version]
            [-h] </path/to/aff4/volume> ...


     Normally pmem will produce an AFF4 volume but this option will force
     an ELF Core image file to be produced during acquisition. Note that
     this option is not compatible with the --input or --pagefile options
     because we can not write multiple streams into an ELF file.

     This option is mostly useful for compatibility with legacy memory
     analysis tools which do not understand AFF4 images.

     If this option is used together with the --export option we will
     export an ELF file from a stream within the AFF4 image.

   -m,  --acquire-memory
     Normally pmem will only acquire memory if the user has not asked for
     something else (like acquiring files, exporting etc). This option
     forces memory to be acquired. It is only required when the program is
     invoked with the --input, --export or other actionable flags.

   -p </path/to/pagefile>,  --pagefile </path/to/pagefile>  (accepted
      multiple times)
     Also capture the pagefile. Note that you must provide this option
     rather than e.g. '--input c:\pagefile.sys' because we can not normally
     read the pagefile directly. This option will use the sleuthkit to read
     the pagefile.

   -V,  --view
     View AFF4 metadata

   -d,  --debug
     Display debugging logging

   -v,  --verbose
     Display more verbose information

   -t,  --truncate
     Truncate the output file. Normally volumes and images are appended to
     existing files, but this flag forces the output file to be truncated

   -i </path/to/file/or/device>,  --input </path/to/file/or/device>
      (accepted multiple times)
     File to image. If specified we copy this file to the output volume
     located at --output. If there is no AFF4 volume on --output yet, we
     create a new volume on it.

     This can be specified multiple times with shell expansion. e.g.:

     -i /bin/*

   -e <string>,  --export <string>
     Name of the stream to export. If specified we try to open this stream
     and write it to the --output file. Note that you will also need to
     specify an AFF4 volume path to load so we know where to find the
     stream. Specifying a relative URN implies a stream residing in a
     loaded volume. E.g.

     -e /dev/sda -o /tmp/myfile my_volume.aff4

   -o </path/to/file>,  --output </path/to/file>
     Output file to write to. If the file does not exist we create it.

   -c <zlib, snappy, none>,  --compression <zlib, snappy, none>
     Type of compression to use (default zlib).

   --,  --ignore_rest
     Ignores the rest of the labeled arguments following this flag.

     Displays version information and exits.

   -h,  --help
     Displays usage information and exits.

   </path/to/aff4/volume>  (accepted multiple times)
     These AFF4 Volumes will be loaded and their metadata will be parsed
     before the program runs.

     Note that this is necessary before you can extract streams with the
     --export flag.

   The LinuxPmem memory imager.  Copyright 2014 Google Inc.

Inspecting an AFF4 Volume.

The tool can examine an AFF4 volume as we have seen previously. It actually loads the provided AFF4 volume and outputs a common view of all known objects.
$ linpmem -V images/Windows_Server-2003-R2_SP2-English-32Bit-Base-2015.02.11.aff4
@prefix rdf: <> .
@prefix aff4: <> .
@prefix xsd: <> .
@prefix memory: <> .

    aff4:category memory:physical ;
    aff4:stored <aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed> ;
    a aff4:map .

    aff4:chunk_size 32768 ;
    aff4:chunks_per_segment 1024 ;
    aff4:compression <> ;
    aff4:size 1073336320 ;
    aff4:stored <aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed> ;
    a aff4:image .

    aff4:category memory:pagefile ;
    aff4:chunk_size 32768 ;
    aff4:chunks_per_segment 1024 ;
    aff4:compression <> ;
    memory:pagefile_number 0 ;
    aff4:size 536870912 ;
    aff4:stored <aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed> ;
    a aff4:image .

    aff4:contains <aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed> .

Extracting a stream from an AFF4 volume.

We can extract one of the streams to a file. This is sometimes useful for using tools which do not support AFF4 natively. For example, we can extract the pagefile into the /tmp/ directory:
$ linpmem --export /c:/pagefile.sys --output /tmp/pagefile.sys images/Windows_Server-2003-R2_SP2-English-32Bit-Base-2015.02.11.aff4

Extracting aff4://4928ef44-6579-496c-a53e-2ad34d98b7ed/c:/pagefile.sys into file:///tmp/pagefile.sys
 Reading 0xa00000  10MiB / 512MiB 0MiB/s
 Reading 0x5000000  80MiB / 512MiB 266MiB/s
 Reading 0xc800000  200MiB / 512MiB 474MiB/s
 Reading 0x15e00000  350MiB / 512MiB 586MiB/s
 Reading 0x19a00000  410MiB / 512MiB 236MiB/s
 Reading 0x1d600000  470MiB / 512MiB 216MiB/s

Adding a new stream to an AFF4 volume.

By default the AFF4 imager tools append streams to existing volumes, rather than overwrite the volume. Therefore it is easy to add additional files after the acquisition is complete to the acquired volume. It is also possible to specify shell globs to add multiple files to the volume. In this sense, the AFF4 volume acts more like a zip container - you can just keep on adding new files.
This is handy if initial analysis reveals some suspected files which we can acquire immediately into the AFF4 volume after the memory is captured. The -t flag explicitly allows pmem to truncate the output file (this will delete all current content of the volume).
For example, the following will add files in /bin/* to the AFF4 volume (without overwriting it).
$ linpmem -i /bin/* -o /tmp/test.aff4

Adding /bin/bash as file:///bin/bash
Adding /bin/bsd-csh as file:///bin/bsd-csh
Adding /bin/bunzip2 as file:///bin/bunzip2
Adding /bin/busybox as file:///bin/busybox
Adding /bin/bzcat as file:///bin/bzcat
Adding /bin/bzcmp as file:///bin/bzcmp
Adding /bin/bzdiff as file:///bin/bzdiff

The WinPmem acquisition tool.

On Windows, one must insert a signed driver in order to gain access to physical memory. WinPmem from version 2.0 is built on top of the AFF4 imager technology, and is packaged bundled with the appropriate memory drivers. Since AFF4 volumes utilize zip file, as their underlying storage format, it is possible to append an AFF4 volume to the end of any other file type. The WinPmem acquisition tool utilizes this property to simply package all needed drivers and tools together with the executable itself - using the AFF4 format.
We typically package with winpmem the 64 bit and 32 bit windows kernel drivers, as well as a copy of fcat.exe from the sleuthkit . This tool is used to provide access to the locked pagefiles. (Note that if you just want to extract the drivers - e.g. to use in another project you can just unzip the winpmem executable).
If no other operation was specified, WinPmem will immediately image memory and also acquire certain files, such as drivers and the kernel image. These are useful to preserve the exact versions of binaries running on the system at the time of the acquisition.
By default WinPmem uses a technique called PTE Remapping to acquire memory. This technique was originally developed in order to bypass potential malware hooking the APIs normally used for acquisition. After much use we found that the technique is in fact more stable than using the APIs and it is actually the only reliable way that access to physical memory is achievable on OSX. We therefore decided to make this the default acquisition mode on both Windows and OSX.
To acquire memory all one needs to do is to specify the output volume:
C:\Users\mic>winpmem_2.0.1.exe -o test.aff4
Driver Unloaded.
CR3: 0x0000187000
 2 memory ranges:
Start 0x00001000 - Length 0x0009E000
Start 0x00100000 - Length 0x3FEF0000
Dumping Range 0 (Starts at 1000)
Dumping Range 1 (Starts at 100000)
Adding C:\Windows\SysNative\drivers/1394bus.sys as file:///C:/Windows/SysNative/drivers/1394bus.sys
Adding C:\Windows\SysNative\drivers/1394ohci.sys as file:///C:/Windows/SysNative/drivers/1394ohci.sys
Adding C:\Windows\SysNative\drivers/acpi.sys as file:///C:/Windows/SysNative/drivers/acpi.sys
Adding C:\Windows\SysNative\drivers/acpipmi.sys as file:///C:/Windows/SysNative/drivers/acpipmi.sys
Adding C:\Windows\SysNative\drivers/adp94xx.sys as file:///C:/Windows/SysNative/drivers/adp94xx.sys
Adding C:\Windows\SysNative\drivers/adpahci.sys as file:///C:/Windows/SysNative/drivers/adpahci.sys
Adding C:\Windows\SysNative\drivers/WUDFPf.sys as file:///C:/Windows/SysNative/drivers/WUDFPf.sys
Adding C:\Windows\SysNative\drivers/WUDFRd.sys as file:///C:/Windows/SysNative/drivers/WUDFRd.sys
Driver Unloaded.
Note that by default the imager also captures the kernel and driver binaries. You can also choose the snappy compression (--compression snappy) for a faster compression algorithm.
Now we can use rekall to analyze this image:
C:\Users\mic>"c:\Program Files\Rekall\rekal.exe" -f test.aff4

The Rekall Memory Forensic framework 1.3.2 (Dammastock).

"We can remember it for you wholesale!"

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License.

See to get started.
[1] test.aff4 15:56:05> pslist
  _EPROCESS            Name          PID   PPID   Thds    Hnds    Sess  Wow64           Start
-------------- -------------------- ----- ------ ------ -------- ------ ------ ------------------------
0xfa8000c9f040 System                   4      0     79      528      - False  2015-04-24 12:12:36+0000
0xfa8000ea3340 SearchProtocol         208   2336      7      284      0 False  2015-04-24 13:37:11+0000
0xfa8001d229f0 smss.exe               228      4      2       29      - False  2015-04-24 12:12:36+0000

The LinPmem acquisition tool.

By default, the linpmem acquisition tool uses the /proc/kcore device to acquire physical memory. This device must be enabled during kernel configuration but we found that in most distributions the device is already enabled.

The OSXPmem acquisition tool

OSXPmem has recently been updated with a new driver written by Adam Sindelar called MacPmem.kext. The new driver is more stable and works on all versions of OSX including the most recent 10.10 series. The new driver presents two devices:
  1. The /dev/pmem device is the raw physical memory device - reading from this device allows userspace applications (running as root) to read physical memory - e.g. Rekall itself can be used for live analysis.
  2. The /dev/pmem_info device presents information collected by the driver about the system - such as the EFI ranges, kernel slide and other critical parameters.
The following example illustrates how we can image memory on OSX. First we must elevate to the root user, then unzip the contents of distribution. Note that the MacPmem.kext directory and its content must be owned by root with group wheel, otherwise kextloadwill refuse to insert the kernel module.
Next we simply load the driver using kextload and run the acquisition tool to create the AFF4 volume.
$ sudo bash
# unzip
# kextload
# ./ -o /tmp/test.aff4
Imaging memory
E0424 16:26:04.297508 2091074320] Range 0 581632
E0424 16:26:04.297526 2091074320] Range 589824 65536
E0424 16:26:04.297534 2091074320] Range 1048576 535822336
E0424 16:26:04.297541 2091074320] Range 538968064 534790144
E0424 16:26:04.297549 2091074320] Range 1073762304 1257820160
E0424 16:26:04.297555 2091074320] Range 2332028928 4096
E0424 16:26:04.297562 2091074320] Range 4294967296 14753464320
 Reading 0x19100000  400MiB / 511MiB 55MiB/s
Adding /mach_kernel as file:///mach_kernel77MiB/s
Adding /dev/pmem_info as file:///dev/pmem_info
Adding /System/Library/Extensions/ALF.kext/Contents/MacOS/ALF as file:///System/Library/Extensions/ALF.kext/Contents/MacOS/ALF
Adding /System/Library/Extensions/ALF.kext/Contents/Resources/Dutch.lproj/ as file:///System/Library/Extensions/ALF.kext/Contents/
Adding /System/Library/Extensions/ALF.kext/Contents/Resources/English.lproj/ as file:///System/Library/Extensions/ALF.kext/Content
# cat /dev/pmem_info | head
%YAML 1.2
  pmem_api_version: 1
  cr3: 14860288073
  dtb_off: 14860288000
  phys_mem_size: 17179869184
  pci_config_space_base: 3758096384
  mmap_poffset: 107778048
  mmap_desc_version: 1
  mmap_size: 13776
  mmap_desc_size: 48
  kaslr_slide: 62914560
  kernel_poffset: 63963136
  kernel_version: "Darwin Kernel Version 13.4.0: Wed Mar 18 16:20:14 PDT 2015; root:xnu-2422.115.14~1/RELEASE_X86_64"
  - purpose: "(PCI) IGPU/0"
    type: "pci_range"
    pci_type: "PCIUnknownMemory"
    start: 4768923648
    length: 4194304
    hw_informant: false
As usual live analysis can be performed by simply specifying the /dev/pmem device for Rekall.

Thursday, April 2, 2015

Announcing Rekall Release 1.3.1 (Dammastock)

Version 1.3.1 Dammastock.

This release was made at the Rekall Memory Forensic Workshop at DFRWS. For the first time, we ran this workshop completely from the interactive Rekall web console. It was an astounding success, and an impressive medium to deliver an interactive workshop (Check it out here ).

Release Highlights

Memory Acquisition
The major thrust for this release was the updating of the Pmem Acquisition tools to AFF4. In addition to the stable WinPmem 1.6.2, we have made available an experimental pre-release of the WinPmem 2.0 series.
The new imagers feature:
  1. A consistent interface. The same command line arguments used for all operating systems.
  2. The new memory image format we have standardized on is AFF4. This allows us to store multiple streams in the image, such as the page file and additional files.
  3. The pmem imagers are able to embed different files inside the final AFF4 image, such as the kernel image and miscellaneous binaries.
Note that the new imagers are still considered pre-release. Please test but continue using the old imagers for critical work. The best documentation of the new imagers is currently found here under "Memory Acquisition".
GUI Web Console
The GUI was expanded to accommodate multiple sessions. A Rekall session is an object encapsulating all we know about a specific image. With multiple session support in the GUI, we are able to write a single web console document which runs plugins on multiple images simultaneously.
  • The GUI was also adapted to allow for the export of static versions of the document, which can be hosted on a simple web server.
Rekall will now automatically fetch missing profiles from the Microsoft Symbol Server for critical modules.
  • This was a huge pain point in the past - when MS updated kernels through a patch the kernel was rebuilt resulting in a new profile. By the time the Rekall team pushed the new profile to the profile repository, Rekall was non-functional, requiring users to know how to generate new profiles manually.
  • This new release adds a setting (you can set it in the configuration file permanently or just use the flag --autodetect_build_local). The following values are allowed:
  • none means that Rekall will not fetch profiles from the symbol server (but will still use the profile repositories specified in repository_path).
  • basic is the default setting. Rekall will fetch profiles for selected modules, such as the kernel, win32k.sys, ntdll, tcpip etc. This is usually good enough for most plugins to function correctly.
  • full in this setting Rekall will try the symbol server for all profiles it does not know about. This can be very slow but will produce the best outcome (e.g. disassembly output will be fully annotated).
Added support for XEN paravirtualized guests.

Tuesday, December 23, 2014

Announcing Rekall Release 1.2.1 (Col de la Croix)

Version 1.2.1 Col de la Croix

This release just made it in time for Christmas! Enjoy!

Release Highlights

For the first time Rekall includes experimental support for analysis of traditional Disk images. This release includes a full featured parser for NTFS. Some interesting plugins:
  • fls: List files in the filesystem.
  • istat: Displays information about an MFT entry.
  • idump: hexdump an attribute or stream.
  • iexport: Exports a file from the NTFS.
This release includes full support for acquisition and analysis of the windows page file. Some interesting plugins include:
  • pagefiles: Lists the currently active page files and their locations.
  • vadmap: Displays each page in the VAD and resolves its location in physical memory (or the page file).
  • vtop: This plugin was expanded to display where virtual pages are actually backed by the page file.
  • dumpfiles: This plugin was finally implemented in Rekall.
  • inspect_heap: Experimental support for heap enumeration on Win7 x64 allows enumeration of userspace heap allocation (e.g. malloc()).
    • dns_cache: This is also used to enumerate the dns cache by inspecting heap allocations.
This release adds a functional Entity layer. Currently confined to OSX analysis. Entities are a kind of query language for memory artifacts. Some useful plugins:
  • find: Search for entities based on a query.
  • analyze: Analyze the internal query optimizer’s collectors that will be run in response to a query.
  • Most other plugins are rewritten in terms of entities (e.g. lsofnetstat etc.)
This release brings a dedicated userspace imager to Linux. The lmap tool was expanded to write ELF core dump files and acquire directly from /proc/kcore, if the target system supports it (in this case no kernel module is needed).
  • MIPS address space added for support on Big Endian Machines.
Rekall can now read and write EWF files natively. There have been many performance and stability improvements too.
  • ewfacquire: Rekall can be used to acquire memory efficiently, writing an EWF compressed file (with an embedded ELF file).
  • The Profile repository is now cached locally to make subsequent runs faster.

Rekall NTFS Support.

Why did we add NTFS support to a memory forensic tool? 
  1. When we added support for using the windows pagefile to supplement memory analysis it became apparent that we needed to read the pagefile directly from the NTFS since the file is normally locked - so normal file APIs are not usable. We considered using tricky kernel hacking to bypass the file lock restrictions but this seems fragile and NTFS parsing is not that complicated. For memory acquisition using WinPmem we included the fcat tool from the Sleuthkit to copy the pagefile out (alternatively we could have linked libtsk directly). But one of the more important uses of Rekall is live analysis, and this does not really solve it.
  2. We knew that Rekall’s binary parsing library was up to the task of handling NTFS. It was a good exercise to learn NTFS and document it in the Rekall implementation. I used Brian Carrier’s excellent book File System Forensic Analysis to learn about the NTFS and implement it in Rekall.
  3. Although we also maintain pytsk as a python binding to the TSK library, it is a bit of a pain to use. The bindings are sometimes fragile and can cause crashes under some situations. They also need to be frequently updated when TSK evolves. Performance is not great - TSK needs to parse the entire MFT each time it is loaded. Pytsk has a lot of trouble parsing a live filesystems since TSK caches its analysis of the MFT and might not see new files created after this initial MFT parsing. Rebuilding the caches is quite slow too due to the IO required in reading the entire MFT each time. I felt that a pure python implementation of NTFS parsing can be useful in some situations and use cases and possibly be more efficient (Despite being written in Python :-).
  4. Probably the most important reason for implementing NTFS parsing in Rekall was that it was a hell of a lot of fun! Rekall’s programming APIs are very easy to work with and the final plugins were really fast and powerful.
Rekall’s implementation is not supposed to be a replacement for TSK. The Sleuthkit actually converts much of the information found in NTFS into a common format to fit all filesystems. So for example, it extends the inode abstraction from other filesystems from a simple integer to a string which may contain type and id (e.g. 50-144-8). Some of the timestamps are also omitted from the tool’s output (but may be found using the low level API). This extra layer of abstraction is good in the general case (e.g. autopsy works with all filesystems in the same way) but may actually be hiding some important forensic information in some cases. Its useful to have an implementation of NTFS parsing with no abstractions at all - to allow examiners to corroborate some of the low level information available.

1. Rekall NTFS plugins.

In order to analyze an NTFS disk image, simply load it with the familiar -f switch. In the following example I use the vdfuse tool in order to export my Virtual Box VDI disk partitions as raw devices:
$ vdfuse -r -f ~/VirtualBox\ VMs/win7/win7.vdi /tmp/mnt/
$ rekal -f /tmp/mnt/Partition2
The Rekall Memory Forensic framework 1.2.0 (Col de la Croix).

"We can remember it for you wholesale!"

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License.

See to get started.
[1] Partition2 15:57:14> istat
MFT Entry Header Values:
Entry: 5        Sequence: 5
$LogFile Sequence Number: 13730649171
Links: 1

Flags                          COMPRESSED, HIDDEN, SYSTEM
Owner ID                       0
SID                            265
Created                        2009-07-14 02:38:56+0000
File Modified                  2014-10-31 21:36:56+0000
MFT Modified                   2014-10-31 21:36:56+0000
Accessed                       2014-10-31 21:36:56+0000

     Inode                   Type                 Name     Res     Size    Comment
--------------- ------------------------------ ---------- ----- ---------- -------
         5-16-0 $STANDARD_INFORMATION                     True          72
         5-48-1 $FILE_NAME                                True          68 .
        5-144-6 $INDEX_ROOT                    $I30       True         168
        5-160-8 $INDEX_ALLOCATION              $I30       False       8192
        5-176-7 $BITMAP                        $I30       True           8
        5-256-9 $LOGGED_UTILITY_STREAM         $TXF_DATA  True          56

$I30 Analysis:
   MFT      Seq           Created                  File Mod                   MFT Mod                   Access              Size    Filename
---------- ----- ------------------------- ------------------------- ------------------------- ------------------------- ---------- --------
         4     4 -                         -                         -                         -                                  0 $AttrDef
         8     8 -                         -                         -                         -                                  0 $BadClus
         6     6 -                         -                         -                         -                                  0 $Bitmap
         7     7 -                         -                         -                         -                                  0 $Boot
        11    11 2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000           0 $Extend
         2     2 -                         -                         -                         -                                  0 $LogFile
         0     1 2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000       16384 $MFT
         1     1 -                         -                         -                         -                                  0 $MFTMirr
        57     3 2009-07-14 03:18:56+0000  2013-02-19 17:51:59+0000  2013-02-19 17:51:59+0000  2013-02-19 17:51:59+0000           0 $Recycle.Bin
         9     9 2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000  2013-02-20 02:35:15+0000           0 $Secure
        10    10 -                         -                         -                         -                                  0 $UpCase
         3     3 -                         -                         -                         -                                  0 $Volume
         5     5 2009-07-14 02:38:56+0000  2014-10-31 21:36:56+0000  2014-10-31 21:36:56+0000  2014-10-31 21:36:56+0000           0 .
    145393    16 2013-02-24 22:22:28+0000  2014-10-28 09:15:59+0000  2014-10-28 09:15:59+0000  2014-10-28 09:15:59+0000           0 Config.Msi
     63072    11 2013-02-19 18:31:29+0000  2014-10-23 15:24:31+0000  2014-10-23 15:24:31+0000  2014-10-23 15:24:31+0000           0 cygwin
     13692     1 2009-07-14 05:08:56+0000  2009-07-14 05:08:56+0000  2013-02-20 02:46:34+0000  2009-07-14 05:08:56+0000           0 Documents and Settings
    165336    45 2013-12-28 19:17:49+0000  2013-12-28 19:17:55+0000  2013-12-28 19:17:55+0000  2013-12-28 19:17:55+0000           0 MinGW
      1251    15 2014-08-25 13:45:38+0000  2014-10-27 14:08:43+0000  2014-10-27 14:08:43+0000  2014-08-25 13:45:38+0000  1207721984 pagefile.sys
        58     1 2009-07-14 03:20:08+0000  2009-07-14 03:20:08+0000  2013-02-20 02:46:13+0000  2009-07-14 03:20:08+0000           0 PerfLogs
        60     1 2009-07-14 03:20:08+0000  2014-08-27 23:30:42+0000  2014-08-27 23:30:42+0000  2014-08-27 23:30:42+0000           0 Program Files
       247     1 2009-07-14 03:20:08+0000  2014-10-28 09:11:38+0000  2014-10-28 09:11:38+0000  2014-10-28 09:11:38+0000           0 Program Files (x86)
       363     1 2009-07-14 03:20:08+0000  2013-02-19 18:27:21+0000  2013-02-19 18:27:21+0000  2013-02-19 18:27:21+0000           0 ProgramData
        60     1 2009-07-14 03:20:08+0000  2014-08-27 23:30:42+0000  2014-08-27 23:30:42+0000  2014-08-27 23:30:42+0000           0 PROGRA~1
       247     1 2009-07-14 03:20:08+0000  2014-10-28 09:11:38+0000  2014-10-28 09:11:38+0000  2014-10-28 09:11:38+0000           0 PROGRA~2
       363     1 2009-07-14 03:20:08+0000  2013-02-19 18:27:21+0000  2013-02-19 18:27:21+0000  2013-02-19 18:27:21+0000           0 PROGRA~3
     88993     1 2013-02-19 18:58:43+0000  2014-08-27 22:34:33+0000  2014-10-21 16:39:13+0000  2014-08-27 22:34:33+0000           0 Python27
    118195     3 2013-02-19 22:37:36+0000  2013-05-30 13:28:51+0000  2013-05-30 13:28:51+0000  2013-05-30 13:28:51+0000           0 Python27.32
     27376     2 2013-02-19 17:51:27+0000  2013-02-19 17:51:27+0000  2013-02-19 17:51:27+0000  2013-02-19 17:51:27+0000           0 Recovery
    149334    13 2014-08-06 18:56:26+0000  2014-09-11 14:18:27+0000  2014-09-11 14:18:27+0000  2014-09-11 14:18:27+0000           0 rekall-profiles
    149334    13 2014-08-06 18:56:26+0000  2014-09-11 14:18:27+0000  2014-09-11 14:18:27+0000  2014-09-11 14:18:27+0000           0 REKALL~1
     16393     2 2013-02-20 02:47:16+0000  2014-10-31 22:07:56+0000  2014-10-31 22:07:56+0000  2014-10-31 22:07:56+0000           0 System Volume Information
     16393     2 2013-02-20 02:47:16+0000  2014-10-31 22:07:56+0000  2014-10-31 22:07:56+0000  2014-10-31 22:07:56+0000           0 SYSTEM~1
       457     1 2009-07-14 03:20:08+0000  2013-02-19 17:51:39+0000  2014-08-27 22:06:23+0000  2013-02-19 17:51:39+0000           0 Users
    154403     2 2013-02-20 13:12:13+0000  2013-02-20 13:15:48+0000  2013-02-20 13:15:48+0000  2013-02-20 13:15:48+0000           0 websymbols
    154403     2 2013-02-20 13:12:13+0000  2013-02-20 13:15:48+0000  2013-02-20 13:15:48+0000  2013-02-20 13:15:48+0000           0 WEBSYM~1
     58269     7 2013-02-19 18:28:16+0000  2013-02-19 18:28:16+0000  2013-02-19 18:28:16+0000  2013-02-19 18:28:16+0000           0 WinDDK
       619     1 2009-07-14 03:20:08+0000  2014-10-21 23:41:52+0000  2014-10-21 23:41:52+0000  2014-10-21 23:41:52+0000           0 Windows
The istat plugin displays information about a particular MFT entry. By default it shows entry 5 (The root directory). If the entry has an I30 attribute (which represents a directory index) the plugin further parses the entry and displays all files in the directory recovered from the I30 attribute stream. Note that the I30 stream contains 3 timestamps for each entry which are separated from the timestamps actually present in the MFT’s $STANDARD_INFORMATION attribute.
The output of istat also lists the attributes and their types in a similar notation to that found in, e.g. the Sleuthkit. That is as a tuple separated by dashes, MFT-TYPE-ID.
The fls plugin works in a similar way, but lists directories based on a filename, rooted at the root of the filesystem. The filename may use forward or backslash for separators.
[1] Partition2 16:06:52> fls "Python27"
-----------------------> fls("Python27")
   MFT      Seq           Created                  File Mod                   MFT Mod                   Access              Size    Filename
---------- ----- ------------------------- ------------------------- ------------------------- ------------------------- ---------- --------
    213703    19 2013-12-28 19:37:46+0000  2013-12-28 19:37:47+0000  2013-12-28 19:37:47+0000  2013-12-28 19:37:46+0000        1315 distorm3-wininst.log
    213703    19 2013-12-28 19:37:46+0000  2013-12-28 19:37:47+0000  2013-12-28 19:37:47+0000  2013-12-28 19:37:46+0000        1315 DISTOR~1.LOG
     91798     1 2013-02-19 18:59:11+0000  2013-12-28 17:23:18+0000  2013-12-28 17:23:18+0000  2013-12-28 17:23:18+0000           0 DLLs
     94581     1 2013-02-19 18:59:34+0000  2013-02-19 18:59:34+0000  2013-02-19 18:59:34+0000  2013-02-19 18:59:34+0000           0 Doc
     92031     1 2013-02-19 18:59:15+0000  2014-10-03 00:01:56+0000  2014-10-03 00:01:56+0000  2014-10-03 00:01:56+0000           0 include
     89201     1 2013-02-19 18:58:47+0000  2014-10-31 21:17:03+0000  2014-10-31 21:17:03+0000  2014-10-31 21:17:03+0000           0 Lib
     92334     1 2013-02-19 18:59:18+0000  2013-02-19 18:59:18+0000  2013-02-19 18:59:18+0000  2013-02-19 18:59:18+0000           0 libs
     89063     1 2012-04-10 22:31:16+0000  2012-04-10 22:31:16+0000  2013-02-19 18:58:44+0000  2013-02-19 18:58:44+0000       40092 LICENSE.txt
    127389     7 2013-05-30 13:01:27+0000  2013-05-30 13:01:29+0000  2013-05-30 13:01:29+0000  2013-05-30 13:01:27+0000        9973 M2Crypto-wininst.log
    127389     7 2013-05-30 13:01:27+0000  2013-05-30 13:01:29+0000  2013-05-30 13:01:29+0000  2013-05-30 13:01:27+0000        9973 M2CRYP~1.LOG
     89035     1 2012-04-10 22:18:52+0000  2012-04-10 22:18:52+0000  2013-02-19 18:58:44+0000  2013-02-19 18:58:44+0000      310875 NEWS.txt
    129217     7 2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000        2645 psutil-wininst.log
    129217     7 2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000        2645 PSUTIL~1.LOG
      1267     8 2014-08-27 22:34:33+0000  2014-08-27 22:34:36+0000  2014-08-27 22:34:36+0000  2014-08-27 22:34:36+0000           0 PyInstaller-2.1
      1267     8 2014-08-27 22:34:33+0000  2014-08-27 22:34:36+0000  2014-08-27 22:34:36+0000  2014-08-27 22:34:36+0000           0 PYINST~1.1
     89064     1 2012-04-10 22:24:54+0000  2012-04-10 22:24:54+0000  2013-02-19 18:58:44+0000  2013-02-19 18:58:44+0000       27136 python.exe
     89065     1 2012-04-10 22:24:58+0000  2012-04-10 22:24:58+0000  2013-02-19 18:58:44+0000  2013-02-19 18:58:44+0000       27648 pythonw.exe
    116396     2 2013-02-19 19:45:14+0000  2013-02-19 23:42:49+0000  2013-02-19 23:42:49+0000  2013-02-19 19:45:14+0000      238566 pywin32-wininst.log
    116396     2 2013-02-19 19:45:14+0000  2013-02-19 23:42:49+0000  2013-02-19 23:42:49+0000  2013-02-19 19:45:14+0000      238566 PYWIN3~1.LOG
     89009     2 2012-03-18 22:58:32+0000  2013-05-30 13:36:32+0000  2013-05-30 13:36:32+0000  2013-05-30 13:36:32+0000        2797 readme.txt
    213706    13 2013-12-28 19:37:46+0000  2013-12-28 19:37:46+0000  2013-12-28 19:37:46+0000  2013-12-28 19:37:46+0000      223744 Removedistorm3.exe
    127390     5 2013-05-30 13:01:27+0000  2013-05-30 13:01:27+0000  2013-05-30 13:01:27+0000  2013-05-30 13:01:27+0000      223744 RemoveM2Crypto.exe
    129218     5 2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000      223744 Removepsutil.exe
    116397     2 2013-02-19 19:45:14+0000  2013-02-19 23:42:21+0000  2013-02-19 23:42:21+0000  2013-02-19 19:45:14+0000      223744 Removepywin32.exe
    116397     2 2013-02-19 19:45:14+0000  2013-02-19 23:42:21+0000  2013-02-19 23:42:21+0000  2013-02-19 19:45:14+0000      223744 REMOVE~1.EXE
    127390     5 2013-05-30 13:01:27+0000  2013-05-30 13:01:27+0000  2013-05-30 13:01:27+0000  2013-05-30 13:01:27+0000      223744 REMOVE~2.EXE
    129218     5 2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000  2013-05-30 13:02:47+0000      223744 REMOVE~3.EXE
    213706    13 2013-12-28 19:37:46+0000  2013-12-28 19:37:46+0000  2013-12-28 19:37:46+0000  2013-12-28 19:37:46+0000      223744 REMOVE~4.EXE
    117061     1 2013-02-19 19:45:15+0000  2014-08-27 23:48:19+0000  2014-08-27 23:48:19+0000  2014-08-27 23:48:19+0000           0 Scripts
    233676    26 2014-02-19 22:43:58+0000  2014-02-19 22:43:58+0000  2014-02-19 22:43:58+0000  2014-02-19 22:43:58+0000           0 share
     92364     1 2013-02-19 18:59:18+0000  2013-02-19 18:59:33+0000  2013-02-19 18:59:33+0000  2013-02-19 18:59:33+0000           0 tcl
     94437     1 2013-02-19 18:59:33+0000  2013-02-19 18:59:34+0000  2013-02-19 18:59:34+0000  2013-02-19 18:59:34+0000           0 Tools
Similarly fstat is analogous to istat except takes a filename as an argument.
Rekall supports NTFS compressed files too. Consider the following file:
[1] Partition2 16:06:54> istat 89063
MFT Entry Header Values:
Entry: 89063        Sequence: 1
$LogFile Sequence Number: 12520239903
Links: 1

Flags                          ARCHIVE, COMPRESSED
Owner ID                       0
SID                            713
Created                        2012-04-10 22:31:16+0000
File Modified                  2012-04-10 22:31:16+0000
MFT Modified                   2013-02-19 18:58:44+0000
Accessed                       2013-02-19 18:58:44+0000

     Inode                   Type                 Name     Res     Size    Comment
--------------- ------------------------------ ---------- ----- ---------- -------
     89063-16-0 $STANDARD_INFORMATION                     True          72
     89063-48-2 $FILE_NAME                                True          88 LICENSE.txt
    89063-128-3 $DATA                                     False      40092 VCN: 0-15

Clusters (128-3):
3456320-3456326(6)        Sparse(10)
NTFS compression works by compressing every 16 clusters together, and inserting a sparse cluster to cover the compressed region. We can see this in the above cluster listing.
Rekall provides the idump plugin which is analogous to the regular dump plugin, and displays a hexdump of the MTF entry.
[1] Partition2 16:06:56> idump 89063
    Offset                           Hex                              Data
-------------- ------------------------------------------------ ----------------
           0x0 41 2e 20 48 49 53 54 4f 52 59 20 4f 46 20 54 48  A..HISTORY.OF.TH -
          0x10 45 20 53 4f 46 54 57 41 52 45 0d 0a 3d 3d 3d 3d  E.SOFTWARE..==== -
          0x20 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d 3d  ================ -
          0x30 3d 3d 3d 3d 3d 3d 0d 0a 0d 0a 50 79 74 68 6f 6e  ======....Python -
          0x40 20 77 61 73 20 63 72 65 61 74 65 64 20 69 6e 20 -
          0x50 74 68 65 20 65 61 72 6c 79 20 31 39 39 30 73 20  the.early.1990s. -
          0x60 62 79 20 47 75 69 64 6f 20 76 61 6e 20 52 6f 73  by.Guido.van.Ros -
          0x70 73 75 6d 20 61 74 20 53 74 69 63 68 74 69 6e 67 -
          0x80 0d 0a 4d 61 74 68 65 6d 61 74 69 73 63 68 20 43  ..Mathematisch.C -
          0x90 65 6e 74 72 75 6d 20 28 43 57 49 2c 20 73 65 65  entrum.(CWI,.see -
If you want to copy a file out of the NTFS filesystem, use the iexport plugin.
[1] Partition2 16:50:46> iexport 89063, dump_dir="/tmp/"
Writing MFT Entry 89063 as Python27/LICENSE.txt
[1] Partition2 16:51:11> !head /tmp/Python27%2fLICENSE.txt

Python was created in the early 1990s by Guido van Rossum at Stichting
Mathematisch Centrum (CWI, see in the Netherlands
as a successor of a language called ABC.  Guido remains Python's
principal author, although it includes many contributions from others.

In 1995, Guido continued his work on Python at the Corporation for
National Research Initiatives (CNRI, see

2. Rekall’s NTFS implementation notes.

This section is intended for Rekall developers who want to learn a bit about how Rekall’s NTFS implementation uses some of the common features in the Rekall API.

2.1. Autodetection of NTFS

To make Rekall as easy to use as possible, we use autodetection as much as we can. Ideally a user should simply provide the image file, and Rekall will detect the image format and the profile required. To support this, Rekall has an autodetection plugin system. A detector class simply registers by extending guess_profile.DetectionMethod:
class NTFSDetector(guess_profile.DetectionMethod):
    name = "ntfs"

    def Offsets(self):
        return [0]

    def DetectFromHit(self, hit, _, address_space):
        ntfs_profile = self.session.LoadProfile("ntfs")
            ntfs = NTFS(address_space=address_space, session=self.session)
            self.session.SetParameter("ntfs", ntfs)

            return ntfs_profile
        except NTFSParseError:
The detector can provide a string on which to fire, or a list of offsets to check in its Offsets() method. The framework will then call it when a hit is found.

2.2. Implementing Fixups

One of the more interesting features of NTFS is the use of Fixups. When the NTFS writes to disk certain data structures, it replaces some bytes in the cluster with a random sequence. It then stores the bytes that used to be there as fixups in a list. When NTFS reads the cluster from disk it applies the fixups to get the original data.
This means that we can not simply read clusters from the disk - we must apply the relevant fixups. In Rekall we have an Address Space abstraction to read data. Address Spaces typically layer on top of other address spaces. Hence we can implement the FixupAddressSpace so it can be layered on top of another address space:
class FixupAddressSpace(addrspace.BaseAddressSpace):
    """An address space to implement record fixup."""

    def __init__(self, fixup_magic, fixup_table, base_offset, length, **kwargs):
        super(FixupAddressSpace, self).__init__(**kwargs)
        self.as_assert(self.base is not None, "Address space must be stacked.")
        self.base_offset = base_offset
        self.fixup_table = fixup_table
        self.fixup_magic = fixup_magic

        # We read the entire region into a mutable buffer then apply the fixups.
        self.buffer = array.array("c",, length))
        for i, fixup_value in enumerate(fixup_table):
            fixup_offset = (i+1) * 512 - 2
            if (self.buffer[fixup_offset:fixup_offset+2].tostring() !=
                raise NTFSParseError("Fixup error")

            self.buffer[fixup_offset:fixup_offset+2] = array.array(
                "c", fixup_value.v())

    def read(self, address, length):
        buffer_offset = address - self.base_offset
        return self.buffer[buffer_offset:buffer_offset+length].tostring()
We can then apply the fixup to arbitary structures. The below code will automatically apply the fixup every time we instantiate an MFT_ENTRY struct. Therefore the fixups become completely transparent now:
class MFT_ENTRY(obj.Struct):
    def __init__(self, **kwargs):
        super(MFT_ENTRY, self).__init__(**kwargs)

        # We implement fixup by wrapping the base address space with a fixed
        # one:
        self.obj_vm = FixupAddressSpace(fixup_magic=self.fixup_magic,

2.3. Runlists

NTFS attributes can be fragmented. The actual blocks they occupy on disk are described using a run list. Rekall already has an address space primitive called a RunBasedAddressSpace. This type of address space is simply initialized with a list of runs specifying tuples of the form (file address, disk address length), and then layered on top of the Physical Address Space (i.e. the disk image).
Supporting compressed files makes the implementation slightly more complex, but in general all one has to do is derive an address space from the RunBasedAddressSpace and in the constructor populate the self.runs collection. The following shows the simplified implementation ignoring compression.
class RunListAddressSpace(addrspace.RunBasedAddressSpace):
    """An address space which is initialized from a runlist."""

    def __init__(self, run_list, cluster_size=None, size=0, **kwargs):
        super(RunListAddressSpace, self).__init__(**kwargs)
        self.PAGE_SIZE = cluster_size or self.session.cluster_size
        self.compression_unit_size = 16 * self.PAGE_SIZE
        self._end = size

        # In clusters.
        file_offset = 0
        for range_start, range_length in run_list:
                file_offset, range_start, uncompressed_range_length)

    def _store_run(self, file_offset, range_start, length):
        """Store a new run with all items given in self.PAGE_SIZE."""
            [file_offset * self.PAGE_SIZE,
             range_start * self.PAGE_SIZE,
             length * self.PAGE_SIZE,
Once the mapping is defined, the address space takes care of efficiently locating and using the correct run for arbitrary read operations.

2.4. Further abstractions

Rekall uses rekall.obj.Struct classes to represent arbitrary structs in memory. There is a mechanism to extend these and provide methods for these structs. The methods can be used to define a kind of API for accessing other data. For example, we can attach convenience methods to an MFT_ENTRY:
class MFT_ENTRY(obj.Struct):

    def attributes(self):
        """Iterate over all attributes, even ones in $ATTRIBUTE_LIST."""
        seen = set()

        for attribute in self._attributes:
            if attribute.type == 0xFFFFFFFF:

            if attribute in seen:

            yield attribute

            if attribute.type == "$ATTRIBUTE_LIST":
                for sub_attr in attribute.DecodeAttribute():
                    if sub_attr.mftReference == self.mft_entry:

                    result = sub_attr.attribute
                    if result in seen:

                    yield result

    def open_file(self):
        """Returns an address space which maps the content of the file's data.

        If this MFT does not contain any $DATA streams, returns a NoneObject().

        The returned address space is formed by joining all $DATA streams' run
        lists in this MFT into a contiguous mapping.

    def list_files(self):
        """List the files contained in this directory.

        Note that any file can contain other files (i.e. be a directory) if it
        has an $I30 stream. Thats is directories may also contain data and
        behave as files!

          An iterator over all INDEX_RECORD_ENTRY.
The above is a sample of some of the convenience methods attached to the MFT_ENTRY. The first combines the attributes defined within the MFT with those defined inside the $ATTRIBUTE_LISTattribute (Typically an MFT will start with some built in attributes until it runs out of room, then it will move some attributes to an $ATTRIBUTE_LIST attribute which is non resident. But this is an implementation detail of the MFT and should really be abstracted.
Similarly we have the list_files() method which simply finds the $INDEX_ROOT and $INDEX_ALLOCATION attributes and enumerates all entries within.
Similarly file data can be stored in multiple $DATA attributes (with different VCN ranges). Its a bit tedious to combine these $DATA attributes and so we have the open_file() convenience method to return a suitable address space over the file.

3. Using the NTFS API.

Using these method it is easy to use the API to open and read arbitrary MFT entries:
$ rekal -f /tmp/mnt/Partition2
The Rekall Memory Forensic framework 1.2.0 (Col de la Croix).

"We can remember it for you wholesale!"

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License.

See to get started.
# This gets a reference to the ntfs object which represents the filesystem.
[1] Partition2 19:28:33> ntfs = session.GetParameter("ntfs")

# The NTFS object contains a reference to the MFT
[1] Partition2 19:28:38> mft = ntfs.mft[89035]

# Which is just an array of MFT_ENTRY structs
[1] Partition2 19:28:40> print mft
[MFT_ENTRY Array[89035] ] @ 0x056F2C00
  0x00 magic                    [String:magic]: 'FILE'
  0x04 fixup_offset             [unsigned short:fixup_offset]: 0x00000030
  0x06 fixup_count              [unsigned short:fixup_count]: 0x00000003
  0x08 logfile_sequence_number  [unsigned long long:logfile_sequence_number]: 0x2C601977B
  0x10 sequence_value           [unsigned short:sequence_value]: 0x00000001
  0x12 link_count               [unsigned short:link_count]: 0x00000001
  0x14 attribute_offset         [unsigned short:attribute_offset]: 0x00000038
  0x16 flags                    [Flags:flags]: 0x00000001 (ALLOCATED)
  0x18 mft_entry_size           [unsigned short:mft_entry_size]: 0x00000178
  0x1C mft_entry_allocated      [unsigned short:mft_entry_allocated]: 0x00000400
  0x20 base_record_reference    [unsigned long long:base_record_reference]: 0x00000000
  0x28 next_attribute_id        [unsigned short:next_attribute_id]: 0x00000004
  0x30 fixup_magic              [String:fixup_magic]: '\x0f\x00'
  0x32 fixup_table             <Array 2 x String @ 0x056F2C32>
  0x38 _attributes             <ListArray 0 x NTFS_ATTRIBUTE @ 0x056F2C38>

# We use the convenience method to open the file, returning a suitable address space.
[1] Partition2 19:28:41> fd = mft.open_file()

# We can just read the address space.
[1] Partition2 19:28:45>, 20)
                  Out  > 'Python News\r\n+++++++'
We can also list files in a directory:
[1] Partition2 19:38:00> for record in ntfs.mft[5].list_files():
                    |..>         print
Documents and Settings
Program Files
Program Files (x86)
System Volume Information
[1] Partition2 19:38:16> print record
[INDEX_RECORD_ENTRY ListArray[20] ] @ 0x00001890
  0x00 mftReference      [BitField(0-48):mftReference]: 0x0000026B
  0x06 seq_num           [short int:seq_num]: 0x00000001
  0x08 sizeOfIndexEntry  [unsigned short:sizeOfIndexEntry]: 0x00000060
  0x0A filenameOffset    [unsigned short:filenameOffset]: 0x00000050
  0x0C flags             [unsigned int:flags]: 0x00000000
  0x10 file             [FILE_NAME file] @ 0x000018A0

[1] Partition2 19:38:18> print record.file
[FILE_NAME file] @ 0x000018A0
  0x00 mftReference     [BitField(0-48):mftReference]: 0x00000005
  0x06 seq_num          [short int:seq_num]: 0x00000005
  0x08 created          [WinFileTime:created]: 0x4A5BF968 (2009-07-14 03:20:08+0000)
  0x10 file_modified    [WinFileTime:file_modified]: 0x5446EF40 (2014-10-21 23:41:52+0000)
  0x18 mft_modified     [WinFileTime:mft_modified]: 0x5446EF40 (2014-10-21 23:41:52+0000)
  0x20 file_accessed    [WinFileTime:file_accessed]: 0x5446EF40 (2014-10-21 23:41:52+0000)
  0x28 allocated_size   [unsigned long long:allocated_size]: 0x00000000
  0x30 size             [unsigned long long:size]: 0x00000000
  0x38 flags            [Flags:flags]: 0x10000800 ()
  0x3C reparse_value    [unsigned int:reparse_value]: 0x00000000
  0x40 _length_of_name  [byte:_length_of_name]: 0x00000007
  0x41 name_type        [Enumeration:name_type]: 0x00000000 (POSIX)
  0x42 name             [UnicodeString:name]: u'Windows' (Windows)
Note that iterating over the index produces a list of INDEX_RECORD_ENTRY structs which also contain FILE_NAME structs within them. The FILE_NAME structs contain the 4 NTFS timestamps quite independently from the timestamps stored in the actual MFT for the file itself (This FILE_NAME struct came from the directory index), this can be forensically significant.
The next example shows how to get the $STANDARD_INFORMATION record for each file:
[1] Partition2 19:56:32> x=ntfs.mft[89035].get_attribute("$STANDARD_INFORMATION")
[1] Partition2 19:57:01> print x.DecodeAttribute()
  0x00 create_time         [WinFileTime:create_time]: 0x4F84B1CC (2012-04-10 22:18:52+0000)
  0x08 file_altered_time   [WinFileTime:file_altered_time]: 0x4F84B1CC (2012-04-10 22:18:52+0000)
  0x10 mft_altered_time    [WinFileTime:mft_altered_time]: 0x5123CB64 (2013-02-19 18:58:44+0000)
  0x18 file_accessed_time  [WinFileTime:file_accessed_time]: 0x5123CB64 (2013-02-19 18:58:44+0000)
  0x20 flags               [Flags:flags]: 0x00000820 (ARCHIVE, COMPRESSED)
  0x24 max_versions        [unsigned int:max_versions]: 0x00000000
  0x28 version             [unsigned int:version]: 0x00000000
  0x2C class_id            [unsigned int:class_id]: 0x00000000
  0x30 owner_id            [unsigned int:owner_id]: 0x00000000
  0x34 sid                 [unsigned int:sid]: 0x000002C9
  0x38 quota               [unsigned long long:quota]: 0x00000000
  0x40 usn                 [unsigned int:usn]: 0xC54A40B8

4. Conclusions

Although the NTFS support in Rekall is still pretty immature we want to make it better and more useful. For a relatively complex filesystem, such as NTFS, the Rekall implementation is pretty small, coming in at around 1000 lines of code (not including the implementation for lznt1 - the NTFS compression algorithm. Additional lines are for plugins etc). It should be possible to support additional filesystems as well. We also want to write more interesting plugins, please let us know any ideas for a good NTFS plugin :-)
Performance is pretty good. One thing you should notice is that Rekall starts up pretty fast since it does not scan the MFT like TSK does. Of course this means that Rekall cant find orphaned files like TSK does! Rekall also does not have a cache of the MFT - making it suitable to operate on a changing live filesystem.
Reading compressed files is currently pretty slow since the lznt1 compression algorithm is implemented in pure python. This could be easily accelerated with a C implementation in future.