DFF & Open Source Digital Forensics blog

Welcome to DFF technical blog. The goal of this blog is to share about DFF and Open Source digital forensics technologies

Code guru digital analysis using microscope law keyboard key

DFF & Open Source Digital Forensics blog

  • News, community and future development

    News

    A week ago, DFF 1.1 was released with new features and bug fixes. About new features, DFF now supports mailboxes reconstruction, based on the library developed by Joachim Metz, libpff. Furthermore, a connector for AFF images has been included, based on Simson L. Garfinkel's library, afflib. Bug fixes mostly concerned some part of the API and NTFS module.

     

    Community

    When downloading, you can fill a form and directly provide suggestions. Amazingly (since this form is optional), many people took the time to fill it and most of suggestions were messages of encouragements. Thank you supporting us, we really appreciate !

    We also take the opportunity to thank people who have provided feeback. Developing digital forensics tools requires to test on a plethora of dumps. We have our own corpora but it's not enough to test all weird behaviours... DFF will be more and more stable and accurate by reporting issues you have encountered. If you wish to contribute by testing, making comparison with existing tools, feel free to contact us through our mailing lists, irc channel or mails.

    It's also the occasion to thank Cory Altheide and Harlan Carvey for mentionning DFF in their recent book Digital Forensics with Open Source Tools but also for supporting the Open Source Digital Forensics community ! As Harlan mentionned in his last post, Open Source is important for the sake of the community.

    Concerning version source control, we have decided to enhance it by applying a new model branching based on one article of Vincent Driessen, A successful Git branching model. He has developed a set of extension for GIT, implementing its branching model, named gitflow which is, as ever, Open Source ;)

    If some of you have an artistic talent, we are looking for someone to design a nice logo for DFF. No money to gain, but your name will be part of contributors !

     

    Future development

    Currently, we are working on the next release of DFF. It is scheduled for the end of June and will provide search and filters features. There is already a Proof of Concept available through the project source code repository, GIT.

    At the moment, a quick search is available to filter contents of the current folder and a decicated window which recursively looks for items matching provided filters.

    As of writing, it's possible to apply filters on different attributes such as filenames (fixed or wildcarded strings and regexp), size, timestamps (even those in exif metadata or registry ;). It's also possible to search in the content of file either by providing a keyword or a dictionnary file. When released, we will describe more precisely how it works.

    If some of you already want to test these features here is a oneliner:

    git clone git://digital-forensic.org/dff.git dff-1.2beta; cd dff-1.2beta; git fetch origin filters:filters; git checkout filters; mkdir build; cd build; cmake ..; make; ./dff.py -g

    If there are lots of demand, we will package a beta version for some platform.

    • Category: dff
  • Virtual Machine disk forensics with DFF

    For several years, virtual technologies have become more accessible and are now used in both professional and personal environments. Whatsoever free, Open Source tools or commercial softwares, people can in one click create one or many virtual systems within a physical computer. As a traditional computer these virtual machines can contain evidences of suspicious activities. That is why it is important when conducting an investigation to be able to analyse virtual machines.

    Virtual Disk Image Format

    Different virtual disk image formats exist, with their own capabilities and specifications. The format is interpreted by a virtual machine monitor to behave like a traditional hard disk, each format is compatible with one or several monitors and has a specific file type extension:

    * .vmdk for VMware VMDK

    *.vhd for Xen and Microsoft Hyper-V, *

    * .vdi for Oracle VM VirtualBox, etc.

    In this article, we will cover the VMDK file format. Indeed, since version 1.0, DFF can manage and reconstruct VmWare workstation disk format.

    Split Vs monolithic images / Flat Vs Sparse memory allocation

    VMDK specification allows to create two kinds of images. The monolithic images are created in a single file, containing all virtual disk memory structures and file system layers, while the Split images are created using multiple files with size of each chunks defined by users.

    The specification also allows different storage allocation policies. Either Flat images, which generate the entire virtual disk when creating a new VM or Sparse images which allocate virtual disk structures on the fly.

    The configuration of these options are stored into a File Descriptor structure, embedded into the binary file for monolithic images or into a specific text file for splited option.

    Here we go ...

    First of all, we need to load into DFF's Virtual File System the directory containing all files created and used by the virtual machine. To do so, go to File -> Open Evidence File(s) and select Directory instead of File in the right combobox, then select the dump location by clicking on the '+' button.

    After this step, navigate to the directory you just added. You will see the list of files used by the Virtual Machine Monitor. In our case this is a 2GB split sparse machine, with several snapshots created. VmWare workstation has a predefined way to name all files. The disk descriptor configuration is defined in the VM_NAME.vmdk as a text file.The files containing data are called Extent and are defined as VM_NAME-S00X.vmdk. The snapshots use the same convention, except they have also an identifier ex: VM_NAME-000000X.vmdk and VM_NAME-000000X-s00X.vmdk for extents.

    You can open the file descriptor of a snapshot and see the configuration:

    Each snapshot has a pointer to the parent snapshot until the root (called base) of the Virtual Machine. DFF module will recursively parse and reconstruct all snapshots. To do that, select the disk descriptor file (VM_NAME.vmdk) of the latest created snapshot and apply Vmware module : right click on VM_NAME-000000X.vmdk, Open with -> Volume -> vmware.

    Once applied, the module will create a new tree from the selected node with two sub-directories. The Baselink directory represents the root disk of the virtual machine and the Snapshots directory the modified disks created incrementaly from the base. Navigate through directories and choose which disk you want to mount, in our case, it's a snapshot.

    Snapshot directories are named according to their ID stored in the disk descriptor. Inside the directory, we found the virtual disk mapped from all extents. We can now access the VirtualHDD like any other disk image.

     

    Once the VirtualHDD is selected, you can see on the right side the type, which is  auto-detected, in this case a 'X86 boot sector' and we can see that the module partition is also detected as relevant to apply on this file. So let's double click to VirtualHDD node. A window will popup to ask if you want to apply the partition module. Click on Yes.

    Then double-click on the partition you want to mount. According to the detected file type, the partition is a Linux EXT4 File system:

    Finally, you can browse the virtual machine disk and get the file system metadata. (EXTFS 4 Metadata on the screenshoot).

    Conclusion

    As you can see, VMDK module in DFF allows to quickly access and mount virtual machine disks. Once this is done you can then proceed a complete forensics analysis of those data, that can be of great values.

    • Category: dff
  • Time filtering

    During a digital analysis, times information can have a major importance. Performing filtering based on timestamps can help reducing the scope of an analysis, and eliminate part of the elements who are not related to the investigation. These temporal data can come from a lot of different sources, such as :

    * File systems timestamps
    * EXIF metadata on some pictures
    * Windows registry
    * etc.

    DFF integrates a graphic module, called timeline, used to select and isolate nodes which time information are in a predefined range. To show how it works, we will use a dump from digital corpora which can be dowloaded here (4GB compressed). Uncompressed, this dump contains a 40GB NTFS file system, so we will need to use the NTFS module to parse it. To add a dump into DFF and apply a module on it, you can refer to DFF documentation. When it is finished, its content can be accessed through DFF interface. Note that we used metaexif as post-process to gather meta exif information on JPEG pictures, and Windows registry to gather data on registry keys.

    We would like to have temporal information on the file system, or part of the file system. The timeline module can help us on that matter.

    We will, in this article, pretend that we are only interested in data of user domex1, so we go into NTFS/Documents and settings/ directory.  Then we launch the timeline by right-clicking on the node domex1 and going into the submenu Open with -> Statistics -> Timeline, as shown below.

    Launch timeline

    The timeline will recursively analyzed all temporal information of all nodes, starting from domex1. The following view will be displayed once it is done.

    timeline

    On the right, in the Global tab, appear a summary of the information the module could find. It displays :

    * The number of nodes who have temporal data (2305)
    * The number of computed timestamps (8079)
    * Dates of the first and last timestamps (between the 04-08-2004 and 30-10-2008).

    It already gives us hints, which are confirmed by the graphical view : the ordinate axis shows the number of modified nodes and the abscissa axis gives time values. The different colored bars correspond to temporal information. The bigger the bar, the more data. In this example, we can see that most temporal information concern dates which are between the 15-02-2008 and the 30-10-2008 (represented by the colored bars between those two dates), with a peak around November the 30th.

    As we do not know yet which temporal information does represent all these colored bars, we will go to the Display tab, displayed on the next screenshot.

    In this menu we can see what color corresponds to what type of timestamps. We can also chose to check or uncheck some information, to display or hide them, depending on ours needs. For example, if we only need to have meta exif timestamps we can uncheck Accessed, Changed and Modified check boxes. In our example, we will uncheck Accessed and created, because those informations are redondant and already given by the NTFS attributes standard information (in red, green and yellow).

    We also check the two EXIF check boxes (blue and cyan) to include pictures metaexif in our time analysis. Note that If you are using or developing an other module generating or working with timestamps, it would appear in the Display tab and you would be able to display the data on the graph.

      display tab

    Ok, we have approximative temporal data, but it is not very accurate. We will then click on the Navigation tab. For now, all options on the top-right of the window are un-activated.

    We said that most timestamps are between the 15-02 and the 30-10 of 2008. As we need more details on the events between those two dates, we will zoom on this part of the graph. To do so, we just have to click on it and draw a box to select the region we are interested in, as represented on the following screen-shot.

    select area

    Once we have made the selection, the different options are activated. Lets click on the Zoom button. The view will be redrawn and will only displays the time range we just selected. In our example, the display starts on the 7-10-2008 and end on the 30-10-2008.

    zoom

    We now have a more accurate view. If you estimate that it still lacks of precision, you can zoom again on a smaller time range, as we already did. You can zoom as much as necessary, until you estimate it is accurate enough. Note that if you want to go back to the original view you just have to click on the Original size button of the Navigation tab.

    The last step is to know exactly which data were modified or accessed at what time. We draw an other box around the bars located in the time interval we want to analyze, but instead of clicking on the Zoom button, we chose the Export one. This will add a Timeline node at the root of the VFS, as shown on the following picture. This node contains all data which temporal informations match the time range we defined.

    results

    In our example, several nodes match, and the analysis can now be focused on them instead of the entire file system.

    To conclude, we have, in a few clicks, extracted some data corresponding to a time range. This functionnality is useful when users want to investigate events which occured between given dates. All data not included in this time range will be hidden, so it can significantly reduce the number of information to analyze.

  • Virtually concatenate dd split images

    It is recurrent to have to concatenate dd splitted images and DFF provides a way to achieve it easily. The demonstation of this functionnality will be based on dumps provided by digitalcorpora.org [1] and more precisely the nps-2009-canon2 images set [2]. There are six RAW splitted images named nps-2009-canon2-gen*.raw (with * ranging from 1 to 6).

    First of all, you need to add concerned files in DFF. You can either click on "File --> Open evidence file(s)" or directly on the blue folder icon beneath "File" menu.

    Then a window will be popped up to choose which files or folders to add. For splitted dd images, select "RAW" checkbox. It is also possible to work with EWF files [3] but it won't be covered in this article.

    By clicking on the green cross, another window will popup to display files stored on your local filesystem. Just add all the raw nps-2009-canon2-gen files by using "shift + left click" bindings and click OK button. Added files will be located to the "Logical files" folder of DFF.

    At this step, files are loaded within DFF and you can now play with them. Now, let's concatenate the six files into only one. The architecture of DFF enables to virtually works with all added but also created (still virtually) files, the only limit being imagination. To perform this action:

    1) Select the six files (click the first one then "shift + left click" on the last one).

    2) Right click on one of the selected nodes (every files and folders in DFF are seen as nodes) then "Open With --> Node --> merge"

    A window will be popped up to provide the configuration of the "merge" module [4]. As shown in figure 1, previously selected nodes are automatically attributed to the "files" argument:

    files to concatenate

     

    Then you can provide the desired name of the file resulting of the concatenation by setting the "output" argument. This argument is optional, if not provided, the resulting file will be named based on the first and last provided files names. For this scenario, it would be "nps-2009-canon2-gen1.raw...nps-2009-canon2-gen6.raw". But let's provide a shorter name:

    output name argument

     

    Finally, there is another optional argument which can be provided, the parent node to attach the resulting file. In the following figure, parent is "/Logical files" folder, but it can be setted to any other existing ones.

    parent node

     

    If "parent" argument is not provided, default parent will be the first selected node (here, it would be "nps-2009-canon2-gen1.raw"). Yes, in this case the corresponding parent is a file, but there is no problem with DFF as everything is seen as a node. The file will become a kind of virtual folder for which you will be able to browse its children but also to have access to its data through the hexadecimal viewer for example.

     

    Once you have configured the "merge" module, just click "OK" and the resulting file nps-2009-canon2.raw will be available in the browser with its attribute setted with nodes used for the concatenation:

    resulting merged file

     

    From this step, you will be able to extract the resulting file and use it with other tools or continue to work with DFF by applying other modules such as "partition" module as suggested in the "relevant module(s)" attribute.

    As you can see, it's easy to concatenate several files with DFF if you are working either on Windows or Linux.

    In further articles, we will continue the analysis of this dump. Stay tuned !

     

    [1] http://digitalcorpora.org/

    [2] http://digitalcorpora.org/corp/images/nps/nps-2009-canon2/

    [3] If EWF checkbox is disabled, it means the module has not been able to find the corresponding library)

    [4] This window is automatically generated based on the configuration provided by the module.

    • Category: dff