MattockFS; Computer-Forensics File-System : Part Three

in #forensics7 years ago (edited)

This post is the third of an eight-part series regarding the MattockFS Computer-Forensics File-System. This series of post will be based on the MattockFS workshop that I gave at the Digital Forensics Research Workshop in Überlingen Germany earlier this year.

In Part One we discussed the concepts of asynchronous processing and the tool-chain approach and how these two concepts do and don't work well together within the context of medium to large scale computer forensic data processing. In Part Two we looked at the integrity requirements for computer forensic data processing in the light of anti-forensics and at the role that capability based security could potentially play in achieving desired integrity properties.

In this installment of this series, I want to look at two important ancestor user-space file-systems that MattockFS inherits and combines important features from. These two ancestors are the zero-storage carving facilitating file-system CarvFS and the least authority file-system MinorFS.

Before we zoom into the user-space file-systems, a tiny bit of history showing all four major influences on the MattockFS design. On the forensic framework side of things, we identify two important influences, both related the the Open Computer Forensics Architecture (OCFA), a now deprecated computer forensic framework built by the Dutch National Police. From it's inception in 2002, OCFA contained a message-bus solution named the Anycast relay, that we discussed in part one of this series. While the initial version of OCFA used a Content Addressed Storage (CAS) based repository, issues with large scale carving requirements later led to the development and (partial) integration of the use of the CarvFS file-system into the base OCFA setup.

On the other side of the MattockFS family tree, we see two other important ancestors with respect to system and forensic process integrity. In 2006. The same year that CarvFS was released, Schatz and Clark published a paper where they argued for the use of a concept called Sealed Digital Evidence Bags in the acquirement of digital forensic evidence.

A few years later, in 2008, my pet project MinorFS showed how sparse capabilities could be used in combination with Linux mandatory access controls, as provided by SELinux or AppArmor can be used as essential building blocks for high integrity system designs on that platform.

In this blog post we shall be looking at the two user-space file-systems and at the properties they provided that are of interest to the MattockFS design as high-integrity component in an IO-efficient component for scalable computer forensic data processing solutions.

Before we look at the two individual user-space file-systems, we need to quickly look at what a user-space file-system is and how it is implemented in FUSE on Linux, BSD and OSX. The FUSE subsystem consists of a library and a kernel module. By writing a user space program against the FUSE library and running that program as a regular user, we create what is called a user space file-system. At process start, the started process in bound to a newly created mount point. If some process on the system accesses the file-system under that mount-point, file-system system calls will go to the kernel just as for regular file-systems, but instead of getting handled by a kernel space file-system, the FUSE kernel module communicates with the FUSE library linked into the running user space process, and appropriate callbacks are invoked within the user-space process.

If we zoom out a bit at how user-space file-systems can be used and are used in both MattockFS and CarvFS, we arrive at the base concept an overlay user-space filesystem. We have the actual data, residing on an actual file-system. So if we have some actual computer forensic module accessing the user space file-system, we could say that the actual user-space file-system, running under a different user id, next to important abstractions, provides for essential priviledge seperation posibilities.

Now for the first MattockFS ancestor: CarvFS. CarvFS was incepted to solve major data-repository storage requirement problems arising from file-carving based data explosion. When processing a disk image, many files can often be found using file-system analysis. In a forensic investigation, however, often much evidence can be found in the unallocated regions of the file-system, or in some cases, in disk image partitions containing file-system that is not yet supported by the forensic file-system processing modules. For such data regions, one way to acquire potential evidence is the use of file-carving. There are different strategies possible for carving files, including header and footer detection, block entropy and serial correlation based block matching and much more that are subject of active research. One important factor however in file-carving is that it's not exactly an exact science. Either accept a lot of false negatives or a lot of false negatives. This goes for file headers, but even more so for block structure and file footers. With carving rules for many potential file-types, there are often many false positives and, lacking a better word, false negatives for file ends. As it is often preferable to have some trailing garbage in a carved file over having a prematurely truncated file, many carved files are significantly larger than the actual files were. Combining these aspects, it is not unusual for 100 GB of unallocated previously used disk space to result in one or even multiple TB of carved files.

The CarvFS file-system is used to mount a disk image in such a way that carved files do not actually need to be copied out in order to use them in consecutive forensic processing modules that expect files of a given type. The way this is done is by using a file-naming convention that allows for designating the carved files using a combination of numeric offsets and (fragment) sizes.

CarvPath designations are file-path like designations that can designate data entities in a nested way. For example, a carved ISO image could contain a large mailbox that in turn could contain individual emails that might all be designated with carvpath tokens relative to their parent. This nesting basically maps to the idea of directories in a regular file-system. On a single level of nesting, a carvpath would normally consist of a series of carvpath fragment, with each fragment consisting of a simple offset and a size. As Linux has a maximum size for file-system entitiy names, it is possible for a highly fragmented file to have a carvpath token name exeeding this maximum size. In that case a database-stored hash to path mapping is used. The final token in a carvpath always has a file extension to designate it as a file and not a nesting level.

Here some example of carvpath designations as used by CarvFS. A single fragment, A two fragment entity, and the third one, an entity with a huge sparse section in the middle. Such sparse fragment designations can be useful when extracting carvpath designations using regular file-system entities on a processed file-system. The 4th example shows a nested carvpath. Then finally we see an example of the usage of a digest as a replacement for a carvpath token that would be too long for the Linux file-system APIs to work with.

While CarvFS is pretty good for basic carving usage. Its use in large scale investigations using a framework such as OCFA turned out to have some limitations. CarvFS is a simple file-system providing read-only access to a single underlying disk image. In a real investigation, this might result in hundreds of simultaneously mounted user-space file-systems. To integrate CarvFS in OCFA, a number of hacks were used to make things fit. These hacks were sub-optimal both from an access control and from a performance perspective.

The other user-space file-system that is an important ancestor of MattockFS is MinorFS. MinorFS is a small set of cooperating user-space file-systems. CapFS is the core sparse-capability based file-system that provides a decomposable file-system tree. ViewFS runs on top of CapFS and provides processed with a private $HOME and $TEMP directory.

The core file-system CapFS is based on the idea of a Directional Acyclic Graph (DAG). Other than in normal file-system trees, a DAG is directional and in this case a directional decomposable authority tree. In such a tree there is no place for the normal double dot directory entry. That is, for MinorFS, '..' is considered evil.

MinorFS by itself has limited value. The real value of MinorFS comes out when the sparse capability based system is combined with the discretionary access controls of AppArmor, and if possible, on top of that a capability based programing language such as the e-language that has support for the concept of persistent processes.

Summarizing, where MinorFS showed the value of the use of sparse capabilities in the implementation of high-integrity systems that run on the Linux platform, CarvFS shows the strength of the concept of carvpath annotations in a user-space file-sytems. Both aspects are very important within large scale semi-automated processing of digital forensic data.

In the next installment of this series we shall look at how these aspects and others come together in the MattockFS ssytem design.

Sort:  

This is very cool and very informative!

Coin Marketplace

STEEM 0.18
TRX 0.15
JST 0.029
BTC 62153.21
ETH 2411.09
USDT 1.00
SBD 2.64