4.1.1 DATA ABSTRACTION LAYERS
Computer systems organize raw storage in successive layers of abstraction—each software layer (some may be in firmware) builds incrementally more abstract data representations dependent only on the interface provided by the layer immediately below it. Accordingly, forensic storage analysis can be performed at several levels of abstraction:
Physical media. At the lowest level, every storage device encodes a sequence of bits and it is, in principle, possible to use a custom mechanism to extract the data bit by bit. In practice, this is rarely done, as it is an expensive and time-consuming process. One example of this process are second-generation mobile phones for which it is feasible to physically remove (desolder) the memory chips and perform acquisition of the content [194]. Thus, the lowest level at which most practical examinations are performed is the host bus adapter (HBA) interface. Adapters implement a standard protocol (SATA, SCSI, etc.) through which they can be made to perform low-level operations. For damaged hard drives, it is often possible to perform at least partial forensic repair and data recovery [123]. In all cases, the goal of the process is to obtain a copy of the data in the storage device for further analysis.
Block device. The typical HBA presents a block device abstraction—the medium is presented as a sequence of fixed-size blocks, commonly of 512 or 4,096 bytes, and the contents of each block can be read or written using block read/write commands. The media can be divided into partitions, or multiple media may be presented as a single logical entity (e.g., RAIDs). The typical data acquisition process works at the block device level to obtain a working copy of the forensic target—a process known as imaging—on which all further processing is performed.
Filesystem. The block device has no notion of files, directories, or—in most cases—which blocks are considered used and which ones are free; it is the filesystem’s task to organize the block storage into file-based storage in which applications can create files and directories with all of their relevant attributes—name, size, owner, timestamps, access permissions, and others. For that purpose, the filesystem maintains metadata, in addition to the contents of user files.
Application artifacts. User applications use the filesystem to store various artifacts that are of value to the end-user—documents, images, messages, etc. The operating system itself also uses the file system to store its own image—executable binaries, libraries, configuration and log files, registry entries—and to install applications. Some application artifacts, such as compound documents, can have a complex internal structure integrating multiple data objects of different types.
Analysis of application artifacts tends to yield the most immediately relevant results as the recorded information most directly relates to actions and communications initiated by humans. As the analysis goes deeper (to a lower level of abstraction), it requires greater effort to independently reconstruct the actions of the system. For example, by understanding the on-disk structures of a specific filesystem, a tool can reconstruct a file out of its constituent blocks. Such knowledge is particularly costly to obtain from a closed system, such as Microsoft Windows, because of the substantial amount of blackbox reverse engineering effort involved.
Despite the cost, independent forensic reconstruction is of critical importance for several reasons:
(a) it enables the recovery of evidentiary data that is not available through the normal data access interface;
(b) it forms the basis for recovering partially overwritten data; and
(c) it allows the discovery and analysis of malware agents that have subverted the normal functioning the system, making data obtained via the regular interface untrustworthy.
4.1.2 DATA ACQUISITION
In line with best practices [100], analysis of data at rest is not carried out on a live system. The target machine is powered down, an exact bit-wise copy of the storage media is created, the original is stored in an evidence locker, and all forensic work is performed on the copy. There are exceptions to this workflow in cases where it is not practical to shut down the target system and, therefore, a media image is obtained while the system is live. Evidently, such an approach does not provide the same level of consistency guarantees, but can still yield valuable insight. The issue of consistency does not exist in virtualized environments, where a consistent image of the virtual disk can be trivially obtained by using the built-in snapshot mechanism.
As already discussed, obtaining the data from the lowest-level system interface available, and independently reconstructing higher-level artifacts, is considered the most reliable approach to forensic analysis. This results in strong preference for acquiring data at lower levels of abstraction and the concepts of physical and logical acquisition.
Physical data acquisition is the process of obtaining the data directly from the hardware media, without the mediation of any (untrusted) third-party software.
An example of this approach is Willassen’s discussion [194] of cell phone data acquisition that relies on removing the physical memory chip and reading the data directly from it. More generally, getting physical with the evidence source is something most practical and necessary for low-end embedded systems with limited hardware capabilities.
For general-purpose systems, tools use an HBA protocol, such as SATA, or SCSI, to interrogate the storage device and obtain a copy of the data. The resulting image is a block-level copy of the target and the process is usually referred to as physical acquisition by most investigators; Casey uses the more accurate term pseudo-physical to account for the fact that not all areas of the physical media are acquired.
It is worth noting that modern storage controllers are quickly evolving into autonomous storage devices, which implement complex (proprietary) wear-leveling and load-balancing algorithms. This has two major implications: (a) the numbering of data blocks becomes decoupled from actual physical location; and (b) it is increasingly possible that the storage controller itself becomes compromised [196], rendering the acquisition process untrustworthy. These caveats notwithstanding, we will refer to block-level acquisition as physical, in line with accepted terminology.
Logical data acquisition relies on one, or more, software layers as intermediaries to acquire the data from the storage device.
In other words, the tool uses an API, or a message protocol, to perform the task. The integrity of this method hinges on the correctness and the integrity of the implementation of the API, or protocol. In addition to the risk, however, there is also a reward—higher-level interfaces present a data view that is closer in abstraction to that of user actions and application data structures. Experienced investigators (equipped with the proper tools) make use of both physical and logical views to obtain and verify the evidence relevant to the case.
HBA firmware compromises. It is important to realize that, although not trivial to execute, attacks on the integrity of the disk controller have been shown to be entirely feasible. For example, early experiments by Goodspeed [77] showed how an iPod can be customized (with relative ease) to detect the read patterns of an acquisition process, and to react by hiding and destroying the data on the fly.
In follow up work, Zaddach et al. [196] reverse engineered from scratch a Seagate disk controller and installed a backdoor, allowing a remote attacker to “mount” the disk and examine its content. This involves no compromises to the software stack above the firmware.
Конец ознакомительного фрагмента.
Текст предоставлен ООО «ЛитРес».
Прочитайте эту книгу целиком, купив