How Is Data Written, Stored On, And Erased From Hard Disks?

One of my favorite IT Directors, Buzz Eyler of themore data gets written to the VM file. When we print
Orcutt Unified School District, tells me that most peoplethe document, a print buffer file is created. So, in the
have no clue how data is stored on a hard driveact of making a document, then opening it later, making
running Windows. A discussion of how it is written anda change or two, and printing it, we've created the
marked for erasing would help a lot of peopleoriginal User Document, two temporary invisible work
understand what's happening under the hood of theirfiles, one print buffer file, and entries in the VM file.
computer.Email and other documents behave in much the same
First, a little background: Inside your hard disk is a stackway, although the specifics differ somewhat from
of one or more optically perfect platters where data isprogram to program. Email issues will have their own
stored magnetically. When the drive is originallyarticle.
formatted, it is laid out in a pattern of concentric circlesWhen a file is deleted, the file does not simply go
(cylinders) and wedges. Try to imagine a hybrid of aaway. It remains on the hard disk, its name slightly
record album and a pizza pie...or a dartboard. However,changed, ignored by the operating system, and invisible
rather than 8 slices of pizza, or about 80 places bigto the user, as are the preexisting, previously deleted
enough to land your dart, there may be hundreds ofWork Files already mentioned, and as is the VM file.
millions of extremely small Sectors. A Sector is 512The Cluster assigned to the file is deallocated, thereby
bytes in size - or big enough to hold about 256becoming unallocated space even though it has data
characters. Windows chunks these out into Clusters,sitting in it. Unallocated Clusters can then be assigned
each of which holds about 64 Sectors. Every time youto a new file when the need arises. The file listing
create a file, Windows sets aside - allocates - at leastassigned to the file's name is also made available to be
one Cluster, and then writes your data to it. Wheneverused, although the file's name is only changed by one
a file exceeds one Cluster in size, the computercharacter. But until another file is saved to that
allocates another entire Cluster. But even if a filedirectory or folder, and saved at that spot in the
consists of one letter, which is 2 bytes in size, thedirectory, the file name is not overwritten. Furthermore,
computer allocates approximately 32,000 (actuallyif the name of the new file that is written to the same
32,768) bytes of space. The file may then be writtenlocation in the directory is shorter than the original
to only the first 2 bytes of the Cluster, leaving thename, only part of the original name is overwritten.
great majority of the Cluster unchanged, as file slack.Similarly, when a file is overwritten, much of the
The Cluster won't be assigned to another file until theprevious content of the file may remain intact. If, for
original file is deleted - that is, until the original is sent toinstance, a file that took up 4 consecutive Clusters is
the Recycle Bin, and the Recycle Bin emptied.deleted, and another file that takes up two consecutive
But this one Cluster isn't the only place to which yourClusters overwrites the original file, then half of that
data is written. Furthermore, where and in how manyoriginal remains, albeit in a raw form, and may be
places data is written can be somewhat dependentrecoverable. Recovering such files and file remnants is
upon the application writing it.an important part of the work a computer forensic
When a file is saved, there are several attributesexaminer performs. When a file is simply deleted, and
saved with it. One is the date the file was created; onenot overwritten, it is fairly trivial for a computer forensic
is the date the file was last changed, or modified; onespecialist (or data recovery technician) to recover, or
is the date the file was last accessed. This informationrecreate, the file. This process is generally known as
is kept as part of a file listing called a directory. Thiselectronic discovery, or e-discovery.
directory is viewed by the user as the contents of aSo, until data is actually overwritten, it is likely to be
folder.recoverable, in all or in part. Furthermore, if the original
Let us take for example, Microsoft Word, the leadingfile is actually overwritten, it may be possible to search
word processing program for office computers. Asthe hard disk for text from the original file, and thereby
soon as the user begins a Word document, an invisible,find complete or partial copies of the file from former,
temporary work file is created (call it Work File A), anddeleted versions of the file, from the aforementioned
parts of the new document get written to the virtualtemporary work files, or from snippets that may
memory file (which in WindowsXP, is calledremain in the virtual memory file. The result may be a
pagefile.sys). We can call it the VM file. When the userrich lode of data useful to the computer forensics
saves the document, a file is created on the hard diskanalysis, or simply recovered data for the end user.
with the name the user gives it; call it User Document.As end users, we see one file being created when we
We think we have created one document, but thesave it, and we see it go away when we trash it. But
data we're typing is going into three separate files. Ifbehind the scenes, there is a lot more going on. More
we close the document, Work File A is deleted, but itthan the one document we think we've saved is
doesn't go away - more on this later.created, and very little goes away when we delete it.
Now, suppose that at a later date, we open UserWhile data is not necessarily immortal, we now see
Document to make some changes. Unbeknownst tothat there is typically a lot more lying around after
us, a new invisible temporary work file is created, andwe're done with it than we realize.