| One of my favorite IT Directors, Buzz Eyler of the | | | | more data gets written to the VM file. When we print |
| Orcutt Unified School District, tells me that most people | | | | the document, a print buffer file is created. So, in the |
| have no clue how data is stored on a hard drive | | | | act of making a document, then opening it later, making |
| running Windows. A discussion of how it is written and | | | | a change or two, and printing it, we've created the |
| marked for erasing would help a lot of people | | | | original User Document, two temporary invisible work |
| understand what's happening under the hood of their | | | | files, one print buffer file, and entries in the VM file. |
| computer. | | | | Email and other documents behave in much the same |
| First, a little background: Inside your hard disk is a stack | | | | way, although the specifics differ somewhat from |
| of one or more optically perfect platters where data is | | | | program to program. Email issues will have their own |
| stored magnetically. When the drive is originally | | | | article. |
| formatted, it is laid out in a pattern of concentric circles | | | | When a file is deleted, the file does not simply go |
| (cylinders) and wedges. Try to imagine a hybrid of a | | | | away. It remains on the hard disk, its name slightly |
| record album and a pizza pie...or a dartboard. However, | | | | changed, ignored by the operating system, and invisible |
| rather than 8 slices of pizza, or about 80 places big | | | | to the user, as are the preexisting, previously deleted |
| enough to land your dart, there may be hundreds of | | | | Work Files already mentioned, and as is the VM file. |
| millions of extremely small Sectors. A Sector is 512 | | | | The Cluster assigned to the file is deallocated, thereby |
| bytes in size - or big enough to hold about 256 | | | | becoming unallocated space even though it has data |
| characters. Windows chunks these out into Clusters, | | | | sitting in it. Unallocated Clusters can then be assigned |
| each of which holds about 64 Sectors. Every time you | | | | to a new file when the need arises. The file listing |
| create a file, Windows sets aside - allocates - at least | | | | assigned to the file's name is also made available to be |
| one Cluster, and then writes your data to it. Whenever | | | | used, although the file's name is only changed by one |
| a file exceeds one Cluster in size, the computer | | | | character. But until another file is saved to that |
| allocates another entire Cluster. But even if a file | | | | directory or folder, and saved at that spot in the |
| consists of one letter, which is 2 bytes in size, the | | | | directory, the file name is not overwritten. Furthermore, |
| computer allocates approximately 32,000 (actually | | | | if the name of the new file that is written to the same |
| 32,768) bytes of space. The file may then be written | | | | location in the directory is shorter than the original |
| to only the first 2 bytes of the Cluster, leaving the | | | | name, only part of the original name is overwritten. |
| great majority of the Cluster unchanged, as file slack. | | | | Similarly, when a file is overwritten, much of the |
| The Cluster won't be assigned to another file until the | | | | previous content of the file may remain intact. If, for |
| original file is deleted - that is, until the original is sent to | | | | instance, a file that took up 4 consecutive Clusters is |
| the Recycle Bin, and the Recycle Bin emptied. | | | | deleted, and another file that takes up two consecutive |
| But this one Cluster isn't the only place to which your | | | | Clusters overwrites the original file, then half of that |
| data is written. Furthermore, where and in how many | | | | original remains, albeit in a raw form, and may be |
| places data is written can be somewhat dependent | | | | recoverable. Recovering such files and file remnants is |
| upon the application writing it. | | | | an important part of the work a computer forensic |
| When a file is saved, there are several attributes | | | | examiner performs. When a file is simply deleted, and |
| saved with it. One is the date the file was created; one | | | | not overwritten, it is fairly trivial for a computer forensic |
| is the date the file was last changed, or modified; one | | | | specialist (or data recovery technician) to recover, or |
| is the date the file was last accessed. This information | | | | recreate, the file. This process is generally known as |
| is kept as part of a file listing called a directory. This | | | | electronic discovery, or e-discovery. |
| directory is viewed by the user as the contents of a | | | | So, until data is actually overwritten, it is likely to be |
| folder. | | | | recoverable, in all or in part. Furthermore, if the original |
| Let us take for example, Microsoft Word, the leading | | | | file is actually overwritten, it may be possible to search |
| word processing program for office computers. As | | | | the hard disk for text from the original file, and thereby |
| soon as the user begins a Word document, an invisible, | | | | find complete or partial copies of the file from former, |
| temporary work file is created (call it Work File A), and | | | | deleted versions of the file, from the aforementioned |
| parts of the new document get written to the virtual | | | | temporary work files, or from snippets that may |
| memory file (which in WindowsXP, is called | | | | remain in the virtual memory file. The result may be a |
| pagefile.sys). We can call it the VM file. When the user | | | | rich lode of data useful to the computer forensics |
| saves the document, a file is created on the hard disk | | | | analysis, or simply recovered data for the end user. |
| with the name the user gives it; call it User Document. | | | | As end users, we see one file being created when we |
| We think we have created one document, but the | | | | save it, and we see it go away when we trash it. But |
| data we're typing is going into three separate files. If | | | | behind the scenes, there is a lot more going on. More |
| we close the document, Work File A is deleted, but it | | | | than the one document we think we've saved is |
| doesn't go away - more on this later. | | | | created, and very little goes away when we delete it. |
| Now, suppose that at a later date, we open User | | | | While data is not necessarily immortal, we now see |
| Document to make some changes. Unbeknownst to | | | | that there is typically a lot more lying around after |
| us, a new invisible temporary work file is created, and | | | | we're done with it than we realize. |