InsideTech » June 2008
A time to reap, a time to cull
Gathering electronic data is no easy task, but the technology exists to make the process a little easier, and a little less costly.
Back in the “old days,” business records and communications were kept in the form of paper. Responding to a discovery request at the direction of in-house or outside counsel was straightforward. Paper files were stored in visibly labeled file cabinets and within those cabinets, in labeled folders. The folders that appeared relevant were piled into boxes and taken for photocopying.
Electronically Stored Information (ESI) is harder to find. It requires a team effort of inside counsel, outside counsel, people in the corporate enterprise with knowledge of the information infrastructure, and outside consultants who understand the infrastructure they’re being shown.
Once found, it must be collected properly. If collected incorrectly, ESI may be inadvertently changed, and subject to the objection of spoliation. Improper collection of electronic data is a common mistake made by companies and their counsel in discovery today.
The “gold-standard” of electronic data collection is forensic bit-stream imaging. This makes an exact “bit-by-bit” copy of the target hard drives of the servers of the individual desktops and laptops of the persons identified as the relevant custodians of electronic data. It captures everything on that hard drive—system files, deleted files not yet overwritten, program files, ordinary data files and even Internet browsing history. While not every case requires this form of forensic data collection, many do, and other cases require the advice of outside experts when the collection is performed.
Data collection without full forensic imaging is traditionally done by a technician going on-site and copying the identified target files from the identified machines, in a manner that is documented and defensible. While it is not as comprehensive as taking a complete bit-by-bit image copy, for cases where that level of authentication is not required, it is commonly done. But it still has to be done right. It also requires a lot of leg work to go from machine to machine to copy this data, almost as time-consuming as making a full bit-stream image.
Recently, the industry has seen the development of “appliances” (hardware with preloaded software) that “crawl” the corporate network, locating every server, PC or other device attached to the network and able to “see” what is on them. These devices can then collect the data from whatever custodians, machines, servers, folders, file shares, and file types that are designated, copying these to its own hard drive, which can range in size from 750 GB to 3 TB. These appliances are connected to the network “behind the firewall” within the corporate enterprise. They allow collection of data on a desktop in Des Moines by a technician working from Dallas. This is not forensic imaging of a hard drive, but it is an efficient way to copy active files. It is also an early opportunity to take a broad-brush reduction of the data, as you can set the copy parameters to file date, file type, etc. Some of these devices also permit keyword filtering at the time of data collection. A word of caution: the time of collection may be too early in the development of the case to have a good handle on what the keyword searches ought to be.
After the data is collected, the traditional next step in electronic discovery is called “processing.” Electronic data comes in many different formats and types: word documents, spreadsheets, e-mails, presentations, databases. If you tried to review electronic data using the native applications in which it was created you would have two problems. First, the mere act of opening up a file in its native application will change some of its properties; second, it is unlikely you will have all the appropriate applications and versions on your system to permit this. Therefore you have to “process” that electronic data.
Processing electronic data involves turning each e-mail or document into a common format regardless of its original source application. That common format then permits the document to be loaded into the review software without causing any changes to the document itself so it retains its authenticity.
TIFF conversion was the standard for electronic discovery data processing for several years, and still is appropriate in many cases. However, native file review, without conversion to TIFF, is becoming more commonplace.
Processing native files for review involves taking a copy of each native file, as well as breaking apart each separate e-mail and any attachments—while maintaining the parent-attachment relationship—and then converting those native files and e-mails and attachments into a format where, while they remain “native,” viewing and notating is possible without changing the file itself.



