Producing electronically stored information (ESI) has unfortunately become one of the most expensive tasks in litigation. Typically, parties comply with obligations to produce ESI by identifying the witnesses most likely to possess relevant documents and files, creating forensic images of their stored ESI from computer hard drives and shared servers, and then using Boolean search terms to identify the electronic documents that may be relevant. Finally, armies of junior associates and contract attorneys review the culled set of documents for responsiveness and privilege. Depending on the number of witnesses and the volume of their files, this process often involves searching hundreds of gigabytes of data, and may cost the producing party tens or even hundreds of thousands of dollars.
Because the costs imposed by this process often outweigh any benefit to the litigants, courts are actively experimenting with ways to limit the scope and expense of e-discovery. For example, the Federal Circuit and the Eastern District of Texas recently issued model discovery orders that seek to reduce e-discovery costs in patent cases by limiting such e-discovery parameters as the number of witnesses whose ESI must be produced, the scope of email production and the number of Boolean search terms that may be used. These model orders are discussed in prior articles in this series.
After the coding parameters are initially determined, the attorneys responsible for producing the documents upload the sample set and its associated coding to the software, which then identifies similar documents over the entirety of the document set and automatically issue codes them. The responding attorney then pulls 500 documents at random from those that the software has identified as relevant and manually re-codes them for relevance, issues and privilege.
The opposing attorneys are again given a chance to evaluate the results, the re-coded 500 documents are uploaded and the software re-codes the entirety of the document set based on the refined information provided by the second pass through a sample set. The parties repeat this process for a total of seven iterations, with the understanding that the coding parameters will be refined and improved through each cycle, thereby resulting in more accurate identification of the most relevant documents. After the final iteration, assuming a high level of accuracy, the responding attorneys perform a manual review of only those documents identified by the software as likely to be relevant.
Judge Peck also noted problems inherent with the use of keywords to cull non-responsive documents. For example, the parties usually do not know which keywords will appear in the most relevant documents, and search terms may result in an over-inclusive set of documents, forcing the responding attorneys to manually review a large quantity of non-responsive documents. Additionally, at least one study demonstrates that keyword searching is only able to identify approximately 20 percent of responsive documents, which undermines its efficacy as a discovery tool.
Judge Peck believed that computer-assisted review was the best method of review for the Publicis case because of: