The task of collecting, identifying and properly using key documents is the foundation of a successful internal investigation. Yet, this task continuously grows more complicated and costly given the swelling volume of data generated in the ordinary course of business. The pressure on counsel to reduce costs is ever-present, but may be particularly acute in the context of internal investigations, where there is likely no financial upside at an investigation’s conclusion. Fortunately, emerging technologies in the field of electronic discovery now enable lawyers to more quickly and accurately identify the most important documents using a comprehensive and defensible process that is substantially less expensive and potentially more effective than commonly used alternatives.
One technology that is growing in acceptance—and may be particularly well-suited to internal investigations—is predictive coding. Predictive coding is one tool that falls under the umbrella of “technology-assisted review.” Predictive coding combines human guidance with computer-piloted concept searching in order to “train” document review software to recognize relevant documents within a document universe.
The search software then analyzes the documents determined to be relevant and searches for similar documents throughout the document universe. To do so, the software looks at broad patterns of language to determine what the relevant documents have in common conceptually. The software then creates conceptual profiles of both relevant and irrelevant documents, applies these profiles to the rest of the documents in the universe and designates the remaining documents as either presumptively relevant or irrelevant.
Once this process is complete, the results are validated. If necessary, the team can “retrain” the software for more precise results and then re-run it on the same universe of documents. The rate at which the human reviewer overturns’ the software’s decisions, known as the “overturn rate,” indicates how well-trained the software was by the first manually reviewed seed set.