You have survived Step 1, Step 2 and Step 3 and are now tasked with actually reviewing all of the data you preserved. Your first job will be to determine what information is actually relevant. Then, of course, you will need to protect any privileged or confidential material. Electronically stored information is multiplying every day, and this directly impacts your litigation, most particularly the amount of data you will need to review. The more data you have to review, the more expensive and time consuming it will be. In fact, because traditional linear review can account for upwards of 73 percent of e-discovery costs, technology and other cost-saving methods must be considered.
Not only does the use of assisted review eliminate the practical need for keyword culling, but the use of keyword culling in an assisted review workflow can be problematic. Predictive coding itself serves as the best culling tool by scanning data to determine what is likely responsive and what is not, to arrive at a set of data that is at least worthy of review. To properly train the assisted review tool, you need to provide the system with good examples of both responsive and non-responsive documents. Culling a large percentage of your non-responsive documents prior to training the system will yield fewer examples of the various types of non-responsive documents that exist in your data collection. This means that your seed documents are less likely to include examples of each type of non-responsive document, and your system will not be properly trained to categorize all of the non-responsive documents in your data set.
To ensure the best results, you must take the time to create a proper and defensible workflow. You will need to work with an expert but supervise the process at each step so that you are making the legal decisions and using your vast knowledge of the case to inform all of the searches and inputs.