Originally posted on July 2, 2013
One of the most interesting predictive coding cases going on right now is In Re: Actos (Pioglitazone) Products Liability Litigation (United States District Court of Louisiana MDL No. 6:11-md-2299). The complexion of the case recently changed as the parties who originally agreed to use predictive coding (and in an innovative twist to even train the predictive coding software together) are now fighting over what the results mean. According to court documents filed by both parties (all available on the PACER website), the cooperation experiment is over and both sides are unable to find common ground to resolve their dispute over how next to proceed with the predictive coding results (see Defendants’ Opposition to Plaintiffs Motion to Compel Production of Documents (“the Defendants’ Brief”) and the Plaintiffs’ Steering Committee’s Motion to Compel Production of Documents and Memorandum to Support (the “Plaintiffs Brief”)).
To date, the parties had completed the predictive coding training process outlined in their Case Management Order: Protocol Relating to the Production of Electronically Stored Information (“ESI”) filed with the court on July 27th, 2012. Now the defendants seek to stop the use of predictive coding and instead rely on a prior production from similar litigation in state court case in Illinois where 1.7 million documents and 23.6 million pages were produced. The defendants have only offered to agree to look at some of the “unreviewed” documents identified by the predictive coding if there is a key word filter applied (Page 5, Defendants’ Brief). The defendants’ argument is the predictive coding review is unduly burdensome based on their comparison of the documents identified by the predictive coding, which were ranked between 95 and 100 on a 0 – 100 scale for their likelihood of being responsive (the “95+ Band”), compared to what was already produced in the state court case. The defendants’ claim that their linear review production resulted in the production of 68,426 documents out of the 76,316 documents in the 95+ Band to be responsive. The defendants’ follow up review of the other 7,890 documents in the 95+ Band that were not reviewed in the state court litigation (13,843 when family members are added back), only resulted in an additional 4,543 responsive documents (see paragraphs 32-37, Declaration of David D. Lewis, Ph.D (“Lewis Declaration”)). The defendants’ claim that this low precision of less than 40% within this tight band of documents makes the additional review of the 505,326 the plaintiffs are seeking to be reviewed burdensome and unnecessary.
The plaintiffs, however, want to continue the predictive coding process. They claim that their control set, which the two parties coded together, indicates they will find 81.8% of the responsive documents in the collection by limiting their review to the top 48% of the documents in the collection, which comprise 505,326 documents (the “52+ Band”) (see paragraph 40 of the Lewis Declaration). They assert, and the calculations based on the predictive coding results indicate, that there are many responsive documents in the 505,326 unreviewed documents that were missed by key words in the Illinois litigation. The plaintiffs also argue that the key word filter the defendants are offering to use is not necessary – predictive coding has filtered the documents and they want to use predictive coding precisely because they do not want to use the filters of key words. Although the key word filter will certainly reduce the number of documents that need to be reviewed, the use of key words is likely to also reduce the number of responsive documents that will be found (see Paragraphs 42-43 of the Lewis Declaration).
Several points jump out at me from this debate and current impasse. The first is that the predictive coding appears to have worked. The defendants’ show that, in the 95+ Band, the predictive coding found 76,316 documents and 68,426 of these documents were also found in the Illinois state court linear review. This is 89.66% agreement between the linear review and predictive coding among the highest scoring documents. In fact, predictive coding also found an additional 4,543 responsive documents. As a technology assisted review advocate, I feel pretty good that predictive coding found so many of the same documents in the 95+ Band because predictive coding is often a much less expensive process in ESI intensive cases. Maybe these burdensome arguments made by the defendants will disappear if the more efficient search approach of predictive coding is used instead of linear review and parties instead focus their efforts on how to best train and validate their predictive coding tools.
The other issue I see in this case is the math may not support the defendants’ claim that they already have found most of the data by focusing their analysis only on the top 5% of the documents in the 95+ Band instead of on the remaining bands of documents that the plaintiffs seek to have reviewed. If you analyze the entire 50+ Band that approximates the 52+ Band that the plaintiffs want the defendants’ to review instead of only the top 95+Band, the documents filed by both parties reveal there are 1,047,667 documents that will fall into the 50+ Band (Para 30 Lewis Declaration). Without knowing the precision estimates from the predictive coding software, we can still estimate recall by using algebra. We know that some 345,021 of the documents in the 50+ Band were previously produced in Illinois (Para 39 Lewis Declaration) as well as 4,523 from the 95+ Band which were only found by predictive coding and have since been produced. This means 349,564 relevant documents were produced at this point in time from the 50+ Band (including family members). The defendants’ contend that 505,326 of the remaining documents the plaintiffs seek to have examined were not reviewed. (Para 39 Lewis Declaration). Control sets are often used to estimate the richness of a collection, meaning how many relevant documents are estimated to exist in the collection. By knowing the population of relevant documents in the 50+ Band, we can use what has already been produced to estimate how much is remaining. The control set that the parties coded together to measure the results of the predictive coding output showed that 35.881% of the collection was relevant by taking the ratio of 385 documents coded as relevant when creating the control set when compared to the 1073 randomly picked documents which were coded in total. (Plaintiffs Brief Pg 3). This same ratio applied to the entire collection of 2,542,345 documents (Para 52 Lewis Declaration) could mean there are 912,219 relevant documents in the collection. Recall was estimated by the predictive coding tool at the 50% + bucket to be 81.8% (Para 40 Lewis Declaration) which means that 81.8% of the 912,219 documents should fall in the 50+ Band, or 746,195 responsive documents. Since 349,564 relevant documents have been produced from the 50+ Band, there could be an estimated 396,631 relevant documents from the 50+ bucket still remaining. That means that defendants’ have only produced 46.846% of the relevant documents from the 50+ bucket. Since there are only 505,326 documents left to review, and 396,631 are estimated relevant, that means that the precision of the un-reviewed documents could actually be 78.49%. This rough calculation does not support a burdensome argument,especially in a case where both parties had agreed to use predictive coding.
The defendants contend that 505,326 of the remaining documents the plaintiffs seek to have examined were not reviewed (see paragraph 39 of the Lewis Declaration). Control sets are often used to estimate the richness of a collection (how many relevant documents are estimated to exist in the collection). The control set that the parties coded together to measure the results of the predictive coding output showed that 35.881% of the collection was relevant by taking the ratio of 385 documents coded as relevant when creating the control set when compared to the 1073 randomly picked documents which were coded in total (see Plaintiffs Brief Pg 3). This same ratio applied to the entire collection of 2,542,345 documents (see paragraph 52 of the Lewis Declaration) could mean there are 912,219 relevant documents in the collection. Recall was estimated by the predictive coding tool at the 50% + bucket to be 81.8% (see paragraph 40 of the Lewis Declaration) which means that 81.8% of the 912,219 documents should fall in the 50+ Band, or 746,195 responsive documents. Since 349,564 relevant documents have been produced from the 50+ Band, there could be an estimated 396,631 relevant documents from the 50+ bucket still remaining. That means that defendants’ have only produced 46.846% of the relevant documents from the 50+ bucket. Since there are only 505,326 documents left to review, and 396,631 are estimated relevant, that means that the precision of the un-reviewed documents could actually be 78.49%. This rough calculation does not support a burdensome argument, especially in a case where both parties had agreed to use predictive coding.
I am not sure how the defendants can argue against doing this review on proportionality grounds given these compelling numbers. They are focusing their argument to stop the process on the top 5% of documents where presumably key words and predictive coding have the best shot of finding responsive documents. It is not surprising that at the high range for predictive coding, that there is some agreement between the two approaches. The more interesting analysis for predictive coding is what is found that key word searching misses. Further review of the 505,326 unreviewed documents will shed much more light on this question.
The defendants could disagree with the analysis above and presumably argue that the control set they created with the plaintiffs is not representative of the collection – that it is a poor estimate of richness because it was based on 4 custodians and some regulatory data instead of the 29 custodians in the collection (see Plaintiff’s Brief pg 3 and pg 5). Furthermore, there are family members contained in some of the calculations above but not in others. However, the parties have gone this far with this experiment and both parties agreed to use these numbers as their control set and collaboratively worked to identify these documents to test the system’s results for recall. If the control set is viewed as imprecise and a poor estimator for richness because it only measures against what was found in a random sample of 4 key custodians and some key documents, an additional random sample could be taken of the entire collection with the family members treated consistently in all of the calculations to estimate richness and to refine the estimate of how many responsive documents are in the collection.
My conclusion is these numbers, albeit rough estimates, do paint a compelling argument to allow this Proof of Concept case to run its course. Whatever the correct numbers are, there clearly are a large number of relevant documents in the 505,326 unreviewed documents as predicted by the predictive coding tool. To me, it it seems unlikely that a court would stop an experimental case using predictive coding at this juncture given the factthe parties had agreed to use predictive coding. Keep your eyes peeled out just in case there is an opinion forthcoming from the court on whether predictive coding should continue to be used in this case.
Also keep your eyes posted out for more of the Thought Leadership programs on predictive coding being held by eDJ Group and Review Less where we go through the math behind predictive coding cases like this to teach more lawyers how to become comfortable with predictive coding results in a vendor neutral setting. This case is a prime example of why the Thought Leadership series to educate lawyers on how to use and argue the math results play out with predictive coding. The next such program will be in Philadelphia on July 22nd with additional similar events to be scheduled in Minneapolis, Cincinnati, Dallas and in other cities around the country.
eDiscoveryJournal Contributor and eDJ Group Adjunct Analyst Karl Schieneman – See more at: http://ediscoveryjournal.com/2013/07/predictive-coding-cooperation-experiment-gets-contentious/#sthash.ypbfEYkX.dpuf