In describing predictive coding systems, it’s important to distinguish document-based systems from corpus-based systems. Document-based systems make their predictions based on the similarity of each document to a single, previously-categorized document. Corpus-based systems are, in addition, able to use higher-order properties of groups of previously-categorized documents to make their predictions. Because of this advantage, corpus-based systems are less affected by errors in coding individual documents. Continue reading →
Some say that predictive coding isn’t as useful to plaintiffs as it is to defendants. See, for example, this post on Linkedin.
In my view, what really matters is whether the litigant is producing or receiving the documents. Predictive coding is more useful to a producing party than to the receiving party. And, in a way, predictive coding is actually the opposite of post-production analysis. Continue reading →
In any case where emails between the parties will be discoverable, the parties should collect and jointly analyze the emails between them, before the initial scheduling conference, in order to agree on an efficient and proportional ediscovery plan and to attempt early resolution.
Even before either side does a first-pass review of their collected documents, they can easily identify which potentially-discoverable documents both sides already have in common. This process would be fast, inexpensive, and easy, and would allow new kinds of cooperation between parties.
Joe Looby’s unique historical documentary, The Decade of Discovery (10th Mountain Films), shows how a few lawyers, judges, and scholars recognized the scope of the looming electronic discovery juggernaut and took the first major systematic steps to rein it in. Continue reading →
I think that the Brainspace Discovery 4 analytical engine (http://brainspace.com) is pointing the way to the future of technology-assisted review. In short, Discovery 4 exposes its inferences about which concepts indicate responsiveness and allows reviewers to adjust them. Continue reading →
My last post showed how you can use pre-reviewed seed sets of relevant and irrelevant documents to help prioritize unreviewed documents, using WordSmith 6 from Lexical Analysis Software Ltd. (USD $88.73 or EUR €67.57 from lexically.net – Please note that I’m not affiliated with Lexical.)
Here’s how you can help your document reviewers hit the ground running by investing a few hours of attorney or paralegal time with WordSmith before you’ve reviewed any documents. Continue reading →