Predictive Coding: Networks and Trees

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Inside the BoxIn describing predictive coding systems, it’s important to distinguish document-based systems from corpus-based systems. Document-based systems make their predictions based on the similarity of each document to a single, previously-categorized document. Corpus-based systems are, in addition, able to use higher-order properties of groups of previously-categorized documents to make their predictions. Because of this advantage, corpus-based systems are less affected by errors in coding individual documents. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Minds Matter: H5, Rules-Based TAR, and Cooperation

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

dipole_small copyThis article is about how H5‘s rules-based approach to technology-assisted review provides a great framework for illustrating cooperation in ediscovery. But first, some context.

By this time next year, Rule 1 of the Federal Rules of Civil Procedure will have been amended to codify the principles of proportionality and cooperation between opposing counsel. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Predictive Coding: For What, Not For Whom

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

PvPSome say that predictive coding isn’t as useful to plaintiffs as it is to defendants. See, for example, this post on Linkedin.

In my view, what really matters is whether the litigant is producing or receiving the documents. Predictive coding is more useful to a producing party than to the receiving party. And, in a way, predictive coding is actually the opposite of post-production analysis. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Deduplication Between Parties

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

DupesEven before either side does a first-pass review of their collected documents, they can easily identify which potentially-discoverable documents both sides already have in common. This process would be fast, inexpensive, and easy, and would allow new kinds of cooperation between parties.

Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Cinema: EDiscovery

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

The Great Seal with lightningJoe Looby’s unique historical documentary, The Decade of Discovery (10th Mountain Films), shows how a few lawyers, judges, and scholars recognized the scope of the looming electronic discovery juggernaut and took the first major systematic steps to rein it in. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Technology-Assisted Review: Thinking Inside the Box

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Technology-Assisted Review: Thinking Inside the BoxI think that the Brainspace Discovery 4 analytical engine (http://brainspace.com) is pointing the way to the future of technology-assisted review. In short, Discovery 4 exposes its inferences about which concepts indicate responsiveness and allows reviewers to adjust them. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Technology-Assisted Review: Quickly Compute Case Dictionaries

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

My last post showed how you can use pre-reviewed seed sets of relevant and irrelevant documents to help prioritize unreviewed documents, using WordSmith 6 from Lexical Analysis Software Ltd. (USD $88.73 or EUR €67.57 from lexically.net – Please note that I’m not affiliated with Lexical.)

Here’s how you can help your document reviewers hit the ground running by investing a few hours of attorney or paralegal time with WordSmith before you’ve reviewed any documents. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Statistically Improbable Phrases in Technology-Assisted Review

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare


amazonAmazon.com computes and displays “Statistically Improbable Phrases” for its indexed books. It defines a Statistically Improbable Phrase as “a phrase that occurs a large number of times in a particular book relative to all [indexed] books.” You can use similar statistics to help you improve your technology-assisted review. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare