Predictive Coding: Networks and Trees

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Inside the BoxIn describing predictive coding systems, it’s important to distinguish document-based systems from corpus-based systems. Document-based systems make their predictions based on the similarity of each document to a single, previously-categorized document. Corpus-based systems are, in addition, able to use higher-order properties of groups of previously-categorized documents to make their predictions. Because of this advantage, corpus-based systems are less affected by errors in coding individual documents. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Technology-Assisted Review: Thinking Inside the Box

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Technology-Assisted Review: Thinking Inside the BoxI think that the Brainspace Discovery 4 analytical engine (http://brainspace.com) is pointing the way to the future of technology-assisted review. In short, Discovery 4 exposes its inferences about which concepts indicate responsiveness and allows reviewers to adjust them. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Technology-Assisted Review: Quickly Compute Case Dictionaries

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

My last post showed how you can use pre-reviewed seed sets of relevant and irrelevant documents to help prioritize unreviewed documents, using WordSmith 6 from Lexical Analysis Software Ltd. (USD $88.73 or EUR €67.57 from lexically.net – Please note that I’m not affiliated with Lexical.)

Here’s how you can help your document reviewers hit the ground running by investing a few hours of attorney or paralegal time with WordSmith before you’ve reviewed any documents. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare

Statistically Improbable Phrases in Technology-Assisted Review

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare


amazonAmazon.com computes and displays “Statistically Improbable Phrases” for its indexed books. It defines a Statistically Improbable Phrase as “a phrase that occurs a large number of times in a particular book relative to all [indexed] books.” You can use similar statistics to help you improve your technology-assisted review. Continue reading

TwitterLinkedInFacebookGoogle GmailYahoo MailAOL MailEmailPocketEvernoteInstapaperShare