Insiders’ Guide to Technology-Assisted Review (TAR)
Cover Design and Image courtesy of EY
Copyright © 2015 by Ernst & Young LLP. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600, or on the Web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
ISBN 978-1-118-89426-2 (Paperback)
ISBN 978-1-118-894323 (ePDF)
ISBN 978-1-118-89438-5 (ePub)
Preface
For many years, courts and lawmakers have been struggling with the challenges that ever increasing volumes of data pose to the legal process. Exponentially increasing digital information is hardly a phenomenon that took us by surprise. In litigation and internal and regulatory investigations, as well as information management more broadly, it is largely corporate parties who are footing the bills.
Where finding the truth requires making sense of information and when there is so much of it, we need new tools that will help us to rise above the mostly irrelevant oceans of potentially discoverable information. Technology-Assisted Review, a potentially broad term that has been used to reference any use of technology that facilitates document review as part of the discovery process, has more recently been used (especially in its abbreviated form “TAR”) in a narrower sense related to predictive coding and various forms of review-enhancing analytics. In all of its various forms, TAR holds real promise for alleviating the problems associated with performing accurate searches of large volumes of data from a multitude of sources.
This book explores the linguistic and technical issues associated with the use of TAR in the legal context, as well as summarizing the small body of case law that has percolated over predictive coding. The introduction provides the historical background of TAR in terms of its evolution in support of litigations and investigations. In Chapter 1, we describe different structures of document review, which we look at as a form of classification in which individual items are labeled according to criteria provided by a set of requests, such as in a discovery document request.
A small but growing body of case law on predictive coding, which often explicitly references “thought leadership” publications as quasi-authoritative, is discussed in Chapter 2. Reflecting the broader trend in electronic discovery, legal principles applied to predictive coding include transparency, proportionality, and defensibility, with a heavy dose of recommended cooperation between opposing parties. Appropriate degrees of transparency into an adversary's discovery process and the appropriate balance between cooperation and advocacy are still very much in dispute.
Chapter 3 explores the economics of TAR in terms of cost and value.
This book represents an attempt to provide professionals without advanced degrees in statistics, linguistics, or machine learning and the related technology of TAR a resource for obtaining a thorough understanding of the theory and practice of TAR. Given the rapidly increasing importance of TAR to the legal process, such an understanding is indispensable to legal professionals and others faced with the problem of making sense of large document collections. While the technology of TAR will undoubtedly continue to advance at a speed that makes it hard to capture in writing, the underlying concepts will hold true. The purpose of this book is to convey an understanding of those concepts to the practitioner.
Introduction – Evolution of TAR
The evolutionary path of modern Technology-Assisted Review (TAR) was paved as much by necessity as it was by innovation. These advanced solutions are now necessary because traditional review workflows simply cannot keep pace with the growing volume of electronically stored information (ESI). Moreover, heightened awareness of the potential flaws associated with linear review1 has called unwanted attention to the very real potential for inaccurate coding and incomplete productions.
At the same time, the underlying discipline that drives TAR methodology – the art and science of information retrieval (IR) – has been widely accepted in government and industry-specific sectors (accounting, finance, insurance, telecommunications, and health care to name a few). This active and ongoing use of IR methodology to search, mine, and manage large sets of electronic data has allowed these innovative solutions to go through decades of testing and validation before reaching the mainstream legal marketplace. And while the use of technology is not new or novel in the discovery arena, the road to modern TAR has been paved with numerous iterations of technology-assisted workflows.
In the literal sense, “technology-assisted review” has been ongoing since the first computer was used to help log or categorize a set of documents in response to a request for production. Database capabilities and sophistication grew rapidly during this inaugural wave of technology-assisted solutions, allowing for document collections and productions to be tracked through delimited fields and warehoused indefinitely.
Further advancements came with the ability to scan the physical image of each document and render that image as a static file in a separate database that could be accessed on demand by reviewers through network connections. Combined with what was then considered a cutting-edge technology called optical character recognition (OCR), reviewers could access search results for both the image of a document and its underlying text in one location. Technology-assisted review was in its renaissance and the legal community fully embraced these solutions.
This was the dawn of the digital age in which paper collections were starting to be replaced by word processing tools and the default standard for business correspondence switched from written letters to electronically transmitted e-mails. With this sea change under way, yet