Can ‘Predictive Coding’ cope up with ‘Big Data?

Discovery has changed, and electronically stored information (ESI) was the facilitator. Though ediscovery matters are no longer the novel issues that they once were,” technology is constantly changing. According to Baseline, it was estimated that 90 percent of worlds data has been created in the last two years.  in 2009 there were 988 Exabyte of data in existence, an amount that would stretch from the Sun to Pluto and back in paper form. The problem for corporations is the storage of huge amounts of data – let alone worry about the ‘compliance monster’.

Perhaps, cloud computing is here to ease things out, yet companies are retaining more information than ever, and lawsuits sometimes require attorneys review millions and millions of documents. While Judiciary struggles to devise effective mechanism regarding proportionality rules, big data is growing even bigger – not to mention growing litigation industry. It seems manual review of documents is not an option anymore, as technology is rushing towards meeting the growing needs of document review.

The most important element overlooked is the fact that human eyeballs are still required to review such documents leading to defensibility of the case; after all, isn’t that the real objective?

Definitions of “predictive coding” vary, but a common form of predictive coding includes the following steps. First, the data is uploaded onto a vendor’s servers. Next, representative samples of the electronic documents are identified. These “seed sets” can be created by counsel familiar with the issues, by the predictive coding software, or both. Counsel then review the seed sets and code each document for responsiveness or other attributes, such as privilege or confidentiality. The predictive coding system analyzes this input and creates a new “training set” reflecting the system’s determinations of responsiveness. Counsel then “train” the computer by evaluating where their decisions differ from the computers and then making appropriate adjustments regarding how the computer will analyze future documents.

This process is repeated until the system’s output is deemed reliable. Reliability is determined by statistical methods that measure recall—the percentage of responsive documents in the entire data set that the computer has located—and precision—the percentage of documents within the computer’s output set that are actually responsive. (That is, “recall” tests the extent to which the predictive coding system misses responsive documents, while “precision” tests the extent to which the system is mixing irrelevant documents in with the production set.) The resulting output can be either produced as is or further refined by subsequent human review. Subsequently, attorneys review a much smaller set of documents. Predictive coding therefore effectively “alleviates the need to review whole masses of records in order to find the relevant few.” Most importantly, predictive coding is estimated to reduce ediscovery costs as much as 40% to 60% while maintaining search quality.

A statistic quoted in an IDC and EMC report says that the digital universe is doubling every two years, and will reach 40,000 Exabyte (40 trillion gigabytes) by 2020. The question is: Can predictive coding cope up with big data?


e-Discovery | cloud computing
New Jersey, USA | Lahore, PAK | Dubai, UAE
info www.claydesk.com
(855) – 833 – 7775 (703) – 646 – 3043

Please follow and like us:

E-Discovery: Document review value proposition you don’t want to miss!

The biggest cost driver in e-discovery is the document review part, where millions and millions of documents must be reviewed for potential relevancy and/or responsiveness. KPMG estimates that first level document review encompasses anywhere between 58% and 90% of total litigation costs. While predictive coding technologies have been somewhat successful in culling down most of the documents by utilizing an optimal combination of ‘recall’ and ‘precision’ values, human eyes are still required. Attorneys, senior paralegals power up review centers specifically designed for such projects only to find repetitive work day in day out – resulting in high turnover.

Typically, teams are rounded up and dismantled on an as-needed basis using attorneys, paralegals and law school graduates. This, in turn, has created a ‘day laborer’ mentality, as most projects are short term in nature, anywhere from a few days to months. Therefore, it is not uncommon for document review teams to start with 30 to 50 individuals, and during the course of the project to lose over half of the original members. With costs and inconsistencies in mind, high document reviewer turnover has not only affected the consistency of first level document review projects, but has led to an inefficient model of ‘training and re-training’, consequently resulting in escalation of costs.

Presently, though, due to influx of attorney workforce, pay rates have been seen to take a downward turn – a simple supply and demand situation.

The overall costs, however, remain high due to advent of ‘Big Data’. The question remains: How do you further reduce costs? There is good news!

Let me illustrate savings for a law firm or corporation in a simple hypothetical scenario:

For a 100 GB project (approx. 15,000 docs/GB – source edrm.net), we are talking approx. 1.5 million docs @$40/hour or $.80 per document (assuming 50 docs reviewed per hour – source edrm.net). Total billing = $1.2 million. With offshore attorney rate @$20/hour or $.40 per document. Total billing = $600K

Savings for the law firm of corporation by 50%, that is, save $600K – flat!

One can only imagine the cost savings in much larger engagements. These savings provide additional value and act as supplement to the predictive coding technology – together they form a true win-win solution for clients.

e-Discovery | cloud computing

New Jersey, USA | Lahore, PAK | Dubai, UAE

info www.claydesk.com

(855) – 833 – 7775 (703) – 646 – 3043

Please follow and like us:

Outsourcing – Offshoring e-Discovery: Data Security Concerns

Data security concerns remain prevalent today with many law firms and corporate department. The technological advancements in the field of e-Discovery can be seen frequently, especially in the predictive coding arena. While attorneys may not be tech savvy, there are platforms available that provide the highest level of data security. For instance, a customized version of think client architecture only allows images to be sent to a terminal client, without transferring any data. Perhaps, only a handful companies are using this type of technology, but I managed to find OrangeLT – according to them, “Orange Legal Technologies’ software architecture approach alleviates the challenges typically associated with a web-based approach by leveraging a custom implementation of aterminal services architecture. Simply stated, this approach does not require entire documents to be transferred over the web, as only images of the documents need to be transferred”

One of the key concerns in offshore based E-discovery processing is about data security, and whether it leaves the US shores? The answer to this is an emphatic “No” – data does not leave US shores at any point in time. The offshore team logs into the servers in the US through a 128 bit encrypted VPN channel & processes the information as required (Native extraction, review, coding, tiffing etc.)

Secondly, we leverage upon state of the art technologies, adhering to highest standards pertaining to data security.

Some of the common question asked by us are:

1. Is my data encrypted?

Yes. We ensure your connection to the application using 256 bit AES SSL encryption, which is the same kind of encryption banks use. Your data is backed up twice to ensure safety and potential loss. This ensures that your data is safe with us and cannot be read by an unauthorized person.

2. Where are my documents stored?

The data you provide are stored in a highly secure enterprise SSAE16 SOC-1 Type II certified data center located in the United States. No transferring of data takes place to any offshore facility or server. Data centers are equipped with a minimum of N+1 power redundancy along with facilities such as Uninterruptible Power Supply (UPS) systems to cater to power spikes and surges. Additionally, data centers are monitored and recorded using CCTV 24×365, staffed with 24-hour security officers to augment physical security features, providing financial-grade protection of your mission-critical data.

3. What about data security?

Following the ISO/IEC 27001:2005 standards which are internationally recognized as the definitive best practice on information security management, incorporates development, implementation and maintenance of stringent privacy, confidentiality and IT/infrastructure controls, ClayDesk takes security and protection of your data very seriously. We deploy certified security and forensic specialists as watchdogs over your data. ClayDesk ensures that federally regulated security standards, including Sarbanes-Oxley and HIPAA compliance standards, as well as meeting the Statement on Auditing Standards No. 70: Service Organizations, Type II (SAS70 Type II) are met.

About Claydesk:

Korporate Solutions, Inc. d/b/a ClayDesk is a leading provider of e-Discovery and cloud computing solutions worldwide with its registered office located in Lahore, Pakistan, with branch offices in Piscataway, NJ and Dubai, UAE. Our e-Discovery services cover the entire EDRM cycle including information governance and compliance. Having JD’s as part of our core team, our onshore and offshore document review capabilities are unmatched in the industry.

e-Discovery | cloud computing

New Jersey, USA | Lahore, PAK | Dubai, UAE

info www.claydesk.com

Please follow and like us:

Why Collect Metrics in the Review Stage?

Dubai

Document review, generally acknowledged as the costliest component of e-discovery also involves the greatest coordination among a number of participants (in-house counsel, the outside law firm, the review platform vendor, and the staffing vendor).  As the collected ESI has been processed and uploaded into the review system, LitSpecialists will start the review.  Documents will be reviewed for their relevancy and coded as to responsiveness or reasons for being withheld entirely or in part from production.  The client and Eagan & Escher are eager to assess the number and content of documents to be produced.  The review is also under a complex schedule and a closely-watched budget.  LitSpecialists will provide progress reports monitoring review rates, to manage expectations and to keep the review team on target, and document statistics, to project the time and cost of production.

The use of metrics in the Review Stage ensures deadlines are met, tracks the cost of the review and helps prepare for production.  But beyond those short-term goals, consistent capture and use of review metrics can establish baselines for projecting timeframes and budgets, as well as preferred review platforms and review team composition.

What Needs to be Measured in the Review Stage?

The metrics for Review are derived from both the review platform and the reviewers.

  • Pre-Review Metrics: Describe the size and composition of the dataset to be reviewed, including foreign languages, image files, media type, and spreadsheets, the size and composition of the review team, and the timeframe of the review;
  • Ongoing Review Metrics: (during a review) Include: the hours worked (billed) by the reviewers; the hours logged on the review platform by the reviewers; average hours worked and logged; number, type and average number of documents/pages coded ; the number/percentage of documents checked for accuracy (QC); the documents unable to be reviewed; any non-viewable documents/pages; error rates; review exceptions; and any system downtime;
  • Post Review Metrics: Describe the aggregate of documents loaded (the original dataset, plus additional files loaded during the course of the review); total hours necessary for the review; average review rates for the team/reviewers; total documents/pages reviewed; totals of categories (e.g., Responsive, Non-Responsive, Privileged, Confidential, Further Review, Not Viewable); total downtime.

Source: EDRM

ClayDesk adopts strict quality controls procedures while fully adopting the guidelines, rules, and procedures. Our goal is to bring review costs to a minimum without compromising on quality – making outsourcing simple! Contact us at info@claydesk.com or call us at (855)-833-7775 for your next offshore we based review by our US licensed attorneys and LLM’s.

ClayDesk_Logo_outlook_address book

twitterLinkedInfacebookGoogle+

e-Discovery | cloud computing
New Jersey, USA | Lahore, PAK | Dubai, UAE
info@claydesk.com       www.claydesk.com

Please follow and like us:

ClayDesk

ClayDesk

ClayDesk provides e-Discovery and cloud computing services worldwide.As a Microsoft Cloud Partner, our certified team can deliver innovative and cost effective solutions. We also provide complete document management services.

Visit us at www.claydesk.com

 

Please follow and like us: