During my research for the update to my publications, I found this important article which I would request be forwarded on to eDiscovery lawyers, experts, etc. From the prestigious Text Retrieval Conference (TREC) Legal Track administered by the US National Institute of Standards (NIST), an article entitled Some Lessons Learned To Date from the TREC Legal Track (2006-2009)(Feb. 24, 2010), http://trec-legal.umiacs.umd.edu/LessonsLearned.pdf (last visited June 10, 2010), found that counsel and others need "interactive access" to the ESI population in order to achieve at least 50% retrieval of responsive ESI. The article stated,
For the vast majority of our production requests, fewer than half of all responsive documents were retrieved by a Boolean query negotiated by lawyers without interactive access to the collection. This was despite thoughtful keyword choices and the use of Boolean, truncation, and proximity operators in a formally correct fashion. While perhaps surprising, this result is consistent with every IR research study of which we are aware in which the number of unretrieved responsive documents from a large collection has been reliably estimated. We agree with the common view that the cause of this problem is the difficulty of anticipating the range of ways in which language is used to convey ideas of interest in a large collection.

Comments