"Minimal Test Collections for Retrieval Evaluation" by Ben Carterette

Selected Works of Ramesh Sitaraman

Follow Contact

Presentation

Minimal Test Collections for Retrieval Evaluation

29th Annual International ACM SIGIR Conference (2006)

Ben Carterette
James Allan
Ramesh Sitaraman, University of Massachusetts - Amherst

Download

Abstract

Accurate estimation of information retrieval evaluation metrics such as average precision require large sets of relevance judgments. Building sets large enough for evaluation of real-world implementations is at best inefficient, at worst infeasible. In this work we link evaluation with test collection construction to gain an understanding of the minimal judging effort that must be done to have high confidence in the outcome of an evaluation. A new way of looking at average precision leads to a natural algorithm for selecting documents to judge and allows us to estimate the degree of confidence by defining a distribution over possible document judgments. A study with annotators shows that this method can be used by a small group of researchers to rank a set of systems in under three hours with 95% confidence.

Keywords

information retrieval,
evaluation,
test collections,
algorithms,
theory

Disciplines

Computer Sciences

Publication Date

2006

Citation Information

Ben Carterette, James Allan and Ramesh Sitaraman. "Minimal Test Collections for Retrieval Evaluation" 29th Annual International ACM SIGIR Conference (2006)
Available at: http://works.bepress.com/ramesh_sitaraman/14/