"Toward Interactive Training and Evaluation" by Gregory Druck

Selected Works of Andrew McCallum

Follow Contact

Unpublished Paper

Toward Interactive Training and Evaluation

(2011)

Gregory Druck
Andrew McCallum, University of Massachusetts - Amherst

Download

Abstract

Machine learning often relies on costly labeled data, which impedes its application to new classification and information extraction problems. This motivates the development of methods that leverage our abundant prior knowledge about these problems in learning. Several recently proposed methods incorporate prior knowledge with constraints on the expectations of a probabilistic model. Building on this work, we envision an interactive training paradigm in which practitioners perform evaluation, analyze errors, and provide and refine expectation constraints in a closed loop. In this paper, we focus on several key subproblems in this paradigm that can be cast as selecting a representative sample of the unlabeled data for the practitioner to inspect. To address these problems, we propose stratified sampling methods that use model expectations as a proxy for latent output variables. In classification and sequence labeling experiments, these sampling strategies reduce accuracy evaluation effort by as much as 53%, provide more reliable estimates of F1 for rare labels, and aid in the specification and refinement of constraints.

Disciplines

Computer Sciences

Publication Date

2011

Comments

This is the pre-published version harvested from CIIR.

Citation Information

Gregory Druck and Andrew McCallum. "Toward Interactive Training and Evaluation" (2011)
Available at: http://works.bepress.com/andrew_mccallum/67/