"Reducing Weight Undertraining in Structured Discriminative Learning" by Charles Sutton

Selected Works of Andrew McCallum

Follow Contact

Unpublished Paper

Reducing Weight Undertraining in Structured Discriminative Learning

(2005)

Charles Sutton
Michael Sindelar
Andrew McCallum, University of Massachusetts - Amherst

Download

Abstract

Discriminative probabilistic models are very popular in NLP because of the latitude they afford in designing features. But training involves complex trade-offs among weights, which can be dangerous: a few highly-indicative features can swamp the contribution of many individually weaker features, causing their weights to be undertrained. Such a model is less robust, for the highly-indicative features may be noisy or missing in the test data. To ameliorate this weight undertraining, we introduce several new feature bagging methods, in which separate models are trained on subsets of the original features, and combined using a mixture model or a product of experts. These methods include the logarithmic opinion pools used by Smith et al. (2005). We evaluate feature bagging on linear-chain conditional random fields for two natural-language tasks. On both tasks, the feature-bagged CRF performs better than simply training a single CRF on all the features.

Disciplines

Computer Sciences

Publication Date

2005

Comments

This is the pre-published version harvested from CIIR.

Citation Information

Charles Sutton, Michael Sindelar and Andrew McCallum. "Reducing Weight Undertraining in Structured Discriminative Learning" (2005)
Available at: http://works.bepress.com/andrew_mccallum/135/