Skip to main content
Unpublished Paper
Reducing Weight Undertraining in Structured Discriminative Learning
(2005)
  • Charles Sutton
  • Michael Sindelar
  • Andrew McCallum, University of Massachusetts - Amherst
Abstract
Discriminative probabilistic models are very popular in NLP because of the latitude they afford in designing features. But training involves complex trade-offs among weights, which can be dangerous: a few highly-indicative features can swamp the contribution of many individually weaker features, causing their weights to be undertrained. Such a model is less robust, for the highly-indicative features may be noisy or missing in the test data. To ameliorate this weight undertraining, we introduce several new feature bagging methods, in which separate models are trained on subsets of the original features, and combined using a mixture model or a product of experts. These methods include the logarithmic opinion pools used by Smith et al. (2005). We evaluate feature bagging on linear-chain conditional random fields for two natural-language tasks. On both tasks, the feature-bagged CRF performs better than simply training a single CRF on all the features.
Disciplines
Publication Date
2005
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Charles Sutton, Michael Sindelar and Andrew McCallum. "Reducing Weight Undertraining in Structured Discriminative Learning" (2005)
Available at: http://works.bepress.com/andrew_mccallum/135/