"Sparse Forward-Backward for Fast Training of Conditional Random Fields" by Charles Sutton

Selected Works of Andrew McCallum

Follow Contact

Unpublished Paper

Sparse Forward-Backward for Fast Training of Conditional Random Fields

(2005)

Charles Sutton
Chris Pal
Andrew McCallum, University of Massachusetts - Amherst

Download

Abstract

Complex tasks in speech and language processing often include random variables with large state spaces, both in speech tasks that involve predicting words and phonemes, and in joint processing of pipelined systems, in which the state space can be the labeling of an entire sequence. In large state spaces, however, discriminative training can be expensive, because it often requires many calls to forward-backward. Beam search is a standard heuristic for controlling complexity during Viterbi decoding, but during forward-backward, standard beam heuristics can be dangerous, as they can make training unstable. We introduce sparse forward-backward, a variational perspective on beam methods that uses an approximating mixture of Kronecker delta functions. This motivates a novel minimum-divergence beam criterion based on minimizing KL divergence between the respective marginal distributions. Our beam selection approach is not only more efficient for Viterbi decoding, but also more stable within sparse forward-backward training. For a standard text-to-speech problem, we reduce CRF training time fourfold--from over a day to six hours--with no loss in accuracy.

Disciplines

Computer Sciences

Publication Date

2005

Comments

This is the pre-published version harvested from CIIR.

Citation Information

Charles Sutton, Chris Pal and Andrew McCallum. "Sparse Forward-Backward for Fast Training of Conditional Random Fields" (2005)
Available at: http://works.bepress.com/andrew_mccallum/136/