Skip to main content
Article
Language Modeling with Limited Domain Data
Proceedings of the Arpa Spoken Language Systems Technology Workshop. San Mateo: Morgan Kaufmann
  • Alexander I Rudnicky, Carnegie Mellon University
Date of Original Version
1-1-1995
Type
Conference Proceeding
Abstract or Description
Generic recognition systems contain language models which arerepresentative of a broad corpus. In actual practice, however, recognitionis usually on a coherent text covering a single topic, suggestingthat knowledge of the topic at hand can be used to advantage. A basemodel can be augmented with information from a small sample ofdomain-specific language data to significantly improve recognitionperformance. Good performance may be obtained by merging inonly those n-grams that include words that are out of vocabularywith respect to the base model.
Citation Information
Alexander I Rudnicky. "Language Modeling with Limited Domain Data" Proceedings of the Arpa Spoken Language Systems Technology Workshop. San Mateo: Morgan Kaufmann (1995) p. 66 - 69
Available at: http://works.bepress.com/alexander_rudnicky/40/