Language Modeling with Limited Domain DataProceedings of the Arpa Spoken Language Systems Technology Workshop. San Mateo: Morgan Kaufmann
Date of Original Version1-1-1995
Abstract or DescriptionGeneric recognition systems contain language models which arerepresentative of a broad corpus. In actual practice, however, recognitionis usually on a coherent text covering a single topic, suggestingthat knowledge of the topic at hand can be used to advantage. A basemodel can be augmented with information from a small sample ofdomain-specific language data to significantly improve recognitionperformance. Good performance may be obtained by merging inonly those n-grams that include words that are out of vocabularywith respect to the base model.
Citation InformationAlexander I Rudnicky. "Language Modeling with Limited Domain Data" Proceedings of the Arpa Spoken Language Systems Technology Workshop. San Mateo: Morgan Kaufmann (1995) p. 66 - 69
Available at: http://works.bepress.com/alexander_rudnicky/40/