Skip to main content
Unpublished Paper
Evaluation Methods for Topic Models
(2009)
  • Hanna M. Wallach, University of Massachusetts - Amherst
  • Iain Murray
  • Ruslan Salakhutdinov
  • David Minmo
Abstract
A natural evaluation metric for statistical topic models is the probability of held-out documents given a trained model. While exact computation of this probability is intractable due to the large number of discrete latent variables, several estimators for this probability have been used in the topic modeling literature, including the harmonic mean method and empirical likelihood method. In this paper, we demonstrate experimentally that commonly-used methods are unlikely to accurately estimate the probability of unseen documents, and propose two alternative methods that are both accurate and efficient.
Disciplines
Publication Date
2009
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Hanna M. Wallach, Iain Murray, Ruslan Salakhutdinov and David Minmo. "Evaluation Methods for Topic Models" (2009)
Available at: http://works.bepress.com/hanna_wallach/8/