Skip to main content
Unpublished Paper
Correlations and Anticorrelations in LDA Inference
(2011)
  • Alexandre Passos
  • Hanna M. Wallach, University of Massachusetts - Amherst
  • Andrew McCallum
Abstract
In this paper, we present preliminary work on identifying equivalent structures required by LDA, and how they affect inference of document-specific topic distributions. This work is based on the observation that when occurrences of a particular word type in a document could be explained by multiple topics, inference will almost always force the model to choose between these topics. Not only is this required by the structure of optimal solutions to the LDA inference problem (see lemma 4 of Sontag and Roy \cite{david11:compl}) but, intuitively, this is also why learning the topic-specific distributions over words is possible: by explaining all document-specific occurrences of a word type with one topic and searching for sparse document--topic distributions, the latent topics should ``move apart'' and eventually represent sets words that exhibit within-topic, but not across-topic, co-occurrences.
Disciplines
Publication Date
2011
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Alexandre Passos, Hanna M. Wallach and Andrew McCallum. "Correlations and Anticorrelations in LDA Inference" (2011)
Available at: http://works.bepress.com/andrew_mccallum/32/