"Correlations and Anticorrelations in LDA Inference" by Alexandre Passos

Selected Works of Andrew McCallum

Follow Contact

Unpublished Paper

Correlations and Anticorrelations in LDA Inference

(2011)

Alexandre Passos
Hanna M. Wallach, University of Massachusetts - Amherst
Andrew McCallum

Download

Abstract

In this paper, we present preliminary work on identifying equivalent structures required by LDA, and how they affect inference of document-specific topic distributions. This work is based on the observation that when occurrences of a particular word type in a document could be explained by multiple topics, inference will almost always force the model to choose between these topics. Not only is this required by the structure of optimal solutions to the LDA inference problem (see lemma 4 of Sontag and Roy \cite{david11:compl}) but, intuitively, this is also why learning the topic-specific distributions over words is possible: by explaining all document-specific occurrences of a word type with one topic and searching for sparse document--topic distributions, the latent topics should ``move apart'' and eventually represent sets words that exhibit within-topic, but not across-topic, co-occurrences.

Disciplines

Computer Sciences

Publication Date

2011

Comments

This is the pre-published version harvested from CIIR.

Citation Information

Alexandre Passos, Hanna M. Wallach and Andrew McCallum. "Correlations and Anticorrelations in LDA Inference" (2011)
Available at: http://works.bepress.com/andrew_mccallum/32/