Skip to main content
Unpublished Paper
Mixtures of Hierarchical Topics with Pachinko Allocation
(2007)
  • David Mimno
  • Wei Li
  • Andrew McCallum, University of Massachusetts - Amherst
Abstract
The four-level Pachinko Allocation model (PAM) represents correlations among topics using a DAG structure. It does not, however, represent a nested hierarchy of topics, with some topical word distributions representing the vocabulary that is shared among several more specific topics. This paper presents Hierarchical PAM---an enhancement that explicitly represents a topic hierarchy. This model can be seen as combining the advantages of hLDA's topical hierarchy representation with PAM's ability to mix multiple leaves of the topic hierarchy. Experimental results show improvements in likelihood of held-out documents, as well as mutual information between automatically-discovered topics and human-generated categories such as journals and newsgroups.
Disciplines
Publication Date
2007
Comments
This is the pre-published version harvested from CIIR.
Citation Information
David Mimno, Wei Li and Andrew McCallum. "Mixtures of Hierarchical Topics with Pachinko Allocation" (2007)
Available at: http://works.bepress.com/andrew_mccallum/107/