
Unpublished Paper
Topic Models for Taxonomies
(2012)
Abstract
Concept taxonomies such as MeSH, the ACM Computing Classification System, and the NY Times Subject Headings are frequently used to help organize data. They typically consist of a set of concept names organized in a hierarchy. However, these names and structure are often not sufficient to fully capture the intended meaning of a taxonomy node, and particularly non-experts may have difficulty navigating and placing data into the taxonomy. This paper introduces two semi-supervised topic models that automatically augment a given taxonomy with many additional keywords by leveraging a corpus of multi-labeled documents. Our experiments show that users find the topics beneficial for taxonomy interpretation, substantially increasing their cataloging accuracy. Furthermore, the models provide a better information rate compared to Labeled LDA.
Keywords
- Topic modeling,
- Taxonomy annotation,
- Taxonomy browsing
Disciplines
Publication Date
2012
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Anton Bakalov, Andrew McCallum, Hanna M. Wallach and David Minmo. "Topic Models for Taxonomies" (2012) Available at: http://works.bepress.com/hanna_wallach/18/