Skip to main content
Unpublished Paper
Classification with Hybrid Generative/Discriminative Models
(2003)
  • Rajat Raina
  • Yirong Shen
  • Andrew Y. Ng
  • Andrew McCallum, University of Massachusetts - Amherst
Abstract
Although discriminatively-trained classifiers are usually more accurate when labeled training data is abundant, previous work has shown that when training data is limited, generative classifiers can out-perform them.This paper describes a hybrid model in which a high-dimensional subset of the parameters are trained to maximize generative likelihood, and another, small, subset of parameters are trained to maximize conditional likelihood. We also give a sample complexity bound showing that in order to fit the discriminative parameters well, the number of training examples required depends only logarithmically on the number of feature occurrences and feature set size. Experimental results show that hybrid models can provide lower test error than either their purely generative or purely discriminative counterparts, and can produce better accuracy/coverage curves than naive Bayes or logistic regression. We also discuss several advantages of hybrid models, and advocate further work in this area.
Disciplines
Publication Date
2003
Comments
This is the pre-published version harvested from CIIR.
Citation Information
Rajat Raina, Yirong Shen, Andrew Y. Ng and Andrew McCallum. "Classification with Hybrid Generative/Discriminative Models" (2003)
Available at: http://works.bepress.com/andrew_mccallum/38/