"Classification with Hybrid Generative/Discriminative Models" by Rajat Raina

Selected Works of Andrew McCallum

Follow Contact

Unpublished Paper

Classification with Hybrid Generative/Discriminative Models

(2003)

Rajat Raina
Yirong Shen
Andrew Y. Ng
Andrew McCallum, University of Massachusetts - Amherst

Download

Abstract

Although discriminatively-trained classifiers are usually more accurate when labeled training data is abundant, previous work has shown that when training data is limited, generative classifiers can out-perform them.This paper describes a hybrid model in which a high-dimensional subset of the parameters are trained to maximize generative likelihood, and another, small, subset of parameters are trained to maximize conditional likelihood. We also give a sample complexity bound showing that in order to fit the discriminative parameters well, the number of training examples required depends only logarithmically on the number of feature occurrences and feature set size. Experimental results show that hybrid models can provide lower test error than either their purely generative or purely discriminative counterparts, and can produce better accuracy/coverage curves than naive Bayes or logistic regression. We also discuss several advantages of hybrid models, and advocate further work in this area.

Disciplines

Computer Sciences

Publication Date

2003

Comments

This is the pre-published version harvested from CIIR.

Citation Information

Rajat Raina, Yirong Shen, Andrew Y. Ng and Andrew McCallum. "Classification with Hybrid Generative/Discriminative Models" (2003)
Available at: http://works.bepress.com/andrew_mccallum/38/