This paper presents an efficient training approach for support vector machines that will improve their ability to learn from a large or imbalanced data set. Given an original training set, the proposed approach applies unsupervised learning to extract a smaller set of salient training exemplars, which are represented by weighted cluster centers and the target outputs. In subsequent supervised learning, the objective function is modified by introducing a weight for each new training sample and the corresponding penalty term. In this paper, we investigate two methods of defining the weight based on cluster vectors. The proposed SVM training is implemented and tested on two problems: (i) gender classification of facial images using the FERET data set; (ii) income prediction using the UCI Adult Census data set. Experiment results show that compared to standard SVM training, the proposed approach leads to much faster SVM training, produces a more compact classifier while maintaining generalization ability.
Available at: http://works.bepress.com/son_lam_phung/9/