Skip to main content
Other
Using Asymmetric Classification Cost Matrices in Predicting Diabetes
ICDSS 2007 Proceedings
  • Bishwadip Ghosh, University of Colorado, Denver
  • Joseph Hasley, Health sciences Center, Denver
Publication Date
1-1-2007
Abstract

Often there is a need to introduce classification costs into the classifier for predicting disease. This is determined by the type of disease, its associated classification cost matrix and/or the target population on which the classifier will be used. Diabetes has higher costs associated with false negatives than true positives, as the disease can progress very rapidly when left untreated. There are two ways to skew a classifier to work towards the given classification cost matrix: (1) by changing the classification probability value, P* based on the classification cost matrix or (2) by rebalancing the training set to introduce more negative cases. Using a diabetes data set, this paper compares the two methods. The results indicate comparable values of predictive accuracy and expected classification costs for either method. However, P* works better when the p-value is less than 0.2. Hence for diabetes classification matrices, the P* method is recommended.

Citation Information
Bishwadip Ghosh and Joseph Hasley. "Using Asymmetric Classification Cost Matrices in Predicting Diabetes" (2007)
Available at: http://works.bepress.com/biswadip_ghosh/3/