Many machine learning algorithms have been used to classify pixels in Landsat imagery. The maximum likelihood classifier is the widely-accepted classifier. Non-parametric methods of classification include neural networks and decision trees. In this research work, we implemented decision trees using the C4.5 algorithm to classify pixels of a scene from Juneau, Alaska area obtained with Landsat 8, Operation Land Imager (OLI). One of the concerns with decision trees is that they are often over fitted with training set data, which yields less accuracy in classifying unknown data. To study the effect of overfitting, we have considered noisy training set data and built decision trees using randomly-selected training samples with variable sample sizes. One of the ways to overcome the overfitting problem is pruning a decision tree. We have generated pruned trees with data sets of various sizes and compared the accuracy obtained with pruned trees to the accuracy obtained with full decision trees. Furthermore, we extracted knowledge regarding classification rules from the pruned tree. To validate the rules, we built a fuzzy inference system (FIS) and reclassified the dataset. In designing the FIS, we used threshold values obtained from extracted rules to define input membership functions and used the extracted rules as the rule-base. The classification results obtained from decision trees and the FIS are evaluated using the overall accuracy obtained from the confusion matrix.
Available at: http://works.bepress.com/arun-kulkarni/57/