Authorea

J48

We did a sweep over the confidence parameter from 0.1 to 0.5. A smaller confidence factor increases the pruning of the tree. Too much pruning will decrease the accuracy of the classifier, but too much pruning risks overfitting the training data. The highest kappa value we found to be at a confidence factor of 0.24. The results seem to be stable around this point, in the region of 0.2 and 0.3. Interestingly, the small spike at low confidence values might be an indicator of overfitting starting to happen.