TY - CHAP
T1 - Improving decision tree performance through induction and cluster-based stratified sampling
AU - Gill, Abdul A.
AU - Smith, George D.
AU - Bagnall, Anthony J.
PY - 2004
Y1 - 2004
N2 - It is generally recognised that recursive partitioning, as used in the construction of classification trees, is inherently unstable, particularly for small data sets. Classification accuracy and, by implication, tree structure, are sensitive to changes in the training data. Successful approaches to counteract this effect include multiple classifiers, e.g. boosting, bagging or windowing. The downside of these multiple classification models, however, is the plethora of trees that result, often making it difficult to extract the classifier in a meaningful manner. We show that, by using some very weak knowledge in the sampling stage, when the data set is partitioned into the training and test sets, a more consistent and improved performance is achieved by a single decision tree classifier.
AB - It is generally recognised that recursive partitioning, as used in the construction of classification trees, is inherently unstable, particularly for small data sets. Classification accuracy and, by implication, tree structure, are sensitive to changes in the training data. Successful approaches to counteract this effect include multiple classifiers, e.g. boosting, bagging or windowing. The downside of these multiple classification models, however, is the plethora of trees that result, often making it difficult to extract the classifier in a meaningful manner. We show that, by using some very weak knowledge in the sampling stage, when the data set is partitioned into the training and test sets, a more consistent and improved performance is achieved by a single decision tree classifier.
U2 - 10.1007/978-3-540-28651-6_50
DO - 10.1007/978-3-540-28651-6_50
M3 - Chapter
SN - 978-3-540-22881-3
VL - 3177
T3 - Lecture Notes in Computer Science
SP - 339
EP - 344
BT - Intelligent Data Engineering and Automated Learning – IDEAL 2004
PB - Springer
ER -