We carry out a systematic study of the effect on the performance of a range of classification algorithms with the inclusion of attributes constructed using genetic programming. The genetic program uses information gain as the basis of its fitness. The classification algorithms used are C5, CART, CHAID and a MLP. The results show that, for the majority of the data sets used, all algorithms benefit by the inclusion of the evolved attributes. However, for one data set, whilst the performance of C5 improves, the performance of the other techniques deteriorates. Whilst this is not statistically significant, it does indicate that care must be taken when a pre-processing technique (attribute construction using GP) and the classification technique (in this case, C5) use the same fundamental technology, in this case Information Gain.
|Lecture Notes in Computer Science
|Springer Berlin / Heidelberg
|16th Australian Conference on AI
|3/12/03 → 5/12/03