A hybrid classification method: Discrete canonical variate analysis using a genetic algorithm

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

This paper describes a novel, hybrid multivariate classification method: discrete canonical variate analysis (DCVA), which is integrated in the present implementation with a genetic algorithm (GA). DCVA transforms a multivariate data set into a set of discrete scores of lower dimensionality, intended specifically to act as classifiers of observations into one out of multiple pre-defined groups. The condition for selecting the DCVA loadings is maximization of the ratio of the between-groups to within-groups variance of the scores, but unlike conventional CVA, there is a non-linear, discontinuous relationship between the scores and loadings. The performance of the DCVA method is compared with that of two competing classification methods, Artificial Neural Networks (ANNs) and Mahalanobis distance-based Linear discriminant analysis (LDA) using six example problems. In all cases, internal (leave-one-out) cross-validation was used, and classification success rates retained from both the training and test segments. Of the methods studied, DCVA clearly performed the best in training, producing the highest mean success rates for four out of the six example data sets. For the test segments, DCVA produced the best performance for two of the data sets, and equalled that of LDA and ANN for a third. However, LDA produced the best performance from the remaining three data sets. This is suggestive of a greater tendency of DCVA, like other search-based methods, to overfit.

Original languageEnglish
Pages (from-to)39-51
Number of pages13
JournalChemometrics and Intelligent Laboratory Systems
Volume55
Issue number1-2
DOIs
Publication statusPublished - 13 Jan 2001
Externally publishedYes

Keywords

  • Canonical variate analysis-CVA
  • Classification
  • Genetic algorithm-GA
  • Non-linear

Cite this