Handling Categorical data in Rule Induction

M. Burgess, G. J. Janacek, V. J. Rayward-Smith

Research output: Contribution to conferencePaper

Abstract

In this paper we address problems arising from the use of categorical valued data in rule induction. By naively using categorical values in rule induction, we risk reducing the chances of finding a good rule in terms both of confidence (accuracy) and of support or coverage. In this paper we introduce a technique called arcsin transformation where categorical valued data is replaced with numeric values. Our results show that on relatively large databases, containing many unordered categorical attributes, larger databases incorporating both unordered and numeric data, and especially those databases that are small containing rare cases, this technique is highly effective when dealing with categorical valued data.
Original languageEnglish
Pages249-255
Number of pages7
Publication statusPublished - 2003
EventInternational Conference on Artificial Neural Nets and Genetic Algorithms (ICANNGA) - Roanne, France
Duration: 1 Jan 2003 → …

Conference

ConferenceInternational Conference on Artificial Neural Nets and Genetic Algorithms (ICANNGA)
Country/TerritoryFrance
CityRoanne
Period1/01/03 → …

Cite this