Handling Categorical data in Rule Induction

M. Burgess, G. J. Janacek, V. J. Rayward-Smith

Research output: Contribution to conferencePaper


In this paper we address problems arising from the use of categorical valued data in rule induction. By naively using categorical values in rule induction, we risk reducing the chances of finding a good rule in terms both of confidence (accuracy) and of support or coverage. In this paper we introduce a technique called arcsin transformation where categorical valued data is replaced with numeric values. Our results show that on relatively large databases, containing many unordered categorical attributes, larger databases incorporating both unordered and numeric data, and especially those databases that are small containing rare cases, this technique is highly effective when dealing with categorical valued data.
Original languageEnglish
Number of pages7
Publication statusPublished - 2003
EventInternational Conference on Artificial Neural Nets and Genetic Algorithms (ICANNGA) - Roanne, France
Duration: 1 Jan 2003 → …


ConferenceInternational Conference on Artificial Neural Nets and Genetic Algorithms (ICANNGA)
Period1/01/03 → …

Cite this