Classification rules are a convenient method of expressing regularities that exist within databases. They are particularly useful when we wish to find patterns that describe a defined class of interest, i.e. for the task of partial classification or "nugget discovery". In this paper we address the problems of finding classification rules from databases containing nominal and ordinal attributes. The number of rules that can be formulated from a database is usually potentially vast due to the effect of combinatorial explosion. This means that generating all rules in order to find the best rules (according to some stated criteria) is usually impractical and alternative strategies must be used. In this paper we present an algorithm that delivers a clearly defined set of rules, the pc'-optimal set. This set describes the interesting associations in a database but excludes many rules that are simply minor variations of other rules. The algorithm addresses the problems of combinatorial explosion and is capable of finding rules from databases comprising nominal and ordinal attributes. In order to find the pc'-optimal set efficiently, novel pruning functions are used in the search that take advantage of the properties of the pc'-optimal set. Our main contribution is a method of on-the-fly pruning based on exploiting the relationship between pc'-optimal sets and ordinal data. We show that using these methods results in a very considerable increase in efficiency allowing the discovery of useful rules from many databases.
|Number of pages||19|
|Journal||Intelligent Data Analysis|
|Publication status||Published - May 2005|