The use of a supervised k-means algorithm on real-valued data with applications in health

Sami H. Al-Harbi, Vic J. Rayward-Smith

Research output: Chapter in Book/Report/Conference proceedingChapter

10 Citations (Scopus)

Abstract

k-means is traditionally viewed as an unsupervised algorithm for the clustering of a heterogeneous population into a number of more homogeneous groups of objects. However, it is not necessarily guaranteed to group the same types (classes) of objects together. In such cases, some supervision is needed to partition objects which have the same class label into one cluster. This paper demonstrates how the popular k-means clustering algorithm can be profitably modified to be used as a classifier algorithm. The output field itself cannot be used in the clustering but it is used in developing a suitable metric defined on other fields. The proposed algorithm combines Simulated Annealing and the modified k-means algorithm. We also apply the proposed algorithm to real data sets, which result in improvements in confidence when compared to C4.5.
Original languageEnglish
Title of host publicationDevelopments in Applied Artificial Intelligence
EditorsPaul Chung, Chris Hinde, Moonis Ali
Place of PublicationBerlin / Heidelberg
PublisherSpringer
Pages373-387
Number of pages15
Volume2718
DOIs
Publication statusPublished - 2003

Publication series

NameLecture Notes in Computer Science
PublisherSpringer

Cite this