A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering

Chotirat Ratanamahatan, Eamonn Keogh, Anthony J. Bagnall, Stefano Lonardi

Research output: Chapter in Book/Report/Conference proceedingChapter

74 Citations (Scopus)

Abstract

Because time series are a ubiquitous and increasingly prevalent type of data, there has been much research effort devoted to time series data mining recently. As with all data mining problems, the key to effective and scalable algorithms is choosing the right representation of the data. Many high level representations of time series have been proposed for data mining. In this work, we introduce a new technique based on a bit level approximation of the data. The representation has several important advantages over existing techniques. One unique advantage is that it allows raw data to be directly compared to the reduced representation, while still guaranteeing lower bounds to Euclidean distance. This fact can be exploited to produce faster exact algorithms for similarly search. In addition, we demonstrate that our new representation allows time series clustering to scale to much larger datasets.
Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
EditorsTB Ho, D Cheung, H Liu
PublisherSpringer Berlin / Heidelberg
Pages51-65
Number of pages15
Volume3518
ISBN (Print)978-3-540-26076-9
DOIs
Publication statusPublished - 2005

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Berlin / Heidelberg

Cite this