TY - JOUR
T1 - An ultra-fast time series distance measure to allow data mining in more complex real-world deployments
AU - Gharghabi, Shaghayegh
AU - Imani, Shima
AU - Bagnall, Anthony
AU - Darvishzadeh, Amirali
AU - Keogh, Eamonn
PY - 2020/7
Y1 - 2020/7
N2 - At their core, many time series data mining algorithms reduce to reasoning about the shapes of time series subsequences. This requires an effective distance measure, and for last two decades most algorithms use Euclidean Distance or DTW as their core subroutine. We argue that these distance measures are not as robust as the community seems to believe. The undue faith in these measures perhaps derives from an overreliance on the benchmark datasets and self-selection bias. The community is simply reluctant to address more difficult domains, for which current distance measures are ill-suited. In this work, we introduce a novel distance measure MPdist. We show that our proposed distance measure is much more robust than current distance measures. For example, it can handle data with missing values or spurious regions. Furthermore, it allows us to successfully mine datasets that would defeat any Euclidean or DTW distance-based algorithm. Additionally, we show that our distance measure can be computed so efficiently as to allow analytics on very fast arriving streams.
AB - At their core, many time series data mining algorithms reduce to reasoning about the shapes of time series subsequences. This requires an effective distance measure, and for last two decades most algorithms use Euclidean Distance or DTW as their core subroutine. We argue that these distance measures are not as robust as the community seems to believe. The undue faith in these measures perhaps derives from an overreliance on the benchmark datasets and self-selection bias. The community is simply reluctant to address more difficult domains, for which current distance measures are ill-suited. In this work, we introduce a novel distance measure MPdist. We show that our proposed distance measure is much more robust than current distance measures. For example, it can handle data with missing values or spurious regions. Furthermore, it allows us to successfully mine datasets that would defeat any Euclidean or DTW distance-based algorithm. Additionally, we show that our distance measure can be computed so efficiently as to allow analytics on very fast arriving streams.
KW - Time Series
KW - Distance Measure
KW - Matrix Profile
UR - http://www.scopus.com/inward/record.url?scp=85085629527&partnerID=8YFLogxK
U2 - 10.1007/s10618-020-00695-8
DO - 10.1007/s10618-020-00695-8
M3 - Article
SN - 1384-5810
VL - 34
SP - 1104
EP - 1135
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 4
ER -