Novel methods for imputing missing values in water level monitoring data

Research output: Contribution to journalArticlepeer-review

1 Downloads (Pure)

Abstract

Hydrological data are collected automatically from remote water level monitoring stations and then transmitted to the national water management centre via telemetry system. How- ever, the data received at the centre can be incomplete or anomalous due to some issues with the instruments such as power and sensor failures. Usually, the detected anomalies or missing data are just simply eliminated from the data, which could lead to inaccurate analysis or even false alarms. Therefore, it is very helpful to identify missing values and correct them as accurate as possible. In this paper, we introduced a new approach - Full Subsequence Matching (FSM), for imputing missing values in telemetry water level data. The FSM firstly identifies a sequence of missing values and replaces them with some constant values to create a dummy complete sequence. Then, searching for the most similar subsequence from the historical data. Finally, the identified subsequence will be adapted to fit the missing part based on their similarity. The imputation accuracy of the FSM was evaluated with telemetry water level data and compared to some well-established methods - Interpolation, k-NN, MissForest, and also a leading deep learning method - the Long Short-Term Memory (LSTM) technique. Experimental results show that the FSM technique can produce more precise imputations, particularly for those with strong periodic patterns.
Original languageEnglish
JournalWater Resources Management
Early online date5 Jan 2023
DOIs
Publication statusE-pub ahead of print - 5 Jan 2023

Keywords

  • Water level telemetry monitoring
  • Missing Data
  • Imputation
  • Missing data imputation
  • Time series
  • Incomplete subsequence

Cite this