Weighted Ensemble Methods for Predicting Train Delays

Mostafa Al Ghamdi, Gerard Parr, Wenjia Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)


Train delays have become a serious and common problem in the rail services due to the increasing number of passengers and limited rail network capacity, so being able to predict train delays accurately is essential for train controllers to devise appropriate plans to prevent or reduce some delays. This paper presents a machine learning ensemble framework to improve the accuracy and consistency of train delay prediction. The basic idea is to train many different types of machine learning models for each station along a chosen journey of train service using historical data and relevant weather data, and then with certain criteria to choose some models to build an ensemble. It then combines the outputs from its member models with an aggregation function to produce the final prediction. Two aggregation functions were devised to combine the outputs of individual models: averaging and weighted averaging. These ensembles were implemented with a framework and their performance was tested with the data from an intercity train service as a case study. The accuracy was measured by the percentages of correct prediction of the arrival time for a train and correct prediction within one minute to the actual arrival time. The mean accuracies and standard deviations are 42.3%(±11.24) from the individual models, 57.8%(±3.56) from the averaging ensembles, and 72.8%(±0.99) from the weighted ensembles. For the predictions within one minute of the actual times, they are 86.4%(±14.05), 94.6%(±1.34) and 96.0%(±0.47) respectively. So overall, the ensembles significantly improved not only the prediction accuracies but also the consistency and the weighted ensembles are clearly the best.
Original languageEnglish
Title of host publicationComputational Science and Its Applications – ICCSA 2020 - 20th International Conference, Proceedings
EditorsOsvaldo Gervasi, Beniamino Murgante, Sanjay Misra, Chiara Garau, Ivan Blecic, David Taniar, Bernady O. Apduhan, Ana Maria A.C. Rocha, Eufemia Tarantino, Carmelo Maria Torre, Yeliz Karaca
Place of Publication978-3-030-58798-7
Number of pages15
ISBN (Electronic)978-3-030-58799-4
Publication statusPublished - 1 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12249 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


  • Ensemble
  • Machine learning
  • Train delays

Cite this