TY - JOUR
T1 - Capturing expert uncertainty: ICC-informed soft labelling for volcano-seismicity
AU - Mitchinson, Sam
AU - Johnson, Jessica H.
AU - Milner, Ben
AU - Lamb, Oliver
AU - Behr, Yannik
N1 - Data Availability: A sample of the full earthquake catalogue from Illsley-Kemp and Mestel (2025) was used in this study. The full catalogue is freely available at https://doi.org/10.5281/zenodo.13138604. The GeoNet seismic data is freely available through GeoNet.
Code Availability: Available on request.
Funding information: The authors would like to express their gratitude to ARIES and NERC for the financial support provided for this research; grant number NE/S007334/1.
PY - 2025/10
Y1 - 2025/10
N2 - Reliable classification of volcano-seismic signals underpins monitoring and eruption forecasting and is an essential tool for advancing understanding of subsurface processes. However, traditional approaches may overlook the inherent uncertainty and variability between expert judgments. We introduce an innovative method that explicitly quantifies inter-expert agreement using the intraclass correlation coefficient (ICC) and incorporates this measure into probabilistic, ICC-informed soft labels, which can be fed into machine learning pipelines. We conducted a global survey involving 89 experts who classified a set of 80 volcano-seismic events from Ruapehu, New Zealand, providing continuous ratings for standard categories: volcano tectonic (VT), hybrid (HYB), long-period (LP), and other (OT). ICC agreement scores revealed that single-rater scores produce poor agreement between experts even for well-established VT and LP classifications. However, reliability significantly improved for these classifications when multiple expert ratings were combined, although, for HYB and OT categories, expert disagreement remained substantial. We developed a soft labelling methodology that weights class probabilities by their respective ICC scores, resulting in a distribution that naturally reflects expert uncertainty. This demonstrates that ICC-informed soft labels could provide a robust alternative to the hard label standard by explicitly capturing classification uncertainty and variability. Our fully probabilistic view has the potential to significantly enhance machine learning model accuracy, robustness, and transferability across volcanic systems and should provide a fundamental shift in how volcano-seismic data are labelled and interpreted within automated monitoring frameworks.
AB - Reliable classification of volcano-seismic signals underpins monitoring and eruption forecasting and is an essential tool for advancing understanding of subsurface processes. However, traditional approaches may overlook the inherent uncertainty and variability between expert judgments. We introduce an innovative method that explicitly quantifies inter-expert agreement using the intraclass correlation coefficient (ICC) and incorporates this measure into probabilistic, ICC-informed soft labels, which can be fed into machine learning pipelines. We conducted a global survey involving 89 experts who classified a set of 80 volcano-seismic events from Ruapehu, New Zealand, providing continuous ratings for standard categories: volcano tectonic (VT), hybrid (HYB), long-period (LP), and other (OT). ICC agreement scores revealed that single-rater scores produce poor agreement between experts even for well-established VT and LP classifications. However, reliability significantly improved for these classifications when multiple expert ratings were combined, although, for HYB and OT categories, expert disagreement remained substantial. We developed a soft labelling methodology that weights class probabilities by their respective ICC scores, resulting in a distribution that naturally reflects expert uncertainty. This demonstrates that ICC-informed soft labels could provide a robust alternative to the hard label standard by explicitly capturing classification uncertainty and variability. Our fully probabilistic view has the potential to significantly enhance machine learning model accuracy, robustness, and transferability across volcanic systems and should provide a fundamental shift in how volcano-seismic data are labelled and interpreted within automated monitoring frameworks.
KW - Inter-rater reliability
KW - Intraclass correlation coefficient
KW - Ruapehu
KW - Uncertainty
KW - Volcano-seismicity
UR - http://www.scopus.com/inward/record.url?scp=105016539757&partnerID=8YFLogxK
U2 - 10.1007/s00445-025-01875-4
DO - 10.1007/s00445-025-01875-4
M3 - Article
AN - SCOPUS:105016539757
SN - 0258-8900
VL - 87
JO - Bulletin of Volcanology
JF - Bulletin of Volcanology
IS - 10
M1 - 84
ER -