Finding an Optimal Segmentation for Audio Genre Classification

K. West, S. J. Cox

Research output: Contribution to conferencePaper

44 Citations (Scopus)

Abstract

In the automatic classification of music many different segmentations of the audio signal have been used to calculate features. These include individual short frames (23 ms), longer frames (200 ms), short sliding textural windows (1 sec) of a stream of 23 ms frames, large fixed windows (10 sec) and whole files. In this work we present an evaluation of these different segmentations, showing that they are sub-optimal for genre classification and introduce the use of an onset detection based segmentation, which appears to outperform all of the fixed and sliding windows segmentation schemes in terms of classification accuracy and model size.
Original languageEnglish
Pages680-685
Number of pages6
Publication statusPublished - Sep 2005
EventProceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005) - London, United Kingdom
Duration: 11 Sep 200515 Sep 2005

Conference

ConferenceProceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005)
CountryUnited Kingdom
CityLondon
Period11/09/0515/09/05

Cite this