In the automatic classification of music many different segmentations of the audio signal have been used to calculate features. These include individual short frames (23 ms), longer frames (200 ms), short sliding textural windows (1 sec) of a stream of 23 ms frames, large fixed windows (10 sec) and whole files. In this work we present an evaluation of these different segmentations, showing that they are sub-optimal for genre classification and introduce the use of an onset detection based segmentation, which appears to outperform all of the fixed and sliding windows segmentation schemes in terms of classification accuracy and model size.
|Number of pages||6|
|Publication status||Published - Sep 2005|
|Event||Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005) - London, United Kingdom|
Duration: 11 Sep 2005 → 15 Sep 2005
|Conference||Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005)|
|Period||11/09/05 → 15/09/05|