A Comparision of Estimated and MAP-Predicted Formants and Fundamental Frequencies with Speech Reconstruction Application

J. Darch, B.P. Milner

Research output: Contribution to conferencePaper

Abstract

This work compares the accuracy of fundamental frequency and formant frequency estimation methods and maximum a posteriori (MAP) prediction from MFCC vectors with hand-corrected references. Five fundamental frequency estimation methods are compared to fundamental frequency prediction from MFCC vectors in both clean and noisy speech. Similarly, three formant frequency estimation and prediction methods are compared. An analysis of estimation and prediction accuracy shows that prediction from MFCCs provides the most accurate voicing classification across clean and noisy speech. On clean speech, fundamental frequency estimation outperforms prediction from MFCCs, but as noise increases the performance of prediction is significantly more robust than estimation. Formant frequency prediction is found to be more accurate than estimation in both clean and noisy speech. A subjective analysis of the estimation and prediction methods is also made by reconstructing speech from the acoustic features.
Original languageEnglish
Pages542-545
Number of pages4
Publication statusPublished - 2007
Event8th Annual Conference of the International Speech Communication Association (Interspeech 2007) - Antwerp, Belgium
Duration: 27 Aug 200731 Aug 2007

Conference

Conference8th Annual Conference of the International Speech Communication Association (Interspeech 2007)
Country/TerritoryBelgium
CityAntwerp
Period27/08/0731/08/07

Cite this