Abstract
This work proposes a novel method of predicting formant frequencies from a stream of mel-frequency cepstral coefficients (MFCC) feature vectors. Prediction is based on modelling the joint density of MFCCs and formant frequencies using a Gaussian mixture model (GMM). Using this GMM and an input MFCC vector, two maximum a posteriori (MAP) prediction methods are developed. The first method predicts formants from the closest, in some sense, cluster to the input MFCC vector, while the second method takes a weighted contribution of formants predicted from all clusters. Experimental results are presented using the ETSI Aurora connected digit database and show that predicted formant frequencies are within 3.2% of reference formant frequencies.
Original language | English |
---|---|
Pages | 941-944 |
Number of pages | 4 |
DOIs | |
Publication status | Published - 2005 |
Event | IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) - Philadelphia, United States Duration: 18 Mar 2005 → 23 Mar 2005 |
Conference
Conference | IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) |
---|---|
Country/Territory | United States |
City | Philadelphia |
Period | 18/03/05 → 23/03/05 |