This paper proposes a set of higher-order modified moments as alternative objective criteria for pitch extraction and explores the impact of the speech window length on pitch estimation error. To obtain the Kth order modified moment, each speech frame is split into a positive-valued signal and a negative-valued signal. The magnitudes of the Kth order moments for the positive and the negative valued signals are obtained and combined. The proposed objective criteria form a relatively sharp peak around the true pitch value compared to the correlation function. For calculation of errors, pitch reference (`ground truth') values are calculated from manually-corrected estimates of the periods obtained from laryngograph signals. The results obtained for the third order modified moment are compared with the results for correlation and magnitude difference criteria and the YIN method. The modified moments provide improved pitch accuracy with less occurrence of large errors (e.g. half or double pitch estimation errors).
|Number of pages||4|
|Publication status||Published - 2010|
|Event||IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) - Dallas, TX|
Duration: 14 Mar 2010 → 19 Mar 2010
|Conference||IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)|
|Period||14/03/10 → 19/03/10|