This paper proposes using a hidden Markov model (HMM) to model a speech signal in terms of its speech class (voiced, unvoiced and nonspeech) and for voiced speech its fundamental frequency. States of the HMM represent unvoiced speech and nonspeech with multiple voiced states that model different fundamental frequencies. The transition matrix of the HMM models temporal changes in speech class and the time-varying fundamental frequency contour. The model is then applied to voicing and fundamental frequency estimation by extracting acoustic features from a speech signal and then applying Viterbi decoding. Experimental results are presented that investigate the estimation accuracy of the proposed system and a comparison is made against conventional methods.
|Number of pages||5|
|Publication status||Published - 2013|