Predictive Speaker Adaptation in Speech recognition

Research output: Contribution to journalArticlepeer-review

22 Citations (Scopus)
3 Downloads (Pure)

Abstract

A major problem with most speaker adaptation schemes is that they rely on the speaker providing at least one example of each acoustic unit (word, phone, triphone etc.) in the vocabulary in order to adapt the appropriate model. Rapid adaptation is difficult to achieve and some sounds may never be adapted because they are never heard. In this paper, a technique of adapting all the speech models to a new speaker's voice when he has given an incomplete set of the vocabulary is presented. The technique is based upon using the training-set to obtain estimates of correlations between sounds. Given some sounds from a new speaker at recognition time, these correlations are used to obtain estimates of unheard sounds which are used to adapt the speech models. The technique was applied to a database of 104 speakers speaking the English alphabet. When speakers spoke half of the vocabulary for enrollment prior to recognition, the technique gave a 78\% decrease in error.
Original languageEnglish
Pages (from-to)1-17
Number of pages17
JournalComputer Speech and Language
Volume9
Issue number1
DOIs
Publication statusPublished - Jan 1995

Cite this