On the Estimation and the Use of Confusion-Matrices for Improving ASR Accuracy

Omar Caballero Morales, Stephen Cox

Research output: Contribution to conferencePaper

3 Citations (Scopus)

Abstract

In previous work, we described how learning the pattern of recognition errors made by an individual using a certain ASR system leads to increased recognition accuracy compared with a standard MLLR adaptation approach. This was the case for low-intelligibility speakers with dysarthric speech, but no improvement was observed for normal speakers. In this paper, we describe an alternative method for obtaining the training data for confusion-matrix estimation for normal speakers which is more effective than our previous technique. We also address the issue of data sparsity in estimation of confusion-matrices by using non-negative matrix factorization (NMF) to discover structure within them. The confusion-matrix estimates made using these techniques are integrated into the ASR process using a technique termed as "metamodels", and the results presented here show statistically significant gains in word recognition accuracy when applied to normal speech.
Original languageEnglish
Pages1599-1602
Number of pages4
Publication statusPublished - Sep 2009
Event10th Annual Conference of the International Speech Communication Association (INTERSPEECH) - Brighton, United Kingdom
Duration: 6 Sep 200910 Sep 2009

Conference

Conference10th Annual Conference of the International Speech Communication Association (INTERSPEECH)
Country/TerritoryUnited Kingdom
CityBrighton
Period6/09/0910/09/09

Cite this