Using Context to Correct Phone Recognition Errors

Research output: Contribution to conferencePaper

1 Citation (Scopus)

Abstract

There are many circumstances in which it is useful or necessary to recognise phones rather than words, but phone recognition is inherently less accurate than word recognition. We describe here a post-recognition method for "translating" an errorful phone string output by a speech recogniser into a string that more closely matches the transcription. The technique owes something to Kohonen's idea of "dynamically expanding context" in that it learns from the errors made by the recogniser in a particular context, but it uses many contexts rather than a single context to estimate the "translation" of a recognised phone. The weights given to the different contexts in estimating the translation are determined discriminatively. On the WSJCAM0 database, the technique gives a 19.2% relative improvement in phone errors (including insertions) over the baseline, compared with a 6.2% improvement obtained using dynamically expanding context.
Original languageEnglish
Pages2061-2064
Number of pages4
Publication statusPublished - 2004
Event8th International Conference on Spoken Language Processing (Interspeech 2004) - Jeju Island, South Korea
Duration: 4 Oct 20048 Oct 2004

Conference

Conference8th International Conference on Spoken Language Processing (Interspeech 2004)
Country/TerritorySouth Korea
CityJeju Island
Period4/10/048/10/04

Cite this