Robust speech recognition over mobile and IP networks in burst-like packet loss

B. P. Milner, A. B. James

Research output: Contribution to journalArticlepeer-review

17 Citations (Scopus)

Abstract

This paper addresses the problem of achieving robust distributed speech recognition in the presence of burst-like packet loss. To compensate for packet loss a number of techniques are investigated to provide estimates of lost vectors. Experimental results on both a connected digits task and a large vocabulary continuous speech recognition task show that simple methods, such as repetition, are not as effective as interpolation methods which are better able to preserve the dynamics of the feature vector stream. Best performance is given by maximum a-posteriori (MAP) estimation of lost vectors which utilizes statistics of the feature vector stream. At longer burst lengths the performance of these compensation techniques deteriorates as the temporal correlation in the received feature vector stream reduces. To compensate for this interleaving is proposed which aims to disperse bursts of loss into a series of unconnected smaller bursts. Results show substantial gains in accuracy, to almost that of the no loss condition, when interleaving is combined with estimation techniques, although this is at the expense of introducing delay. This leads to the proposal that, for a distributed speech recognition application, it is more beneficial to trade delay for accuracy rather than trading bit-rate for accuracy as in forward error correction schemes.
Original languageEnglish
Pages (from-to)223-231
Number of pages9
JournalIEEE Transactions on Audio, Speech, and Language Processing
Volume14
Issue number1
DOIs
Publication statusPublished - 2006

Cite this