This paper examines problems associated with performing speech recognition over mobile and IP networks. The main problems are identified as codec-based distortion and from speech vectors being lost from packet loss in the network. A realistic model for packet loss is developed, based on a three state Markov model and is shown to be capable of simulating the burst-like nature of packet loss. A two stage packet loss detection and estimation scheme is proposed and is shown to improve the recognition performance in the event of feature vectors being lost. Results from the Aurora database show that burst-like packet loss reduces the digit accuracy from 99% to 57% at 50% packet loss. Estimation of the lost packets recovers the performance to 77%.
|Number of pages||4|
|Publication status||Published - May 2001|
|Event||IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) - Salt Lake City, United States|
Duration: 7 May 2001 → 11 May 2001
|Conference||IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)|
|City||Salt Lake City|
|Period||7/05/01 → 11/05/01|