This work compares the performance of three compensation methods for speech recognition in the presence of packet loss. Two methods, cubic interpolation and a novel maximum a posteriori (MAP) estimation, aim to restore the feature vector stream in the event of packet loss, while the third technique applies compensation in the decoding stage of recognition through missing feature theory. To improve performance in burst-like packet loss, interleaving is introduced to disperse bursts of loss. Experiments on the ETSI Aurora connected digit task show best performance to be given by a combination of missing feature theory and cubic interpolation. This raises performance from 50.3% to 69.8% at a packet loss rate of 50% and average burst length of 20 packets. Including interleaving further increases performance to over 76%.
|Number of pages||4|
|Publication status||Published - Oct 2004|
|Event||8th International Conference on Spoken Language Processing (Interspeech 2004) - Jeju Island, South Korea|
Duration: 4 Oct 2004 → 8 Oct 2004
|Conference||8th International Conference on Spoken Language Processing (Interspeech 2004)|
|Period||4/10/04 → 8/10/04|