The aim of this work is to improve distributed speech recognition accuracy in packet loss by considering the effect of loss on the temporal derivatives of the feature vector. Analysis of temporal derivatives reveals they suffer severe distortion when static vectors are lost in times of packet loss. The application of missing feature theory and soft-decoding techniques are considered for compensating against packet loss at the decoding stage of recognition. An extension to these methods is developed which considers the static, velocity and acceleration components separately. A series of confidence measures for the temporal derivatives is devised and applied within the soft-decoding framework. Experimental results on both a connected digit task and a large vocabulary task demonstrate significant increases in recognition accuracy under a range of packet loss conditions.
|Number of pages||4|
|Publication status||Published - Mar 2005|
|Event||IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) - Philadelphia, United States|
Duration: 18 Mar 2005 → 23 Mar 2005
|Conference||IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)|
|Period||18/03/05 → 23/03/05|