Abstract
This work looks at the issues involved in performing robust speech recognition over mobile and IP networks. The conventional method for sending speech across a mobile or IP network is to encode the speech on the terminal device using a low bit-rate codec and then transmit the stream of codec parameters. It is shown in this work that for speech recognition applications an alternative is available whereby the front-end processing part of a network-based speech recogniser is detached and moved onto the terminal device. Recognition features are then sent over the network to the remote recogniser. Simulations demonstrate that sending the speech features in this manner can provide a significant enhancement in recognition performance over the traditional codec-based approach. This technique forms the basis of the ETSI (European Telecommunications Standards Institute) Aurora standard. Problems arising with access over IP networks are also considered and in particular that of packet loss. A novel two-stage identification and estimation strategy is introduced which compensates for this loss of speech packets. Simulation results show that an almost negligible loss in recognition performance is possible at packet losses of up to 50%
Original language | English |
---|---|
Pages | 1197-1201 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 2000 |
Event | 11th IEEE Symposium on Personal Indoor Mobile Radio Communication (PIMRC) - London, United Kingdom Duration: 18 Sep 2000 → 21 Sep 2000 |
Conference
Conference | 11th IEEE Symposium on Personal Indoor Mobile Radio Communication (PIMRC) |
---|---|
Country/Territory | United Kingdom |
City | London |
Period | 18/09/00 → 21/09/00 |