Abstract
We describe experiments in visual-only language identification, in which only lip-shape and lip-motion are used to determine the language of a spoken utterance. We focus on the task of discriminating between two or three languages spoken by the same speaker, and we have recorded a suitable database for these experiments. We use a standard audio language identification approach in which the feature vectors are tokenized and then a language model for each language is estimated over a stream of tokens. Although rate of speaking appeared to affect our results, it was found that different languages spoken at rather similar speeds were as well discriminated as a single language spoken at three extreme speeds, indicating that there is a language effect present in our results.
| Original language | English |
|---|---|
| Pages | 4345-4348 |
| Number of pages | 4 |
| DOIs | |
| Publication status | Published - Apr 2009 |
| Event | IEEE International Conference on Acoustics, Speech and Signal Processing - Taipei, Taiwan Duration: 19 Apr 2009 → 24 Apr 2009 |
Conference
| Conference | IEEE International Conference on Acoustics, Speech and Signal Processing |
|---|---|
| Country/Territory | Taiwan |
| City | Taipei |
| Period | 19/04/09 → 24/04/09 |
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver