Abstract
To undertake machine lip-reading, we try to recognise speech from a visual signal. Current work often uses viseme classification supported by language models with varying degrees of success. A few recent works suggest phoneme classification, in the right circumstances, can outperform viseme classification. In this work we present a novel two-pass method of training phoneme classifiers which uses previously trained visemes in the first pass. With our new training algorithm, we show classification performance which significantly improves on previous lip-reading results.
Original language | English |
---|---|
Publication status | Published - 2016 |
Event | International Conference on Acoustics, Speech, and Signal Processing - Shanghai, China Duration: 21 Mar 2016 → 25 Mar 2016 |
Conference
Conference | International Conference on Acoustics, Speech, and Signal Processing |
---|---|
Country/Territory | China |
City | Shanghai |
Period | 21/03/16 → 25/03/16 |
Keywords
- visemes
- weak learning
- visual speech
- lip-reading
- recognition
- classification
Profiles
-
Richard Harvey
- School of Computing Sciences - Professor
Person: Research Group Member, Academic, Teaching & Research