Decoding visemes: improving machine lip-reading

Helen Bear, Richard Harvey

Research output: Contribution to conferencePosterpeer-review

25 Citations (Scopus)
18 Downloads (Pure)


To undertake machine lip-reading, we try to recognise speech from a visual signal. Current work often uses viseme classification supported by language models with varying degrees of success. A few recent works suggest phoneme classification, in the right circumstances, can outperform viseme classification. In this work we present a novel two-pass method of training phoneme classifiers which uses previously trained visemes in the first pass. With our new training algorithm, we show classification performance which significantly improves on previous lip-reading results.
Original languageEnglish
Publication statusPublished - 2016
EventInternational Conference on Acoustics, Speech, and Signal Processing - Shanghai, China
Duration: 21 Mar 201625 Mar 2016


ConferenceInternational Conference on Acoustics, Speech, and Signal Processing


  • visemes
  • weak learning
  • visual speech
  • lip-reading
  • recognition
  • classification

Cite this