Decoding visemes: improving machine lip-reading

Helen Bear, Richard Harvey

Research output: Contribution to conferencePosterpeer-review

18 Citations (Scopus)
8 Downloads (Pure)

Abstract

To undertake machine lip-reading, we try to recognise speech from a visual signal. Current work often uses viseme classification supported by language models with varying degrees of success. A few recent works suggest phoneme classification, in the right circumstances, can outperform viseme classification. In this work we present a novel two-pass method of training phoneme classifiers which uses previously trained visemes in the first pass. With our new training algorithm, we show classification performance which significantly improves on previous lip-reading results.
Original languageEnglish
Publication statusPublished - 2016
EventInternational Conference on Acoustics, Speech, and Signal Processing - Shanghai, China
Duration: 21 Mar 201625 Mar 2016

Conference

ConferenceInternational Conference on Acoustics, Speech, and Signal Processing
Country/TerritoryChina
CityShanghai
Period21/03/1625/03/16

Keywords

  • visemes
  • weak learning
  • visual speech
  • lip-reading
  • recognition
  • classification

Cite this