Visually-derived Wiener filters for speech enhancement

Ibrahim Almajai, Ben P. Milner, Jonathan Darch, Saeed V. Vaseghi

Research output: Contribution to conferencePaper

15 Citations (Scopus)

Abstract

This work begins by examining the correlation between audio and visual speech features and reveals higher correlation to exist within individual phoneme sounds rather than globally across all speech. Utilising this correlation, a visually-derived Wiener filter is proposed in which clean power spectrum estimates are obtained from visual speech features. Two methods of extracting clean power spectrum estimates are made; first from a global estimate using a single Gaussian mixture model (GMM), and second from phoneme-specific estimates using a hidden Markov model (HMM)-GMM structure. Measurement of estimation accuracy reveals that the phoneme-specific (HMM-GMM) system leads to lower estimation errors than the global (GMM) system. Finally, the effectiveness of visually-derived Wiener filtering is examined
Original languageEnglish
PagesIV-585-IV-588
Number of pages4
DOIs
Publication statusPublished - 2007
EventIEEE International Conference on Acoustics, Speech and Signal Processing - Honolulu, United States
Duration: 15 Apr 200720 Apr 2007

Conference

ConferenceIEEE International Conference on Acoustics, Speech and Signal Processing
Country/TerritoryUnited States
CityHonolulu
Period15/04/0720/04/07

Cite this