HMM-Based Speech Enhancement Using Sub-Word Models and Noise Adaptation

Akihiro Kato, Ben Milner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)
19 Downloads (Pure)

Abstract

This work proposes a method of speech enhancement that uses a network of HMMs to first decode noisy speech and to then synthesise a set of features that enables a clean speech signal to be reconstructed. Different choices of acoustic model (whole-word, monophone and triphone) and grammars (highly constrained to no constraints) are considered and the effects of introducing or relaxing acoustic and grammar constraints investigated. For robust operation in noisy conditions it is necessary for the HMMs to model noisy speech and consequently noise adaptation is investigated along with its effect on the reconstructed speech. Speech quality and intelligibility analysis find triphone models with no grammar, combined with noise adaptation, gives highest performance that outperforms conventional methods of enhancement at low signal-to-noise ratios.
Original languageEnglish
Title of host publicationProceedings of the Interspeech Conference 2016
PublisherInternational Speech Communication Association
Pages3748-3752
Number of pages5
DOIs
Publication statusPublished - Sep 2016
EventInterspeech 2016 - San Francisco, United States
Duration: 8 Sep 201612 Sep 2016

Conference

ConferenceInterspeech 2016
Country/TerritoryUnited States
CitySan Francisco
Period8/09/1612/09/16

Cite this