Speaker Normalisation in the MFCC Domain

Research output: Contribution to conferencePaper

Abstract

It has been shown in several recent publications that application of vocal tract normalization (VTN) is a successful method for improving the accuracy of speaker independent recognisers. We argue that VTN can be implemented in the filterbank domain and propose a model to achieve this. We show how the model can be implemented directly in the MFCC domain, where it may be viewed as a constrained version of maximum likelihood linear regression (MLLR). The parameter estimates produced by the model are in accord with our ideas about how it should operate to perform VTN. Recognition results on a phoneme recognition task are presented which show a small improvement in accuracy.
Original languageEnglish
Pages853-856
Number of pages4
Publication statusPublished - Oct 2000
EventSixth International Conference on Spoken Language Processing (ICSLP 2000) - Beijing, China
Duration: 16 Oct 200020 Oct 2000

Conference

ConferenceSixth International Conference on Spoken Language Processing (ICSLP 2000)
CountryChina
CityBeijing
Period16/10/0020/10/00

Cite this