Analysing the importance of different visual feature coefficients

Ben Milner, Danny Websdale

Research output: Contribution to conferencePaperpeer-review

16 Downloads (Pure)

Abstract

A study is presented to determine the relative importance of different visual features for speech recognition which includes pixel-based, model-based, contour-based and physical features. Analysis to determine the discriminability of features is per- formed through F-ratio and J-measures for both static and tem- poral derivatives, the results of which were found to correlate highly with speech recognition accuracy (r = 0.97). Princi- pal component analysis is then used to combine all visual fea- tures into a single feature vector, of which further analysis is performed on the resulting basis functions. An optimal feature vector is obtained which outperforms the best individual feature (AAM) with 93.5 % word accuracy.
Original languageEnglish
Publication statusPublished - 2015
EventFAAVSP - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing - Austria, Vienna, Austria
Duration: 11 Sep 201513 Sep 2015
http://www.isca-speech.org/archive/avsp15/av15_127.html

Conference

ConferenceFAAVSP - The 1st Joint Conference on Facial Analysis, Animation and Auditory-Visual Speech Processing
Abbreviated titleFAAVSP 2015
Country/TerritoryAustria
CityVienna
Period11/09/1513/09/15
Internet address

Cite this