2.5D Visual Speech Synthesis Using Appearance Models

B Theobald, GC Cawley, JRW Glauert, JA Abider, I Matthews

Research output: Contribution to conferencePaper

Abstract

Two dimensional (2D) shape and appearance models are applied to the problem of creating a near-videorealistic talking head. A speech corpus of a talker uttering a set of phonetically balanced training sentences is analysed using a generative model of the human face. Segments of original parameter trajectories, corresponding to the synthesis unit (e.g.~triphone), are extracted from a codebook, then normalised, blended, concatenated and smoothed before being applied to the model to give natural, realistic animations of novel utterances. The system provides a 2D image sequence corresponding to the face of a talker. It is also used to animate the face of a 3D avatar by displacing the mesh according to movements of points in the shape model and dynamically texturing the face polygons using the appearance model.
Original languageEnglish
Pages43-52
Number of pages10
Publication statusPublished - 2003
EventBritish Machine Vision Conference - Oxford Brookes University, Oxford, United Kingdom
Duration: 5 Sep 20058 Sep 2005

Conference

ConferenceBritish Machine Vision Conference
Country/TerritoryUnited Kingdom
CityOxford
Period5/09/058/09/05

Cite this