2.5D Visual Speech Synthesis Using Appearance Models

B Theobald, GC Cawley, JRW Glauert, JA Abider, I Matthews

Research output: Contribution to conferencePaper


Two dimensional (2D) shape and appearance models are applied to the problem of creating a near-videorealistic talking head. A speech corpus of a talker uttering a set of phonetically balanced training sentences is analysed using a generative model of the human face. Segments of original parameter trajectories, corresponding to the synthesis unit (e.g.~triphone), are extracted from a codebook, then normalised, blended, concatenated and smoothed before being applied to the model to give natural, realistic animations of novel utterances. The system provides a 2D image sequence corresponding to the face of a talker. It is also used to animate the face of a 3D avatar by displacing the mesh according to movements of points in the shape model and dynamically texturing the face polygons using the appearance model.
Original languageEnglish
Number of pages10
Publication statusPublished - 2003
EventBritish Machine Vision Conference - Oxford Brookes University, Oxford, United Kingdom
Duration: 5 Sep 20058 Sep 2005


ConferenceBritish Machine Vision Conference
Country/TerritoryUnited Kingdom

Cite this