Abstract
Two dimensional (2D) shape and appearance models are applied to the problem of creating a near-videorealistic talking head. A speech corpus of a talker uttering a set of phonetically balanced training sentences is analysed using a generative model of the human face. Segments of original parameter trajectories, corresponding to the synthesis unit (e.g.~triphone), are extracted from a codebook, then normalised, blended, concatenated and smoothed before being applied to the model to give natural, realistic animations of novel utterances. The system provides a 2D image sequence corresponding to the face of a talker. It is also used to animate the face of a 3D avatar by displacing the mesh according to movements of points in the shape model and dynamically texturing the face polygons using the appearance model.
Original language | English |
---|---|
Pages | 43-52 |
Number of pages | 10 |
Publication status | Published - 2003 |
Event | British Machine Vision Conference - Oxford Brookes University, Oxford, United Kingdom Duration: 5 Sep 2005 → 8 Sep 2005 |
Conference
Conference | British Machine Vision Conference |
---|---|
Country/Territory | United Kingdom |
City | Oxford |
Period | 5/09/05 → 8/09/05 |