The UEA Digital Humans entry to the GENEA Challenge 2023

Jonathan Windle (Lead Author), Iain Matthews, Ben Milner, Sarah Taylor

Research output: Contribution to conferencePaperpeer-review

5 Citations (Scopus)

Abstract

This paper describes our entry to the GENEA (Generation and Evaluation of Non-verbal Behaviour for Embodied Agents) Challenge 2023. This year's challenge focuses on generating gestures in a dyadic setting - predicting a main-agent's motion from the speech of both the main-agent and an interlocutor. We adapt a Transformer-XL architecture for this task by adding a cross-attention module that integrates the interlocutor's speech with that of the main-agent. Our model is conditioned on speech audio (encoded using PASE+), text (encoded using FastText) and a speaker identity label, and is able to generate smooth and speech appropriate gestures for a given identity. We consider the GENEA Challenge user study results and present a discussion of our model strengths and where improvements can be made.

Original languageEnglish
Pages802-810
Number of pages9
DOIs
Publication statusPublished - 9 Oct 2023
EventINTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION -
Duration: 9 Oct 202313 Oct 2023

Conference

ConferenceINTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
Period9/10/2313/10/23

Keywords

  • 3D pose prediction
  • Cross-Attention
  • Self-Attention
  • Speech-to-gesture
  • Transformer-XL
  • gesture generation

Cite this