Real-time Gesture Animation Generation from Speech for Virtual Human Interaction

Manuel Rebol, Christian Gütl, Krzysztof Pietroszek

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

We propose a real-time system for synthesizing gestures directly from speech. Our data-driven approach is based on Generative Adversarial Neural Networks to model the speech-gesture relationship. We utilize the large amount of speaker video data available online to train our 3D gesture model. Our model generates speaker-specific gestures by taking consecutive audio input chunks of two seconds in length. We animate the predicted gestures on a virtual avatar. We achieve a delay below three seconds between the time of audio input and gesture animation.

Original languageEnglish
Title of host publicationExtended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA 2021
PublisherAssociation of Computing Machinery
Pages327-344
ISBN (Electronic)9781450380959
DOIs
Publication statusPublished - 8 May 2021
Event2021 CHI Conference on Human Factors in Computing Systems: Making Waves, Combining Strengths: CHI 2021 - Virtual, Online, Japan
Duration: 8 May 202113 May 2021

Publication series

NameConference on Human Factors in Computing Systems - Proceedings

Conference

Conference2021 CHI Conference on Human Factors in Computing Systems: Making Waves, Combining Strengths
Country/TerritoryJapan
CityVirtual, Online
Period8/05/2113/05/21

Keywords

  • Animation
  • Gestures
  • NUI

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design
  • Software

Fields of Expertise

  • Information, Communication & Computing

Fingerprint

Dive into the research topics of 'Real-time Gesture Animation Generation from Speech for Virtual Human Interaction'. Together they form a unique fingerprint.

Cite this