Perceiving Systems The MIT License 2019-05-01

VOCA: Capture, Learning, and Synthesis of 3D Speaking Styles

Thumb ticker sm photo linkedin
Perceiving Systems
Ph.D. Student (deceased)
Thumb ticker sm bild
Perceiving Systems
  • Research Scientist
Thumb ticker sm 2015 05 gemstone photoshoot cropped
Perceiving Systems
Thumb ticker sm pc1a5918
Perceiving Systems
  • Doctoral Researcher
Thumb ticker sm headshot2021
Perceiving Systems
Director
Teaser voca

VOCA (Voice Operated Character Animation) is a framework that takes a speech signal as input and realistically animates a wide range of adult faces. <p><strong>Code: </strong>We provide Python demo code that outputs a 3D head animation given a speech signal and a static 3D head mesh. The codebase further provides animation control to alter the speaking style, identity-dependent facial shape, and head pose (i.e. head rotation around the neck) during animation. The code further demonstrates how to sample 3D head meshes from the publicly available FLAME model, that can then be animated&nbsp; with the provided code.</p> <p><strong>Dataset: </strong>We capture a unique 4D face dataset (VOCASET) with about 29 minutes of 3D scans captured at 60 fps and synchronized audio from 12 speakers. We provide the raw 3D scans, registrations in FLAME topology, and unposed registrations (i.e. registrations in &quot;zero pose&quot;).</p>

Release Date: 01 May 2019
licence_type: The MIT License
Copyright: Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V.
Authors: Daniel Cudeiro and Timo Bolkart and Cassidy Laidlaw and Anurag Ranjan and Michael Black
Link (URL): https://voca.is.tue.mpg.de
Repository: https://github.com/TimoBolkart/voca