Back

Perceiving Systems Members Publications Website

Joker: Conditional 3D Head Synthesis with Extreme Facial Expressions

Joker
Joker uses one reference image to generate a 3D reconstruction with a novel extreme expression. The target expression is defined through 3DMM parameters and text prompts. The text prompts effectively resolve ambiguities in the 3DMM input and can control emotion-related expression subtleties and tongue articulation. Our Method consists of 2 components: a 2D Diffusion-Based Prior and a Progressive 3D Distillation procedure. The 2D prior predicts images of the reference identity while expression and pose are controlled through a 3DMM and text prompts. During the Progressive 3D Distillation, predictions of the 2D prior are used to generate a 3D reconstruction.

Members

Thumb ticker sm me
Perceiving Systems
  • Guest Scientist
Thumb ticker sm kberna
Perceiving Systems
  • Guest Scientist
Thumb ticker sm eth photo
Perceiving Systems, Human-centric Vision & Learning
  • Doctoral Researcher
Thumb ticker sm justus thies
Neural Capture and Synthesis, Perceiving Systems
Max Planck Research Group Leader

Publications

Perceiving Systems Conference Paper Joker: Conditional 3D Head Synthesis with Extreme Facial Expressions Prinzler, M., Zakharov, E., Sklyarova, V., Kabadayi, B., Thies, J. In International Conference on 3D Vision (3DV), International Conference on 3D Vision, 2025 (Published) project page arxiv BibTeX