Stable Video Portraits is a novel hybrid 2D/3D generation method that outputs photorealistic videos of talking faces leveraging a large pre-trained text-to-image prior (2D), controlled via a 3DMM (3D). It is based on a personalized image diffusion prior which allows us to generate new videos of the subject, and also to edit the appearance by blending the personalized image prior with a general text-conditioned model.
This website uses cookies to ensure you get the best experience. Learn more.