Events & Talks
Perceiving Systems
Talk
Rao Fu
Upcoming
24-07-2025
Capturing Dexterity: Large-Scale Demonstration Acquisition and Dynamic Contact Modeling for Bimanual Hand–Object Interaction
Achieving dexterous manipulation in AI and robotics depends on learning from high-quality, generalizable demonstrations with detailed annotations. In this talk, I first present Gigahands, a massive and richly annotated dataset capturing diverse bimanual hand object interactions. Existing datasets often fall short in scale, diversity, and detail, limiting the effective training of generative models and policy learners. Gigahands offers the possibility of overcoming these limitations by providing a robust foundation of diverse demonstrations paired with comprehensive text annotations, which a...
Perceiving Systems
Talk
Robin Courant
10-03-2025
How and what to film in virtual environments?
Content creation for movies and video games has been transformed with the rise of virtual environments, yet filming within these digital worlds remains a complex challenge. This talk explores the question: how and what to film in virtual environments? We examine the role of camera control and human interaction across different virtual settings, including NeRF, 3D engines, and video generation.
Victoria Fernandez Abrevaya
Perceiving Systems
Talk
Ailing Zeng
18-02-2025
The Dawn of Video Generation: Preliminary Explorations with SORA-like Models
High-quality video generation—encompassing text-to-video (T2V), image-to-video (I2V), and video-to-video (V2V) generation—plays a pivotal role in content creation and world simulation. While several DiT-based models have advanced rapidly in the past year, a thorough exploration of their capabilities, limitations, and alignment with human preferences remains incomplete. In this talk, I will present recent advancements in SORA-like T2V, I2V, and V2V models and products, bridging the gap between academic research and industry applications. Through live demonstrations and comparative analyses, ...
Nikos Athanasiou
Michael Black
Perceiving Systems
Talk
Yannis Siglidis
06-02-2025
Computer Vision at the Mirror Stage: Questioning and Refining Visual Categorization
Computer vision advancements in predicting and visualizing labels, often motivate us to consider the relationship between labels and images as a given. Yet, the prototypical nature of coherent labels, such as the alphabet of handwritten characters, can help us question assumed families of handwritten variation.
Nikos Athanasiou
Perceiving Systems
Talk
Sergi Pujades
28-11-2024
How to predict the inside from the outside? Segment, register, model and infer!
Observing and modeling the human body has attracted scientific efforts since the very early times in history. In the recent decades, though, several imaging modalities, such as Computed Tomography scanners (CT), Magnetic Resonance Imaging (MRI), or X-ray have provided the means to “see” inside the body. Most interestingly, there is growing evidence pointing that the shape of the surface of the human body is highly correlated with its internal properties, for example, the body composition, the size of the bones, and the amount of muscle and adipose tissue (fat). In this talk I will go over ...
Marilyn Keller
Perceiving Systems
Talk
Guy Tevet
14-10-2024
Diffusion Models for Human Motion Synthesis
Character motion synthesis stands as a central challenge in computer animation and graphics. The successful adaptation of diffusion models to the field boosted synthesis quality and provided intuitive controls such as text and music.
One of the earliest and most popular methods to do so is Motion Diffusion Model (MDM) [ICLR 2023]. In this talk, I will review how MDM incorporates domain know-how into the diffusion model and enables intuitive editing capabilities.
Then, I will present two recent works, each suggesting a refreshing take on motion diffusion and extending its abilities to new...
Omid Taheri
Perceiving Systems
Talk
Egor Zakharov
10-10-2024
Reconstruction and Animation of Realistic Head Avatars
Digital humans, or realistic avatars, are a centerpiece of future telepresence and special effects systems, and human head modeling is one of their main components. The abovementioned applications, however, are highly demanding in terms of avatar creation speed, as well as realism, and controllability. This talk will focus on the approaches that create controllable and detailed 3D head avatars using the data from consumer-grade devices, such as smartphones, in an uncalibrated and unconstrained capture setting. We will discuss leveraging in-the-wild internet videos and synthetic data sources...
Vanessa Sklyarova
Perceiving Systems
Talk
Simon Donne
26-09-2024
Collaborative Control for Geometry-Conditioned PBR Image Generation
Current diffusion models only generate RGB images. If we want to make progress towards graphics-ready 3D content generation, we need a PBR foundation model, but there is not enough PBR data available to train such a model from scratch. We introduce Collaborative Control, which tightly links a new PBR diffusion model to a pre-trained RGB model. We show that this dual architecture does not risk catastrophic forgetting, outputting high-quality PBR images and generalizing well beyond the PBR training dataset. Furthermore, the frozen base model remains compatible with techniques such as IP-Adapter.
Soubhik Sanyal
Perceiving Systems
Talk
Slava Elizarov
26-09-2024
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation
In this talk, I will present Geometry Image Diffusion (GIMDiffusion), a novel method designed to generate 3D objects from text prompts efficiently. GIMDiffusion uses geometry images, a 2D representation of 3D shapes, which allows the use of existing image-based architectures instead of complex 3D-aware models. This approach reduces computational costs and simplifies the model design. By incorporating Collaborative Control, the method exploits rich priors of pretrained Text-to-Image models like Stable Diffusion, enabling strong generalization even with limited 3D training data. GIMDiffusion ...
Soubhik Sanyal
Perceiving Systems
Talk
Panagiotis Filntisis and George Retsinas
23-09-2024
Advancements in 3D Facial Expression Reconstruction
Recent advances in 3D face reconstruction from in-the-wild images and videos have excelled at capturing the overall facial shape associated with a person's identity. However, they often struggle to accurately represent the perceptual realism of facial expressions, especially subtle, extreme, or rarely observed ones. In this talk, we will present two contributions focused on improving 3D facial expression reconstruction. The first part introduces SPECTRE—"Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos"—which offers a method for precise 3D reconstruction of mou...
Victoria Fernandez Abrevaya
Perceiving Systems
Talk
Wanyue Zhang
12-09-2024
Generalizable Object-aware Human Motion Synthesis
Data-driven virtual 3D character animation has recently witnessed remarkable progress. The realism of virtual characters is a core contributing factor to the quality of computer animations and user experience in immersive applications like games, movies, and VR/AR. However, existing automatic approaches for 3D virtual character motion synthesis supporting scene interactions do not generalize well to new objects outside training distributions, even when trained on extensive motion capture datasets with diverse objects and annotated interactions. In this talk, I will present ROAM, an alternat...
Nikos Athanasiou
Perceiving Systems
Talk
Thor Besier
04-09-2024
Modelling the Musculoskeletal System
Thor Besier leads the musculoskeletal modelling group at the Auckland Bioengineering Institute and will provide an overview of the institute and some of the current research projects of his team, including the Musculoskeletal Atlas Project, Harmonising clinical gait analysis data, Digital Twins for shoulder arthroplasty, and Reproducibility of Knee Models (NIH funded KneeHUB project).
Marilyn Keller
Perceiving Systems
Talk
István Sárándi
22-08-2024
Real Virtual Humans
With the explosive growth of available training data, 3D human pose and shape estimation is ahead of a transition to a data-centric paradigm. To leverage data scale, we need flexible models trainable from heterogeneous data sources. To this end, our latest work, Neural Localizer Fields, seamlessly unifies different human pose and shape-related tasks and datasets though the ability - both at training and test time - to query any arbitrary point of the human volume, and obtain its estimated location in 3D, based on a single RGB image. We achieve this by learning a continuous neural field of b...
Marilyn Keller
Perceiving Systems
Talk
Jiawei Liu
25-07-2024
4D Dynamic Scene Reconstruction, Editing, and Generation.
People live in a 4D dynamic moving world. While videos serve as the most convenient medium to capture this dynamic world, they lack the capability to present the 4D nature of our world. Therefore, 4D video reconstruction, free-viewpoint rendering, and high-quality editing and generation offer innovative opportunities for content creation, virtual reality, telepresence, and robotics. Although promising, they also pose significant challenges in terms of efficiency, 4D motion and dynamics, temporal and subject consistency, and text-3D/video alignment. In light of these challenges, this talk wi...
Omid Taheri
Perceiving Systems
Talk
Angelica Lim
23-07-2024
Multimodal Social Signal Processing for Human-Robot Interaction
Science fiction has long promised us interfaces and robots that interact with us as smoothly as humans do - Rosie the Robot from The Jetsons, C-3PO from Star Wars, and Samantha from Her. Today, interactive robots and voice user interfaces are moving us closer to effortless, human-like interactions in the real world. In this talk, I will discuss the opportunities and challenges in finely analyzing, detecting and generating non-verbal communication in context, including gestures, gaze, auditory signals, and facial expressions. Specifically, I will discuss how we might allow robots and virtual...
Yao Feng
Michael Black
Perceiving Systems
Talk
Siheng Chen
18-07-2024
Integrating AI Agents into Human Lives via a Simulation Approach
As the rapid growth of AI techniques, we might witness the emergence of AI agents entering our lives, reminiscent of new species. Ensuring these AI agents can well integrate into human life would be a profounding challenge. We urge these agents to be highly performant, safe, and well-aligned with human values. However, directly training and testing AI agents in real-world environments to guarantee their performance and safety is costly and can disrupt everyday life. Thus, we are exploring a simulation-based approach to incubate these AI agents. In this talk, we will highlight the role of si...
Yao Feng
Perceiving Systems
Talk
Boxiang Rong
18-07-2024
Recreating Real Garments in Virtual Space with Gaussian Splatting and GNNs
Recent advances in scene reconstruction with 3D Gaussian Splatting and cloth simulation with Graph neural networks open the prospects for methods that reconstruct proto-realistic virtual garments from visual observations. In this talk we will present our recently submitted paper – Gaussian Garments. There we reconstruct simulation ready photorealistic garments from multi-view videos. With the power of 3D Gaussian Splatting we are able to match three key aspects of real garments in virtual space: their geometry, appearance and behavior. The resulting virtual garments can then be combined int...
Artur Grigorev
Perceiving Systems
Talk
Yafes Sahin
08-07-2024
Creating High-End Visuals with Real-Time Technology
Creating captivating 3D visuals, particularly photorealistic CGI, demands a diverse range of tools, techniques, and expertise, from concept design to the creation of entire 3D worlds. Linear content generation represents the highest standard of visual quality and has long been a source of inspiration for game developers. In this talk, we will explore the advancements in techniques that have contributed to the rise of real-time technologies in movies and game cinematics.
We will delve into projects created with Unreal Engine, such as The Matrix Awakens, Vaulted Halls Entombed (Netflix S...
Yao Feng
Perceiving Systems
Talk
Pranav Manu
04-07-2024
Text-Driven 3D Modeling of Avatars
Generating 3D objects poses notable challenges due to the limited availability of annotated 3D datasets, unlike their 2D counterparts. Current approaches often resort to models trained on 2D data, resulting in prolonged optimization phases. Conversely, models trained on 3D datasets enable inference without optimization but suffer from limited dataset diversity. This talk explores methodologies for generative 3D modelling of human heads and garments, pivotal for human avatar creation. First, we introduce "Clip-Head," a text-to-textured 3D head generation model that generates a textured NPHM ...
Victoria Fernandez Abrevaya
Perceiving Systems
Talk
Shixiang Tang
10-06-2024
Towards Human-Centric Foundation Models: Pretraining Datasets and Unified Architectures
Recent years have witnessed great research interests in Human-Centric Visual Computing, such as person re-identification in social surveillance, mesh recovery in Metaverse, and pedestrian detection in autonomous driving. The recent development of large model offers the opportunity to unify these human-centric tasks and achieve improved performance by merging public datasets from different tasks. This talk will present our recent work on developing human-centric unified models on 2D vision, 3D vision, Skelton-based and vision-language tasks. We hope our model will be integrated to the curre...
Yandong Wen
Perceiving Systems
Talk
Shengqu Cai
02-05-2024
Generative Rendering and Beyond
Traditional 3D content creation tools empower users to bring their imagination to life by giving them direct control over a scene's geometry, appearance, motion, and camera path. Creating computer-generated videos, however, is a tedious manual process, which can be automated by emerging text-to-video diffusion models (SORA). Despite great promise, video diffusion models are difficult to control, hindering users from applying their own creativity rather than amplifying it. In this talk, we present a novel approach called Generative Rendering that combines the controllability of dynamic 3D me...
Shrisha Bharadwaj
Michael Black
Perceiving Systems
Talk
Maria Korosteleva
04-04-2024
Modeling and Reconstructing Garments with Sewing Patterns
The problems of creating new garments (modeling) or reproducing the existing ones (reconstruction) appear in various fields: from fashion production to digital human modeling for the metaverse. The talk introduces approaches to a novel garment creation paradigm: programming-based parametric sewing pattern construction and its application to generating rich synthetic datasets of garments with sewing patterns. We will then discuss how the availability of ground truth sewing patterns allows posing the learning-based garment reconstruction problem as a sewing pattern recovery. Such reformulatio...
Yao Feng
Michael Black
Perceiving Systems
Talk
Qixing Huang
13-03-2024
Geometric Regularizations for 3D Shape Generation
Generative models, which map a latent parameter space to instances in an ambient space, enjoy various applications in 3D Vision and related domains. A standard scheme of these models is probabilistic, which aligns the induced ambient distribution of a generative model from a prior distribution of the latent space with the empirical ambient distribution of training instances. While this paradigm has proven to be quite successful on images, its current applications in 3D generation encounter fundamental challenges in the limited training data and generalization behavior. The key difference be...
Yuliang Xiu
Perceiving Systems
Talk
Luming Tang
18-01-2024
Mining Visual Knowledge from Large Pre-trained Models
Computer vision made huge progress in the past decade with the dominant supervised learning paradigm, that is training large-scale neural networks on each task with ever larger datasets. However, in many cases, scalable data or annotation collection is intractable. In contrast, humans can easily adapt to new vision tasks with very little data or labels. In order to bridge this gap, we found that there actually exists rich visual knowledge in large pre-trained models, i.e., models trained on scalable internet images with either self-supervised or generative objectives. And we proposed differ...
Yuliang Xiu
Yandong Wen
Perceiving Systems
Talk
Partha Ghosh
30-11-2023
RAVEN: Rethinking Adversarial Video generation with Efficient tri-plane Networks
We present a novel unconditional video generative model designed to address long-term spatial and temporal dependencies. To capture these dependencies, our approach incorporates a hybrid explicit-implicit tri-plane representation inspired by 3D-aware generative frameworks developed for three-dimensional object representation and employs a singular latent code to model an entire video sequence. Individual video frames are then synthesized from an intermediate tri-plane representation, which itself is derived from the primary latent code. This novel strategy reduces computational complexity b...
Yandong Wen
Perceiving Systems
Talk
Weiyang Liu
19-10-2023
Orthogonal Butterfly: Parameter-Efficient Orthogonal Adaptation of Foundation Models via Butterfly Factorization
Large foundation models are becoming ubiquitous, but training them from scratch is prohibitively expensive. Thus, efficiently adapting these powerful models to downstream tasks is increasingly important. In this paper, we study a principled finetuning paradigm -- Orthogonal Finetuning (OFT) -- for downstream task adaptation. Despite demonstrating good generalizability, OFT still uses a fairly large number of trainable parameters due to the high dimensionality of orthogonal matrices. To address this, we start by examining OFT from an information transmission perspective, and then identify a ...
Yandong Wen
Perceiving Systems
Talk
Zhen Liu
12-10-2023
Ghost on the Shell: An Expressive Representation of General 3D Shapes
The creation of photorealistic virtual worlds requires the accurate modeling of 3D surface geometry for a wide range of objects. For this, meshes are appealing since they enable 1) fast physics-based rendering with realistic material and lighting, 2) physical simulation, and 3) are memory-efficient for modern graphics pipelines. Recent work on reconstructing and statistically modeling 3D shape, however, has critiqued meshes as being topologically inflexible. To capture a wide range of object shapes, any 3D representation must be able to model solid, watertight, shapes as well as thin, open,...
Yandong Wen
Perceiving Systems
Talk
Jun Gao
07-09-2023
Scaling up 3D content generation via 3D grounding for representation, data and algorithm
Creating 3D virtual worlds will require generating diverse and high-quality 3D content that mimics the intricacies of the real 3D world. While machine learning has achieved significant success in image and video generation, its application in 3D content generation encounters fundamental challenges in the scarcity of 3D training data and increased complexities inherent in three dimensions. We approach the problem of 3D content generation by revisiting the 3D grounding for the representation, data and algorithms. First, we introduce a differentiable 3D representation that bridges neural field...
Yao Feng
Perceiving Systems
Talk
Yifan Wang
24-08-2023
SunStage: Portrait Reconstruction and Relighting using the Sun as a Light Stage
A light stage acquires the shape and material properties of a face in high detail using a series of images captured under synchronized cameras and lights. This captured information can be used to synthesize novel images of the subject under arbitrary lighting conditions or from arbitrary viewpoints. This process enables a number of visual effects, such as creating digital replicas of actors that can be used in movies or high-quality postproduction relighting. In many cases, however, it is often infeasible to get access to a light stage for capturing a particular subject, because light stage...
Yao Feng
Perceiving Systems
Talk
Claudia Gallatz
17-08-2023
Face Exploration - Capture all Degrees of Freedom of the Face
A high quality data capture is decisive for your scientific work. As a member of the data team, it is a core task of my daily routine to ensure good quality standards in this field. My talk will enlighten the background of this work, starting from scanner set-up and the corresponding data outcome with focus on the Face Scanner. A work, each scientist can profit from for his personal projects. I will take the occasion to present our most recent face capture study named FACE EXPLORATION, of which Timo Bolkart is the leading scientist. A selection of representative sequences including facial m...
Yandong Wen
Perceiving Systems
Talk
Yangyi Huang
13-07-2023
Full-body avatars from single images and textual guidance
The reconstruction of full body appearance of clothed humans from single-view RGB images is a crucial yet challenging task, primarily due to depth ambiguities and the absence of observations from unseen regions. While existing methods have shown impressive results, they still suffer from limitations such as over-smooth surfaces and blurry textures, particularly lacking details at the backside of the avatar. In this talk, I will delve into how we have addressed these limitations by leveraging text guidance and pretrained text-image models, introducing two novel methods. Firstly, I will prese...
Hongwei Yi
Perceiving Systems
Talk
Bian Siyuan
13-04-2023
Pose, Kinematics, and Dynamics
Recovering accurate 3D human pose and shape from monocular input remains a challenging problem despite the rapid advancements powered by deep neural networks. Existing methods have limitations in achieving both robustness and mesh-image alignment, and the estimated pose suffers from physical artifacts such as foot sliding and body leaning. In this talk, we present two new methods to address these limitations. Firstly, we introduce NIKI, an inverse kinematics algorithm that utilizes an invertible neural network to model both the forward kinematics process and the inverse kinematics process. ...
Michael Black
Perceiving Systems
Talk
Lisa Dunlap
29-03-2023
Language is the key to robust vision systems
The ability to extend a model beyond the domain of the training data is central to building robust computer vision models. Methods for dealing with unseen test distributions or biased training data often require leveraging additional image data, but linguistic knowledge of the task and potential domain shifts is much cheaper and easier to obtain. In this talk, I will present three recent works that focus on different ways one can improve accuracy with language advice and incomplete training data via large-scale vision and language models.
Lea Müller
Perceiving Systems
Talk
Anurag Ranjan
23-02-2023
Neural Graphics in a Generative World
Recent years have seen significant advancements in deep learning, which has led to a growing belief that Moore's law, which traditionally pertained to the packing of transistors, is now transitioning towards the improvement of photo-realistic 3D graphics. The advancements in this research field can be broadly categorized into two areas: neural fields, which are capable of modeling photo-realistic 3D representations, and diffusion models, which are able to generalize to large scale data and produce photo-realistic images. To combine these technologies for large scale 3D generative modeling, ...
Sai Kumar Dwivedi
Perceiving Systems
Talk
Xi Wang
16-02-2023
What do language models tell us about human-object interaction?
Research in artificial intelligence (AI) continues to advance quickly and outperforms humans in many tasks, making its way into our daily lives. However, beneath their superior performance, current technologies, limited in how to perceive, process, and understand our visual world, struggle with understanding and interacting with people. These issues raise the core question of my research: How do we build intelligent systems that can interact with people and offer assistance in a natural and seamless way? In this talk, I will present our recent works on using the CLIP model for object intera...
Muhammed Kocabas
Perceiving Systems
Talk
Mingyuan Zhang
19-01-2023
Human Motion Generation with Diffusion Models
Human motion modeling is important for many modern graphics applications, which typically require professional skills. In order to remove the skill barriers for laymen, recent motion generation methods can directly generate human motions conditioned on natural languages, speech, and music. However, it remains challenging to achieve diverse and fine-grained motion generation with comprehensive condition signals. Inspired by the success in image generation, recent works attempt to apply diffusion models to motion generation tasks (Motion Diffusion Models) and achieve impressive progress in as...
Shashank Tripathi
Perceiving Systems
Talk
Zhongang Cai
12-01-2023
Data Infrastructure for Scaling up Human Understanding and Modelling to the Real World
Human sensing and modelling are fundamental tasks in vision and graphics with numerous applications. However, due to the prohibitive cost, existing datasets are often limited in scale and diversity. This talk shares two of our recent works to tackle data scarcity. First, with the advances of new sensors and algorithms, paired data can be obtained from an inexpensive set-up and an automatic annotation pipeline. Specifically, we demonstrate the data collection solution by introducing HuMMan, a large-scale multimodal 4D human dataset. HuMMan has several appealing properties: 1) multimodal data...
Shashank Tripathi
Perceiving Systems
Talk
Yuge Shi
22-09-2022
Combine and conquer: representation learning from multiple data distributions
It is becoming less and less controversial to say that the days of learning representations through label supervision are over. Recent work discovers that such regimes are not only expensive, but also suffer from various generalisation/robustness issues. This is somewhat unsurprising, as perceptual data (vision, language) are rich and cannot be well represented by a single label --- doing so inevitably result in the model learning spurious features that trivially correlates to the label.
In this talk, I will introduce my work during my PhD at Oxford, which looks at representation learning...
Yao Feng
Perceiving Systems
Talk
Alejandro Pardo
08-09-2022
Computer Vision for Automated Video Editing and Understanding.
Video content creation has boomed in recent years. Every day hundreds of thousands of video hours are uploaded to the internet. Thus, video content editing has become more popular and accessible to amateur users. However, current Computer Vision (CV) techniques have not studied technologies to help video editing become a less tedious task. Currently, editors spend hours cutting and stitching videos to deliver final edited videos that convey stories. This cutting process is creative but is often repetitive. With the recent advances in CV, one would expect that a system could learn some cutti...
Hongwei Yi
Perceiving Systems
Talk
Xucong Zhang
11-08-2022
Gaze Estimation and Its Application
Human eye gaze is an important non-verbal cue to estimate the user attention and intention. In this talk, I will present our works over past few years on appearance-based gaze estimation, which takes input frame from a single webcam. I will start with introduction of our work on proposing new datasets such as MPIIFaceGaze, ETH-XGaze, and EVE datasets. And then I will introduce methods such as GazeNet, full face gaze estimation, and gaze redirection. At last, I will briefly introduce applications of our method in real-world settings.
Xu Chen
Perceiving Systems
Talk
Zenghao Chai
04-08-2022
REALY: Rethinking the Evaluation of 3D Face Reconstruction
The evaluation of 3D face reconstruction results typically relies on a rigid shape alignment between the estimated 3D model and the ground-truth scan. We observe that aligning two shapes with different reference points can largely affect the evaluation results. This poses difficulties for precisely diagnosing and improving a 3D face reconstruction method. In this paper, we propose a novel evaluation approach with a new benchmark REALY, consisting of 100 globally aligned face scans with accurate facial keypoints, high-quality region masks, and topology-consistent meshes. Our approach perform...
Yandong Wen
Perceiving Systems
Talk
Lingchen Yang
28-07-2022
Implicit Neural Representation for Physics-driven Actuated Soft Bodies
Active soft bodies can affect their shape through an internal actuation mechanism that induces a deformation. Similar to recent work, this paper utilizes a differentiable, quasi-static, and physics-based simulation layer to optimize for actuation signals parameterized by neural networks.
Our key contribution is a general and implicit formulation to control active soft bodies by defining a function that enables a continuous mapping from a spatial point in the material space to the actuation value. This property allows us to capture the signal's dominant frequencies, making the method discre...
Yao Feng
Perceiving Systems
Talk
Supreeth Narasimhaswamy
28-07-2022
Understanding Human Hands in Visual Data
Hands are the central means by which humans interact with their surroundings. Understanding human hands help human behavior analysis and facilitate other visual analysis tasks such as action and gesture recognition. Recently, there has been a surge of interest in understanding first-person visual data, and hands are the dominant interaction entities in such activities. Also, there is an explosion of interest in developing computer vision methods for augmented and virtual reality. To deliver an authentic augmented and virtual reality experience, we need to enable humans to interact with the ...
Sai Kumar Dwivedi
Dimitris Tzionas
Perceiving Systems
Talk
Michael Zollhoefer
27-07-2022
Complete Codec Telepresence
Imagine two people, each of them within their own home, being able to communicate and interact virtually with each other as if they are both present in the same shared physical space. Enabling such an experience, i.e., building a telepresence system that is indistinguishable from reality, is one of the goals of Reality Labs Research (RLR) in Pittsburgh. To this end, we develop key technology that combines fundamental computer vision, machine learning, and graphics techniques based on a novel neural reconstruction and rendering paradigm. In this talk, I will cover our advances towards a neur...
Yao Feng
Perceiving Systems
Talk
Rana Hanocka
13-06-2022
Shape editing, generation, and stylization
Manual authoring of 3D content is a laborious and tedious task. In this talk, I present some of 3DL's recent and on-going efforts toward building tools which provide intuitive control for editing, manipulating, and generating 3D shapes. I will discuss how recent advancements, such as joint vision-language embedding spaces can be used to stylize 3D objects, driven by natural language. Finally, I will conclude with ongoing and future work in this direction, as well as other related areas.
Omid Taheri
Perceiving Systems
Talk
Mohammed Hassan
13-06-2022
Title: Synthesizing Physical Character-Scene Interactions
Movement is how people interact with and affect their environment. For realistic virtual character animation, it is necessary to realistically synthesize such interactions between virtual characters and their surroundings. Despite recent progress in character animation using machine learning, most systems focus on controlling an agent's movements in fairly simple and homogeneous environments, with limited interactions with other objects. Furthermore, many previous approaches that synthesize human-scene interaction require significant manual labeling of the training data. In contrast, we pre...
Nikos Athanasiou
Perceiving Systems
Talk
Youngjoong Kwon
09-06-2022
Learning to create Digital Humans: Generalizable Radiance Fields for Human Performance Rendering
In this work, we aim at synthesizing a free-viewpoint video of an arbitrary human performance using sparse multi-view cameras. Recently, several works have addressed this problem by learning person-specific neural radiance fields (NeRF) to capture the appearance of a particular human, In parallel, some work proposed to use pixel-aligned features to generalize radiance fields to arbitrary new scenes and objects. Adopting such generalization approaches to humans, however, is highly challenging due to the heavy occlusions and dynamic articulations of body parts. To tackle this, we propose a no...
Yuliang Xiu
Perceiving Systems
Talk
Tianye Li
07-06-2022
Reconstruction and Synthesis for Dynamic Humans and Scenes
This thesis focuses on automated systems to capture realistic 4D visual content for general humans and scenes, such that we can animate and replay the captured content. Firstly, we design a system to reconstruct and register a large quantity of high-quality 4D faces across identities, expressions, and poses by utilizing geometric, photometric, and motion cues. Based on well-curated datasets we propose a lightweight yet expressive face model that works on a wide range of populations, by separately modeling the shape (identity), expression, and poses of human faces. Secondly, we design an inf...
Nikos Athanasiou
Timo Bolkart
Perceiving Systems
Talk
Jiashi Feng
02-05-2022
Learning to estimate 3D human poses without labeled data
Estimating 3D human poses from images or videos is a fundamental task in computer vision. However, the limitation of training data with high-quality 3D pose annotations largely hinder its development and deployment in real applications. In this talk, I will introduce our recent works on training 3D pose estimation models without requiring 3D labeled data. Our first step is to present PoseAug, a new auto-augmentation framework that learns to augment the available training poses towards a greater diversity and thus improve generalization of the trained 2D-to-3D pose estimator. Specifically, P...
Michael Black
Perceiving Systems
Talk
Lixin Yang
25-04-2022
Leverage Kinematic and Contact constraints for understanding hand-object interaction
My works focus on inferring and understanding the human hand’s interaction with objects from visual inputs, which include several tasks like pose estimation, grasping pose generation, and interacting pose transfer. Unlike the single-body pose estimation task, understanding the Hand-object (multi-bodies) interactions in 3D spaces is more challenging, due to its high degree of articulations, the projection ambiguity, self or mutual occlusions, and the complicated physical constraints. Designing algorithms to tackle these challenges is my goal. We find that the mutual contact can provide rich ...
Yuliang Xiu