Perceiving Systems Conference Paper 2025

PICO: Reconstructing 3D People In Contact with Objects

project arXiv video code dataset
Thumb ticker sm cseke alp%c3%a1r   profilk%c3%a9p
Perceiving Systems
  • Guest Scientist
Thumb ticker sm shashank
Perceiving Systems
  • Doctoral Researcher
Thumb ticker sm 9db539f1 459c 4e0c 8d2b 9543d2213f30
Perceiving Systems
  • Doctoral Researcher
Thumb ticker sm headshot2021
Perceiving Systems
Director
Thumb ticker sm 04 1.58.1   crop3
Perceiving Systems
  • Guest Scientist
Thumb xxl pico teaser

Recovering 3D Human-Object Interaction (HOI) from single color images is challenging due to depth ambiguities, occlusions, and the huge variation in object shape and appearance. Thus, past work requires controlled settings such as known object shapes and contacts, and tackles only limited object classes. Instead, we need methods that generalize to natural images and novel object classes. We tackle this in two main ways: (1) We collect PICO-db, a new dataset of natural images uniquely paired with dense 3D contact on both body and object meshes. To this end, we use images from the recent DAMON dataset that are paired with contacts, but these contacts are only annotated on a canonical 3D body. In contrast, we seek contact labels on both the body and the object. To infer these given an image, we retrieve an appropriate 3D object mesh from a database by leveraging vision foundation models. Then, we project DAMON's body contact patches onto the object via a novel method needing only 2 clicks per patch. This minimal human input establishes rich contact correspondences between bodies and objects. (2) We exploit our new dataset of contact correspondences in a novel render-and-compare fitting method, called PICO-fit, to recover 3D body and object meshes in interaction. PICO-fit infers contact for the SMPL-X body, retrieves a likely 3D object mesh and contact from PICO-db for that object, and uses the contact to iteratively fit the 3D body and object meshes to image evidence via optimization. Uniquely, PICO-fit works well for many object categories that no existing method can tackle. This is crucial to enable HOI understanding to scale in the wild.

Author(s): Alpár Cseke and Shashank Tripathi and Sai Kumar Dwivedi and Arjun S. Lakshmipathy and Agniv Chatterjee and Michael J. Black and Dimitrios Tzionas
Links:
Book Title: IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR)
Year: 2025
Month: June
Day: 13
Bibtex Type: Conference Paper (inproceedings)
Event Place: Nashville, TN
State: Published

BibTex

@inproceedings{PICO:2025,
  title = {{PICO}: Reconstructing {3D} People In Contact with Objects},
  booktitle = {IEEE/CVF Conf.~on Computer Vision and Pattern Recognition (CVPR)},
  abstract = {Recovering 3D Human-Object Interaction (HOI) from single color images is challenging due to depth ambiguities, occlusions, and the huge variation in object shape and appearance. Thus, past work requires controlled settings such as known object shapes and contacts, and tackles only limited object classes. Instead, we need methods that generalize to natural images and novel object classes. We tackle this in two main ways: (1) We collect PICO-db, a new dataset of natural images uniquely paired with dense 3D contact on both body and object meshes. To this end, we use images from the recent DAMON dataset that are paired with contacts, but these contacts are only annotated on a canonical 3D body. In contrast, we seek contact labels on both the body and the object. To infer these given an image, we retrieve an appropriate 3D object mesh from a database by leveraging vision foundation models. Then, we project DAMON's body contact patches onto the object via a novel method needing only 2 clicks per patch. This minimal human input establishes rich contact correspondences between bodies and objects. (2) We exploit our new dataset of contact correspondences in a novel render-and-compare fitting method, called PICO-fit, to recover 3D body and object meshes in interaction. PICO-fit infers contact for the SMPL-X body, retrieves a likely 3D object mesh and contact from PICO-db for that object, and uses the contact to iteratively fit the 3D body and object meshes to image evidence via optimization. Uniquely, PICO-fit works well for many object categories that no existing method can tackle. This is crucial to enable HOI understanding to scale in the wild. },
  month = jun,
  year = {2025},
  slug = {pico-2025},
  author = {Cseke, Alpár and Tripathi, Shashank and Dwivedi, Sai Kumar and Lakshmipathy, Arjun S. and Chatterjee, Agniv and Black, Michael J. and Tzionas, Dimitrios},
  month_numeric = {6}
}