Institute Homepage

Institute Homepage DE Sign In

Perceiving Systems Conference Paper 2021

Learning to Disambiguate Strongly Interacting Hands via Probabilistic Per-pixel Part Segmentation

arXiv project code video

Perceiving Systems

Zicong (Alex) Fan

Guest Scientist

Perceiving Systems

Muhammed Kocabas

Guest Scientist

Perceiving Systems

Siyu Tang

Guest Scientist

Perceiving Systems

Michael Black

Emeritus / Acting Director

In natural conversation and interaction, our hands often overlap or are in contact with each other. Due to the homogeneous appearance of hands, this makes estimating the 3D pose of interacting hands from images difficult. In this paper we demonstrate that self-similarity, and the resulting ambiguities in assigning pixel observations to the respective hands and their parts, is a major cause of the final 3D pose error. Motivated by this insight, we propose DIGIT, a novel method for estimating the 3D poses of two interacting hands from a single monocular image. The method consists of two interwoven branches that process the input imagery into a per-pixel semantic part segmentation mask and a visual feature volume. In contrast to prior work, we do not decouple the segmentation from the pose estimation stage, but rather leverage the per-pixel probabilities directly in the downstream pose estimation task. To do so, the part probabilities are merged with the visual features and processed via fully-convolutional layers. We experimentally show that the proposed approach achieves new state-of-the-art performance on the InterHand2.6M dataset for both single and interacting hands across all metrics. We provide detailed ablation studies to demonstrate the efficacy of our method and to provide insights into how the modelling of pixel ownership affects single and interacting hand pose estimation. Our code will be released for research purposes.

Author(s):	Zicong Fan and Adrian Spurr and Muhammed Kocabas and Siyu Tang and Michael J. Black and Otmar Hilliges
Links:	arXiv project code video
Book Title:	2021 International Conference on 3D Vision (3DV 2021)
Pages:	1--10
Year:	2021
Month:	December
Publisher:	IEEE

Project(s):	Hands-Object Interaction
BibTeX Type:	Conference Paper (conference)

Address:	Piscataway, NJ
DOI:	10.1109/3DV53792.2021.00011
Event Name:	International Conference on 3D Vision (3DV 2021)
Event Place:	Virtual
State:	Published

Electronic Archiving:	grant_archive
ISBN:	978-1-6654-2688-6

BibTeX

@conference{Fan:3DV:2021,
  title = {Learning to Disambiguate Strongly Interacting Hands via Probabilistic Per-pixel Part Segmentation},
  booktitle = {2021 International Conference on 3D Vision (3DV 2021)},
  abstract = {In natural conversation and interaction, our hands often overlap or are in contact with each other. Due to the homogeneous appearance of hands, this makes estimating the 3D pose of interacting hands from images difficult. In this paper we demonstrate that self-similarity, and the resulting ambiguities in assigning pixel observations to the respective hands and their parts, is a major cause of the final 3D pose error. Motivated by this insight, we propose DIGIT, a novel method for estimating the 3D poses of two interacting hands from a single monocular image. The method consists of two interwoven branches that process the input imagery into a per-pixel semantic part segmentation mask and a visual feature volume. In contrast to prior work, we do not decouple the segmentation from the pose estimation stage, but rather leverage the per-pixel probabilities directly in the downstream pose estimation task. To do so, the part probabilities are merged with the visual features and processed via fully-convolutional layers. We experimentally show that the proposed approach achieves new state-of-the-art performance on the InterHand2.6M dataset for both single and interacting hands across all metrics. We provide detailed ablation studies to demonstrate the efficacy of our method and to provide insights into how the modelling of pixel ownership affects single and interacting hand pose estimation. Our code will be released for research purposes.},
  pages = {1--10},
  publisher = {IEEE},
  address = {Piscataway, NJ},
  month = dec,
  year = {2021},
  author = {Fan, Zicong and Spurr, Adrian and Kocabas, Muhammed and Tang, Siyu and Black, Michael J. and Hilliges, Otmar},
  doi = {10.1109/3DV53792.2021.00011},
  month_numeric = {12}
}

Research

Departments

Max Planck Research Groups

Start-Up Teams

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives

Research

Departments

Max Planck Research Groups

Start-Up Teams

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives

Learning to Disambiguate Strongly Interacting Hands via Probabilistic Per-pixel Part Segmentation

BibTeX