Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

Website pdf

Perceiving Systems

Dimitris Tzionas

Guest Scientist

Perceiving Systems

Abhilash Srikantha

Perceiving Systems

Jürgen Gall

Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras. For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.

Author(s):	Dimitrios Tzionas and Luca Ballan and Abhilash Srikantha and Pablo Aponte and Marc Pollefeys and Juergen Gall
Links:	Website pdf
Journal:	International Journal of Computer Vision (IJCV)
Volume:	118
Number (issue):	2
Pages:	172--193
Year:	2016
Month:	June

Project(s):	Hands-Object Interaction
BibTeX Type:	Article (article)

DOI:	10.1007/s11263-016-0895-4
State:	Published
URL:	https://doi.org/10.1007/s11263-016-0895-4

Electronic Archiving:	grant_archive

BibTeX

@article{Tzionas:IJCV:2016,
  title = {Capturing Hands in Action using Discriminative Salient Points and Physics Simulation},
  journal = {International Journal of Computer Vision (IJCV)},
  abstract = {Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras. For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.},
  volume = {118},
  number = {2},
  pages = {172--193},
  month = jun,
  year = {2016},
  author = {Tzionas, Dimitrios and Ballan, Luca and Srikantha, Abhilash and Aponte, Pablo and Pollefeys, Marc and Gall, Juergen},
  doi = {10.1007/s11263-016-0895-4},
  url = {https://doi.org/10.1007/s11263-016-0895-4},
  month_numeric = {6}
}