Thumb ticker sm gul
Perceiving Systems
  • Guest Scientist
Thumb ticker sm thumb javier winter 2
Perceiving Systems
Affiliated Researcher
Thumb ticker sm me pic large
Perceiving Systems
  • Research Engineer
Thumb ticker sm headshot2021
Perceiving Systems
Director
Thumb ticker sm thumb xl cordelia schmid vignette.jpg vignette
Perceiving Systems
Affiliated Researcher
Surrealin

Estimating human pose, shape, and motion from images and videos are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In this work we present SURREAL (Synthetic hUmans foR REAL tasks): a new large-scale dataset with synthetically-generated but realistic images of people rendered from 3D sequences of human motion capture data. We generate more than 6 million frames together with ground truth pose, depth maps, and segmentation masks. We show that CNNs trained on our synthetic dataset allow for accurate human depth estimation and human part segmentation in real RGB images. Our results and the new dataset open up new possibilities for advancing person analysis using cheap and large-scale synthetic data.

Author(s): Varol, Gül and Romero, Javier and Martin, Xavier and Mahmood, Naureen and Black, Michael J. and Laptev, Ivan and Schmid, Cordelia
Links:
Book Title: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
Pages: 4627-4635
Year: 2017
Month: July
Day: 21-26
Publisher: IEEE
Project(s):
Bibtex Type: Conference Paper (inproceedings)
Address: Piscataway, NJ, USA
Event Name: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Event Place: Honolulu, HI, USA
Electronic Archiving: grant_archive
ISBN: 978-1-5386-0457-1
ISSN: 1063-6919

BibTex

@inproceedings{Varol:CVPR:2017,
  title = {Learning from Synthetic Humans},
  booktitle = {Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017},
  abstract = {Estimating human pose, shape, and motion from images and videos are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In this work we present SURREAL (Synthetic hUmans foR REAL tasks): a new large-scale dataset with synthetically-generated but realistic images of people rendered from 3D sequences of human motion capture data. We generate more than 6 million frames together with ground truth pose, depth maps, and segmentation masks.  We show that CNNs trained on our synthetic dataset allow for accurate human depth estimation and human part segmentation in real RGB images. Our results and the new dataset open up new possibilities for advancing person analysis using cheap and large-scale synthetic data.
  },
  pages = {4627-4635},
  publisher = {IEEE},
  address = {Piscataway, NJ, USA},
  month = jul,
  year = {2017},
  slug = {varol-cvpr-2017},
  author = {Varol, G{\"u}l and Romero, Javier and Martin, Xavier and Mahmood, Naureen and Black, Michael J. and Laptev, Ivan and Schmid, Cordelia},
  month_numeric = {7}
}