Occluded Human Mesh Recovery

project arXiv

Perceiving Systems

Shashank Tripathi

Postdoctoral Researcher

Top-down methods for monocular human mesh recovery have two stages: (1) detect human bounding boxes; (2) treat each bounding box as an independent single-human mesh recovery task. Unfortunately, the single-human assumption does not hold in images with multi-human occlusion and crowding. Consequently, top-down methods have difficulties in recovering accurate 3D human meshes under severe person-person occlusion. To address this, we present Occluded Human Mesh Recovery (OCHMR) - a novel top-down mesh recovery approach that incorporates image spatial context to overcome the limitations of the single-human assumption. The approach is conceptually simple and can be applied to any existing top-down architecture. Along with the input image, we condition the top-down model on spatial context from the image in the form of body-center heatmaps. To reason from the predicted body centermaps, we introduce Contextual Normalization (CoNorm) blocks to adaptively modulate intermediate features of the top-down model. The contextual conditioning helps our model disambiguate between two severely overlapping human bounding-boxes, making it robust to multi-person occlusion. Compared with state-of-the-art methods, OCHMR achieves superior performance on challenging multi-person benchmarks like 3DPW, CrowdPose and OCHuman. Specifically, our proposed contextual reasoning architecture applied to the SPIN model with ResNet-50 backbone results in 75.2 PMPJPE on 3DPW-PC, 23.6 AP on CrowdPose and 37.7 AP on OCHuman datasets, a significant improvement of 6.9 mm, 6.4 AP and 20.8 AP respectively over the baseline. Code and models will be released.

Author(s):	Rawal Khirodkar and Shashank Tripathi and Kris Kitani
Links:	project arXiv
Book Title:	2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)
Pages:	1705--1715
Year:	2022
Month:	June
Publisher:	IEEE

BibTeX Type:	Conference Paper (inproceedings)

Address:	Piscataway, NJ
DOI:	10.1109/CVPR52688.2022.00176
Event Name:	IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)
Event Place:	New Orleans, LA
State:	Published

Electronic Archiving:	grant_archive
ISBN:	78-1-6654-6947-0

BibTeX

@inproceedings{khirodkar_ochmr_2022,
  title = {Occluded Human Mesh Recovery},
  booktitle = {2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)},
  abstract = {Top-down methods for monocular human mesh recovery have two stages: (1) detect human bounding boxes; (2) treat each bounding box as an independent single-human mesh recovery task. Unfortunately, the single-human assumption does not hold in images with multi-human occlusion and crowding. Consequently, top-down methods have difficulties in recovering accurate 3D human meshes under severe person-person occlusion. To address this, we present Occluded Human Mesh Recovery (OCHMR) - a novel top-down mesh recovery approach that incorporates image spatial context to overcome the limitations of the single-human assumption. The approach is conceptually simple and can be applied to any existing top-down architecture. Along with the input image, we condition the top-down model on spatial context from the image in the form of body-center heatmaps. To reason from the predicted body centermaps, we introduce Contextual Normalization (CoNorm) blocks to adaptively modulate intermediate features of the top-down model. The contextual conditioning helps our model disambiguate between two severely overlapping human bounding-boxes, making it robust to multi-person occlusion. Compared with state-of-the-art methods, OCHMR achieves superior performance on challenging multi-person benchmarks like 3DPW, CrowdPose and OCHuman. Specifically, our proposed contextual reasoning architecture applied to the SPIN model with ResNet-50 backbone results in 75.2 PMPJPE on 3DPW-PC, 23.6 AP on CrowdPose and 37.7 AP on OCHuman datasets, a significant improvement of 6.9 mm, 6.4 AP and 20.8 AP respectively over the baseline. Code and models will be released.},
  pages = {1705--1715},
  publisher = {IEEE},
  address = {Piscataway, NJ},
  month = jun,
  year = {2022},
  author = {Khirodkar, Rawal and Tripathi, Shashank and Kitani, Kris},
  doi = {10.1109/CVPR52688.2022.00176},
  month_numeric = {6}
}