Perceiving Systems Conference Paper 2015

Human Pose as Context for Object Detection

Thumb ticker sm thumb abhilash
Perceiving Systems
no image
Perceiving Systems
Bmvc2015 web teaser

Detecting small objects in images is a challenging problem particularly when they are often occluded by hands or other body parts. Recently, joint modelling of human pose and objects has been proposed to improve both pose estimation as well as object detection. These approaches, however, focus on explicit interaction with an object and lack the flexibility to combine both modalities when interaction is not obvious. We therefore propose to use human pose as an additional context information for object detection. To this end, we represent an object category by a tree model and train regression forests that localize parts of an object for each modality separately. Predictions of the two modalities are then combined to detect the bounding box of the object. We evaluate our approach on three challenging datasets which vary in the amount of object interactions and the quality of automatically extracted human poses.

Author(s): Abhilash Srikantha and Juergen Gall
Book Title: British Machine Vision Conference
Year: 2015
Month: September
Project(s):
Bibtex Type: Conference Paper (conference)
Event Name: British Machine Vision Conference
Event Place: Swansea, United Kingdom
Electronic Archiving: grant_archive
Attachments:

BibTex

@conference{Srik:BMVC:2015,
  title = {Human Pose as Context for Object Detection},
  booktitle = {British Machine Vision Conference},
  abstract = {Detecting small objects in images is a challenging problem particularly when they are often occluded by hands or other body parts. 
  Recently, joint modelling of human pose and objects has been proposed to improve both pose estimation as well as object detection. 
  These approaches, however, focus on explicit interaction with an object and lack the flexibility to combine both modalities when interaction is not obvious. 
  We therefore propose to use human pose as an additional context information for object detection. 
  To this end, we represent an object category by a tree model and train regression forests that localize parts of an object for each modality separately. 
  Predictions of the two modalities are then combined to detect the bounding box of the object. 
  We evaluate our approach on three challenging datasets which vary in the amount of object interactions and the quality of automatically extracted human poses.                },
  month = sep,
  year = {2015},
  slug = {srik-bmvc-2015},
  author = {Srikantha, Abhilash and Gall, Juergen},
  month_numeric = {9}
}