Back

Perceiving Systems Members Publications

Members

Perceiving Systems
  • Research Group Leader
Perceiving Systems
Affiliated Researcher
Perceiving Systems
  • Guest Scientist
Robust Machine Learning
  • Postdoctoral Researcher
no image
Perceiving Systems
Perceiving Systems
Emeritus / Acting Director

Publications

Perceiving Systems Empirical Inference Conference Paper Human Pose Estimation with Fields of Parts Kiefel, M., Gehler, P. In Computer Vision – ECCV 2014, LNCS 8693:331-346, Lecture Notes in Computer Science, (Editors: Fleet, David and Pajdla, Tomas and Schiele, Bernt and Tuytelaars, Tinne), Springer, 13th European Conference on Computer Vision, September 2014
This paper proposes a new formulation of the human pose estimation problem. We present the Fields of Parts model, a binary Conditional Random Field model designed to detect human body parts of articulated people in single images. The Fields of Parts model is inspired by the idea of Pictorial Structures, it models local appearance and joint spatial configuration of the human body. However the underlying graph structure is entirely different. The idea is simple: we model the presence and absence of a body part at every possible position, orientation, and scale in an image with a binary random variable. This results into a vast number of random variables, however, we show that approximate inference in this model is efficient. Moreover we can encode the very same appearance and spatial structure as in Pictorial Structures models. This approach allows us to combine ideas from segmentation and pose estimation into a single model. The Fields of Parts model can use evidence from the background, include local color information, and it is connected more densely than a kinematic chain structure. On the challenging Leeds Sports Poses dataset we improve over the Pictorial Structures counterpart by 5.5% in terms of Average Precision of Keypoints (APK).
website pdf DOI BibTeX

Perceiving Systems Conference Paper Human Pose Estimation: New Benchmark and State of the Art Analysis Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 3686 - 3693, IEEE, IEEE International Conference on Computer Vision and Pattern Recognition, June 2014 pdf DOI BibTeX

Perceiving Systems Conference Paper Preserving Modes and Messages via Diverse Particle Selection Pacheco, J., Zuffi, S., Black, M. J., Sudderth, E. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), 32(1):1152-1160, J. Machine Learning Research Workshop and Conf. and Proc., Beijing, China, International Conference on Machine Learning (ICML), June 2014
In applications of graphical models arising in domains such as computer vision and signal processing, we often seek the most likely configurations of high-dimensional, continuous variables. We develop a particle-based max-product algorithm which maintains a diverse set of posterior mode hypotheses, and is robust to initialization. At each iteration, the set of hypotheses at each node is augmented via stochastic proposals, and then reduced via an efficient selection algorithm. The integer program underlying our optimization-based particle selection minimizes errors in subsequent max-product message updates. This objective automatically encourages diversity in the maintained hypotheses, without requiring tuning of application-specific distances among hypotheses. By avoiding the stochastic resampling steps underlying particle sum-product algorithms, we also avoid common degeneracies where particles collapse onto a single hypothesis. Our approach significantly outperforms previous particle-based algorithms in experiments focusing on the estimation of human pose from single images.
pdf SupMat BibTeX

Perceiving Systems Conference Paper Strong Appearance and Expressive Spatial Models for Human Pose Estimation Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B. In International Conference on Computer Vision (ICCV), 3487 - 3494 , IEEE, Computer Vision (ICCV), 2013 IEEE International Conference on , December 2013
Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the body part hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary; (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-of-the-art performance when augmented with the proper appearance representation; and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the "Leeds Sports Poses'' and "Parse'' benchmarks.
pdf DOI BibTeX

Perceiving Systems Conference Paper Human Pose Estimation using Body Parts Dependent Joint Regressors Dantone, M., Gall, J., Leistner, C., van Gool, L. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 3041-3048, IEEE, Portland, OR, USA, June 2013
In this work, we address the problem of estimating 2d human pose from still images. Recent methods that rely on discriminatively trained deformable parts organized in a tree model have shown to be very successful in solving this task. Within such a pictorial structure framework, we address the problem of obtaining good part templates by proposing novel, non-linear joint regressors. In particular, we employ two-layered random forests as joint regressors. The first layer acts as a discriminative, independent body part classifier. The second layer takes the estimated class distributions of the first one into account and is thereby able to predict joint locations by modeling the interdependence and co-occurrence of the parts. This results in a pose estimation framework that takes dependencies between body parts already for joint localization into account and is thus able to circumvent typical ambiguities of tree structures, such as for legs and arms. In the experiments, we demonstrate that our body parts dependent joint regressors achieve a higher joint localization accuracy than tree-based state-of-the-art methods.
pdf DOI BibTeX

Perceiving Systems Conference Paper Poselet conditioned pictorial structures Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 588 - 595, IEEE, Portland, OR, Conference on Computer Vision and Pattern Recognition (CVRP), June 2013 pdf DOI BibTeX

Perceiving Systems Conference Paper From pictorial structures to deformable structures Zuffi, S., Freifeld, O., Black, M. J. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 3546-3553, IEEE, June 2012
Pictorial Structures (PS) define a probabilistic model of 2D articulated objects in images. Typical PS models assume an object can be represented by a set of rigid parts connected with pairwise constraints that define the prior probability of part configurations. These models are widely used to represent non-rigid articulated objects such as humans and animals despite the fact that such objects have parts that deform non-rigidly. Here we define a new Deformable Structures (DS) model that is a natural extension of previous PS models and that captures the non-rigid shape deformation of the parts. Each part in a DS model is represented by a low-dimensional shape deformation space and pairwise potentials between parts capture how the shape varies with pose and the shape of neighboring parts. A key advantage of such a model is that it more accurately models object boundaries. This enables image likelihood models that are more discriminative than previous PS likelihoods. This likelihood is learned using training imagery annotated using a DS “puppet.” We focus on a human DS model learned from 2D projections of a realistic 3D human body model and use it to infer human poses in images using a form of non-parametric belief propagation.
pdf sup mat code poster BibTeX