Header logo is


2014


Thumb xl screen shot 2015 08 22 at 22.50.12
Data-Driven Grasp Synthesis - A Survey

Bohg, J., Morales, A., Asfour, T., Kragic, D.

IEEE Transactions on Robotics, 30, pages: 289 - 309, IEEE, April 2014 (article)

Abstract
We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps. We divide the approaches into three groups based on whether they synthesize grasps for known, familiar or unknown objects. This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique. In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation. In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects. Finally for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps. Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping. We also draw a parallel to the classical approaches that rely on analytic formulations.

am

PDF link (url) DOI Project Page [BibTex]

2014


PDF link (url) DOI Project Page [BibTex]


Thumb xl thumb
Multi-View Priors for Learning Detectors from Sparse Viewpoint Data

Pepik, B., Stark, M., Gehler, P., Schiele, B.

International Conference on Learning Representations, International Conference on Learning Representations (ICLR), April 2014 (conference)

Abstract
While the majority of today's object class models provide only 2D bounding boxes, far richer output hypotheses are desirable including viewpoint, fine-grained category, and 3D geometry estimate. However, models trained to provide richer output require larger amounts of training data, preferably well covering the relevant aspects such as viewpoint and fine-grained categories. In this paper, we address this issue from the perspective of transfer learning, and design an object class model that explicitly leverages correlations between visual features. Specifically, our model represents prior distributions over permissible multi-view detectors in a parametric way -- the priors are learned once from training data of a source object class, and can later be used to facilitate the learning of a detector for a target class. As we show in our experiments, this transfer is not only beneficial for detectors based on basic-level category representations, but also enables the robust learning of detectors that represent classes at finer levels of granularity, where training data is typically even scarcer and more unbalanced. As a result, we report largely improved performance in simultaneous 2D object localization and viewpoint estimation on a recent dataset of challenging street scenes.

ps

reviews pdf Project Page [BibTex]

reviews pdf Project Page [BibTex]


no image
Local Gaussian Regression

Meier, F., Hennig, P., Schaal, S.

arXiv preprint, March 2014, clmc (misc)

Abstract
Abstract: Locally weighted regression was created as a nonparametric learning method that is computationally efficient, can learn from very large amounts of data and add data incrementally. An interesting feature of locally weighted regression is that it can work with ...

am pn

Web link (url) [BibTex]

Web link (url) [BibTex]


Thumb xl homerjournal
Adaptive Offset Correction for Intracortical Brain Computer Interfaces

Homer, M. L., Perge, J. A., Black, M. J., Harrison, M. T., Cash, S. S., Hochberg, L. R.

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 22(2):239-248, March 2014 (article)

Abstract
Intracortical brain computer interfaces (iBCIs) decode intended movement from neural activity for the control of external devices such as a robotic arm. Standard approaches include a calibration phase to estimate decoding parameters. During iBCI operation, the statistical properties of the neural activity can depart from those observed during calibration, sometimes hindering a user’s ability to control the iBCI. To address this problem, we adaptively correct the offset terms within a Kalman filter decoder via penalized maximum likelihood estimation. The approach can handle rapid shifts in neural signal behavior (on the order of seconds) and requires no knowledge of the intended movement. The algorithm, called MOCA, was tested using simulated neural activity and evaluated retrospectively using data collected from two people with tetraplegia operating an iBCI. In 19 clinical research test cases, where a nonadaptive Kalman filter yielded relatively high decoding errors, MOCA significantly reduced these errors (10.6 ± 10.1\%; p < 0.05, pairwise t-test). MOCA did not significantly change the error in the remaining 23 cases where a nonadaptive Kalman filter already performed well. These results suggest that MOCA provides more robust decoding than the standard Kalman filter for iBCIs.

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl aggteaser
Model-based Anthropometry: Predicting Measurements from 3D Human Scans in Multiple Poses

Tsoli, A., Loper, M., Black, M. J.

In Proceedings Winter Conference on Applications of Computer Vision, pages: 83-90, IEEE , IEEE Winter Conference on Applications of Computer Vision (WACV) , March 2014 (inproceedings)

Abstract
Extracting anthropometric or tailoring measurements from 3D human body scans is important for applications such as virtual try-on, custom clothing, and online sizing. Existing commercial solutions identify anatomical landmarks on high-resolution 3D scans and then compute distances or circumferences on the scan. Landmark detection is sensitive to acquisition noise (e.g. holes) and these methods require subjects to adopt a specific pose. In contrast, we propose a solution we call model-based anthropometry. We fit a deformable 3D body model to scan data in one or more poses; this model-based fitting is robust to scan noise. This brings the scan into registration with a database of registered body scans. Then, we extract features from the registered model (rather than from the scan); these include, limb lengths, circumferences, and statistical features of global shape. Finally, we learn a mapping from these features to measurements using regularized linear regression. We perform an extensive evaluation using the CAESAR dataset and demonstrate that the accuracy of our method outperforms state-of-the-art methods.

ps

pdf DOI Project Page Project Page [BibTex]

pdf DOI Project Page Project Page [BibTex]


no image
Probabilistic ODE Solvers with Runge-Kutta Means

Schober, M., Duvenaud, D., Hennig, P.

In Advances in Neural Information Processing Systems 27, pages: 739-747, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), Curran Associates, Inc., 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)

ei pn

Web link (url) [BibTex]

Web link (url) [BibTex]


no image
Active Learning of Linear Embeddings for Gaussian Processes

Garnett, R., Osborne, M., Hennig, P.

In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pages: 230-239, (Editors: NL Zhang and J Tian), AUAI Press , Corvallis, Oregon, UAI2014, 2014, another link: http://arxiv.org/abs/1310.6740 (inproceedings)

ei pn

PDF Web [BibTex]

PDF Web [BibTex]


no image
Probabilistic Shortest Path Tractography in DTI Using Gaussian Process ODE Solvers

Schober, M., Kasenburg, N., Feragen, A., Hennig, P., Hauberg, S.

In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014, Lecture Notes in Computer Science Vol. 8675, pages: 265-272, (Editors: P. Golland, N. Hata, C. Barillot, J. Hornegger and R. Howe), Springer, Heidelberg, MICCAI, 2014 (inproceedings)

ei pn

DOI [BibTex]

DOI [BibTex]


no image
Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature

Gunter, T., Osborne, M., Garnett, R., Hennig, P., Roberts, S.

In Advances in Neural Information Processing Systems 27, pages: 2789-2797, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), Curran Associates, Inc., 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)

ei pn

Web link (url) [BibTex]

Web link (url) [BibTex]


no image
A Self-Tuning LQR Approach Demonstrated on an Inverted Pendulum

Trimpe, S., Millane, A., Doessegger, S., D’Andrea, R.

In Proceedings of the 19th IFAC World Congress, Cape Town, South Africa, 2014 (inproceedings)

am ics

PDF Supplementary material DOI [BibTex]

PDF Supplementary material DOI [BibTex]


no image
Learning objective functions for autonomous motion generation

Kalakrishnan, M.

University of Southern California, University of Southern California, Los Angeles, CA, 2014 (phdthesis)

am

Project Page Project Page [BibTex]

Project Page Project Page [BibTex]


no image
A Limiting Property of the Matrix Exponential

Trimpe, S., D’Andrea, R.

IEEE Transactions on Automatic Control, 59(4):1105-1110, 2014 (article)

am ics

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Learning coupling terms for obstacle avoidance

Rai, A., Meier, F., Ijspeert, A., Schaal, S.

In International Conference on Humanoid Robotics, pages: 512-518, IEEE, 2014, clmc (inproceedings)

Abstract
Autonomous manipulation in dynamic environments is important for robots to perform everyday tasks. For this, a manipulator should be capable of interpreting the environment and planning an appropriate movement. At least, two possible approaches exist for this in literature. Usually, a planning system is used to generate a complex movement plan that satisfies all constraints. Alternatively, a simple plan could be chosen and modified with sensory feedback to accommodate additional constraints by equipping the controller with features that remain dormant most of the time, except when specific situations arise. Dynamic Movement Primitives (DMPs) form a robust and versatile starting point for such a controller that can be modified online using a non-linear term, called the coupling term. This can prove to be a fast and reactive way of obstacle avoidance in a human-like fashion. We propose a method to learn this coupling term from human demonstrations starting with simple features and making it more robust to avoid a larger range of obstacles. We test the ability of our coupling term to model different kinds of obstacle avoidance behaviours in humans and use this learnt coupling term to avoid obstacles in a reactive manner. This line of research aims at pushing the boundary of reactive control strategies to more complex scenarios, such that complex and usually computationally more expensive planning methods can be avoided as much as possible.

am

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Event-Based State Estimation With Variance-Based Triggering

Trimpe, S., D’Andrea, R.

IEEE Transactions on Automatic Control, 59(12):3266-3281, 2014 (article)

am ics

PDF Supplementary material DOI Project Page [BibTex]

PDF Supplementary material DOI Project Page [BibTex]


Thumb xl freelymoving2
A freely-moving monkey treadmill model

Foster, J., Nuyujukian, P., Freifeld, O., Gao, H., Walker, R., Ryu, S., Meng, T., Murmann, B., Black, M., Shenoy, K.

J. of Neural Engineering, 11(4):046020, 2014 (article)

Abstract
Objective: Motor neuroscience and brain-machine interface (BMI) design is based on examining how the brain controls voluntary movement, typically by recording neural activity and behavior from animal models. Recording technologies used with these animal models have traditionally limited the range of behaviors that can be studied, and thus the generality of science and engineering research. We aim to design a freely-moving animal model using neural and behavioral recording technologies that do not constrain movement. Approach: We have established a freely-moving rhesus monkey model employing technology that transmits neural activity from an intracortical array using a head-mounted device and records behavior through computer vision using markerless motion capture. We demonstrate the excitability and utility of this new monkey model, including the fi rst recordings from motor cortex while rhesus monkeys walk quadrupedally on a treadmill. Main results: Using this monkey model, we show that multi-unit threshold-crossing neural activity encodes the phase of walking and that the average ring rate of the threshold crossings covaries with the speed of individual steps. On a population level, we find that neural state-space trajectories of walking at diff erent speeds have similar rotational dynamics in some dimensions that evolve at the step rate of walking, yet robustly separate by speed in other state-space dimensions. Significance: Freely-moving animal models may allow neuroscientists to examine a wider range of behaviors and can provide a flexible experimental paradigm for examining the neural mechanisms that underlie movement generation across behaviors and environments. For BMIs, freely-moving animal models have the potential to aid prosthetic design by examining how neural encoding changes with posture, environment, and other real-world context changes. Understanding this new realm of behavior in more naturalistic settings is essential for overall progress of basic motor neuroscience and for the successful translation of BMIs to people with paralysis.

ps

pdf Supplementary DOI Project Page [BibTex]

pdf Supplementary DOI Project Page [BibTex]


no image
Incremental Local Gaussian Regression

Meier, F., Hennig, P., Schaal, S.

In Advances in Neural Information Processing Systems 27, pages: 972-980, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014, clmc (inproceedings)

am ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Efficient Bayesian Local Model Learning for Control

Meier, F., Hennig, P., Schaal, S.

In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, pages: 2244 - 2249, IROS, 2014, clmc (inproceedings)

Abstract
Model-based control is essential for compliant controland force control in many modern complex robots, like humanoidor disaster robots. Due to many unknown and hard tomodel nonlinearities, analytical models of such robots are oftenonly very rough approximations. However, modern optimizationcontrollers frequently depend on reasonably accurate models,and degrade greatly in robustness and performance if modelerrors are too large. For a long time, machine learning hasbeen expected to provide automatic empirical model synthesis,yet so far, research has only generated feasibility studies butno learning algorithms that run reliably on complex robots.In this paper, we combine two promising worlds of regressiontechniques to generate a more powerful regression learningsystem. On the one hand, locally weighted regression techniquesare computationally efficient, but hard to tune due to avariety of data dependent meta-parameters. On the other hand,Bayesian regression has rather automatic and robust methods toset learning parameters, but becomes quickly computationallyinfeasible for big and high-dimensional data sets. By reducingthe complexity of Bayesian regression in the spirit of local modellearning through variational approximations, we arrive at anovel algorithm that is computationally efficient and easy toinitialize for robust learning. Evaluations on several datasetsdemonstrate very good learning performance and the potentialfor a general regression learning tool for robotics.

am ei pn

PDF link (url) DOI [BibTex]

PDF link (url) DOI [BibTex]


no image
Perspective: Intelligent Systems: Bits and Bots

Spatz, J. P., Schaal, S.

Nature, (509), 2014, clmc (article)

Abstract
What is intelligence, and can we create it? Animals can perceive, reason, react and learn, but they are just one example of an intelligent system. Intelligent systems could be robots as large as humans, helping with search-and- rescue operations in dangerous places, or smart devices as tiny as a cell, delivering drugs to a target within the body. Even computing systems can be intelligent, by perceiving the world, crawling the web and processing â??big dataâ?? to extract and learn from complex information.Understanding not only how intelligence can be reproduced, but also how to build systems that put these ideas into practice, will be a challenge. Small intelligent systems will require new materials and fabrication methods, as well as com- pact information processors and power sources. And for nano-sized systems, the rules change altogether. The laws of physics operate very differently at tiny scales: for a nanorobot, swimming through water is like struggling through treacle.Researchers at the Max Planck Institute for Intelligent Systems have begun to solve these problems by developing new computational methods, experiment- ing with unique robotic systems and fabricating tiny, artificial propellers, like bacterial flagella, to propel nanocreations through their environment.

am

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Stability Analysis of Distributed Event-Based State Estimation

Trimpe, S.

In Proceedings of the 53rd IEEE Conference on Decision and Control, Los Angeles, CA, 2014 (inproceedings)

Abstract
An approach for distributed and event-based state estimation that was proposed in previous work [1] is analyzed and extended to practical networked systems in this paper. Multiple sensor-actuator-agents observe a dynamic process, sporadically exchange their measurements over a broadcast network according to an event-based protocol, and estimate the process state from the received data. The event-based approach was shown in [1] to mimic a centralized Luenberger observer up to guaranteed bounds, under the assumption of identical estimates on all agents. This assumption, however, is unrealistic (it is violated by a single packet drop or slight numerical inaccuracy) and removed herein. By means of a simulation example, it is shown that non-identical estimates can actually destabilize the overall system. To achieve stability, the event-based communication scheme is supplemented by periodic (but infrequent) exchange of the agentsâ?? estimates and reset to their joint average. When the local estimates are used for feedback control, the stability guarantee for the estimation problem extends to the event-based control system.

am ics

PDF Supplementary material DOI Project Page [BibTex]

PDF Supplementary material DOI Project Page [BibTex]


no image
Data-driven autonomous manipulation

Pastor, P.

University of Southern California, University of Southern California, Los Angeles, CA, 2014 (phdthesis)

am

Project Page Project Page [BibTex]

Project Page Project Page [BibTex]


no image
Dual Execution of Optimized Contact Interaction Trajectories

Toussaint, M., Ratliff, N., Bohg, J., Righetti, L., Englert, P., Schaal, S.

In 2014 IEEE/RSJ Conference on Intelligent Robots and Systems, pages: 47-54, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
Efficient manipulation requires contact to reduce uncertainty. The manipulation literature refers to this as funneling: a methodology for increasing reliability and robustness by leveraging haptic feedback and control of environmental interaction. However, there is a fundamental gap between traditional approaches to trajectory optimization and this concept of robustness by funneling: traditional trajectory optimizers do not discover force feedback strategies. From a POMDP perspective, these behaviors could be regarded as explicit observation actions planned to sufficiently reduce uncertainty thereby enabling a task. While we are sympathetic to the full POMDP view, solving full continuous-space POMDPs in high-dimensions is hard. In this paper, we propose an alternative approach in which trajectory optimization objectives are augmented with new terms that reward uncertainty reduction through contacts, explicitly promoting funneling. This augmentation shifts the responsibility of robustness toward the actual execution of the optimized trajectories. Directly tracing trajectories through configuration space would lose all robustness-dual execution achieves robustness by devising force controllers to reproduce the temporal interaction profile encoded in the dual solution of the optimization problem. This work introduces dual execution in depth and analyze its performance through robustness experiments in both simulation and on a real-world robotic platform.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
An autonomous manipulation system based on force control and optimization

Righetti, L., Kalakrishnan, M., Pastor, P., Binney, J., Kelly, J., Voorhies, R. C., Sukhatme, G. S., Schaal, S.

Autonomous Robots, 36(1-2):11-30, January 2014 (article)

Abstract
In this paper we present an architecture for autonomous manipulation. Our approach is based on the belief that contact interactions during manipulation should be exploited to improve dexterity and that optimizing motion plans is useful to create more robust and repeatable manipulation behaviors. We therefore propose an architecture where state of the art force/torque control and optimization-based motion planning are the core components of the system. We give a detailed description of the modules that constitute the complete system and discuss the challenges inherent to creating such a system. We present experimental results for several grasping and manipulation tasks to demonstrate the performance and robustness of our approach.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning and Exploration in a Novel Dimensionality-Reduction Task

Ebert, J, Kim, S, Schweighofer, N., Sternad, D, Schaal, S.

In Abstracts of Neural Control of Movement Conference (NCM 2009), Amsterdam, Netherlands, 2014 (inproceedings)

am

[BibTex]

[BibTex]


no image
Learning of grasp selection based on shape-templates

Herzog, A., Pastor, P., Kalakrishnan, M., Righetti, L., Bohg, J., Asfour, T., Schaal, S.

Autonomous Robots, 36(1-2):51-65, January 2014 (article)

Abstract
The ability to grasp unknown objects still remains an unsolved problem in the robotics community. One of the challenges is to choose an appropriate grasp configuration, i.e., the 6D pose of the hand relative to the object and its finger configuration. In this paper, we introduce an algorithm that is based on the assumption that similarly shaped objects can be grasped in a similar way. It is able to synthesize good grasp poses for unknown objects by finding the best matching object shape templates associated with previously demonstrated grasps. The grasp selection algorithm is able to improve over time by using the information of previous grasp attempts to adapt the ranking of the templates to new situations. We tested our approach on two different platforms, the Willow Garage PR2 and the Barrett WAM robot, which have very different hand kinematics. Furthermore, we compared our algorithm with other grasp planners and demonstrated its superior performance. The results presented in this paper show that the algorithm is able to find good grasp configurations for a large set of unknown objects from a relatively small set of demonstrations, and does improve its performance over time.

am mg

link (url) DOI [BibTex]


Thumb xl simulated annealing
Simulated Annealing

Gall, J.

In Encyclopedia of Computer Vision, pages: 737-741, 0, (Editors: Ikeuchi, K. ), Springer Verlag, 2014, to appear (inbook)

ps

[BibTex]

[BibTex]


Thumb xl ijcvflow2
A Quantitative Analysis of Current Practices in Optical Flow Estimation and the Principles behind Them

Sun, D., Roth, S., Black, M. J.

International Journal of Computer Vision (IJCV), 106(2):115-137, 2014 (article)

Abstract
The accuracy of optical flow estimation algorithms has been improving steadily as evidenced by results on the Middlebury optical flow benchmark. The typical formulation, however, has changed little since the work of Horn and Schunck. We attempt to uncover what has made recent advances possible through a thorough analysis of how the objective function, the optimization method, and modern implementation practices influence accuracy. We discover that "classical'' flow formulations perform surprisingly well when combined with modern optimization and implementation techniques. One key implementation detail is the median filtering of intermediate flow fields during optimization. While this improves the robustness of classical methods it actually leads to higher energy solutions, meaning that these methods are not optimizing the original objective function. To understand the principles behind this phenomenon, we derive a new objective function that formalizes the median filtering heuristic. This objective function includes a non-local smoothness term that robustly integrates flow estimates over large spatial neighborhoods. By modifying this new term to include information about flow and image boundaries we develop a method that can better preserve motion details. To take advantage of the trend towards video in wide-screen format, we further introduce an asymmetric pyramid downsampling scheme that enables the estimation of longer range horizontal motions. The methods are evaluated on Middlebury, MPI Sintel, and KITTI datasets using the same parameter settings.

ps

pdf full text code [BibTex]

pdf full text code [BibTex]


no image
Balancing experiments on a torque-controlled humanoid with hierarchical inverse dynamics

Herzog, A., Righetti, L., Grimminger, F., Pastor, P., Schaal, S.

In 2014 IEEE/RSJ Conference on Intelligent Robots and Systems, pages: 981-988, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
Recently several hierarchical inverse dynamics controllers based on cascades of quadratic programs have been proposed for application on torque controlled robots. They have important theoretical benefits but have never been implemented on a torque controlled robot where model inaccuracies and real-time computation requirements can be problematic. In this contribution we present an experimental evaluation of these algorithms in the context of balance control for a humanoid robot. The presented experiments demonstrate the applicability of the approach under real robot conditions (i.e. model uncertainty, estimation errors, etc). We propose a simplification of the optimization problem that allows us to decrease computation time enough to implement it in a fast torque control loop. We implement a momentum-based balance controller which shows robust performance in face of unknown disturbances, even when the robot is standing on only one foot. In a second experiment, a tracking task is evaluated to demonstrate the performance of the controller with more complicated hierarchies. Our results show that hierarchical inverse dynamics controllers can be used for feedback control of humanoid robots and that momentum-based balance control can be efficiently implemented on a real robot.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Full Dynamics LQR Control of a Humanoid Robot: An Experimental Study on Balancing and Squatting

Mason, S., Righetti, L., Schaal, S.

In 2014 IEEE-RAS International Conference on Humanoid Robots, pages: 374-379, IEEE, Madrid, Spain, 2014 (inproceedings)

Abstract
Humanoid robots operating in human environments require whole-body controllers that can offer precise tracking and well-defined disturbance rejection behavior. In this contribution, we propose an experimental evaluation of a linear quadratic regulator (LQR) using a linearization of the full robot dynamics together with the contact constraints. The advantage of the controller is that it explicitly takes into account the coupling between the different joints to create optimal feedback controllers for whole-body control. We also propose a method to explicitly regulate other tasks of interest, such as the regulation of the center of mass of the robot or its angular momentum. In order to evaluate the performance of linear optimal control designs in a real-world scenario (model uncertainty, sensor noise, imperfect state estimation, etc), we test the controllers in a variety of tracking and balancing experiments on a torque controlled humanoid (e.g. balancing, split plane balancing, squatting, pushes while squatting, and balancing on a wheeled platform). The proposed control framework shows a reliable push recovery behavior competitive with more sophisticated balance controllers, rejecting impulses up to 11.7 Ns with peak forces of 650 N, with the added advantage of great computational simplicity. Furthermore, the controller is able to track squatting trajectories up to 1 Hz without relinearization, suggesting that the linearized dynamics is sufficient for significant ranges of motion.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
State Estimation for a Humanoid Robot

Rotella, N., Bloesch, M., Righetti, L., Schaal, S.

In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 952-958, IEEE, Chicago, USA, 2014 (inproceedings)

Abstract
This paper introduces a framework for state estimation on a humanoid robot platform using only common proprioceptive sensors and knowledge of leg kinematics. The presented approach extends that detailed in prior work on a point-foot quadruped platform by adding the rotational constraints imposed by the humanoid's flat feet. As in previous work, the proposed Extended Kalman Filter accommodates contact switching and makes no assumptions about gait or terrain, making it applicable on any humanoid platform for use in any task. A nonlinear observability analysis is performed on both the point-foot and flat-foot filters and it is concluded that the addition of rotational constraints significantly simplifies singular cases and improves the observability characteristics of the system. Results on a simulated walking dataset demonstrate the performance gain of the flat-foot filter as well as confirm the results of the presented observability analysis.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2013


Thumb xl thumb
Branch&Rank for Efficient Object Detection

Lehmann, A., Gehler, P., VanGool, L.

International Journal of Computer Vision, Springer, December 2013 (article)

Abstract
Ranking hypothesis sets is a powerful concept for efficient object detection. In this work, we propose a branch&rank scheme that detects objects with often less than 100 ranking operations. This efficiency enables the use of strong and also costly classifiers like non-linear SVMs with RBF-TeX kernels. We thereby relieve an inherent limitation of branch&bound methods as bounds are often not tight enough to be effective in practice. Our approach features three key components: a ranking function that operates on sets of hypotheses and a grouping of these into different tasks. Detection efficiency results from adaptively sub-dividing the object search space into decreasingly smaller sets. This is inherited from branch&bound, while the ranking function supersedes a tight bound which is often unavailable (except for rather limited function classes). The grouping makes the system effective: it separates image classification from object recognition, yet combines them in a single formulation, phrased as a structured SVM problem. A novel aspect of branch&rank is that a better ranking function is expected to decrease the number of classifier calls during detection. We use the VOC’07 dataset to demonstrate the algorithmic properties of branch&rank.

ps

pdf link (url) [BibTex]

2013


pdf link (url) [BibTex]


Thumb xl thumb
Strong Appearance and Expressive Spatial Models for Human Pose Estimation

Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.

In International Conference on Computer Vision (ICCV), pages: 3487 - 3494 , IEEE, Computer Vision (ICCV), IEEE International Conference on , December 2013 (inproceedings)

Abstract
Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the body part hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary; (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-of-the-art performance when augmented with the proper appearance representation; and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the "Leeds Sports Poses'' and "Parse'' benchmarks.

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl tro
Extracting Postural Synergies for Robotic Grasping

Romero, J., Feix, T., Ek, C., Kjellstrom, H., Kragic, D.

Robotics, IEEE Transactions on, 29(6):1342-1352, December 2013 (article)

ps

[BibTex]

[BibTex]


Thumb xl zhang
Understanding High-Level Semantics by Modeling Traffic Patterns

Zhang, H., Geiger, A., Urtasun, R.

In International Conference on Computer Vision, pages: 3056-3063, Sydney, Australia, December 2013 (inproceedings)

Abstract
In this paper, we are interested in understanding the semantics of outdoor scenes in the context of autonomous driving. Towards this goal, we propose a generative model of 3D urban scenes which is able to reason not only about the geometry and objects present in the scene, but also about the high-level semantics in the form of traffic patterns. We found that a small number of patterns is sufficient to model the vast majority of traffic scenes and show how these patterns can be learned. As evidenced by our experiments, this high-level reasoning significantly improves the overall scene estimation as well as the vehicle-to-lane association when compared to state-of-the-art approaches. All data and code will be made available upon publication.

avg ps

pdf [BibTex]

pdf [BibTex]


Thumb xl thumb
A Non-parametric Bayesian Network Prior of Human Pose

Lehrmann, A. M., Gehler, P., Nowozin, S.

In Proceedings IEEE Conf. on Computer Vision (ICCV), pages: 1281-1288, IEEE International Conference on Computer Vision, December 2013 (inproceedings)

Abstract
Having a sensible prior of human pose is a vital ingredient for many computer vision applications, including tracking and pose estimation. While the application of global non-parametric approaches and parametric models has led to some success, finding the right balance in terms of flexibility and tractability, as well as estimating model parameters from data has turned out to be challenging. In this work, we introduce a sparse Bayesian network model of human pose that is non-parametric with respect to the estimation of both its graph structure and its local distributions. We describe an efficient sampling scheme for our model and show its tractability for the computation of exact log-likelihoods. We empirically validate our approach on the Human 3.6M dataset and demonstrate superior performance to global models and parametric networks. We further illustrate our model's ability to represent and compose poses not present in the training set (compositionality) and describe a speed-accuracy trade-off that allows realtime scoring of poses.

ps

Project page pdf DOI Project Page [BibTex]

Project page pdf DOI Project Page [BibTex]


Thumb xl jhuang
Towards understanding action recognition

Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M. J.

In IEEE International Conference on Computer Vision (ICCV), pages: 3192-3199, IEEE, Sydney, Australia, December 2013 (inproceedings)

Abstract
Although action recognition in videos is widely studied, current methods often fail on real-world datasets. Many recent approaches improve accuracy and robustness to cope with challenging video sequences, but it is often unclear what affects the results most. This paper attempts to provide insights based on a systematic performance evaluation using thoroughly-annotated data of human actions. We annotate human Joints for the HMDB dataset (J-HMDB). This annotation can be used to derive ground truth optical flow and segmentation. We evaluate current methods using this dataset and systematically replace the output of various algorithms with ground truth. This enables us to discover what is important – for example, should we work on improving flow algorithms, estimating human bounding boxes, or enabling pose estimation? In summary, we find that highlevel pose features greatly outperform low/mid level features; in particular, pose over time is critical, but current pose estimation algorithms are not yet reliable enough to provide this information. We also find that the accuracy of a top-performing action recognition framework can be greatly increased by refining the underlying low/mid level features; this suggests it is important to improve optical flow and human detection algorithms. Our analysis and JHMDB dataset should facilitate a deeper understanding of action recognition algorithms.

ps

Website Errata Poster Paper Slides DOI Project Page Project Page Project Page [BibTex]

Website Errata Poster Paper Slides DOI Project Page Project Page Project Page [BibTex]


Thumb xl impact battery
Probabilistic Object Tracking Using a Range Camera

Wüthrich, M., Pastor, P., Kalakrishnan, M., Bohg, J., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 3195-3202, IEEE, November 2013 (inproceedings)

Abstract
We address the problem of tracking the 6-DoF pose of an object while it is being manipulated by a human or a robot. We use a dynamic Bayesian network to perform inference and compute a posterior distribution over the current object pose. Depending on whether a robot or a human manipulates the object, we employ a process model with or without knowledge of control inputs. Observations are obtained from a range camera. As opposed to previous object tracking methods, we explicitly model self-occlusions and occlusions from the environment, e.g, the human or robotic hand. This leads to a strongly non-linear observation model and additional dependencies in the Bayesian network. We employ a Rao-Blackwellised particle filter to compute an estimate of the object pose at every time step. In a set of experiments, we demonstrate the ability of our method to accurately and robustly track the object pose in real-time while it is being manipulated by a human or a robot.

am

arXiv Video Code Video DOI Project Page [BibTex]

arXiv Video Code Video DOI Project Page [BibTex]


Thumb xl embs2013
Mixing Decoded Cursor Velocity and Position from an Offline Kalman Filter Improves Cursor Control in People with Tetraplegia

Homer, M., Harrison, M., Black, M. J., Perge, J., Cash, S., Friehs, G., Hochberg, L.

In 6th International IEEE EMBS Conference on Neural Engineering, pages: 715-718, San Diego, November 2013 (inproceedings)

Abstract
Kalman filtering is a common method to decode neural signals from the motor cortex. In clinical research investigating the use of intracortical brain computer interfaces (iBCIs), the technique enabled people with tetraplegia to control assistive devices such as a computer or robotic arm directly from their neural activity. For reaching movements, the Kalman filter typically estimates the instantaneous endpoint velocity of the control device. Here, we analyzed attempted arm/hand movements by people with tetraplegia to control a cursor on a computer screen to reach several circular targets. A standard velocity Kalman filter is enhanced to additionally decode for the cursor’s position. We then mix decoded velocity and position to generate cursor movement commands. We analyzed data, offline, from two participants across six sessions. Root mean squared error between the actual and estimated cursor trajectory improved by 12.2 ±10.5% (pairwise t-test, p<0.05) as compared to a standard velocity Kalman filter. The findings suggest that simultaneously decoding for intended velocity and position and using them both to generate movement commands can improve the performance of iBCIs.

ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl pic cviu13
Markov Random Field Modeling, Inference & Learning in Computer Vision & Image Understanding: A Survey

Wang, C., Komodakis, N., Paragios, N.

Computer Vision and Image Understanding (CVIU), 117(11):1610-1627, November 2013 (article)

Abstract
In this paper, we present a comprehensive survey of Markov Random Fields (MRFs) in computer vision and image understanding, with respect to the modeling, the inference and the learning. While MRFs were introduced into the computer vision field about two decades ago, they started to become a ubiquitous tool for solving visual perception problems around the turn of the millennium following the emergence of efficient inference methods. During the past decade, a variety of MRF models as well as inference and learning methods have been developed for addressing numerous low, mid and high-level vision problems. While most of the literature concerns pairwise MRFs, in recent years we have also witnessed significant progress in higher-order MRFs, which substantially enhances the expressiveness of graph-based models and expands the domain of solvable problems. This survey provides a compact and informative summary of the major literature in this research topic.

ps

Publishers site pdf [BibTex]

Publishers site pdf [BibTex]


no image
Camera-specific Image Denoising

Schober, M.

Eberhard Karls Universität Tübingen, Germany, October 2013 (diplomathesis)

ei pn

PDF [BibTex]

PDF [BibTex]


Thumb xl multi modal
3-D Object Reconstruction of Symmetric Objects by Fusing Visual and Tactile Sensing

Illonen, J., Bohg, J., Kyrki, V.

The International Journal of Robotics Research, 33(2):321-341, Sage, October 2013 (article)

Abstract
In this work, we propose to reconstruct a complete 3-D model of an unknown object by fusion of visual and tactile information while the object is grasped. Assuming the object is symmetric, a first hypothesis of its complete 3-D shape is generated. A grasp is executed on the object with a robotic manipulator equipped with tactile sensors. Given the detected contacts between the fingers and the object, the initial full object model including the symmetry parameters can be refined. This refined model will then allow the planning of more complex manipulation tasks. The main contribution of this work is an optimal estimation approach for the fusion of visual and tactile data applying the constraint of object symmetry. The fusion is formulated as a state estimation problem and solved with an iterative extended Kalman filter. The approach is validated experimentally using both artificial and real data from two different robotic platforms.

am

Web DOI Project Page [BibTex]

Web DOI Project Page [BibTex]


Thumb xl implied flow whue
Puppet Flow

Zuffi, S., Black, M. J.

(7), Max Planck Institute for Intelligent Systems, October 2013 (techreport)

Abstract
We introduce Puppet Flow (PF), a layered model describing the optical flow of a person in a video sequence. We consider video frames composed by two layers: a foreground layer corresponding to a person, and background. We model the background as an affine flow field. The foreground layer, being a moving person, requires reasoning about the articulated nature of the human body. We thus represent the foreground layer with the Deformable Structures model (DS), a parametrized 2D part-based human body representation. We call the motion field defined through articulated motion and deformation of the DS model, a Puppet Flow. By exploiting the DS representation, Puppet Flow is a parametrized optical flow field, where parameters are the person's pose, gender and body shape.

ps

pdf Project Page Project Page [BibTex]

pdf Project Page Project Page [BibTex]


Thumb xl ijrr
Vision meets Robotics: The KITTI Dataset

Geiger, A., Lenz, P., Stiller, C., Urtasun, R.

International Journal of Robotics Research, 32(11):1231 - 1237 , Sage Publishing, September 2013 (article)

Abstract
We present a novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research. In total, we recorded 6 hours of traffic scenarios at 10-100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3D laser scanner and a high-precision GPS/IMU inertial navigation system. The scenarios are diverse, capturing real-world traffic situations and range from freeways over rural areas to inner-city scenes with many static and dynamic objects. Our data is calibrated, synchronized and timestamped, and we provide the rectified and raw image sequences. Our dataset also contains object labels in the form of 3D tracklets and we provide online benchmarks for stereo, optical flow, object detection and other tasks. This paper describes our recording platform, the data format and the utilities that we provide.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl imgf0006
Human Pose Calculation from Optical Flow Data

Black, M., Loper, M., Romero, J., Zuffi, S.

European Patent Application EP 2843621 , August 2013 (patent)

ps

Google Patents [BibTex]

Google Patents [BibTex]


Thumb xl cover3
Statistics on Manifolds with Applications to Modeling Shape Deformations

Freifeld, O.

Brown University, August 2013 (phdthesis)

Abstract
Statistical models of non-rigid deformable shape have wide application in many fi elds, including computer vision, computer graphics, and biometry. We show that shape deformations are well represented through nonlinear manifolds that are also matrix Lie groups. These pattern-theoretic representations lead to several advantages over other alternatives, including a principled measure of shape dissimilarity and a natural way to compose deformations. Moreover, they enable building models using statistics on manifolds. Consequently, such models are superior to those based on Euclidean representations. We demonstrate this by modeling 2D and 3D human body shape. Shape deformations are only one example of manifold-valued data. More generally, in many computer-vision and machine-learning problems, nonlinear manifold representations arise naturally and provide a powerful alternative to Euclidean representations. Statistics is traditionally concerned with data in a Euclidean space, relying on the linear structure and the distances associated with such a space; this renders it inappropriate for nonlinear spaces. Statistics can, however, be generalized to nonlinear manifolds. Moreover, by respecting the underlying geometry, the statistical models result in not only more e ffective analysis but also consistent synthesis. We go beyond previous work on statistics on manifolds by showing how, even on these curved spaces, problems related to modeling a class from scarce data can be dealt with by leveraging information from related classes residing in di fferent regions of the space. We show the usefulness of our approach with 3D shape deformations. To summarize our main contributions: 1) We de fine a new 2D articulated model -- more expressive than traditional ones -- of deformable human shape that factors body-shape, pose, and camera variations. Its high realism is obtained from training data generated from a detailed 3D model. 2) We defi ne a new manifold-based representation of 3D shape deformations that yields statistical deformable-template models that are better than the current state-of-the- art. 3) We generalize a transfer learning idea from Euclidean spaces to Riemannian manifolds. This work demonstrates the value of modeling manifold-valued data and their statistics explicitly on the manifold. Specifi cally, the methods here provide new tools for shape analysis.

ps

pdf Project Page [BibTex]


Thumb xl thumb
Poselet conditioned pictorial structures

Pishchulin, L., Andriluka, M., Gehler, P., Schiele, B.

In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages: 588 - 595, IEEE, Portland, OR, Conference on Computer Vision and Pattern Recognition (CVRP), June 2013 (inproceedings)

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl thumb
Occlusion Patterns for Object Class Detection

Pepik, B., Stark, M., Gehler, P., Schiele, B.

In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Portland, OR, June 2013 (inproceedings)

Abstract
Despite the success of recent object class recognition systems, the long-standing problem of partial occlusion re- mains a major challenge, and a principled solution is yet to be found. In this paper we leave the beaten path of meth- ods that treat occlusion as just another source of noise – instead, we include the occluder itself into the modelling, by mining distinctive, reoccurring occlusion patterns from annotated training data. These patterns are then used as training data for dedicated detectors of varying sophistica- tion. In particular, we evaluate and compare models that range from standard object class detectors to hierarchical, part-based representations of occluder/occludee pairs. In an extensive evaluation we derive insights that can aid fur- ther developments in tackling the occlusion challenge.

ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl lost
Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization

(CVPR13 Best Paper Runner-Up)

Brubaker, M. A., Geiger, A., Urtasun, R.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2013), pages: 3057-3064, IEEE, Portland, OR, June 2013 (inproceedings)

Abstract
In this paper we propose an affordable solution to self- localization, which utilizes visual odometry and road maps as the only inputs. To this end, we present a probabilis- tic model as well as an efficient approximate inference al- gorithm, which is able to utilize distributed computation to meet the real-time requirements of autonomous systems. Because of the probabilistic nature of the model we are able to cope with uncertainty due to noisy visual odometry and inherent ambiguities in the map ( e.g ., in a Manhattan world). By exploiting freely available, community devel- oped maps and visual odometry measurements, we are able to localize a vehicle up to 3m after only a few seconds of driving on maps which contain more than 2,150km of driv- able roads.

avg ps

pdf supplementary project page [BibTex]

pdf supplementary project page [BibTex]


Thumb xl poseregression
Human Pose Estimation using Body Parts Dependent Joint Regressors

Dantone, M., Gall, J., Leistner, C., van Gool, L.

In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 3041-3048, IEEE, Portland, OR, USA, June 2013 (inproceedings)

Abstract
In this work, we address the problem of estimating 2d human pose from still images. Recent methods that rely on discriminatively trained deformable parts organized in a tree model have shown to be very successful in solving this task. Within such a pictorial structure framework, we address the problem of obtaining good part templates by proposing novel, non-linear joint regressors. In particular, we employ two-layered random forests as joint regressors. The first layer acts as a discriminative, independent body part classifier. The second layer takes the estimated class distributions of the first one into account and is thereby able to predict joint locations by modeling the interdependence and co-occurrence of the parts. This results in a pose estimation framework that takes dependencies between body parts already for joint localization into account and is thus able to circumvent typical ambiguities of tree structures, such as for legs and arms. In the experiments, we demonstrate that our body parts dependent joint regressors achieve a higher joint localization accuracy than tree-based state-of-the-art methods.

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl deqingcvpr13b
A fully-connected layered model of foreground and background flow

Sun, D., Wulff, J., Sudderth, E., Pfister, H., Black, M.

In IEEE Conf. on Computer Vision and Pattern Recognition, (CVPR 2013), pages: 2451-2458, Portland, OR, June 2013 (inproceedings)

Abstract
Layered models allow scene segmentation and motion estimation to be formulated together and to inform one another. Traditional layered motion methods, however, employ fairly weak models of scene structure, relying on locally connected Ising/Potts models which have limited ability to capture long-range correlations in natural scenes. To address this, we formulate a fully-connected layered model that enables global reasoning about the complicated segmentations of real objects. Optimization with fully-connected graphical models is challenging, and our inference algorithm leverages recent work on efficient mean field updates for fully-connected conditional random fields. These methods can be implemented efficiently using high-dimensional Gaussian filtering. We combine these ideas with a layered flow model, and find that the long-range connections greatly improve segmentation into figure-ground layers when compared with locally connected MRF models. Experiments on several benchmark datasets show that the method can recover fine structures and large occlusion regions, with good flow accuracy and much lower computational cost than previous locally-connected layered models.

ps

pdf Supplemental Material Project Page Project Page [BibTex]

pdf Supplemental Material Project Page Project Page [BibTex]