Header logo is


2018


Thumb xl imgidx 00326
Customized Multi-Person Tracker

Ma, L., Tang, S., Black, M. J., Van Gool, L.

In Computer Vision – ACCV 2018, Springer International Publishing, Asian Conference on Computer Vision, December 2018 (inproceedings)

ps

PDF Project Page [BibTex]

2018


PDF Project Page [BibTex]


Thumb xl dip final
Deep Inertial Poser: Learning to Reconstruct Human Pose from Sparse Inertial Measurements in Real Time

Huang, Y., Kaufmann, M., Aksan, E., Black, M. J., Hilliges, O., Pons-Moll, G.

ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), 37, pages: 185:1-185:15, ACM, November 2018, Two first authors contributed equally (article)

Abstract
We demonstrate a novel deep neural network capable of reconstructing human full body pose in real-time from 6 Inertial Measurement Units (IMUs) worn on the user's body. In doing so, we address several difficult challenges. First, the problem is severely under-constrained as multiple pose parameters produce the same IMU orientations. Second, capturing IMU data in conjunction with ground-truth poses is expensive and difficult to do in many target application scenarios (e.g., outdoors). Third, modeling temporal dependencies through non-linear optimization has proven effective in prior work but makes real-time prediction infeasible. To address this important limitation, we learn the temporal pose priors using deep learning. To learn from sufficient data, we synthesize IMU data from motion capture datasets. A bi-directional RNN architecture leverages past and future information that is available at training time. At test time, we deploy the network in a sliding window fashion, retaining real time capabilities. To evaluate our method, we recorded DIP-IMU, a dataset consisting of 10 subjects wearing 17 IMUs for validation in 64 sequences with 330,000 time instants; this constitutes the largest IMU dataset publicly available. We quantitatively evaluate our approach on multiple datasets and show results from a real-time implementation. DIP-IMU and the code are available for research purposes.

ps

data code pdf preprint video DOI Project Page [BibTex]

data code pdf preprint video DOI Project Page [BibTex]


Thumb xl sevillagcpr
On the Integration of Optical Flow and Action Recognition

Sevilla-Lara, L., Liao, Y., Güney, F., Jampani, V., Geiger, A., Black, M. J.

In German Conference on Pattern Recognition (GCPR), LNCS 11269, pages: 281-297, Springer, Cham, October 2018 (inproceedings)

Abstract
Most of the top performing action recognition methods use optical flow as a "black box" input. Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better. In particular, we investigate the impact of different flow algorithms and input transformations to better understand how these affect a state-of-the-art action recognition method. Furthermore, we fine tune two neural-network flow methods end-to-end on the most widely used action recognition dataset (UCF101). Based on these experiments, we make the following five observations: 1) optical flow is useful for action recognition because it is invariant to appearance, 2) optical flow methods are optimized to minimize end-point-error (EPE), but the EPE of current methods is not well correlated with action recognition performance, 3) for the flow methods tested, accuracy at boundaries and at small displacements is most correlated with action recognition performance, 4) training optical flow to minimize classification error instead of minimizing EPE improves recognition performance, and 5) optical flow learned for the task of action recognition differs from traditional optical flow especially inside the human body and at the boundary of the body. These observations may encourage optical flow researchers to look beyond EPE as a goal and guide action recognition researchers to seek better motion cues, leading to a tighter integration of the optical flow and action recognition communities.

avg ps

arXiv DOI [BibTex]

arXiv DOI [BibTex]


Thumb xl screenshot from 2018 06 15 22 59 30
A Value-Driven Eldercare Robot: Virtual and Physical Instantiations of a Case-Supported Principle-Based Behavior Paradigm

Anderson, M., Anderson, S., Berenz, V.

Proceedings of the IEEE, pages: 1,15, October 2018 (article)

Abstract
In this paper, a case-supported principle-based behavior paradigm is proposed to help ensure ethical behavior of autonomous machines. We argue that ethically significant behavior of autonomous systems should be guided by explicit ethical principles determined through a consensus of ethicists. Such a consensus is likely to emerge in many areas in which autonomous systems are apt to be deployed and for the actions they are liable to undertake. We believe that this is the case since we are more likely to agree on how machines ought to treat us than on how human beings ought to treat one another. Given such a consensus, particular cases of ethical dilemmas where ethicists agree on the ethically relevant features and the right course of action can be used to help discover principles that balance these features when they are in conflict. Such principles not only help ensure ethical behavior of complex and dynamic systems but also can serve as a basis for justification of this behavior. The requirements, methods, implementation, and evaluation components of the paradigm are detailed as well as its instantiation in both a simulated and real robot functioning in the domain of eldercare.

am

link (url) DOI [BibTex]


Thumb xl cover
Deep Neural Network-based Cooperative Visual Tracking through Multiple Micro Aerial Vehicles

Price, E., Lawless, G., Ludwig, R., Martinovic, I., Buelthoff, H. H., Black, M. J., Ahmad, A.

IEEE Robotics and Automation Letters, Robotics and Automation Letters, 3(4):3193-3200, IEEE, October 2018, Also accepted and presented in the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). (article)

Abstract
Multi-camera tracking of humans and animals in outdoor environments is a relevant and challenging problem. Our approach to it involves a team of cooperating micro aerial vehicles (MAVs) with on-board cameras only. DNNs often fail at objects with small scale or far away from the camera, which are typical characteristics of a scenario with aerial robots. Thus, the core problem addressed in this paper is how to achieve on-board, online, continuous and accurate vision-based detections using DNNs for visual person tracking through MAVs. Our solution leverages cooperation among multiple MAVs and active selection of most informative regions of image. We demonstrate the efficiency of our approach through simulations with up to 16 robots and real robot experiments involving two aerial robots tracking a person, while maintaining an active perception-driven formation. ROS-based source code is provided for the benefit of the community.

ps

Published Version link (url) DOI [BibTex]

Published Version link (url) DOI [BibTex]


Thumb xl interpolation
Temporal Interpolation as an Unsupervised Pretraining Task for Optical Flow Estimation

Wulff, J., Black, M. J.

In German Conference on Pattern Recognition (GCPR), LNCS 11269, pages: 567-582, Springer, Cham, October 2018 (inproceedings)

Abstract
The difficulty of annotating training data is a major obstacle to using CNNs for low-level tasks in video. Synthetic data often does not generalize to real videos, while unsupervised methods require heuristic n losses. Proxy tasks can overcome these issues, and start by training a network for a task for which annotation is easier or which can be trained unsupervised. The trained network is then fine-tuned for the original task using small amounts of ground truth data. Here, we investigate frame interpolation as a proxy task for optical flow. Using real movies, we train a CNN unsupervised for temporal interpolation. Such a network implicitly estimates motion, but cannot handle untextured regions. By fi ne-tuning on small amounts of ground truth flow, the network can learn to fill in homogeneous regions and compute full optical flow fi elds. Using this unsupervised pre-training, our network outperforms similar architectures that were trained supervised using synthetic optical flow.

ps

pdf arXiv DOI Project Page [BibTex]

pdf arXiv DOI Project Page [BibTex]


Thumb xl alice
First Impressions of Personality Traits From Body Shapes

Hu, Y., Parde, C. J., Hill, M. Q., Mahmood, N., O’Toole, A. J.

Psychological Science, 29(12):1969-–1983, October 2018 (article)

Abstract
People infer the personalities of others from their facial appearance. Whether they do so from body shapes is less studied. We explored personality inferences made from body shapes. Participants rated personality traits for male and female bodies generated with a three-dimensional body model. Multivariate spaces created from these ratings indicated that people evaluate bodies on valence and agency in ways that directly contrast positive and negative traits from the Big Five domains. Body-trait stereotypes based on the trait ratings revealed a myriad of diverse body shapes that typify individual traits. Personality-trait profiles were predicted reliably from a subset of the body-shape features used to specify the three-dimensional bodies. Body features related to extraversion and conscientiousness were predicted with the highest consensus, followed by openness traits. This study provides the first comprehensive look at the range, diversity, and reliability of personality inferences that people make from body shapes.

ps

publisher site pdf DOI [BibTex]

publisher site pdf DOI [BibTex]


Thumb xl bmvc pic
Human Motion Parsing by Hierarchical Dynamic Clustering

Zhang, Y., Tang, S., Sun, H., Neumann, H.

In Proceedings of the British Machine Vision Conference (BMVC), pages: 269, BMVA Press, 29th British Machine Vision Conference, September 2018 (inproceedings)

Abstract
Parsing continuous human motion into meaningful segments plays an essential role in various applications. In this work, we propose a hierarchical dynamic clustering framework to derive action clusters from a sequence of local features in an unsuper- vised bottom-up manner. We systematically investigate the modules in this framework and particularly propose diverse temporal pooling schemes, in order to realize accurate temporal action localization. We demonstrate our method on two motion parsing tasks: temporal action segmentation and abnormal behavior detection. The experimental results indicate that the proposed framework is significantly more effective than the other related state-of-the-art methods on several datasets.

ps

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl coma faces
Generating 3D Faces using Convolutional Mesh Autoencoders

Ranjan, A., Bolkart, T., Sanyal, S., Black, M. J.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11207, pages: 725-741, Springer, Cham, September 2018 (inproceedings)

Abstract
Learned 3D representations of human faces are useful for computer vision problems such as 3D face tracking and reconstruction from images, as well as graphics applications such as character generation and animation. Traditional models learn a latent representation of a face using linear subspaces or higher-order tensor generalizations. Due to this linearity, they can not capture extreme deformations and non-linear expressions. To address this, we introduce a versatile model that learns a non-linear representation of a face using spectral convolutions on a mesh surface. We introduce mesh sampling operations that enable a hierarchical mesh representation that captures non-linear variations in shape and expression at multiple scales within the model. In a variational setting, our model samples diverse realistic 3D faces from a multivariate Gaussian distribution. Our training data consists of 20,466 meshes of extreme expressions captured over 12 different subjects. Despite limited training data, our trained model outperforms state-of-the-art face models with 50% lower reconstruction error, while using 75% fewer parameters. We also show that, replacing the expression space of an existing state-of-the-art face model with our autoencoder, achieves a lower reconstruction error. Our data, model and code are available at http://coma.is.tue.mpg.de/.

ps

Code (tensorflow) Code (pytorch) Project Page paper supplementary DOI Project Page Project Page [BibTex]

Code (tensorflow) Code (pytorch) Project Page paper supplementary DOI Project Page Project Page [BibTex]


Thumb xl person reid.001
Part-Aligned Bilinear Representations for Person Re-identification

Suh, Y., Wang, J., Tang, S., Mei, T., Lee, K. M.

In European Conference on Computer Vision (ECCV), 11218, pages: 418-437, Springer, Cham, September 2018 (inproceedings)

Abstract
Comparing the appearance of corresponding body parts is essential for person re-identification. However, body parts are frequently misaligned be- tween detected boxes, due to the detection errors and the pose/viewpoint changes. In this paper, we propose a network that learns a part-aligned representation for person re-identification. Our model consists of a two-stream network, which gen- erates appearance and body part feature maps respectively, and a bilinear-pooling layer that fuses two feature maps to an image descriptor. We show that it results in a compact descriptor, where the inner product between two image descriptors is equivalent to an aggregation of the local appearance similarities of the cor- responding body parts, and thereby significantly reduces the part misalignment problem. Our approach is advantageous over other pose-guided representations by learning part descriptors optimal for person re-identification. Training the net- work does not require any part annotation on the person re-identification dataset. Instead, we simply initialize the part sub-stream using a pre-trained sub-network of an existing pose estimation network and train the whole network to minimize the re-identification loss. We validate the effectiveness of our approach by demon- strating its superiority over the state-of-the-art methods on the standard bench- mark datasets including Market-1501, CUHK03, CUHK01 and DukeMTMC, and standard video dataset MARS.

ps

pdf supplementary DOI Project Page [BibTex]

pdf supplementary DOI Project Page [BibTex]


Thumb xl screenshot from 2017 07 27 17 24 14
Playful: Reactive Programming for Orchestrating Robotic Behavior

Berenz, V., Schaal, S.

IEEE Robotics Automation Magazine, 25(3):49-60, September 2018 (article) In press

Abstract
For many service robots, reactivity to changes in their surroundings is a must. However, developing software suitable for dynamic environments is difficult. Existing robotic middleware allows engineers to design behavior graphs by organizing communication between components. But because these graphs are structurally inflexible, they hardly support the development of complex reactive behavior. To address this limitation, we propose Playful, a software platform that applies reactive programming to the specification of robotic behavior.

am

playful website playful_IEEE_RAM link (url) DOI [BibTex]


Thumb xl persondetect  copy
Learning Human Optical Flow

Ranjan, A., Romero, J., Black, M. J.

In 29th British Machine Vision Conference, September 2018 (inproceedings)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Given this, we devise an optical flow algorithm specifically for human motion and show that it is superior to generic flow methods. Designing a method by hand is impractical, so we develop a new training database of image sequences with ground truth optical flow. For this we use a 3D model of the human body and motion capture data to synthesize realistic flow fields. We then train a convolutional neural network to estimate human flow fields from pairs of images. Since many applications in human motion analysis depend on speed, and we anticipate mobile applications, we base our method on SpyNet with several modifications. We demonstrate that our trained network is more accurate than a wide range of top methods on held-out test data and that it generalizes well to real image sequences. When combined with a person detector/tracker, the approach provides a full solution to the problem of 2D human flow estimation. Both the code and the dataset are available for research.

ps

video code pdf link (url) Project Page Project Page [BibTex]

video code pdf link (url) Project Page Project Page [BibTex]


Thumb xl nbf
Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation

(Best Student Paper Award)

Omran, M., Lassner, C., Pons-Moll, G., Gehler, P. V., Schiele, B.

In 3DV, September 2018 (inproceedings)

Abstract
Direct prediction of 3D body pose and shape remains a challenge even for highly parameterized deep learning models. Mapping from the 2D image space to the prediction space is difficult: perspective ambiguities make the loss function noisy and training data is scarce. In this paper, we propose a novel approach (Neural Body Fitting (NBF)). It integrates a statistical body model within a CNN, leveraging reliable bottom-up semantic body part segmentation and robust top-down body model constraints. NBF is fully differentiable and can be trained using 2D and 3D annotations. In detailed experiments, we analyze how the components of our model affect performance, especially the use of part segmentations as an explicit intermediate representation, and present a robust, efficiently trainable framework for 3D human pose estimation from 2D images with competitive results on standard benchmarks. Code is available at https://github.com/mohomran/neural_body_fitting

ps

arXiv code Project Page [BibTex]


Thumb xl screen shot 2018 09 19 at 09.33.59
ClusterNet: Instance Segmentation in RGB-D Images

Shao, L., Tian, Y., Bohg, J.

arXiv, September 2018, Submitted to ICRA'19 (article) Submitted

Abstract
We propose a method for instance-level segmentation that uses RGB-D data as input and provides detailed information about the location, geometry and number of {\em individual\/} objects in the scene. This level of understanding is fundamental for autonomous robots. It enables safe and robust decision-making under the large uncertainty of the real-world. In our model, we propose to use the first and second order moments of the object occupancy function to represent an object instance. We train an hourglass Deep Neural Network (DNN) where each pixel in the output votes for the 3D position of the corresponding object center and for the object's size and pose. The final instance segmentation is achieved through clustering in the space of moments. The object-centric training loss is defined on the output of the clustering. Our method outperforms the state-of-the-art instance segmentation method on our synthesized dataset. We show that our method generalizes well on real-world data achieving visually better segmentation results.

am

link (url) [BibTex]

link (url) [BibTex]


no image
Discovering and Teaching Optimal Planning Strategies

Lieder, F., Callaway, F., Krueger, P. M., Das, P., Griffiths, T. L., Gul, S.

In The 14th biannual conference of the German Society for Cognitive Science, GK, September 2018 (inproceedings)

re

Project Page [BibTex]

Project Page [BibTex]


Thumb xl joeleccv18
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

Janai, J., Güney, F., Ranjan, A., Black, M. J., Geiger, A.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11220, pages: 713-731, Springer, Cham, September 2018 (inproceedings)

avg ps

pdf suppmat Video Project Page DOI Project Page [BibTex]

pdf suppmat Video Project Page DOI Project Page [BibTex]


Thumb xl grasping
Leveraging Contact Forces for Learning to Grasp

Merzic, H., Bogdanovic, M., Kappler, D., Righetti, L., Bohg, J.

arXiv, September 2018, Submitted to ICRA'19 (article) Submitted

Abstract
Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two- fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.

am mg

video arXiv [BibTex]

video arXiv [BibTex]


no image
Discovering Rational Heuristics for Risky Choice

Gul, S., Krueger, P. M., Callaway, F., Griffiths, T. L., Lieder, F.

The 14th biannual conference of the German Society for Cognitive Science, GK, The 14th biannual conference of the German Society for Cognitive Science, GK, September 2018 (conference)

re

Project Page [BibTex]

Project Page [BibTex]


Thumb xl sample3 merge black
Learning an Infant Body Model from RGB-D Data for Accurate Full Body Motion Analysis

Hesse, N., Pujades, S., Romero, J., Black, M. J., Bodensteiner, C., Arens, M., Hofmann, U. G., Tacke, U., Hadders-Algra, M., Weinberger, R., Muller-Felber, W., Schroeder, A. S.

In Int. Conf. on Medical Image Computing and Computer Assisted Intervention (MICCAI), September 2018 (inproceedings)

Abstract
Infant motion analysis enables early detection of neurodevelopmental disorders like cerebral palsy (CP). Diagnosis, however, is challenging, requiring expert human judgement. An automated solution would be beneficial but requires the accurate capture of 3D full-body movements. To that end, we develop a non-intrusive, low-cost, lightweight acquisition system that captures the shape and motion of infants. Going beyond work on modeling adult body shape, we learn a 3D Skinned Multi-Infant Linear body model (SMIL) from noisy, low-quality, and incomplete RGB-D data. We demonstrate the capture of shape and motion with 37 infants in a clinical environment. Quantitative experiments show that SMIL faithfully represents the data and properly factorizes the shape and pose of the infants. With a case study based on general movement assessment (GMA), we demonstrate that SMIL captures enough information to allow medical assessment. SMIL provides a new tool and a step towards a fully automatic system for GMA.

ps

pdf Project page video extended arXiv version DOI Project Page [BibTex]

pdf Project page video extended arXiv version DOI Project Page [BibTex]


Thumb xl eccv pascal results  thumbnail
Deep Directional Statistics: Pose Estimation with Uncertainty Quantification

Prokudin, S., Gehler, P., Nowozin, S.

European Conference on Computer Vision (ECCV), September 2018 (conference)

Abstract
Modern deep learning systems successfully solve many perception tasks such as object pose estimation when the input image is of high quality. However, in challenging imaging conditions such as on low resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy. While a loss in performance is unavoidable we would like our models to quantify their uncertainty in order to achieve robustness against images of varying quality. Probabilistic deep learning models combine the expressive power of deep learning with uncertainty quantification. In this paper, we propose a novel probabilistic deep learning model for the task of angular regression. Our model uses von Mises distributions to predict a distribution over object pose angle. Whereas a single von Mises distribution is making strong assumptions about the shape of the distribution, we extend the basic model to predict a mixture of von Mises distributions. We show how to learn a mixture model using a finite and infinite number of mixture components. Our model allow for likelihood-based training and efficient inference at test time. We demonstrate on a number of challenging pose estimation datasets that our model produces calibrated probability predictions and competitive or superior point estimates compared to the current state-of-the-art.

ps

code pdf [BibTex]

code pdf [BibTex]


Thumb xl vip
Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera

Marcard, T. V., Henschel, R., Black, M. J., Rosenhahn, B., Pons-Moll, G.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11214, pages: 614-631, Springer, Cham, September 2018 (inproceedings)

Abstract
In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW ), a new dataset consisting of more than 51; 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having co ffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

ps

pdf SupMat data project DOI Project Page [BibTex]

pdf SupMat data project DOI Project Page [BibTex]


Thumb xl fict 05 00018 g003
Visual Perception and Evaluation of Photo-Realistic Self-Avatars From 3D Body Scans in Males and Females

Thaler, A., Piryankova, I., Stefanucci, J. K., Pujades, S., de la Rosa, S., Streuber, S., Romero, J., Black, M. J., Mohler, B. J.

Frontiers in ICT, 5, pages: 1-14, September 2018 (article)

Abstract
The creation or streaming of photo-realistic self-avatars is important for virtual reality applications that aim for perception and action to replicate real world experience. The appearance and recognition of a digital self-avatar may be especially important for applications related to telepresence, embodied virtual reality, or immersive games. We investigated gender differences in the use of visual cues (shape, texture) of a self-avatar for estimating body weight and evaluating avatar appearance. A full-body scanner was used to capture each participant's body geometry and color information and a set of 3D virtual avatars with realistic weight variations was created based on a statistical body model. Additionally, a second set of avatars was created with an average underlying body shape matched to each participant’s height and weight. In four sets of psychophysical experiments, the influence of visual cues on the accuracy of body weight estimation and the sensitivity to weight changes was assessed by manipulating body shape (own, average) and texture (own photo-realistic, checkerboard). The avatars were presented on a large-screen display, and participants responded to whether the avatar's weight corresponded to their own weight. Participants also adjusted the avatar's weight to their desired weight and evaluated the avatar's appearance with regard to similarity to their own body, uncanniness, and their willingness to accept it as a digital representation of the self. The results of the psychophysical experiments revealed no gender difference in the accuracy of estimating body weight in avatars. However, males accepted a larger weight range of the avatars as corresponding to their own. In terms of the ideal body weight, females but not males desired a thinner body. With regard to the evaluation of avatar appearance, the questionnaire responses suggest that own photo-realistic texture was more important to males for higher similarity ratings, while own body shape seemed to be more important to females. These results argue for gender-specific considerations when creating self-avatars.

ps

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl aircap ca 3
Decentralized MPC based Obstacle Avoidance for Multi-Robot Target Tracking Scenarios

Tallamraju, R., Rajappa, S., Black, M. J., Karlapalem, K., Ahmad, A.

2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pages: 1-8, IEEE, August 2018 (conference)

Abstract
In this work, we consider the problem of decentralized multi-robot target tracking and obstacle avoidance in dynamic environments. Each robot executes a local motion planning algorithm which is based on model predictive control (MPC). The planner is designed as a quadratic program, subject to constraints on robot dynamics and obstacle avoidance. Repulsive potential field functions are employed to avoid obstacles. The novelty of our approach lies in embedding these non-linear potential field functions as constraints within a convex optimization framework. Our method convexifies nonconvex constraints and dependencies, by replacing them as pre-computed external input forces in robot dynamics. The proposed algorithm additionally incorporates different methods to avoid field local minima problems associated with using potential field functions in planning. The motion planner does not enforce predefined trajectories or any formation geometry on the robots and is a comprehensive solution for cooperative obstacle avoidance in the context of multi-robot target tracking. We perform simulation studies for different scenarios to showcase the convergence and efficacy of the proposed algorithm.

ps

Published Version link (url) DOI [BibTex]

Published Version link (url) DOI [BibTex]


Thumb xl mazen
Robust Physics-based Motion Retargeting with Realistic Body Shapes

Borno, M. A., Righetti, L., Black, M. J., Delp, S. L., Fiume, E., Romero, J.

Computer Graphics Forum, 37, pages: 6:1-12, July 2018 (article)

Abstract
Motion capture is often retargeted to new, and sometimes drastically different, characters. When the characters take on realistic human shapes, however, we become more sensitive to the motion looking right. This means adapting it to be consistent with the physical constraints imposed by different body shapes. We show how to take realistic 3D human shapes, approximate them using a simplified representation, and animate them so that they move realistically using physically-based retargeting. We develop a novel spacetime optimization approach that learns and robustly adapts physical controllers to new bodies and constraints. The approach automatically adapts the motion of the mocap subject to the body shape of a target subject. This motion respects the physical properties of the new body and every body shape results in a different and appropriate movement. This makes it easy to create a varied set of motions from a single mocap sequence by simply varying the characters. In an interactive environment, successful retargeting requires adapting the motion to unexpected external forces. We achieve robustness to such forces using a novel LQR-tree formulation. We show that the simulated motions look appropriate to each character’s anatomy and their actions are robust to perturbations.

mg ps

pdf video Project Page Project Page [BibTex]

pdf video Project Page Project Page [BibTex]


Thumb xl teaser image
Probabilistic Recurrent State-Space Models

Doerr, A., Daniel, C., Schiegg, M., Nguyen-Tuong, D., Schaal, S., Toussaint, M., Trimpe, S.

In Proceedings of the International Conference on Machine Learning (ICML), International Conference on Machine Learning (ICML), July 2018 (inproceedings)

Abstract
State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g., LSTMs) proved extremely successful in modeling complex time-series data. Fully probabilistic SSMs, however, unfortunately often prove hard to train, even for smaller problems. To overcome this limitation, we propose a scalable initialization and training algorithm based on doubly stochastic variational inference and Gaussian processes. In the variational approximation we propose in contrast to related approaches to fully capture the latent state temporal correlations to allow for robust training.

am ics

arXiv pdf Project Page [BibTex]

arXiv pdf Project Page [BibTex]


Thumb xl octo turned
Real-time Perception meets Reactive Motion Generation

(Best Systems Paper Finalists - Amazon Robotics Best Paper Awards in Manipulation)

Kappler, D., Meier, F., Issac, J., Mainprice, J., Garcia Cifuentes, C., Wüthrich, M., Berenz, V., Schaal, S., Ratliff, N., Bohg, J.

IEEE Robotics and Automation Letters, 3(3):1864-1871, July 2018 (article)

Abstract
We address the challenging problem of robotic grasping and manipulation in the presence of uncertainty. This uncertainty is due to noisy sensing, inaccurate models and hard-to-predict environment dynamics. Our approach emphasizes the importance of continuous, real-time perception and its tight integration with reactive motion generation methods. We present a fully integrated system where real-time object and robot tracking as well as ambient world modeling provides the necessary input to feedback controllers and continuous motion optimizers. Specifically, they provide attractive and repulsive potentials based on which the controllers and motion optimizer can online compute movement policies at different time intervals. We extensively evaluate the proposed system on a real robotic platform in four scenarios that exhibit either challenging workspace geometry or a dynamic environment. We compare the proposed integrated system with a more traditional sense-plan-act approach that is still widely used. In 333 experiments, we show the robustness and accuracy of the proposed system.

am

arxiv video video link (url) DOI Project Page [BibTex]


Thumb xl screen shot 2018 03 22 at 10.40.47 am
Oncilla robot: a versatile open-source quadruped research robot with compliant pantograph legs

Sproewitz, A., Tuleu, A., Ajallooeian, M., Vespignani, M., Moeckel, R., Eckert, P., D’Haene, M., Degrave, J., Nordmann, A., Schrauwen, B., Steil, J., Ijspeert, A. J.

Frontiers in Robotics and AI, 5(67), June 2018, arXiv: 1803.06259 (article)

Abstract
We present Oncilla robot, a novel mobile, quadruped legged locomotion machine. This large-cat sized, 5.1 robot is one of a kind of a recent, bioinspired legged robot class designed with the capability of model-free locomotion control. Animal legged locomotion in rough terrain is clearly shaped by sensor feedback systems. Results with Oncilla robot show that agile and versatile locomotion is possible without sensory signals to some extend, and tracking becomes robust when feedback control is added (Ajaoolleian 2015). By incorporating mechanical and control blueprints inspired from animals, and by observing the resulting robot locomotion characteristics, we aim to understand the contribution of individual components. Legged robots have a wide mechanical and control design parameter space, and a unique potential as research tools to investigate principles of biomechanics and legged locomotion control. But the hardware and controller design can be a steep initial hurdle for academic research. To facilitate the easy start and development of legged robots, Oncilla-robot's blueprints are available through open-source. [...]

dlg

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]


Thumb xl screen shot 2018 04 18 at 11.01.27 am
Learning from Outside the Viability Kernel: Why we Should Build Robots that can Fail with Grace

Heim, S., Sproewitz, A.

Proceedings of SIMPAR 2018, pages: 55-61, IEEE, 2018 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR), May 2018 (conference)

dlg

link (url) DOI Project Page [BibTex]

link (url) DOI Project Page [BibTex]


Thumb xl tslip
Impact of Trunk Orientation for Dynamic Bipedal Locomotion

Drama, O.

Dynamic Walking Conference, May 2018 (talk)

Abstract
Impact of trunk orientation for dynamic bipedal locomotion My research revolves around investigating the functional demands of bipedal running, with focus on stabilizing trunk orientation. When we think about postural stability, there are two critical questions we need to answer: What are the necessary and sufficient conditions to achieve and maintain trunk stability? I am concentrating on how morphology affects control strategies in achieving trunk stability. In particular, I denote the trunk pitch as the predominant morphology parameter and explore the requirements it imposes on a chosen control strategy. To analyze this, I use a spring loaded inverted pendulum model extended with a rigid trunk, which is actuated by a hip motor. The challenge for the controller design here is to have a single hip actuator to achieve two coupled tasks of moving the legs to generate motion and stabilizing the trunk. I enforce orthograde and pronograde postures and aim to identify the effect of these trunk orientations on the hip torque and ground reaction profiles for different control strategies.

dlg

Impact of trunk orientation for dynamic bipedal locomotion [DW 2018] link (url) Project Page [BibTex]


Thumb xl meta learning overview
Online Learning of a Memory for Learning Rates

(nominated for best paper award)

Meier, F., Kappler, D., Schaal, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018, accepted (inproceedings)

Abstract
The promise of learning to learn for robotics rests on the hope that by extracting some information about the learning process itself we can speed up subsequent similar learning tasks. Here, we introduce a computationally efficient online meta-learning algorithm that builds and optimizes a memory model of the optimal learning rate landscape from previously observed gradient behaviors. While performing task specific optimization, this memory of learning rates predicts how to scale currently observed gradients. After applying the gradient scaling our meta-learner updates its internal memory based on the observed effect its prediction had. Our meta-learner can be combined with any gradient-based optimizer, learns on the fly and can be transferred to new optimization tasks. In our evaluations we show that our meta-learning algorithm speeds up learning of MNIST classification and a variety of learning control tasks, either in batch or online learning settings.

am

pdf video code [BibTex]

pdf video code [BibTex]


Thumb xl screen shot 2018 02 03 at 9.09.06 am
Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware

Heim, S., Ruppert, F., Sarvestani, A., Sproewitz, A.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, pages: 5076-5081, IEEE, International Conference on Robotics and Automation, May 2018 (inproceedings)

Abstract
Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the of concept shaping the reward landscape with training wheels; temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted on a boom learning to hop fast. This proof of concept embodies typical challenges such as instability and contact, while being simple enough to empirically map out and visualize the reward landscape. Based on our results we propose three criteria for designing effective training wheels for learning in robotics.

dlg

Video Youtube link (url) Project Page [BibTex]

Video Youtube link (url) Project Page [BibTex]


Thumb xl learning ct w asm block diagram detailed
Learning Sensor Feedback Models from Demonstrations via Phase-Modulated Neural Networks

Sutanto, G., Su, Z., Schaal, S., Meier, F.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2018, IEEE, International Conference on Robotics and Automation, May 2018 (inproceedings)

am

pdf video [BibTex]

pdf video [BibTex]


Thumb xl thesis cover2
Model-based Optical Flow: Layers, Learning, and Geometry

Wulff, J.

Tuebingen University, April 2018 (phdthesis)

Abstract
The estimation of motion in video sequences establishes temporal correspondences between pixels and surfaces and allows reasoning about a scene using multiple frames. Despite being a focus of research for over three decades, computing motion, or optical flow, remains challenging due to a number of difficulties, including the treatment of motion discontinuities and occluded regions, and the integration of information from more than two frames. One reason for these issues is that most optical flow algorithms only reason about the motion of pixels on the image plane, while not taking the image formation pipeline or the 3D structure of the world into account. One approach to address this uses layered models, which represent the occlusion structure of a scene and provide an approximation to the geometry. The goal of this dissertation is to show ways to inject additional knowledge about the scene into layered methods, making them more robust, faster, and more accurate. First, this thesis demonstrates the modeling power of layers using the example of motion blur in videos, which is caused by fast motion relative to the exposure time of the camera. Layers segment the scene into regions that move coherently while preserving their occlusion relationships. The motion of each layer therefore directly determines its motion blur. At the same time, the layered model captures complex blur overlap effects at motion discontinuities. Using layers, we can thus formulate a generative model for blurred video sequences, and use this model to simultaneously deblur a video and compute accurate optical flow for highly dynamic scenes containing motion blur. Next, we consider the representation of the motion within layers. Since, in a layered model, important motion discontinuities are captured by the segmentation into layers, the flow within each layer varies smoothly and can be approximated using a low dimensional subspace. We show how this subspace can be learned from training data using principal component analysis (PCA), and that flow estimation using this subspace is computationally efficient. The combination of the layered model and the low-dimensional subspace gives the best of both worlds, sharp motion discontinuities from the layers and computational efficiency from the subspace. Lastly, we show how layered methods can be dramatically improved using simple semantics. Instead of treating all layers equally, a semantic segmentation divides the scene into its static parts and moving objects. Static parts of the scene constitute a large majority of what is shown in typical video sequences; yet, in such regions optical flow is fully constrained by the depth structure of the scene and the camera motion. After segmenting out moving objects, we consider only static regions, and explicitly reason about the structure of the scene and the camera motion, yielding much better optical flow estimates. Furthermore, computing the structure of the scene allows to better combine information from multiple frames, resulting in high accuracies even in occluded regions. For moving regions, we compute the flow using a generic optical flow method, and combine it with the flow computed for the static regions to obtain a full optical flow field. By combining layered models of the scene with reasoning about the dynamic behavior of the real, three-dimensional world, the methods presented herein push the envelope of optical flow computation in terms of robustness, speed, and accuracy, giving state-of-the-art results on benchmarks and pointing to important future research directions for the estimation of motion in natural scenes.

ps

Official link DOI Project Page [BibTex]


Thumb xl animage2mask3
Assessing body image in anorexia nervosa using biometric self-avatars in virtual reality: Attitudinal components rather than visual body size estimation are distorted

Mölbert, S. C., Thaler, A., Mohler, B. J., Streuber, S., Romero, J., Black, M. J., Zipfel, S., Karnath, H., Giel, K. E.

Psychological Medicine, 48(4):642-653, March 2018 (article)

Abstract
Background: Body image disturbance (BID) is a core symptom of anorexia nervosa (AN), but as yet distinctive features of BID are unknown. The present study aimed at disentangling perceptual and attitudinal components of BID in AN. Methods: We investigated n=24 women with AN and n=24 controls. Based on a 3D body scan, we created realistic virtual 3D bodies (avatars) for each participant that were varied through a range of ±20% of the participants' weights. Avatars were presented in a virtual reality mirror scenario. Using different psychophysical tasks, participants identified and adjusted their actual and their desired body weight. To test for general perceptual biases in estimating body weight, a second experiment investigated perception of weight and shape matched avatars with another identity. Results: Women with AN and controls underestimated their weight, with a trend that women with AN underestimated more. The average desired body of controls had normal weight while the average desired weight of women with AN corresponded to extreme AN (DSM-5). Correlation analyses revealed that desired body weight, but not accuracy of weight estimation, was associated with eating disorder symptoms. In the second experiment, both groups estimated accurately while the most attractive body was similar to Experiment 1. Conclusions: Our results contradict the widespread assumption that patients with AN overestimate their body weight due to visual distortions. Rather, they illustrate that BID might be driven by distorted attitudes with regard to the desired body. Clinical interventions should aim at helping patients with AN to change their desired weight.

ps

doi pdf DOI Project Page [BibTex]


Thumb xl plos1
Body size estimation of self and others in females varying in BMI

Thaler, A., Geuss, M. N., Mölbert, S. C., Giel, K. E., Streuber, S., Romero, J., Black, M. J., Mohler, B. J.

PLoS ONE, 13(2), Febuary 2018 (article)

Abstract
Previous literature suggests that a disturbed ability to accurately identify own body size may contribute to overweight. Here, we investigated the influence of personal body size, indexed by body mass index (BMI), on body size estimation in a non-clinical population of females varying in BMI. We attempted to disentangle general biases in body size estimates and attitudinal influences by manipulating whether participants believed the body stimuli (personalized avatars with realistic weight variations) represented their own body or that of another person. Our results show that the accuracy of own body size estimation is predicted by personal BMI, such that participants with lower BMI underestimated their body size and participants with higher BMI overestimated their body size. Further, participants with higher BMI were less likely to notice the same percentage of weight gain than participants with lower BMI. Importantly, these results were only apparent when participants were judging a virtual body that was their own identity (Experiment 1), but not when they estimated the size of a body with another identity and the same underlying body shape (Experiment 2a). The different influences of BMI on accuracy of body size estimation and sensitivity to weight change for self and other identity suggests that effects of BMI on visual body size estimation are self-specific and not generalizable to other bodies.

ps

pdf DOI Project Page [BibTex]


Thumb xl coregpatentfig
Co-Registration – Simultaneous Alignment and Modeling of Articulated 3D Shapes

Black, M., Hirshberg, D., Loper, M., Rachlin, E., Weiss, A.

Febuary 2018, U.S.~Patent 9,898,848 (misc)

Abstract
Present application refers to a method, a model generation unit and a computer program (product) for generating trained models (M) of moving persons, based on physically measured person scan data (S). The approach is based on a common template (T) for the respective person and on the measured person scan data (S) in different shapes and different poses. Scan data are measured with a 3D laser scanner. A generic personal model is used for co-registering a set of person scan data (S) aligning the template (T) to the set of person scans (S) while simultaneously training the generic personal model to become a trained person model (M) by constraining the generic person model to be scan-specific, person-specific and pose-specific and providing the trained model (M), based on the co registering of the measured object scan data (S).

ps

text [BibTex]


no image
Distributed Event-Based State Estimation for Networked Systems: An LMI Approach

Muehlebach, M., Trimpe, S.

IEEE Transactions on Automatic Control, 63(1):269-276, January 2018 (article)

am ics

arXiv (extended version) DOI Project Page [BibTex]

arXiv (extended version) DOI Project Page [BibTex]


Thumb xl hmrteaser
End-to-end Recovery of Human Shape and Pose

Kanazawa, A., Black, M. J., Jacobs, D. W., Malik, J.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most current methods that compute 2D or 3D joint locations, we produce a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. The main objective is to minimize the reprojection loss of keypoints, which allows our model to be trained using in-the-wild images that only have ground truth 2D annotations. However, the reprojection loss alone is highly underconstrained. In this work we address this problem by introducing an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes. We show that HMR can be trained with and without using any paired 2D-to-3D supervision. We do not rely on intermediate 2D keypoint detections and infer 3D pose and shape parameters directly from image pixels. Our model runs in real-time given a bounding box containing the person. We demonstrate our approach on various images in-the-wild and out-perform previous optimization-based methods that output 3D meshes and show competitive results on tasks such as 3D joint location estimation and part segmentation.

ps

pdf code project video Project Page [BibTex]

pdf code project video Project Page [BibTex]


no image
Learning to select computations

Callaway, F., Gul, S., Krueger, P., Griffiths, T. L., Lieder, F.

In Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference, 2018 (inproceedings)

re

Project Page [BibTex]

Project Page [BibTex]


no image
On Time Optimization of Centroidal Momentum Dynamics

Ponton, B., Herzog, A., Del Prete, A., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 5776-5782, IEEE, Brisbane, Australia, 2018 (inproceedings)

Abstract
Recently, the centroidal momentum dynamics has received substantial attention to plan dynamically consistent motions for robots with arms and legs in multi-contact scenarios. However, it is also non convex which renders any optimization approach difficult and timing is usually kept fixed in most trajectory optimization techniques to not introduce additional non convexities to the problem. But this can limit the versatility of the algorithms. In our previous work, we proposed a convex relaxation of the problem that allowed to efficiently compute momentum trajectories and contact forces. However, our approach could not minimize a desired angular momentum objective which seriously limited its applicability. Noticing that the non-convexity introduced by the time variables is of similar nature as the centroidal dynamics one, we propose two convex relaxations to the problem based on trust regions and soft constraints. The resulting approaches can compute time-optimized dynamically consistent trajectories sufficiently fast to make the approach realtime capable. The performance of the algorithm is demonstrated in several multi-contact scenarios for a humanoid robot. In particular, we show that the proposed convex relaxation of the original problem finds solutions that are consistent with the original non-convex problem and illustrate how timing optimization allows to find motion plans that would be difficult to plan with fixed timing † †Implementation details and demos can be found in the source code available at https://git-amd.tuebingen.mpg.de/bponton/timeoptimization.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Thumb xl img
Combining learned and analytical models for predicting action effects

Kloss, A., Schaal, S., Bohg, J.

arXiv, 2018 (article) Submitted

Abstract
One of the most basic skills a robot should possess is predicting the effect of physical interactions with objects in the environment. This enables optimal action selection to reach a certain goal state. Traditionally, dynamics are approximated by physics-based analytical models. These models rely on specific state representations that may be hard to obtain from raw sensory data, especially if no knowledge of the object shape is assumed. More recently, we have seen learning approaches that can predict the effect of complex physical interactions directly from sensory input. It is however an open question how far these models generalize beyond their training data. In this work, we investigate the advantages and limitations of neural network based learning approaches for predicting the effects of actions based on sensory input and show how analytical and learned models can be combined to leverage the best of both worlds. As physical interaction task, we use planar pushing, for which there exists a well-known analytical model and a large real-world dataset. We propose to use a convolutional neural network to convert raw depth images or organized point clouds into a suitable representation for the analytical model and compare this approach to using neural networks for both, perception and prediction. A systematic evaluation of the proposed approach on a very large real-world dataset shows two main advantages of the hybrid architecture. Compared to a pure neural network, it significantly (i) reduces required training data and (ii) improves generalization to novel physical interaction.

am

arXiv pdf link (url) [BibTex]


Thumb xl yanzhang clustering
Temporal Human Action Segmentation via Dynamic Clustering

Zhang, Y., Sun, H., Tang, S., Neumann, H.

arXiv preprint arXiv:1803.05790, 2018 (article)

Abstract
We present an effective dynamic clustering algorithm for the task of temporal human action segmentation, which has comprehensive applications such as robotics, motion analysis, and patient monitoring. Our proposed algorithm is unsupervised, fast, generic to process various types of features, and applica- ble in both the online and offline settings. We perform extensive experiments of processing data streams, and show that our algorithm achieves the state-of- the-art results for both online and offline settings.

ps

link (url) [BibTex]

link (url) [BibTex]


Thumb xl motion segmentation tracking clustering teaser
Motion Segmentation & Multiple Object Tracking by Correlation Co-Clustering

Keuper, M., Tang, S., Andres, B., Brox, T., Schiele, B.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018 (article)

ps

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl smalrteaser
Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape from Images

Zuffi, S., Kanazawa, A., Black, M. J.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2018, 2018 (inproceedings)

Abstract
Animals are widespread in nature and the analysis of their shape and motion is important in many fields and industries. Modeling 3D animal shape, however, is difficult because the 3D scanning methods used to capture human shape are not applicable to wild animals or natural settings. Consequently, we propose a method to capture the detailed 3D shape of animals from images alone. The articulated and deformable nature of animals makes this problem extremely challenging, particularly in unconstrained environments with moving and uncalibrated cameras. To make this possible, we use a strong prior model of articulated animal shape that we fit to the image data. We then deform the animal shape in a canonical reference pose such that it matches image evidence when articulated and projected into multiple images. Our method extracts significantly more 3D shape detail than previous methods and is able to model new species, including the shape of an extinct animal, using only a few video frames. Additionally, the projected 3D shapes are accurate enough to facilitate the extraction of a realistic texture map from multiple frames.

ps

pdf code/data 3D models Project Page [BibTex]

pdf code/data 3D models Project Page [BibTex]


no image
Unsupervised Contact Learning for Humanoid Estimation and Control

Rotella, N., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 411-417, IEEE, Brisbane, Australia, 2018 (inproceedings)

Abstract
This work presents a method for contact state estimation using fuzzy clustering to learn contact probability for full, six-dimensional humanoid contacts. The data required for training is solely from proprioceptive sensors - endeffector contact wrench sensors and inertial measurement units (IMUs) - and the method is completely unsupervised. The resulting cluster means are used to efficiently compute the probability of contact in each of the six endeffector degrees of freedom (DoFs) independently. This clustering-based contact probability estimator is validated in a kinematics-based base state estimator in a simulation environment with realistic added sensor noise for locomotion over rough, low-friction terrain on which the robot is subject to foot slip and rotation. The proposed base state estimator which utilizes these six DoF contact probability estimates is shown to perform considerably better than that which determines kinematic contact constraints purely based on measured normal force.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Task-Specific Dynamics to Improve Whole-Body Control

Gams, A., Mason, S., Ude, A., Schaal, S., Righetti, L.

In Hua, IEEE, Beijing, China, November 2018 (inproceedings)

Abstract
In task-based inverse dynamics control, reference accelerations used to follow a desired plan can be broken down into feedforward and feedback trajectories. The feedback term accounts for tracking errors that are caused from inaccurate dynamic models or external disturbances. On underactuated, free-floating robots, such as humanoids, high feedback terms can be used to improve tracking accuracy; however, this can lead to very stiff behavior or poor tracking accuracy due to limited control bandwidth. In this paper, we show how to reduce the required contribution of the feedback controller by incorporating learned task-space reference accelerations. Thus, we i) improve the execution of the given specific task, and ii) offer the means to reduce feedback gains, providing for greater compliance of the system. With a systematic approach we also reduce heuristic tuning of the model parameters and feedback gains, often present in real-world experiments. In contrast to learning task-specific joint-torques, which might produce a similar effect but can lead to poor generalization, our approach directly learns the task-space dynamics of the center of mass of a humanoid robot. Simulated and real-world results on the lower part of the Sarcos Hermes humanoid robot demonstrate the applicability of the approach.

am mg

link (url) [BibTex]

link (url) [BibTex]


no image
An MPC Walking Framework With External Contact Forces

Mason, S., Rotella, N., Schaal, S., Righetti, L.

In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages: 1785-1790, IEEE, Brisbane, Australia, May 2018 (inproceedings)

Abstract
In this work, we present an extension to a linear Model Predictive Control (MPC) scheme that plans external contact forces for the robot when given multiple contact locations and their corresponding friction cone. To this end, we set up a two-step optimization problem. In the first optimization, we compute the Center of Mass (CoM) trajectory, foot step locations, and introduce slack variables to account for violating the imposed constraints on the Zero Moment Point (ZMP). We then use the slack variables to trigger the second optimization, in which we calculate the optimal external force that compensates for the ZMP tracking error. This optimization considers multiple contacts positions within the environment by formulating the problem as a Mixed Integer Quadratic Program (MIQP) that can be solved at a speed between 100-300 Hz. Once contact is created, the MIQP reduces to a single Quadratic Program (QP) that can be solved in real-time ({\textless}; 1kHz). Simulations show that the presented walking control scheme can withstand disturbances 2-3× larger with the additional force provided by a hand contact.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2009


no image
Path integral-based stochastic optimal control for rigid body dynamics

Theodorou, E. A., Buchli, J., Schaal, S.

In Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL ’09. IEEE Symposium on, pages: 219-225, 2009, clmc (inproceedings)

Abstract
Recent advances on path integral stochastic optimal control [1],[2] provide new insights in the optimal control of nonlinear stochastic systems which are linear in the controls, with state independent and time invariant control transition matrix. Under these assumptions, the Hamilton-Jacobi-Bellman (HJB) equation is formulated and linearized with the use of the logarithmic transformation of the optimal value function. The resulting HJB is a linear second order partial differential equation which is solved by an approximation based on the Feynman-Kac formula [3]. In this work we review the theory of path integral control and derive the linearized HJB equation for systems with state dependent control transition matrix. In addition we derive the path integral formulation for the general class of systems with state dimensionality that is higher than the dimensionality of the controls. Furthermore, by means of a modified inverse dynamics controller, we apply path integral stochastic optimal control over the new control space. Simulations illustrate the theoretical results. Future developments and extensions are discussed.

am

link (url) [BibTex]

2009


link (url) [BibTex]


no image
Learning locomotion over rough terrain using terrain templates

Kalakrishnan, M., Buchli, J., Pastor, P., Schaal, S.

In Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, pages: 167-172, 2009, clmc (inproceedings)

Abstract
We address the problem of foothold selection in robotic legged locomotion over very rough terrain. The difficulty of the problem we address here is comparable to that of human rock-climbing, where foot/hand-hold selection is one of the most critical aspects. Previous work in this domain typically involves defining a reward function over footholds as a weighted linear combination of terrain features. However, a significant amount of effort needs to be spent in designing these features in order to model more complex decision functions, and hand-tuning their weights is not a trivial task. We propose the use of terrain templates, which are discretized height maps of the terrain under a foothold on different length scales, as an alternative to manually designed features. We describe an algorithm that can simultaneously learn a small set of templates and a foothold ranking function using these templates, from expert-demonstrated footholds. Using the LittleDog quadruped robot, we experimentally show that the use of terrain templates can produce complex ranking functions with higher performance than standard terrain features, and improved generalization to unseen terrain.

am

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]