Header logo is


2019


no image
Limitations of the empirical Fisher approximation for natural gradient descent

Kunstner, F., Hennig, P., Balles, L.

Advances in Neural Information Processing Systems 32, pages: 4158-4169, (Editors: H. Wallach and H. Larochelle and A. Beygelzimer and F. d’Alché-Buc and E. Fox and R. Garnett), Curran Associates, Inc., 33rd Annual Conference on Neural Information Processing Systems, December 2019 (conference)

ei pn

link (url) [BibTex]

2019


link (url) [BibTex]


no image
Convergence Guarantees for Adaptive Bayesian Quadrature Methods

Kanagawa, M., Hennig, P.

Advances in Neural Information Processing Systems 32, pages: 6234-6245, (Editors: H. Wallach and H. Larochelle and A. Beygelzimer and F. d’Alché-Buc and E. Fox and R. Garnett), Curran Associates, Inc., 33rd Annual Conference on Neural Information Processing Systems, December 2019 (conference)

ei pn

link (url) [BibTex]

link (url) [BibTex]


Learning to Explore in Motion and Interaction Tasks
Learning to Explore in Motion and Interaction Tasks

Bogdanovic, M., Righetti, L.

Proceedings 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages: 2686-2692, IEEE, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), November 2019, ISSN: 2153-0866 (conference)

Abstract
Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems.

mg

DOI [BibTex]

DOI [BibTex]


no image
Robust Humanoid Locomotion Using Trajectory Optimization and Sample-Efficient Learning

Yeganegi, M. H., Khadiv, M., Moosavian, S. A. A., Zhu, J., Prete, A. D., Righetti, L.

Proceedings International Conference on Humanoid Robots, IEEE, 2019 IEEE-RAS International Conference on Humanoid Robots, October 2019 (conference)

Abstract
Trajectory optimization (TO) is one of the most powerful tools for generating feasible motions for humanoid robots. However, including uncertainties and stochasticity in the TO problem to generate robust motions can easily lead to intractable problems. Furthermore, since the models used in TO have always some level of abstraction, it can be hard to find a realistic set of uncertainties in the model space. In this paper we leverage a sample-efficient learning technique (Bayesian optimization) to robustify TO for humanoid locomotion. The main idea is to use data from full-body simulations to make the TO stage robust by tuning the cost weights. To this end, we split the TO problem into two phases. The first phase solves a convex optimization problem for generating center of mass (CoM) trajectories based on simplified linear dynamics. The second stage employs iterative Linear-Quadratic Gaussian (iLQG) as a whole-body controller to generate full body control inputs. Then we use Bayesian optimization to find the cost weights to use in the first stage that yields robust performance in the simulation/experiment, in the presence of different disturbance/uncertainties. The results show that the proposed approach is able to generate robust motions for different sets of disturbances and uncertainties.

mg

https://arxiv.org/abs/1907.04616 link (url) [BibTex]

https://arxiv.org/abs/1907.04616 link (url) [BibTex]


no image
DeepOBS: A Deep Learning Optimizer Benchmark Suite

Schneider, F., Balles, L., Hennig, P.

7th International Conference on Learning Representations (ICLR), ICLR, 7th International Conference on Learning Representations (ICLR), May 2019 (conference)

ei pn

link (url) [BibTex]

link (url) [BibTex]


no image
Efficient Humanoid Contact Planning using Learned Centroidal Dynamics Prediction

Lin, Y., Ponton, B., Righetti, L., Berenson, D.

International Conference on Robotics and Automation (ICRA), pages: 5280-5286, IEEE, May 2019 (conference)

mg

DOI [BibTex]

DOI [BibTex]


Leveraging Contact Forces for Learning to Grasp
Leveraging Contact Forces for Learning to Grasp

Merzic, H., Bogdanovic, M., Kappler, D., Righetti, L., Bohg, J.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings)

Abstract
Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two- fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.

am mg

video arXiv [BibTex]

video arXiv [BibTex]


no image
Fast and Robust Shortest Paths on Manifolds Learned from Data

Arvanitidis, G., Hauberg, S., Hennig, P., Schober, M.

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 89, pages: 1506-1515, (Editors: Kamalika Chaudhuri and Masashi Sugiyama), PMLR, April 2019 (conference)

ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]


Active Probabilistic Inference on Matrices for Pre-Conditioning in Stochastic Optimization
Active Probabilistic Inference on Matrices for Pre-Conditioning in Stochastic Optimization

de Roos, F., Hennig, P.

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 89, pages: 1448-1457, (Editors: Kamalika Chaudhuri and Masashi Sugiyama), PMLR, April 2019 (conference)

Abstract
Pre-conditioning is a well-known concept that can significantly improve the convergence of optimization algorithms. For noise-free problems, where good pre-conditioners are not known a priori, iterative linear algebra methods offer one way to efficiently construct them. For the stochastic optimization problems that dominate contemporary machine learning, however, this approach is not readily available. We propose an iterative algorithm inspired by classic iterative linear solvers that uses a probabilistic model to actively infer a pre-conditioner in situations where Hessian-projections can only be constructed with strong Gaussian noise. The algorithm is empirically demonstrated to efficiently construct effective pre-conditioners for stochastic gradient descent and its variants. Experiments on problems of comparably low dimensionality show improved convergence. In very high-dimensional problems, such as those encountered in deep learning, the pre-conditioner effectively becomes an automatic learning-rate adaptation scheme, which we also empirically show to work well.

ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]


Active Uncertainty Calibration in Bayesian ODE Solvers
Active Uncertainty Calibration in Bayesian ODE Solvers

Kersting, H., Hennig, P.

Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), pages: 309-318, (Editors: Ihler, A. and Janzing, D.), AUAI Press, June 2016 (conference)

Abstract
There is resurging interest, in statistics and machine learning, in solvers for ordinary differential equations (ODEs) that return probability measures instead of point estimates. Recently, Conrad et al.~introduced a sampling-based class of methods that are `well-calibrated' in a specific sense. But the computational cost of these methods is significantly above that of classic methods. On the other hand, Schober et al.~pointed out a precise connection between classic Runge-Kutta ODE solvers and Gaussian filters, which gives only a rough probabilistic calibration, but at negligible cost overhead. By formulating the solution of ODEs as approximate inference in linear Gaussian SDEs, we investigate a range of probabilistic ODE solvers, that bridge the trade-off between computational cost and probabilistic calibration, and identify the inaccurate gradient measurement as the crucial source of uncertainty. We propose the novel filtering-based method Bayesian Quadrature filtering (BQF) which uses Bayesian quadrature to actively learn the imprecision in the gradient measurement by collecting multiple gradient evaluations.

ei pn

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


Automatic LQR Tuning Based on Gaussian Process Global Optimization
Automatic LQR Tuning Based on Gaussian Process Global Optimization

Marco, A., Hennig, P., Bohg, J., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 270-277, IEEE, IEEE International Conference on Robotics and Automation, May 2016 (inproceedings)

Abstract
This paper proposes an automatic controller tuning framework based on linear optimal control combined with Bayesian optimization. With this framework, an initial set of controller gains is automatically improved according to a pre-defined performance objective evaluated from experimental data. The underlying Bayesian optimization algorithm is Entropy Search, which represents the latent objective as a Gaussian process and constructs an explicit belief over the location of the objective minimum. This is used to maximize the information gain from each experimental evaluation. Thus, this framework shall yield improved controllers with fewer evaluations compared to alternative approaches. A seven-degree- of-freedom robot arm balancing an inverted pole is used as the experimental demonstrator. Results of a two- and four- dimensional tuning problems highlight the method’s potential for automatic controller tuning on robotic platforms.

am ics pn

Video - Automatic LQR Tuning Based on Gaussian Process Global Optimization - ICRA 2016 Video - Automatic Controller Tuning on a Two-legged Robot PDF DOI Project Page [BibTex]

Video - Automatic LQR Tuning Based on Gaussian Process Global Optimization - ICRA 2016 Video - Automatic Controller Tuning on a Two-legged Robot PDF DOI Project Page [BibTex]


no image
Batch Bayesian Optimization via Local Penalization

González, J., Dai, Z., Hennig, P., Lawrence, N.

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 51, pages: 648-657, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C.), May 2016 (conference)

ei pn

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Probabilistic Approximate Least-Squares
Probabilistic Approximate Least-Squares

Bartels, S., Hennig, P.

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 51, pages: 676-684, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C. ), May 2016 (conference)

Abstract
Least-squares and kernel-ridge / Gaussian process regression are among the foundational algorithms of statistics and machine learning. Famously, the worst-case cost of exact nonparametric regression grows cubically with the data-set size; but a growing number of approximations have been developed that estimate good solutions at lower cost. These algorithms typically return point estimators, without measures of uncertainty. Leveraging recent results casting elementary linear algebra operations as probabilistic inference, we propose a new approximate method for nonparametric least-squares that affords a probabilistic uncertainty estimate over the error between the approximate and exact least-squares solution (this is not the same as the posterior variance of the associated Gaussian process regressor). This allows estimating the error of the least-squares solution on a subset of the data relative to the full-data solution. The uncertainty can be used to control the computational effort invested in the approximation. Our algorithm has linear cost in the data-set size, and a simple formal form, so that it can be implemented with a few lines of code in programming languages with linear algebra functionality.

ei pn

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


no image
On the Effects of Measurement Uncertainty in Optimal Control of Contact Interactions

Ponton, B., Schaal, S., Righetti, L.

In The 12th International Workshop on the Algorithmic Foundations of Robotics WAFR, Berkeley, USA, 2016 (inproceedings)

Abstract
Stochastic Optimal Control (SOC) typically considers noise only in the process model, i.e. unknown disturbances. However, in many robotic applications involving interaction with the environment, such as locomotion and manipulation, uncertainty also comes from lack of precise knowledge of the world, which is not an actual disturbance. We analyze the effects of also considering noise in the measurement model, by devel- oping a SOC algorithm based on risk-sensitive control, that includes the dynamics of an observer in such a way that the control law explicitly de- pends on the current measurement uncertainty. In simulation results on a simple 2D manipulator, we have observed that measurement uncertainty leads to low impedance behaviors, a result in contrast with the effects of process noise that creates stiff behaviors. This suggests that taking into account measurement uncertainty could be a potentially very interesting way to approach problems involving uncertain contact interactions.

am mg

link (url) [BibTex]

link (url) [BibTex]


no image
A Convex Model of Momentum Dynamics for Multi-Contact Motion Generation

Ponton, B., Herzog, A., Schaal, S., Righetti, L.

In 2016 IEEE-RAS 16th International Conference on Humanoid Robots Humanoids, pages: 842-849, IEEE, Cancun, Mexico, 2016 (inproceedings)

Abstract
Linear models for control and motion generation of humanoid robots have received significant attention in the past years, not only due to their well known theoretical guarantees, but also because of practical computational advantages. However, to tackle more challenging tasks and scenarios such as locomotion on uneven terrain, a more expressive model is required. In this paper, we are interested in contact interaction-centered motion optimization based on the momentum dynamics model. This model is non-linear and non-convex; however, we find a relaxation of the problem that allows us to formulate it as a single convex quadratically-constrained quadratic program (QCQP) that can be very efficiently optimized and is useful for multi-contact planning. This convex model is then coupled to the optimization of end-effector contact locations using a mixed integer program, which can also be efficiently solved. This becomes relevant e.g. to recover from external pushes, where a predefined stepping plan is likely to fail and an online adaptation of the contact location is needed. The performance of our algorithm is demonstrated in several multi-contact scenarios for a humanoid robot.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Inertial Sensor-Based Humanoid Joint State Estimation

Rotella, N., Mason, S., Schaal, S., Righetti, L.

In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages: 1825-1831, IEEE, Stockholm, Sweden, 2016 (inproceedings)

Abstract
This work presents methods for the determination of a humanoid robot's joint velocities and accelerations directly from link-mounted Inertial Measurement Units (IMUs) each containing a three-axis gyroscope and a three-axis accelerometer. No information about the global pose of the floating base or its links is required and precise knowledge of the link IMU poses is not necessary due to presented calibration routines. Additionally, a filter is introduced to fuse gyroscope angular velocities with joint position measurements and compensate the computed joint velocities for time-varying gyroscope biases. The resulting joint velocities are subject to less noise and delay than filtered velocities computed from numerical differentiation of joint potentiometer signals, leading to superior performance in joint feedback control as demonstrated in experiments performed on a SARCOS hydraulic humanoid.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Stepping Stabilization Using a Combination of DCM Tracking and Step Adjustment

Khadiv, M., Kleff, S., Herzog, A., Moosavian, S. A. A., Schaal, S., Righetti, L.

In 2016 4th International Conference on Robotics and Mechatronics (ICROM), pages: 130-135, IEEE, Teheran, Iran, 2016 (inproceedings)

Abstract
In this paper, a method for stabilizing biped robots stepping by a combination of Divergent Component of Motion (DCM) tracking and step adjustment is proposed. In this method, the DCM trajectory is generated, consistent with the predefined footprints. Furthermore, a swing foot trajectory modification strategy is proposed to adapt the landing point, using DCM measurement. In order to apply the generated trajectories to the full robot, a Hierarchical Inverse Dynamics (HID) is employed. The HID enables us to use different combinations of the DCM tracking and step adjustment for stabilizing different biped robots. Simulation experiments on two scenarios for two different simulated robots, one with active ankles and the other with passive ankles, are carried out. Simulation results demonstrate the effectiveness of the proposed method for robots with both active and passive ankles.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Structured contact force optimization for kino-dynamic motion generation

Herzog, A., Schaal, S., Righetti, L.

In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages: 2703-2710, IEEE, Daejeon, South Korea, 2016 (inproceedings)

Abstract
Optimal control approaches in combination with trajectory optimization have recently proven to be a promising control strategy for legged robots. Computationally efficient and robust algorithms were derived using simplified models of the contact interaction between robot and environment such as the linear inverted pendulum model (LIPM). However, as humanoid robots enter more complex environments, less restrictive models become increasingly important. As we leave the regime of linear models, we need to build dedicated solvers that can compute interaction forces together with consistent kinematic plans for the whole-body. In this paper, we address the problem of planning robot motion and interaction forces for legged robots given predefined contact surfaces. The motion generation process is decomposed into two alternating parts computing force and motion plans in coherence. We focus on the properties of the momentum computation leading to sparse optimal control formulations to be exploited by a dedicated solver. In our experiments, we demonstrate that our motion generation algorithm computes consistent contact forces and joint trajectories for our humanoid robot. We also demonstrate the favorable time complexity due to our formulation and composition of the momentum equations.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Balancing and Walking Using Full Dynamics LQR Control With Contact Constraints

Mason, S., Rotella, N., Schaal, S., Righetti, L.

In 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pages: 63-68, IEEE, Cancun, Mexico, 2016 (inproceedings)

Abstract
Torque control algorithms which consider robot dynamics and contact constraints are important for creating dynamic behaviors for humanoids. As computational power increases, algorithms tend to also increase in complexity. However, it is not clear how much complexity is really required to create controllers which exhibit good performance. In this paper, we study the capabilities of a simple approach based on contact consistent LQR controllers designed around key poses to control various tasks on a humanoid robot. We present extensive experimental results on a hydraulic, torque controlled humanoid performing balancing and stepping tasks. This feedback control approach captures the necessary synergies between the DoFs of the robot to guarantee good control performance. We show that for the considered tasks, it is only necessary to re-linearize the dynamics of the robot at different contact configurations and that increasing the number of LQR controllers along desired trajectories does not improve performance. Our result suggest that very simple controllers can yield good performance competitive with current state of the art, but more complex, optimization-based whole-body controllers. A video of the experiments can be found at https://youtu.be/5T08CNKV1hw.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Step Timing Adjustement: a Step toward Generating Robust Gaits

Khadiv, M., Herzog, A., Moosavian, S. A. A., Righetti, L.

In 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pages: 35-42, IEEE, Cancun, Mexico, 2016 (inproceedings)

Abstract
Step adjustment for humanoid robots has been shown to improve robustness in gaits. However, step duration adaptation is often neglected in control strategies. In this paper, we propose an approach that combines both step location and timing adjustment for generating robust gaits. In this approach, step location and step timing are decided, based on feedback from the current state of the robot. The proposed approach is comprised of two stages. In the first stage, the nominal step location and step duration for the next step or a previewed number of steps are specified. In this stage which is done at the start of each step, the main goal is to specify the best step length and step duration for a desired walking speed. The second stage deals with finding the best landing point and landing time of the swing foot at each control cycle. In this stage, stability of the gaits is preserved by specifying a desired offset between the swing foot landing point and the Divergent Component of Motion (DCM) at the end of current step. After specifying the landing point of the swing foot at a desired time, the swing foot trajectory is regenerated at each control cycle to realize desired landing properties. Simulation on different scenarios shows the robustness of the generated gaits from our proposed approach compared to the case where no timing adjustment is employed.

mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2012


Quasi-Newton Methods: A New Direction
Quasi-Newton Methods: A New Direction

Hennig, P., Kiefel, M.

In Proceedings of the 29th International Conference on Machine Learning, pages: 25-32, ICML ’12, (Editors: John Langford and Joelle Pineau), Omnipress, New York, NY, USA, ICML, July 2012 (inproceedings)

Abstract
Four decades after their invention, quasi- Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

ei ps pn

website+code pdf link (url) [BibTex]

2012


website+code pdf link (url) [BibTex]


no image
Learning Tracking Control with Forward Models

Bócsi, B., Hennig, P., Csató, L., Peters, J.

In pages: 259 -264, IEEE International Conference on Robotics and Automation (ICRA), May 2012 (inproceedings)

Abstract
Performing task-space tracking control on redundant robot manipulators is a difficult problem. When the physical model of the robot is too complex or not available, standard methods fail and machine learning algorithms can have advantages. We propose an adaptive learning algorithm for tracking control of underactuated or non-rigid robots where the physical model of the robot is unavailable. The control method is based on the fact that forward models are relatively straightforward to learn and local inversions can be obtained via local optimization. We use sparse online Gaussian process inference to obtain a flexible probabilistic forward model and second order optimization to find the inverse mapping. Physical experiments indicate that this approach can outperform state-of-the-art tracking control algorithms in this context.

ei pn

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Approximate Gaussian Integration using Expectation Propagation

Cunningham, J., Hennig, P., Lacoste-Julien, S.

In pages: 1-11, -, January 2012 (inproceedings) Submitted

Abstract
While Gaussian probability densities are omnipresent in applied mathematics, Gaussian cumulative probabilities are hard to calculate in any but the univariate case. We offer here an empirical study of the utility of Expectation Propagation (EP) as an approximate integration method for this problem. For rectangular integration regions, the approximation is highly accurate. We also extend the derivations to the more general case of polyhedral integration regions. However, we find that in this polyhedral case, EP's answer, though often accurate, can be almost arbitrarily wrong. These unexpected results elucidate an interesting and non-obvious feature of EP not yet studied in detail, both for the problem of Gaussian probabilities and for EP more generally.

ei pn

Web [BibTex]

Web [BibTex]


no image
Kernel Topic Models

Hennig, P., Stern, D., Herbrich, R., Graepel, T.

In Fifteenth International Conference on Artificial Intelligence and Statistics, 22, pages: 511-519, JMLR Proceedings, (Editors: Lawrence, N. D. and Girolami, M.), JMLR.org, AISTATS , 2012 (inproceedings)

Abstract
Latent Dirichlet Allocation models discrete data as a mixture of discrete distributions, using Dirichlet beliefs over the mixture weights. We study a variation of this concept, in which the documents' mixture weight beliefs are replaced with squashed Gaussian distributions. This allows documents to be associated with elements of a Hilbert space, admitting kernel topic models (KTM), modelling temporal, spatial, hierarchical, social and other structure between documents. The main challenge is efficient approximate inference on the latent Gaussian. We present an approximate algorithm cast around a Laplace approximation in a transformed basis. The KTM can also be interpreted as a type of Gaussian process latent variable model, or as a topic model conditional on document features, uncovering links between earlier work in these areas.

ei pn

PDF Web [BibTex]

PDF Web [BibTex]


no image
Encoding of Periodic and their Transient Motions by a Single Dynamic Movement Primitive

Ernesti, J., Righetti, L., Do, M., Asfour, T., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 57-64, IEEE, Osaka, Japan, November 2012 (inproceedings)

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Force Control Policies for Compliant Robotic Manipulation

Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.

In ICML’12 Proceedings of the 29th International Coference on International Conference on Machine Learning, pages: 49-50, Edinburgh, Scotland, 2012 (inproceedings)

am mg

[BibTex]

[BibTex]


no image
Quadratic programming for inverse dynamics with optimal distribution of contact forces

Righetti, L., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 538-543, IEEE, Osaka, Japan, November 2012 (inproceedings)

Abstract
In this contribution we propose an inverse dynamics controller for a humanoid robot that exploits torque redundancy to minimize any combination of linear and quadratic costs in the contact forces and the commands. In addition the controller satisfies linear equality and inequality constraints in the contact forces and the commands such as torque limits, unilateral contacts or friction cones limits. The originality of our approach resides in the formulation of the problem as a quadratic program where we only need to solve for the control commands and where the contact forces are optimized implicitly. Furthermore, we do not need a structured representation of the dynamics of the robot (i.e. an explicit computation of the inertia matrix). It is in contrast with existing methods based on quadratic programs. The controller is then robust to uncertainty in the estimation of the dynamics model and the optimization is fast enough to be implemented in high bandwidth torque control loops that are increasingly available on humanoid platforms. We demonstrate properties of our controller with simulations of a human size humanoid robot.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Towards Associative Skill Memories

Pastor, P., Kalakrishnan, M., Righetti, L., Schaal, S.

In 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pages: 309-315, IEEE, Osaka, Japan, November 2012 (inproceedings)

Abstract
Movement primitives as basis of movement planning and control have become a popular topic in recent years. The key idea of movement primitives is that a rather small set of stereotypical movements should suffice to create a large set of complex manipulation skills. An interesting side effect of stereotypical movement is that it also creates stereotypical sensory events, e.g., in terms of kinesthetic variables, haptic variables, or, if processed appropriately, visual variables. Thus, a movement primitive executed towards a particular object in the environment will associate a large number of sensory variables that are typical for this manipulation skill. These association can be used to increase robustness towards perturbations, and they also allow failure detection and switching towards other behaviors. We call such movement primitives augmented with sensory associations Associative Skill Memories (ASM). This paper addresses how ASMs can be acquired by imitation learning and how they can create robust manipulation skill by determining subsequent ASMs online to achieve a particular manipulation goal. Evaluation for grasping and manipulation with a Barrett WAM/Hand illustrate our approach.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Template-based learning of grasp selection

Herzog, A., Pastor, P., Kalakrishnan, M., Righetti, L., Asfour, T., Schaal, S.

In 2012 IEEE International Conference on Robotics and Automation, pages: 2379-2384, IEEE, Saint Paul, USA, 2012 (inproceedings)

Abstract
The ability to grasp unknown objects is an important skill for personal robots, which has been addressed by many present and past research projects, but still remains an open problem. A crucial aspect of grasping is choosing an appropriate grasp configuration, i.e. the 6d pose of the hand relative to the object and its finger configuration. Finding feasible grasp configurations for novel objects, however, is challenging because of the huge variety in shape and size of these objects. Moreover, possible configurations also depend on the specific kinematics of the robotic arm and hand in use. In this paper, we introduce a new grasp selection algorithm able to find object grasp poses based on previously demonstrated grasps. Assuming that objects with similar shapes can be grasped in a similar way, we associate to each demonstrated grasp a grasp template. The template is a local shape descriptor for a possible grasp pose and is constructed using 3d information from depth sensors. For each new object to grasp, the algorithm then finds the best grasp candidate in the library of templates. The grasp selection is also able to improve over time using the information of previous grasp attempts to adapt the ranking of the templates. We tested the algorithm on two different platforms, the Willow Garage PR2 and the Barrett WAM arm which have very different hands. Our results show that the algorithm is able to find good grasp configurations for a large set of objects from a relatively small set of demonstrations, and does indeed improve its performance over time.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Probabilistic depth image registration incorporating nonvisual information

Wüthrich, M., Pastor, P., Righetti, L., Billard, A., Schaal, S.

In 2012 IEEE International Conference on Robotics and Automation, pages: 3637-3644, IEEE, Saint Paul, USA, 2012 (inproceedings)

Abstract
In this paper, we derive a probabilistic registration algorithm for object modeling and tracking. In many robotics applications, such as manipulation tasks, nonvisual information about the movement of the object is available, which we will combine with the visual information. Furthermore we do not only consider observations of the object, but we also take space into account which has been observed to not be part of the object. Furthermore we are computing a posterior distribution over the relative alignment and not a point estimate as typically done in for example Iterative Closest Point (ICP). To our knowledge no existing algorithm meets these three conditions and we thus derive a novel registration algorithm in a Bayesian framework. Experimental results suggest that the proposed methods perform favorably in comparison to PCL [1] implementations of feature mapping and ICP, especially if nonvisual information is available.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2011


no image
Optimal Reinforcement Learning for Gaussian Systems

Hennig, P.

In Advances in Neural Information Processing Systems 24, pages: 325-333, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
The exploration-exploitation trade-off is among the central challenges of reinforcement learning. The optimal Bayesian solution is intractable in general. This paper studies to what extent analytic statements about optimal learning are possible if all beliefs are Gaussian processes. A first order approximation of learning of both loss and dynamics, for nonlinear, time-varying systems in continuous time and space, subject to a relatively weak restriction on the dynamics, is described by an infinite-dimensional partial differential equation. An approximate finitedimensional projection gives an impression for how this result may be helpful.

ei pn

PDF Web [BibTex]

2011


PDF Web [BibTex]


no image
Learning Force Control Policies for Compliant Manipulation

Kalakrishnan, M., Righetti, L., Pastor, P., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 4639-4644, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Developing robots capable of fine manipulation skills is of major importance in order to build truly assistive robots. These robots need to be compliant in their actuation and control in order to operate safely in human environments. Manipulation tasks imply complex contact interactions with the external world, and involve reasoning about the forces and torques to be applied. Planning under contact conditions is usually impractical due to computational complexity, and a lack of precise dynamics models of the environment. We present an approach to acquiring manipulation skills on compliant robots through reinforcement learning. The initial position control policy for manipulation is initialized through kinesthetic demonstration. We augment this policy with a force/torque profile to be controlled in combination with the position trajectories. We use the Policy Improvement with Path Integrals (PI2) algorithm to learn these force/torque profiles by optimizing a cost function that measures task success. We demonstrate our approach on the Barrett WAM robot arm equipped with a 6-DOF force/torque sensor on two different manipulation tasks: opening a door with a lever door handle, and picking up a pen off the table. We show that the learnt force control policies allow successful, robust execution of the tasks.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Control of legged robots with optimal distribution of contact forces

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 11th IEEE-RAS International Conference on Humanoid Robots, pages: 318-324, IEEE, Bled, Slovenia, 2011 (inproceedings)

Abstract
The development of agile and safe humanoid robots require controllers that guarantee both high tracking performance and compliance with the environment. More specifically, the control of contact interaction is of crucial importance for robots that will actively interact with their environment. Model-based controllers such as inverse dynamics or operational space control are very appealing as they offer both high tracking performance and compliance. However, while widely used for fully actuated systems such as manipulators, they are not yet standard controllers for legged robots such as humanoids. Indeed such robots are fundamentally different from manipulators as they are underactuated due to their floating-base and subject to switching contact constraints. In this paper we present an inverse dynamics controller for legged robots that use torque redundancy to create an optimal distribution of contact constraints. The resulting controller is able to minimize, given a desired motion, any quadratic cost of the contact constraints at each instant of time. In particular we show how this can be used to minimize tangential forces during locomotion, therefore significantly improving the locomotion of legged robots on difficult terrains. In addition to the theoretical result, we present simulations of a humanoid and a quadruped robot, as well as experiments on a real quadruped robot that demonstrate the advantages of the controller.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Learning Motion Primitive Goals for Robust Manipulation

Stulp, F., Theodorou, E., Kalakrishnan, M., Pastor, P., Righetti, L., Schaal, S.

In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 325-331, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Applying model-free reinforcement learning to manipulation remains challenging for several reasons. First, manipulation involves physical contact, which causes discontinuous cost functions. Second, in manipulation, the end-point of the movement must be chosen carefully, as it represents a grasp which must be adapted to the pose and shape of the object. Finally, there is uncertainty in the object pose, and even the most carefully planned movement may fail if the object is not at the expected position. To address these challenges we 1) present a simplified, computationally more efficient version of our model-free reinforcement learning algorithm PI2; 2) extend PI2 so that it simultaneously learns shape parameters and goal parameters of motion primitives; 3) use shape and goal learning to acquire motion primitives that are robust to object pose uncertainty. We evaluate these contributions on a manipulation platform consisting of a 7-DOF arm with a 4-DOF hand.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Inverse Dynamics Control of Floating-Base Robots with External Constraints: a Unified View

Righetti, L., Buchli, J., Mistry, M., Schaal, S.

In 2011 IEEE International Conference on Robotics and Automation, pages: 1085-1090, IEEE, Shanghai, China, 2011 (inproceedings)

Abstract
Inverse dynamics controllers and operational space controllers have proved to be very efficient for compliant control of fully actuated robots such as fixed base manipulators. However legged robots such as humanoids are inherently different as they are underactuated and subject to switching external contact constraints. Recently several methods have been proposed to create inverse dynamics controllers and operational space controllers for these robots. In an attempt to compare these different approaches, we develop a general framework for inverse dynamics control and show that these methods lead to very similar controllers. We are then able to greatly simplify recent whole-body controllers based on operational space approaches using kinematic projections, bringing them closer to efficient practical implementations. We also generalize these controllers such that they can be optimal under an arbitrary quadratic cost in the commands.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Operational Space Control of Constrained and Underactuated Systems

Mistry, M., Righetti, L.

In Proceedings of Robotics: Science and Systems, Los Angeles, CA, USA, June 2011 (inproceedings)

Abstract
The operational space formulation (Khatib, 1987), applied to rigid-body manipulators, describes how to decouple task-space and null-space dynamics, and write control equations that correspond only to forces at the end-effector or, alternatively, only to motion within the null-space. We would like to apply this useful theory to modern humanoids and other legged systems, for manipulation or similar tasks, however these systems present additional challenges due to their underactuated floating bases and contact states that can dynamically change. In recent work, Sentis et al. derived controllers for such systems by implementing a task Jacobian projected into a space consistent with the supporting constraints and underactuation (the so called "support consistent reduced Jacobian"). Here, we take a new approach to derive operational space controllers for constrained underactuated systems, by first considering the operational space dynamics within "projected inverse-dynamics" (Aghili, 2005), and subsequently resolving underactuation through the addition of dynamically consistent control torques. Doing so results in a simplified control solution compared with previous results, and importantly yields several new insights into the underlying problem of operational space control in constrained environments: 1) Underactuated systems, such as humanoid robots, cannot in general completely decouple task and null-space dynamics. However, 2) there may exist an infinite number of control solutions to realize desired task-space dynamics, and 3) these solutions involve the addition of dynamically consistent null-space motion or constraint forces (or combinations of both). In light of these findings, we present several possible control solutions, with varying optimization criteria, and highlight some of their practical consequences.

mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Online movement adaptation based on previous sensor experiences

Pastor, P., Righetti, L., Kalakrishnan, M., Schaal, S.

In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages: 365-371, IEEE, San Francisco, USA, sep 2011 (inproceedings)

Abstract
Personal robots can only become widespread if they are capable of safely operating among humans. In uncertain and highly dynamic environments such as human households, robots need to be able to instantly adapt their behavior to unforseen events. In this paper, we propose a general framework to achieve very contact-reactive motions for robotic grasping and manipulation. Associating stereotypical movements to particular tasks enables our system to use previous sensor experiences as a predictive model for subsequent task executions. We use dynamical systems, named Dynamic Movement Primitives (DMPs), to learn goal-directed behaviors from demonstration. We exploit their dynamic properties by coupling them with the measured and predicted sensor traces. This feedback loop allows for online adaptation of the movement plan. Our system can create a rich set of possible motions that account for external perturbations and perception uncertainty to generate truly robust behaviors. As an example, we present an application to grasping with the WAM robot arm.

am mg

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2009


no image
Modelling the interplay of central pattern generation and sensory feedback in the neuromuscular control of running

Daley, M., Righetti, L., Ijspeert, A.

In Comparative Biochemistry and Physiology - Part A: Molecular & Integrative Physiology. Annual Main Meeting for the Society for Experimental Biology, 153, Glasgow, Scotland, 2009 (inproceedings)

mg

link (url) DOI [BibTex]

2009


link (url) DOI [BibTex]

2005


no image
A dynamical systems approach to learning: a frequency-adaptive hopper robot

Buchli, J., Righetti, L., Ijspeert, A.

In Proceedings of the VIIIth European Conference on Artificial Life ECAL 2005, pages: 210-220, Springer Verlag, 2005 (inproceedings)

mg

[BibTex]

2005


[BibTex]


no image
From Dynamic Hebbian Learning for Oscillators to Adaptive Central Pattern Generators

Righetti, L., Buchli, J., Ijspeert, A.

In Proceedings of 3rd International Symposium on Adaptive Motion in Animals and Machines – AMAM 2005, Verlag ISLE, Ilmenau, 2005 (inproceedings)

mg

[BibTex]

[BibTex]

2004


no image
Operating system support for interface virtualisation of reconfigurable coprocessors

Vuletic, M., Righetti, L., Pozzi, L., Ienne, P.

In In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pages: 748-749, IEEE, Paris, France, 2004 (inproceedings)

Abstract
Reconfigurable systems-on-chip (SoC) consist of large field programmable gate arrays (FPGAs) and standard processors. The reconfigurable logic can be used for application-specific coprocessors to speedup execution of applications. The widespread use is limited by the complexity of interfacing software applications with coprocessors. We present a virtualization layer that lowers the interfacing complexity and improves the portability. The layer shifts the burden of moving data between processor and coprocessor from the programmer to the operating system (OS). A reconfigurable SoC running Linux is used to prove the concept.

mg

link (url) DOI [BibTex]

2004


link (url) DOI [BibTex]