Header logo is


2017


Thumb xl fig toyex lqr1kernel 1
On the Design of LQR Kernels for Efficient Controller Learning

Marco, A., Hennig, P., Schaal, S., Trimpe, S.

Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract
Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

am ics pn

arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]

2017


arXiv PDF On the Design of LQR Kernels for Efficient Controller Learning - CDC presentation DOI Project Page [BibTex]


Thumb xl teaser
Coupling Adaptive Batch Sizes with Learning Rates

Balles, L., Romero, J., Hennig, P.

In Proceedings Conference on Uncertainty in Artificial Intelligence (UAI) 2017, pages: 410-419, (Editors: Gal Elidan and Kristian Kersting), Association for Uncertainty in Artificial Intelligence (AUAI), Conference on Uncertainty in Artificial Intelligence (UAI), August 2017 (inproceedings)

Abstract
Mini-batch stochastic gradient descent and variants thereof have become standard for large-scale empirical risk minimization like the training of neural networks. These methods are usually used with a constant batch size chosen by simple empirical inspection. The batch size significantly influences the behavior of the stochastic optimization algorithm, though, since it determines the variance of the gradient estimates. This variance also changes over the optimization process; when using a constant batch size, stability and convergence is thus often enforced by means of a (manually tuned) decreasing learning rate schedule. We propose a practical method for dynamic batch size adaptation. It estimates the variance of the stochastic gradients and adapts the batch size to decrease the variance proportionally to the value of the objective function, removing the need for the aforementioned learning rate decrease. In contrast to recent related work, our algorithm couples the batch size to the learning rate, directly reflecting the known relationship between the two. On three image classification benchmarks, our batch size adaptation yields faster optimization convergence, while simultaneously simplifying learning rate tuning. A TensorFlow implementation is available.

ps pn

Code link (url) Project Page [BibTex]

Code link (url) Project Page [BibTex]


no image
Dynamic Time-of-Flight

Schober, M., Adam, A., Yair, O., Mazor, S., Nowozin, S.

Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, pages: 170-179, IEEE, Piscataway, NJ, USA, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017 (conference)

ei pn

DOI [BibTex]

DOI [BibTex]


Thumb xl this one
Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A. P., Krause, A., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

am ics pn

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]

PDF arXiv ICRA 2017 Spotlight presentation Virtual vs. Real - Video explanation DOI Project Page [BibTex]


Thumb xl screen shot 2017 07 20 at 12.31.00 pm
Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets

Klein, A., Falkner, S., Bartels, S., Hennig, P., Hutter, F.

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017), 54, pages: 528-536, Proceedings of Machine Learning Research, (Editors: Sign, Aarti and Zhu, Jerry), PMLR, April 2017 (conference)

pn

pdf link (url) Project Page [BibTex]

pdf link (url) Project Page [BibTex]


Thumb xl screen shot 2018 10 09 at 11.42.49
Active Uncertainty Calibration in Bayesian ODE Solvers

Kersting, H., Hennig, P.

Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), pages: 309-318, (Editors: Ihler, A. and Janzing, D.), AUAI Press, June 2016 (conference)

Abstract
There is resurging interest, in statistics and machine learning, in solvers for ordinary differential equations (ODEs) that return probability measures instead of point estimates. Recently, Conrad et al.~introduced a sampling-based class of methods that are `well-calibrated' in a specific sense. But the computational cost of these methods is significantly above that of classic methods. On the other hand, Schober et al.~pointed out a precise connection between classic Runge-Kutta ODE solvers and Gaussian filters, which gives only a rough probabilistic calibration, but at negligible cost overhead. By formulating the solution of ODEs as approximate inference in linear Gaussian SDEs, we investigate a range of probabilistic ODE solvers, that bridge the trade-off between computational cost and probabilistic calibration, and identify the inaccurate gradient measurement as the crucial source of uncertainty. We propose the novel filtering-based method Bayesian Quadrature filtering (BQF) which uses Bayesian quadrature to actively learn the imprecision in the gradient measurement by collecting multiple gradient evaluations.

ei pn

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


Thumb xl screen shot 2016 01 19 at 14.48.37
Automatic LQR Tuning Based on Gaussian Process Global Optimization

Marco, A., Hennig, P., Bohg, J., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 270-277, IEEE, IEEE International Conference on Robotics and Automation, May 2016 (inproceedings)

Abstract
This paper proposes an automatic controller tuning framework based on linear optimal control combined with Bayesian optimization. With this framework, an initial set of controller gains is automatically improved according to a pre-defined performance objective evaluated from experimental data. The underlying Bayesian optimization algorithm is Entropy Search, which represents the latent objective as a Gaussian process and constructs an explicit belief over the location of the objective minimum. This is used to maximize the information gain from each experimental evaluation. Thus, this framework shall yield improved controllers with fewer evaluations compared to alternative approaches. A seven-degree- of-freedom robot arm balancing an inverted pole is used as the experimental demonstrator. Results of a two- and four- dimensional tuning problems highlight the method’s potential for automatic controller tuning on robotic platforms.

am ics pn

Video PDF DOI Project Page [BibTex]

Video PDF DOI Project Page [BibTex]


no image
Batch Bayesian Optimization via Local Penalization

González, J., Dai, Z., Hennig, P., Lawrence, N.

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 51, pages: 648-657, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C.), May 2016 (conference)

ei pn

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Thumb xl untitled
Probabilistic Approximate Least-Squares

Bartels, S., Hennig, P.

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 51, pages: 676-684, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C. ), May 2016 (conference)

Abstract
Least-squares and kernel-ridge / Gaussian process regression are among the foundational algorithms of statistics and machine learning. Famously, the worst-case cost of exact nonparametric regression grows cubically with the data-set size; but a growing number of approximations have been developed that estimate good solutions at lower cost. These algorithms typically return point estimators, without measures of uncertainty. Leveraging recent results casting elementary linear algebra operations as probabilistic inference, we propose a new approximate method for nonparametric least-squares that affords a probabilistic uncertainty estimate over the error between the approximate and exact least-squares solution (this is not the same as the posterior variance of the associated Gaussian process regressor). This allows estimating the error of the least-squares solution on a subset of the data relative to the full-data solution. The uncertainty can be used to control the computational effort invested in the approximation. Our algorithm has linear cost in the data-set size, and a simple formal form, so that it can be implemented with a few lines of code in programming languages with linear algebra functionality.

ei pn

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]

2010


no image
Using an Infinite Von Mises-Fisher Mixture Model to Cluster Treatment Beam Directions in External Radiation Therapy

Bangert, M., Hennig, P., Oelfke, U.

In pages: 746-751 , (Editors: Draghici, S. , T.M. Khoshgoftaar, V. Palade, W. Pedrycz, M.A. Wani, X. Zhu), IEEE, Piscataway, NJ, USA, Ninth International Conference on Machine Learning and Applications (ICMLA), December 2010 (inproceedings)

Abstract
We present a method for fully automated selection of treatment beam ensembles for external radiation therapy. We reformulate the beam angle selection problem as a clustering problem of locally ideal beam orientations distributed on the unit sphere. For this purpose we construct an infinite mixture of von Mises-Fisher distributions, which is suited in general for density estimation from data on the D-dimensional sphere. Using a nonparametric Dirichlet process prior, our model infers probability distributions over both the number of clusters and their parameter values. We describe an efficient Markov chain Monte Carlo inference algorithm for posterior inference from experimental data in this model. The performance of the suggested beam angle selection framework is illustrated for one intra-cranial, pancreas, and prostate case each. The infinite von Mises-Fisher mixture model (iMFMM) creates between 18 and 32 clusters, depending on the patient anatomy. This suggests to use the iMFMM directly for beam ensemble selection in robotic radio surgery, or to generate low-dimensional input for both subsequent optimization of trajectories for arc therapy and beam ensemble selection for conventional radiation therapy.

ei pn

Web DOI [BibTex]

2010


Web DOI [BibTex]


no image
Coherent Inference on Optimal Play in Game Trees

Hennig, P., Stern, D., Graepel, T.

In JMLR Workshop and Conference Proceedings Volume 9: AISTATS 2010, pages: 326-333, (Editors: Teh, Y.W. , M. Titterington ), JMLR, Cambridge, MA, USA, Thirteenth International Conference on Artificial Intelligence and Statistics, May 2010 (inproceedings)

Abstract
Round-based games are an instance of discrete planning problems. Some of the best contemporary game tree search algorithms use random roll-outs as data. Relying on a good policy, they learn on-policy values by propagating information upwards in the tree, but not between sibling nodes. Here, we present a generative model and a corresponding approximate message passing scheme for inference on the optimal, off-policy value of nodes in smooth AND/OR trees, given random roll-outs. The crucial insight is that the distribution of values in game trees is not completely arbitrary. We define a generative model of the on-policy values using a latent score for each state, representing the value under the random roll-out policy. Inference on the values under the optimal policy separates into an inductive, pre-data step and a deductive, post-data part. Both can be solved approximately with Expectation Propagation, allowing off-policy value inference for any node in the (exponentially big) tree in linear time.

ei pn

PDF Web [BibTex]

PDF Web [BibTex]