Header logo is


2019


no image
DeepOBS: A Deep Learning Optimizer Benchmark Suite

Schneider, F., Balles, L., Hennig, P.

7th International Conference on Learning Representations (ICLR), May 2019 (conference)

ei pn

link (url) [BibTex]

2019


link (url) [BibTex]


no image
Fast and Robust Shortest Paths on Manifolds Learned from Data

Arvanitidis, G., Hauberg, S., Hennig, P., Schober, M.

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 89, pages: 1506-1515, (Editors: Kamalika Chaudhuri and Masashi Sugiyama), PMLR, April 2019 (conference)

ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]


Thumb xl 543 figure0 1
Active Probabilistic Inference on Matrices for Pre-Conditioning in Stochastic Optimization

de Roos, F., Hennig, P.

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 89, pages: 1448-1457, (Editors: Kamalika Chaudhuri and Masashi Sugiyama), PMLR, April 2019 (conference)

Abstract
Pre-conditioning is a well-known concept that can significantly improve the convergence of optimization algorithms. For noise-free problems, where good pre-conditioners are not known a priori, iterative linear algebra methods offer one way to efficiently construct them. For the stochastic optimization problems that dominate contemporary machine learning, however, this approach is not readily available. We propose an iterative algorithm inspired by classic iterative linear solvers that uses a probabilistic model to actively infer a pre-conditioner in situations where Hessian-projections can only be constructed with strong Gaussian noise. The algorithm is empirically demonstrated to efficiently construct effective pre-conditioners for stochastic gradient descent and its variants. Experiments on problems of comparably low dimensionality show improved convergence. In very high-dimensional problems, such as those encountered in deep learning, the pre-conditioner effectively becomes an automatic learning-rate adaptation scheme, which we also empirically show to work well.

pn ei

PDF link (url) [BibTex]

PDF link (url) [BibTex]


Thumb xl linear solvers stco figure7 1
Probabilistic Linear Solvers: A Unifying View

Bartels, S., Cockayne, J., Ipsen, I. C. F., Hennig, P.

Statistics and Computing, 2019 (article) Accepted

pn

link (url) [BibTex]

link (url) [BibTex]


Thumb xl screen shot 2018 10 09 at 11.42.49
Active Uncertainty Calibration in Bayesian ODE Solvers

Kersting, H., Hennig, P.

Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI), pages: 309-318, (Editors: Ihler, A. and Janzing, D.), AUAI Press, June 2016 (conference)

Abstract
There is resurging interest, in statistics and machine learning, in solvers for ordinary differential equations (ODEs) that return probability measures instead of point estimates. Recently, Conrad et al.~introduced a sampling-based class of methods that are `well-calibrated' in a specific sense. But the computational cost of these methods is significantly above that of classic methods. On the other hand, Schober et al.~pointed out a precise connection between classic Runge-Kutta ODE solvers and Gaussian filters, which gives only a rough probabilistic calibration, but at negligible cost overhead. By formulating the solution of ODEs as approximate inference in linear Gaussian SDEs, we investigate a range of probabilistic ODE solvers, that bridge the trade-off between computational cost and probabilistic calibration, and identify the inaccurate gradient measurement as the crucial source of uncertainty. We propose the novel filtering-based method Bayesian Quadrature filtering (BQF) which uses Bayesian quadrature to actively learn the imprecision in the gradient measurement by collecting multiple gradient evaluations.

ei pn

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


Thumb xl screen shot 2016 01 19 at 14.48.37
Automatic LQR Tuning Based on Gaussian Process Global Optimization

Marco, A., Hennig, P., Bohg, J., Schaal, S., Trimpe, S.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 270-277, IEEE, IEEE International Conference on Robotics and Automation, May 2016 (inproceedings)

Abstract
This paper proposes an automatic controller tuning framework based on linear optimal control combined with Bayesian optimization. With this framework, an initial set of controller gains is automatically improved according to a pre-defined performance objective evaluated from experimental data. The underlying Bayesian optimization algorithm is Entropy Search, which represents the latent objective as a Gaussian process and constructs an explicit belief over the location of the objective minimum. This is used to maximize the information gain from each experimental evaluation. Thus, this framework shall yield improved controllers with fewer evaluations compared to alternative approaches. A seven-degree- of-freedom robot arm balancing an inverted pole is used as the experimental demonstrator. Results of a two- and four- dimensional tuning problems highlight the method’s potential for automatic controller tuning on robotic platforms.

am ics pn

Video PDF DOI Project Page [BibTex]

Video PDF DOI Project Page [BibTex]


no image
Batch Bayesian Optimization via Local Penalization

González, J., Dai, Z., Hennig, P., Lawrence, N.

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 51, pages: 648-657, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C.), May 2016 (conference)

ei pn

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


Thumb xl untitled
Probabilistic Approximate Least-Squares

Bartels, S., Hennig, P.

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), 51, pages: 676-684, JMLR Workshop and Conference Proceedings, (Editors: Gretton, A. and Robert, C. C. ), May 2016 (conference)

Abstract
Least-squares and kernel-ridge / Gaussian process regression are among the foundational algorithms of statistics and machine learning. Famously, the worst-case cost of exact nonparametric regression grows cubically with the data-set size; but a growing number of approximations have been developed that estimate good solutions at lower cost. These algorithms typically return point estimators, without measures of uncertainty. Leveraging recent results casting elementary linear algebra operations as probabilistic inference, we propose a new approximate method for nonparametric least-squares that affords a probabilistic uncertainty estimate over the error between the approximate and exact least-squares solution (this is not the same as the posterior variance of the associated Gaussian process regressor). This allows estimating the error of the least-squares solution on a subset of the data relative to the full-data solution. The uncertainty can be used to control the computational effort invested in the approximation. Our algorithm has linear cost in the data-set size, and a simple formal form, so that it can be implemented with a few lines of code in programming languages with linear algebra functionality.

ei pn

link (url) Project Page Project Page [BibTex]

link (url) Project Page Project Page [BibTex]


Thumb xl cloud tracking
Gaussian Process-Based Predictive Control for Periodic Error Correction

Klenske, E. D., Zeilinger, M., Schölkopf, B., Hennig, P.

IEEE Transactions on Control Systems Technology , 24(1):110-121, 2016 (article)

ei pn

PDF DOI [BibTex]

PDF DOI [BibTex]


Thumb xl dual control sampled b
Dual Control for Approximate Bayesian Reinforcement Learning

Klenske, E. D., Hennig, P.

Journal of Machine Learning Research, 17(127):1-30, 2016 (article)

ei pn

PDF link (url) [BibTex]

PDF link (url) [BibTex]

2013


no image
Camera-specific Image Denoising

Schober, M.

Eberhard Karls Universität Tübingen, Germany, October 2013 (diplomathesis)

ei pn

PDF [BibTex]

2013


PDF [BibTex]


Thumb xl thumb hennigk2012 2
Quasi-Newton Methods: A New Direction

Hennig, P., Kiefel, M.

Journal of Machine Learning Research, 14(1):843-865, March 2013 (article)

Abstract
Four decades after their invention, quasi-Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

ei ps pn

website+code pdf link (url) [BibTex]

website+code pdf link (url) [BibTex]


no image
The Randomized Dependence Coefficient

Lopez-Paz, D., Hennig, P., Schölkopf, B.

In Advances in Neural Information Processing Systems 26, pages: 1-9, (Editors: C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger), 27th Annual Conference on Neural Information Processing Systems (NIPS), 2013 (inproceedings)

ei pn

PDF [BibTex]

PDF [BibTex]


no image
Fast Probabilistic Optimization from Noisy Gradients

Hennig, P.

In Proceedings of The 30th International Conference on Machine Learning, JMLR W&CP 28(1), pages: 62–70, (Editors: S Dasgupta and D McAllester), ICML, 2013 (inproceedings)

ei pn

PDF [BibTex]

PDF [BibTex]


Thumb xl error vs dt fine
Nonparametric dynamics estimation for time periodic systems

Klenske, E., Zeilinger, M., Schölkopf, B., Hennig, P.

In Proceedings of the 51st Annual Allerton Conference on Communication, Control, and Computing, pages: 486-493 , 2013 (inproceedings)

ei pn

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
The Randomized Dependence Coefficient

Lopez-Paz, D., Hennig, P., Schölkopf, B.

Neural Information Processing Systems (NIPS), 2013 (poster)

ei pn

PDF [BibTex]

PDF [BibTex]


no image
Analytical probabilistic modeling for radiation therapy treatment planning

Bangert, M., Hennig, P., Oelfke, U.

Physics in Medicine and Biology, 58(16):5401-5419, 2013 (article)

ei pn

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Analytical probabilistic proton dose calculation and range uncertainties

Bangert, M., Hennig, P., Oelfke, U.

In 17th International Conference on the Use of Computers in Radiation Therapy, pages: 6-11, (Editors: A. Haworth and T. Kron), ICCR, 2013 (inproceedings)

ei pn

[BibTex]

[BibTex]


no image
Animating Samples from Gaussian Distributions

Hennig, P.

(8), Max Planck Institute for Intelligent Systems, Tübingen, Germany, 2013 (techreport)

ei pn

PDF [BibTex]

PDF [BibTex]

2012


Thumb xl thumb hennigk2012
Quasi-Newton Methods: A New Direction

Hennig, P., Kiefel, M.

In Proceedings of the 29th International Conference on Machine Learning, pages: 25-32, ICML ’12, (Editors: John Langford and Joelle Pineau), Omnipress, New York, NY, USA, ICML, July 2012 (inproceedings)

Abstract
Four decades after their invention, quasi- Newton methods are still state of the art in unconstrained numerical optimization. Although not usually interpreted thus, these are learning algorithms that fit a local quadratic approximation to the objective function. We show that many, including the most popular, quasi-Newton methods can be interpreted as approximations of Bayesian linear regression under varying prior assumptions. This new notion elucidates some shortcomings of classical algorithms, and lights the way to a novel nonparametric quasi-Newton method, which is able to make more efficient use of available information at computational cost similar to its predecessors.

ei ps pn

website+code pdf link (url) [BibTex]

2012


website+code pdf link (url) [BibTex]


Thumb xl screen shot 2017 09 21 at 00.54.33
Entropy Search for Information-Efficient Global Optimization

Hennig, P., Schuler, C.

Journal of Machine Learning Research, 13, pages: 1809-1837, -, June 2012 (article)

Abstract
Contemporary global optimization algorithms are based on local measures of utility, rather than a probability measure over location and value of the optimum. They thus attempt to collect low function values, not to learn about the optimum. The reason for the absence of probabilistic global optimizers is that the corresponding inference problem is intractable in several ways. This paper develops desiderata for probabilistic optimization algorithms, then presents a concrete algorithm which addresses each of the computational intractabilities with a sequence of approximations and explicitly adresses the decision problem of maximizing information gain from each evaluation.

ei pn

PDF Web Project Page [BibTex]

PDF Web Project Page [BibTex]


no image
Learning Tracking Control with Forward Models

Bócsi, B., Hennig, P., Csató, L., Peters, J.

In pages: 259 -264, IEEE International Conference on Robotics and Automation (ICRA), May 2012 (inproceedings)

Abstract
Performing task-space tracking control on redundant robot manipulators is a difficult problem. When the physical model of the robot is too complex or not available, standard methods fail and machine learning algorithms can have advantages. We propose an adaptive learning algorithm for tracking control of underactuated or non-rigid robots where the physical model of the robot is unavailable. The control method is based on the fact that forward models are relatively straightforward to learn and local inversions can be obtained via local optimization. We use sparse online Gaussian process inference to obtain a flexible probabilistic forward model and second order optimization to find the inverse mapping. Physical experiments indicate that this approach can outperform state-of-the-art tracking control algorithms in this context.

ei pn

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Approximate Gaussian Integration using Expectation Propagation

Cunningham, J., Hennig, P., Lacoste-Julien, S.

In pages: 1-11, -, January 2012 (inproceedings) Submitted

Abstract
While Gaussian probability densities are omnipresent in applied mathematics, Gaussian cumulative probabilities are hard to calculate in any but the univariate case. We offer here an empirical study of the utility of Expectation Propagation (EP) as an approximate integration method for this problem. For rectangular integration regions, the approximation is highly accurate. We also extend the derivations to the more general case of polyhedral integration regions. However, we find that in this polyhedral case, EP's answer, though often accurate, can be almost arbitrarily wrong. These unexpected results elucidate an interesting and non-obvious feature of EP not yet studied in detail, both for the problem of Gaussian probabilities and for EP more generally.

ei pn

Web [BibTex]

Web [BibTex]


no image
Kernel Topic Models

Hennig, P., Stern, D., Herbrich, R., Graepel, T.

In Fifteenth International Conference on Artificial Intelligence and Statistics, 22, pages: 511-519, JMLR Proceedings, (Editors: Lawrence, N. D. and Girolami, M.), JMLR.org, AISTATS , 2012 (inproceedings)

Abstract
Latent Dirichlet Allocation models discrete data as a mixture of discrete distributions, using Dirichlet beliefs over the mixture weights. We study a variation of this concept, in which the documents' mixture weight beliefs are replaced with squashed Gaussian distributions. This allows documents to be associated with elements of a Hilbert space, admitting kernel topic models (KTM), modelling temporal, spatial, hierarchical, social and other structure between documents. The main challenge is efficient approximate inference on the latent Gaussian. We present an approximate algorithm cast around a Laplace approximation in a transformed basis. The KTM can also be interpreted as a type of Gaussian process latent variable model, or as a topic model conditional on document features, uncovering links between earlier work in these areas.

ei pn

PDF Web [BibTex]

PDF Web [BibTex]