Header logo is


2019


no image
On the positivity and magnitudes of Bayesian quadrature weights

Karvonen, T., Kanagawa, M., Särkä, S.

Statistics and Computing, 29, pages: 1317-1333, 2019 (article)

pn

DOI [BibTex]

2019


DOI [BibTex]


no image
Probabilistic solutions to ordinary differential equations as nonlinear Bayesian filtering: a new perspective

Tronarp, F., Kersting, H., Särkkä, S. H. P.

Statistics and Computing, 29(6):1297-1315, 2019 (article)

ei pn

DOI [BibTex]

DOI [BibTex]


Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives
Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives

Gumbsch, C., Butz, M. V., Martius, G.

IEEE Transactions on Cognitive and Developmental Systems, 2019 (article)

Abstract
Voluntary behavior of humans appears to be composed of small, elementary building blocks or behavioral primitives. While this modular organization seems crucial for the learning of complex motor skills and the flexible adaption of behavior to new circumstances, the problem of learning meaningful, compositional abstractions from sensorimotor experiences remains an open challenge. Here, we introduce a computational learning architecture, termed surprise-based behavioral modularization into event-predictive structures (SUBMODES), that explores behavior and identifies the underlying behavioral units completely from scratch. The SUBMODES architecture bootstraps sensorimotor exploration using a self-organizing neural controller. While exploring the behavioral capabilities of its own body, the system learns modular structures that predict the sensorimotor dynamics and generate the associated behavior. In line with recent theories of event perception, the system uses unexpected prediction error signals, i.e., surprise, to detect transitions between successive behavioral primitives. We show that, when applied to two robotic systems with completely different body kinematics, the system manages to learn a variety of complex behavioral primitives. Moreover, after initial self-exploration the system can use its learned predictive models progressively more effectively for invoking model predictive planning and goal-directed control in different tasks and environments.

al

arXiv PDF video link (url) DOI Project Page [BibTex]


no image
Even Delta-Matroids and the Complexity of Planar Boolean CSPs

Kazda, A., Kolmogorov, V., Rolinek, M.

ACM Transactions on Algorithms, 15(2, Special Issue on Soda'17 and Regular Papers):Article Number 22, 2019 (article)

al

DOI [BibTex]

DOI [BibTex]


no image
Machine Learning for Haptics: Inferring Multi-Contact Stimulation From Sparse Sensor Configuration

Sun, H., Martius, G.

Frontiers in Neurorobotics, 13, pages: 51, 2019 (article)

Abstract
Robust haptic sensation systems are essential for obtaining dexterous robots. Currently, we have solutions for small surface areas such as fingers, but affordable and robust techniques for covering large areas of an arbitrary 3D surface are still missing. Here, we introduce a general machine learning framework to infer multi-contact haptic forces on a 3D robot’s limb surface from internal deformation measured by only a few physical sensors. The general idea of this framework is to predict first the whole surface deformation pattern from the sparsely placed sensors and then to infer number, locations and force magnitudes of unknown contact points. We show how this can be done even if training data can only be obtained for single-contact points using transfer learning at the example of a modified limb of the Poppy robot. With only 10 strain-gauge sensors we obtain a high accuracy also for multiple-contact points. The method can be applied to arbitrarily shaped surfaces and physical sensor types, as long as training data can be obtained.

al

link (url) DOI [BibTex]


no image
Dense connectomic reconstruction in layer 4 of the somatosensory cortex

Motta, A., Berning, M., Boergens, K. M., Staffler, B., Beining, M., Loomba, S., Hennig, P., Wissler, H., Helmstaedter, M.

Science, 366(6469):eaay3134, American Association for the Advancement of Science, 2019 (article)

ei pn

DOI [BibTex]

DOI [BibTex]


Probabilistic Linear Solvers: A Unifying View
Probabilistic Linear Solvers: A Unifying View

Bartels, S., Cockayne, J., Ipsen, I., Hennig, P.

Statistics and Computing, 29(6):1249-1263, 2019 (article)

pn

link (url) DOI [BibTex]

link (url) DOI [BibTex]

2018


Probabilistic Solutions To Ordinary Differential Equations As Non-Linear Bayesian Filtering: A New Perspective
Probabilistic Solutions To Ordinary Differential Equations As Non-Linear Bayesian Filtering: A New Perspective

Tronarp, F., Kersting, H., Särkkä, S., Hennig, P.

ArXiv preprint 2018, arXiv:1810.03440 [stat.ME], October 2018 (article)

Abstract
We formulate probabilistic numerical approximations to solutions of ordinary differential equations (ODEs) as problems in Gaussian process (GP) regression with non-linear measurement functions. This is achieved by defining the measurement sequence to consists of the observations of the difference between the derivative of the GP and the vector field evaluated at the GP---which are all identically zero at the solution of the ODE. When the GP has a state-space representation, the problem can be reduced to a Bayesian state estimation problem and all widely-used approximations to the Bayesian filtering and smoothing problems become applicable. Furthermore, all previous GP-based ODE solvers, which were formulated in terms of generating synthetic measurements of the vector field, come out as specific approximations. We derive novel solvers, both Gaussian and non-Gaussian, from the Bayesian state estimation problem posed in this paper and compare them with other probabilistic solvers in illustrative experiments.

pn

link (url) Project Page [BibTex]

2018



no image
Convergence Rates of Gaussian ODE Filters

Kersting, H., Sullivan, T. J., Hennig, P.

arXiv preprint 2018, arXiv:1807.09737 [math.NA], July 2018 (article)

Abstract
A recently-introduced class of probabilistic (uncertainty-aware) solvers for ordinary differential equations (ODEs) applies Gaussian (Kalman) filtering to initial value problems. These methods model the true solution $x$ and its first $q$ derivatives a priori as a Gauss--Markov process $\boldsymbol{X}$, which is then iteratively conditioned on information about $\dot{x}$. We prove worst-case local convergence rates of order $h^{q+1}$ for a wide range of versions of this Gaussian ODE filter, as well as global convergence rates of order $h^q$ in the case of $q=1$ and an integrated Brownian motion prior, and analyse how inaccurate information on $\dot{x}$ coming from approximate evaluations of $f$ affects these rates. Moreover, we present explicit formulas for the steady states and show that the posterior confidence intervals are well calibrated in all considered cases that exhibit global convergence---in the sense that they globally contract at the same rate as the truncation error.

pn

link (url) Project Page [BibTex]

link (url) Project Page [BibTex]


no image
Nonlinear decoding of a complex movie from the mammalian retina

Botella-Soler, V., Deny, S., Martius, G., Marre, O., Tkačik, G.

PLOS Computational Biology, 14(5):1-27, Public Library of Science, May 2018 (article)

Abstract
Author summary Neurons in the retina transform patterns of incoming light into sequences of neural spikes. We recorded from ∼100 neurons in the rat retina while it was stimulated with a complex movie. Using machine learning regression methods, we fit decoders to reconstruct the movie shown from the retinal output. We demonstrated that retinal code can only be read out with a low error if decoders make use of correlations between successive spikes emitted by individual neurons. These correlations can be used to ignore spontaneous spiking that would, otherwise, cause even the best linear decoders to “hallucinate” nonexistent stimuli. This work represents the first high resolution single-trial full movie reconstruction and suggests a new paradigm for separating spontaneous from stimulus-driven neural activity.

al

DOI [BibTex]

DOI [BibTex]


no image
Numerical Quadrature for Probabilistic Policy Search

Vinogradska, J., Bischoff, B., Achterhold, J., Koller, T., Peters, J.

IEEE Transactions on Pattern Analysis and Machine Intelligence, pages: 1-1, 2018 (article)

ev

DOI [BibTex]

DOI [BibTex]


no image
Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences

Kanagawa, M., Hennig, P., Sejdinovic, D., Sriperumbudur, B. K.

Arxiv e-prints, arXiv:1805.08845v1 [stat.ML], 2018 (article)

Abstract
This paper is an attempt to bridge the conceptual gaps between researchers working on the two widely used approaches based on positive definite kernels: Bayesian learning or inference using Gaussian processes on the one side, and frequentist kernel methods based on reproducing kernel Hilbert spaces on the other. It is widely known in machine learning that these two formalisms are closely related; for instance, the estimator of kernel ridge regression is identical to the posterior mean of Gaussian process regression. However, they have been studied and developed almost independently by two essentially separate communities, and this makes it difficult to seamlessly transfer results between them. Our aim is to overcome this potential difficulty. To this end, we review several old and new results and concepts from either side, and juxtapose algorithmic quantities from each framework to highlight close similarities. We also provide discussions on subtle philosophical and theoretical differences between the two approaches.

pn ei

arXiv [BibTex]

arXiv [BibTex]


no image
Counterfactual Mean Embedding: A Kernel Method for Nonparametric Causal Inference

Muandet, K., Kanagawa, M., Saengkyongam, S., Marukata, S.

Arxiv e-prints, arXiv:1805.08845v1 [stat.ML], 2018 (article)

Abstract
This paper introduces a novel Hilbert space representation of a counterfactual distribution---called counterfactual mean embedding (CME)---with applications in nonparametric causal inference. Counterfactual prediction has become an ubiquitous tool in machine learning applications, such as online advertisement, recommendation systems, and medical diagnosis, whose performance relies on certain interventions. To infer the outcomes of such interventions, we propose to embed the associated counterfactual distribution into a reproducing kernel Hilbert space (RKHS) endowed with a positive definite kernel. Under appropriate assumptions, the CME allows us to perform causal inference over the entire landscape of the counterfactual distribution. The CME can be estimated consistently from observational data without requiring any parametric assumption about the underlying distributions. We also derive a rate of convergence which depends on the smoothness of the conditional mean and the Radon-Nikodym derivative of the underlying marginal distributions. Our framework can deal with not only real-valued outcome, but potentially also more complex and structured outcomes such as images, sequences, and graphs. Lastly, our experimental results on off-policy evaluation tasks demonstrate the advantages of the proposed estimator.

ei pn

arXiv [BibTex]

arXiv [BibTex]


no image
Model-based Kernel Sum Rule: Kernel Bayesian Inference with Probabilistic Models

Nishiyama, Y., Kanagawa, M., Gretton, A., Fukumizu, K.

Arxiv e-prints, arXiv:1409.5178v2 [stat.ML], 2018 (article)

Abstract
Kernel Bayesian inference is a powerful nonparametric approach to performing Bayesian inference in reproducing kernel Hilbert spaces or feature spaces. In this approach, kernel means are estimated instead of probability distributions, and these estimates can be used for subsequent probabilistic operations (as for inference in graphical models) or in computing the expectations of smooth functions, for instance. Various algorithms for kernel Bayesian inference have been obtained by combining basic rules such as the kernel sum rule (KSR), kernel chain rule, kernel product rule and kernel Bayes' rule. However, the current framework only deals with fully nonparametric inference (i.e., all conditional relations are learned nonparametrically), and it does not allow for flexible combinations of nonparametric and parametric inference, which are practically important. Our contribution is in providing a novel technique to realize such combinations. We introduce a new KSR referred to as the model-based KSR (Mb-KSR), which employs the sum rule in feature spaces under a parametric setting. Incorporating the Mb-KSR into existing kernel Bayesian framework provides a richer framework for hybrid (nonparametric and parametric) kernel Bayesian inference. As a practical application, we propose a novel filtering algorithm for state space models based on the Mb-KSR, which combines the nonparametric learning of an observation process using kernel mean embedding and the additive Gaussian noise model for a state transition process. While we focus on additive Gaussian noise models in this study, the idea can be extended to other noise models, such as the Cauchy and alpha-stable noise models.

pn

arXiv [BibTex]

arXiv [BibTex]


A probabilistic model for the numerical solution of initial value problems
A probabilistic model for the numerical solution of initial value problems

Schober, M., Särkkä, S., Philipp Hennig,

Statistics and Computing, Springer US, 2018 (article)

Abstract
We study connections between ordinary differential equation (ODE) solvers and probabilistic regression methods in statistics. We provide a new view of probabilistic ODE solvers as active inference agents operating on stochastic differential equation models that estimate the unknown initial value problem (IVP) solution from approximate observations of the solution derivative, as provided by the ODE dynamics. Adding to this picture, we show that several multistep methods of Nordsieck form can be recast as Kalman filtering on q-times integrated Wiener processes. Doing so provides a family of IVP solvers that return a Gaussian posterior measure, rather than a point estimate. We show that some such methods have low computational overhead, nontrivial convergence order, and that the posterior has a calibrated concentration rate. Additionally, we suggest a step size adaptation algorithm which completes the proposed method to a practically useful implementation, which we experimentally evaluate using a representative set of standard codes in the DETEST benchmark set.

pn

PDF Code DOI Project Page [BibTex]

2012


Entropy Search for Information-Efficient Global Optimization
Entropy Search for Information-Efficient Global Optimization

Hennig, P., Schuler, C.

Journal of Machine Learning Research, 13, pages: 1809-1837, -, June 2012 (article)

Abstract
Contemporary global optimization algorithms are based on local measures of utility, rather than a probability measure over location and value of the optimum. They thus attempt to collect low function values, not to learn about the optimum. The reason for the absence of probabilistic global optimizers is that the corresponding inference problem is intractable in several ways. This paper develops desiderata for probabilistic optimization algorithms, then presents a concrete algorithm which addresses each of the computational intractabilities with a sequence of approximations and explicitly adresses the decision problem of maximizing information gain from each evaluation.

ei pn

PDF Web Project Page [BibTex]

2012


PDF Web Project Page [BibTex]


no image
Variants of guided self-organization for robot control

Martius, G., Herrmann, J.

Theory in Biosci., 131(3):129-137, Springer Berlin / Heidelberg, 2012 (article)

al

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
The Playful Machine - Theoretical Foundation and Practical Realization of Self-Organizing Robots

Der, R., Martius, G.

Springer, Berlin Heidelberg, 2012 (book)

Abstract
Autonomous robots may become our closest companions in the near future. While the technology for physically building such machines is already available today, a problem lies in the generation of the behavior for such complex machines. Nature proposes a solution: young children and higher animals learn to master their complex brain-body systems by playing. Can this be an option for robots? How can a machine be playful? The book provides answers by developing a general principle---homeokinesis, the dynamical symbiosis between brain, body, and environment---that is shown to drive robots to self-determined, individual development in a playful and obviously embodiment-related way: a dog-like robot starts playing with a barrier, eventually jumping or climbing over it; a snakebot develops coiling and jumping modes; humanoids develop climbing behaviors when fallen into a pit, or engage in wrestling-like scenarios when encountering an opponent. The book also develops guided self-organization, a new method that helps to make the playful machines fit for fulfilling tasks in the real world.

al

link (url) [BibTex]


no image
Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers

Rolinek, M., Swoboda, P., Zietlow, D., Paulus, A., Musil, V., Martius, G.

Arxiv (article)

Abstract
Building on recent progress at the intersection of combinatorial optimization and deep learning, we propose an end-to-end trainable architecture for deep graph matching that contains unmodified combinatorial solvers. Using the presence of heavily optimized combinatorial solvers together with some improvements in architecture design, we advance state-of-the-art on deep graph matching benchmarks for keypoint correspondence. In addition, we highlight the conceptual advantages of incorporating solvers into deep learning architectures, such as the possibility of post-processing with a strong multi-graph matching solver or the indifference to changes in the training setting. Finally, we propose two new challenging experimental setups

al

Arxiv [BibTex]