Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Haptic Intelligence Intelligent Control Systems Conference Paper Diffusion-Based Approximate MPC: Fast and Consistent Imitation of Multi-Modal Action Distributions Marquez Julbe, P., Nubert, J., Hose, H., Trimpe, S., Kuchenbecker, K. J. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 5633-5640, Hangzhou, China, October 2025 (Published)
Approximating model predictive control (MPC) using imitation learning (IL) allows for fast control without solving expensive optimization problems online. However, methods that use neural networks in a simple L2-regression setup fail to approximate multi-modal (set-valued) solution distributions caused by local optima found by the numerical solver or non-convex constraints, such as obstacles, significantly limiting the applicability of approximate MPC in practice. We solve this issue by using diffusion models to accurately represent the complete solution distribution (i.e., all modes) at high control rates (more than 1000 Hz). This work shows that diffusion-based AMPC significantly outperforms L2-regression-based approximate MPC for multi-modal action distributions. In contrast to most earlier work on IL, we also focus on running the diffusion-based controller at a higher rate and in joint space instead of end-effector space. Additionally, we propose the use of gradient guidance during the denoising process to consistently pick the same mode in closed loop to prevent switching between solutions. We propose using the cost and constraint satisfaction of the original MPC problem during parallel sampling of solutions from the diffusion model to pick a better mode online. We evaluate our method on the fast and accurate control of a 7-DoF robot manipulator both in simulation and on hardware deployed at 250 Hz, achieving a speedup of more than 70 times compared to solving the MPC problem online and also outperforming the numerical optimization (used for training) in success ratio.
DOI BibTeX

Haptic Intelligence Intelligent Control Systems Article Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test Khojasteh, B., Solowjow, F., Trimpe, S., Kuchenbecker, K. J. IEEE Transactions on Automation Science and Engineering, 21(3):4432-4447, July 2024 (Published)
Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and parameter tuning. To overcome these challenges, we propose an easily implemented framework that can directly handle heterogeneous data sources for classification tasks. Our data-versus-data approach automatically quantifies distinctive differences in distributions in a high-dimensional space via kernel two-sample testing between two sets extracted from multimodal data (e.g., images, sounds, haptic signals). We demonstrate the effectiveness of our technique by benchmarking against expertly engineered classifiers for visual-audio-haptic surface recognition due to the industrial relevance, difficulty, and competitive baselines of this application; ablation studies confirm the utility of key components of our pipeline. As shown in our open-source code, we achieve 97.2\% accuracy on a standard multi-user dataset with 108 surface classes, outperforming the state-of-the-art machine-learning algorithm by 6\% on a more difficult version of the task. The fact that our classifier obtains this performance with minimal data processing in the standard algorithm setting reinforces the powerful nature of kernel methods for learning to recognize complex patterns. Note to Practitioners—We demonstrate how to apply the kernel two-sample test to a surface-recognition task, discuss opportunities for improvement, and explain how to use this framework for other classification problems with similar properties. Automating surface recognition could benefit both surface inspection and robot manipulation. Our algorithm quantifies class similarity and therefore outputs an ordered list of similar surfaces. This technique is well suited for quality assurance and documentation of newly received materials or newly manufactured parts. More generally, our automated classification pipeline can handle heterogeneous data sources including images and high-frequency time-series measurements of vibrations, forces and other physical signals. As our approach circumvents the time-consuming process of feature engineering, both experts and non-experts can use it to achieve high-accuracy classification. It is particularly appealing for new problems without existing models and heuristics. In addition to strong theoretical properties, the algorithm is straightforward to use in practice since it requires only kernel evaluations. Its transparent architecture can provide fast insights into the given use case under different sensing combinations without costly optimization. Practitioners can also use our procedure to obtain the minimum data-acquisition time for independent time-series data from new sensor recordings.
DOI BibTeX

Haptic Intelligence Intelligent Control Systems Conference Paper Enhancing Surgical Team Collaboration and Situation Awareness through Multimodal Sensing Allemang–Trivalle, A. In Proceedings of the ACM International Conference on Multimodal Interaction, 716-720, Extended abstract (5 pages) presented at the ACM International Conference on Multimodal Interaction (ICMI) Doctoral Consortium, Paris, France, October 2023 (Published)
Surgery, typically seen as the surgeon's sole responsibility, requires a broader perspective acknowledging the vital roles of other operating room (OR) personnel. The interactions among team members are crucial for delivering quality care and depend on shared situation awareness. I propose a two-phase approach to design and evaluate a multimodal platform that monitors OR members, offering insights into surgical procedures. The first phase focuses on designing a data-collection platform, tailored to surgical constraints, to generate novel collaboration and situation-awareness metrics using synchronous recordings of the participants' voices, positions, orientations, electrocardiograms, and respiration signals. The second phase concerns the creation of intuitive dashboards and visualizations, aiding surgeons in reviewing recorded surgery, identifying adverse events and contributing to proactive measures. This work aims to demonstrate an innovative approach to data collection and analysis, augmenting the surgical team's capabilities. The multimodal platform has the potential to enhance collaboration, foster situation awareness, and ultimately mitigate surgical adverse events. This research sets the stage for a transformative shift in the OR, enabling a more holistic and inclusive perspective that recognizes that surgery is a team effort.
DOI BibTeX

Haptic Intelligence Intelligent Control Systems Miscellaneous Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test: Code Khojasteh, B., Solowjow, F., Trimpe, S., Kuchenbecker, K. J. Code published as a companion to the journal article "Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test" in IEEE Transactions on Automation Science and Engineering, July 2023 (Published) DOI BibTeX

Intelligent Control Systems Robotics Article The Wheelbot: A Jumping Reaction Wheel Unicycle Geist, A. R., Fiene, J., Tashiro, N., Jia, Z., Trimpe, S. IEEE Robotics and Automation Letters, 7(4):9683-9690, IEEE, October 2022 (Published)
Combining off-the-shelf components with 3D- printing, the Wheelbot is a symmetric reaction wheel unicycle that can jump onto its wheels from any initial position. With non-holonomic and under-actuated dynamics, as well as two coupled unstable degrees of freedom, the Wheelbot provides a challenging platform for nonlinear and data-driven control research. This letter presents the Wheelbot's mechanical and electrical design, its estimation and control algorithms, as well as experiments demonstrating both self-erection and disturbance rejection while balancing.
DOI URL BibTeX

Intelligent Control Systems Conference Paper Learning-enhanced robust controller synthesis with rigorous statistical and control-theoretic guarantees Fiedler, C., Scherer, C. W., Trimpe, S. In 60th IEEE Conference on Decision and Control (CDC), IEEE, December 2021 (Accepted)
The combination of machine learning with control offers many opportunities, in particular for robust control. However, due to strong safety and reliability requirements in many real-world applications, providing rigorous statistical and control-theoretic guarantees is of utmost importance, yet difficult to achieve for learning-based control schemes. We present a general framework for learning-enhanced robust control that allows for systematic integration of prior engineering knowledge, is fully compatible with modern robust control and still comes with rigorous and practically meaningful guarantees. Building on the established Linear Fractional Representation and Integral Quadratic Constraints framework, we integrate Gaussian Process Regression as a learning component and stateof-the-art robust controller synthesis. In a concrete robust control example, our approach is demonstrated to yield improved performance with more data, while guarantees are maintained throughout.
URL BibTeX

Intelligent Control Systems Conference Paper Local policy search with Bayesian optimization Müller, S., von Rohr, A., Trimpe, S. In Advances in Neural Information Processing Systems 34, 25:20708-20720, (Editors: Ranzato, M. and Beygelzimer, A. and Dauphin, Y. and Liang, P. S. and Wortman Vaughan, J.), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021) , December 2021 (Published)
Reinforcement learning (RL) aims to find an optimal policy by interaction with an environment. Consequently, learning complex behavior requires a vast number of samples, which can be prohibitive in practice. Nevertheless, instead of systematically reasoning and actively choosing informative samples, policy gradients for local search are often obtained from random perturbations. These random samples yield high variance estimates and hence are sub-optimal in terms of sample complexity. Actively selecting informative samples is at the core of Bayesian optimization, which constructs a probabilistic surrogate of the objective from past samples to reason about informative subsequent ones. In this paper, we propose to join both worlds. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient. Based on the model, the algorithm decides where to query a noisy zeroth-order oracle to improve the gradient estimates. The resulting algorithm is a novel type of policy search method, which we compare to existing black-box algorithms. The comparison reveals improved sample complexity and reduced variance in extensive empirical evaluations on synthetic objectives. Further, we highlight the benefits of active sampling on popular RL benchmarks.
arXiv GitHub URL BibTeX

Intelligent Control Systems Conference Paper Using Physics Knowledge for Learning Rigid-Body Forward Dynamics with Gaussian Process Force Priors Rath, L., Geist, A. R., Trimpe, S. In Proceedings of the 5th Conference on Robot Learning, 164:101-111, Proceedings of Machine Learning Research, (Editors: Faust, Aleksandra and Hsu, David and Neumann, Gerhard), PMLR, 5th Conference on Robot Learning (CoRL 2021), November 2021 (Published) URL BibTeX

Intelligent Control Systems Conference Paper GoSafe: Globally Optimal Safe Robot Learning Baumann, D., Marco, A., Turchetta, M., Trimpe, S. In 2021 IEEE International Conference on Robotics and Automation (ICRA 2021), 4452-4458, IEEE, Piscataway, NJ, IEEE International Conference on Robotics and Automation (ICRA 2021), October 2021 (Published) DOI BibTeX

Intelligent Control Systems Conference Paper Probabilistic robust linear quadratic regulators with Gaussian processes von Rohr, A., Neumann-Brosig, M., Trimpe, S. Proceedings of the 3rd Conference on Learning for Dynamics and Control, 324-335, Proceedings of Machine Learning Research (PMLR), Vol. 144, (Editors: Jadbabaie, Ali and Lygeros, John and Pappas, George J. and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.), PMLR, Brookline, MA 02446 , 3rd Annual Conference on Learning for Dynamics and Control (L4DC), June 2021 (Published)
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design. While learning-based control has the potential to yield superior performance in demanding applications, robustness to uncertainty remains an important challenge. Since Bayesian methods quantify uncertainty of the learning results, it is natural to incorporate these uncertainties in a robust design. In contrast to most state-of-the-art approaches that consider worst-case estimates, we leverage the learning methods’ posterior distribution in the controller synthesis. The result is a more informed and thus efficient trade-off between performance and robustness. We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin. The formulation is based on a recently proposed algorithm for linear quadratic control synthesis, which we extend by giving probabilistic robustness guarantees in the form of credibility bounds for the system’s stability. Comparisons to existing methods based on worst-case and certainty-equivalence designs reveal superior performance and robustness properties of the proposed method.
DOI URL BibTeX

Intelligent Control Systems Conference Paper On exploration requirements for learning safety constraints Massiani, P., Heim, S., Trimpe, S. In Proceedings of the 3rd Conference on Learning for Dynamics and Control, 905-916, Proceedings of Machine Learning Research (PMLR), Vol. 144, (Editors: Jadbabaie, Ali and Lygeros, John and Pappas, George J. and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie), PMLR, 3rd Annual Conference on Learning for Dynamics and Control (L4DC), June 2021 (Published)
Enforcing safety for dynamical systems is challenging, since it requires constraint satisfaction along trajectory predictions. Equivalent control constraints can be computed in the form of sets that enforce positive invariance, and can thus guarantee safety in feedback controllers without predictions. However, these constraints are cumbersome to compute from models, and it is not yet well established how to infer constraints from data. In this paper, we shed light on the key objects involved in learning control constraints from data in a model-free setting. In particular, we discuss the family of constraints that enforce safety in the context of a nominal control policy, and expose that these constraints do not need to be accurate everywhere. They only need to correctly exclude a subset of the state-actions that would cause failure, which we call the critical set.
URL BibTeX

Intelligent Control Systems Article Structured learning of rigid-body dynamics: A survey and unified view from a robotics perspective Geist, A. R., Trimpe, S. GAMM-Mitteilungen, 44(2):e202100009, Special Issue: Scientific Machine Learning, June 2021 (Published)
Accurate models of mechanical system dynamics are often critical for model-based control and reinforcement learning. Fully data-driven dynamics models promise to ease the process of modeling and analysis, but require considerable amounts of data for training and often do not generalize well to unseen parts of the state space. Combining data-driven modeling with prior analytical knowledge is an attractive alternative as the inclusion of structural knowledge into a regression model improves the model's data efficiency and physical integrity. In this article, we survey supervised regression models that combine rigid-body mechanics with data-driven modeling techniques. We analyze the different latent functions (such as kinetic energy or dissipative forces) and operators (such as differential operators and projection matrices) underlying common descriptions of rigid-body mechanics. Based on this analysis, we provide a unified view on the combination of data-driven regression models, such as neural networks and Gaussian processes, with analytical model priors. Furthermore, we review and discuss key techniques for designing structured models such as automatic differentiation.
DOI BibTeX

Intelligent Control Systems Conference Paper Practical and Rigorous Uncertainty Bounds for Gaussian Process Regression Fiedler, C., Scherer, C. W., Trimpe, S. In The Thirty-Fifth AAAI Conference on Artificial Intelligence, the Thirty-Third Conference on Innovative Applications of Artificial Intelligence, the Eleventh Symposium on Educational Advances in Artificial Intelligence, 8:7439-7447, AAAI Press, Palo Alto, CA, Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021), Thirty-Third Conference on Innovative Applications of Artificial Intelligence (IAAI 2021), Eleventh Symposium on Educational Advances in Artificial Intelligence (EAAI 2021), May 2021
Gaussian Process regression is a popular nonparametric regression method based on Bayesian principles that provides uncertainty estimates for its predictions. However, these estimates are of a Bayesian nature, whereas for some important applications, like learning-based control with safety guarantees, frequentist uncertainty bounds are required. Although such rigorous bounds are available for Gaussian Processes, they are too conservative to be useful in applications. This often leads practitioners to replacing these bounds by heuristics, thus breaking all theoretical guarantees. To address this problem, we introduce new uncertainty bounds that are rigorous, yet practically useful at the same time. In particular, the bounds can be explicitly evaluated and are much less conservative than state of the art results. Furthermore, we show that certain model misspecifications lead to only graceful degradation. We demonstrate these advantages and the usefulness of our results for learning-based control with numerical examples.},
URL BibTeX

Dynamic Locomotion Intelligent Control Systems Conference Paper A little damping goes a long way Heim, S., Millard, M., Mouel, C. L., Badri-Spröwitz, A. In Integrative and Comparative Biology, 61(Supplement 1):E367-E367, Oxford University Press, Society for Integrative and Comparative Biology Annual Meeting (SICB Annual Meeting 2021) , March 2021 (Published) DOI URL BibTeX

Intelligent Control Systems Movement Generation and Control Probabilistic Numerics Empirical Inference Article Robot Learning with Crash Constraints Marco, A., Baumann, D., Khadiv, M., Hennig, P., Righetti, L., Trimpe, S. IEEE Robotics and Automation Letters, 6(2):1439-1446, IEEE, February 2021 (Published)
In the past decade, numerous machine learning algorithms have been shown to successfully learn optimal policies to control real robotic systems. However, it is common to encounter failing behaviors as the learning loop progresses. Specifically, in robot applications where failing is undesired but not catastrophic, many algorithms struggle with leveraging data obtained from failures. This is usually caused by (i) the failed experiment ending prematurely, or (ii) the acquired data being scarce or corrupted. Both complicate the design of proper reward functions to penalize failures. In this paper, we propose a framework that addresses those issues. We consider failing behaviors as those that violate a constraint and address the problem of learning with crash constraints, where no data is obtained upon constraint violation. The no-data case is addressed by a novel GP model (GPCR) for the constraint that combines discrete events (failure/success) with continuous observations (only obtained upon success). We demonstrate the effectiveness of our framework on simulated benchmarks and on a real jumping quadruped, where the constraint threshold is unknown a priori. Experimental data is collected, by means of constrained Bayesian optimization, directly on the real robot. Our results outperform manual tuning and GPCR proves useful on estimating the constraint threshold.
DOI URL BibTeX

Intelligent Control Systems Article Event-triggered Learning for Linear Quadratic Control Schlüter, H., Solowjow, F., Trimpe, S. IEEE Transactions on Automatic Control, 66(10):4485-4498, 2021 (Published) arXiv DOI BibTeX

Intelligent Control Systems Empirical Inference Article Learning Event-triggered Control from Data through Joint Optimization Funk, N., Baumann, D., Berenz, V., Trimpe, S. IFAC Journal of Systems and Control, 16:100144, 2021
We present a framework for model-free learning of event-triggered control strategies. Event-triggered methods aim to achieve high control performance while only closing the feedback loop when needed. This enables resource savings, e.g., network bandwidth if control commands are sent via communication networks, as in networked control systems. Event-triggered controllers consist of a communication policy, determining when to communicate, and a control policy, deciding what to communicate. It is essential to jointly optimize the two policies since individual optimization does not necessarily yield the overall optimal solution. To address this need for joint optimization, we propose a novel algorithm based on hierarchical reinforcement learning. The resulting algorithm is shown to accomplish high-performance control in line with resource savings and scales seamlessly to nonlinear and high-dimensional systems. The method’s applicability to real-world scenarios is demonstrated through experiments on a six degrees of freedom real-time controlled manipulator. Further, we propose an approach towards evaluating the stability of the learned neural network policies.
arXiv DOI URL BibTeX

Intelligent Control Systems Article Wireless Control for Smart Manufacturing: Recent Approaches and Open Challenges Baumann, D., Mager, F., Wetzker, U., Thiele, L., Zimmerling, M., Trimpe, S. Proceedings of the IEEE, 109(4):441-467, 2021 (Published) arXiv DOI BibTeX

Dynamic Locomotion Intelligent Control Systems Article A Learnable Safety Measure Heim, S., Rohr, A. V., Trimpe, S., Badri-Spröwitz, A. Proceedings of the Conference on Robot Learning, 100:627-639, Proceedings of Machine Learning Research, (Editors: Kaelbling, Leslie Pack and Kragic, Danica and Sugiura, Komei), PMLR, Conference on Robot Learning, October 2020 (Published) Arxiv BibTeX

Dynamic Locomotion Intelligent Control Systems Article A little damping goes a long way: a simulation study of how damping influences task-level stability in running Heim, S., Millard, M., Le Mouel, C., Badri-Spröwitz, A. Biology Letters, 16(9):20200467, September 2020 (Published)
It is currently unclear if damping plays a functional role in legged locomotion, and simple models often do not include damping terms. We present a new model with a damping term that is isolated from other parameters: that is, the damping term can be adjusted without retuning other model parameters for nominal motion. We systematically compare how increased damping affects stability in the face of unexpected ground-height perturbations. Unlike most studies, we focus on task-level stability: instead of observing whether trajectories converge towards a nominal limit-cycle, we quantify the ability to avoid falls using a recently developed mathematical measure. This measure allows trajectories to be compared quantitatively instead of only being separated into a binary classification of ‘stable' or ‘unstable'. Our simulation study shows that increased damping contributes significantly to task-level stability; however, this benefit quickly plateaus after only a small amount of damping. These results suggest that the low intrinsic damping values observed experimentally may have stability benefits and are not simply minimized for energetic reasons. All Python code and data needed to generate our results are available open source.
DOI URL BibTeX

Intelligent Control Systems Ph.D. Thesis Bayesian Optimization in Robot Learning - Automatic Controller Tuning and Sample-Efficient Methods Marco-Valle, A. Eberhard Karls Universität Tübingen, Tübingen, July 2020
The problem of designing controllers to regulate dynamical systems has been studied by engineers during the past millennia. Ever since, suboptimal performance lingers in many closed loops as an unavoidable side effect of manually tuning the parameters of the controllers. Nowadays, industrial settings remain skeptic about data-driven methods that allow one to automatically learn controller parameters. In the context of robotics, machine learning (ML) keeps growing its influence on increasing autonomy and adaptability, for example to aid automating controller tuning. However, data-hungry ML methods, such as standard reinforcement learning, require a large number of experimental samples, prohibitive in robotics, as hardware can deteriorate and break. This brings about the following question: Can manual controller tuning, in robotics, be automated by using data-efficient machine learning techniques? In this thesis, we tackle the question above by exploring Bayesian optimization (BO), a data-efficient ML framework, to buffer the human effort and side effects of manual controller tuning, while retaining a low number of experimental samples. We focus this work in the context of robotic systems, providing thorough theoretical results that aim to increase data-efficiency, as well as demonstrations in real robots. Specifically, we present four main contributions. We first consider using BO to replace manual tuning in robotic platforms. To this end, we parametrize the design weights of a linear quadratic regulator (LQR) and learn its parameters using an information-efficient BO algorithm. Such algorithm uses Gaussian processes (GPs) to model the unknown performance objective. The GP model is used by BO to suggest controller parameters that are expected to increment the information about the optimal parameters, measured as a gain in entropy. The resulting “automatic LQR tuning” framework is demonstrated on two robotic platforms: A robot arm balancing an inverted pole and a humanoid robot performing a squatting task. In both cases, an existing controller is automatically improved in a handful of experiments without human intervention. BO compensates for data scarcity by means of the GP, which is a probabilistic model that encodes prior assumptions about the unknown performance objective. Usually, incorrect or non-informed assumptions have negative consequences, such as higher number of robot experiments, poor tuning performance or reduced sample-efficiency. The second to fourth contributions presented herein attempt to alleviate this issue. The second contribution proposes to include the robot simulator into the learning loop as an additional information source for automatic controller tuning. While doing a real robot experiment generally entails high associated costs (e.g., require preparation and take time), simulations are cheaper to obtain (e.g., they can be computed faster). However, because the simulator is an imperfect model of the robot, its information is biased and could have negative repercussions in the learning performance. To address this problem, we propose “simu-vs-real”, a principled multi-fidelity BO algorithm that trades off cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. The resulting algorithm is demonstrated on a cart-pole system, where simulations and real experiments are alternated, thus sparing many real evaluations. The third contribution explores how to adequate the expressiveness of the probabilistic prior to the control problem at hand. To this end, the mathematical structure of LQR controllers is leveraged and embedded into the GP, by means of the kernel function. Specifically, we propose two different “LQR kernel” designs that retain the flexibility of Bayesian nonparametric learning. Simulated results indicate that the LQR kernel yields superior performance than non-informed kernel choices when used for controller learning with BO. Finally, the fourth contribution specifically addresses the problem of handling controller failures, which are typically unavoidable in practice while learning from data, specially if non-conservative solutions are expected. Although controller failures are generally problematic (e.g., the robot has to be emergency-stopped), they are also a rich information source about what should be avoided. We propose “failures-aware excursion search”, a novel algorithm for Bayesian optimization under black-box constraints, where failures are limited in number. Our results in numerical benchmarks indicate that by allowing a confined number of failures, better optima are revealed as compared with state-of-the-art methods. The first contribution of this thesis, “automatic LQR tuning”, lies among the first on applying BO to real robots. While it demonstrated automatic controller learning from few experimental samples, it also revealed several important challenges, such as the need of higher sample-efficiency, which opened relevant research directions that we addressed through several methodological contributions. Summarizing, we proposed “simu-vs-real”, a novel BO algorithm that includes the simulator as an additional information source, an “LQR kernel” design that learns faster than standard choices and “failures-aware excursion search”, a new BO algorithm for constrained black-box optimization problems, where the number of failures is limited.
Repository (Universitätsbibliothek) - University of Tübingen PDF DOI BibTeX

Intelligent Control Systems Article Event-triggered Learning Solowjow, F., Trimpe, S. Automatica, 117:109009, Elsevier, July 2020 (Published) arXiv PDF DOI BibTeX

Physical Intelligence Intelligent Control Systems Conference Paper Learning of sub-optimal gait controllers for magnetic walking soft millirobots Culha, U., Demir, S. O., Trimpe, S., Sitti, M. In Robotics: Science and Systems XVI, P070, (Editors: Toussaint, Marc and Bicchi, Antonio and Hermans, Tucker), RSS Foundation, Robotics: Science and Systems 2020 (RSS 2020), July 2020 (Published)
Untethered small-scale soft robots have promising applications in minimally invasive surgery, targeted drug delivery, and bioengineering applications as they can access confined spaces in the human body. However, due to highly nonlinear soft continuum deformation kinematics, inherent stochastic variability during fabrication at the small scale, and lack of accurate models, the conventional control methods cannot be easily applied. Adaptivity of robot control is additionally crucial for medical operations, as operation environments show large variability, and robot materials may degrade or change over time,which would have deteriorating effects on the robot motion and task performance. Therefore, we propose using a probabilistic learning approach for millimeter-scale magnetic walking soft robots using Bayesian optimization (BO) and Gaussian processes (GPs). Our approach provides a data-efficient learning scheme to find controller parameters while optimizing the stride length performance of the walking soft millirobot robot within a small number of physical experiments. We demonstrate adaptation to fabrication variabilities in three different robots and to walking surfaces with different roughness. We also show an improvement in the learning performance by transferring the learning results of one robot to the others as prior information.
DOI URL BibTeX

Intelligent Control Systems Conference Paper Actively Learning Gaussian Process Dynamics Buisson-Fenet, M., Solowjow, F., Trimpe, S. Proceedings of the 2nd Conference on Learning for Dynamics and Control, 120:5-15, Proceedings of Machine Learning Research (PMLR), (Editors: Bayen, Alexandre M. and Jadbabaie, Ali and Pappas, George and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire and Zeilinger, Melanie), PMLR, 2nd Annual Conference on Learning for Dynamics and Control (L4DC), June 2020 (Published)
Despite the availability of ever more data enabled through modern sensor and computer technology, it still remains an open problem to learn dynamical systems in a sample-efficient way. We propose active learning strategies that leverage information-theoretical properties arising naturally during Gaussian process regression, while respecting constraints on the sampling process imposed by the system dynamics. Sample points are selected in regions with high uncertainty, leading to exploratory behavior and data-efficient training of the model. All results are verified in an extensive numerical benchmark.
ArXiv URL BibTeX

Intelligent Control Systems Conference Paper Learning Constrained Dynamics with Gauss Principle adhering Gaussian Processes Geist, A. R., Trimpe, S. In Proceedings of the 2nd Conference on Learning for Dynamics and Control, 120:225-234, Proceedings of Machine Learning Research (PMLR), (Editors: Bayen, Alexandre M. and Jadbabaie, Ali and Pappas, George and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire and Zeilinger, Melanie), PMLR, 2nd Annual Conference on Learning for Dynamics and Control (L4DC), June 2020 (Published)
The identification of the constrained dynamics of mechanical systems is often challenging. Learning methods promise to ease an analytical analysis, but require considerable amounts of data for training. We propose to combine insights from analytical mechanics with Gaussian process regression to improve the model's data efficiency and constraint integrity. The result is a Gaussian process model that incorporates a priori constraint knowledge such that its predictions adhere to Gauss' principle of least constraint. In return, predictions of the system's acceleration naturally respect potentially non-ideal (non-)holonomic equality constraints. As corollary results, our model enables to infer the acceleration of the unconstrained system from data of the constrained system and enables knowledge transfer between differing constraint configurations.
Proceedings of Machine Learning Research URL BibTeX

Intelligent Control Systems Article Data-efficient Autotuning with Bayesian Optimization: An Industrial Control Study Neumann-Brosig, M., Marco, A., Schwarzmann, D., Trimpe, S. IEEE Transactions on Control Systems Technology, 28(3):730-740, May 2020 (Published)
Bayesian optimization is proposed for automatic learning of optimal controller parameters from experimental data. A probabilistic description (a Gaussian process) is used to model the unknown function from controller parameters to a user-defined cost. The probabilistic model is updated with data, which is obtained by testing a set of parameters on the physical system and evaluating the cost. In order to learn fast, the Bayesian optimization algorithm selects the next parameters to evaluate in a systematic way, for example, by maximizing information gain about the optimum. The algorithm thus iteratively finds the globally optimal parameters with only few experiments. Taking throttle valve control as a representative industrial control example, the proposed auto-tuning method is shown to outperform manual calibration: it consistently achieves better performance with a low number of experiments. The proposed auto-tuning framework is flexible and can handle different control structures and objectives.
arXiv (PDF) DOI BibTeX

Intelligent Control Systems Conference Paper Robust Model-free Reinforcement Learning with Multi-objective Bayesian Optimization Turchetta, M., Krause, A., Trimpe, S. In 2020 IEEE International Conference on Robotics and Automation (ICRA 2020), 10702-10708, IEEE, Piscataway, NJ, IEEE International Conference on Robotics and Automation (ICRA 2020), May 2020 (Published) DOI BibTeX

Intelligent Control Systems Article Sliding Mode Control with Gaussian Process Regression for Underwater Robots Lima, G. S., Trimpe, S., Bessa, W. M. Journal of Intelligent & Robotic Systems, 99(3-4):487-498, January 2020 (Published) DOI BibTeX

Intelligent Control Systems Autonomous Motion Proceedings Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures Marco, A., Rohr, A. V., Baumann, D., Hernández-Lobato, J. M., Trimpe, S. 2020 (In revision)
When learning to ride a bike, a child falls down a number of times before achieving the first success. As falling down usually has only mild consequences, it can be seen as a tolerable failure in exchange for a faster learning process, as it provides rich information about an undesired behavior. In the context of Bayesian optimization under unknown constraints (BOC), typical strategies for safe learning explore conservatively and avoid failures by all means. On the other side of the spectrum, non conservative BOC algorithms that allow failing may fail an unbounded number of times before reaching the optimum. In this work, we propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures. Empirical validation shows that our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods. In addition, we propose an original algorithm for unconstrained Bayesian optimization inspired by the notion of excursion sets in stochastic processes, upon which the failures-aware algorithm is built.
arXiv code (python) PDF BibTeX

Intelligent Control Systems Autonomous Motion Article Safe and Fast Tracking on a Robot Manipulator: Robust MPC and Neural Network Control Nubert, J., Koehler, J., Berenz, V., Allgower, F., Trimpe, S. IEEE Robotics and Automation Letters, 5(2):3050-3057, 2020 (Published)
Fast feedback control and safety guarantees are essential in modern robotics. We present an approach that achieves both by combining novel robust model predictive control (MPC) with function approximation via (deep) neural networks (NNs). The result is a new approach for complex tasks with nonlinear, uncertain, and constrained dynamics as are common in robotics. Specifically, we leverage recent results in MPC research to propose a new robust setpoint tracking MPC algorithm, which achieves reliable and safe tracking of a dynamic setpoint while guaranteeing stability and constraint satisfaction. The presented robust MPC scheme constitutes a one-layer approach that unifies the often separated planning and control layers, by directly computing the control command based on a reference and possibly obstacle positions. As a separate contribution, we show how the computation time of the MPC can be drastically reduced by approximating the MPC law with a NN controller. The NN is trained and validated from offline samples of the MPC, yielding statistical guarantees, and used in lieu thereof at run time. Our experiments on a state-of-the-art robot manipulator are the first to show that both the proposed robust and approximate MPC schemes scale to real-world robotic systems.
arXiv PDF DOI BibTeX

Intelligent Control Systems Conference Paper Controlling Heterogeneous Stochastic Growth Processes on Lattices with Limited Resources Haksar, R., Solowjow, F., Trimpe, S., Schwager, M. In Proceedings of the 58th IEEE International Conference on Decision and Control (CDC) , 1315-1322, 58th IEEE International Conference on Decision and Control (CDC), December 2019 (Published) PDF BibTeX

Intelligent Control Systems Article Fast Feedback Control over Multi-hop Wireless Networks with Mode Changes and Stability Guarantees Baumann, D., Mager, F., Jacob, R., Thiele, L., Zimmerling, M., Trimpe, S. ACM Transactions on Cyber-Physical Systems, 4(2):18, November 2019 (Published) arXiv PDF DOI BibTeX

Intelligent Control Systems Conference Paper Predictive Triggering for Distributed Control of Resource Constrained Multi-agent Systems Mastrangelo, J. M., Baumann, D., Trimpe, S. In Proceedings of the 8th IFAC Workshop on Distributed Estimation and Control in Networked Systems, 79-84, 8th IFAC Workshop on Distributed Estimation and Control in Networked Systems (NecSys), September 2019 (Published) arXiv PDF DOI BibTeX

Intelligent Control Systems Conference Paper Event-triggered Pulse Control with Model Learning (if Necessary) Baumann, D., Solowjow, F., Johansson, K. H., Trimpe, S. In Proceedings of the American Control Conference, 792-797, American Control Conference (ACC), July 2019 (Published) arXiv PDF BibTeX

Intelligent Control Systems Conference Paper Data-driven inference of passivity properties via Gaussian process optimization Romer, A., Trimpe, S., Allgöwer, F. In Proceedings of the European Control Conference, European Control Conference (ECC), June 2019 (Published) PDF BibTeX

Intelligent Control Systems Article Resource-aware IoT Control: Saving Communication through Predictive Triggering Trimpe, S., Baumann, D. IEEE Internet of Things Journal, 6(3):5013-5028, June 2019 (Published)
The Internet of Things (IoT) interconnects multiple physical devices in large-scale networks. When the 'things' coordinate decisions and act collectively on shared information, feedback is introduced between them. Multiple feedback loops are thus closed over a shared, general-purpose network. Traditional feedback control is unsuitable for design of IoT control because it relies on high-rate periodic communication and is ignorant of the shared network resource. Therefore, recent event-based estimation methods are applied herein for resource-aware IoT control allowing agents to decide online whether communication with other agents is needed, or not. While this can reduce network traffic significantly, a severe limitation of typical event-based approaches is the need for instantaneous triggering decisions that leave no time to reallocate freed resources (e.g., communication slots), which hence remain unused. To address this problem, novel predictive and self triggering protocols are proposed herein. From a unified Bayesian decision framework, two schemes are developed: self triggers that predict, at the current triggering instant, the next one; and predictive triggers that check at every time step, whether communication will be needed at a given prediction horizon. The suitability of these triggers for feedback control is demonstrated in hardware experiments on a cart-pole, and scalability is discussed with a multi-vehicle simulation.
PDF arXiv DOI BibTeX

Intelligent Control Systems Conference Paper Trajectory-Based Off-Policy Deep Reinforcement Learning Doerr, A., Volpp, M., Toussaint, M., Trimpe, S., Daniel, C. In Proceedings of the International Conference on Machine Learning (ICML), International Conference on Machine Learning (ICML), June 2019 (Published)
Policy gradient methods are powerful reinforcement learning algorithms and have been demonstrated to solve many complex tasks. However, these methods are also data-inefficient, afflicted with high variance gradient estimates, and frequently get stuck in local optima. This work addresses these weaknesses by combining recent improvements in the reuse of off-policy data and exploration in parameter space with deterministic behavioral policies. The resulting objective is amenable to standard neural network optimization strategies like stochastic gradient descent or stochastic gradient Hamiltonian Monte Carlo. Incorporation of previous rollouts via importance sampling greatly improves data-efficiency, whilst stochastic optimization schemes facilitate the escape from local optima. We evaluate the proposed approach on a series of continuous control benchmark tasks. The results show that the proposed algorithm is able to successfully and reliably learn solutions using fewer system interactions than standard policy gradient methods.
arXiv PDF BibTeX

Intelligent Control Systems Poster Demo Abstract: Fast Feedback Control and Coordination with Mode Changes for Wireless Cyber-Physical Systems Mager, F., Baumann, D., Jacob, R., Thiele, L., Trimpe, S., Zimmerling, M. Proceedings of the 18th ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN), 340-341, 18th ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN), April 2019 (Published) arXiv PDF DOI BibTeX

Intelligent Control Systems Conference Paper Feedback Control Goes Wireless: Guaranteed Stability over Low-power Multi-hop Networks Mager, F., Baumann, D., Jacob, R., Thiele, L., Trimpe, S., Zimmerling, M. In Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems, 97-108, 10th ACM/IEEE International Conference on Cyber-Physical Systems, April 2019 (Published)
Closing feedback loops fast and over long distances is key to emerging applications; for example, robot motion control and swarm coordination require update intervals below 100 ms. Low-power wireless is preferred for its flexibility, low cost, and small form factor, especially if the devices support multi-hop communication. Thus far, however, closed-loop control over multi-hop low-power wireless has only been demonstrated for update intervals on the order of multiple seconds. This paper presents a wireless embedded system that tames imperfections impairing control performance such as jitter or packet loss, and a control design that exploits the essential properties of this system to provably guarantee closed-loop stability for linear dynamic systems. Using experiments on a testbed with multiple cart-pole systems, we are the first to demonstrate the feasibility and to assess the performance of closed-loop control and coordination over multi-hop low-power wireless for update intervals from 20 ms to 50 ms.
arXiv PDF DOI BibTeX