Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Perceiving Systems Conference Paper HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics Grigorev, A., Thomaszewski, B., Black, M. J., Hilliges, O. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 16965-16974, CVPR, June 2023 (Published)
We propose a method that leverages graph neural networks, multi-level message passing, and unsupervised training to enable real-time prediction of realistic clothing dynamics. Whereas existing methods based on linear blend skinning must be trained for specific garments, our method is agnostic to body shape and applies to tight-fitting garments as well as loose, free-flowing clothing. Our method furthermore handles changes in topology (e.g., garments with buttons or zippers) and material properties at inference time. As one key contribution, we propose a hierarchical message-passing scheme that efficiently propagates stiff stretching modes while preserving local detail. We empirically show that our method outperforms strong baselines quantitatively and that its results are perceived as more realistic than state-of-the-art methods.
arXiv project pdf supp URL BibTeX

Perceiving Systems Neural Capture and Synthesis Conference Paper MIME: Human-Aware 3D Scene Generation Yi, H., Huang, C. P., Tripathi, S., Hering, L., Thies, J., Black, M. J. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 12965-12976, CVPR, June 2023 (Published)
Generating realistic 3D worlds occupied by moving humans has many applications in games, architecture, and synthetic data creation. But generating such scenes is expensive and labor intensive. Recent work generates human poses and motions given a 3D scene. Here, we take the opposite approach and generate 3D indoor scenes given 3D human motion. Such motions can come from archival motion capture or from IMU sensors worn on the body, effectively turning human movement in a “scanner” of the 3D world. Intuitively, human movement indicates the free-space in a room and human contact indicates surfaces or objects that support activities such as sitting, lying or touching. We propose MIME (Mining Interaction and Movement to infer 3D Environments), which is a generative model of indoor scenes that produces furniture layouts that are consistent with the human movement. MIME uses an auto-regressive transformer architecture that takes the already generated objects in the scene as well as the human motion as input, and outputs the next plausible object. To train MIME, we build a dataset by populating the 3D FRONT scene dataset with 3D humans. Our experiments show that MIME produces more diverse and plausible 3D scenes than a recent generative scene method that does not know about human movement. Code and data will be available for research at https://mime.is.tue.mpg.de.
project arXiv paper URL BibTeX

Perceiving Systems Conference Paper PointAvatar: Deformable Point-Based Head Avatars From Videos Zheng, Y., Yifan, W., Wetzstein, G., Black, M. J., Hilliges, O. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 21057-21067, CVPR, June 2023 (Published)
The ability to create realistic animatable and relightable head avatars from casual video sequences would open up wide ranging applications in communication and entertainment. Current methods either build on explicit 3D morphable meshes (3DMM) or exploit neural implicit representations. The former are limited by fixed topology, while the latter are non-trivial to deform and inefficient to render. Furthermore, existing approaches entangle lighting and albedo, limiting the ability to re-render the avatar in new environments. In contrast, we propose PointAvatar, a deformable point-based representation that disentangles the source color into intrinsic albedo and normal-dependent shading. We demonstrate that PointAvatar bridges the gap between existing mesh- and implicit representations, combining high-quality geometry and appearance with topological flexibility, ease of deformation and rendering efficiency. We show that our method is able to generate animatable 3D avatars using monocular videos from multiple sources including hand-held smartphones, laptop webcams and internet videos, achieving state-of-the-art quality in challenging cases where previous methods fail, e.g., thin hair strands, while being significantly more efficient in training than competing methods.
pdf project code video DOI URL BibTeX

Perceiving Systems Conference Paper SLOPER4D: A Scene-Aware Dataset for Global 4D Human Pose Estimation in Urban Environments Dai, Y., Lin, Y., Lin, X., Wen, C., Xu, L., Yi, H., Shen, S., Ma, Y., Wang, C. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 682-692, CVF, CVPR, June 2023 (Published)
We present SLOPER4D, a novel scene-aware dataset collected in large urban environments to facilitate the research of global human pose estimation (GHPE) with human-scene interaction in the wild. Employing a head-mounted device integrated with a LiDAR and camera, we record 12 human subjects’ activities over 10 diverse urban scenes from an egocentric view. Frame-wise annotations for 2D key points, 3D pose parameters, and global translations are provided, together with reconstructed scene point clouds. To obtain accurate 3D ground truth in such large dynamic scenes, we propose a joint optimization method to fit local SMPL meshes to the scene and fine-tune the camera calibration during dynamic motions frame by frame, resulting in plausible and scene-natural 3D human poses. Eventually, SLOPER4D consists of 15 sequences of human motions, each of which has a trajectory length of more than 200 meters (up to 1,300 meters) and covers an area of more than 200 square meters (up to 30,000 square meters), including more than 100K LiDAR frames, 300k video frames, and 500K IMU-based motion frames. With SLOPER4D, we provide a detailed and thorough analysis of two critical tasks, including camera-based 3D HPE and LiDAR-based 3D HPE in urban environments, and benchmark a new task, GHPE. The in-depth analysis demonstrates SLOPER4D poses significant challenges to existing methods and produces great research opportunities. The dataset and code are released https://github.com/climbingdaily/SLOPER4D.
project dataset codebase paper arXiv BibTeX

Perceiving Systems Conference Paper TRACE: 5D Temporal Regression of Avatars With Dynamic Cameras in 3D Environments Sun, Y., Bao, Q., Liu, W., Mei, T., Black, M. J. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 8856-8866, CVPR, June 2023 (Published)
Although the estimation of 3D human pose and shape (HPS) is rapidly progressing, current methods still cannot reliably estimate moving humans in global coordinates, which is critical for many applications. This is particularly challenging when the camera is also moving, entangling human and camera motion. To address these issues, we adopt a novel 5D representation (space, time, and identity) that enables end-to-end reasoning about people in scenes. Our method, called TRACE, introduces several novel architectural components. Most importantly, it uses two new "maps" to reason about the 3D trajectory of people over time in camera, and world, coordinates. An additional memory unit enables persistent tracking of people even during long occlusions. TRACE is the first one-stage method to jointly recover and track 3D humans in global coordinates from dynamic cameras. By training it end-to-end, and using full image information, TRACE achieves state-of-the-art performance on tracking and HPS benchmarks. The code and dataset are released for research purposes.
pdf supp code video URL BibTeX

Autonomous Learning Conference Paper Backpropagation through Combinatorial Algorithms: Identity with Projection Works Sahoo, S., Paulus, A., Vlastelica, M., Musil, V., Kuleshov, V., Martius, G. In Proceedings of the Eleventh International Conference on Learning Representations, May 2023 (Accepted)
Embedding discrete solvers as differentiable layers has given modern deep learning architectures combinatorial expressivity and discrete reasoning capabilities. The derivative of these solvers is zero or undefined, therefore a meaningful replacement is crucial for effective gradient-based learning. Prior works rely on smoothing the solver with input perturbations, relaxing the solver to continuous problems, or interpolating the loss landscape with techniques that typically require additional solver calls, introduce extra hyper-parameters, or compromise performance. We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass and further provide a theoretical justification. Our experiments demonstrate that such a straightforward hyper-parameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval. Furthermore, we substitute the previously proposed problem-specific and label-dependent margin with a generic regularization procedure that prevents cost collapse and increases robustness.
OpenReview Arxiv Pdf URL BibTeX

Autonomous Learning Empirical Inference Conference Paper Benchmarking Offline Reinforcement Learning on Real-Robot Hardware Gürtler, N., Blaes, S., Kolev, P., Widmaier, F., Wüthrich, M., Bauer, S., Schölkopf, B., Martius, G. In Proceedings of the Eleventh International Conference on Learning Representations, The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published)
Learning policies from previously recorded data is a promising direction for real-world robotics tasks, as online learning is often infeasible. Dexterous manipulation in particular remains an open problem in its general form. The combination of offline reinforcement learning with large diverse datasets, however, has the potential to lead to a breakthrough in this challenging domain analogously to the rapid progress made in supervised learning in recent years. To coordinate the efforts of the research community toward tackling this problem, we propose a benchmark including: i) a large collection of data for offline learning from a dexterous manipulation platform on two tasks, obtained with capable RL agents trained in simulation; ii) the option to execute learned policies on a real-world robotic system and a simulation for efficient debugging. We evaluate prominent open-sourced offline reinforcement learning algorithms on the datasets and provide a reproducible experimental setup for offline reinforcement learning on real systems.
Website arXiv Code URL BibTeX

Autonomous Learning Empirical Inference Conference Paper DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems Schumacher, P., Haeufle, D. F., Büchler, D., Schmitt, S., Martius, G. In The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published)
Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by our finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems, outperforming current approaches in all considered tasks in sample efficiency and robustness.
Arxiv pdf Website URL BibTeX

Empirical Inference Article ResMiCo: Increasing the quality of metagenome-assembled genomes with deep learning Mineeva*, O., Danciu*, D., Schölkopf, B., Ley, R. E., Rätsch, G., Youngblut, N. D. PLOS Computational Biology, 19(5), Public Library of Science, San Francisco, CA, May 2023, *equal contribution (Published) DOI BibTeX

Haptic Intelligence Miscellaneous 3D Reconstruction for Minimally Invasive Surgery: Lidar Versus Learning-Based Stereo Matching Caccianiga, G., Nubert, J., Hutter, M., Kuchenbecker., K. J. Workshop paper (2 pages) presented at the ICRA Workshop on Robot-Assisted Medical Imaging, London, UK, May 2023 (Published)
This work investigates real-time 3D surface reconstruction for minimally invasive surgery. Specifically, we analyze depth sensing through laser-based time-of-flight sensing (lidar) and stereo endoscopy on ex-vivo porcine tissue samples. When compared to modern learning-based stereo matching from endoscopic images, lidar achieves lower processing delay, higher frame rate, and superior robustness against sensor distance and poor illumination. Furthermore, we report on the negative effect of near-infrared light penetration on the accuracy of time-of-flight measurements across different tissue types.
BibTeX

Empirical Inference Article A Kernel Stein Test for Comparing Latent Variable Models Kanagawa, H., Jitkrittum, W., Mackey, L., Fukumizu, K., Gretton, A. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(3):986-1011, May 2023 (Published) arXiv DOI BibTeX

Social Foundations of Computation Conference Paper A Theory of Dynamic Benchmarks Shirali, A., Abebe, R., Hardt, M. In The Eleventh International Conference on Learning Representations (ICLR 2023) , May 2023 (Published)
Dynamic benchmarks interweave model fitting and data collection in an attempt to mitigate the limitations of static benchmarks. In contrast to an extensive theoretical and empirical study of the static setting, the dynamic counterpart lags behind due to limited empirical studies and no apparent theoretical foundation to date. Responding to this deficit, we initiate a theoretical study of dynamic benchmarking. We examine two realizations, one capturing current practice and the other modeling more complex settings. In the first model, where data collection and model fitting alternate sequentially, we prove that model performance improves initially but can stall after only three rounds. Label noise arising from, for instance, annotator disagreement leads to even stronger negative results. Our second model generalizes the first to the case where data collection and model fitting have a hierarchical dependency structure. We show that this design guarantees strictly more progress than the first, albeit at a significant increase in complexity. We support our theoretical analysis by simulating dynamic benchmarks on two popular datasets. These results illuminate the benefits and practical limitations of dynamic benchmarking, providing both a theoretical foundation and a causal explanation for observed bottlenecks in empirical work.
arXiv URL BibTeX

Empirical Inference Conference Paper A law of adversarial risk, interpolation, and label noise Paleka, D., Sanyal, A. The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published) URL BibTeX

Haptic Intelligence Miscellaneous AiroTouch: Naturalistic Vibrotactile Feedback for Telerobotic Construction-Related Tasks Gong, Y., Tashiro, N., Javot, B., Lauer, A. P. R., Sawodny, O., Kuchenbecker, K. J. Extended abstract (1 page) presented at the ICRA Workshop on Communicating Robot Learning across Human-Robot Interaction, London, UK, May 2023 (Published) BibTeX

Empirical Inference Article Better Together: Data Harmonization and Cross-StudAnalysis of Abdominal MRI Data From UK Biobank and the German National Cohort Gatidis, S., Kart, T., Fischer, M., Winzeck, S., Glocker, B., Bai, W., Bülow, R., Emmel, C., Friedrich, L., Kauczor, H., Keil, T., Kröncke, T., Mayer, P., Niendorf, T., Peters, A., Pischon, T., Schaarschmidt, B., Schmidt, B., Schulze, M., Umutle, L., et al. Investigative Radiology, 58(5):346-354, May 2023 (Published) DOI BibTeX

Autonomous Learning Empirical Inference Conference Paper Bridging the Gap to Real-World Object-Centric Learning Seitzer, M., Horn, M., Zadaianchuk, A., Zietlow, D., Xiao, T., Simon-Gabriel, C., He, T., Zhang, Z., Schölkopf, B., Brox, T., Locatello, F. In Proceedings of the Eleventh International Conference on Learning Representations, The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published)
Humans naturally decompose their environment into entities at the appropriate level of abstraction to act in the world. Allowing machine learning algorithms to derive this decomposition in an unsupervised way has become an important line of research. However, current methods are restricted to simulated data or require additional information in the form of motion or depth in order to successfully discover objects. In this work, we overcome this limitation by showing that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way. Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data and is the first unsupervised object-centric model that scales to real world-datasets such as COCO and PASCAL VOC. DINOSAUR is conceptually simple and shows competitive performance compared to more involved pipelines from the computer vision literature.
Code Website URL BibTeX

Empirical Inference Conference Paper Disentanglement of Correlated Factors via Hausdorff Factorized Support Roth, K., Ibrahim, M., Akata, Z., Vincent, P., Bouchacourt, D. The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published) URL BibTeX

Autonomous Learning Conference Paper Efficient Learning of High Level Plans from Play Armengol Urpi, N., Bagatella, M., Hilliges, O., Martius, G., Coros, S. In International Conference on Robotics and Automation, May 2023 (Accepted)
Real-world robotic manipulation tasks remain an elusive challenge, since they involve both fine-grained environment interaction, as well as the ability to plan for long-horizon goals. Although deep reinforcement learning (RL) methods have shown encouraging results when planning end-to-end in high-dimensional environments, they remain fundamentally limited by poor sample efficiency due to inefficient exploration, and by the complexity of credit assignment over long horizons. In this work, we present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL to achieve long-horizon complex manipulation tasks. We leverage task-agnostic play data to learn a discrete behavioral prior over object-centric primitives, modeling their feasibility given the current context. We then design a high-level goal-conditioned policy which (1) uses primitives as building blocks to scaffold complex long-horizon tasks and (2) leverages the behavioral prior to accelerate learning. We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks and learns policies that can be easily transferred to physical hardware.
Arxiv Website Poster BibTeX

Empirical Inference Conference Paper Flow Annealed Importance Sampling Bootstrap Midgley*, L. I., Stimper*, V., Simm, G. N. C., Schölkopf, B., Hernádez-Lobato, J. M. The Eleventh International Conference on Learning Representations (ICLR), May 2023, *equal contribution (Published) arXiv URL BibTeX

Empirical Inference Conference Paper Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap Liu, W., Yu, L., Weller, A., Schölkopf, B. The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published) URL BibTeX

Empirical Inference Conference Paper How robust is unsupervised representation learning to distribution shift? Shi, Y., Daunhawer, I., Vogt, J. E., Torr, P., Sanyal, A. The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published) URL BibTeX

Physics for Inference and Optimization Article Hypergraphx: a library for higher-order network analysis Lotito, Q. F., Contisciani, M., De Bacco, C., Di Gaetano, L., Gallo, L., Montresor, A., Musciotto, F., Ruggeri, N., Battiston, F. Journal of Complex Networks, 11, May 2023 (Published) Preprint Code DOI BibTeX

Empirical Inference Conference Paper Identifiability Results for Multimodal Contrastive Learning Daunhawer, I., Bizeul, A., Palumbo, E., Marx, A., Vogt, J. E. The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published) URL BibTeX

Empirical Inference Conference Paper Investigating the Impact of Action Representations in Policy Gradient Algorithms Schneider, J., Schumacher, P., Häufle, D., Schölkopf, B., Büchler, D. Workshop on effective Representations, Abstractions, and Priors for Robot Learning (RAP4Robots) @ ICRA 2023, May 2023 (Published) arXiv Poster BibTeX

Empirical Inference Ph.D. Thesis Learning with and for discrete optimization Paulus, M. ETH Zurich, Switzerland, May 2023, CLS PhD Program (Published) BibTeX

Perceiving Systems Empirical Inference Conference Paper MeshDiffusion: Score-based Generative 3D Mesh Modeling Liu, Z., Feng, Y., Black, M. J., Nowrouzezahrai, D., Paull, L., Liu, W. The Eleventh International Conference on Learning Representations (ICLR), ICLR, May 2023 (Published)
We consider the task of generating realistic 3D shapes, which is useful for a variety of applications such as automatic scene generation and physical simulation. Compared to other 3D representations like voxels and point clouds, meshes are more desirable in practice, because (1) they enable easy and arbitrary manipulation of shapes for relighting and simulation, and (2) they can fully leverage the power of modern graphics pipelines which are mostly optimized for meshes. Previous scalable methods for generating meshes typically rely on sub-optimal post-processing, and they tend to produce overly-smooth or noisy surfaces without fine-grained geometric details. To overcome these shortcomings, we take advantage of the graph structure of meshes and use a simple yet very effective generative modeling method to generate 3D meshes. Specifically, we represent meshes with deformable tetrahedral grids, and then train a diffusion model on this direct parametrization. We demonstrate the effectiveness of our model on multiple generative tasks.
Home Code URL BibTeX

Empirical Inference Conference Paper Meta-learning Adaptive Deep Kernel Gaussian Processes for Molecular Property Prediction Chen, W., Tripp, A., Hernández-Lobato, J. M. The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published) URL BibTeX

Empirical Inference Movement Generation and Control Conference Paper On the Use of Torque Measurement in Centroidal State Estimation Khorshidi, S., Gazar, A., Rotella, N., Naveau, M., Righetti, L., Bennewitz, M., Khadiv, M. IEEE International Conference on Robotics and Automation (ICRA), 9931-9937, May 2023 (Published) DOI BibTeX

Autonomous Learning Conference Paper Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning Eberhard, O., Hollenstein, J., Pinneri, C., Martius, G. In Proceedings of the Eleventh International Conference on Learning Representations (ICLR), The Eleventh International Conference on Learning Representations (ICLR), May 2023
In off-policy deep reinforcement learning with continuous action spaces, exploration is often implemented by injecting action noise into the action selection process. Popular algorithms based on stochastic policies, such as SAC or MPO, inject white noise by sampling actions from uncorrelated Gaussian distributions. In many tasks, however, white noise does not provide sufficient exploration, and temporally correlated noise is used instead. A common choice is Ornstein-Uhlenbeck (OU) noise, which is closely related to Brownian motion (red noise). Both red noise and white noise belong to the broad family of colored noise. In this work, we perform a comprehensive experimental evaluation on MPO and SAC to explore the effectiveness of other colors of noise as action noise. We find that pink noise, which is halfway between white and red noise, significantly outperforms white noise, OU noise, and other alternatives on a wide range of environments. Thus, we recommend it as the default choice for action noise in continuous control.
URL BibTeX

Perceiving Systems Ph.D. Thesis Reining in the Deep Generative Models Ghosh, P. University of Tübingen, May 2023 (Published)
This thesis studies controllability of generative models (specifically VAEs and GANs) applied primarily to images. We improve 1. generation quality, by removing the arbitrary prior assumptions, 2. classification by suitably choosing the latent space distribution, and 3. inference performance by optimizing the generative and inference objective simultaneously. Variational autoencoders (VAEs) are an incredibly useful tool as they can be used as a backbone for a variety of machine learning tasks e.g., semi-supervised learning, representation learning, unsupervised learning, etc. However, the generated samples are overly smooth and this limits their practical usage tremendously. There are two leading hypotheses to explain this: 1. bad likelihood model and 2. overly simplistic prior. We investigate these by designing a deterministic yet samplable autoencoder named Regularized Autoencoders (RAE). This redesign helps us enforce arbitrary priors over the latent distribution of a VAE addressing hypothesis (1) above. This leads us to conclude that a poor likelihood model is the predominant factor that makes VAEs blurry. Furthermore, we show that combining generative (e.g., VAE objective) and discriminative objectives (e.g., classification objective) improve performance of both. Specifically, We use a special case of an RAE to build a classifier that offers robustness against adversarial attack. Conditional generative models have the potential to revolutionize the animation industry, among others. However, to do so, the two key requirements are, 1. they must be of high quality (i.e., generate high-resolution images) and 2. must follow their conditioning (i.e., generate images that have the properties specified by the condition). We exploit pixel-localized correlation between the conditioning variable and generated image to ensure strong association between the two and thereby gain precise control over the generated content. We further show that closing the generation-inference loop (training them together) in latent variable models benefits both the generation and the inference component. This opens up the possibility to train an inference and a generative model simultaneously in one unified framework, in the fully or semi supervised setting. With the proposed approach, one can build a robust classifier by introducing the marginal likelihood of a data point, removing arbitrary assumptions about the prior distribution, mitigating posterior-prior distribution mismatch and completing the generation inference loop. In this thesis, we study real-life implications of each of the themes using various image classification and generation frameworks.
download pdf DOI BibTeX

Empirical Inference Article Staying and Returning dynamics of young children’s attention Kim, J., Singh, S., Vales, C., Keebler, E., Fisher, A. V., Thiessen, E. D. Developmental Science, 26(6), May 2023 (Published) DOI BibTeX

Empirical Inference Conference Paper Structure by Architecture: Structured Representations without Regularization Leeb, F., Lanzillotta, G., Annadani, Y., Besserve, M., Bauer, S., Schölkopf, B. The Eleventh International Conference on Learning Representations (ICLR), May 2023 (Published) URL BibTeX

Haptic Intelligence Miscellaneous Surface Perception through Haptic-Auditory Contact Data Khojasteh, B., Shao, Y., Kuchenbecker, K. J. Workshop paper (4 pages) presented at the ICRA Workshop on Embracing Contacts, London, UK, May 2023 (Published)
Sliding a finger or tool along a surface generates rich haptic and auditory contact signals that encode properties crucial for manipulation, such as friction and hardness. To engage in contact-rich manipulation, future robots would benefit from having surface-characterization capabilities similar to humans, but the optimal sensing configuration is not yet known. Thus, we developed a test bed for capturing high-quality measurements as a human touches surfaces with different tools: it includes optical motion capture, a force/torque sensor under the surface sample, high-bandwidth accelerometers on the tool and the fingertip, and a high-fidelity microphone. After recording data from three tool diameters and nine surfaces, we describe a surface-classification pipeline that uses the maximum mean discrepancy (MMD) to compare newly gathered data to each surface in our known library. The results achieved under several pipeline variations are compared, and future investigations are outlined.
URL BibTeX

Dynamic Locomotion Article Virtual pivot point in human walking: always experimentally observed but simulations suggest it may not be necessary for stability Schreff, L., Haeufle, D. F. B., Badri-Spröwitz, A., Vielemeyer, J., Müller, R. Journal of Biomechanics, 153, May 2023 (Published)
The intersection of ground reaction forces near a point above the center of mass has been observed in computer simulation models and human walking experiments. Observed so ubiquitously, the intersection point (IP) is commonly assumed to provide postural stability for bipedal walking. In this study, we challenge this assumption by questioning if walking without an IP is possible. Deriving gaits with a neuromuscular reflex model through multi-stage optimization, we found stable walking patterns that show no signs of the IP-typical intersection of ground reaction forces. The non-IP gaits found are stable and successfully rejected step-down perturbations, which indicates that an IP is not necessary for locomotion robustness or postural stability. A collision-based analysis shows that non-IP gaits feature center of mass (CoM) dynamics with vectors of the CoM velocity and ground reaction force increasingly opposing each other, indicating an increased mechanical cost of transport. Although our computer simulation results have yet to be confirmed through experimental studies, they already indicate that the role of the IP in postural stability should be further investigated. Moreover, our observations on the CoM dynamics and gait efficiency suggest that the IP may have an alternative or additional function that should be considered.
arXiv DOI URL BibTeX

Empirical Inference Conference Paper DCI-ES: An Extended Disentanglement Framework with Connections to Identifiability Eastwood*, C., Nicolicioiu*, A. L., von Kügelgen*, J., Kekić, A., Träuble, F., Dittadi, A., Schölkopf, B. The Eleventh International Conference on Learning Representations (ICLR), May 2023, *equal contribution (Published) URL BibTeX

Haptic Intelligence Miscellaneous OCRA: An Optimization-Based Customizable Retargeting Algorithm for Teleoperation Mohan, M., Kuchenbecker, K. J. Workshop paper (3 pages) presented at the ICRA Workshop Toward Robot Avatars, London, UK, May 2023 (Published)
This paper presents a real-time optimization-based algorithm for mapping motion between two kinematically dissimilar serial linkages, such as a human arm and a robot arm. OCRA can be customized based on the target task to weight end-effector orientation versus the configuration of the central line of the arm, which we call the skeleton. A video-watching study (N=70) demonstrated that when this algorithm considers both the hand orientation and the arm skeleton, it creates robot arm motions that users perceive to be highly similar to those of the human operator, indicating OCRA would be suitable for telerobotics and telepresence through avatars.
URL BibTeX

Perceiving Systems Article Fast-SNARF: A Fast Deformer for Articulated Neural Fields Chen, X., Jiang, T., Song, J., Rietmann, M., Geiger, A., Black, M. J., Hilliges, O. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 1-15, April 2023 (Published)
Neural fields have revolutionized the area of 3D reconstruction and novel view synthesis of rigid scenes. A key challenge in making such methods applicable to articulated objects, such as the human body, is to model the deformation of 3D locations between the rest pose (a canonical space) and the deformed space. We propose a new articulation module for neural fields, Fast-SNARF, which finds accurate correspondences between canonical space and posed space via iterative root finding. Fast-SNARF is a drop-in replacement in functionality to our previous work, SNARF, while significantly improving its computational efficiency. We contribute several algorithmic and implementation improvements over SNARF, yielding a speed-up of 150× . These improvements include voxel-based correspondence search, pre-computing the linear blend skinning function, and an efficient software implementation with CUDA kernels. Fast-SNARF enables efficient and simultaneous optimization of shape and skinning weights given deformed observations without correspondences (e.g. 3D meshes). Because learning of deformation maps is a crucial component in many 3D human avatar methods and since Fast-SNARF provides a computationally efficient solution, we believe that this work represents a significant step towards the practical creation of 3D virtual humans.
pdf publisher site code DOI URL BibTeX

Robotic Materials Patent High Strain Peano Hydraulically Amplified Self-healing Electrostatic (HASEL) Transducers Keplinger, C. M., Wang, X., Mitchell, S. K. (US Patent 11635094), April 2023
High strain hydraulically amplified self-healing electrostatic transducers having increased maximum theoretical and practical strains are disclosed. In particular, the actuators include electrode configurations having a zipping front created by the attraction of the electrodes that is configured orthogonally to a strain axis along which the actuators. This configuration produces increased strains. In turn, various form factors for the actuator configuration are presented including an artificial circular muscle and a strain amplifying pulley system. Other actuator configurations are contemplated that include independent and opposed electrode pairs to create cyclic activation, hybrid electrode configurations, and use of strain limiting layers for controlled deflection of the actuator.
URL BibTeX

Empirical Inference Article Uncovering the Organization of Neural Circuits with Generalized Phase Locking Analysis Safavi, S., Panagiotaropoulos, T. I., Kapoor, V., Ramirez-Villegas, J. F., Logothetis, N., Besserve, M. PLOS Computational Biology, 19(4):45, Public Library of Science, April 2023 (Published) bioRxiv DOI BibTeX

Robotic Materials Physical Intelligence Bioinspired Autonomous Miniature Robots Article A Versatile Jellyfish-Like Robotic Platform for Effective Underwater Propulsion and Manipulation Wang, T., Joo, H., Song, S., Hu, W., Keplinger, C., Sitti, M. Science Advances, 9(15), American Association for the Advancement of Science, April 2023, Tianlu Wang and Hyeong-Joon Joo contributed equally to this work. (Published)
Underwater devices are critical for environmental applications. However, existing prototypes typically use bulky, noisy actuators and limited configurations. Consequently, they struggle to ensure noise-free and gentle interactions with underwater species when realizing practical functions. Therefore, we developed a jellyfish-like robotic platform enabled by a synergy of electrohydraulic actuators and a hybrid structure of rigid and soft components. Our 16-cm-diameter noise-free prototype could control the fluid flow to propel while manipulating objects to be kept beneath its body without physical contact, thereby enabling safer interactions. Its against-gravity speed was up to 6.1 cm/s, substantially quicker than other examples in literature, while only requiring a low input power of around 100 mW. Moreover, using the platform, we demonstrated contact-based object manipulation, fluidic mixing, shape adaptation, steering, wireless swimming, and cooperation of two to three robots. This study introduces a versatile jellyfish-like robotic platform with a wide range of functions for diverse applications.
YouTube video DOI URL BibTeX

Empirical Inference Article Adapting to noise distribution shifts in flow-based gravitational-wave inference Wildberger, J., Dax, M., Green, S. R., Gair, J., Pürrer, M., Macke, J. H., Buonanno, A., Schölkopf, B. Physical Review D, 107(8), April 2023 (Published) DOI BibTeX

Empirical Inference Conference Paper BaCaDI: Bayesian Causal Discovery with Unknown Interventions Hägele, A., Rothfuss, J., Lorch, L., Somnath, V. R., Schölkopf, B., Krause, A. Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) , 206:1411-1436, Proceedings of Machine Learning Research, (Editors: Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem), PMLR, April 2023 (Published) URL BibTeX

Empirical Inference Conference Paper Backtracking Counterfactuals von Kügelgen, J., Mohamed, A., Beckers, S. Proceedings of the Second Conference on Causal Learning and Reasoning (CLeaR), 213:177-196, Proceedings of Machine Learning Research, (Editors: van der Schaar, Mihaela and Zhang, Cheng and Janzing, Dominik), PMLR, April 2023 (Published) URL BibTeX

Empirical Inference Conference Paper Causal Triplet: An Open Challenge for Intervention-centric Causal Representation Learning Liu, Y., Alahi, A., Russell, C., Horn, M., Zietlow, D., Schölkopf, B., Locatello, F. Proceedings of the Second Conference on Causal Learning and Reasoning (CLeaR), 213:553-573, Proceedings of Machine Learning Research, (Editors: van der Schaar, Mihaela and Zhang, Cheng and Janzing, Dominik), PMLR, April 2023 (Published) URL BibTeX

Empirical Inference Conference Paper Dataflow graphs as complete causal graphs Paleyes, A., Guo, S., Schölkopf, B., Lawrence, N. D. 2nd International Conference on AI Engineering - Software Engineering for AI (CAIN), 7-12, IEEE, April 2023 (Published) arXiv DOI BibTeX

Haptic Intelligence Article Effects of Automated Skill Assessment on Robotic Surgery Training Brown, J. D., Kuchenbecker, K. J. The International Journal of Medical Robotics and Computer Assisted Surgery, 19(2):e2492, April 2023 (Published)
Background: Several automated skill-assessment approaches have been proposed for robotic surgery, but their utility is not well understood. This article investigates the effects of one machine-learning-based skill-assessment approach on psychomotor skill development in robotic surgery training. Methods: N=29 trainees (medical students and residents) with no robotic surgery experience performed five trials of inanimate peg transfer with an Intuitive Surgical da Vinci Standard robot. Half of the participants received no post-trial feedback. The other half received automatically calculated scores from five Global Evaluative Assessment of Robotic Skill (GEARS) domains post-trial. Results: There were no significant differences between the groups regarding overall improvement or skill improvement rate. However, participants who received post-trial feedback rated their overall performance improvement significantly lower than participants who did not receive feedback. Conclusions: These findings indicate that automated skill evaluation systems might improve trainee selfawareness but not accelerate early-stage psychomotor skill development in robotic surgery training.
DOI BibTeX

Haptic Intelligence Article Haptify: A Measurement-Based Benchmarking System for Grounded Force-Feedback Devices Fazlollahi, F., Kuchenbecker, K. J. IEEE Transactions on Robotics, 39(2):1622-1636, April 2023 (Published)
Grounded force-feedback (GFF) devices are an established and diverse class of haptic technology based on robotic arms. However, the number of designs and how they are specified make comparing devices difficult. We thus present Haptify, a benchmarking system that can thoroughly, fairly, and noninvasively evaluate GFF haptic devices. The user holds the instrumented device end-effector and moves it through a series of passive and active experiments. Haptify records the interaction between the hand, device, and ground with a seven-camera optical motion-capture system, a 60-cm-square custom force plate, and a customized sensing end-effector. We demonstrate six key ways to assess GFF device performance: workspace shape, global free-space forces, global free-space vibrations, local dynamic forces and torques, frictionless surface rendering, and stiffness rendering. We then use Haptify to benchmark two commercial haptic devices. With a smaller workspace than the 3D Systems Touch, the more expensive Touch X outputs smaller free-space forces and vibrations, smaller and more predictable dynamic forces and torques, and higher-quality renderings of a frictionless surface and high stiffness.
DOI BibTeX

Empirical Inference Article Instrumental variable regression via kernel maximum moment loss Zhang, R., Imaizumi, M., Schölkopf, B., Muandet, K. Journal of Causal Inference, 11(1), April 2023 (Published) DOI BibTeX

Empirical Inference Conference Paper Iterative Teaching by Data Hallucination Qiu, Z., Liu, W., Xiao, T., Liu, Z., Bhatt, U., Luo, Y., Weller, A., Schölkopf, B. Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) , 206:9892-9913, Proceedings of Machine Learning Research, (Editors: Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem), PMLR, April 2023 (Published) URL BibTeX