Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Haptic Intelligence Ph.D. Thesis Gesture-Based Nonverbal Interaction for Exercise Robots Mohan, M. University of Tübingen, Tübingen, Germany, October 2023, Department of Computer Science (Published)
When teaching or coaching, humans augment their words with carefully timed hand gestures, head and body movements, and facial expressions to provide feedback to their students. Robots, however, rarely utilize these nuanced cues. A minimally supervised social robot equipped with these abilities could support people in exercising, physical therapy, and learning new activities. This thesis examines how the intuitive power of human gestures can be harnessed to enhance human-robot interaction. To address this question, this research explores gesture-based interactions to expand the capabilities of a socially assistive robotic exercise coach, investigating the perspectives of both novice users and exercise-therapy experts. This thesis begins by concentrating on the user's engagement with the robot, analyzing the feasibility of minimally supervised gesture-based interactions. This exploration seeks to establish a framework in which robots can interact with users in a more intuitive and responsive manner. The investigation then shifts its focus toward the professionals who are integral to the success of these innovative technologies: the exercise-therapy experts. Roboticists face the challenge of translating the knowledge of these experts into robotic interactions. We address this challenge by developing a teleoperation algorithm that can enable exercise therapists to create customized gesture-based interactions for a robot. Thus, this thesis lays the groundwork for dynamic gesture-based interactions in minimally supervised environments, with implications for not only exercise-coach robots but also broader applications in human-robot interaction.
BibTeX

Social Foundations of Computation Conference Paper Is Your Model Predicting the Past? Hardt, M., Kim, M. P. In Proceedings of the Third ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO), ACM, October 2023 (Published)
When does a machine learning model predict the future of individuals and when does it recite patterns that predate the individuals? In this work, we propose a distinction between these two pathways of prediction, supported by theoretical, empirical, and normative arguments. At the center of our proposal is a family of simple and efficient statistical tests, called backward baselines, that demonstrate if, and to what extent, a model recounts the past. Our statistical theory provides guidance for interpreting backward baselines, establishing equivalences between different baselines and familiar statistical concepts. Concretely, we derive a meaningful backward baseline for auditing a prediction system as a black box, given only background variables and the system’s predictions. Empirically, we evaluate the framework on different prediction tasks derived from longitudinal panel surveys, demonstrating the ease and effectiveness of incorporating backward baselines into the practice of machine learning.
URL BibTeX

Empirical Inference Perceiving Systems Conference Paper One-shot Implicit Animatable Avatars with Model-based Priors Huang, Y., Yi, H., Liu, W., Wang, H., Wu, B., Wang, W., Lin, B., Zhang, D., Cai, D. In Proc. International Conference on Computer Vision (ICCV), 8940-8951, International Conference on Computer Vision, October 2023, *equal contribution (Published)
Existing neural rendering methods for creating human avatars typically either require dense input signals such as video or multi-view images, or leverage a learned prior from large-scale specific 3D human datasets such that reconstruction can be performed with sparse-view inputs. Most of these methods fail to achieve realistic reconstruction when only a single image is available. To enable the data-efficient creation of realistic animatable 3D humans, we propose ELICIT, a novel method for learning human-specific neural radiance fields from a single image. Inspired by the fact that humans can easily reconstruct the body geometry and infer the full-body clothing from a single image, we leverage two priors in ELICIT: 3D geometry prior and visual semantic prior. Specifically, ELICIT introduces the 3D body shape geometry prior from a skinned vertex-based template model (i.e., SMPL) and implements the visual clothing semantic prior with the CLIP-based pre-trained models. Both priors are used to jointly guide the optimization for creating plausible content in the invisible areas. In order to further improve visual details, we propose a segmentation-based sampling strategy that locally refines different parts of the avatar.Comprehensive evaluations on multiple popular benchmarks, including ZJU-MoCAP, Human3.6M, and DeepFashion, show that ELICIT has outperformed current state-of-the-art avatar creation methods when only a single image is available. Code will be public for reseach purpose at https://github.com/huangyangyi/ELICIT
arXiv code project DOI BibTeX

Perceiving Systems Empirical Inference Conference Paper Pairwise Similarity Learning is SimPLE Wen, Y., Liu, W., Feng, Y., Raj, B., Singh, R., Weller, A., Black, M. J., Schölkopf, B. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), International Conference on Computer Vision, October 2023 (Published)
In this paper, we focus on a general yet important learning problem, pairwise similarity learning (PSL). PSL subsumes a wide range of important applications, such as open-set face recognition, speaker verification, image retrieval and person re-identification. The goal of PSL is to learn a pairwise similarity function assigning a higher similarity score to positive pairs (i.e., a pair of samples with the same label) than to negative pairs (i.e., a pair of samples with different label). We start by identifying a key desideratum for PSL, and then discuss how existing methods can achieve this desideratum. We then propose a surprisingly simple proxy-free method, called SimPLE, which requires neither feature/proxy normalization nor angular margin and yet is able to generalize well in open-set recognition. We apply the proposed method to three challenging PSL tasks: open-set face recognition, image retrieval and speaker verification. Comprehensive experimental results on large-scale benchmarks show that our method performs significantly better than current state-of-the-art methods.
URL BibTeX

Robust Machine Learning Conference Paper Scale Alone Does not Improve Mechanistic Interpretability in Vision Models Zimmermann, R. S., Klein, T., Brendel, W. In Advances in Neural Information Processing Systems 36 (NeurIPS 2023), 57876 - 57907, Curran Associates Inc., NeurIPS, October 2023 (Published) NeurIPS Proceedings DOI URL BibTeX

Haptic Intelligence Miscellaneous Seeking Causal, Invariant, Structures with Kernel Mean Embeddings in Haptic-Auditory Data from Tool-Surface Interaction Khojasteh, B., Shao, Y., Kuchenbecker, K. J. Workshop paper (4 pages) presented at the IROS Workshop on Causality for Robotics: Answering the Question of Why, Detroit, USA, October 2023 (Published)
Causal inference could give future learning robots strong generalization and scalability capabilities, which are crucial for safety, fault diagnosis and error prevention. One application area of interest consists of the haptic recognition of surfaces. We seek to understand cause and effect during physical surface interaction by examining surface and tool identity, their interplay, and other contact-irrelevant factors. To work toward elucidating the mechanism of surface encoding, we attempt to recognize surfaces from haptic-auditory data captured by previously unseen hemispherical steel tools that differ from the recording tool in diameter and mass. In this context, we leverage ideas from kernel methods to quantify surface similarity through descriptive differences in signal distributions. We find that the effect of the tool is significantly present in higher-order statistical moments of contact data: aligning the means of the distributions being compared somewhat improves recognition but does not fully separate tool identity from surface identity. Our findings shed light on salient aspects of haptic-auditory data from tool-surface interaction and highlight the challenges involved in generalizing artificial surface discrimination capabilities.
Manuscript URL BibTeX

Perceiving Systems Conference Paper AG3D: Learning to Generate 3D Avatars from 2D Image Collections Dong, Z., Chen, X., Yang, J., Black, M. J., Hilliges, O., Geiger, A. In Proc. International Conference on Computer Vision (ICCV), 14916-14927, International Conference on Computer Vision (ICCV), October 2023 (Published)
While progress in 2D generative models of human appearance has been rapid, many applications require 3D avatars that can be animated and rendered. Unfortunately, most existing methods for learning generative models of 3D humans with diverse shape and appearance require 3D training data, which is limited and expensive to acquire. The key to progress is hence to learn generative models of 3D avatars from abundant unstructured 2D image collections. However, learning realistic and complete 3D appearance and geometry in this under-constrained setting remains challenging, especially in the presence of loose clothing such as dresses. In this paper, we propose a new adversarial generative model of realistic 3D people from 2D images. Our method captures shape and deformation of the body and loose clothing by adopting a holistic 3D generator and integrating an efficient and flexible articulation module. To improve realism, we train our model using multiple discriminators while also integrating geometric cues in the form of predicted 2D normal maps. We experimentally find that our method outperforms previous 3D- and articulation-aware methods in terms of geometry and appearance. We validate the effectiveness of our model and the importance of each component via systematic ablation studies.
project pdf code video DOI URL BibTeX

Empirical Inference Article CROCODILE - Incorporating medium-resolution spectroscopy of close-in directly imaged exoplanets into atmospheric retrievals via cross-correlation Hayoz, J., Cugno, G., Quanz, S. P., Patapis, P., Alei, E., Bonse, M. J., Dannert, F. A., Garvin, E. O., Gebhard, T. D., Konrad, B. S., Sartori, L. F. Astronomy & Astrophysics, 678, October 2023 (Published) DOI BibTeX

Perceiving Systems Conference Paper D-IF: Uncertainty-aware Human Digitization via Implicit Distribution Field Yang, X., Luo, Y., Xiu, Y., Wang, W., Xu, H., Fan, Z. In Proc. International Conference on Computer Vision (ICCV), 9122-9132, International Conference on Computer Vision, October 2023 (Published)
Realistic virtual humans play a crucial role in numerous industries, such as metaverse, intelligent healthcare, and self-driving simulation. But creating them on a large scale with high levels of realism remains a challenge. The utilization of deep implicit function sparks a new era of image-based 3D clothed human reconstruction, enabling pixel-aligned shape recovery with fine details. Subsequently, the vast majority of works locate the surface by regressing the deterministic implicit value for each point. However, should all points be treated equally regardless of their proximity to the surface? In this paper, we propose replacing the implicit value with an adaptive uncertainty distribution, to differentiate between points based on their distance to the surface. This simple "value to distribution" transition yields significant improvements on nearly all the baselines. Furthermore, qualitative results demonstrate that the models trained using our uncertainty distribution loss, can capture more intricate wrinkles, and realistic limbs.
Code Homepage URL BibTeX

Perceiving Systems Software Workshop Conference Paper DECO: Dense Estimation of 3D Human-Scene Contact in the Wild Tripathi, S., Chatterjee, A., Passy, J., Yi, H., Tzionas, D., Black, M. J. In Proc. International Conference on Computer Vision (ICCV), 8001-8013, International Conference on Computer Vision, October 2023 (Published)
Understanding how humans use physical contact to interact with the world is key to enabling human-centric artificial intelligence. While inferring 3D contact is crucial for modeling realistic and physically-plausible human-object interactions, existing methods either focus on 2D, consider body joints rather than the surface, use coarse 3D body regions, or do not generalize to in-the-wild images. In contrast, we focus on inferring dense, 3D contact between the full body surface and objects in arbitrary images. To achieve this, we first collect DAMON, a new dataset containing dense vertex-level contact annotations paired with RGB images containing complex human-object and human-scene contact. Second, we train DECO, a novel 3D contact detector that uses both body-part-driven and scene-context-driven attention to estimate vertex-level contact on the SMPL body. DECO builds on the insight that human observers recognize contact by reasoning about the contacting body parts, their proximity to scene objects, and the surrounding scene context. We perform extensive evaluations of our detector on DAMON as well as on the RICH and BEHAVE datasets. We significantly outperform existing SOTA methods across all benchmarks. We also show qualitatively that DECO generalizes well to diverse and challenging real-world human interactions in natural images. The code, data, and models are available at https://deco.is.tue.mpg.de/login.php.
Project Video Poster Code Data DOI URL BibTeX

Perceiving Systems Conference Paper SINC: Spatial Composition of 3D Human Motions for Simultaneous Action Generation Athanasiou, N., Petrovich, M., Black, M. J., Varol, G. In Proc. International Conference on Computer Vision (ICCV), 9984-9995, International Conference on Computer Vision, October 2023 (Published)
Our goal is to synthesize 3D human motions given textual inputs describing multiple simultaneous actions, for example ‘waving hand’ while ‘walking’ at the same time. We refer to generating such simultaneous movements as performing ‘spatial compositions’. In contrast to ‘temporal compositions’ that seek to transition from one action to another in a sequence, spatial compositing requires understanding which body parts are involved with which action. Motivated by the observation that the correspondence between actions and body parts is encoded in powerful language models, we extract this knowledge by prompting GPT-3 with text such as “what parts of the body are moving when someone is doing the action <action name>?”. Given this action-part mapping, we automatically create new training data by artificially combining body parts from multiple text-motion pairs together. We extend previous work on text-to-motions synthesis to train on spatial compositions, and introduce SINC (“SImultaneous actioN Compositions for 3D human motions”). We experimentally validate that our additional GPT-guided data helps to better learn compositionality compared to training only on existing real data of simultaneous actions, which is limited in quantity.
website code paper-arxiv video BibTeX

Perceiving Systems Conference Paper TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis Petrovich, M., Black, M. J., Varol, G. In Proc. International Conference on Computer Vision (ICCV), 9488-9497, International Conference on Computer Vision, October 2023 (Published)
In this paper, we present TMR, a simple yet effective approach for text to 3D human motion retrieval. While previous work has only treated retrieval as a proxy evaluation metric, we tackle it as a standalone task. Our method extends the state-of-the-art text-to-motion synthesis model TEMOS, and incorporates a contrastive loss to better structure the cross-modal latent space. We show that maintaining the motion generation loss, along with the contrastive training, is crucial to obtain good performance. We introduce a benchmark for evaluation and provide an in-depth analysis by reporting results on several protocols. Our extensive experiments on the KIT-ML and HumanML3D datasets show that TMR outperforms the prior work by a significant margin, for example reducing the median rank from 54 to 19. Finally, we showcase the potential of our approach on moment retrieval. Our code and models are publicly available.
website code paper-arxiv video URL BibTeX

Autonomous Learning Conference Paper Regularity as Intrinsic Reward for Free Play Sancaktar, C., Piater, J., Martius, G. In Advances in Neural Information Processing Systems (NeurIPS, Advances in Neural Information Processing Systems 36, September 2023 (Published)
We propose regularity as a novel reward signal for intrinsically-motivated reinforcement learning. Taking inspiration from child development, we postulate that striving for structure and order helps guide exploration towards a subspace of tasks that are not favored by naive uncertainty-based intrinsic rewards. Our generalized formulation of Regularity as Intrinsic Reward (RaIR) allows us to operationalize it within model-based reinforcement learning. In a synthetic environment, we showcase the plethora of structured patterns that can emerge from pursuing this regularity objective. We also demonstrate the strength of our method in a multi-object robotic manipulation environment. We incorporate RaIR into free play and use it to complement the model’s epistemic uncertainty as an intrinsic reward. Doing so, we witness the autonomous construction of towers and other regular structures during free play, which leads to a substantial improvement in zero-shot downstream task performance on assembly tasks.
URL BibTeX

Organizational Leadership and Diversity Article Hooked on artificial agents: a systems thinking perspective Ðula, I., Berberena, T., Keplinger, K., Wirzberger, M. Frontiers in Behavioral Economics, 2:1223281, September 2023 (Published)
Following recent technological developments in the artificial intelligence space, artificial agents are increasingly taking over organizational tasks typically reserved for humans. Studies have shown that humans respond differently to this, with some being appreciative of their advice (algorithm appreciation), others being averse toward them (algorithm aversion), and others still fully relinquishing control to artificial agents without adequate oversight (automation bias). Using systems thinking, we analyze the existing literature on these phenomena and develop a conceptual model that provides an underlying structural explanation for their emergence. In doing so, we create a powerful visual tool that can be used to ground discussions about the impact artificial agents have on organizations and humans within them.
Hooked on artificial agents DOI URL BibTeX

Empirical Inference Article A historical perspective of biomedical explainable AI research Malinverno, L., Barros, V., Ghisoni, F., Visonà, G., Kern, R., Nickel, P. J., Ventura, B. E., Šimić, I., Stryeck, S., Manni, F., Ferri, C., Jean-Quartier, C., Genga, L., Schweikert, G., Lovrić, M., Rosen-Zvi, M. Patterns, 4(9), September 2023 (Published) DOI BibTeX

Empirical Inference Conference Paper Certified private data release for sparse Lipschitz functions Donhauser, K., Lokna, J., Sanyal, A., Boedihardjo, M., Hönig, R., Yang, F. TPDP 2023 - Theory and Practice of Differential Privacy, September 2023 (Published) arXiv URL BibTeX

Empirical Inference Master Thesis Efficient Sampling from Differentiable Matrix Elements Kofler, A. Technical University of Munich, Germany, September 2023 (Published) BibTeX

Empirical Inference Conference Paper How to make semi-private learning more effective Pinto, F., Hu, Y., Yang, F., Sanyal, A. TPDP 2023 - Theory and Practice of Differential Privacy, September 2023 (Published) arXiv URL BibTeX

Social Foundations of Computation Conference Paper Incentivizing Honesty among Competitors in Collaborative Learning and Optimization Dorner, F. E., Konstantinov, N., Pashaliev, G., Vechev, M. In Advances in Neural Information Processing Systems 36 (NeurIPS 2023), The Thirty-Seventh Annual Conference on Neural Information Processing Systems (NeurIPS), September 2023 (Published)
Collaborative learning techniques have the potential to enable training machine learning models that are superior to models trained on a single entity’s data. However, in many cases, potential participants in such collaborative schemes are competitors on a downstream task, such as firms that each aim to attract customers by providing the best recommendations. This can incentivize dishonest updates that damage other participants' models, potentially undermining the benefits of collaboration. In this work, we formulate a game that models such interactions and study two learning tasks within this framework: single-round mean estimation and multi-round SGD on strongly-convex objectives. For a natural class of player actions, we show that rational clients are incentivized to strongly manipulate their updates, preventing learning. We then propose mechanisms that incentivize honest communication and ensure learning quality comparable to full cooperation. Lastly, we empirically demonstrate the effectiveness of our incentive scheme on a standard non-convex federated learning benchmark. Our work shows that explicitly modeling the incentives and actions of dishonest clients, rather than assuming them malicious, can enable strong robustness guarantees for collaborative learning.
arXiv URL BibTeX

Haptic Intelligence Miscellaneous NearContact: Accurate Human Detection using Tomographic Proximity and Contact Sensing with Cross-Modal Attention Garrofé, G., Schoeffmann, C., Zangl, H., Kuchenbecker, K. J., Lee, H. Extended abstract (4 pages) presented at the International Workshop on Human-Friendly Robotics (HFR), Munich, Germany, September 2023 (Published) BibTeX

Empirical Inference Article Neural Causal Structure Discovery from Interventions Ke*, N. R., Bilaniuk*, O., Goyal, A., Bauer, S., Larochelle, H., Schölkopf, B., Mozer, M. C., Pal, C., Bengio, Y. Transactions on Machine Learning Research, September 2023, *equal contribution (Published) URL BibTeX

Empirical Inference Article Simulation-based inference for efficient identification of generative models in computational connectomics Boelts, J., Harth, P., Gao, R., Udvary, D., Yáñez, F., Baum, D., Hege, H., Oberlaender, M., Macke, J. H. PLOS Computational Biology, 19(9):1-28, September 2023 (Published) DOI BibTeX

Perceiving Systems Conference Paper Synthetic Data-Based Detection of Zebras in Drone Imagery Bonetto, E., Ahmad, A. 2023 European Conference on Mobile Robots (ECMR), 1-8, IEEE, ECMR, September 2023 (Published)
Nowadays, there is a wide availability of datasets that enable the training of common object detectors or human detectors. These come in the form of labelled real-world images and require either a significant amount of human effort, with a high probability of errors such as missing labels, or very constrained scenarios, e.g. VICON systems. On the other hand, uncommon scenarios, like aerial views, animals, like wild zebras, or difficult-to-obtain information, such as human shapes, are hardly available. To overcome this, synthetic data generation with realistic rendering technologies has recently gained traction and advanced research areas such as target tracking and human pose estimation. However, subjects such as wild animals are still usually not well represented in such datasets. In this work, we first show that a pre-trained YOLO detector can not identify zebras in real images recorded from aerial viewpoints. To solve this, we present an approach for training an animal detector using only synthetic data. We start by generating a novel synthetic zebra dataset using GRADE, a state-of-the-art framework for data generation. The dataset includes RGB, depth, skeletal joint locations, pose, shape and instance segmentations for each subject. We use this to train a YOLO detector from scratch. Through extensive evaluations of our model with real-world data from i) limited datasets available on the internet and ii) a new one collected and manually labelled by us, we show that we can detect zebras by using only synthetic data during training. The code, results, trained models, and both the generated and training data are provided as open-source at https://eliabntt.github.io/grade-rr.
Generation code pdf DOI URL BibTeX

Organizational Leadership and Diversity Conference Paper Constructing and deconstructing bias: modeling privilege and mentorship in agent-based simulations Smith, A., Heuschkel, S., Keplinger, K., Wu, C. Conference on Cognitive Computational Neuroscience, 10.32470/CCN.2023.1257-0, Conference on Cognitive Computational Neuroscience, Oxford, UK, Conference on Cognitive Computational Neuroscience, August 2023 (Published)
Bias exists in how we pick leaders, who we perceive as being influential, and who we interact with, not only in society, but in organizational contexts. Drawing from leadership emergence and social influence theories, we investigate potential interventions that support diverse leaders. Using agent-based simulations, we model a collective search process on a fitness landscape. Agents combine individual and social learning, and are represented as a feature vector blending relevant (e.g., individual learning characteristics) and irrelevant (e.g., race or gender) features. Agents use rational principles of learning to estimate feature weights on the basis of performance predictions, which are used to dynamically define social influence in their network. We show how biases arise based on historic privilege, but can be drastically reduced through the use of an intervention (e.g. mentorship). This work provides important insights into the cognitive mechanisms underlying bias construction and deconstruction, while pointing towards real-world interventions to be tested in future empirical work.
CCN2023 DOI URL BibTeX

Robotic Materials Patent High Strain Peano Hydraulically Amplified Self-Healing Electrostatic (HASEL) Transducers Keplinger, C. M., Wang, X., Mitchell, S. K. (US Patent App. 18/138,621), August 2023
High strain hydraulically amplified self-healing electrostatic transducers having increased maximum theoretical and practical strains are disclosed. In particular, the actuators include electrode configurations having a zipping front created by the attraction of the electrodes that is configured orthogonally to a strain axis along which the actuators. This configuration produces increased strains. In turn, various form factors for the actuator configuration are presented including an artificial circular muscle and a strain amplifying pulley system. Other actuator configurations are contemplated that include independent and opposed electrode pairs to create cyclic activation, hybrid electrode configurations, and use of strain limiting layers for controlled deflection of the actuator.
URL BibTeX

Haptic Intelligence Software Workshop Autonomous Motion Conference Paper Augmenting Human Policies using Riemannian Metrics for Human-Robot Shared Control Oh, Y., Passy, J., Mainprice, J. In Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 1612-1618, Busan, South Korea, August 2023 (Published)
We present a shared control framework for teleoperation that combines the human and autonomous robot agents operating in different dimension spaces. The shared control problem is an optimization problem to maximize the human's internal action-value function while guaranteeing that the shared control policy is close to the autonomous robot policy. This results in a state update rule that augments the human controls using the Riemannian metric that emerges from computing the curvature of the robot's value function to account for any cost terms or constraints that the human operator may neglect when operating a redundant manipulator. In our experiments, we apply Linear Quadratic Regulators to locally approximate the robot policy using a single optimized robot trajectory, thereby preventing the need for an optimization step at each time step to determine the optimal policy. We show preliminary results of reach-and-grasp teleoperation tasks with a simulated human policy and a pilot user study using the VR headset and controllers. However, the mixed user preference ratings and quantitative results show that more investigation is required to prove the efficacy of the proposed paradigm.
DOI BibTeX

Learning and Dynamical Systems Empirical Inference Conference Paper Causal Effect Estimation from Observational and Interventional Data Through Matrix Weighted Linear Estimators Kladny, K., von Kügelgen, J., Schölkopf, B., Muehlebach, M. Conference on Uncertainty in Artificial Intelligence, 216:1087-1097, Proceedings of Machine Learning Research, (Editors: Evans, Robin J. and Shpitser, Ilya), PMLR, August 2023 (Published) URL BibTeX

Empirical Inference Article Chasing rainbows and ocean glints: Inner working angle constraints for the Habitable Worlds Observatory Vaughan, S. R., Gebhard, T. D., Bott, K., Casewell, S. L., Cowan, N. B., Doelman, D. S., Kenworthy, M., Mazoyer, J., Millar-Blanchaer, M. A., Trees, V. J. H., Stam, D. M., Absil, O., Altinier, L., Baudoz, P., Belikov, R., Bidot, A., Birkby, J. L., Bonse, M. J., Brandl, B., Carlotti, A., et al. Monthly Notices of the Royal Astronomical Society, 524(4):5477-5485, August 2023 (Published) DOI BibTeX

Haptic Intelligence Perceiving Systems Article Learning to Estimate Palpation Forces in Robotic Surgery From Visual-Inertial Data Lee, Y., Mat Husin, H., Forte, M., Lee, S., Kuchenbecker, K. J. IEEE Transactions on Medical Robotics and Bionics, 5(3):496-506, August 2023, Young-Eun Lee and Haliza Mat Husin contributed equally to this work (Published)
Surgeons cannot directly touch the patient's tissue in robot-assisted minimally invasive procedures. Instead, they must palpate using instruments inserted into the body through trocars. This way of operating largely prevents surgeons from using haptic cues to localize visually undetectable structures such as tumors and blood vessels, motivating research on direct and indirect force sensing. We propose an indirect force-sensing method that combines monocular images of the operating field with measurements from IMUs attached externally to the instrument shafts. Our method is thus suitable for various robotic surgery systems as well as laparoscopic surgery. We collected a new dataset using a da Vinci Si robot, a force sensor, and four different phantom tissue samples. The dataset includes 230 one-minute-long recordings of repeated bimanual palpation tasks performed by four lay operators. We evaluated several network architectures and investigated the role of the network inputs. Using the DenseNet vision model and including inertial data best-predicted palpation forces (lowest average root-mean-square error and highest average coefficient of determination). Ablation studies revealed that video frames carry significantly more information than inertial signals. Finally, we demonstrated the model's ability to generalize to unseen tissue and predict shear contact forces.
DOI BibTeX

Haptic Intelligence Autonomous Learning Empirical Inference Article Minsight: A Fingertip-Sized Vision-Based Tactile Sensor for Robotic Manipulation Andrussow, I., Sun, H., Kuchenbecker, K. J., Martius, G. Advanced Intelligent Systems, 5(8):2300042, August 2023, Inside back cover, DOI: 10.1002/aisy.202370035 (Published)
Intelligent interaction with the physical world requires perceptual abilities beyond vision and hearing; vibrant tactile sensing is essential for autonomous robots to dexterously manipulate unfamiliar objects or safely contact humans. Therefore, robotic manipulators need high-resolution touch sensors that are compact, robust, inexpensive, and efficient. The soft vision-based haptic sensor presented herein is a miniaturized and optimized version of the previously published sensor Insight. Minsight has the size and shape of a human fingertip and uses machine learning methods to output high-resolution maps of 3D contact force vectors at 60 Hz. Experiments confirm its excellent sensing performance, with a mean absolute force error of 0.07 N and contact location error of 0.6 mm across its surface area. Minsight's utility is shown in two robotic tasks on a 3-DoF manipulator. First, closed-loop force control enables the robot to track the movements of a human finger based only on tactile data. Second, the informative value of the sensor output is shown by detecting whether a hard lump is embedded within a soft elastomer with an accuracy of 98\%. These findings indicate that Minsight can give robots the detailed fingertip touch sensing needed for dexterous manipulation and physical human–robot interaction.
DOI BibTeX

Physical Intelligence Article Reconfigurable Innervation of Modular Soft Machines via Soft, Sticky, and Instant Electronic Adhesive Interlocking Yoon, J., Byun, J., Park, M., Kim, H., Kim, W., Yoon, J., Cho, K., Hong, Y. Advanced Intelligent Systems, 5(8), August 2023 (Published) DOI URL BibTeX

Empirical Inference Conference Paper Socially Responsible Machine Learning: A Causal Perspective Moraffah, R., Karimi, A., Raglin, A., Liu, H. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 5819-5820, Association for Computing Machinery, August 2023 (Published) DOI BibTeX

Perceiving Systems Conference Paper Synthesizing Physical Character-scene Interactions Hassan, M., Guo, Y., Wang, T., Black, M. J., Fidler, S., Bin Peng, X. In ACM SIGGRAPH 2023 Conference Proceedings, SIGGRAPH, August 2023 (Published)
Movement is how people interact with and affect their environment. For realistic virtual character animation, it is necessary to realistically synthesize such interactions between virtual characters and their surroundings. Despite recent progress in character animation using machine learning, most systems focus on controlling an agent's movements in fairly simple and homogeneous environments, with limited interactions with other objects. Furthermore, many previous approaches that synthesize human-scene interaction require significant manual labeling of the training data. In contrast, we present a system that uses adversarial imitation learning and reinforcement learning to train physically-simulated characters that perform scene interaction tasks in a natural and life-like manner. Our method is able to learn natural scene interaction behaviors from large unstructured motion datasets, without manual annotation of the motion data. These scene interactions are learned using an adversarial discriminator that evaluates the realism of a motion within the context of a scene. The key novelty involves conditioning both the discriminator and the policy networks on scene context. We demonstrate the effectiveness of our approach through three challenging scene interaction tasks: carrying, sitting, and lying down, which require coordination of a character's movements in relation to objects in the environment. Our policies learn to seamlessly transition between different behaviors like idling, walking, and sitting. Using an efficient approach to randomize the training objects and their placements during training enables our method to generalize beyond the objects and scenarios in the training dataset, producing natural character-scene interactions despite wide variation in object shape and placement. The approach takes physics-based character motion generation a step closer to broad applicability.
video arXiv ACM paper pdf BibTeX

Haptic Intelligence Miscellaneous The Role of Kinematics Estimation Accuracy in Learning with Wearable Haptics Rokhmanova, N., Pearl, O., Kuchenbecker, K. J., Halilaj, E. Abstract (1 page) presented at the American Society of Biomechanics Annual Meeting (ASB), Knoxville, USA, August 2023 (Published) BibTeX

Empirical Inference Conference Paper USIM-DAL: Uncertainty-aware Statistical Image Modeling-based Dense Active Learning for Super-resolution Rangnekar, V., Upadhyay, U., Akata, Z., Banerjee, B. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI), 216:1707-1717, Proceedings of Machine Learning Research, (Editors: Evans, Robin J. and Shpitser, Ilya), PMLR, August 2023 (Published) URL BibTeX

Haptic Intelligence Conference Paper Wear Your Heart on Your Sleeve: Users Prefer Robots with Emotional Reactions to Touch and Ambient Moods Burns, R. B., Ojo, F., Kuchenbecker, K. J. In Proceedings of the IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 1914-1921, Busan, South Korea, August 2023 (Published)
Robots are increasingly being developed as assistants for household, education, therapy, and care settings. Such robots can use adaptive emotional behavior to communicate warmly and effectively with their users and to encourage interest in extended interactions. However, autonomous physical robots often lack a dynamic internal emotional state, instead displaying brief, fixed emotion routines to promote specific user interactions. Furthermore, despite the importance of social touch in human communication, most commercially available robots have limited touch sensing, if any at all. We propose that users' perceptions of a social robotic system will improve when the robot provides emotional responses on both shorter and longer time scales (reactions and moods), based on touch inputs from the user. We evaluated this proposal through an online study in which 51 diverse participants watched nine randomly ordered videos (a three-by-three full-factorial design) of the koala-like robot HERA being touched by a human. Users provided the highest ratings in terms of agency, ambient activity, enjoyability, and touch perceptivity for scenarios in which HERA showed emotional reactions and either neutral or emotional moods in response to social touch gestures. Furthermore, we summarize key qualitative findings about users' preferences for reaction timing, the ability of robot mood to show persisting memory, and perception of neutral behaviors as a curious or self-aware robot.
DOI BibTeX

Perceiving Systems Article BARC: Breed-Augmented Regression Using Classification for 3D Dog Reconstruction from Images Rueegg, N., Zuffi, S., Schindler, K., Black, M. J. Int. J. of Comp. Vis. (IJCV), 131(8):1964–1979, August 2023 (Published)
The goal of this work is to reconstruct 3D dogs from monocular images. We take a model-based approach, where we estimate the shape and pose parameters of a 3D articulated shape model for dogs. We consider dogs as they constitute a challenging problem, given they are highly articulated and come in a variety of shapes and appearances. Recent work has considered a similar task using the multi-animal SMAL model, with additional limb scale parameters, obtaining reconstructions that are limited in terms of realism. Like previous work, we observe that the original SMAL model is not expressive enough to represent dogs of many different breeds. Moreover, we make the hypothesis that the supervision signal used to train the network, that is 2D keypoints and silhouettes, is not sufficient to learn a regressor that can distinguish between the large variety of dog breeds. We therefore go beyond previous work in two important ways. First, we modify the SMAL shape space to be more appropriate for representing dog shape. Second, we formulate novel losses that exploit information about dog breeds. In particular, we exploit the fact that dogs of the same breed have similar body shapes. We formulate a novel breed similarity loss, consisting of two parts: One term is a triplet loss, that encourages the shape of dogs from the same breed to be more similar than dogs of different breeds. The second one is a breed classification loss. With our approach we obtain 3D dogs that, compared to previous work, are quantitatively better in terms of 2D reconstruction, and significantly better according to subjective and quantitative 3D evaluations. Our work shows that a-priori side information about similarity of shape and appearance, as provided by breed labels, can help to compensate for the lack of 3D training data. This concept may be applicable to other animal species or groups of species. We call our method BARC (Breed-Augmented Regression using Classification). Our code is publicly available for research purposes at https://barc.is.tue.mpg.de/.
On-line DOI URL BibTeX

Organizational Leadership and Diversity Conference Paper Unlearning the bias: An agent-based simulation for increasing diversere presentation through leadership emergence Smith, A., Heuschkel, S., Keplinger, K., Wu, C. In Proceedings of the 45th Annual Conference of the Cognitive Science Society, https://escholarship.org/uc/item/5mq9v0rm, Sydney, Australia, Proceedings of the 45th Annual Conference of the Cognitive Science Society, July 2023 (Published)
Despite increased interest in creating more diverse and inclusive organizational environments, bias exists in how we choose leaders, who we interact with, and who we consider influential. Drawing from leadership emergence theory, we investigate potential interventions that support diverse leaders. Using agent-based simulations, we model a collective search process on a fitness landscape. Agents combine individual and social learning, and are represented as a feature vector blending relevant (e.g., individual learning characteristics) and irrelevant (e.g., race or gender) features. Agents use rational principles of learning to estimate feature weights on the basis of performance predictions, which are used to dynamically define social influence in their network. We show how biases arise based on historic privilege, but can be drastically reduced through the use of an intervention (e.g. mentorship). This framework allows us to test interventions best suited for unlearning bias in favor of performance-relevant traits.
DOI URL BibTeX

Autonomous Learning Article Offline Diversity Maximization under Imitation Constraints Marin, V., Jin, C., Martius, G., Kolev, P. Reinforcement Learning Journal, Offline Diversity Maximization under Imitation Constraints, 3:1377-1409, July 2023 (Published)
There has been significant recent progress in the area of unsupervised skill discovery, utilizing various information-theoretic objectives as measures of diversity. Despite these advances, challenges remain: current methods require significant online interaction, fail to leverage vast amounts of available task-agnostic data and typically lack a quantitative measure of skill utility. We address these challenges by proposing a principled offline algorithm for unsupervised skill discovery that, in addition to maximizing diversity, ensures that each learned skill imitates state-only expert demonstrations to a certain degree. Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery to maximize a mutual information objective subject to KL-divergence state occupancy constraints. Furthermore, we demonstrate the effectiveness of our method on the standard offline benchmark D4RL and on a custom offline dataset collected from a 12-DoF quadruped robot for which the policies trained in simulation transfer well to the real robotic system.
Website DOI URL BibTeX

Empirical Inference Conference Paper A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models Stolfo, A., Jin, Z., Shridhar, K., Schölkopf, B., Sachan, M. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), Volume 1: Long Papers:545-561, (Editors: Rogers, A. and Boyd-Graber, J. L. and Okazaki, N.), Association for Computational Linguistics, July 2023 (Published) DOI BibTeX

Robotic Materials Article A Multifunctional Soft Robotic Shape Display with High-speed Actuation, Sensing, and Control Johnson, B. K., Naris, M., Sundaram, V., Volchko, A., Ly, K., Mitchell, S. K., Acome, E., Kellaris, N., Keplinger, C., Correll, N., Humbert, J. S., Rentschler, M. E. Nature Communications, 14(1), July 2023 (Published)
Shape displays which actively manipulate surface geometry are an expanding robotics domain with applications to haptics, manufacturing, aerodynamics, and more. However, existing displays often lack high-fidelity shape morphing, high-speed deformation, and embedded state sensing, limiting their potential uses. Here, we demonstrate a multifunctional soft shape display driven by a 10 × 10 array of scalable cellular units which combine high-speed electrohydraulic soft actuation, magnetic-based sensing, and control circuitry. We report high-performance reversible shape morphing up to 50 Hz, sensing of surface deformations with 0.1 mm sensitivity and external forces with 50 mN sensitivity in each cell, which we demonstrate across a multitude of applications including user interaction, image display, sensing of object mass, and dynamic manipulation of solids and liquids. This work showcases the rich multifunctionality and high-performance capabilities that arise from tightly-integrating large numbers of electrohydraulic actuators, soft sensors, and controllers at a previously undemonstrated scale in soft robotics.
YouTube video DOI URL BibTeX

Empirical Inference Article A network approach to atomic spectra Wellnitz, D., Kekić, A., Heiss, J., Gertz, M., Weidemüller, M., Spitz, A. Journal of Physics: Complexity, 4(3), July 2023 (Published) DOI BibTeX

Social Foundations of Computation Conference Paper AI and the EU Digital Markets Act: Addressing the Risks of Bigness in Generative AI Yasar, A. G., Chong, A., Dong, E., Gilbert, T. K., Hladikova, S., Maio, R., Mougan, C., Shen, X., Singh, S., Stoica, A., Thais, S., Zilka, M. Proceedings of the 40th International Conference on Machine Learning (ICML 2023), PMLR, The Forty International Conference on Machine Learning (ICML), July 2023 (Accepted)
As AI technology advances rapidly, concerns over the risks of bigness in digital markets are also growing. The EU's Digital Markets Act (DMA) aims to address these risks. Still, the current framework may not adequately cover generative AI systems that could become gateways for AI-based services. This paper argues for integrating certain AI software as core platform services and classifying certain developers as gatekeepers under the DMA. We also propose an assessment of gatekeeper obligations to ensure they cover generative AI services. As the EU considers generative AI-specific rules and possible DMA amendments, this paper provides insights towards diversity and openness in generative AI services.
arXiv URL BibTeX

Empirical Inference Conference Paper Adversarial robustness of amortized Bayesian inference Glöckler, M., Deistler, M., Macke, J. H. Proceedings of 40th International Conference on Machine Learning (ICML) , 202:11493-11524, Proceedings of Machine Learning Research, (Editors: A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato and J. Scarlett), PMLR, July 2023 (Published) URL BibTeX