Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Autonomous Learning Conference Paper Modelling Microbial Communities with Graph Neural Networks Ruaud, A., Sancaktar, C., Bagatella, M., Ratzke, C., Martius, G. In Proceedings of the 41st International Conference on Machine Learning (ICML), 235:42742-42765, Proceedings of Machine Learning Research, (Editors: Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix), PMLR, July 2024 (Published) URL BibTeX

Haptic Intelligence Intelligent Control Systems Article Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test Khojasteh, B., Solowjow, F., Trimpe, S., Kuchenbecker, K. J. IEEE Transactions on Automation Science and Engineering, 21(3):4432-4447, July 2024 (Published)
Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and parameter tuning. To overcome these challenges, we propose an easily implemented framework that can directly handle heterogeneous data sources for classification tasks. Our data-versus-data approach automatically quantifies distinctive differences in distributions in a high-dimensional space via kernel two-sample testing between two sets extracted from multimodal data (e.g., images, sounds, haptic signals). We demonstrate the effectiveness of our technique by benchmarking against expertly engineered classifiers for visual-audio-haptic surface recognition due to the industrial relevance, difficulty, and competitive baselines of this application; ablation studies confirm the utility of key components of our pipeline. As shown in our open-source code, we achieve 97.2\% accuracy on a standard multi-user dataset with 108 surface classes, outperforming the state-of-the-art machine-learning algorithm by 6\% on a more difficult version of the task. The fact that our classifier obtains this performance with minimal data processing in the standard algorithm setting reinforces the powerful nature of kernel methods for learning to recognize complex patterns. Note to Practitioners—We demonstrate how to apply the kernel two-sample test to a surface-recognition task, discuss opportunities for improvement, and explain how to use this framework for other classification problems with similar properties. Automating surface recognition could benefit both surface inspection and robot manipulation. Our algorithm quantifies class similarity and therefore outputs an ordered list of similar surfaces. This technique is well suited for quality assurance and documentation of newly received materials or newly manufactured parts. More generally, our automated classification pipeline can handle heterogeneous data sources including images and high-frequency time-series measurements of vibrations, forces and other physical signals. As our approach circumvents the time-consuming process of feature engineering, both experts and non-experts can use it to achieve high-accuracy classification. It is particularly appealing for new problems without existing models and heuristics. In addition to strong theoretical properties, the algorithm is straightforward to use in practice since it requires only kernel evaluations. Its transparent architecture can provide fast insights into the given use case under different sensing combinations without costly optimization. Practitioners can also use our procedure to obtain the minimum data-acquisition time for independent time-series data from new sensor recordings.
DOI BibTeX

Empirical Inference Conference Paper On the Growth of Mistakes in Differentially Private Online Learning: A Lower Bound Perspective Dmitriev, D., Szabó, K., Sanyal, A. Proceedings of the 37th Annual Conference on Learning Theory (COLT), 247:1379-1398, Proceedings of Machine Learning Research, (Editors: Agrawal, Shipra and Roth, Aaron), PMLR, July 2024, (talk) (Published) URL BibTeX

Empirical Inference Robust Machine Learning Conference Paper Position: Understanding LLMs Requires More Than Statistical Generalization Reizinger, P., Ujváry, S., Mészáros, A., Kerekes, A., Brendel, W., Huszár, F. Proceedings of the 41st International Conference on Machine Learning (ICML), 235:42365-42390, Proceedings of Machine Learning Research, (Editors: Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix), PMLR, July 2024 (Published) arXiv URL BibTeX

Empirical Inference Article Probabilistic pathway-based multimodal factor analysis Immer, A., Stark, S. G., Jacob, F., Bonilla, X., Thomas, T., Kahles, A., Goetze, S., Milani, E. S., Wollscheid, B., Consortium, T. T. P., Rätsch, G., Lehmann, K. Bioinformatics, 40(Supplement 1):i189-i198, July 2024 (Published) DOI URL BibTeX

Empirical Inference Conference Paper Products, Abstractions and Inclusions of Causal Spaces Buchholz, S., Park, J., Schölkopf, B. 40th Conference on Uncertainty in Artificial Intelligence (UAI), 244:430-449, Proceedings of Machine Learning Research, (Editors: Kiyavash, Negar and Mooij, Joris M.), PMLR, July 2024 (Published) arXiv URL BibTeX

Empirical Inference Conference Paper Provable Privacy with Non-Private Pre-Processing Hu, Y., Sanyal, A., Schölkopf, B. Proceedings of the 41st International Conference on Machine Learning (ICML), 235:19402-19437, Proceedings of Machine Learning Research, (Editors: Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix), PMLR, July 2024 (Published) URL BibTeX

Haptic Intelligence Conference Paper Reflectance Outperforms Force and Position in Model-Free Needle Puncture Detection L’Orsa, R., Bisht, A., Yu, L., Murari, K., Westwick, D. T., Sutherland, G. R., Kuchenbecker, K. J. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 1-7, Orlando, USA, July 2024 (Published)
The surgical procedure of needle thoracostomy temporarily corrects accidental over-pressurization of the space between the chest wall and the lungs. However, failure rates of up to 94.1\% have been reported, likely because this procedure is done blind: operators estimate by feel when the needle has reached its target. We believe instrumented needles could help operators discern entry into the target space, but limited success has been achieved using force and/or position to try to discriminate needle puncture events during simulated surgical procedures. We thus augmented our needle insertion system with a novel in-bore double-fiber optical setup. Tissue reflectance measurements as well as 3D force, torque, position, and orientation were recorded while two experimenters repeatedly inserted a bevel-tipped percutaneous needle into ex vivo porcine ribs. We applied model-free puncture detection to various filtered time derivatives of each sensor data stream offline. In the held-out test set of insertions, puncture-detection precision improved substantially using reflectance measurements compared to needle insertion force alone (3.3-fold increase) or position alone (11.6-fold increase).
DOI BibTeX

Empirical Inference Conference Paper Robustness of Nonlinear Representation Learning Buchholz, S., Schölkopf, B. Proceedings of the 41st International Conference on Machine Learning (ICML), 235:4785-4821, Proceedings of Machine Learning Research, (Editors: Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix), PMLR, July 2024 (Published) URL BibTeX

Empirical Inference Conference Paper Simultaneous identification of models and parameters of scientific simulators Schröder, C., Macke, J. H. Proceedings of the 41st International Conference on Machine Learning (ICML), 235:43895-43927, Proceedings of Machine Learning Research, (Editors: Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix), PMLR, July 2024 (Published) URL BibTeX

Empirical Inference Conference Paper Stitching Manifolds: Leveraging Interaction to Compose Object Representations into Scenes Keurti, H., Schölkopf, B., Aceituno, P. V., Grewe, B. ICML 2024 Workshop on Geometry-grounded Representation Learning and Generative Modeling (GRaM), July 2024 (Published) URL BibTeX

Empirical Inference Conference Paper Targeted Reduction of Causal Models Kekić, A., Schölkopf, B., Besserve, M. 40th Conference on Uncertainty in Artificial Intelligence (UAI), 244:1953-1980, Proceedings of Machine Learning Research, (Editors: Kiyavash, Negar and Mooij, Joris M.), PMLR, July 2024 (Published) arXiv URL BibTeX

Human Aspects of Machine Learning Empirical Inference Conference Paper The Role of Learning Algorithms in Collective Action Ben-Dov*, O., Fawkes*, J., Samadi, S., Sanyal, A. Proceedings of the 41st International Conference on Machine Learning (ICML), 235:3443-3461, Proceedings of Machine Learning Research, (Editors: Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix), PMLR, July 2024, *equal contribution (Published) URL BibTeX

Empirical Inference Conference Paper Unveiling CLIP Dynamics: Linear Mode Connectivity and Generalization Abdolahpourrostam, A., Sanyal, A., Moosavi-Dezfooli, S. ICML 2024 Workshop on Foundation Models in the Wild, July 2024 (Published) URL BibTeX

Empirical Inference Conference Paper What Makes Safety Fine-tuning Methods Safe? A Mechanistic Study Jain, S., Lubana, E. S., Oksuz, K., Joy, T., Torr, P. H. S., Sanyal, A., Dokania, P. K. ICML 2024 Workshop on Mechanistic Interpretability (Spotlight), July 2024 (Published) URL BibTeX

Perceiving Systems Conference Paper ContourCraft: Learning to Resolve Intersections in Neural Multi-Garment Simulations Grigorev, A., Becherini, G., Black, M., Hilliges, O., Thomaszewski, B. In Proceedings SIGGRAPH 2024 Conference Papers , Association for Computing Machinery, New York, NY, USA, SIGGRAPH '24 , July 2024 (Published)
Learning-based approaches to cloth simulation have started to show their potential in recent years. However, handling collisions and intersections in neural simulations remains a largely unsolved problem. In this work, we present ContourCraft, a learning-based solution for handling intersections in neural cloth simulations. Unlike conventional approaches that critically rely on intersection-free inputs, ContourCraft robustly recovers from intersections introduced through missed collisions, self-penetrating bodies, or errors in manually designed multi-layer outfits. The technical core of ContourCraft is a novel intersection contour loss that penalizes interpenetrations and encourages rapid resolution thereof. We integrate our intersection loss with a collision-avoiding repulsion objective into a neural cloth simulation method based on graph neural networks (GNNs). We demonstrate our method’s ability across a challenging set of diverse multi-layer outfits under dynamic human motions. Our extensive analysis indicates that ContourCraft significantly improves collision handling for learned simulation and produces visually compelling results.
paper arXiv project video code DOI URL BibTeX

Perceiving Systems Conference Paper Airship Formations for Animal Motion Capture and Behavior Analysis Price, E., Ahmad, A. Proceedings 2nd International Conference on Design and Engineering of Lighter-Than-Air systems (DELTAS2024), 2nd International Conference on Design and Engineering of Lighter-Than-Air systems (DELTAS2024), June 2024 (Published)
Using UAVs for wildlife observation and motion capture offers manifold advantages for studying animals in the wild, especially grazing herds in open terrain. The aerial perspective allows observation at a scale and depth that is not possible on the ground, offering new insights into group behavior. However, the very nature of wildlife field-studies puts traditional fixed wing and multi-copter systems to their limits: limited flight time, noise and safety aspects affect their efficacy, where lighter than air systems can remain on station for many hours. Nevertheless, airships are challenging from a ground handling perspective as well as from a control point of view, being voluminous and highly affected by wind. In this work, we showcase a system designed to use airship formations to track, follow, and visually record wild horses from multiple angles, including airship design, simulation, control, on board computer vision, autonomous operation and practical aspects of field experiments.
arXiv URL BibTeX

Autonomous Learning Article PaSTS An Operational Dataset for Domestic Solar Thermal Systems Ebmeier, F., Ludwig, N., Martius, G., Franz, V. H. PaSTS An Operational Dataset for Domestic Solar Thermal Systems, June 2024 (Accepted)
Solar thermal systems play an important role in the decarbonization of the domestic heating sector, yet there exist no publicly available datasets of such systems. Therefore, this paper presents the PaSTS dataset, a unique collection of operational data from domestic Solar Thermal Systems (STS) manufactured by Ritter Energie and marketed under the Paradigma brand. Unlike previous research that primarily relied on simulated or unpublished experimental data, this dataset is derived from the service team at Ritter Energie, offering a realistic reflection of the challenges commonly faced in the field. This paper provides a comprehensive dataset overview, emphasizing its application in anomaly and fault detection tasks within STS and establishes the dataset as the first of its kind. Given the inherent complexities of fault detection in STS, we elaborate on the expert system-based fault detection mechanism currently in …
URL BibTeX

Deep Models and Optimization Conference Paper Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues Orvieto, A., De, S., Gulcehre, C., Pascanu, R., Smith, S. L. In Proceedings of Machine Learning Research, Proceedings of the Forty-First International Conference on Machine Learning , Forty-First International Conference on Machine Learning , June 2024 (Published) URL BibTeX

Empirical Inference Ph.D. Thesis Advancing Normalising Flows to Model Boltzmann Distributions Stimper, V. University of Cambridge, UK, Cambridge, June 2024, (Cambridge-Tübingen-Fellowship-Program) (Published) BibTeX

Empirical Inference Conference Paper Analyzing the Role of Semantic Representations in the Era of Large Language Models Jin*, Z., Chen*, Y., Gonzalez*, F., Liu, J., Zhang, J., Michael, J., Schölkopf, B., Diab, M. Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Volume 1: Long Papers:3781-3798, (Editors: Duh, Kevin and Gomez, Helena and Bethard, Steven), Association for Computational Linguistics, June 2024, *equal contribution (Published) arXiv DOI URL BibTeX

Empirical Inference Conference Paper Automatic Generation of Model and Data Cards: A Step Towards Responsible AI Liu, J., Li, W., Jin, Z., Diab, M. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), Volume 1: Long Papers:1975-1997, (Editors: Duh, Kevin and Gomez, Helena and Bethard, Steven), Association for Computational Linguistics, June 2024 (Published) DOI URL BibTeX

Haptic Intelligence Bachelor Thesis Kalman Filter Approach to Sensor Fusion of Ultra-Wideband Positioning and IMU Readings for Enhanced Indoor Tracking of Collaborating Humans Hudhud Mughrabi, M. Kadir Has University, Istanbul, Turkey, June 2024, Bachelor of Science (BSc) in Mechatronics Engineering (Published)
The question of how humans collaborate to perform complex tasks such as surgery has previously been investigated via multimodal sensing and analysis. Ultra-wideband (UWB) localization systems can be deployed to track collaborating team members due to good maneuverability even in cramped environments. However, UWB systems' sampling rate is inversely proportional to the number of people tracked, and their accuracy is hindered by electromagnetic occlusion. This thesis combines UWB positioning with measurements from a wearable inertial measurement unit (IMU) by applying an error-state extended Kalman filter (ES-EKF) to improve position and orientation estimation during team collaborative studies. ES-EKF offers faster and more consistent estimation and can be estimated even without UWB input. Single-human and multi-human sessions were recorded and filtered for evaluation in comparison to ground truth from optical motion capture. By integrating the IMU, the ES-EKF increases the sampling rate from 0.5–20 Hz to 100 Hz. As it is corrected in only 2 degrees of freedom (DOF), the ES-EKF yields improved results over UWB in 4 out of 6 DOF: lateral and longitudinal position and yaw and pitch orientation. Further filter design implications are suggested for future application of ES-EKF in position and orientation estimation of collaborating humans.
BibTeX

Perceiving Systems Conference Paper Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation Petrovich, M., Litany, O., Iqbal, U., Black, M. J., Varol, G., Peng, X. B., Rempe, D. In CVPR Workshop on Human Motion Generation, Seattle, CVPR, June 2024 (Published)
Recent advances in generative modeling have led to promising progress on synthesizing 3D human motion from text, with methods that can generate character animations from short prompts and specified durations. However, using a single text prompt as input lacks the fine-grained control needed by animators, such as composing multiple actions and defining precise durations for parts of the motion. To address this, we introduce the new problem of timeline control for text-driven motion synthesis, which provides an intuitive, yet fine-grained, input interface for users. Instead of a single prompt, users can specify a multi-track timeline of multiple prompts organized in temporal intervals that may overlap. This enables specifying the exact timings of each action and composing multiple actions in sequence or at overlapping intervals. To generate composite animations from a multi-track timeline, we propose a new test-time denoising method. This method can be integrated with any pre-trained motion diffusion model to synthesize realistic motions that accurately reflect the timeline. At every step of denoising, our method processes each timeline interval (text prompt) individually, subsequently aggregating the predictions with consideration for the specific body parts engaged in each action. Experimental comparisons and ablations validate that our method produces realistic motions that respect the semantics and timing of given text prompts.
code website paper-arxiv video URL BibTeX

Neural Capture and Synthesis Perceiving Systems Conference Paper Neuropostors: Neural Geometry-aware 3D Crowd Character Impostors Ostrek, M., Mitra, N. J., O’Sullivan, C. In 27th International Conference on Pattern Recognition (ICPR), Springer, 27th International Conference on Pattern Recognition (ICPR), June 2024 (Published)
Crowd rendering and animation was a very active research area over a decade ago, but in recent years this has lessened, mainly due to improvements in graphics acceleration hardware. Nevertheless, there is still a high demand for generating varied crowd appearances and animation for games, movie production, and mixed-reality applications. Current approaches are still limited in terms of both the behavioral and appearance aspects of virtual characters due to (i) high memory and computational demands; and (ii) person-hours needed of skilled artists in the context of short production cycles. A promising previous approach to generating varied crowds was the use of pre-computed impostor representations for crowd characters, which could replace an animation of a 3D mesh with a simplified 2D impostor for every frame of an animation sequence, e.g., Geopostors [1]. However, with their high memory demands at a time when improvements in consumer graphics accelerators were outpacing memory availability, the practicality of such methods was limited. Inspired by this early work and recent advances in the field of Neural Rendering, we present a new character representation: Neuropostors. We train a Convolutional Neural Network as a means of compressing both the geometric properties and animation key-frames for a 3D character, thereby allowing for constant-time rendering of animated characters from arbitrary camera views. Our method also allows for explicit illumination and material control, by utilizing a flexible rendering equation that is connected to the outputs of the neural network.
BibTeX

Robust Machine Learning Article Translational symmetry in convolutions with localized kernels causes an implicit bias toward high frequency adversarial examples Caro, J. O., Ju, Y., Pyle, R., Dey, S., Brendel, W., Anselmi, F., Patel, A. B. Frontiers in Computational Neuroscience, 18:1387077, June 2024 (Published) Frontiers in Computational Neuroscience BibTeX

Perceiving Systems Conference Paper 4D-DRESS: A 4D Dataset of Real-World Human Clothing With Semantic Annotations Wang, W., Ho, H., Guo, C., Rong, B., Grigorev, A., Song, J., Zarate, J. J., Hilliges, O. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), CVPR, June 2024 (Published)
The studies of human clothing for digital avatars have predominantly relied on synthetic datasets. While easy to collect, synthetic data often fall short in realism and fail to capture authentic clothing dynamics. Addressing this gap, we introduce 4D-DRESS, the first real-world 4D dataset advancing human clothing research with its high-quality 4D textured scans and garment meshes. 4D-DRESS captures 64 outfits in 520 human motion sequences amounting to a total of 78k textured scans. Creating a real-world clothing dataset is challenging, particularly in annotating and segmenting the extensive and complex 4D human scans. To address this, we develop a semi-automatic 4D human parsing pipeline. We efficiently combine a human-in-the-loop process with automation to accurately label 4D scans in diverse garments and body movements. Leveraging precise annotations and high-quality garment meshes, we establish a number of benchmarks for clothing simulation and reconstruction. 4D-DRESS offers realistic and challenging data that complements synthetic sources, paving the way for advancements in research of lifelike human clothing.
arXiv project code data BibTeX

Perceiving Systems Conference Paper MonoHair: High-Fidelity Hair Modeling from a Monocular Video Wu, K., Yang, L., Kuang, Z., Feng, Y., Han, X., Shen, Y., Fu, H., Zhou, K., Zheng, Y. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 24164-24173, CVPR, June 2024 (Published)
Undoubtedly, high-fidelity 3D hair is crucial for achieving realism, artistic expression, and immersion in computer graphics. While existing 3D hair modeling methods have achieved impressive performance, the challenge of achieving high-quality hair reconstruction persists: they either require strict capture conditions, making practical applications difficult, or heavily rely on learned prior data, obscuring fine-grained details in images. To address these challenges, we propose a generic framework to achieve high-fidelity hair reconstruction from a monocular video, without specific requirements for environments. Our approach bifurcates the hair modeling process into two main stages: precise exterior reconstruction and interior structure inference. The exterior is meticulously crafted using our Patch-based Multi-View Optimization (PMVO). This method strategically collects and integrates hair information from multiple views, independent of prior data, to produce a high-fidelity exterior 3D line map. This map not only captures intricate details but also facilitates the inference of the hair’s inner structure. For the interior, we employ a data-driven, multi-view 3D hair reconstruction method. This method utilizes 2D structural renderings derived from the reconstructed exterior, mirroring the synthetic 2D inputs used during training. This alignment effectively bridges the domain gap between our training data and real-world data, thereby enhancing the accuracy and reliability of our interior structure inference. Lastly, we generate a strand model and resolve the directional ambiguity by our hair growth algorithm. Our experiments demonstrate that our method exhibits robustness across diverse hairstyles and achieves state-of-the-art performance.
Project Arxiv DOI URL BibTeX

Perceiving Systems Conference Paper TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation Dwivedi, S. K., Sun, Y., Patel, P., Feng, Y., Black, M. J. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), 1323-1333, CVPR, June 2024 (Published)
We address the problem of regressing 3D human pose and shape from a single image, with a focus on 3D accuracy. The current best methods leverage large datasets of 3D pseudo-ground-truth (p-GT) and 2D keypoints, leading to robust performance. With such methods, we observe a paradoxical decline in 3D pose accuracy with increasing 2D accuracy. This is caused by biases in the p-GT and the use of an approximate camera projection model. We quantify the error induced by current camera models and show that fitting 2D keypoints and p-GT accurately causes incorrect 3D poses. Our analysis defines the invalid distances within which minimizing 2D and p-GT losses is detrimental. We use this to formulate a new loss Threshold-Adaptive Loss Scaling (TALS) that penalizes gross 2D and p-GT losses but not smaller ones. With such a loss, there are many 3D poses that could equally explain the 2D evidence. To reduce this ambiguity we need a prior over valid human poses but such priors can introduce unwanted bias. To address this, we exploit a tokenized representation of human pose and reformulate the problem as token prediction. This restricts the estimated poses to the space of valid poses, effectively providing a uniform prior. Extensive experiments on the EMDB and 3DPW datasets show that our reformulated keypoint loss and tokenization allows us to train on in-the-wild data while improving 3D accuracy over the state-of-the-art.
Paper Project Code Poster Video DOI URL BibTeX

Haptic Intelligence Article AiroTouch: Enhancing Telerobotic Assembly through Naturalistic Haptic Feedback of Tool Vibrations Gong, Y., Mat Husin, H., Erol, E., Ortenzi, V., Kuchenbecker, K. J. Frontiers in Robotics and AI, 11(1355205):1-15, May 2024 (Published)
Teleoperation allows workers to safely control powerful construction machines; however, its primary reliance on visual feedback limits the operator's efficiency in situations with stiff contact or poor visibility, hindering its use for assembly of pre-fabricated building components. Reliable, economical, and easy-to-implement haptic feedback could fill this perception gap and facilitate the broader use of robots in construction and other application areas. Thus, we adapted widely available commercial audio equipment to create AiroTouch, a naturalistic haptic feedback system that measures the vibration experienced by each robot tool and enables the operator to feel a scaled version of this vibration in real time. Accurate haptic transmission was achieved by optimizing the positions of the system's off-the-shelf accelerometers and voice-coil actuators. A study was conducted to evaluate how adding this naturalistic type of vibrotactile feedback affects the operator during telerobotic assembly. Thirty participants used a bimanual dexterous teleoperation system (Intuitive da Vinci Si) to build a small rigid structure under three randomly ordered haptic feedback conditions: no vibrations, one-axis vibrations, and summed three-axis vibrations. The results show that users took advantage of both tested versions of the naturalistic haptic feedback after gaining some experience with the task, causing significantly lower vibrations and forces in the second trial. Subjective responses indicate that haptic feedback increased the realism of the interaction and reduced the perceived task duration, task difficulty, and fatigue. As hypothesized, higher haptic feedback gains were chosen by users with larger hands and for the smaller sensed vibrations in the one-axis condition. These results elucidate important details for effective implementation of naturalistic vibrotactile feedback and demonstrate that our accessible audio-based approach could enhance user performance and experience during telerobotic assembly in construction and other application domains.
DOI BibTeX

Empirical Inference Conference Paper Can Large Language Models Infer Causation from Correlation? Jin, Z., Liu, J., Lyu, Z., Poff, S., Sachan, M., Mihalcea, R., Diab*, M., Schölkopf*, B. The Twelfth International Conference on Learning Representations (ICLR), May 2024, *equal supervision (Published) arXiv URL BibTeX

Empirical Inference Conference Paper Causal Modeling with Stationary Diffusions Lorch, L., Krause*, A., Schölkopf*, B. Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS), 238:1927-1935, Proceedings of Machine Learning Research, (Editors: Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen), PMLR, May 2024, *equal supervision (Published) URL BibTeX

Empirical Inference Conference Paper Certified private data release for sparse Lipschitz functions Donhauser, K., Lokna, J., Sanyal, A., Boedihardjo, M., Hönig, R., Yang, F. Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS), 238:1396-1404, Proceedings of Machine Learning Research, (Editors: Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen), PMLR, May 2024 (Published) URL BibTeX

Haptic Intelligence Article Closing the Loop in Minimally Supervised Human-Robot Interaction: Formative and Summative Feedback Mohan, M., Nunez, C. M., Kuchenbecker, K. J. Scientific Reports, 14(1):10564, May 2024 (Published)
Human instructors fluidly communicate with hand gestures, head and body movements, and facial expressions, but robots rarely leverage these complementary cues. A minimally supervised social robot with such skills could help people exercise and learn new activities. Thus, we investigated how nonverbal feedback from a humanoid robot affects human behavior. Inspired by the education literature, we evaluated formative feedback (real-time corrections) and summative feedback (post-task scores) for three distinct tasks: positioning in the room, mimicking the robot's arm pose, and contacting the robot's hands. Twenty-eight adults completed seventy-five 30-second-long trials with no explicit instructions or experimenter help. Motion-capture data analysis shows that both formative and summative feedback from the robot significantly aided user performance. Additionally, formative feedback improved task understanding. These results show the power of nonverbal cues based on human movement and the utility of viewing feedback through formative and summative lenses.
DOI BibTeX

Empirical Inference Conference Paper Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding Pace, A., Yèche, H., Schölkopf, B., Rätsch, G., Tennenholtz, G. The Twelfth International Conference on Learning Representations (ICLR), May 2024 (Published) arXiv BibTeX

Autonomous Learning Conference Paper Emergent mechanisms for long timescales depend on training curriculum and affect performance in memory tasks Khajehabdollahi, S., Zeraati, R., Giannakakis, E., Schäfer, T. J., Martius, G., Levina, A. In The Twelfth International Conference on Learning Representations, ICLR 2024, May 2024 (Published) URL BibTeX

Perceiving Systems Article Exploring Weight Bias and Negative Self-Evaluation in Patients with Mood Disorders: Insights from the BodyTalk Project, Meneguzzo, P., Behrens, S. C., Pavan, C., Toffanin, T., Quiros-Ramirez, M. A., Black, M. J., Giel, K., Tenconi, E., Favaro, A. Frontiers in Psychiatry, 15, Sec. Psychopathology, May 2024 (Published)
Background: Negative body image and adverse body self-evaluation represent key psychological constructs within the realm of weight bias (WB), potentially intertwined with the negative self-evaluation characteristic of depressive symptomatology. Although WB encapsulates an implicit form of self-critical assessment, its exploration among people with mood disorders (MD) has been under-investigated. Our primary goal is to comprehensively assess both explicit and implicit WB, seeking to reveal specific dimensions that could interconnect with the symptoms of MDs. Methods: A cohort comprising 25 MD patients and 35 demographically matched healthy peers (with 83\%\/ female representation) participated in a series of tasks designed to evaluate the congruence between various computer-generated body representations and a spectrum of descriptive adjectives. Our analysis delved into multiple facets of body image evaluation, scrutinizing the associations between different body sizes and emotionally charged adjectives (e.g., active, apple-shaped, attractive). Results: No discernible differences emerged concerning body dissatisfaction or the correspondence of different body sizes with varying adjectives. Interestingly, MD patients exhibited a markedly higher tendency to overestimate their body weight (p = 0.011). Explicit WB did not show significant variance between the two groups, but MD participants demonstrated a notable implicit WB within a specific weight rating task for BMI between 18.5 and 25 kg/m2 (p = 0.012). Conclusions: Despite the striking similarities in the assessment of participants’ body weight, our investigation revealed an implicit WB among individuals grappling with MD. This bias potentially assumes a role in fostering self-directed negative evaluations, shedding light on a previously unexplored facet of the interplay between WB and mood disorders.
paper paper DOI URL BibTeX

Social Foundations of Computation Conference Paper Fairness Rising from the Ranks: HITS and PageRank on Homophilic Networks Stoica, A., Litvak, N., Chaintreau, A. In Proceedings of the Association for Computing Machinery (ACM) Web Conference 2024, ACM, The 2024 ACM Web Conference, May 2024 (Published)
In this paper, we investigate the conditions under which link analysis algorithms prevent minority groups from reaching high-ranking slots. We find that the most common link-based algorithms using centrality metrics, such as PageRank and HITS, can reproduce and even amplify bias against minority groups in networks. Yet, their behavior differs: on the one hand, we empirically show that PageRank mirrors the degree distribution for most of the ranking positions and it can equalize representation of minorities among the top-ranked nodes; on the other hand, we find that HITS amplifies pre-existing bias in homophilic networks through a novel theoretical analysis, supported by empirical results. We find the root cause of bias amplification in HITS to be the level of homophily present in the network, modeled through an evolving network model with two communities. We illustrate our theoretical analysis on both synthetic and real datasets and we present directions for future work.
ArXiv URL BibTeX

Haptic Intelligence Robotics Miscellaneous GaitGuide: A Wearable Device for Vibrotactile Motion Guidance Rokhmanova, N., Martus, J., Faulkner, R., Fiene, J., Kuchenbecker, K. J. Workshop paper (3 pages) presented at the ICRA Workshop on Advancing Wearable Devices and Applications Through Novel Design, Sensing, Actuation, and AI, Yokohama, Japan, May 2024 (Published)
Wearable vibrotactile devices can provide salient sensations that attract the user's attention or guide them to change. The future integration of such feedback into medical or consumer devices would benefit from understanding how vibrotactile cues vary in amplitude and perceived strength across the heterogeneity of human skin. Here, we developed an adhesive vibrotactile device (the GaitGuide) that uses two individually mounted linear resonant actuators to deliver directional motion guidance. By measuring the mechanical vibrations of the actuators via small on-board accelerometers, we compared vibration amplitudes and perceived signal strength across 20 subjects at five signal voltages and four sites around the shank. Vibrations were consistently smallest in amplitude—but perceived to be strongest—at the site located over the tibia. We created a fourth-order linear dynamic model to capture differences in tissue properties across subjects and sites via optimized stiffness and damping parameters. The anterior site had significantly higher skin stiffness and damping; these values also correlate with subject-specific body-fat percentages. Surprisingly, our study shows that the perception of vibrotactile stimuli does not solely depend on the vibration magnitude delivered to the skin. These findings also help to explain the clinical practice of evaluating vibrotactile sensitivity over a bony prominence.
URL BibTeX

Perceiving Systems Empirical Inference Conference Paper Ghost on the Shell: An Expressive Representation of General 3D Shapes Liu, Z., Feng, Y., Xiu, Y., Liu, W., Paull, L., Black, M. J., Schölkopf, B. In Proceedings of the Twelfth International Conference on Learning Representations (ICLR), The Twelfth International Conference on Learning Representations (ICLR), May 2024 (Published)
The creation of photorealistic virtual worlds requires the accurate modeling of 3D surface geometry for a wide range of objects. For this, meshes are appealing since they 1) enable fast physics-based rendering with realistic material and lighting, 2) support physical simulation, and 3) are memory-efficient for modern graphics pipelines. Recent work on reconstructing and statistically modeling 3D shape, however, has critiqued meshes as being topologically inflexible. To capture a wide range of object shapes, any 3D representation must be able to model solid, watertight, shapes as well as thin, open, surfaces. Recent work has focused on the former, and methods for reconstructing open surfaces do not support fast reconstruction with material and lighting or unconditional generative modelling. Inspired by the observation that open surfaces can be seen as islands floating on watertight surfaces, we parameterize open surfaces by defining a manifold signed distance field on watertight templates. With this parameterization, we further develop a grid-based and differentiable representation that parameterizes both watertight and non-watertight meshes of arbitrary topology. Our new representation, called Ghost-on-the-Shell (G-Shell), enables two important applications: differentiable rasterization-based reconstruction from multiview images and generative modelling of non-watertight meshes. We empirically demonstrate that G-Shell achieves state-of-the-art performance on non-watertight mesh reconstruction and generation tasks, while also performing effectively for watertight meshes.
Home Code Video Project BibTeX

Empirical Inference Conference Paper Identifying Policy Gradient Subspaces Schneider, J., Schumacher, P., Guist, S., Chen, L., Häufle, D., Schölkopf, B., Büchler, D. The Twelfth International Conference on Learning Representations (ICLR), May 2024 (Published) arXiv BibTeX

Autonomous Learning Conference Paper Learning Hierarchical World Models with Adaptive Temporal Abstractions from Discrete Latent Dynamics Gumbsch, C., Sajid, N., Martius, G., Butz, M. V. In The Twelfth International Conference on Learning Representations, ICLR 2024, May 2024 URL BibTeX

Empirical Inference Autonomous Learning Conference Paper Multi-View Causal Representation Learning with Partial Observability Yao, D., Xu, D., Lachapelle, S., Magliacane, S., Taslakian, P., Martius, G., von Kügelgen, J., Locatello, F. The Twelfth International Conference on Learning Representations (ICLR), May 2024 (Published) arXiv BibTeX