Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Physical Intelligence Article Clinical translation of wireless soft robotic medical devices Wang, T., Wu, Y., Yildiz, E., Kanyas, S., Sitti, M. Nature Reviews Bioengineering, 2024 (Published)
Small-scale wireless soft robotics can be designed as implantable, interventional or wearable devices for various biomedical applications. Their flexibility, dexterity, adaptability and safe interactions with biological environments make them promising candidates for enabling precise and remote healthcare and disease diagnosis. However, the clinical translation of wireless soft robotic medical devices remains challenging. In this Review, we provide a comprehensive overview of the robotic technologies, the navigation methods, the dexterous functions and the translational challenges of wireless soft robotic medical devices. We first discuss safety and biocompatibility from a biological and technical perspective and then examine navigation methods for overcoming biological barriers for delivery, mobility and retrieval, highlighting dexterous medical functions at small scales. Finally, we identify key product development challenges, as well as the regulatory and ethical considerations that should be addressed to enable the clinical translation of wireless soft robotic medical devices.
DOI BibTeX

Modern Magnetic Systems Article Coherent magnons with giant nonreciprocity at nanoscale wavelengths Gallardo, R., Weigand, M., Schultheiss, K., Kakay, A., Mattheis, R., Raabe, J., Schütz, G., Deac, A., Lindner, J., Wintz, S. ACS Nano, 18(7):5249-5257, American Chemical Society, Washington, DC, 2024 DOI BibTeX

Learning and Dynamical Systems Conference Paper Conformal Performance Range Prediction for Segmentation Output Quality Control Wundram, A., Fischer, P., Muehlebach, M., Koch, L., Baumgartner, C. International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024 (Published) BibTeX

Modern Magnetic Systems Article Control of stripe, skyrmion and skyrmionium formation in the 2D magnet Fe3–xGeTe2 by varying composition Birch, M. T., Powalla, L., Litzius, K., Nehruji, V., Hovorka, O., Wintz, S., Schulz, F., Mayoh, D. A., Balakrishnan, G., Weigand, M., Burghard, M., Schütz, G. 2D Materials, 11(2), IOP Publ., Bristol, 2024 DOI BibTeX

Haptic Intelligence Miscellaneous Discrete Fourier Transform Three-to-One (DFT321): Code Landin, N., Romano, J. M., McMahan, W., Kuchenbecker, K. J. MATLAB code of discrete fourier transform three-to-one (DFT321), 2024 (Published) Code BibTeX

Modern Magnetic Systems Article Electrical detection and nucleation of a magnetic skyrmion in a magnetic tunnel junction observed via operando magnetic microscopy Urrestarazu Larran aga, J., Sisodia, N., Guedas, R., Pham, V. T., Di Manici, I., Masseboeuf, A., Garello, K., Disdier, F., Fernandez, B., Wintz, S., Weigand, M., Belmeguenai, M., Pizzini, S., Sousa, R. C., Buda-Prejbeanu, L. D., Gaudin, G., Boulle, O. Nano Letters, 24(12):3557-3565, American Chemical Society, Washington, DC, 2024 DOI BibTeX

Modern Magnetic Systems Article Establishing ZIF-8 as a reference material for hydrogen cryoadsorption: An interlaboratory study Villajos, J. A., Balderas-Xicohténcatl, R., Al Shakhs, A. N., Berenguer-Murcia, A., Buckley, C. E., Cazorla-Amorós, D., Charalambopoulou, G., Couturas, F., Cuevas, F., Fairen-Jimenez, D., Heinselman, K. N., Humphries, T. D., Kaskel, S., Kim, H., Marco-Lozar, J. P., Oh, H., Parilla, P. A., Paskevicius, M., Senkovska, I., Shulda, S., et al. ChemPhysChem, 25(5), Wiley-VCH, Weinheim, Germany, 2024 DOI BibTeX

Embodied Vision Article Event-based Non-Rigid Reconstruction of Low-Rank Parametrized Deformations from Contours Xue, Y., Li, H., Leutenegger, S., Stueckler, J. International Journal of Computer Vision (IJCV), 2024 (Published)
Visual reconstruction of fast non-rigid object deformations over time is a challenge for conventional frame-based cameras. In recent years, event cameras have gained significant attention due to their bio-inspired properties, such as high temporal resolution and high dynamic range. In this paper, we propose a novel approach for reconstructing such deformations using event measurements. Under the assumption of a static background, where all events are generated by the motion, our approach estimates the deformation of objects from events generated at the object contour in a probabilistic optimization framework. It associates events to mesh faces on the contour and maximizes the alignment of the line of sight through the event pixel with the associated face. In experiments on synthetic and real data of human body motion, we demonstrate the advantages of our method over state-of-the-art optimization and learning-based approaches for reconstructing the motion of human arms and hands. In addition, we propose an efficient event stream simulator to synthesize realistic event data for human motion.
DOI URL BibTeX

Embodied Vision Conference Paper Examining Common Paradigms in Multi-Task Learning Elich, C., Kirchdorfer, L., M. Köhler, J., Schott, L. In Proceedings of the German Conference on Pattern Recognition (GCPR), 2024, to appear (To be published) paper BibTeX

Learning and Dynamical Systems Article Gray-box nonlinear feedback optimization He, Z., Bolognani, S., Muehlebach, M., Dörfler, F. IEEE Transactions on Automatic Control, 2024 (Submitted) URL BibTeX

Modern Magnetic Systems Article Hydrogen-stabilized ScYNdGd medium-entropy alloy for hydrogen storage Balcerzak, M., Ponsoni, J. B., Petersen, H., Menéndez, C., Ternieden, J., Zhang, L., Winkelmann, F., Aguey-Zinsou, K., Hirscher, M., Felderhoff, M. Journal of the American Chemical Society, 146(8):5283-5294, American Chemical Society, Washington, DC, 2024 DOI BibTeX

Embodied Vision Technical Report Incremental Few-Shot Adaptation for Non-Prehensile Object Manipulation using Parallelizable Physics Simulators Baumeister, F., Mack, L., Stueckler, J. CoRR abs/2409.13228, CoRR, 2024, Submitted to IEEE International Conference on Robotics and Automation (ICRA) 2025 (Submitted)
Few-shot adaptation is an important capability for intelligent robots that perform tasks in open-world settings such as everyday environments or flexible production. In this paper, we propose a novel approach for non-prehensile manipulation which iteratively adapts a physics-based dynamics model for model-predictive control. We adapt the parameters of the model incrementally with a few examples of robot-object interactions. This is achieved by sampling-based optimization of the parameters using a parallelizable rigid-body physics simulation as dynamic world model. In turn, the optimized dynamics model can be used for model-predictive control using efficient sampling-based optimization. We evaluate our few-shot adaptation approach in several object pushing experiments in simulation and with a real robot.
URL BibTeX

Physical Intelligence Article Janus microparticles-based targeted and spatially-controlled piezoelectric neural stimulation via low-intensity focused ultrasound Han, M., Yildiz, E., Bozuyuk, U., Aydin, A., Yu, Y., Bhargava, A., Karaz, S., Sitti, M. Nature Communications, 15(1):2013, 2024
Electrical stimulation is a fundamental tool in studying neural circuits, treating neurological diseases, and advancing regenerative medicine. Injectable, free-standing piezoelectric particle systems have emerged as non-genetic and wireless alternatives for electrode-based tethered stimulation systems. However, achieving cell-specific and high-frequency piezoelectric neural stimulation remains challenging due to high-intensity thresholds, non-specific diffusion, and internalization of particles. Here, we develop cell-sized 20 μm-diameter silica-based piezoelectric magnetic Janus microparticles (PEMPs), enabling clinically-relevant high-frequency neural stimulation of primary neurons under low-intensity focused ultrasound. Owing to its functionally anisotropic design, half of the PEMP acts as a piezoelectric electrode via conjugated barium titanate nanoparticles to induce electrical stimulation, while the nickel-gold nanofilm-coated magnetic half provides spatial and orientational control on neural stimulation via external uniform rotating magnetic fields. Furthermore, surface functionalization with targeting antibodies enables cell-specific binding/targeting and stimulation of dopaminergic neurons. Taking advantage of such functionalities, the PEMP design offers unique features towards wireless neural stimulation for minimally invasive treatment of neurological diseases.
DOI BibTeX

Physical Intelligence Article Learning Soft Millirobot Multimodal Locomotion with Sim-to-Real Transfer Demir, S. O., Tiryaki, M. E., Karacakol, A. C., Sitti, M. Advanced Science, 2024 (Published)
With wireless multimodal locomotion capabilities, magnetic soft millirobots have emerged as potential minimally invasive medical robotic platforms. Due to their diverse shape programming capability, they can generate various locomotion modes, and their locomotion can be adapted to different environments by controlling the external magnetic field signal. Existing adaptation methods, however, are based on hand-tuned signals. Here, a learning-based adaptive magnetic soft millirobot multimodal locomotion framework empowered by sim-to-real transfer is presented. Developing a data-driven magnetic soft millirobot simulation environment, the periodic magnetic actuation signal is learned for a given soft millirobot in simulation. Then, the learned locomotion strategy is deployed to the real world using Bayesian optimization and Gaussian processes. Finally, automated domain recognition and locomotion adaptation for unknown environments using a Kullback-Leibler divergence-based probabilistic method are illustrated. This method can enable soft millirobot locomotion to quickly and continuously adapt to environmental changes and explore the actuation space for unanticipated solutions with minimum experimental cost.
DOI BibTeX

Embodied Vision Technical Report Learning a Terrain- and Robot-Aware Dynamics Model for Autonomous Mobile Robot Navigation Achterhold, J., Guttikonda, S., Kreber, J. U., Li, H., Stueckler, J. CoRR abs/2409.11452, 2024, Preprint submitted to Robotics and Autonomous Systems Journal. https://arxiv.org/abs/2409.11452 (Submitted)
Mobile robots should be capable of planning cost-efficient paths for autonomous navigation. Typically, the terrain and robot properties are subject to variations. For instance, properties of the terrain such as friction may vary across different locations. Also, properties of the robot may change such as payloads or wear and tear, e.g., causing changing actuator gains or joint friction. Autonomous navigation approaches should thus be able to adapt to such variations. In this article, we propose a novel approach for learning a probabilistic, terrain- and robot-aware forward dynamics model (TRADYN) which can adapt to such variations and demonstrate its use for navigation. Our learning approach extends recent advances in meta-learning forward dynamics models based on Neural Processes for mobile robot navigation. We evaluate our method in simulation for 2D navigation of a robot with uni-cycle dynamics with varying properties on terrain with spatially varying friction coefficients. In our experiments, we demonstrate that TRADYN has lower prediction error over long time horizons than model ablations which do not adapt to robot or terrain variations. We also evaluate our model for navigation planning in a model-predictive control framework and under various sources of noise. We demonstrate that our approach yields improved performance in planning control-efficient paths by taking robot and terrain properties into account.
BibTeX

Autonomous Learning Article Machine learning of a density functional for anisotropic patchy particles Simon, A., Weimar, J., Martius, G., Oettel, M. Journal of Chemical Theory and Computation, 2024 (Accepted)
Anisotropic patchy particles have become an archetypical statistical model system for associating fluids. Here we formulate an approach to the Kern-Frenkel model via classical density functional theory to describe the positionally and orientationally resolved equilibrium density distributions in flat wall geometries. The density functional is split into a reference part for the orientationally averaged density and an orientational part in mean-field approximation. To bring the orientational part into a kernel form suitable for machine learning techniques, an expansion into orientational invariants and the proper incorporation of single-particle symmetries is formulated. The mean-field kernel is constructed via machine learning on the basis of hard wall simulation data. Results are compared to the well-known random-phase approximation which strongly underestimates the orientational correlations close to the wall. Successes and shortcomings of the mean-field treatment of the orientational part are highlighted and perspectives are given for attaining a full density functional via machine learning.
DOI URL BibTeX

Physical Intelligence Medical Systems Article Nanodiamond-Enhanced Magnetic Resonance Imaging Jelena Lazovic, E. G. A. W. P. S. A. S. J. L. G. W. M. S. Advanced Materials, 36(11):2310109, 2024 DOI BibTeX

Perceiving Systems Ph.D. Thesis Natural Language Control for 3D Human Motion Synthesis Petrovich, M. LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, 2024 (Published)
3D human motions are at the core of many applications in the film industry, healthcare, augmented reality, virtual reality and video games. However, these applications often rely on expensive and time-consuming motion capture data. The goal of this thesis is to explore generative models as an alternative route to obtain 3D human motions. More specifically, our aim is to allow a natural language interface as a means to control the generation process. To this end, we develop a series of models that synthesize realistic and diverse motions following the semantic inputs. In our first contribution, described in Chapter 3, we address the challenge of generating human motion sequences conditioned on specific action categories. We introduce ACTOR, a conditional variational autoencoder (VAE) that learns an action-aware latent representation for human motions. We show significant gains over existing methods thanks to our new Transformer-based VAE formulation, encoding and decoding SMPL pose sequences through a single motion-level embedding. In our second contribution, described in Chapter 4, we go beyond categorical actions, and dive into the task of synthesizing diverse 3D human motions from textual descriptions allowing a larger vocabulary and potentially more fine-grained control. Our work stands out from previous research by not deterministically generating a single motion sequence, but by synthesizing multiple, varied sequences from a given text. We propose TEMOS, building on our VAE-based ACTOR architecture, but this time integrating a pretrained text encoder to handle large-vocabulary natural language inputs. In our third contribution, described in Chapter 5, we address the adjacent task of text-to-3D human motion retrieval, where the goal is to search in a motion collection by querying via text. We introduce a simple yet effective approach, named TMR, building on our earlier model TEMOS, by integrating a contrastive loss to enhance the structure of the cross-modal latent space. Our findings emphasize the importance of retaining the motion generation loss in conjunction with contrastive training for improved results. We establish a new evaluation benchmark and conduct analyses on several protocols. In our fourth contribution, described in Chapter 6, we introduce a new problem termed as “multi-track timeline control” for text-driven 3D human motion synthesis. Instead of a single textual prompt, users can organize multiple prompts in temporal intervals that may overlap. We introduce STMC, a test-time denoising method that can be integrated with any pre-trained motion diffusion model. Our evaluations demonstrate that our method generates motions that closely match the semantic and temporal aspects of the input timelines. In summary, our contributions in this thesis are as follows: (i) we develop a generative variational autoencoder, ACTOR, for action-conditioned generation of human motion sequences, (ii) we introduce TEMOS, a text-conditioned generative model that synthesizes diverse human motions from textual descriptions, (iii) we present TMR, a new approach for text-to-3D human motion retrieval, (iv) we propose STMC, a method for timeline control in text-driven motion synthesis, enabling the generation of detailed and complex motions.
pdf YouTube Thesis BibTeX

Learning and Dynamical Systems Conference Paper Online learning under adversarial nonlinear constraints Kolev, P., Martius, G., Muehlebach, M. In Advances in Neural Information Processing Systems 36, Advances in Neural Information Processing Systems, 2024 (Published) URL BibTeX

Empirical Inference Article Optimal Decision Making Under Strategic Behavior Tsirtsis, S., Tabibian, B., Khajehnejad, M., Singla, A., Schölkopf, B., Gomez-Rodriguez, M. Management Science, 2024, Published Online (In press) DOI BibTeX

Empirical Inference Article Parameterizing pressure-temperature profiles of exoplanet atmospheres with neural networks Gebhard, T. D., Angerhausen, D., Konrad, B. S., Alei, E., Quanz, S. P., Schölkopf, B. Astronomy & Astrophysics, 681, 2024 (Published) DOI BibTeX

Embodied Vision Conference Paper Physically Plausible Object Pose Refinement in Cluttered Scenes Strecke, M., Stueckler, J. In Proceedings of the German Conference on Pattern Recognition (GCPR), 2024, to appear (To be published) code preprint (submitted version) BibTeX

Embodied Vision Conference Paper Physics-Based Rigid Body Object Tracking and Friction Filtering From RGB-D Videos Kandukuri, R. K., Strecke, M., Stueckler, J. In Proceedings of the International Conference on 3D Vision (3DV), 2024 (Published)
Physics-based understanding of object interactions from sensory observations is an essential capability in augmented reality and robotics. It enables to capture the properties of a scene for simulation and control. In this paper, we propose a novel approach for real-to-sim which tracks rigid objects in 3D from RGB-D images and infers physical properties of the objects. We use a differentiable physics simulation as state-transition model in an Extended Kalman Filter which can model contact and friction for arbitrary mesh-based shapes and in this way estimate physically plausible trajectories. We demonstrate that our approach can filter position, orientation, velocities, and concurrently can estimate the coefficient of friction of the objects. We analyze our approach on various sliding scenarios in synthetic image sequences of single objects and colliding objects. We also demonstrate and evaluate our approach on a real-world dataset. We make our novel benchmark datasets publicly available to foster future research in this novel problem setting and comparison with our method.
preprint supplemental video dataset DOI URL BibTeX

Physical Intelligence Article Roadmap for Clinical Translation of Mobile Microrobotics Bozuyuk, U., Wrede, P., Yildiz, E., Sitti, M. Advanced Materials, 2311462, 2024
Medical microrobotics is an emerging field to revolutionize clinical applications in diagnostics and therapeutics of various diseases. On the other hand, the mobile microrobotics field has important obstacles to pass before clinical translation. This article focuses on these challenges and provides a roadmap of medical microrobots to enable their clinical use. From the concept of a “magic bullet” to the physicochemical interactions of microrobots in complex biological environments in medical applications, there are several translational steps to consider. Clinical translation of mobile microrobots is only possible with a close collaboration between clinical experts and microrobotics researchers to address the technical challenges in microfabrication, safety, and imaging. The clinical application potential can be materialized by designing microrobots that can solve the current main challenges, such as actuation limitations, material stability, and imaging constraints. The strengths and weaknesses of the current progress in the microrobotics field are discussed and a roadmap for their clinical applications in the near future is outlined.
DOI BibTeX

Physical Intelligence Article Single-step precision programming of decoupledmultiresponsive soft millirobots Zheng, Z., Han, J., Shi, Q., Demir, S. O., Jiang, W., Sitti, M. PNAS, 121, 2024 (Published)
Stimuli-responsive soft robots offer new capabilities for the fields of medical and rehabilitation robotics, artificial intelligence, and soft electronics. Precisely programming the shape morphing and decoupling the multiresponsiveness of such robots is crucial to enable them with ample degrees of freedom and multifunctionality, while ensuring high fabrication accuracy. However, current designs featuring coupled multiresponsiveness or intricate assembly processes face limitations in executing complex transformations and suffer from a lack of precision. Therefore, we propose a one-stepped strategy to program multistep shape-morphing soft millirobots (MSSMs) in response to decoupled environmental stimuli. Our approach involves employing a multilayered elastomer and laser scanning technology to selectively process the structure of MSSMs, achieving a minimum machining precision of 30 μm. The resulting MSSMs are capable of imitating the shape morphing of plants and hand gestures and resemble kirigami, pop-up, and bistable structures. The decoupled multistimuli responsiveness of the MSSMs allows them to conduct shape morphing during locomotion, perform logic circuit control, and remotely repair circuits in response to humidity, temperature, and magnetic field. This strategy presents a paradigm for the effective design and fabrication of untethered soft miniature robots with physical intelligence, advancing the decoupled multiresponsive materials through modular tailoring of robotic body structures and properties to suit specific applications.
DOI URL BibTeX

Modern Magnetic Systems Article Size-dependent bistability of magnetic states in soft magnetic cap arrays Sam, S. A., Seyd, J., Ullrich, A., Jung, F., Groß, F., Krupiński, M., Albrecht, M., Thomas, S. Nanotechnology, 35(22), IOP Pub., Bristol, UK, 2024 DOI BibTeX

Modern Magnetic Systems Article Small-pore hydridic frameworks store densely packed hydrogen Oh, H., Tumanov, N., Ban, V., Li, X., Richter, B., Hudson, M. R., Brown, C. M., Iles, G. N., Wallacher, D., Jorgensen, S. W., Daemen, L., Balderas-Xicohténcatl, R., Cheng, Y., Ramirez-Cuesta, A. J., Heere, M., Posada-Pérez, S., Hautier, G., Hirscher, M., Jensen, T. R., Filinchuk, Y. Nature Chemistry, 16(5):809-816, Nature Publishing Group, London, UK, 2024 DOI BibTeX

Materials Article Soft Sub-Structured Multi-Material Biosensor Hydrogels with Enzymes Retained by Plant Viral Scaffolds Grübel, J., Wendlandt, T., Urban, D., Jauch, C. O., Wege, C., Tovar, G. E. M., Southan, A. Macromolecular Bioscience, 24(3), Wiley, 2024 (Published) pdf DOI URL BibTeX

Learning and Dynamical Systems Article Towards a systems theory of algorithms Dörfler, F., He, Z., Belgioioso, G., Bolognani, S., Lygeros, J., Muehlebach, M. IEEE Control System Letters, 2024 (Published) URL BibTeX

Empirical Inference Article Use the 4S (Signal-Safe Speckle Subtraction): Explainable Machine Learning reveals the Giant Exoplanet AF Lep b in High-Contrast Imaging Data from 2011 Bonse, M. J., Gebhard, T. D., Dannert, F. A., Absil, O., Cantalloube, F., Christiaens, V., Cugno, G., Garvin, E. O., Hayoz, J., Kasper, M., Matthews, E., Schölkopf, B., Quanz, S. P. The Astronomical Journal, 2024 (Accepted) arXiv BibTeX

Physical Intelligence Article Wireless flow-powered miniature robot capable of traversing tubular structures Hong, C., Wu, Y., Wang, C., Ren, Z., Wang, C., Liu, Z., Hu, W., Sitti, M. Science Robotics, 9(88):eadi5155, 2024 (Published)
Wireless millimeter-scale robots capable of navigating through fluid-flowing tubular structures hold substantial potential for inspection, maintenance, or repair use in nuclear, industrial, and medical applications. However, prevalent reliance on external powering constrains these robots’ operational range and applicable environments. Alternatives with onboard powering must trade off size, functionality, and operation duration. Here, we propose a wireless millimeter-scale wheeled robot capable of using environmental flows to power and actuate its long-distance locomotion through complex pipelines. The flow-powering module can convert flow energy into mechanical energy, achieving an impeller speed of up to 9595 revolutions per minute, accompanied by an output power density of 11.7 watts per cubic meter and an efficiency of 33.7%. A miniature gearbox module can further transmit the converted mechanical energy into the robot’s locomotion system, allowing the robot to move against water flow at an average rate of up to 1.05 meters per second. The robot’s motion status (moving against/with flow or pausing) can be switched using an external magnetic field or an onboard mechanical regulator, contingent on different proposed control designs. In addition, we designed kirigami-based soft wheels for adaptive locomotion. The robot can move against flows of various substances within pipes featuring complex geometries and diverse materials. Solely powered by flow, the robot can transport cylindrical payloads with a diameter of up to 55% of the pipe’s diameter and carry devices such as an endoscopic camera for pipeline inspection, a wireless temperature sensor for environmental temperature monitoring, and a leak-stopper shell for infrastructure maintenance.
DOI URL BibTeX

Perceiving Systems Article InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction from Multi-view RGB-D Images Huang, Y., Taheri, O., Black, M. J., Tzionas, D. International Journal of Computer Vision (IJCV), 132(7):2551-2566, 2024 (Published)
Humans constantly interact with objects to accomplish tasks. To understand such interactions, computers need to reconstruct these in 3D from images of whole bodies manipulating objects, e.g., for grasping, moving and using the latter. This involves key challenges, such as occlusion between the body and objects, motion blur, depth ambiguities, and the low image resolution of hands and graspable object parts. To make the problem tractable, the community has followed a divide-and-conquer approach, focusing either only on interacting hands, ignoring the body, or on interacting bodies, ignoring the hands. However, these are only parts of the problem. On the contrary, recent work focuses on the whole problem. The GRAB dataset addresses whole-body interaction with dexterous hands but captures motion via markers and lacks video, while the BEHAVE dataset captures video of body-object interaction but lacks hand detail. We address the limitations of prior work with InterCap, a novel method that reconstructs interacting whole-bodies and objects from multi-view RGB-D data, using the parametric whole-body SMPL-X model and known object meshes. To tackle the above challenges, InterCap uses two key observations: (i) Contact between the body and object can be used to improve the pose estimation of both. (ii) Consumer-level Azure Kinect cameras let us set up a simple and flexible multi-view RGB-D system for reducing occlusions, with spatially calibrated and temporally synchronized cameras. With our InterCap method we capture the InterCap dataset, which contains 10 subjects (5 males and 5 females) interacting with 10 daily objects of various sizes and affordances, including contact with the hands or feet. To this end, we introduce a new data-driven hand motion prior, as well as explore simple ways for automatic contact detection based on 2D and 3D cues. In total, InterCap has 223 RGB-D videos, resulting in 67,357 multi-view frames, each containing 6 RGB-D images, paired with pseudo ground-truth 3D body and object meshes. Our InterCap method and dataset fill an important gap in the literature and support many research directions. Data and code are available at https://intercap.is.tue.mpg.de.
Paper DOI URL BibTeX

Perceiving Systems Conference Paper 3D Neural Edge Reconstruction Lil, L., Peng, S., Yu, Z., Liu, S., Pautrat, R., Yin, X., Pollefeys, M. In Proceedings 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 21219-21229, 10.1109/CVPR52733.2024.02005, 2024 (Published) DOI URL BibTeX

Physics for Inference and Optimization Conference Paper A causality-inspired adjusted plus-minus model for player evaluation in team sports De Bacco, C., Wang, Y., Blei, D. M. In Proceedings of Machine Learning Research (PMLR), Proceedings Third Conference on Causal Learning and Reasoning, 236:769-792, Third Conference on Causal Learning and Reasoning, 2024 (Published) URL BibTeX

Empirical Inference Article A temperate super-Jupiter imaged with JWST in the mid-infrared Matthews, E. C., Carter, A. L., Pathak, P., Morley, C. V., Phillips, M. W., S. Krishanth, P. M., Feng, F., Bonse, M. J., Boogaard, L. A., Burt, J. A., Crossfield, I. J. M., Douglas, E. S., Henning, T., Hom, J., Ko, C. -., Kasper, M., Lagrange, A., Petit Dit de la Roche, D., Philipot, F. Nature, 633:789–792, 2024 (Published)
Of the approximately 25 directly imaged planets to date, all are younger than 500 Myr, and all but six are younger than 100 Myr (ref. 1). Eps Ind A (HD209100, HIP108870) is a K5V star of roughly solar age (recently derived as 3.7–5.7 Gyr (ref. 2) and  Gyr (ref. 3)). A long-term radial-velocity trend4,5 and an astrometric acceleration6,7 led to claims of a giant planet2,8,9 orbiting the nearby star (3.6384 ± 0.0013 pc; ref. 10). Here we report JWST coronagraphic images which reveal a giant exoplanet that is consistent with these radial and astrometric measurements but inconsistent with the previously claimed planet properties. The new planet has a temperature of approximately 275 K and is remarkably bright at 10.65 and 15.50 µm. Non-detections between 3.5 and 5.0 µm indicate an unknown opacity source in the atmosphere, possibly suggesting a high-metallicity, high carbon-to-oxygen ratio planet. The best-fitting temperature of the planet is consistent with theoretical thermal evolution models, which were previously untested at this temperature range. The data indicate that this is probably the only giant planet in the system, and therefore we refer to it as b, despite it having significantly different orbital properties than the previously claimed planet b.
DOI URL BibTeX

Safety- and Efficiency- aligned Learning Technical Report AI Risk Management Should Incorporate Both Safety and Security Qi, X., Huang, Y., Zeng, Y., Debenedetti, E., Geiping, J., He, L., Huang, K., Madhushani, U., Sehwag, V., Shi, W., Wei, B., Xie, T., Chen, D., Chen, P., Ding, J., Jia, R., Ma, J., Narayanan, A., Su, W. J., Wang, M., et al. 2024 BibTeX

Perceiving Systems Article Accelerated Video Annotation Driven by Deep Detector and Tracker Price, E., Ahmad, A. INTELLIGENT AUTONOMOUS SYSTEMS 18, 2:141–153, IAS, 2024 (Published)
Annotating object ground truth in videos is vital for several downstream tasks in robot perception and machine learning, such as for evaluating the performance of an object tracker or training an image-based object detector. The accuracy of the annotated instances of the moving objects on every image frame in a video is crucially important. Achieving that through manual annotations is not only very time consuming and labor intensive, but is also prone to high error rate. State-of-the-art annotation methods depend on manually initializing the object bounding boxes only in the first frame and then use classical tracking methods, e.g., adaboost, or kernelized correlation filters, to keep track of those bounding boxes. These can quickly drift, thereby requiring tedious manual supervision. In this paper, we propose a new annotation method which leverages a combination of a learning-based detector (SSD) and a learning-based tracker (RE). Through this, we significantly reduce annotation drifts, and, consequently, the required manual supervision. We validate our approach through annotation experiments using our proposed annotation method and existing baselines on a set of drone video frames. Source code and detailed information on how to run the annotation program can be found at https://github.com/robot-perception-group/smarter-labelme
project DOI URL BibTeX

Empirical Inference Miscellaneous Analyzing Human Questioning Behavior and Causal Curiosity through Natural Queries Ceraolo, R., Kharlapenko, D., Khan, A., Reymond, A., Mihalcea, R., Sachan, M., Schölkopf, B., Jin, Z. 2024 (Published) URL BibTeX

Safety- and Efficiency- aligned Learning Conference Paper Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs Hans, A., Wen, Y., Jain, N., Kirchenbauer, J., Kazemi, H., Singhania, P., Singh, S., Somepalli, G., Geiping, J., Bhatele, A., Goldstein, T. In Proceedings of the Thirty-Eighth Annual Conference on Neural Information Processing Systems, Thirty-Eighth Annual Conference on Neural Information Processing Systems, 2024 (Published) URL BibTeX