Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Physical Intelligence Article Janus microparticles-based targeted and spatially-controlled piezoelectric neural stimulation via low-intensity focused ultrasound Han, M., Yildiz, E., Bozuyuk, U., Aydin, A., Yu, Y., Bhargava, A., Karaz, S., Sitti, M. Nature Communications, 15(1):2013, 2024
Electrical stimulation is a fundamental tool in studying neural circuits, treating neurological diseases, and advancing regenerative medicine. Injectable, free-standing piezoelectric particle systems have emerged as non-genetic and wireless alternatives for electrode-based tethered stimulation systems. However, achieving cell-specific and high-frequency piezoelectric neural stimulation remains challenging due to high-intensity thresholds, non-specific diffusion, and internalization of particles. Here, we develop cell-sized 20 μm-diameter silica-based piezoelectric magnetic Janus microparticles (PEMPs), enabling clinically-relevant high-frequency neural stimulation of primary neurons under low-intensity focused ultrasound. Owing to its functionally anisotropic design, half of the PEMP acts as a piezoelectric electrode via conjugated barium titanate nanoparticles to induce electrical stimulation, while the nickel-gold nanofilm-coated magnetic half provides spatial and orientational control on neural stimulation via external uniform rotating magnetic fields. Furthermore, surface functionalization with targeting antibodies enables cell-specific binding/targeting and stimulation of dopaminergic neurons. Taking advantage of such functionalities, the PEMP design offers unique features towards wireless neural stimulation for minimally invasive treatment of neurological diseases.
DOI BibTeX

Physical Intelligence Article Learning Soft Millirobot Multimodal Locomotion with Sim-to-Real Transfer Demir, S. O., Tiryaki, M. E., Karacakol, A. C., Sitti, M. Advanced Science, 2024 (Published)
With wireless multimodal locomotion capabilities, magnetic soft millirobots have emerged as potential minimally invasive medical robotic platforms. Due to their diverse shape programming capability, they can generate various locomotion modes, and their locomotion can be adapted to different environments by controlling the external magnetic field signal. Existing adaptation methods, however, are based on hand-tuned signals. Here, a learning-based adaptive magnetic soft millirobot multimodal locomotion framework empowered by sim-to-real transfer is presented. Developing a data-driven magnetic soft millirobot simulation environment, the periodic magnetic actuation signal is learned for a given soft millirobot in simulation. Then, the learned locomotion strategy is deployed to the real world using Bayesian optimization and Gaussian processes. Finally, automated domain recognition and locomotion adaptation for unknown environments using a Kullback-Leibler divergence-based probabilistic method are illustrated. This method can enable soft millirobot locomotion to quickly and continuously adapt to environmental changes and explore the actuation space for unanticipated solutions with minimum experimental cost.
DOI BibTeX

Embodied Vision Technical Report Learning a Terrain- and Robot-Aware Dynamics Model for Autonomous Mobile Robot Navigation Achterhold, J., Guttikonda, S., Kreber, J. U., Li, H., Stueckler, J. CoRR abs/2409.11452, 2024, Preprint submitted to Robotics and Autonomous Systems Journal. https://arxiv.org/abs/2409.11452 (Submitted)
Mobile robots should be capable of planning cost-efficient paths for autonomous navigation. Typically, the terrain and robot properties are subject to variations. For instance, properties of the terrain such as friction may vary across different locations. Also, properties of the robot may change such as payloads or wear and tear, e.g., causing changing actuator gains or joint friction. Autonomous navigation approaches should thus be able to adapt to such variations. In this article, we propose a novel approach for learning a probabilistic, terrain- and robot-aware forward dynamics model (TRADYN) which can adapt to such variations and demonstrate its use for navigation. Our learning approach extends recent advances in meta-learning forward dynamics models based on Neural Processes for mobile robot navigation. We evaluate our method in simulation for 2D navigation of a robot with uni-cycle dynamics with varying properties on terrain with spatially varying friction coefficients. In our experiments, we demonstrate that TRADYN has lower prediction error over long time horizons than model ablations which do not adapt to robot or terrain variations. We also evaluate our model for navigation planning in a model-predictive control framework and under various sources of noise. We demonstrate that our approach yields improved performance in planning control-efficient paths by taking robot and terrain properties into account.
BibTeX

Autonomous Learning Article Machine learning of a density functional for anisotropic patchy particles Simon, A., Weimar, J., Martius, G., Oettel, M. Journal of Chemical Theory and Computation, 2024 (Accepted)
Anisotropic patchy particles have become an archetypical statistical model system for associating fluids. Here we formulate an approach to the Kern-Frenkel model via classical density functional theory to describe the positionally and orientationally resolved equilibrium density distributions in flat wall geometries. The density functional is split into a reference part for the orientationally averaged density and an orientational part in mean-field approximation. To bring the orientational part into a kernel form suitable for machine learning techniques, an expansion into orientational invariants and the proper incorporation of single-particle symmetries is formulated. The mean-field kernel is constructed via machine learning on the basis of hard wall simulation data. Results are compared to the well-known random-phase approximation which strongly underestimates the orientational correlations close to the wall. Successes and shortcomings of the mean-field treatment of the orientational part are highlighted and perspectives are given for attaining a full density functional via machine learning.
DOI URL BibTeX

Physical Intelligence Medical Systems Article Nanodiamond-Enhanced Magnetic Resonance Imaging Jelena Lazovic, E. G. A. W. P. S. A. S. J. L. G. W. M. S. Advanced Materials, 36(11):2310109, 2024 DOI BibTeX

Perceiving Systems Ph.D. Thesis Natural Language Control for 3D Human Motion Synthesis Petrovich, M. LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, 2024 (Published)
3D human motions are at the core of many applications in the film industry, healthcare, augmented reality, virtual reality and video games. However, these applications often rely on expensive and time-consuming motion capture data. The goal of this thesis is to explore generative models as an alternative route to obtain 3D human motions. More specifically, our aim is to allow a natural language interface as a means to control the generation process. To this end, we develop a series of models that synthesize realistic and diverse motions following the semantic inputs. In our first contribution, described in Chapter 3, we address the challenge of generating human motion sequences conditioned on specific action categories. We introduce ACTOR, a conditional variational autoencoder (VAE) that learns an action-aware latent representation for human motions. We show significant gains over existing methods thanks to our new Transformer-based VAE formulation, encoding and decoding SMPL pose sequences through a single motion-level embedding. In our second contribution, described in Chapter 4, we go beyond categorical actions, and dive into the task of synthesizing diverse 3D human motions from textual descriptions allowing a larger vocabulary and potentially more fine-grained control. Our work stands out from previous research by not deterministically generating a single motion sequence, but by synthesizing multiple, varied sequences from a given text. We propose TEMOS, building on our VAE-based ACTOR architecture, but this time integrating a pretrained text encoder to handle large-vocabulary natural language inputs. In our third contribution, described in Chapter 5, we address the adjacent task of text-to-3D human motion retrieval, where the goal is to search in a motion collection by querying via text. We introduce a simple yet effective approach, named TMR, building on our earlier model TEMOS, by integrating a contrastive loss to enhance the structure of the cross-modal latent space. Our findings emphasize the importance of retaining the motion generation loss in conjunction with contrastive training for improved results. We establish a new evaluation benchmark and conduct analyses on several protocols. In our fourth contribution, described in Chapter 6, we introduce a new problem termed as “multi-track timeline control” for text-driven 3D human motion synthesis. Instead of a single textual prompt, users can organize multiple prompts in temporal intervals that may overlap. We introduce STMC, a test-time denoising method that can be integrated with any pre-trained motion diffusion model. Our evaluations demonstrate that our method generates motions that closely match the semantic and temporal aspects of the input timelines. In summary, our contributions in this thesis are as follows: (i) we develop a generative variational autoencoder, ACTOR, for action-conditioned generation of human motion sequences, (ii) we introduce TEMOS, a text-conditioned generative model that synthesizes diverse human motions from textual descriptions, (iii) we present TMR, a new approach for text-to-3D human motion retrieval, (iv) we propose STMC, a method for timeline control in text-driven motion synthesis, enabling the generation of detailed and complex motions.
pdf YouTube Thesis BibTeX

Learning and Dynamical Systems Conference Paper Online learning under adversarial nonlinear constraints Kolev, P., Martius, G., Muehlebach, M. In Advances in Neural Information Processing Systems 36, Advances in Neural Information Processing Systems, 2024 (Published) URL BibTeX

Empirical Inference Article Optimal Decision Making Under Strategic Behavior Tsirtsis, S., Tabibian, B., Khajehnejad, M., Singla, A., Schölkopf, B., Gomez-Rodriguez, M. Management Science, 2024, Published Online (In press) DOI BibTeX

Empirical Inference Article Parameterizing pressure-temperature profiles of exoplanet atmospheres with neural networks Gebhard, T. D., Angerhausen, D., Konrad, B. S., Alei, E., Quanz, S. P., Schölkopf, B. Astronomy & Astrophysics, 681, 2024 (Published) DOI BibTeX

Embodied Vision Conference Paper Physically Plausible Object Pose Refinement in Cluttered Scenes Strecke, M., Stueckler, J. In Proceedings of the German Conference on Pattern Recognition (GCPR), 2024, to appear (To be published) code preprint (submitted version) BibTeX

Embodied Vision Conference Paper Physics-Based Rigid Body Object Tracking and Friction Filtering From RGB-D Videos Kandukuri, R. K., Strecke, M., Stueckler, J. In Proceedings of the International Conference on 3D Vision (3DV), 2024 (Published)
Physics-based understanding of object interactions from sensory observations is an essential capability in augmented reality and robotics. It enables to capture the properties of a scene for simulation and control. In this paper, we propose a novel approach for real-to-sim which tracks rigid objects in 3D from RGB-D images and infers physical properties of the objects. We use a differentiable physics simulation as state-transition model in an Extended Kalman Filter which can model contact and friction for arbitrary mesh-based shapes and in this way estimate physically plausible trajectories. We demonstrate that our approach can filter position, orientation, velocities, and concurrently can estimate the coefficient of friction of the objects. We analyze our approach on various sliding scenarios in synthetic image sequences of single objects and colliding objects. We also demonstrate and evaluate our approach on a real-world dataset. We make our novel benchmark datasets publicly available to foster future research in this novel problem setting and comparison with our method.
preprint supplemental video dataset DOI URL BibTeX

Physical Intelligence Article Roadmap for Clinical Translation of Mobile Microrobotics Bozuyuk, U., Wrede, P., Yildiz, E., Sitti, M. Advanced Materials, 2311462, 2024
Medical microrobotics is an emerging field to revolutionize clinical applications in diagnostics and therapeutics of various diseases. On the other hand, the mobile microrobotics field has important obstacles to pass before clinical translation. This article focuses on these challenges and provides a roadmap of medical microrobots to enable their clinical use. From the concept of a “magic bullet” to the physicochemical interactions of microrobots in complex biological environments in medical applications, there are several translational steps to consider. Clinical translation of mobile microrobots is only possible with a close collaboration between clinical experts and microrobotics researchers to address the technical challenges in microfabrication, safety, and imaging. The clinical application potential can be materialized by designing microrobots that can solve the current main challenges, such as actuation limitations, material stability, and imaging constraints. The strengths and weaknesses of the current progress in the microrobotics field are discussed and a roadmap for their clinical applications in the near future is outlined.
DOI BibTeX

Physical Intelligence Article Single-step precision programming of decoupledmultiresponsive soft millirobots Zheng, Z., Han, J., Shi, Q., Demir, S. O., Jiang, W., Sitti, M. PNAS, 121, 2024 (Published)
Stimuli-responsive soft robots offer new capabilities for the fields of medical and rehabilitation robotics, artificial intelligence, and soft electronics. Precisely programming the shape morphing and decoupling the multiresponsiveness of such robots is crucial to enable them with ample degrees of freedom and multifunctionality, while ensuring high fabrication accuracy. However, current designs featuring coupled multiresponsiveness or intricate assembly processes face limitations in executing complex transformations and suffer from a lack of precision. Therefore, we propose a one-stepped strategy to program multistep shape-morphing soft millirobots (MSSMs) in response to decoupled environmental stimuli. Our approach involves employing a multilayered elastomer and laser scanning technology to selectively process the structure of MSSMs, achieving a minimum machining precision of 30 μm. The resulting MSSMs are capable of imitating the shape morphing of plants and hand gestures and resemble kirigami, pop-up, and bistable structures. The decoupled multistimuli responsiveness of the MSSMs allows them to conduct shape morphing during locomotion, perform logic circuit control, and remotely repair circuits in response to humidity, temperature, and magnetic field. This strategy presents a paradigm for the effective design and fabrication of untethered soft miniature robots with physical intelligence, advancing the decoupled multiresponsive materials through modular tailoring of robotic body structures and properties to suit specific applications.
DOI URL BibTeX

Modern Magnetic Systems Article Size-dependent bistability of magnetic states in soft magnetic cap arrays Sam, S. A., Seyd, J., Ullrich, A., Jung, F., Groß, F., Krupiński, M., Albrecht, M., Thomas, S. Nanotechnology, 35(22), IOP Pub., Bristol, UK, 2024 DOI BibTeX

Modern Magnetic Systems Article Small-pore hydridic frameworks store densely packed hydrogen Oh, H., Tumanov, N., Ban, V., Li, X., Richter, B., Hudson, M. R., Brown, C. M., Iles, G. N., Wallacher, D., Jorgensen, S. W., Daemen, L., Balderas-Xicohténcatl, R., Cheng, Y., Ramirez-Cuesta, A. J., Heere, M., Posada-Pérez, S., Hautier, G., Hirscher, M., Jensen, T. R., Filinchuk, Y. Nature Chemistry, 16(5):809-816, Nature Publishing Group, London, UK, 2024 DOI BibTeX

Materials Article Soft Sub-Structured Multi-Material Biosensor Hydrogels with Enzymes Retained by Plant Viral Scaffolds Grübel, J., Wendlandt, T., Urban, D., Jauch, C. O., Wege, C., Tovar, G. E. M., Southan, A. Macromolecular Bioscience, 24(3), Wiley, 2024 (Published) pdf DOI URL BibTeX

Learning and Dynamical Systems Article Towards a systems theory of algorithms Dörfler, F., He, Z., Belgioioso, G., Bolognani, S., Lygeros, J., Muehlebach, M. IEEE Control System Letters, 2024 (Published) URL BibTeX

Empirical Inference Article Use the 4S (Signal-Safe Speckle Subtraction): Explainable Machine Learning reveals the Giant Exoplanet AF Lep b in High-Contrast Imaging Data from 2011 Bonse, M. J., Gebhard, T. D., Dannert, F. A., Absil, O., Cantalloube, F., Christiaens, V., Cugno, G., Garvin, E. O., Hayoz, J., Kasper, M., Matthews, E., Schölkopf, B., Quanz, S. P. The Astronomical Journal, 2024 (Accepted) arXiv BibTeX

Physical Intelligence Article Wireless flow-powered miniature robot capable of traversing tubular structures Hong, C., Wu, Y., Wang, C., Ren, Z., Wang, C., Liu, Z., Hu, W., Sitti, M. Science Robotics, 9(88):eadi5155, 2024 (Published)
Wireless millimeter-scale robots capable of navigating through fluid-flowing tubular structures hold substantial potential for inspection, maintenance, or repair use in nuclear, industrial, and medical applications. However, prevalent reliance on external powering constrains these robots’ operational range and applicable environments. Alternatives with onboard powering must trade off size, functionality, and operation duration. Here, we propose a wireless millimeter-scale wheeled robot capable of using environmental flows to power and actuate its long-distance locomotion through complex pipelines. The flow-powering module can convert flow energy into mechanical energy, achieving an impeller speed of up to 9595 revolutions per minute, accompanied by an output power density of 11.7 watts per cubic meter and an efficiency of 33.7%. A miniature gearbox module can further transmit the converted mechanical energy into the robot’s locomotion system, allowing the robot to move against water flow at an average rate of up to 1.05 meters per second. The robot’s motion status (moving against/with flow or pausing) can be switched using an external magnetic field or an onboard mechanical regulator, contingent on different proposed control designs. In addition, we designed kirigami-based soft wheels for adaptive locomotion. The robot can move against flows of various substances within pipes featuring complex geometries and diverse materials. Solely powered by flow, the robot can transport cylindrical payloads with a diameter of up to 55% of the pipe’s diameter and carry devices such as an endoscopic camera for pipeline inspection, a wireless temperature sensor for environmental temperature monitoring, and a leak-stopper shell for infrastructure maintenance.
DOI URL BibTeX

Perceiving Systems Article InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction from Multi-view RGB-D Images Huang, Y., Taheri, O., Black, M. J., Tzionas, D. International Journal of Computer Vision (IJCV), 132(7):2551-2566, 2024 (Published)
Humans constantly interact with objects to accomplish tasks. To understand such interactions, computers need to reconstruct these in 3D from images of whole bodies manipulating objects, e.g., for grasping, moving and using the latter. This involves key challenges, such as occlusion between the body and objects, motion blur, depth ambiguities, and the low image resolution of hands and graspable object parts. To make the problem tractable, the community has followed a divide-and-conquer approach, focusing either only on interacting hands, ignoring the body, or on interacting bodies, ignoring the hands. However, these are only parts of the problem. On the contrary, recent work focuses on the whole problem. The GRAB dataset addresses whole-body interaction with dexterous hands but captures motion via markers and lacks video, while the BEHAVE dataset captures video of body-object interaction but lacks hand detail. We address the limitations of prior work with InterCap, a novel method that reconstructs interacting whole-bodies and objects from multi-view RGB-D data, using the parametric whole-body SMPL-X model and known object meshes. To tackle the above challenges, InterCap uses two key observations: (i) Contact between the body and object can be used to improve the pose estimation of both. (ii) Consumer-level Azure Kinect cameras let us set up a simple and flexible multi-view RGB-D system for reducing occlusions, with spatially calibrated and temporally synchronized cameras. With our InterCap method we capture the InterCap dataset, which contains 10 subjects (5 males and 5 females) interacting with 10 daily objects of various sizes and affordances, including contact with the hands or feet. To this end, we introduce a new data-driven hand motion prior, as well as explore simple ways for automatic contact detection based on 2D and 3D cues. In total, InterCap has 223 RGB-D videos, resulting in 67,357 multi-view frames, each containing 6 RGB-D images, paired with pseudo ground-truth 3D body and object meshes. Our InterCap method and dataset fill an important gap in the literature and support many research directions. Data and code are available at https://intercap.is.tue.mpg.de.
Paper DOI URL BibTeX

Perceiving Systems Conference Paper 3D Neural Edge Reconstruction Lil, L., Peng, S., Yu, Z., Liu, S., Pautrat, R., Yin, X., Pollefeys, M. In Proceedings 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 21219-21229, 10.1109/CVPR52733.2024.02005, 2024 (Published) DOI URL BibTeX

Physics for Inference and Optimization Conference Paper A causality-inspired adjusted plus-minus model for player evaluation in team sports De Bacco, C., Wang, Y., Blei, D. M. In Proceedings of Machine Learning Research (PMLR), Proceedings Third Conference on Causal Learning and Reasoning, 236:769-792, Third Conference on Causal Learning and Reasoning, 2024 (Published) URL BibTeX

Empirical Inference Article A temperate super-Jupiter imaged with JWST in the mid-infrared Matthews, E. C., Carter, A. L., Pathak, P., Morley, C. V., Phillips, M. W., S. Krishanth, P. M., Feng, F., Bonse, M. J., Boogaard, L. A., Burt, J. A., Crossfield, I. J. M., Douglas, E. S., Henning, T., Hom, J., Ko, C. -., Kasper, M., Lagrange, A., Petit Dit de la Roche, D., Philipot, F. Nature, 633:789–792, 2024 (Published)
Of the approximately 25 directly imaged planets to date, all are younger than 500 Myr, and all but six are younger than 100 Myr (ref. 1). Eps Ind A (HD209100, HIP108870) is a K5V star of roughly solar age (recently derived as 3.7–5.7 Gyr (ref. 2) and  Gyr (ref. 3)). A long-term radial-velocity trend4,5 and an astrometric acceleration6,7 led to claims of a giant planet2,8,9 orbiting the nearby star (3.6384 ± 0.0013 pc; ref. 10). Here we report JWST coronagraphic images which reveal a giant exoplanet that is consistent with these radial and astrometric measurements but inconsistent with the previously claimed planet properties. The new planet has a temperature of approximately 275 K and is remarkably bright at 10.65 and 15.50 µm. Non-detections between 3.5 and 5.0 µm indicate an unknown opacity source in the atmosphere, possibly suggesting a high-metallicity, high carbon-to-oxygen ratio planet. The best-fitting temperature of the planet is consistent with theoretical thermal evolution models, which were previously untested at this temperature range. The data indicate that this is probably the only giant planet in the system, and therefore we refer to it as b, despite it having significantly different orbital properties than the previously claimed planet b.
DOI URL BibTeX

Safety- and Efficiency- aligned Learning Technical Report AI Risk Management Should Incorporate Both Safety and Security Qi, X., Huang, Y., Zeng, Y., Debenedetti, E., Geiping, J., He, L., Huang, K., Madhushani, U., Sehwag, V., Shi, W., Wei, B., Xie, T., Chen, D., Chen, P., Ding, J., Jia, R., Ma, J., Narayanan, A., Su, W. J., Wang, M., et al. 2024 BibTeX

Perceiving Systems Article Accelerated Video Annotation Driven by Deep Detector and Tracker Price, E., Ahmad, A. INTELLIGENT AUTONOMOUS SYSTEMS 18, 2:141–153, IAS, 2024 (Published)
Annotating object ground truth in videos is vital for several downstream tasks in robot perception and machine learning, such as for evaluating the performance of an object tracker or training an image-based object detector. The accuracy of the annotated instances of the moving objects on every image frame in a video is crucially important. Achieving that through manual annotations is not only very time consuming and labor intensive, but is also prone to high error rate. State-of-the-art annotation methods depend on manually initializing the object bounding boxes only in the first frame and then use classical tracking methods, e.g., adaboost, or kernelized correlation filters, to keep track of those bounding boxes. These can quickly drift, thereby requiring tedious manual supervision. In this paper, we propose a new annotation method which leverages a combination of a learning-based detector (SSD) and a learning-based tracker (RE). Through this, we significantly reduce annotation drifts, and, consequently, the required manual supervision. We validate our approach through annotation experiments using our proposed annotation method and existing baselines on a set of drone video frames. Source code and detailed information on how to run the annotation program can be found at https://github.com/robot-perception-group/smarter-labelme
project DOI URL BibTeX

Empirical Inference Miscellaneous Analyzing Human Questioning Behavior and Causal Curiosity through Natural Queries Ceraolo, R., Kharlapenko, D., Khan, A., Reymond, A., Mihalcea, R., Sachan, M., Schölkopf, B., Jin, Z. 2024 (Published) URL BibTeX

Safety- and Efficiency- aligned Learning Conference Paper Be like a Goldfish, Don’t Memorize! Mitigating Memorization in Generative LLMs Hans, A., Wen, Y., Jain, N., Kirchenbauer, J., Kazemi, H., Singhania, P., Singh, S., Somepalli, G., Geiping, J., Bhatele, A., Goldstein, T. In Proceedings of the Thirty-Eighth Annual Conference on Neural Information Processing Systems, Thirty-Eighth Annual Conference on Neural Information Processing Systems, 2024 (Published) URL BibTeX

Learning and Dynamical Systems Article Bi-level Motion Imitation for Humanoid Robots Zhao, W., Zhao, Y., Pajarinen, J., Muehlebach, M. Conference on Robot Learning, 2024 (Published) BibTeX

Safety- and Efficiency- aligned Learning Conference Paper Bring Your Own Data! Self-Sensitivity Evaluation for Large Language Models Jain, N., Saifullah, K., Wen, Y., Kirchenbauer, J., Shu, M., Saha, A., Goldblum, M., Geiping, J., Goldstein, T. In Proceedings of the First Conference on Language Modeling, First Conference on Language Modeling, 2024 (Published) URL BibTeX

Safety- and Efficiency- aligned Learning Conference Paper CALVIN: Improved Contextual Video Captioning via Instruction Tuning Somepalli, G., Chowdhury, A., Geiping, J., Basri, R., Goldstein, T., Jacobs, D. W. In Proceedings of the Thirty-Eighth Annual Conference on Neural Information Processing Systems, Thirty-Eighth Annual Conference on Neural Information Processing Systems, 2024 (Published) URL BibTeX

Human Aspects of Machine Learning Conference Paper Causal Adversarial Perturbations for Individual Fairness and Robustness in Heterogeneous Data Spaces. Ehyaei, A., Mohammadi, K., Karimi, A., Samadi, S., Farnadi, G. In Proceedings of the AAAI Conference on Artificial Intelligence, 2024 (Published) BibTeX

Safety- and Efficiency- aligned Learning Technical Report Coercing LLMs to do and reveal (almost) anything Geiping, J., Stein, A., Shu, M., Saifullah, K., Wen, Y., Goldstein, T. 2024 (Submitted) URL BibTeX

Empirical Inference Article Connectome-constrained networks predict neural activity across the fly visual system Lappalainen, J. K., Tschopp, F. D., Prakhya, S., McGill, M., Nern, A., Shinomiya, K., Takemura, S., Gruntman, E., Macke, J. H., Turaga, S. C. Nature, 634:1132–1140, 2024 (Published)
We can now measure the connectivity of every neuron in a neural circuit, but we cannot measure other biological details, including the dynamical characteristics of each neuron. The degree to which measurements of connectivity alone can inform the understanding of neural computation is an open question10. Here we show that with experimental measurements of only the connectivity of a biological neural network, we can predict the neural activity underlying a specified neural computation. We constructed a model neural network with the experimentally determined connectivity for 64 cell types in the motion pathways of the fruit fly optic lobe but with unknown parameters for the single-neuron and single-synapse properties. We then optimized the values of these unknown parameters using techniques from deep learning, to allow the model network to detect visual motion. Our mechanistic model makes detailed, experimentally testable predictions for each neuron in the connectome. We found that model predictions agreed with experimental measurements of neural activity across 26 studies. Our work demonstrates a strategy for generating detailed hypotheses about the mechanisms of neural circuit function from connectivity measurements. We show that this strategy is more likely to be successful when neurons are sparsely connected—a universally observed feature of biological neural networks across species and brain regions.
DOI URL BibTeX

Empirical Inference Conference Paper DeViL: Decoding Vision features into Language Dani, M., Rio-Torto, I., Alaniz, S., Akata, Z. In Lecture Notes in Computer Science, vol 14264, 363–377, 45th Annual Conference of the German-Association-for-Pattern-Recognition (DAGM GCPR), 2024 (Published) DOI URL BibTeX

Human Aspects of Machine Learning Conference Paper Designing Ambiguity Sets for Distributionally Robust Optimization Using Structural Causal Optimal Transport Ehyaei, A. R., Farnadi, G., Samadi, S. In Proceedings of the AAAI Conference on Artificial Intelligence, 2024 (Published) BibTeX

Learning and Dynamical Systems Article Event-Based Federated Q-Learning Er, D., Muehlebach, M. Workshop on Foundations of RL and Control, International Conference on Machine Learning, 2024 (Published) BibTeX

Neural Capture and Synthesis Conference Paper FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models Aneja, S., Thies, J., Dail, A., Niessner, M. In Proceedings 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 21263-21273, IEEE, CVPR, 2024 (Published) DOI URL BibTeX

Safety- and Efficiency- aligned Learning Technical Report Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion Souri, H., Bansal, A., Kazemi, H., Fowl, L., Saha, A., Geiping, J., Wilson, A. G., Chellappa, R., Goldstein, T., Goldblum, M. 2024 (Submitted) URL BibTeX

Rationality Enhancement Article Identifying Resource-Rational Heuristics for Risky Choice Krueger, P., Callaway, F., Gul, S., Griffiths, T., Lieder, F. Psychological Review, 2024 (Published) DOI URL BibTeX