Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Perceiving Systems Conference Paper Generative Zoo Niewiadomski, T., Yiannakidis, A., Cuevas-Velasquez, H., Sanyal, S., Black, M. J., Zuffi, S., Kulits, P. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, International Conference on Computer Vision, ICCV, October 2025 (Published)
The model-based estimation of 3D animal pose and shape from images enables computational modeling of animal behavior. Training models for this purpose requires large amounts of labeled image data with precise pose and shape annotations. However, capturing such data requires the use of multi-view or marker-based motion-capture systems, which are impractical to adapt to wild animals in situ and impossible to scale across a comprehensive set of animal species. Some have attempted to address the challenge of procuring training data by pseudo-labeling individual real-world images through manual 2D annotation, followed by 3D-parameter optimization to those labels. While this approach may produce silhouette-aligned samples, the obtained pose and shape parameters are often implausible due to the ill-posed nature of the monocular fitting problem. Sidestepping real-world ambiguity, others have designed complex synthetic-data-generation pipelines leveraging video-game engines and collections of artist-designed 3D assets. Such engines yield perfect ground-truth annotations but are often lacking in visual realism and require considerable manual effort to adapt to new species or environments. Motivated by these shortcomings, we propose an alternative approach to synthetic-data generation: rendering with a conditional image-generation model. We introduce a pipeline that samples a diverse set of poses and shapes for a variety of mammalian quadrupeds and generates realistic images with corresponding ground-truth pose and shape parameters. To demonstrate the scalability of our approach, we introduce GenZoo, a synthetic dataset containing one million images of distinct subjects. We train a 3D pose and shape regressor on GenZoo, which achieves state-of-the-art performance on a real-world multi-species 3D animal pose and shape estimation benchmark, despite being trained solely on synthetic data. We will release our dataset and generation pipeline to support future research.
project page code demo pdf BibTeX

Perceiving Systems Conference Paper ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness Li, B., Feng, H., Cai, Z., Black, M. J., Xiu, Y. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025 (Published)
itting a body to a 3D clothed human point cloud is a common yet challenging task. Traditional optimization-based approaches use multi-stage pipelines that are sensitive to pose initialization, while recent learning-based methods often struggle with generalization across diverse poses and garment types. We propose Equivariant Tightness Fitting for Clothed Humans, or ETCH, a novel pipeline that estimates cloth-to-body surface mapping through locally approximate SE(3) equivariance, encoding tightness as displacement vectors from the cloth surface to the underlying body. Following this mapping, pose-invariant body features regress sparse body markers, simplifying clothed human fitting into an inner-body marker fitting task. Extensive experiments on CAPE and 4D-Dress show that ETCH significantly outperforms state-of-the-art methods -- both tightness-agnostic and tightness-aware -- in body fitting accuracy on loose clothing (16.7% ~ 69.5%) and shape accuracy (average 49.9%). Our equivariant tightness design can even reduce directional errors by (67.2% ~ 89.8%) in one-shot (or out-of-distribution) settings (~ 1% data). Qualitative results demonstrate strong generalization of ETCH, regardless of challenging poses, unseen shapes, loose clothing, and non-rigid dynamics.
project arXiv code video BibTeX

Perceiving Systems Conference Paper Im2Haircut: Single-view Strand-based Hair Reconstruction for Human Avatars Vanessa, S., Egor, Z., Malte, P., Giorgio, B., Michael, B., Justus, T. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, USA, October 2025 (Accepted)
We present a novel approach for 3D hair reconstruction from single photographs based on a global hair prior combined with local optimization. Capturing strand-based hair geometry from single photographs is challenging due to the variety and geometric complexity of hairstyles and the lack of ground truth training data. Classical reconstruction methods like multi-view stereo only reconstruct the visible hair strands, missing the inner structure of hairstyles and hampering realistic hair simulation. To address this, existing methods leverage hairstyle priors trained on synthetic data. Such data, however, is limited in both quantity and quality since it requires manual work from skilled artists to model the 3D hairstyles and create near-photorealistic renderings. To address this, we propose a novel approach that uses both, real and synthetic data to learn an effective hairstyle prior. Specifically, we train a transformer-based prior model on synthetic data to obtain knowledge of the internal hairstyle geometry and introduce real data in the learning process to model the outer structure. This training scheme is able to model the visible hair strands depicted in an input image, while preserving the general 3D structure of hairstyles. We exploit this prior to create a Gaussian-splatting-based reconstruction method that creates hairstyles from one or more images. Qualitative and quantitative comparisons with existing reconstruction pipelines demonstrate the effectiveness and superior performance of our method for capturing detailed hair orientation, overall silhouette, and backside consistency. For additional results and code, please refer to https://im2haircut.is.tue.mpg.de.
arXiv project code BibTeX

Perceiving Systems Conference Paper MagicHOI: Leveraging 3D Priors for Accurate Hand-object Reconstruction from Short Monocular Video Clips Wang, S., He, H., Parelli, M., Gebhardt, C., Fan, Z., Song, J. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), ICCV, October 2025 (Published)
Most RGB-based hand-object reconstruction methods rely on object templates, while template-free methods typically assume full object visibility. This assumption often breaks in real-world settings, where fixed camera viewpoints and static grips leave parts of the object unobserved, resulting in implausible reconstructions. To overcome this, we present MagicHOI, a method for reconstructing hands and objects from short monocular interaction videos, even under limited viewpoint variation. Our key insight is that, despite the scarcity of paired 3D hand-object data, largescale novel view synthesis diffusion models offer rich object supervision. This supervision serves as a prior to regularize unseen object regions during hand interactions. Leveraging this insight, we integrate a novel view synthesis model into our hand-object reconstruction framework. We further align hand to object by incorporating visible contact constraints. Our results demonstrate that MagicHOI significantly outperforms existing state-of-the-art hand-object reconstruction methods. We also show that novel view synthesis diffusion priors effectively regularize unseen object regions, enhancing 3D hand-object reconstruction.
Project Video Code URL BibTeX

Perceiving Systems Conference Paper MoGA: 3D Generative Avatar Prior for Monocular Gaussian Avatar Reconstruction Dong, Z., Duan, L., Song, J., Black, M. J., Geiger, A. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025 (Published)
We present MoGA, a novel method to reconstruct high-fidelity 3D Gaussian avatars from a single-view image. The main challenge lies in inferring unseen appearance and geometric details while ensuring 3D consistency and realism. Most previous methods rely on 2D diffusion models to synthesize unseen views; however, these generated views are sparse and inconsistent, resulting in unrealistic 3D artifacts and blurred appearance. To address these limitations, we leverage a generative avatar model, that can generate diverse 3D avatars by sampling deformed Gaussians from a learned prior distribution. Due to the limited amount of 3D training data such a 3D model alone cannot capture all image details of unseen identities. Consequently, we integrate it as a prior, ensuring 3D consistency by projecting input images into its latent space and enforcing additional 3D appearance and geometric constraints. Our novel approach formulates Gaussian avatar creation as a model inversion process by fitting the generative avatar to synthetic views from 2D diffusion models. The generative avatar provides a meaningful initialization for model fitting, enforces 3D regularization, and helps in refining pose estimation. Experiments show that our method surpasses state-of-the-art techniques and generalizes well to real-world scenarios. Our Gaussian avatars are also inherently animatable.
pdf project code video BibTeX

Perceiving Systems Conference Paper PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning Zhang, Y., Feng, Y., Cseke, A., Saini, N., Bajandas, N., Heron, N., Black, M. J. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025 (Published)
We formulate the motor system of an interactive avatar as a generative motion model that can drive the body to move through 3D space in a perpetual, realistic, controllable, and responsive manner. Although human motion generation has been extensively studied, many existing methods lack the responsiveness and realism of real human movements. Inspired by recent advances in foundation models, we propose PRIMAL, which is learned with a two-stage paradigm. In the pretraining stage, the model learns body movements from a large number of sub-second motion segments, providing a generative foundation from which more complex motions are built. This training is fully unsupervised without annotations. Given a single-frame initial state during inference, the pretrained model not only generates unbounded, realistic, and controllable motion, but also enables the avatar to be responsive to induced impulses in real time. In the adaptation phase, we employ a novel ControlNet-like adaptor to fine-tune the base model efficiently, adapting it to new tasks such as few-shot personalized action generation and spatial target reaching. Evaluations show that our proposed method outperforms state-of-the-art baselines. We leverage the model to create a real-time character animation system in Unreal Engine that feels highly responsive and natural.
pdf project code video BibTeX

Perceiving Systems Conference Paper SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image Antić, D., Paschalidis, G., Tripathi, S., Gevers, T., Dwivedi, S. K., Tzionas, D. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025 (Published)
Recovering 3D object pose and shape from a single image is a challenging and ill-posed problem. This is due to strong (self-)occlusions, depth ambiguities, the vast intra- and inter-class shape variance, and the lack of 3D ground truth for natural images. Existing deep-network methods are trained on synthetic datasets to predict 3D shapes, so they often struggle generalizing to real-world images. Moreover, they lack an explicit feedback loop for refining noisy estimates, and primarily focus on geometry without directly considering pixel alignment. To tackle these limitations, we develop a novel render-and-compare optimization framework, called SDFit. This has three key innovations: First, it uses a learned category-specific and morphable signed-distance-function (mSDF) model, and fits this to an image by iteratively refining both 3D pose and shape. The mSDF robustifies inference by constraining the search on the manifold of valid shapes, while allowing for arbitrary shape topologies. Second, SDFit retrieves an initial 3D shape that likely matches the image, by exploiting foundational models for efficient look-up into 3D shape databases. Third, SDFit initializes pose by establishing rich 2D-3D correspondences between the image and the mSDF through foundational features. We evaluate SDFit on three image datasets, i.e., Pix3D, Pascal3D+, and COMIC. SDFit performs on par with SotA feed-forward networks for unoccluded images and common poses, but is uniquely robust to occlusions and uncommon poses. Moreover, it requires no retraining for unseen images. Thus, SDFit contributes new insights for generalizing in the wild.
Project arXiv Code Video Poster BibTeX

Perceiving Systems Conference Paper St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World Feng, H., Zhang, J., Wang, Q., Ye, Y., Yu, P., Black, M., Darrell, T., Kanazawa, A. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2025 (Published)
Dynamic 3D reconstruction and point tracking in videos are typically treated as separate tasks, despite their deep connection. We propose St4RTrack, a feed-forward framework that simultaneously reconstructs and tracks dynamic video content in a world coordinate frame from RGB inputs. This is achieved by predicting two appropriately defined pointmaps for a pair of frames captured at different moments. Specifically, we predict both pointmaps at the same moment, in the same world, capturing both static and dynamic scene geometry while maintaining 3D correspondences. Chaining these predictions through the video sequence with respect to a reference frame naturally computes long-range correspondences, effectively combining 3D reconstruction with 3D tracking. Unlike prior methods that rely heavily on 4D ground truth supervision we employ a novel adaptation scheme based on a reprojection loss. We establish a new extensive benchmark for world-frame reconstruction and tracking, demonstrating the effectiveness and efficiency of our unified, data-driven framework.
pdf arXiv project code demo video BibTeX

Robotic Composites and Compositions Article Jamming with magnetic composites Aktaş, B., Kim, M., Baeckert, M., Sicilia, G., Franchini, G., Heemeyer, F., Gervasoni, S., Chen, X., Pane, S., Nelson, B. Nature Communications, 16:8711, September 2025 (Published)
The jamming transition—marked by dramatic changes in mechanical properties, such as stiffness and damping—enables programmable and adaptive structures for robotic applications. This phenomenon, driven by changes in the coupling between individual subunits of an aggregate, can be controlled through external actuation sources. Existing jamming actuation methods, such as applying a vacuum with an airtight envelope, pose significant limitations, as they require the structures to be tethered, limiting reconfigurability and scalability. Here, we introduce an untethered jamming mechanism based on magnetic interactions between soft-ferromagnetic composites. We establish composite design principles to program the magnetization of the subunits, demonstrate linear, planar, and volumetric jamming and shape-locking, and model the magneto-mechanical behavior. This approach contributes to the development of jamming-based materials in which the jamming directions and transition points can be tuned on-the-fly by adjusting the external magnetic field orientation and strength, respectively.
DOI URL BibTeX

Organizational Leadership and Diversity Article Inclusive avatars in the Metaverse: learning from the lived experiences of people with disabilities Angerbauer, K., Van Wagoner, H. P., Keplinger, K., Halach, T., Vogelsang, J., Hube, N., Smith, A., Sedlmair, M. The Journal of Strategic Information Systems, 34:101935, September 2025 (Published)
Immersive platforms like the Metaverse have gained attention in information systems (IS) research, yet the diverse needs of people with disabilities (PWD) remain underexplored. This research examines the experiences of PWD using inclusive avatars that represent disabilities. Through an exploratory mixed-methods approach, combining qualitative interviews with an experience sampling study, we develop a framework informed by Affective Events Theory and voices of PWD to better understand how social interactions in the Metaverse impact PWD’s emotions and outcomes. Findings suggest that when PWD use inclusive avatars, inclusive and exclusionary social interactions shape their emotional responses, which in turn influence engagement, avatar connection and satisfaction, and perceptions of inclusion in the Metaverse. Although adopting inclusive avatars can be challenging, especially in the face of exclusionary interactions, the benefits can outweigh the costs. The role of disability identity is critical; PWD who identify strongly with their disability experience less negative emotional impact from exclusion. This research contributes to IS literature by conceptualizing the Metaverse as a relational, emotion-driven environment shaped by social interactions as well as a platform for authentic self-representation. Practical implications include supporting avatar-based disability representation, involving PWD in co-designing virtual reality technologies, and providing training to foster inclusive interactions in the Metaverse. These strategies can help organizations build more inclusive and engaging digital workplaces for an often underrepresented workforce segment.
DOI URL BibTeX

Physical Intelligence Article Mixed-length multivariate covalent organic framework for combined near-infrared photodynamic therapy and drug delivery Rodrı́guez-Camargo, A., Yildiz, E., Juela, D., Fischer, F. R., Graf, D., Rath, B. B., Ochsenfeld, C., Bauer, M., Sitti, M., Yao, L., Lotsch, B. Journal of the American Chemical Society, 147:33472-33481, September 2025 (Published)
Covalent organic frameworks (COFs) have been emerging as versatile reticular materials due to their tunable structures and functionalities, enabled by precise molecular engineering at the atomic level. While the integration of multiple components into COFs has substantially expanded their structural complexity, the strategic engineering of diverse functionalities within a single framework via the random distribution of linkers with varying lengths remains largely unexplored. Here, we report a series of highly crystalline mixed-length multivariate COFs synthesized using azobenzene and bipyridine as linkers, where tuning the ratio of linkers and incorporating palladium effectively modulates the balance between near-infrared (NIR) light absorption and catalytic sites for NIR-generation of hydrogen peroxide (H2O2). Capitalizing on the deep tissue penetration of NIR light and the generated H2O2 as reactive oxygen species, as a proof of concept, the optimal mixed-length multivariate COF reduces breast cancer cell viability by almost 90% after 1 h of irradiation in a combined in vitro photodynamic therapy and drug delivery.
DOI URL BibTeX

Haptic Intelligence Autonomous Learning Empirical Inference Conference Paper Adding Internal Audio Sensing to Internal Vision Enables Human-Like In-Hand Fabric Recognition with Soft Robotic Fingertips Andrussow, I., Solano, J., Richardson, B. A., Martius, G., Kuchenbecker, K. J. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots (Humanoids), 373-380, Seoul, South Korea, September 2025 (Published)
Distinguishing the feel of smooth silk from coarse cotton is a trivial everyday task for humans. When exploring such fabrics, fingertip skin senses both spatio-temporal force patterns and texture-induced vibrations that are integrated to form a haptic representation of the explored material. It is challenging to reproduce this rich, dynamic perceptual capability in robots because tactile sensors typically cannot achieve both high spatial resolution and high temporal sampling rate. In this work, we present a system that can sense both types of haptic information, and we investigate how each type influences robotic tactile perception of fabrics. Our robotic hand's middle finger and thumb each feature a soft tactile sensor: one is the open- source Minsight sensor that uses an internal camera to measure fingertip deformation and force at 50 Hz, and the other is our new sensor Minsound that captures vibrations through an internal MEMS microphone with a bandwidth from 50 Hz to 15 kHz. Inspired by the movements humans make to evaluate fabrics, our robot actively encloses and rubs folded fabric samples between its two sensitive fingers. Our results test the influence of each sensing modality on overall classification performance, showing high utility for the audio-based sensor. Our transformer-based method achieves a maximum fabric classification accuracy of 97% on a dataset of 20 common fabrics. Incorporating an external microphone away from Minsound increases our method's robustness in loud ambient noise conditions. To show that this audio-visual tactile sensing approach generalizes beyond the training data, we learn general representations of fabric stretchiness, thickness, and roughness.
DOI BibTeX

Haptic Intelligence Robotics Embodied Vision Conference Paper ISyHand: A Dexterous Multi-finger Robot Hand with an Articulated Palm Richardson, B. A., Grüninger, F., Mack, L., Stueckler, J., Kuchenbecker, K. J. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots (Humanoids), 720-727, Seoul, South Korea, September 2025, Benjamin A. Richardson, Felix Grueninger and Lukas Mack contributed equally to this publication (Published) DOI BibTeX

Social Foundations of Computation Conference Paper Strategic Hypothesis Testing Hossain, S., Chen, Y., Chen, Y. The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), Spotlight Poster, top 3%, September 2025 (Accepted)
We examine hypothesis testing within a principal-agent framework, where a strategic agent, holding private beliefs about the effectiveness of a product, submits data to a principal who decides on approval. The principal employs a hypothesis testing rule, aiming to pick a p-value threshold that balances false positives and false negatives while anticipating the agent's incentive to maximize expected profitability. Building on prior work, we develop a game-theoretic model that captures how the agent's participation and reporting behavior respond to the principal's statistical decision rule. Despite the complexity of the interaction, we show that the principal's errors exhibit clear monotonic behavior when segmented by an efficiently computable critical p-value threshold, leading to an interpretable characterization of their optimal p-value threshold. We empirically validate our model and these insights using publicly available data on drug approvals. Overall, our work offers a comprehensive perspective on strategic interactions within the hypothesis testing framework, providing technical and regulatory insights.
arXiv BibTeX

Haptic Intelligence Master Thesis Wrist-Worn Pressure Pulses for Phantom Directional Cues in VR Kadmani, A. Technical University of Munich, Munich, Germany, September 2025, M.Sc. in Electrical Engineering and Information Technology (Published)
Haptic feedback in today's VR systems is often limited to vibration delivered through handheld controllers, leaving a gap for compact devices that can convey spatial cues without occupying the hands. This thesis presents the design and evaluation of SuperCUTE, a wrist-worn pressure feedback device that uses four soft electrohydraulic actuators to elicit phantom tactile sensations around the wrist. The device was evaluated with n = 20 participants in a user study comprising two tasks. In Task 1 (circular GUI), single-actuator cues produced tightly clustered responses (median resultant length R = 0.92); about 70% of trials fell within ± 22.5° of the stimulated cardinal. Adjacent-actuator pairs yielded in-between percepts (about 70% of reports), and intensity imbalance shifted perceived location toward the stronger actuator; reported intensity was higher for strong than weak drives (mean 0.76 vs. 0.32). Across cues, Rayleigh tests indicated strong clustering of response angles (median R ≈ 0.82). In Task 2 (VR), hand trajectories during 5 s cues aligned with cue geometry; end-directions showed strong clustering (median R ≈ 0.78), and latency estimated from a 1 cm displacement threshold had a median of 1.25 s (IQR 0.61 s). Questionnaire responses indicated clear, comfortable, and usable cues. Overall, pressure pulses are a feasible approach for directional wrist cues in VR. We provide device documentation, datasets, and analysis code to support pressure-based wearable haptics.
BibTeX

Physical Intelligence Article Real-time in situ magnetization reprogramming for soft robotics Bao, X., Wang, F., Zhang, J., Li, M., Zhang, S., Ren, Z., Liao, J., Yan, Y., Kang, W., Zhang, R., Sitti, M. Nature, 645:375–384, August 2025 (Published)
Magnetic soft robots offer considerable potential across various scenarios, such as biomedical applications and industrial tasks, because of their shape programmability and reconfigurability, safe interaction and biocompatibility1,2,3,4. Despite recent advances, magnetic soft robots are still limited by the difficulties in reprogramming their required magnetization profiles in real time on the spot (in situ), which is essential for performing multiple functions or executing diverse tasks5,6. Here we introduce a method for real-time in situ magnetization reprogramming that enables the rearrangement and recombination of magnetic units to achieve diverse magnetization profiles. We explore the applications of this method in structures of varying dimensions, from one-dimensional tubes to three-dimensional frameworks, showcasing a diverse and expanded range of configurations and their deformations. This method also demonstrates versatility in diverse scenarios, including navigating around objects without undesired contact, reprogramming cilia arrays, managing multiple instruments cooperatively or independently under the same magnetic field, and manipulating objects of various shapes. These abilities extend the range of applications for magnetic actuation technologies. Furthermore, this method frees magnetic soft robots from the sole reliance on external magnetic fields for shape change, facilitating unprecedented modes and varieties of deformation while simultaneously reducing the need for complex magnetic field generation systems, thereby opening avenues for the development of magnetic actuation technologies.
DOI URL BibTeX

Social Foundations of Computation Algorithms and Society Article Performative Prediction: Past and Future Hardt, M., Mendler-Dünner, C. Statistical Science, Institute of Mathematical Statistics, August 2025 (Published)
Predictions in the social world generally influence the target of prediction, a phenomenon known as performativity. Self-fulfilling and self-negating predictions are examples of performativity. Of fundamental importance to economics, finance, and the social sciences, the notion has been absent from the development of machine learning. In machine learning applications, performativity often surfaces as distribution shift. A predictive model deployed on a digital platform, for example, influences consumption and thereby changes the data-generating distribution. We survey the recently founded area of performative prediction that provides a definition and conceptual framework to study performativity in machine learning. A consequence of performative prediction is a natural equilibrium notion that gives rise to new optimization challenges. Another consequence is a distinction between learning and steering, two mechanisms at play in performative prediction. The notion of steering is in turn intimately related to questions of power in digital markets. We review the notion of performative power that gives an answer to the question how much a platform can steer participants through its predictions. We end on a discussion of future directions, such as the role that performativity plays in contesting algorithmic systems.
arXiv URL BibTeX

Haptic Intelligence Miscellaneous The Benefits of Gait Retraining with Vibrotactile Feedback Outweigh Higher Perceived Mental Load Sundaram, V. H., Rokhmanova, N., Halilaj, E., Kuchenbecker, K. J. Extended abstract (1 page) presented at the American Society of Biomechanics Annual Meeting (ASB), Pittsburgh, USA, August 2025 (Published)
Knee osteoarthritis (KOA) affects millions worldwide, with excessive joint loading linked to disease progression. Modifying the foot progression angle (FPA) while walking is one strategy to reduce knee adduction moments, a measure associated with medial knee joint loading. This study investigated whether two types of vibrotactile biofeedback during a 20-minute treadmill gait-retraining session helped healthy adults better learn and retain a 10°toe-in gait. Participants who received feedback showed greater improvements in FPA accuracy than those without feedback and also reported significantly higher mental effort. The type of feedback that scaled the duration of the vibration with the magnitude of the error led to better short-term retention than no feedback, and it was also preferred by almost all subjects over constant-duration cues. These findings suggest that despite the added cognitive demand, users value biofeedback, emphasizing the need to design gait-retraining tools that consider both learning effectiveness and user experience.
BibTeX

Materials Article Sensitivity Enhancement of a Micro Ring Resonator-Based Photonic Sensor by Using a Gelatin Methacryloyl Functional Coating for the Detection of Metoprolol Tsianaka, A., Schweikert, C., Southan, A., Hoppe, N., Greul, M., Kaschel, M., Vogel, W., Berroth, M., Rademacher, G., Tovar, G. E. M. ACS Applied Optical Materials, 3(7):1556-1566, July 2025 (Published)
Aquatic environments are often contaminated with biopersistent pharmaceuticals, such as the β-blocker metoprolol. The quantitative determination of such pollutants is crucial for environmental monitoring. Therefore, a highly sensitive integrated photonic biosensor for the detection of minute concentrations of metoprolol is presented here. The sensor is based on a thermally robust ring resonator with a hydrogel coating for metoprolol adsorption. Hydrogels consisting of gelatin methacryloyl enabled an increase in the concentration of metoprolol ions in the vicinity of the photonic chip, resulting in high sensitivity of the sensor setup. Compared to an uncoated chip, an increase in sensitivity of up to a factor of 20 was observed. In combination with software-implemented signal processing, the setup showed a detection limit of less than 1 × 10–4 μmol mL–1. The combination of functional coating, thermally insensitive design, and applied digital signal postprocessing makes the system introduced here an attractive approach toward sensor-based wastewater analysis and monitoring.
pdf DOI URL BibTeX

Haptic Intelligence Miscellaneous A DNN-Based Metamodel for Simulating Fingertip Deformation Deshmukh, Y., Kuchenbecker, K. J., Serhat, G. Work-in-progress paper (2 pages) presented at the IEEE World Haptics Conference (WHC), Suwon, South Korea, July 2025 (Published) BibTeX

Empirical Inference Conference Paper Active Fine-Tuning of Multi-Task Policies Bagatella, M., Hübotter, J., Martius, G., Krause, A. In Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:2409-2441, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), PMLR, International Conference on Machine Learning, July 2025 (Published) arXiv URL BibTeX

Social Foundations of Computation Miscellaneous Answer Matching Outperforms Multiple Choice for Language Model Evaluation Chandak, N., Goel, S., Prabhu, A., Hardt, M., Geiping, J. July 2025 (Submitted)
Multiple choice benchmarks have long been the workhorse of language model evaluation because grading multiple choice is objective and easy to automate. However, we show multiple choice questions from popular benchmarks can often be answered without even seeing the question. These shortcuts arise from a fundamental limitation of discriminative evaluation not shared by evaluations of the model's free-form, generative answers. Until recently, there appeared to be no viable, scalable alternative to multiple choice--but, we show that this has changed. We consider generative evaluation via what we call answer matching: Give the candidate model the question without the options, have it generate a free-form response, then use a modern language model with the reference answer to determine if the response matches the reference. To compare the validity of different evaluation strategies, we annotate MMLU-Pro and GPQA-Diamond to obtain human grading data, and measure the agreement of each evaluation approach. We find answer matching using recent models--even small ones--achieves near-perfect agreement, in the range of inter-annotator agreement. In contrast, both multiple choice evaluation and using LLM-as-a-judge without reference answers aligns poorly with human grading. Improving evaluations via answer matching is not merely a conceptual concern: the rankings of several models change significantly when evaluating their free-form responses with answer matching. In light of these findings, we discuss how to move the evaluation ecosystem from multiple choice to answer matching.
arXiv BibTeX

Empirical Inference Deep Models and Optimization Conference Paper Generalized Interpolating Discrete Diffusion von Rütte, D., Fluri, J., Ding, Y., Orvieto, A., Schölkopf, B., Hofmann, T. Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:61810-61843, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), PMLR, International Conference on Machine Learning, July 2025 (Published) arXiv URL BibTeX

Empirical Inference Conference Paper Generative Intervention Models for Causal Perturbation Modeling Schneider, N., Lorch, L., Kilbertus, N., Schölkopf, B., Krause, A. Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:53388-53412, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), PMLR, International Conference on Machine Learning, July 2025 (Published) arXiv URL BibTeX

Empirical Inference Conference Paper Learning Joint Interventional Effects from Single-Variable Interventions in Additive Models Kekić, A., Garrido Mejia, S., Schölkopf, B. Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:29651-29669, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), PMLR, International Conference on Machine Learning, July 2025 (Published) arXiv URL BibTeX

Haptic Intelligence Robotic Materials Miscellaneous Learning-Based Touch Detection and Force Estimation in Cutaneous Electrohydraulic Devices Sanchez-Tamayo, N., Singer, D., Keplinger, C., Kuchenbecker, K. J. Work-in-progress paper (2 pages) presented at the IEEE World Haptics Conference (WHC), Suwon, South Korea, July 2025 (Published) BibTeX

Haptic Intelligence Miscellaneous Perception of Diverse Asymmetric Vibration Signals Tashiro, N., Ballardini, G., Nunez, C. M., Vardar, Y., Kuchenbecker, K. J. Work-in-progress paper (2 pages) presented at the IEEE World Haptics Conference (WHC), Suwon, South Korea, July 2025 (Published) BibTeX

Empirical Inference Conference Paper Position: Probabilistic Modelling is Sufficient for Causal Inference Mlodozeniec, B. K., Krueger, D., Turner, R. E. Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:81810-81840, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), PMLR, International Conference on Machine Learning, July 2025 (Published) URL BibTeX

Empirical Inference Ph.D. Thesis Probabilistic Machine Learning for Real-Time Gravitational-Wave Inference Dax, M. Eberhard Karls Universität Tübingen, July 2025, (MPI IS + ELLIS Institute T{\"u}bingen) (Published) BibTeX

Empirical Inference Conference Paper Progressive Tempering Sampler with Diffusion Rissanen*, S., OuYang*, R., He*, J., Chen, W., Heinonen, M., Solin, A., Hernández-Lobato, J. M. Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:51724-51746, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), PMLR, International Conference on Machine Learning, July 2025, *equal contribution (Published) arXiv URL BibTeX

Haptic Intelligence Miscellaneous Quantifying Texture-Rendering Quality Across Haptic Devices Fazlollahi, F., Seifi, H., Ballardini, G., Taghizadeh, Z., Schulz, A. K., MacLean, K. E., Kuchenbecker, K. J. Work-in-progress paper (2 pages) presented at the IEEE World Haptics Conference (WHC), Suwon, South Korea, July 2025 (Published) BibTeX

Empirical Inference Autonomous Learning Conference Paper SENSEI: Semantic Exploration Guided by Foundation Models to Learn Versatile World Models Sancaktar, C., Gumbsch, C., Zadaianchuk, A., Kolev, P., Martius, G. In Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:52745-52777, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), International Conference on Machine Learning , July 2025 (Published) arXiv Project website URL BibTeX

Empirical Inference Conference Paper Scalable Gaussian Processes with Latent Kronecker Structure Lin, J. A., Ament, A., Balandat, M., Eriksson, D., Hernández-Lobato, J. M., Bakshy, E. Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:37730-37744, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), PMLR, International Conference on Machine Learning, July 2025 (Published) arXiv URL BibTeX

Haptic Intelligence Robotics Miscellaneous Soft Magnetic Fingertip Devices for Clear Vibrotactile Feedback Gertler, I., Ballardini, G., Grüninger, F., Kuchenbecker, K. J. Hands-on demonstration presented at the IEEE World Haptics Conference (WHC), Suwon, South Korea, July 2025 (Published) BibTeX

Social Foundations of Computation Miscellaneous Train-before-Test Harmonizes Language Model Rankings Zhang, G., Dominguez-Olmedo, R., Hardt, M. July 2025 (Submitted)
Existing language model benchmarks provide contradictory model rankings, even for benchmarks that aim to capture similar skills. This dilemma of conflicting rankings hampers model selection, clouds model comparisons, and adds confusion to a growing ecosystem of competing models. Recent work attributed ranking disagreement to the phenomenon of training on the test task: As released, different models exhibit a different level of preparation for any given test task. A candidate solution to the problem is train-before-test: Give each model the same benchmark-specific finetuning before evaluation. Our primary contribution is a broad empirical evaluation of train-before-test across 24 benchmarks and 61 models. We show that train-before-test significantly improves ranking agreement consistently across all benchmarks. Whereas rankings have little external validity to start with, they enjoy a significant degree of external validity when applying train-before-test: Model rankings transfer gracefully from one benchmark to the other. Even within the same model family, train-before-test reduces strong ranking disagreement to near-perfect agreement. In addition, train-before-test reduces the model-score matrix to essentially rank one, revealing new insights into the latent factors of benchmark performance. Our work supports the recommendation to make train-before-test a default component of LLM benchmarking.
arXiv BibTeX

Haptic Intelligence Miscellaneous Whole-Arm Humanoid Robot Teleoperation with Naturalistic Vibrotactile Feedback Gong, Y., Hudhud Mughrabi, M., L’Orsa, R., Mohan, M., Kuchenbecker, K. J. Work-in-progress paper (2 pages) presented at the IEEE World Haptics Conference (WHC), Suwon, South Korea, July 2025 (Published) BibTeX

Autonomous Learning Empirical Inference Conference Paper Zero-Shot Offline Imitation Learning via Optimal Transport Rupf, T., Bagatella, M., Gürtler, N., Frey, J., Martius, G. In Proceedings of the 42nd International Conference on Machine Learning (ICML), 267:52345-52381, Proceedings of Machine Learning Research, (Editors: Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry), PMLR, International Conference on Machine Learning, July 2025 (Published)
Zero-shot imitation learning algorithms hold the promise of reproducing unseen behavior from as little as a single demonstration at test time. Existing practical approaches view the expert demonstration as a sequence of goals, enabling imitation with a high-level goal selector, and a low-level goal-conditioned policy. However, this framework can suffer from myopic behavior: the agent's immediate actions towards achieving individual goals may undermine long-term objectives. We introduce a novel method that mitigates this issue by directly optimizing the occupancy matching objective that is intrinsic to imitation learning. We propose to lift a goal-conditioned value function to a distance between occupancies, which are in turn approximated via a learned world model. The resulting method can learn from offline, suboptimal data, and is capable of non-myopic, zero-shot imitation, as we demonstrate in complex, continuous benchmarks.
arXiv URL BibTeX

Physical Intelligence Article Bacterial Minicell-Based Biohybrid Sub-micron Swimmers for Targeted Cargo Delivery Saadet Fatma Baltaci, M. B. A. I. K. V. S. M. S. Advanced Science, 12:e05538, June 2025 (Published)
Bacterial biohybrid microrobots possess significant potential for targeted cargo delivery and minimally invasive therapy. However, many challenges, such as biocompatibility, stability, and effective cargo loading, remain. Bacterial membrane vesicles, also referred to as minicells, offer a promising alternative for creating sub-micron scale biohybrid swimmers (minicell biohybrids) due to their active metabolism, non-dividing nature, robust structure, and high cargo-carrying capacity. Here, a biohybrid system is reported that utilizes motile minicells, ≈400 nm in diameter, generated by aberrant cell division of engineered Escherichia coli (E. coli), for the first time. Achieving over 99% purification from their parental bacterial cells, minicells are functionalized with magnetic nanoparticles (MNPs) to enable external magnetic control. Minicell biohybrids are capable of swimming at an average speed of up to 13.3 µm s−1 and being steered under a uniform magnetic field of 26 mT. Furthermore, they exhibit a significantly high drug loading capacity (2.8 µg mL−1) while maintaining their motility and show pH-sensitive release of anticancer drug doxorubicin hydrochloride (DOX) under acidic conditions. Additionally, drug-loaded minicell biohybrids notably reduce the viability of SK-BR-3 breast cancer cells in vitro. This study introduces minicell biohybrids and establishes their potential as magnetically guided, drug-loaded biohybrid systems for targeted therapies in future medical applications.
DOI URL BibTeX

Physical Intelligence Article Magnetically Controllable and Degradable Milliscale Swimmers as Intraocular Drug Implants Yildiz, E., Bozuyuk, U., Yildiz, E., Wang, F., Han, M., Karacakol, A. C., Sheehan, D., Yu, Y., Sitti, M. Advanced Science, 12:e07569, June 2025 (Published)
Intraocular drug implants are increasingly used for retinal treatments, such as age-related macular degeneration and diabetic macular edema, due to the rapidly aging global population. Although these therapies show promise in arresting disease progression and improving vision, intraocular implant-based therapies can cause unexpected complications that require further surgery due to implant dislocation or uncontrolled drug release. These frequent complications of intraocular drug implants can be overcome using magnetically controllable degradable milliscale swimmers (MDMS) with a double-helix body morphology. A biodegradable hydrogel, polyethylene glycol diacrylate, is employed as the primary 3D printing material of MDMS, and it is magnetized by decorating it with biocompatible polydopamine-encapsulated iron-platinum nanoparticles. MDMS have comparable dimensions to commercial intraocular implants that achieve translational motions in both aqueous and vitreous bodies. They can be imaged in real-time using optical coherence tomography, ultrasound, and photoacoustic imaging. Thanks to their biodegradable hydrogel-based structure, they can be loaded with anti-inflammatory drug molecules and release the medications without disrupting retinal epithelial viability and barrier function, and decrease proinflammatory cytokine release significantly. These magnetically controllable swimmers, which degrade in a couple of months, can be used for less invasive and more precise intraocular drug delivery compared to commercial intraocular drug implants.
DOI URL BibTeX

Perceiving Systems Conference Paper Reconstructing Animals and the Wild Kulits, P., Black, M. J., Zuffi, S. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2025 (Published)
The idea of 3D reconstruction as scene understanding is foundational in computer vision. Reconstructing 3D scenes from 2D visual observations requires strong priors to disambiguate structure. Much work has been focused on the anthropocentric, which, characterized by smooth surfaces, coherent normals, and regular edges, allows for the integration of strong geometric inductive biases. Here, we consider a more challenging problem where such assumptions do not hold: the reconstruction of natural scenes containing trees, bushes, boulders, and animals. While numerous works have attempted to tackle the problem of reconstructing animals in the wild, they have focused solely on the animal, neglecting environmental context. This limits their usefulness for analysis tasks, as animals exist inherently within the 3D world, and information is lost when environmental factors are disregarded. We propose a method to reconstruct natural scenes from single images. We base our approach on recent advances leveraging the strong world priors ingrained in Large Language Models and train an autoregressive model to decode a CLIP embedding into a structured compositional scene representation, encompassing both animals and the wild (RAW). To enable this, we propose a synthetic dataset comprising one million images and thousands of assets. Our approach, having been trained solely on synthetic data, generalizes to the task of reconstructing animals and their environments in real-world images. We will release our dataset and code to encourage future research.
project arXiv code BibTeX

Perceiving Systems Conference Paper DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models Rosu, R. A., Wu, K., Feng, Y., Zheng, Y., Black, M. J. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2025 (Published)
We address the task of reconstructing 3D hair geometry from a single image, which is challenging due to the diversity of hairstyles and the lack of paired image-to-3D hair data. Previous methods are primarily trained on synthetic data and cope with the limited amount of such data by using low-dimensional intermediate representations, such as guide strands and scalp-level embeddings, that require post-processing to decode, upsample, and add realism. These approaches fail to reconstruct detailed hair, struggle with curly hair, or are limited to handling only a few hairstyles. To overcome these limitations, we propose DiffLocks, a novel framework that enables detailed reconstruction of a wide variety of hairstyles directly from a single image. First, we address the lack of 3D hair data by automating the creation of the largest synthetic hair dataset to date, containing 40K hairstyles. Second, we leverage the synthetic hair dataset to learn an image-conditioned diffusion-transfomer model that reconstructs accurate 3D strands from a single frontal image. By using a pretrained image backbone, our method generalizes to in-the-wild images despite being trained only on synthetic data. Our diffusion model predicts a scalp texture map in which any point in the map contains the latent code for an individual hair strand. These codes are directly decoded to 3D strands without post-processing techniques. Representing individual strands, instead of guide strands, enables the transformer to model the detailed spatial structure of complex hairstyles. With this, DiffLocks can reconstruct highly curled hair, like afro hairstyles, from a single image for the first time. Qualitative and quantitative results demonstrate that DiffLocks outperforms exising state-of-the-art approaches. Data and code is available for research.
project paper code dataset BibTeX

Perceiving Systems Conference Paper InterDyn: Controllable Interactive Dynamics with Video Diffusion Models Akkerman, R., Feng, H., Black, M. J., Tzionas, D., Abrevaya, V. F. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2025 (Published)
Predicting the dynamics of interacting objects is essential for both humans and intelligent systems. However, existing approaches are limited to simplified, toy settings and lack generalizability to complex, real-world environments. Recent advances in generative models have enabled the prediction of state transitions based on interventions, but focus on generating a single future state which neglects the continuous dynamics resulting from the interaction. To address this gap, we propose InterDyn, a novel framework that generates videos of interactive dynamics given an initial frame and a control signal encoding the motion of a driving object or actor. Our key insight is that large video generation models can act as both neural renderers and implicit physics ``simulators'', having learned interactive dynamics from large-scale video data. To effectively harness this capability, we introduce an interactive control mechanism that conditions the video generation process on the motion of the driving entity. Qualitative results demonstrate that InterDyn generates plausible, temporally consistent videos of complex object interactions while generalizing to unseen objects. Quantitative evaluations show that InterDyn outperforms baselines that focus on static state transitions. This work highlights the potential of leveraging video generative models as implicit physics engines
project arXiv BibTeX

Perceiving Systems Conference Paper PICO: Reconstructing 3D People In Contact with Objects Cseke, A., Tripathi, S., Dwivedi, S. K., Lakshmipathy, A. S., Chatterjee, A., Black, M. J., Tzionas, D. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2025 (Published)
Recovering 3D Human-Object Interaction (HOI) from single color images is challenging due to depth ambiguities, occlusions, and the huge variation in object shape and appearance. Thus, past work requires controlled settings such as known object shapes and contacts, and tackles only limited object classes. Instead, we need methods that generalize to natural images and novel object classes. We tackle this in two main ways: (1) We collect PICO-db, a new dataset of natural images uniquely paired with dense 3D contact on both body and object meshes. To this end, we use images from the recent DAMON dataset that are paired with contacts, but these contacts are only annotated on a canonical 3D body. In contrast, we seek contact labels on both the body and the object. To infer these given an image, we retrieve an appropriate 3D object mesh from a database by leveraging vision foundation models. Then, we project DAMON's body contact patches onto the object via a novel method needing only 2 clicks per patch. This minimal human input establishes rich contact correspondences between bodies and objects. (2) We exploit our new dataset of contact correspondences in a novel render-and-compare fitting method, called PICO-fit, to recover 3D body and object meshes in interaction. PICO-fit infers contact for the SMPL-X body, retrieves a likely 3D object mesh and contact from PICO-db for that object, and uses the contact to iteratively fit the 3D body and object meshes to image evidence via optimization. Uniquely, PICO-fit works well for many object categories that no existing method can tackle. This is crucial to enable HOI understanding to scale in the wild.
project arXiv video code dataset BibTeX

Perceiving Systems Conference Paper ChatGarment: Garment Estimation, Generation and Editing via Large Language Models Bian, S., Xu, C., Xiu, Y., Grigorev, A., Liu, Z., Lu, C., Black, M. J., Feng, Y. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), June 2025 (Published)
We introduce ChatGarment, a novel approach that leverages large vision-language models (VLMs) to automate the estimation, generation, and editing of 3D garment sewing patterns from images or text descriptions. Unlike previous methods that often lack robustness and interactive editing capabilities, ChatGarment finetunes a VLM to produce GarmentCode, a JSON-based, language-friendly format for 2D sewing patterns, enabling both estimating and editing from images and text instructions. To optimize performance, we refine GarmentCode by expanding its support for more diverse garment types and simplifying its structure, making it more efficient for VLM finetuning. Additionally, we develop an automated data construction pipeline to generate a large-scale dataset of image-to-sewing-pattern and text-to-sewing-pattern pairs, empowering ChatGarment with strong generalization across various garment types. Extensive evaluations demonstrate ChatGarment’s ability to accurately reconstruct, generate, and edit garments from multimodal inputs, highlighting its potential to revolutionize workflows in fashion and gaming applications.
project arXiv video code data BibTeX

Social Foundations of Computation Conference Paper Difficult Lessons on Social Prediction from Wisconsin Public Schools Perdomo, J. C., Britton, T., Hardt, M., Abebe, R. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, June 2025 (Published)
Early warning systems (EWS) are predictive tools at the center of recent efforts to improve graduation rates in public schools across the United States. These systems assist in targeting interventions to individual students by predicting which students are at risk of dropping out. Despite significant investments in their widespread adoption, there remain large gaps in our understanding of the efficacy of EWS, and the role of statistical risk scores in education. In this work, we draw on nearly a decade's worth of data from a system used throughout Wisconsin to provide the first large-scale evaluation of the long-term impact of EWS on graduation outcomes. We present empirical evidence that the prediction system accurately sorts students by their dropout risk. We also find that it may have caused a single-digit percentage increase in graduation rates, though our empirical analyses cannot reliably rule out that there has been no positive treatment effect. Going beyond a retrospective evaluation of DEWS, we draw attention to a central question at the heart of the use of EWS: Are individual risk scores necessary for effectively targeting interventions? We propose a simple mechanism that only uses information about students' environments -- such as their schools, and districts -- and argue that this mechanism can target interventions just as efficiently as the individual risk score-based mechanism. Our argument holds even if individual predictions are highly accurate and effective interventions exist. In addition to motivating this simple targeting mechanism, our work provides a novel empirical backbone for the robust qualitative understanding among education researchers that dropout is structurally determined. Combined, our insights call into question the marginal value of individual predictions in settings where outcomes are driven by high levels of inequality.
arXiv URL BibTeX