Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Rationality Enhancement Conference Paper Promoting metacognitive learning through systematic reflection Becker, F., Lieder, F. Workshop on Metacognition in the Age of AI. Thirty-fifth Conference on Neural Information Processing Systems, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published)
People are able to learn clever cognitive strategies through trial and error from small amounts of experience. This is facilitated by people's ability to reflect on their own thinking which is known as metacognition. To examine the effects of deliberate systematic metacognitive reflection on how people learn how to plan, the experimental group was guided to systematically reflect on their decision-making process after every third decision. We found that participants assisted by reflection prompts learned to plan better faster. Moreover, we found that reflection led to immediate improvements in the participants' planning strategies. Our preliminary results do suggest that deliberate metacognitive reflection can help people discover clever cognitive strategies from very small amounts of experience. Understanding the role of reflection in human learning is a promising approach for making reinforcement learning more sample efficient in both humans and machines.
DOI URL BibTeX

Robotic Materials Article Miniaturized Circuitry for Capacitive Self-sensing and Closed-loop Control of Soft Electrostatic Transducers Ly, K., Kellaris, N., McMorris, D., Johnson, B. K., Acome, E., Sundaram, V., Naris, M., Humbert, J. S., Rentschler, M. E., Keplinger, C., Correll, N. Soft Robotics, 8(6):673-686, Mary Ann Liebert, Inc., publishers, December 2021 (Published)
Soft robotics is a field of robotic system design characterized by materials and structures that exhibit large-scale deformation, high compliance, and rich multifunctionality. The incorporation of soft and deformable structures endows soft robotic systems with the compliance and resiliency that makes them well adapted for unstructured and dynamic environments. Although actuation mechanisms for soft robots vary widely, soft electrostatic transducers such as dielectric elastomer actuators (DEAs) and hydraulically amplified self-healing electrostatic (HASEL) actuators have demonstrated promise due to their muscle-like performance and capacitive self-sensing capabilities. Despite previous efforts to implement self-sensing in electrostatic transducers by overlaying sinusoidal low-voltage signals, these designs still require sensing high-voltage signals, requiring bulky components that prevent integration with miniature untethered soft robots. We present a circuit design that eliminates the need for any high-voltage sensing components, thereby facilitating the design of simple low cost circuits using off-the-shelf components. Using this circuit, we perform simultaneous sensing and actuation for a range of electrostatic transducers including circular DEAs and HASEL actuators and demonstrate accurate estimated displacements with errors <4%. We further develop this circuit into a compact and portable system that couples high voltage actuation, sensing, and computation as a prototype toward untethered multifunctional soft robotic systems. Finally, we demonstrate the capabilities of our self-sensing design through feedback control of a robotic arm powered by Peano-HASEL actuators.
DOI URL BibTeX

Autonomous Vision Conference Paper Projected GANs Converge Faster Sauer, A. C. K. M. J. G. A. Advances in Neural Information Processing Systems 34 (NeurIPS 2021) , 34, (Editors: Ranzato, M; Beygelzimer, A; Dauphin, Y; Liang, PS; Vaughan, JW), NeuRIPS, 35th Conference on Neural Information Processing Systems (NeurIPS), December 2021 (Published) URL BibTeX

Autonomous Vision Conference Paper CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields Niemeyer, M. G. A. In Proceedings 2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 951-961, 3DV, 3DV, December 2021 (Published) DOI BibTeX

Human-centric Vision & Learning Conference Paper Human Performance Capture from Monocular Video in the Wild Guo, C. C. X. S. J. H. O. 2021 International Conference on 3D Vision (3DV), 889-898, IEEE, International Conference on 3D Vision (3DV 2021), December 2021 (Published) DOI URL BibTeX

Empirical Inference Conference Paper A Probabilistic State Space Model for Joint Inference from Differential Equations and Data Schmidt, J., Kraemer, N., Hennig, P. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 12374-12385, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021 (Published) URL BibTeX

Rationality Enhancement Article A Rational Reinterpretation of Dual Process Theories Milli, S., Lieder, F., Griffiths, T. L. Cognition, 217, December 2021 (Published)
Highly influential "dual-process" accounts of human cognition postulate the coexistence of a slow accurate system with a fast error-prone system. But why would there be just two systems rather than, say, one or 93? Here, we argue that a dual-process architecture might reflect a rational tradeoff between the cognitive flexibility afforded by multiple systems and the time and effort required to choose between them. We investigate what the optimal set and number of cognitive systems would be depending on the structure of the environment. We find that the optimal number of systems depends on the variability of the environment and the difficulty of deciding when which system should be used. Furthermore, we find that there is a plausible range of conditions under which it is optimal to be equipped with a fast system that performs no deliberation (``System 1'') and a slow system that achieves a higher expected accuracy through deliberation (``System 2''). Our findings thereby suggest a rational reinterpretation of dual-process theories.
DOI URL BibTeX

Movement Generation and Control Conference Paper A unified framework for walking and running of bipedal robots Boroujeni, M. G., Daneshmand, E., Righetti, L., Khadiv, M. 20th International Conference on Advanced Robotics (ICAR), December 2021 (Accepted)
In this paper, we propose a novel framework capable of generating various walking and running gaits for bipedal robots. The main goal is to relax the fixed center of mass (CoM) height assumption of the linear inverted pendulum model (LIPM) and generate a wider range of walking and running motions, without a considerable increase in complexity. To do so, we use the concept of virtual constraints in the centroidal space which enables generating motions beyond walking while keeping the complexity at a minimum. By a proper choice of these virtual constraints, we show that we can generate different types of walking and running motions. More importantly, enforcing the virtual constraints through feedback renders the dynamics linear and enables us to design a feedback control mechanism which adapts the next step location and timing in face of disturbances, through a simple quadratic program (QP). To show the effectiveness of this framework, we showcase different walking and running simulations of the biped robot Bolt in the presence of both environmental uncertainties and external disturbances.
URL BibTeX

Autonomous Vision Conference Paper ATISS: Autoregressive Transformers for Indoor Scene Synthesis Paschalidou, D., Kar, A., Shugrina, M., Kreis, K., Geiger, A., Fidler, S. In Advances in Neural Information Processing Systems 34, 15:12013-12026, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published)
The ability to synthesize realistic and diverse indoor furniture layouts automatically or based on partial input, unlocks many applications, from better interactive 3D tools to data synthesis for training and simulation. In this paper, we present ATISS, a novel autoregressive transformer architecture for creating diverse and plausible synthetic indoor environments, given only the room type and its floor plan. In contrast to prior work, which poses scene synthesis as sequence generation, our model generates rooms as unordered sets of objects. We argue that this formulation is more natural, as it makes ATISS generally useful beyond fully automatic room layout synthesis. For example, the same trained model can be used in interactive applications for general scene completion, partial room re-arrangement with any objects specified by the user, as well as object suggestions for any partial room. To enable this, our model leverages the permutation equivariance of the transformer when conditioning on the partial scene, and is trained to be permutation-invariant across object orderings. Our model is trained end-to-end as an autoregressive generative model using only labeled 3D bounding boxes as supervision. Evaluations on four room types in the 3D-FRONT dataset demonstrate that our model consistently generates plausible room layouts that are more realistic than existing methods. In addition, it has fewer parameters, is simpler to implement and train and runs up to 8x faster than existing methods.
URL BibTeX

Empirical Inference Conference Paper An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence Kristiadi, A., Hein, M., Hennig, P. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 18789-18800, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021 (Published) URL BibTeX

Empirical Inference Conference Paper Analytic Insights into Structure and Rank of Neural Network Hessian Maps Singh, S. P., Bachmann, G., Hofmann, T. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 23914-23927, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021 (Published) URL BibTeX

Empirical Inference Conference Paper Backward-Compatible Prediction Updates: A Probabilistic Approach Träuble, F., von Kügelgen, J., Kleindessner, M., Locatello, F., Schölkopf, B., Gehler, P. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 116-128, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems (NeurIPS), December 2021 (Published) arXiv URL BibTeX

Empirical Inference Conference Paper Bootstrap Your Flow Midgley, L. I., Stimper, V., Simm, G. N. C., Hernández-Lobato, J. M. 1st ELLIS Machine Learning for Molecule Discovery Workshop, 1st ELLIS Machine Learning for Molecule Discovery Workshop , December 2021 (Published) arXiv URL BibTeX

Autonomous Learning Empirical Inference Conference Paper Causal Influence Detection for Improving Efficiency in Reinforcement Learning Seitzer, M., Schölkopf, B., Martius, G. In Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 34:22905-22918, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems, December 2021 (Published) arXiv PDF Data Code URL BibTeX

Empirical Inference Conference Paper Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks Schneider, F., Dangel, F., Hennig, P. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 20825-20837, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems (NeurIPS), December 2021 (Published) URL BibTeX

Perceiving Systems Conference Paper Collaborative Regression of Expressive Bodies using Moderation Feng, Y., Choutas, V., Bolkart, T., Tzionas, D., Black, M. J. 2021 International Conference on 3D Vision (3DV 2021), 792-804, IEEE, Piscataway, NJ, International Conference on 3D Vision (3DV 2021), December 2021 (Published)
Recovering expressive humans from images is essential for understanding human behavior. Methods that estimate 3D bodies, faces, or hands have progressed significantly, yet separately. Face methods recover accurate 3D shape and geometric details, but need a tight crop and struggle with extreme views and low resolution. Whole-body methods are robust to a wide range of poses and resolutions, but provide only a rough 3D face shape without details like wrinkles. To get the best of both worlds, we introduce PIXIE, which produces animatable, whole-body 3D avatars with realistic facial detail, from a single image. For this, PIXIE uses two key observations. First, existing work combines independent estimates from body, face, and hand experts, by trusting them equally. PIXIE introduces a novel moderator that merges the features of the experts, weighted by their confidence. All part experts can contribute to the whole, using SMPL-X’s shared shape space across all body parts. Second, human shape is highly correlated with gender, but existing work ignores this. We label training images as male, female, or non-binary, and train PIXIE to infer “gendered” 3D body shapes with a novel shape loss. In addition to 3D body pose and shape parameters, PIXIE estimates expression, illumination, albedo and 3D facial surface displacements. Quantitative and qualitative evaluation shows that PIXIE estimates more accurate whole-body shape and detailed face shape than the state of the art. Models and code are available at https://pixie.is.tue.mpg.de.
arXiv project pdf suppl DOI BibTeX

Empirical Inference Conference Paper DiBS: Differentiable Bayesian Structure Learning Lorch, L., Rothfuss, J., Schölkopf, B., Krause, A. In Advances in Neural Information Processing Systems 34, 29:24111-24123, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Annual Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published) URL BibTeX

Embodied Vision Conference Paper DiffSDFSim: Differentiable Rigid-Body Dynamics With Implicit Shapes Strecke, M., Stückler, J. In 2021 International Conference on 3D Vision (3DV 2021) , 96-105 , International Conference on 3D Vision (3DV 2021) , December 2021 (Published) Project website Preprint Code DOI URL BibTeX

Empirical Inference Conference Paper Dynamic Inference with Neural Interpreters Rahaman*, N., Gondal*, M. W., Joshi, S., Gehler, P., Bengio, Y., Locatello, F., Schölkopf, B. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 10985-10998, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021, *equal contribution (Published) URL BibTeX

Empirical Inference Conference Paper Fine-Grained Zero-Shot Learning with DNA as Side Information Badirli, S., Akata, Z., Mohler, G., Picard, C., Dundar, M. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 34:19352-19362, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Conference on Neural Information Processing Systems (NeurIPS 2021) , December 2021 (Published) URL BibTeX

Micro, Nano, and Molecular Systems Article Following molecular mobility during chemical reactions: no evidence for active propulsion Fillbrook, L. L., Günther, J., Majer, G., O’Leary, D. J. P. W. S., Van Ryswyk, H., Fischer, P., Beves, J. E. Journal of the American Chemical Society, 143(49):20884-20890, December 2021
The reported changes in self-diffusion of small molecules during reactions have been attributed to “boosted mobility”. We demonstrate the critical role of changing concentrations of paramagnetic ions on nuclear magnetic resonance (NMR) signal intensities, which lead to erroneous measurements of diffusion coefficients. We present simple methods to over-come this problem. The use of shuffled gradient amplitudes allows accurate diffusion NMR measurements, even with time-dependent relaxation rates caused by changing concentrations of paramagnetic ions. The addition of a paramagnetic relaxation agent allows accurate determination of both diffusion coefficients and reaction kinetics during a single experi-ment. We analyze a copper-catalyzed azide-alkyne cycloaddition ‘click’ reaction, for which boosted mobility has been claimed. With our methods, we accurately measure the diffusive behavior of the solvent, starting materials and product, and find no global increase in diffusion coefficients during the reaction. We overcome NMR signal overlap using an alter-native reducing agent to improve the accuracy of the diffusion measurements. The alkyne reactant diffuses slower as the reaction proceeds, due to binding to the copper catalyst during the catalytic cycle. The formation of this intermediate was confirmed by complementary NMR techniques and density functional theory calculations. Our work calls into question recent claims that molecules actively propel or swim during reactions, and establishes that time-resolved diffusion NMR measurements can provide valuable insight into reaction mechanisms.
DOI URL BibTeX

Rationality Enhancement Conference Paper Have I done enough planning or should I plan more? He, R., Jain, Y. R., Lieder, F. Workshop on Metacognition in the Age of AI. Thirty-fifth Conference on Neural Information Processing Systems, Long Paper, Workshop on Metacognition in the Age of AI. Thirty-fifth Conference on Neural Information Processing Systems, December 2021 (Accepted)
People’s decisions about how to allocate their limited computational resources are essential to human intelligence. An important component of this metacognitive ability is deciding whether to continue thinking about what to do and move on to the next decision. Here, we show that people acquire this ability through learning and reverse-engineer the underlying learning mechanisms. Using a process-tracing paradigm that externalises human planning, we find that people quickly adapt how much planning they perform to the cost and benefit of planning. To discover the underlying metacognitive learning mechanisms we augmented a set of reinforcement learning models with metacognitive features and performed Bayesian model selection. Our results suggest that the metacognitive ability to adjust the amount of planning might be learned through a policy-gradient mechanism that is guided by metacognitive pseudo-rewards that communicate the value of planning.
BibTeX

Empirical Inference Autonomous Learning Conference Paper Hierarchical Reinforcement Learning with Timed Subgoals Gürtler, N., Büchler, D., Martius, G. In Advances in Neural Information Processing Systems 34, 26:21732-21743, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published) video arXiv code URL BibTeX

Empirical Inference Conference Paper Independent mechanisms analysis, a new concept? Gresele*, L., von Kügelgen*, J., Stimper, V., Schölkopf, B., Besserve, M. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 28233-28248, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Annual Conference on Neural Information Processing Systems, December 2021, *equal contribution (Published) arXiv URL BibTeX

Empirical Inference Conference Paper Iterative Teaching by Label Synthesis Liu, W., Liu, Z., Wang, H., Paull, L., Schölkopf, B., Weller, A. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 21681-21695, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021 (Published) URL BibTeX

Empirical Inference Conference Paper Laplace Redux — Effortless Bayesian Deep Learning Daxberger*, E., Kristiadi*, A., Immer*, A., Eschenhagen*, R., Bauer, M., Hennig, P. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 20089-20103, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021, *equal contribution (Published) URL BibTeX

Perceiving Systems Conference Paper Learning to Disambiguate Strongly Interacting Hands via Probabilistic Per-pixel Part Segmentation Fan, Z., Spurr, A., Kocabas, M., Tang, S., Black, M. J., Hilliges, O. 2021 International Conference on 3D Vision (3DV 2021), 1-10, IEEE, Piscataway, NJ, International Conference on 3D Vision (3DV 2021), December 2021 (Published)
In natural conversation and interaction, our hands often overlap or are in contact with each other. Due to the homogeneous appearance of hands, this makes estimating the 3D pose of interacting hands from images difficult. In this paper we demonstrate that self-similarity, and the resulting ambiguities in assigning pixel observations to the respective hands and their parts, is a major cause of the final 3D pose error. Motivated by this insight, we propose DIGIT, a novel method for estimating the 3D poses of two interacting hands from a single monocular image. The method consists of two interwoven branches that process the input imagery into a per-pixel semantic part segmentation mask and a visual feature volume. In contrast to prior work, we do not decouple the segmentation from the pose estimation stage, but rather leverage the per-pixel probabilities directly in the downstream pose estimation task. To do so, the part probabilities are merged with the visual features and processed via fully-convolutional layers. We experimentally show that the proposed approach achieves new state-of-the-art performance on the InterHand2.6M dataset for both single and interacting hands across all metrics. We provide detailed ablation studies to demonstrate the efficacy of our method and to provide insights into how the modelling of pixel ownership affects single and interacting hand pose estimation. Our code will be released for research purposes.
arXiv project code video DOI BibTeX

Article Learning to solve sequential physical reasoning problems from a scene image Driess, D., Ha, J., Toussaint, M. The International Journal of Robotics Research, 40(12-14):1435-1466, December 2021 (Published) DOI BibTeX

Intelligent Control Systems Conference Paper Learning-enhanced robust controller synthesis with rigorous statistical and control-theoretic guarantees Fiedler, C., Scherer, C. W., Trimpe, S. In 60th IEEE Conference on Decision and Control (CDC), IEEE, December 2021 (Accepted)
The combination of machine learning with control offers many opportunities, in particular for robust control. However, due to strong safety and reliability requirements in many real-world applications, providing rigorous statistical and control-theoretic guarantees is of utmost importance, yet difficult to achieve for learning-based control schemes. We present a general framework for learning-enhanced robust control that allows for systematic integration of prior engineering knowledge, is fully compatible with modern robust control and still comes with rigorous and practically meaningful guarantees. Building on the established Linear Fractional Representation and Integral Quadratic Constraints framework, we integrate Gaussian Process Regression as a learning component and stateof-the-art robust controller synthesis. In a concrete robust control example, our approach is demonstrated to yield improved performance with more data, while guarantees are maintained throughout.
URL BibTeX

Empirical Inference Conference Paper Linear-Time Probabilistic Solution of Boundary Value Problems Kraemer, N., Hennig, P. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 11160-11171, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021 (Published) URL BibTeX

Intelligent Control Systems Conference Paper Local policy search with Bayesian optimization Müller, S., von Rohr, A., Trimpe, S. In Advances in Neural Information Processing Systems 34, 25:20708-20720, (Editors: Ranzato, M. and Beygelzimer, A. and Dauphin, Y. and Liang, P. S. and Wortman Vaughan, J.), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021) , December 2021 (Published)
Reinforcement learning (RL) aims to find an optimal policy by interaction with an environment. Consequently, learning complex behavior requires a vast number of samples, which can be prohibitive in practice. Nevertheless, instead of systematically reasoning and actively choosing informative samples, policy gradients for local search are often obtained from random perturbations. These random samples yield high variance estimates and hence are sub-optimal in terms of sample complexity. Actively selecting informative samples is at the core of Bayesian optimization, which constructs a probabilistic surrogate of the objective from past samples to reason about informative subsequent ones. In this paper, we propose to join both worlds. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient. Based on the model, the algorithm decides where to query a noisy zeroth-order oracle to improve the gradient estimates. The resulting algorithm is a novel type of policy search method, which we compare to existing black-box algorithms. The comparison reveals improved sample complexity and reduced variance in extensive empirical evaluations on synthetic objectives. Further, we highlight the benefits of active sampling on popular RL benchmarks.
arXiv GitHub URL BibTeX

Empirical Inference Conference Paper Locality Sensitive Teaching Xu, Z., Chen, B., Li, C., Liu, W., Song, L., Lin, Y., Shrivastava, A. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 18049-18062, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021 (Published) URL BibTeX

Autonomous Vision Perceiving Systems Conference Paper MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images Wang, S., Mihajlovic, M., Ma, Q., Geiger, A., Tang, S. In Advances in Neural Information Processing Systems 34, 4:2810-2822, (Editors: Ranzato, M. and Beygelzimer, A. and Dauphin, Y. and Liang, P. S. and Wortman Vaughan, J.), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published)
In this paper, we aim to create generalizable and controllable neural signed distance fields (SDFs) that represent clothed humans from monocular depth observations. Recent advances in deep learning, especially neural implicit representations, have enabled human shape reconstruction and controllable avatar generation from different sensor inputs. However, to generate realistic cloth deformations from novel input poses, watertight meshes or dense full-body scans are usually needed as inputs. Furthermore, due to the difficulty of effectively modeling pose-dependent cloth deformations for diverse body shapes and cloth types, existing approaches resort to per-subject/cloth-type optimization from scratch, which is computationally expensive. In contrast, we propose an approach that can quickly generate realistic clothed human avatars, represented as controllable neural SDFs, given only monocular depth images. We achieve this by using meta-learning to learn an initialization of a hypernetwork that predicts the parameters of neural SDFs. The hypernetwork is conditioned on human poses and represents a clothed neural avatar that deforms non-rigidly according to the input poses. Meanwhile, it is meta-learned to effectively incorporate priors of diverse body shapes and cloth types and thus can be much faster to fine-tune compared to models trained from scratch. We qualitatively and quantitatively show that our approach outperforms state-of-the-art approaches that require complete meshes as inputs while our approach requires only depth frames as inputs and runs orders of magnitudes faster. Furthermore, we demonstrate that our meta-learned hypernetwork is very robust, being the first to generate avatars with realistic dynamic cloth deformations given as few as 8 monocular depth frames.
Project page arXiv URL BibTeX

Autonomous Vision Conference Paper On the Frequency Bias of Generative Models Schwarz, K., Liao, Y., Geiger, A. In Advances in Neural Information Processing Systems 34, 22:18126-18136, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published)
The key objective of Generative Adversarial Networks (GANs) is to generate new data with the same statistics as the provided training data. However, multiple recent works show that state-of-the-art architectures yet struggle to achieve this goal. In particular, they report an elevated amount of high frequencies in the spectral statistics which makes it straightforward to distinguish real and generated images. Explanations for this phenomenon are controversial: While most works attribute the artifacts to the generator, other works point to the discriminator. We take a sober look at those explanations and provide insights on what makes proposed measures against high-frequency artifacts effective. To achieve this, we first independently assess the architectures of both the generator and discriminator and investigate if they exhibit a frequency bias that makes learning the distribution of high-frequency content particularly problematic. Based on these experiments, we make the following four observations: 1) Different upsampling operations bias the generator towards different spectral properties. 2) Checkerboard artifacts introduced by upsampling cannot explain the spectral discrepancies alone as the generator is able to compensate for these artifacts. 3) The discriminator does not struggle with detecting high frequencies per se but rather struggles with frequencies of low magnitude. 4) The downsampling operations in the discriminator can impair the quality of the training signal it provides. In light of these findings, we analyze proposed measures against high-frequency artifacts in state-of-the-art GAN training but find that none of the existing approaches can fully resolve spectral artifacts yet. Our results suggest that there is great potential in improving the discriminator and that this could be key to match the distribution of the training data more closely.
URL BibTeX

Autonomous Learning Conference Paper Planning from Pixels in Environments with Combinatorially Hard Search Spaces Bagatella, M., Olšák, M., Rolínek, M., Martius, G. In Advances in Neural Information Processing Systems 34, 30:24707-24718, Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published)
The ability to form complex plans based on raw visual input is a litmus test for current capabilities of artificial intelligence, as it requires a seamless combination of visual processing and abstract algorithmic execution, two traditionally separate areas of computer science. A recent surge of interest in this field brought advances that yield good performance in tasks ranging from arcade games to continuous control; these methods however do not come without significant issues, such as limited generalization capabilities and difficulties when dealing with combinatorially hard planning instances. Our contribution is two-fold: (i) we present a method that learns to represent its environment as a latent graph and leverages state reidentification to reduce the complexity of finding a good policy from exponential to linear (ii) we introduce a set of lightweight environments with an underlying discrete combinatorial structure in which planning is challenging even for humans. Moreover, we show that our methods achieves strong empirical generalization to variations in the environment, even across highly disadvantaged regimes, such as “one-shot” planning, or in an offline RL paradigm which only provides low-quality trajectories.
URL BibTeX

Empirical Inference Article Real-time gravitational wave science with neural posterior estimation Dax, M., Green, S. R., Gair, J., Macke, J. H., Buonanno, A., Schölkopf, B. Physical Review Letters, 127(24), December 2021 (Published) arXiv DOI URL BibTeX

Empirical Inference Conference Paper Regret Bounds for Gaussian-Process Optimization in Large Domains Wüthrich, M., Schölkopf, B., Krause, A. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 7385-7396, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021 (Published) arXiv URL BibTeX

Empirical Inference Conference Paper Robustness via Uncertainty-aware Cycle Consistency Upadhyay, U., Chen, Y., Akata, Z. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 28261-28273, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021 (Published) URL BibTeX

Micro, Nano, and Molecular Systems Article Rods in a lyotropic chromonic liquid crystal: emergence of chirality, symmetry-breaking alignment, and caged angular diffusion Ettinger, S., Dietrich, C. F., Mishra, C. K., Miksch, C., Beller, D. A., Collings, P. J., Yodh, A. G. Soft Matter, 18(3):487-495 , December 2021 (Published) DOI BibTeX

Article Section Patterns: Efficiently Solving Narrow Passage Problems in Multilevel Motion Planning Orthey, A., Toussaint, M. IEEE Transactions on Robotics, 37(6):1891-1905, December 2021 (Published) DOI BibTeX

Empirical Inference Conference Paper Self-supervised learning with data augmentations provably isolates content from style von Kügelgen*, J., Sharma*, Y., Gresele*, L., Brendel, W., Schölkopf, B., Besserve, M., Locatello, F. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 16451-16467, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021, *equal contribution (Published) arXiv URL BibTeX

Autonomous Vision Conference Paper Shape As Points: A Differentiable Poisson Solver Peng, S., Jiang, C. M., Liao, Y., Niemeyer, M., Pollefeys, M., Geiger, A. In Advances in Neural Information Processing Systems 34, 16:13032-13044, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published)
In recent years, neural implicit representations gained popularity in 3D reconstruction due to their expressiveness and flexibility. However, the implicit nature of neural implicit representations results in slow inference times and requires careful initialization. In this paper, we revisit the classic yet ubiquitous point cloud representation and introduce a differentiable point-to-mesh layer using a differentiable formulation of Poisson Surface Reconstruction (PSR) which allows for a GPU-accelerated fast solution of the indicator function given an oriented point cloud. The differentiable PSR layer allows us to efficiently and differentiably bridge the explicit 3D point representation with the 3D mesh via the implicit indicator field, enabling end-to-end optimization of surface reconstruction metrics such as Chamfer distance. This duality between points and meshes hence allows us to represent shapes as oriented point clouds, which are explicit, lightweight and expressive. Compared to neural implicit representations, our Shape-As-Points (SAP) model is more interpretable, lightweight, and accelerates inference time by one order of magnitude. Compared to other explicit representations such as points, patches, and meshes, SAP produces topology-agnostic, watertight manifold surfaces. We demonstrate the effectiveness of SAP on the task of surface reconstruction from unoriented point clouds and learning-based reconstruction.
Paper URL BibTeX

Autonomous Learning Conference Paper Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains Gumbsch, C., Butz, M. V., Martius, G. In Advances in Neural Information Processing Systems 34, 21:17518-17531, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P. S. Liang and J. Wortman Vaughan), Curran Associates, Inc., Red Hook, NY, 35th Conference on Neural Information Processing Systems (NeurIPS 2021), December 2021 (Published)
A common approach to prediction and planning in partially observable domains is to use recurrent neural networks (RNNs), which ideally develop and maintain a latent memory about hidden, task-relevant factors. We hypothesize that many of these hidden factors in the physical world are constant over time, changing only sparsely. Accordingly, we propose Gated $L_0$ Regularized Dynamics (GateL0RD), a novel recurrent architecture that incorporates the inductive bias to maintain stable, sparsely changing latent states. The bias is implemented by means of a novel internal gating function and a penalty on the $L_0$ norm of latent state changes. We demonstrate that GateL0RD can compete with or outperform state-of-the-art RNNs in a variety of partially observable prediction and control tasks. GateL0RD tends to encode the underlying generative factors of the environment, ignores spurious temporal dependencies, and generalizes better, improving sampling efficiency and prediction accuracy as well as behavior in model-based planning and reinforcement learning tasks. Moreover, we show that the developing latent states can be easily interpreted, which is a step towards better explainability in RNNs.
arXiv Openreview URL BibTeX

Empirical Inference Conference Paper The Inductive Bias of Quantum Kernels Kübler*, J. M., Buchholz*, S., Schölkopf, B. Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 12661-12673, (Editors: M. Ranzato and A. Beygelzimer and Y. Dauphin and P.S. Liang and J. Wortman Vaughan), Curran Associates, Inc., 35th Annual Conference on Neural Information Processing Systems, December 2021, *equal contribution (Published) arXiv URL BibTeX

Perceiving Systems Article The neural coding of face and body orientation in occipitotemporal cortex Foster, C., Zhao, M., Bolkart, T., Black, M. J., Bartels, A., Bülthoff, I. NeuroImage, 246:118783, December 2021 (Published)
Face and body orientation convey important information for us to understand other people's actions, intentions and social interactions. It has been shown that several occipitotemporal areas respond differently to faces or bodies of different orientations. However, whether face and body orientation are processed by partially overlapping or completely separate brain networks remains unclear, as the neural coding of face and body orientation is often investigated separately. Here, we recorded participants’ brain activity using fMRI while they viewed faces and bodies shown from three different orientations, while attending to either orientation or identity information. Using multivoxel pattern analysis we investigated which brain regions process face and body orientation respectively, and which regions encode both face and body orientation in a stimulus-independent manner. We found that patterns of neural responses evoked by different stimulus orientations in the occipital face area, extrastriate body area, lateral occipital complex and right early visual cortex could generalise across faces and bodies, suggesting a stimulus-independent encoding of person orientation in occipitotemporal cortex. This finding was consistent across functionally defined regions of interest and a whole-brain searchlight approach. The fusiform face area responded to face but not body orientation, suggesting that orientation responses in this area are face-specific. Moreover, neural responses to orientation were remarkably consistent regardless of whether participants attended to the orientation of faces and bodies or not. Together, these results demonstrate that face and body orientation are processed in a partially overlapping brain network, with a stimulus-independent neural code for face and body orientation in occipitotemporal cortex.
paper DOI BibTeX

Haptic Intelligence Article Virtual Reality Treatment Displaying the Missing Leg Improves Phantom Limb Pain: A Small Clinical Trial Ambron, E., Buxbaum, L. J., Miller, A., Stoll, H., Kuchenbecker, K. J., Coslett, H. B. Neurorehabilitation and Neural Repair, 35(12):1100-1111, December 2021 (Published)
Background: Phantom limb pain (PLP) is a common and in some cases debilitating consequence of upper- or lower-limb amputation for which current treatments are inadequate. Objective: This small clinical trial tested whether game-like interactions with immersive VR activities can reduce PLP in subjects with transtibial lower-limb amputation. Methods: Seven participants attended 5–7 sessions in which they engaged in a visually immersive virtual reality experience that did not require leg movements (Cool! TM), followed by 10–12 sessions of targeted lower-limb VR treatment consisting of custom games requiring leg movement. In the latter condition, they controlled an avatar with 2 intact legs viewed in a head-mounted display (HTC Vive TM). A motion-tracking system mounted on the intact and residual limbs controlled the movements of both virtual extremities independently. Results: All participants except one experienced a reduction of pain immediately after VR sessions, and their pre session pain levels also decreased over the course of the study. At a group level, PLP decreased by 28% after the treatment that did not include leg movements and 39.6% after the games requiring leg motions. Both treatments were successful in reducing PLP. Conclusions: This VR intervention appears to be an efficacious treatment for PLP in subjects with lower-limb amputation.
DOI BibTeX

Empirical Inference Article Coupling of hippocampal theta and ripples with pontogeniculooccipital waves Ramirez-Villegas, J., Besserve, M., Murayama, Y., Evrard, H., Oeltermann, A., Logothetis, N. Nature, 589(7840):96-102, Nature, November 2021 (Published) DOI URL BibTeX

Intelligent Control Systems Conference Paper Using Physics Knowledge for Learning Rigid-Body Forward Dynamics with Gaussian Process Force Priors Rath, L., Geist, A. R., Trimpe, S. In Proceedings of the 5th Conference on Robot Learning, 164:101-111, Proceedings of Machine Learning Research, (Editors: Faust, Aleksandra and Hsu, David and Neumann, Gerhard), PMLR, 5th Conference on Robot Learning (CoRL 2021), November 2021 (Published) URL BibTeX