Goal-conditioned Offline Planning | Autonomous Learning – MPI-IS

Institute Homepage

Institute Homepage Sign In

Research Overview

Intrinsically Motivated Learning

Regularity as Intrinsic Reward for Free Play

SENSEI: Semantic Exploration Guided by Foundation Models to Learn Versatile World Models

Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation

Learning with Muscles

Natural and Robust Walking from Generic Rewards

The effect of muscles in Learning Behavior

Scaling RL to Large Musculoskeletal Systems

Reinforcement Learning for Diverse Solutions

Offline Diversity Under Imitation Constraints

Learning Diverse Skills for Local Navigation

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Reinforcement Learning and Control

Model-based Reinforcement Learning and Planning

Causal Reasoning in RL

Intrinsically Motivated Hierarchical Learner

Regularity as Intrinsic Reward for Free Play

Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation

Natural and Robust Walking from Generic Rewards

Goal-conditioned Offline Planning

Offline Diversity Under Imitation Constraints

Learning Diverse Skills for Local Navigation

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Combinatorial Optimization as a Layer / Blackbox Differentiation

Symbolic Regression and Equation Learning

Representation Learning

Learning with 3D rotations: A hitchhiker’s guide to SO(3)

Super-resolution Sensing for Haptics

Insight: a Haptic Sensor Powered by Vision and Machine Learning

Minsight: Learning-based tactile sensing for robotics

Symbolic Regression and Equation Learning

Previous Research Projects

Learning to Control Emulated Muscles in Real Robots: A Software Test Bed for Bio-Inspired Actuators in Hardware

Combinatorial Optimization as a Layer / Blackbox Differentiation

Causal Action Influence Aware Counterfactual Data Augmentation

Dual-Force: Enhanced Offline Diversity Maximization under Imitation Constraints

Versatile Skill Control via Self-supervised Adversarial Imitation of Unlabeled Mixed Motions

On Imitation in Mean-field Games

Online Learning under Adversarial Nonlinear Constraints

Autonomous Learning Members Publications

Goal-conditioned Offline Planning

Curiosity has established itself as a powerful exploration strategy in deep reinforcement learning. We consider the challenge of extracting flexible, goal-conditioned behavior from the products of such unsupervised exploration techniques, without any additional environment interaction. An object of central importance to this function is a value function, encoding “temporal distance” between any given state and goals. By analyzing the geometry of optimal goal-conditioned value functions, we highlight estimation artifacts in learned values. In order to mitigate their occurrence, we propose to combine model-based planning over learned value landscapes with a graph-based value aggregation scheme. We show how this combination can correct both local and global artifacts, obtaining significant improvements in zero-shot goal-reaching performance across diverse simulated environments.

Members

Empirical Inference

Marco Bagatella

Doctoral Researcher

Empirical Inference, Autonomous Learning

Georg Martius

Senior Research Scientist

Empirical Inference

Marco Bagatella

Doctoral Researcher

Empirical Inference, Autonomous Learning

Georg Martius

Senior Research Scientist

Publications

Conference Paper Goal-conditioned Offline Planning from Curious Exploration Bagatella, M., Martius, G. In Advances in Neural Information Processing Systems 36, 0 BibTeX