Sample-efficient Cross-Entropy Method for Real-time Planning

Institute Homepage

Institute Homepage Sign In

Back

Autonomous Learning Embodied Vision Conference Paper 2020

Autonomous Learning

Cristina Pinneri

Robust Machine Learning

Sebastian Blaes

Postdoctoral Researcher

Embodied Vision

Jan Achterhold

Embodied Vision

Jörg Stückler

Max Planck Research Group Leader

Autonomous Learning

Michal Rolinek

Empirical Inference, Autonomous Learning

Georg Martius

Senior Research Scientist

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

Author(s):	Cristina Pinneri and Shambhuraj Sawant and Sebastian Blaes and Jan Achterhold and Joerg Stueckler and Michal Rolinek and Georg Martius
Book Title:	Conference on Robot Learning 2020
Year:	2020

Project(s):	Model-based Reinforcement Learning and Planning
Bibtex Type:	Conference Paper (inproceedings)

State:	Published
URL:	https://corlconf.github.io/corl2020/paper_217/

Electronic Archiving:	grant_archive

Links:	Paper Code Spotlight-Video

BibTex

@inproceedings{PinneriEtAl2020:iCEM,
  title = {Sample-efficient Cross-Entropy Method for Real-time Planning},
  booktitle = {Conference on Robot Learning 2020},
  abstract = {Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.},
  year = {2020},
  slug = {pinnerietal2020-icem},
  author = {Pinneri, Cristina and Sawant, Shambhuraj and Blaes, Sebastian and Achterhold, Jan and Stueckler, Joerg and Rolinek, Michal and Martius, Georg},
  url = {https://corlconf.github.io/corl2020/paper_217/ }
}