Empirical Inference
Conference Paper
2008
Probabilistic Inference for Fast Learning in Control
PDF WebWe provide a novel framework for very fast model-based reinforcement learning in continuous state and action spaces. The framework requires probabilistic models that explicitly characterize their levels of confidence. Within this framework, we use flexible, non-parametric models to describe the world based on previously collected experience. We demonstrate learning on the cart-pole problem in a setting where we provide very limited prior knowledge about the task. Learning progresses rapidly, and a good policy is found after only a hand-full of iterations.
| Author(s): | Rasmussen, CE. and Deisenroth, MP. |
| Links: | |
| Book Title: | EWRL 2008 |
| Journal: | Recent Advances in Reinforcement Learning: 8th European Workshop (EWRL 2008) |
| Pages: | 229-242 |
| Year: | 2008 |
| Month: | November |
| Day: | 0 |
| Editors: | Girgin, S. , M. Loth, R. Munos, P. Preux, D. Ryabko |
| Publisher: | Springer |
| BibTeX Type: | Conference Paper (inproceedings) |
| Address: | Berlin, Germany |
| DOI: | 10.1007/978-3-540-89722-4_18 |
| Event Name: | 8th European Workshop on Reinforcement Learning |
| Event Place: | Villeneuve d‘Ascq, France |
| Digital: | 0 |
| Electronic Archiving: | grant_archive |
| Language: | en |
| Organization: | Max-Planck-Gesellschaft |
| School: | Biologische Kybernetik |
BibTeX
@inproceedings{5398,
title = {Probabilistic Inference for Fast Learning in Control},
journal = {Recent Advances in Reinforcement Learning: 8th European Workshop (EWRL 2008)},
booktitle = {EWRL 2008},
abstract = {We provide a novel framework for very fast model-based reinforcement learning in continuous state and action spaces. The framework requires probabilistic models that explicitly characterize their levels of confidence. Within this framework, we use flexible, non-parametric models to describe the world based on previously collected experience. We demonstrate learning on the cart-pole problem in a setting where we provide very limited prior knowledge about the task. Learning progresses rapidly, and a good policy is found after only a hand-full of iterations.},
pages = {229-242},
editors = {Girgin, S. , M. Loth, R. Munos, P. Preux, D. Ryabko},
publisher = {Springer},
organization = {Max-Planck-Gesellschaft},
school = {Biologische Kybernetik},
address = {Berlin, Germany},
month = nov,
year = {2008},
author = {Rasmussen, CE. and Deisenroth, MP.},
doi = {10.1007/978-3-540-89722-4_18},
month_numeric = {11}
}