Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

Institute Homepage

Institute Homepage EN Sign In

Back

Empirische Inferenz Conference Paper 2007

PDF Web

Empirische Inferenz

Jan Peters

Research Group Leader

Autonomous Motion

Stefan Schaal

Director

In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the building blocks of movement generation, called motor primitives. Motor primitives are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. We show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.

Author(s):	Peters, J. and Schaal, S.
Links:	PDF Web
Journal:	Proceedings of the 15th European Symposium on Artificial Neural Networks (ESANN 2007)
Pages:	295-300
Year:	2007
Month:	April
Day:	0
Publisher:	D-Side

Bibtex Type:	Conference Paper (inproceedings)

Address:	Evere, Belgium
Event Name:	15th European Symposium on Artificial Neural Networks (ESANN 2007)
Event Place:	Brugge, Belgium

Digital:	0
Electronic Archiving:	grant_archive
Language:	en
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

BibTex

@inproceedings{4725,
  title = {Applying the Episodic Natural Actor-Critic
  Architecture to Motor Primitive Learning},
  journal = {Proceedings of the 15th European Symposium on Artificial Neural Networks (ESANN 2007)},
  abstract = {In this paper, we investigate motor primitive learning with the
  Natural Actor-Critic approach. The Natural Actor-Critic consists out of
  actor updates which are achieved using natural stochastic policy gradients
  while the critic obtains the natural policy gradient by linear regression.
  We show that this architecture can be used to learn the building blocks
  of movement generation, called motor primitives. Motor primitives are
  parameterized control policies such as splines or nonlinear differential equations
  with desired attractor properties. We show that our most modern
  algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.},
  pages = {295-300},
  publisher = {D-Side},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Evere, Belgium},
  month = apr,
  year = {2007},
  slug = {4725},
  author = {Peters, J. and Schaal, S.},
  month_numeric = {4}
}

Forschung

Abteilungen

Forschungsgruppen

Personen

Kontakt

Our Institute

Unsere Geschichte

Karriere

Überblick über Promotionsprogramme

Karriere

Service-Einrichtungen

Zentrale Wissenschaftliche Einrichtungen

Werkstätten

Campus Services

Impact

Kooperationen

Initiativen und Partner

Forschung

Abteilungen

Forschungsgruppen

Personen

Kontakt

Our Institute

Unsere Geschichte

Karriere

Überblick über Promotionsprogramme

Karriere

Service-Einrichtungen

Zentrale Wissenschaftliche Einrichtungen

Werkstätten

Campus Services

Impact

Kooperationen

Initiativen und Partner

BibTex