PILCO: A Model-Based and Data-Efficient Approach to Policy Search
File(s)icml2011_final.pdf (766.88 KB)
Accepted version
Author(s)
Deisenroth, Marc P
Rasmussen, Carl E
Type
Conference Paper
Abstract
In this paper, we introduce PILCO, a practical, data-efficient model-based policy search method. PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way. By learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning, PILCO can cope with very little data and facilitates learning from scratch in only a few trials. Policy evaluation is performed in closed form using state-of-the-art approximate inference. Furthermore, policy gradients are computed analytically for policy improvement. We report unprecedented learning efficiency on challenging and high-dimensional control tasks. Copyright 2011 by the author(s)/owner(s).
Date Issued
2011-06
Citation
Proceedings of the International Conference on Machine Learning (ICML 2011), 2011
Publisher
IMLS
Journal / Book Title
Proceedings of the International Conference on Machine Learning (ICML 2011)
Copyright Statement
© 2011 The Authors
Description
04.07.13 KB. Ok to add accepted version to Spiral. Authors retain copyright.
Identifier
http://www.icml-2011.org/papers.php
Source
28th International Conference on Machine Learning (ICML 2011)
Source Place
Washington, USA
Notes
timestamp: 2011.01.20
Publisher URL
Start Date
2011-06-28
Finish Date
2011-07-02
Coverage Spatial
Washington, USA