Altmetric

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

File Description SizeFormat 
icml2011_final.pdfAccepted version766.88 kBAdobe PDFDownload
Title: PILCO: A Model-Based and Data-Efficient Approach to Policy Search
Author(s): Deisenroth, MP
Rasmussen, CE
Item Type: Conference Paper
Abstract: In this paper, we introduce PILCO, a practical, data-efficient model-based policy search method. PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way. By learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning, PILCO can cope with very little data and facilitates learning from scratch in only a few trials. Policy evaluation is performed in closed form using state-of-the-art approximate inference. Furthermore, policy gradients are computed analytically for policy improvement. We report unprecedented learning efficiency on challenging and high-dimensional control tasks. Copyright 2011 by the author(s)/owner(s).
Publication Date: 7-Oct-2011
URI: http://hdl.handle.net/10044/1/11585
Publisher: IMLS
Journal / Book Title: Proceedings of the International Conference on Machine Learning (ICML 2011)
Copyright Statement: © 2011 The Authors
Conference Name: 28th International Conference on Machine Learning (ICML 2011)
Conference Location: Washington, USA
Publisher URL: http://www.icml-2011.org/papers.php
Start Date: 2011-06-28
Finish Date: 2011-07-02
Conference Place: Washington, USA
Appears in Collections:Computing



Items in Spiral are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons