Model-Based Reinforcement Learning with Continuous States and Actions

File Description SizeFormat 
esann2008_errata.pdfSupporting information90.58 kBAdobe PDFDownload
es2008-8.final.pdfPublished version817.43 kBAdobe PDFDownload
Title: Model-Based Reinforcement Learning with Continuous States and Actions
Author(s): Deisenroth, MP
Rasmussen, CE
Peters, J
Item Type: Conference Paper
Abstract: Finding an optimal policy in a reinforcement learning (RL) framework with continuous state and action spaces is challenging. Approximate solutions are often inevitable. GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics. After building a GP model for the transition dynamics, we apply GPDP to this model and determine a continuous-valued policy in the entire state space. We apply the resulting controller to the underpowered pendulum swing up. Moreover, we compare our results on this RL task to a nearly optimal discrete DP solution in a fully known environment.
Publication Date: 1-Dec-2008
URI: http://hdl.handle.net/10044/1/12220
ISBN: 2-930307-08-0
Start Page: 19
End Page: 24
Journal / Book Title: Proceedings of the 16th European Symposium on Artificial Neural Networks (ESANN 2008)
Copyright Statement: © 2008 ESANN
Conference Name: ESANN 2008
Place of Publication: Bruges, Belgium
Start Date: 2008-04-23
Finish Date: 2008-04-25
Conference Place: Bruges, Belgium
Appears in Collections:Computing



Items in Spiral are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons