Probability redistribution using time hopping for reinforcement learning
File(s)Kormushev_ISIS-2009.pdf (387.93 KB)
Accepted version
Author(s)
Kormushev, Petar
Dong, Fangyan
Hirota, Kaoru
Type
Conference Paper
Abstract
—A method for using the Time Hopping technique as a tool for probability redistribution is proposed. Applied to reinforcement learning in a simulation, it is able to re-shape the state probability distribution of the underlying Markov decision process as desired. This is achieved by modifying the target selection strategy of Time Hopping appropriately. Experiments with a robot maze reinforcement learning problem show that the method improves the exploration efficiency by re-shaping the state probability distribution to an almost uniform distribution.
Date Issued
2009
Date Acceptance
2009-08-17
Citation
Proc. 10th International Symposium on Advanced Intelligent Systems, ISIS 2009, 2009
Journal / Book Title
Proc. 10th International Symposium on Advanced Intelligent Systems, ISIS 2009
Copyright Statement
© 2009 The Authors
Source
10th International Symposium on Advanced Intelligent Systems, ISIS 2009
Publication Status
Published
Start Date
2009-08-17
Finish Date
2009-08-19
Coverage Spatial
Busan, Korea