Repository logo
  • Log In
    Log in via Symplectic to deposit your publication(s).
Repository logo
  • Communities & Collections
  • Research Outputs
  • Statistics
  • Log In
    Log in via Symplectic to deposit your publication(s).
  1. Home
  2. Faculty of Engineering
  3. Faculty of Engineering
  4. Action branching architectures for deep reinforcement learning
 
  • Details
Action branching architectures for deep reinforcement learning
File(s)
Tavakoli_AAAI-2018[1].pdf (3.54 MB)
Accepted version
OA Location
http://kormushev.com/research/publications/
Author(s)
Tavakoli, Arash
Pardo, Fabio
Kormushev, Petar
Type
Conference Paper
Abstract
Discrete-action algorithms have been central to numerous
recent successes of deep reinforcement learning. However,
applying these algorithms to high-dimensional action tasks
requires tackling the combinatorial increase of the number
of possible actions with the number of action dimensions.
This problem is further exacerbated for continuous-action
tasks that require fine control of actions via discretization.
In this paper, we propose a novel neural architecture fea-
turing a shared decision module followed by several net-
work
branches
, one for each action dimension. This approach
achieves a linear increase of the number of network outputs
with the number of degrees of freedom by allowing a level of
independence for each individual action dimension. To illus-
trate the approach, we present a novel agent, called Branch-
ing Dueling Q-Network (BDQ), as a branching variant of
the Dueling Double Deep Q-Network (Dueling DDQN). We
evaluate the performance of our agent on a set of challeng-
ing continuous control tasks. The empirical results show that
the proposed agent scales gracefully to environments with in-
creasing action dimensionality and indicate the significance
of the shared decision module in coordination of the dis-
tributed action branches. Furthermore, we show that the pro-
posed agent performs competitively against a state-of-the-
art continuous control algorithm, Deep Deterministic Policy
Gradient (DDPG).
Date Issued
2018-02-02
Date Acceptance
2017-11-09
Citation
Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI 2018), 2018
URI
http://hdl.handle.net/10044/1/60671
URL
http://kormushev.com/papers/Tavakoli_AAAI-2018.pdf
Publisher
AAAI
Journal / Book Title
Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI 2018)
Copyright Statement
© 2018, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
Identifier
http://kormushev.com/papers/Tavakoli_AAAI-2018.pdf
Source
AAAI 2018
Subjects
cs.LG
cs.AI
Publication Status
Published
Start Date
2018-02-02
Finish Date
2018-02-07
Coverage Spatial
New Orleans, LA, USA
About
Spiral Depositing with Spiral Publishing with Spiral Symplectic
Contact us
Open access team Report an issue
Other Services
Scholarly Communications Library Services
logo

Imperial College London

South Kensington Campus

London SW7 2AZ, UK

tel: +44 (0)20 7589 5111

Accessibility Modern slavery statement Cookie Policy

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback