Argumentation accelerated reinforcement learning

File Description SizeFormat 
Gao-Y-2015-PhD-Thesis.pdfThesis3.65 MBAdobe PDFDownload
Title: Argumentation accelerated reinforcement learning
Author(s): Gao, Yang
Item Type: Thesis or dissertation
Abstract: Reinforcement Learning (RL) is a popular statistical Artificial Intelligence (AI) technique for building autonomous agents, but it suffers from the curse of dimensionality: the computational requirement for obtaining the optimal policies grows exponentially with the size of the state space. Integrating heuristics into RL has proven to be an effective approach to combat this curse, but deriving high-quality heuristics from people’s (typically conflicting) domain knowledge is challenging, yet it received little research attention. Argumentation theory is a logic-based AI technique well-known for its conflict resolution capability and intuitive appeal. In this thesis, we investigate the integration of argumentation frameworks into RL algorithms, so as to improve the convergence speed of RL algorithms. In particular, we propose a variant of Value-based Argumentation Framework (VAF) to represent domain knowledge and to derive heuristics from this knowledge. We prove that the heuristics derived from this framework can effectively instruct individual learning agents as well as multiple cooperative learning agents. In addition,we propose the Argumentation Accelerated RL (AARL) framework to integrate these heuristics into different RL algorithms via Potential Based Reward Shaping (PBRS) techniques: we use classical PBRS techniques for flat RL (e.g. SARSA(λ)) based AARL, and propose a novel PBRS technique for MAXQ-0, a hierarchical RL (HRL) algorithm, so as to implement HRL based AARL. We empirically test two AARL implementations — SARSA(λ)-based AARL and MAXQ-based AARL — in multiple application domains, including single-agent and multi-agent learning problems. Empirical results indicate that AARL can improve the convergence speed of RL, and can also be easily used by people that have little background in Argumentation and RL.
Content Version: Open Access
Publication Date: Dec-2014
Date Awarded: Apr-2015
URI: http://hdl.handle.net/10044/1/26603
Advisor: Toni, Francesca
Sponsor/Funder: China Scholarship Council
Funder's Grant Number: 2011629167
Department: Computing
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Computing PhD theses



Items in Spiral are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons