386
IRUS TotalDownloads
Altmetric
Argumentation accelerated reinforcement learning
File | Description | Size | Format | |
---|---|---|---|---|
Gao-Y-2015-PhD-Thesis.pdf | Thesis | 3.65 MB | Adobe PDF | View/Open |
Title: | Argumentation accelerated reinforcement learning |
Authors: | Gao, Yang |
Item Type: | Thesis or dissertation |
Abstract: | Reinforcement Learning (RL) is a popular statistical Artificial Intelligence (AI) technique for building autonomous agents, but it suffers from the curse of dimensionality: the computational requirement for obtaining the optimal policies grows exponentially with the size of the state space. Integrating heuristics into RL has proven to be an effective approach to combat this curse, but deriving high-quality heuristics from people’s (typically conflicting) domain knowledge is challenging, yet it received little research attention. Argumentation theory is a logic-based AI technique well-known for its conflict resolution capability and intuitive appeal. In this thesis, we investigate the integration of argumentation frameworks into RL algorithms, so as to improve the convergence speed of RL algorithms. In particular, we propose a variant of Value-based Argumentation Framework (VAF) to represent domain knowledge and to derive heuristics from this knowledge. We prove that the heuristics derived from this framework can effectively instruct individual learning agents as well as multiple cooperative learning agents. In addition,we propose the Argumentation Accelerated RL (AARL) framework to integrate these heuristics into different RL algorithms via Potential Based Reward Shaping (PBRS) techniques: we use classical PBRS techniques for flat RL (e.g. SARSA(λ)) based AARL, and propose a novel PBRS technique for MAXQ-0, a hierarchical RL (HRL) algorithm, so as to implement HRL based AARL. We empirically test two AARL implementations — SARSA(λ)-based AARL and MAXQ-based AARL — in multiple application domains, including single-agent and multi-agent learning problems. Empirical results indicate that AARL can improve the convergence speed of RL, and can also be easily used by people that have little background in Argumentation and RL. |
Content Version: | Open Access |
Issue Date: | Dec-2014 |
Date Awarded: | Apr-2015 |
URI: | http://hdl.handle.net/10044/1/26603 |
DOI: | https://doi.org/10.25560/26603 |
Supervisor: | Toni, Francesca |
Sponsor/Funder: | China Scholarship Council |
Funder's Grant Number: | 2011629167 |
Department: | Computing |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Computing PhD theses |