Systematic generalisation in deep reinforcement learning
File(s)
Author(s)
Gonzalez Leon, Borja
Type
Thesis or dissertation
Abstract
One of the main goals in artificial intelligence (AI) is to build autonomous agents that correctly execute complex instructions given by a human user in a wide variety of settings. A promising line of work towards such objective is known as grounded language learning, in which autonomous agents are trained to not only parse language-based instructions, but to carry out those instructions in an environment, either physical or digital. A dominant paradigm for this line of work is deep reinforcement learning (DRL), an approach that blends the fields of reinforcement learning and artificial neural networks (NNs).
However, despite its promise, DRL solutions remain fragile. A major challenge in creating robust and efficient agents for grounded language learning is achieving systematic generalisation, the ability to comprehend and to generate new combinations of familiar elements. Systematic generalisation in a grounded setting is a recurrent research topic, and even the most advanced contemporary AI systems, large language models, struggle to achieve it.
In this thesis, we first present theoretical frameworks for combining temporal logic, a formal language suited for grounded language learning, and DRL. Then, we present two empirical contributions towards systematic generalisation in DRL. The first one concerns the impact of visual encoders when learning abstract operators such as negation in a systematic fashion. The second contribution are novel neural-network architectures that enable more robust systematic generalisation of unseen formal instructions in out-of-distribution (OOD) scenarios.
Last, we introduce the first DRL framework that finds diverse competitive policies in complex settings such as realistic car-racing simulators. Such framework is capable of effective driving while fulfilling given constraints, e.g. driving at high speeds while not being aggressive when overtaking. Moreover, the presented solution yields robust agents capable of generalising to OOD requirements and physics.
However, despite its promise, DRL solutions remain fragile. A major challenge in creating robust and efficient agents for grounded language learning is achieving systematic generalisation, the ability to comprehend and to generate new combinations of familiar elements. Systematic generalisation in a grounded setting is a recurrent research topic, and even the most advanced contemporary AI systems, large language models, struggle to achieve it.
In this thesis, we first present theoretical frameworks for combining temporal logic, a formal language suited for grounded language learning, and DRL. Then, we present two empirical contributions towards systematic generalisation in DRL. The first one concerns the impact of visual encoders when learning abstract operators such as negation in a systematic fashion. The second contribution are novel neural-network architectures that enable more robust systematic generalisation of unseen formal instructions in out-of-distribution (OOD) scenarios.
Last, we introduce the first DRL framework that finds diverse competitive policies in complex settings such as realistic car-racing simulators. Such framework is capable of effective driving while fulfilling given constraints, e.g. driving at high speeds while not being aggressive when overtaking. Moreover, the presented solution yields robust agents capable of generalising to OOD requirements and physics.
Version
Open Access
Date Issued
2024-03
Date Awarded
2024-11
Copyright Statement
Creative Commons Attribution NonCommercial Licence
Advisor
Belardinelli, Francesco
Shanahan, Murray
Publisher Department
Computing
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)