Understanding learning from non-stationary data
File(s)
Author(s)
Berariu, Tudor
Type
Thesis or dissertation
Abstract
Deep neural networks are ubiquitous, powering necessary predictions for numerous AI applications. These models are trained on evergrowing data collections, making the old paradigm of optimising neural networks from scratch on a fixed data set inefficient and obsolete. As advocated by the proponents of continual learning, these networks need to efficiently absorb knowledge in stages. We need, therefore, to assume that the training data changes during learning, and models need to adapt fast to the corresponding shifts in the output of the optimisation functions. In this monograph, I discuss the challenges that come with this expectation.
This thesis puts together three contributions to this topic. First, we study a surprising fact in supervised learning: pre-training negatively affects the generalisation of fine-tuned models compared to those trained from scratch. We look at this ``generalisation gap'', discuss the conditions for its manifestation and put forward a hypothesis for why it happens. An antidote to this generalisation gap is partially resetting parameters during training. Therefore, the second contribution of the thesis focuses on this regularisation technique and asks how it interacts with more commonly used regularisers and why state-of-the-art setups do not use re-initialisations. For the third contribution, we turn our attention to reinforcement learning, a scenario where the training data distribution shifts continuously because of the changes in the policy interacting with the environment. We propose a technique based on spectral normalisation and demonstrate its efficacy for value-based deep reinforcement learning agents on the Atari benchmark.
Hopefully, this thesis brings a couple of useful observations in the era of designing perpetually learning agents and fine-tuning large foundational models.
This thesis puts together three contributions to this topic. First, we study a surprising fact in supervised learning: pre-training negatively affects the generalisation of fine-tuned models compared to those trained from scratch. We look at this ``generalisation gap'', discuss the conditions for its manifestation and put forward a hypothesis for why it happens. An antidote to this generalisation gap is partially resetting parameters during training. Therefore, the second contribution of the thesis focuses on this regularisation technique and asks how it interacts with more commonly used regularisers and why state-of-the-art setups do not use re-initialisations. For the third contribution, we turn our attention to reinforcement learning, a scenario where the training data distribution shifts continuously because of the changes in the policy interacting with the environment. We propose a technique based on spectral normalisation and demonstrate its efficacy for value-based deep reinforcement learning agents on the Atari benchmark.
Hopefully, this thesis brings a couple of useful observations in the era of designing perpetually learning agents and fine-tuning large foundational models.
Version
Open Access
Date Issued
2023-07
Date Awarded
2024-03
Copyright Statement
Creative Commons Attribution NonCommercial Licence
Advisor
Clopath, Claudia
Pascanu, Razvan
Publisher Department
Bioengineering
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)