37
IRUS Total
Downloads
  Altmetric

Continual machine learning for non-stationary data analysis

File Description SizeFormat 
Li-H-2022-PhD-Thesis.pdfThesis9.05 MBAdobe PDFView/Open
Title: Continual machine learning for non-stationary data analysis
Authors: Li, Honglin
Item Type: Thesis or dissertation
Abstract: Although deep learning models have achieved significant successes in various fields, most of them have limited capacity in learning multiple tasks sequentially. The issue of forgetting the previously learned tasks in continual learning is known as catastrophic forgetting or interference. When the input data or the goal of learning changes, a conventional machine learning model will learn and adapt to the new status. However, the model will not remember or recognise any revisits to the previous states. This causes performance reduction and re-training curves in dealing with periodic or irregularly reoccurring changes in the data or goals. Without continual learning ability, one cannot deploy an adaptive machine learning model in a changing environment. This thesis investigates the continual learning and mitigating the catastrophic forgetting problem in neural networks. We assume non-stationary data contains multiple different tasks which are coming in sequence and will not be stored. We propose a regularisation method, which is to identify and penalise the changes of important parameters of previous tasks while learning a new one. However, when the number of tasks is sufficiently large, this method cannot preserve all the previously learned knowledge, or it impedes the integration of new knowledge. This is also known as the stability-plasticity dilemma. To solve this problem, we proposed a replay method based on Generative Adversarial Networks (GANs). Different from other replay methods, the proposed model is not bounded by the fitting capacity of the generator. However, the number of parameters increases rapidly as the number of learned tasks grows. Therefore, we propose a continual learning model based on Bayesian neural networks and a Mixture of Experts (MoE) framework. The proposed model integrates different experts which are responsible for different tasks into a giant model. Previously knowledge is preserved, and new tasks can be efficiently learned by assigning new experts. Based on Monte-Carlo Sampling, the performance is not satisfied. To address this issue, we propose a Probabilistic Neural Network (PNN) and integrate it with a conventional neural network. The PNN can produce the likelihood given input and be used in a variety of fields. To apply continual learning methods to real-world applications, we then propose a semi-supervised learning model to analyse healthcare datasets. The proposed framework extracts the general features from unlabelled data. We integrate the PNN into the framework to classify the data, which includes a smaller set of labelled samples and continually learn the new cases. The proposed model has been tested on benchmark datasets and also a real-world clinical dataset. The results showed that our proposed model outperforms the state-of-the-art models without requiring prior knowledge of the tasks and overall accuracy of the continual learning. The experiments on the real-world clinical data were designed to identify the risk of Urinary Tract Infections (UTIs) using in-home monitoring data. The UTI risk analysis model has been deployed in a digital platform and is currently part of the on-going Minder clinical study at the UK Dementia Research Institute (UK DRI). An earlier version of the model was deployed as a part of a Class-I CE marked medical device. The UK DRI Minder platform and the deployed machine learning models, including the UTI risk analysis model developed in this research, are in the process to be accredited as a Class-IIa medical device. Overall, this PhD research tackles theoretical and applied challenges of continuous learning models in dealing with real-world data. We evaluate the proposed continual learning methods in a variety of benchmarks with comprehensive analysis and show their effectiveness. Furthermore, we have applied the proposed methods in real-world applications and demonstrated the applicability of the models to real-world settings and clinical problems.
Content Version: Open Access
Issue Date: Apr-2022
Date Awarded: Sep-2022
URI: http://hdl.handle.net/10044/1/100201
DOI: https://doi.org/10.25560/100201
Copyright Statement: Creative Commons Attribution NoDerivatives Licence
Supervisor: Barnaghi, Payam
Sharp, David
Department: Brain Science
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Department of Brain Sciences PhD Theses



This item is licensed under a Creative Commons License Creative Commons