Analysing datafied life

File Description SizeFormat 
Yang-X-2016-PhD-Thesis.pdfThesis26.72 MBAdobe PDFDownload
Title: Analysing datafied life
Author(s): Yang, Xian
Item Type: Thesis or dissertation
Abstract: Our life is being increasingly quantified by data. To obtain information from quantitative data, we need to develop various analysis methods, which can be drawn from diverse fields, such as computer science, information theory and statistics. This thesis focuses on investigating methods for analysing data generated for medical research. Its focus is on the purpose of using various data to quantify patients for personalized treatment. From the perspective of data type, this thesis proposes analysis methods for the data from the fields of Bioinformatics and medical imaging. We will discuss the need of using data from molecular level to pathway level and also incorporating medical imaging data. Different preprocessing methods should be developed for different data types, while some post-processing steps for various data types, such as classification and network analysis, can be done by a generalized approach. From the perspective of research questions, this thesis studies methods for answering five typical questions from simple to complex. These questions are detecting associations, identifying groups, constructing classifiers, deriving connectivity and building dynamic models. Each research question is studied in a specific field. For example, detecting associations is investigated for fMRI signals. However, the proposed methods can be naturally extended to solve questions in other fields. This thesis has successfully demonstrated that applying a method traditionally used in one field to a new field can bring lots of new insights. Five main research contributions for different research questions have been made in this thesis. First, to detect active brain regions associated to tasks using fMRI signals, a new significance index, CR-value, has been proposed. It is originated from the idea of using sparse modelling in gene association study. Secondly, in quantitative Proteomics analysis, a clustering based method has been developed to extract more information from large scale datasets than traditional methods. Clustering methods, which are usually used in finding subgroups of samples or features, are used to match similar identities across samples. Thirdly, a pipeline originally proposed in the field of Bioinformatics has been adapted to multivariate analysis of fMRI signals. Fourthly, the concept of elastic computing in computer science has been used to develop a new method for generating functional connectivity from fMRI data. Finally, sparse signal recovery methods from the domain of signal processing are suggested to solve the underdetermined problem of network model inference.
Content Version: Open Access
Publication Date: Sep-2015
Date Awarded: Jun-2016
Advisor: Guo, Yike
Department: Computing
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Computing PhD theses

Items in Spiral are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons