Repository logo
  • Log In
    Log in via Symplectic to deposit your publication(s).
Repository logo
  • About
  • Communities & Collections
  • Advanced Search
  • Statistics
  • Log In
    Log in via Symplectic to deposit your publication(s).
  1. Home
  2. Faculty of Natural Sciences
  3. Mathematics
  4. Mathematics PhD theses
  5. Peak selection in metabolic profiles using functional data analysis
 
  • Details
Peak selection in metabolic profiles using functional data analysis
File(s)
Doehring-O-2013-PhD-Thesis.pdf (9.28 MB)
Author(s)
Doehring, Orlando
Type
Thesis
Abstract
In this thesis we describe sparse principal component analysis (PCA) methods and apply
them to the analysis of short multivariate time series in order to perform both dimensionality
reduction and variable selection. We take a functional data analysis (FDA) modelling
approach in which each time series is treated as a continuous smooth function of time or
curve.
These techniques have been applied to analyse time series data arising in the area
of metabonomics. Metabonomics studies chemical processes involving small molecule
metabolites in a cell. We use experimental data obtained from the COnsortium for MEtabonomic
Toxicology (COMET) project which is formed by six pharmaceutical companies and
Imperial College London, UK. In the COMET project repeated measurements of several
metabolites over time were collected which are taken from rats subjected to different drug
treatments. The aim of our study is to detect important metabolites by analysing the multivariate
time series.
Multivariate functional PCA is an exploratory technique to describe the observed time
series. In its standard form, PCA involves linear combinations of all variables (i.e. metabolite
peaks) and does not perform variable selection. In order to select a subset of important
metabolites we introduce sparsity into the model. We develop a novel functional Sparse
Grouped Principal Component Analysis (SGPCA) algorithm using ideas related to Least
Absolute Shrinkage and Selection Operator (LASSO), a regularized regression technique,
with grouped variables. This SGPCA algorithm detects a sparse linear combination of
metabolites which explain a large proportion of the variance. Apart from SGPCA, we also propose two alternative approaches for metabolite selection. The first one is based on
thresholding the multivariate functional PCA solution, while the second method computes
the variance of each metabolite curve independently and then proceeds to these rank curves
in decreasing order of importance. To the best of our knowledge, this is the first application
of sparse functional PCA methods to the problem of modelling multivariate metabonomic
time series data and selecting a subset of metabolite peaks.
We present comprehensive experimental results using simulated data and COMET project
data for different multivariate and functional PCA variants from the literature and for SGPCA
. Simulation results show that that the SGPCA algorithm recovers a high proportion
of truly important metabolite variables. Furthermore, in the case of SGPCA applied to the
COMET dataset we identify a small number of important metabolites independently for
two different treatment conditions. A comparison of selected metabolites in both treatment
conditions reveals that there is an overlap of over 75 percent.
Date Issued
2013-02
Date Awarded
2013-03
URI
http://hdl.handle.net/10044/1/11062
DOI
https://doi.org/10.25560/11062
Copyright Statement
Attribution NoDerivatives 4.0 International Licence (CC BY-ND)
License URL
Attribution-NonCommercial-NoDerivatives 4.0 International
Advisor
Montana, Giovanni
Sponsor
Engineering and Physical Sciences Research Council (EPSRC)
Publisher Department
Mathematics
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)
About
Spiral Depositing with Spiral Publishing with Spiral Symplectic
Contact us
Open access team Report an issue
Other Services
Scholarly Communications Library Services
logo

Imperial College London

South Kensington Campus

London SW7 2AZ, UK

tel: +44 (0)20 7589 5111

Accessibility Modern slavery statement Cookie Policy

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback