|Abstract: ||All living organisms exhibit complex behaviour, and this is a result of the underlying regulatory mechanisms that occur at cellular and molecular levels. For this reason such reactions are of central importance in the field of systems biology. Throughout this thesis we are concerned with mathematical models that allow us to better under- stand and represent the biological phenomena behind experimental data, and equally to make predictions about key regulatory processes happening in the cells. Specifically, this work explores and demonstrates how modern Bayesian nonparametric techniques, namely Gaussian process regression and Dirichlet process mixture models, can be applied in order to model complex systems biology data.
Here we have developed a new technique based on Gaussian process regression approaches to model metabolic regulatory processes at the cellular level. Our technique allows us to model noisy metabolite time course data and predicts dynamical metabolic flux behaviour in the associated pathways; we demonstrate that by learning the dependencies between several metabolites we can strengthen our predictions in sparsely sampled regions. We furthermore discuss when Gaussian processes can accurately reconstruct the underlying functions and when they are subject to the Nyquist limit.
Next we proceed to modelling biological processes that occur at the molecular level. Here we are interested in studying large and diverse functional genomics datasets. A variety of computational techniques allow us to analyse such data and model biological processes underlying them; an important class of these methods are techniques that permit the detection of heterogeneity in experimentally observed data. Here we employ Dirichlet processes to estimate the number of clusters within such genomic datasets and further propose a new method to tackle the data fusion problem. Our technique primarily relies on the outcomes from nonparametric Bayesian clustering approaches and is based on graph theory concepts, but in parallel we also discuss and show how this graph-theoretical approach can be extended to integrate results from non-Bayesian type clustering algorithms. We show that by integrating several data types we can successfully identify e.g. sets of genes that are regulated by similar transcription factors.|