Approximate inference of gene regulatory network models from RNA-Seq time series data
File(s)s12859-018-2125-2.pdf (1.18 MB)
Published version
Author(s)
Thorne, Thomas
Type
Journal Article
Abstract
Background
Inference of gene regulatory network structures from RNA-Seq data is challenging due to the nature of the data, as measurements take the form of counts of reads mapped to a given gene. Here we present a model for RNA-Seq time series data that applies a negative binomial distribution for the observations, and uses sparse regression with a horseshoe prior to learn a dynamic Bayesian network of interactions between genes. We use a variational inference scheme to learn approximate posterior distributions for the model parameters.
Results
The methodology is benchmarked on synthetic data designed to replicate the distribution of real world RNA-Seq data. We compare our method to other sparse regression approaches and find improved performance in learning directed networks. We demonstrate an application of our method to a publicly available human neuronal stem cell differentiation RNA-Seq time series data set to infer the underlying network structure.
Conclusions
Our method is able to improve performance on synthetic data by explicitly modelling the statistical distribution of the data when learning networks from RNA-Seq time series. Applying approximate inference techniques we can learn network structures quickly with only moderate computing resources.
Inference of gene regulatory network structures from RNA-Seq data is challenging due to the nature of the data, as measurements take the form of counts of reads mapped to a given gene. Here we present a model for RNA-Seq time series data that applies a negative binomial distribution for the observations, and uses sparse regression with a horseshoe prior to learn a dynamic Bayesian network of interactions between genes. We use a variational inference scheme to learn approximate posterior distributions for the model parameters.
Results
The methodology is benchmarked on synthetic data designed to replicate the distribution of real world RNA-Seq data. We compare our method to other sparse regression approaches and find improved performance in learning directed networks. We demonstrate an application of our method to a publicly available human neuronal stem cell differentiation RNA-Seq time series data set to infer the underlying network structure.
Conclusions
Our method is able to improve performance on synthetic data by explicitly modelling the statistical distribution of the data when learning networks from RNA-Seq time series. Applying approximate inference techniques we can learn network structures quickly with only moderate computing resources.
Date Issued
2018-04-11
Date Acceptance
2018-03-22
Citation
BMC BIOINFORMATICS, 2018, 19 (1)
ISSN
1471-2105
Publisher
BMC
Journal / Book Title
BMC BIOINFORMATICS
Volume
19
Issue
1
Copyright Statement
© 2018 The Author(s). Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (
http://creativecommons.org/licenses/by/4.0/
), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(
http://creativecommons.org/publicdomain/zero/1.0/
) applies to the data made available in this article, unless otherwise stated.
International License (
http://creativecommons.org/licenses/by/4.0/
), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(
http://creativecommons.org/publicdomain/zero/1.0/
) applies to the data made available in this article, unless otherwise stated.
Identifier
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000431023300002&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=1ba7043ffcc86c417c072aa74d649202
Subjects
Science & Technology
Life Sciences & Biomedicine
Biochemical Research Methods
Biotechnology & Applied Microbiology
Mathematical & Computational Biology
Biochemistry & Molecular Biology
BAYESIAN NETWORKS
DIFFERENTIATION
REGULARIZATION
EXPRESSION
PACKAGE
CDO
Publication Status
Published
Article Number
ARTN 127
Date Publish Online
2018-04-11