PhenoRank: reducing study bias in gene prioritisation through simulation

File Description SizeFormat 
bty028.pdfPublished version433.88 kBAdobe PDFDownload
Title: PhenoRank: reducing study bias in gene prioritisation through simulation
Author(s): Cornish, AJ
David, A
Sternberg, MJE
Item Type: Journal Article
Abstract: Motivation: Genome-wide association studies have identified thousands of loci associated with human disease, but identifying the causal genes at these loci is often difficult. Several methods prioritise genes most likely to be disease causing through the integration of biological data, including protein-protein interaction and phenotypic data. Data availability is not the same for all genes however, potentially influencing the performance of these methods. Results: We demonstrate that whilst disease genes tend to be associated with greater numbers of data, this may be at least partially a result of them being better studied. With this observation we develop PhenoRank, which prioritises disease genes whilst avoiding being biased towards genes with more available data. Bias is avoided by comparing gene scores generated for the query disease against gene scores generated using simulated sets of phenotype terms, which ensures that differences in data availability do not affect the ranking of genes. We demonstrate that whilst existing prioritisation methods are biased by data availability, PhenoRank is not similarly biased. Avoiding this bias allows PhenoRank to effectively prioritise genes with fewer available data and improves its overall performance. PhenoRank outperforms three available prioritisation methods in cross-validation (PhenoRank area under receiver operating characteristic curve [AUC]=0.89, DADA AUC=0.87, EXOMISER AUC=0.71, PRINCE AUC=0.83, P < 2.2 × 10-16). Availability: PhenoRank is freely available for download at https://github.com/alexjcornish/PhenoRank. Contact: m.sternberg@imperial.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.
Publication Date: 15-Jun-2018
Date of Acceptance: 14-Jan-2018
URI: http://hdl.handle.net/10044/1/56722
DOI: https://dx.doi.org/10.1093/bioinformatics/bty028
ISSN: 1367-4803
Publisher: Oxford University Press (OUP)
Start Page: 2087
End Page: 2095
Journal / Book Title: Bioinformatics
Volume: 34
Issue: 12
Copyright Statement: © The Author(s) 2018. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited
Sponsor/Funder: Wellcome Trust
Funder's Grant Number: WT/104955/Z/14/Z
Keywords: 01 Mathematical Sciences
06 Biological Sciences
08 Information And Computing Sciences
Bioinformatics
Publication Status: Published
Open Access location: https://doi.org/10.1093/bioinformatics/bty028
Online Publication Date: 2018-01-17
Appears in Collections:Faculty of Natural Sciences



Items in Spiral are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons