IRUS Total

GeDi: applying su x arrays to increase the repertoire of detectable SNVs in tumour genomes

File Description SizeFormat 
s12859-020-3367-3.pdfPublished version1.84 MBAdobe PDFView/Open
Title: GeDi: applying su x arrays to increase the repertoire of detectable SNVs in tumour genomes
Authors: Coleman, I
Corleone, G
Arram, J
Ng, H-C
Magnani, L
Luk, W
Item Type: Journal Article
Abstract: Background Current popular variant calling pipelines rely on the mapping coordinates of each input read to a reference genome in order to detect variants. Since reads deriving from variant loci that diverge in sequence substantially from the reference are often assigned incorrect mapping coordinates, variant calling pipelines that rely on mapping coordinates can exhibit reduced sensitivity. Results In this work we present GeDi, a suffix array-based somatic single nucleotide variant (SNV) calling algorithm that does not rely on read mapping coordinates to detect SNVs and is therefore capable of reference-free and mapping-free SNV detection. GeDi executes with practical runtime and memory resource requirements, is capable of SNV detection at very low allele frequency (<1%), and detects SNVs with high sensitivity at complex variant loci, dramatically outperforming MuTect, a well-established pipeline. Conclusion By designing novel suffix-array based SNV calling methods, we have developed a practical SNV calling software, GeDi, that can characterise SNVs at complex variant loci and at low allele frequency thus increasing the repertoire of detectable SNVs in tumour genomes. We expect GeDi to find use cases in targeted-deep sequencing analysis, and to serve as a replacement and improvement over previous suffix-array based SNV calling methods.
Issue Date: 5-Feb-2020
Date of Acceptance: 14-Jan-2020
URI: http://hdl.handle.net/10044/1/76947
DOI: 10.1186/s12859-020-3367-3
ISSN: 1471-2105
Publisher: BioMed Central
Journal / Book Title: BMC Bioinformatics
Volume: 21
Copyright Statement: © The Author(s). 2020. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Sponsor/Funder: Engineering & Physical Science Research Council (EPSRC)
Funder's Grant Number: EP/P010040/1
Keywords: Cancer
Suffix array
Variant calling
01 Mathematical Sciences
06 Biological Sciences
08 Information and Computing Sciences
Publication Status: Published
Article Number: ARTN 45
Appears in Collections:Computing