FineSplice, enhanced splice junction detection and quantification: a novel pipeline based on the assessment of diverse RNA-Seq alignment solutions
Author(s)
Type
Journal Article
Abstract
Alternative splicing is the main mechanism governing protein diversity. The recent developments in RNA-Seq technology have enabled the study of the global impact and regulation of this biological process. However, the lack of standardized protocols constitutes a major bottleneck in the analysis of alternative splicing. This is particularly important for the identification of exon–exon junctions, which is a critical step in any analysis workflow. Here we performed a systematic benchmarking of alignment tools to dissect the impact of design and method on the mapping, detection and quantification of splice junctions from multi-exon reads. Accordingly, we devised a novel pipeline based on TopHat2 combined with a splice junction detection algorithm, which we have named FineSplice. FineSplice allows effective elimination of spurious junction hits arising from artefactual alignments, achieving up to 99% precision in both real and simulated data sets and yielding superior F1 scores under most tested conditions. The proposed strategy conjugates an efficient mapping solution with a semi-supervised anomaly detection scheme to filter out false positives and allows reliable estimation of expressed junctions from the alignment output. Ultimately this provides more accurate information to identify meaningful splicing patterns. FineSplice is freely available at https://sourceforge.net/p/finesplice/.
Date Issued
2014-04-01
Date Acceptance
2014-02-08
Citation
Nucleic Acids Research, 2014, 42 (8), pp.1-11
ISSN
0305-1048
Publisher
Oxford University Press
Start Page
1
End Page
11
Journal / Book Title
Nucleic Acids Research
Volume
42
Issue
8
Copyright Statement
© The Author(s) 2014. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Sponsor
British Heart Foundation
Identifier
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000336092300010&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=1ba7043ffcc86c417c072aa74d649202
Grant Number
SP/10/10/28431
Subjects
Science & Technology
Life Sciences & Biomedicine
Biochemistry & Molecular Biology
MESSENGER-RNA
DIFFERENTIAL EXPRESSION
HUMAN GENOME
DISEASE
ALGORITHMS
GENE
CODE
TRANSCRIPTOMES
LANDSCAPE
ULTRAFAST
Publication Status
Published
Article Number
ARTN e71
Date Publish Online
2014-02-25