ClinVar data parsing.
File(s)
Author(s)
Type
Journal Article
Abstract
This software repository provides a pipeline for converting raw ClinVar data files into analysis-friendly tab-delimited tables, and also provides these tables for the most recent ClinVar release. Separate tables are generated for genome builds GRCh37 and GRCh38 as well as for mono-allelic variants and complex multi-allelic variants. Additionally, the tables are augmented with allele frequencies from the ExAC and gnomAD datasets as these are often consulted when analyzing ClinVar variants. Overall, this work provides ClinVar data in a format that is easier to work with and can be directly loaded into a variety of popular analysis tools such as R, python pandas, and SQL databases.
Date Issued
2017-05-23
Date Acceptance
2017-05-22
Citation
Wellcome Open Research, 2017, 2
ISSN
2398-502X
Publisher
F1000Research
Journal / Book Title
Wellcome Open Research
Volume
2
Copyright Statement
Copyright: © 2017 Zhang X et al. This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The author(s) is/are employees of the US Government and therefore domestic copyright protection in USA does not apply to this work. The work may be protected under the copyright laws of other jurisdictions when used in those jurisdictions.
License URL
Sponsor
Wellcome Trust
Grant Number
107469/Z/15/Z
Subjects
ClinVar
Mendelian disease
XML parsing
pathogenic variants
variant interpretation
Publication Status
Published online
Article Number
33