54
IRUS TotalDownloads
Altmetric
Developing methods to assess evolutionary and functional equivalence of single nucleotide variants for improved clinical interpretation of human genetic variation
File | Description | Size | Format | |
---|---|---|---|---|
Li-N-2023-PhD-Thesis.pdf | Thesis | 30.2 MB | Adobe PDF | View/Open |
Title: | Developing methods to assess evolutionary and functional equivalence of single nucleotide variants for improved clinical interpretation of human genetic variation |
Authors: | Li, Nicholas |
Item Type: | Thesis or dissertation |
Abstract: | With advancements in sequencing technology there has been an unprecedented rise in human single nucleotide variant data in recent years. One of the key challenges within clinical genetics is distinguishing truly pathogenic from rare but benign variants. Many in silico tools have been developed with this aim but they often over predict pathogenicity particularly on novel variants. Here, I demonstrate how a framework designed to identify variants with functional equivalence by using information from variants in known related genes can help pathogenic variant interpretation. Using sequence alignments of human paralogues, known pathogenic variants within aligned positions can be used to transfer their annotations across to aligned variants. This Paralogue Annotation method is shown to be widely applicable exome-wide, with 71% of disease genes having at least one paralogue. As a classifier it performs more precisely than other contemporary variant predictors, having a precision of 94% or higher depending on the data. This however comes at the cost of limited sensitivity (17% and lower). But this is rescued when the framework was improved by altering the alignments to protein domains instead of whole gene sequences. The sensitivity was increased by 74% with a marginal 6% precision decrease. By expanding the framework to explore the usage of structural protein alignments instead of sequence alignments there is potential to further improve sensitivity, but current limited structural data means that predicted protein models must be relied on leading to further assumptions to be taken. In structural space, pathogenic variants across aligned models are statistically more likely to be closer together than benign and pathogenic variants. This framework can be used as a precise pathogenic variant classifier in sequence space, but overall, it can be used to search for functionally equivalent variants to variants of interest, which is a line of information not used by many. |
Content Version: | Open Access |
Issue Date: | Feb-2022 |
Date Awarded: | Jan-2023 |
URI: | http://hdl.handle.net/10044/1/101859 |
DOI: | https://doi.org/10.25560/101859 |
Copyright Statement: | Creative Commons Attribution NonCommercial Licence |
Supervisor: | Ware, James Whiffin, Nicola |
Sponsor/Funder: | Medical Research Council (Great Britain) |
Department: | Institute of Clinical Sciences |
Publisher: | Imperial College London |
Qualification Level: | Doctoral |
Qualification Name: | Doctor of Philosophy (PhD) |
Appears in Collections: | Department of Clinical Sciences PhD Theses |
This item is licensed under a Creative Commons License