Altmetric
A novel approach to power analysis for differential expression in scRNA-seq data
File | Description | Size | Format | |
---|---|---|---|---|
Fawad-S-2024-MPhil-Thesis.pdf | Thesis | 12.43 MB | Adobe PDF | View/Open |
Title: | A novel approach to power analysis for differential expression in scRNA-seq data |
Authors: | Fawad, Salman |
Item Type: | Thesis or dissertation |
Abstract: | Single-cell RNA sequencing (scRNA-seq) has helped revolutionise the scientific community’s understanding of cellular heterogeneity in complex tissues and is now commonly employed to find genes that are differentially expressed between two conditions (such as disease versus control). However, its potential is often hampered by low statistical power due to limited sample sizes (a consequence of high sequencing costs associated with generating each sample). A significant shortfall in the statistical robustness of human scRNA-seq investigations emerges from these small sample sizes, resulting in inaccurate detection of differentially expressed genes. Using a comprehensive analysis of 29 large brain scRNA-seq datasets (spanning 23 different cell types), I demonstrate that most published case-control studies are statistically underpowered, resulting in a massive mean false discovery rate of 80% when using as few as 10 samples. 75% power is attained only when the sample size reaches 80, corroborated by rigorous validation (against bulk RNA-seq data from the GTEx dataset), which is further demonstrated by showing that many of the detected genes actually come from the sex chromosomes. Importantly, increasing the number of cells per sample was also found to enhance statistical power significantly. My data indicate that for reliable differential expression analysis in human scRNA-seq, future studies should aim for a minimum of 80 samples and over 1000 cells per cell type of interest (ideally deeply sequenced). These findings serve as a cautionary tale for the bioinformatics community, underscoring the urgency for larger, more robustly designed scRNA-seq studies which better elucidate disease mechanisms. |
Content Version: | Open Access |
Issue Date: | Apr-2024 |
Date Awarded: | Aug-2024 |
URI: | http://hdl.handle.net/10044/1/114471 |
DOI: | https://doi.org/10.25560/114471 |
Copyright Statement: | Creative Commons Attribution NonCommercial Licence |
Supervisor: | Matthews, Paul Skene, Nathan |
Department: | Department of Brain Sciences |
Publisher: | Imperial College London |
Qualification Level: | Masters |
Qualification Name: | Master of Philosophy (MPhil) |
Appears in Collections: | Department of Brain Sciences PhD Theses |
This item is licensed under a Creative Commons License