Altmetric

A novel approach to power analysis for differential expression in scRNA-seq data

File Description SizeFormat 
Fawad-S-2024-MPhil-Thesis.pdfThesis12.43 MBAdobe PDFView/Open
Title: A novel approach to power analysis for differential expression in scRNA-seq data
Authors: Fawad, Salman
Item Type: Thesis or dissertation
Abstract: Single-cell RNA sequencing (scRNA-seq) has helped revolutionise the scientific community’s understanding of cellular heterogeneity in complex tissues and is now commonly employed to find genes that are differentially expressed between two conditions (such as disease versus control). However, its potential is often hampered by low statistical power due to limited sample sizes (a consequence of high sequencing costs associated with generating each sample). A significant shortfall in the statistical robustness of human scRNA-seq investigations emerges from these small sample sizes, resulting in inaccurate detection of differentially expressed genes. Using a comprehensive analysis of 29 large brain scRNA-seq datasets (spanning 23 different cell types), I demonstrate that most published case-control studies are statistically underpowered, resulting in a massive mean false discovery rate of 80% when using as few as 10 samples. 75% power is attained only when the sample size reaches 80, corroborated by rigorous validation (against bulk RNA-seq data from the GTEx dataset), which is further demonstrated by showing that many of the detected genes actually come from the sex chromosomes. Importantly, increasing the number of cells per sample was also found to enhance statistical power significantly. My data indicate that for reliable differential expression analysis in human scRNA-seq, future studies should aim for a minimum of 80 samples and over 1000 cells per cell type of interest (ideally deeply sequenced). These findings serve as a cautionary tale for the bioinformatics community, underscoring the urgency for larger, more robustly designed scRNA-seq studies which better elucidate disease mechanisms.
Content Version: Open Access
Issue Date: Apr-2024
Date Awarded: Aug-2024
URI: http://hdl.handle.net/10044/1/114471
DOI: https://doi.org/10.25560/114471
Copyright Statement: Creative Commons Attribution NonCommercial Licence
Supervisor: Matthews, Paul
Skene, Nathan
Department: Department of Brain Sciences
Publisher: Imperial College London
Qualification Level: Masters
Qualification Name: Master of Philosophy (MPhil)
Appears in Collections:Department of Brain Sciences PhD Theses



This item is licensed under a Creative Commons License Creative Commons