iDEG: a single-subject method for assessing gene differential expression from two transcriptomes of an individual
Abstract
Single-subject RNA-Sequencing (RNA-Seq) analysis is a powerful precision medicine tool for unveiling individual disease mechanisms. Due to the scarcity of relevant tissue samples and the cost of high-throughput technologies, the availability of replicates for a single subject is often impractical. This constraint prohibits the use of most conventional statistical techniques since replicates are typically needed to estimate data variability and make inferences. We propose the iDEG method to identify individualized Differentially Expressed Genes from two conditions of a subject, each sampled once: a baseline sample (e.g., unaffected tissue) vs. a case sample (e.g., cancer). iDEG gathers information across different genes from the same individual while strategically bypassing the requirement of replicates per condition to make valid inferences. The main idea of iDEG is to transform RNA-Seq data such that, under the null hypothesis, differences of transformed expression counts follow the same distribution with a constant variance. This transformation enables modeling all genes with a two-group mixture model, from which the probability of differential expression for each gene is then estimated by an empirical Bayes approach with a local false discovery rate control. Our extensive numerical studies demonstrate iDEG’s superior performance compared to existing methods under a variety of scenarios. Finally, iDEG is applied to a triple negative breast cancer single-subject dataset in which a personal set of differentially expressed genes are identified.
Availability: Text Text Text Text Text Text Text Text Text Text Text Text Text Text
Text Text Text Text Text Text Text Text Text Text Text Text Text Text Text
Contact: yves@email.arizona.edu
Supplementary information: Supplementary data are available at Bioinformatics
online.