Differential expression analysis of RNA-seq using multivariate variance modelling

Differential expression analysis of RNA-seq using multivariate variance modelling

Project details

RNA sequencing is the current standard technology for profiling gene expression. One of the most fundamental analysis aims is to identify genes that are differentially expressed between cell types or experimental conditions. State-of-the-art DE methods model the variance of the expression values as a function of read abundance.

This project will extend these methods to model the variance in terms of multiple variables. This work will enable powerful statistical analyses to be extended to situations that are now problematic, either by accounting for read mapping ambiguity between overlapping transcripts of the same gene, or accounting for gene length effects or by recovering sequencing depth from commonly reported TPM and RPKM values when read counts themselves are not available.

About our research group

Professor Smyth's research Lab has a history of developing new statistical techniques for the analysis of genomic data that are widely used or have become accepted international standards. Members the group typically have backgrounds in mathematics, statistics, computer science, genetics, engineering or physics. The group has developed a number of well-known software packages including limma, edgeR, goseq, Rsubread, csaw and diffHic, goseq. The group is particularly well known for developing advanced statistical methods for differential expression analyses (McCarthy Nucleic Acids Research 2012 40:4288; Law Genome Biology 2014 15:R29; Ritchie Nucleic Acids Research 2015 43:e47; Chen F1000Research 2016 5:1438).


Professor Gordon Smyth

Professor Gordon Smyth writing on a whiteboard
Joint Division Head
Dr Yunshun Chen
Bioinformatics division

Project Type: