Linear models for gene expression data

Microarray and high-throughput gene expression experiments provide a means to observe and understand gene activity on a genomic basis. Experiments are conducted for example to study molecular pathways which are affected when a key regulator is perturbed, or to study molecular responses to a pathogen or stress factor, or to study gene activity in diseased individuals. Designed expression experiments are typically small but complex, with few biological replicates but with multiple experimental factors at varying levels. My lab has developed a powerful framework for analyzing these experiments using linear models. Empirical Bayes methods are used to borrow information between genes. Much of this work is encapsulated in the popular limma software package, which provides a flexible framework for the analysis of high-throughput expression experiments. Key principles include (i) borrowing information between genes, (ii) analysing whole experiments together, combining information across arrays, (iii) noise reduction by suitable background correction, and (iv) quantitative quality weighting.