Mengbo Li – Bioinformatics division

12/06/2024 1:00 pm - 12/06/2024 2:00 pm
Davis Auditorium

WEHI Wednesday Seminar hosted by Professor Gordon Smyth

Mengbo Li
PhD Student – Smyth Laboratory, Bioinformatics division – Computational Biology Theme, WEHI
(this is a PhD Completion seminar)

Addressing non-ignorable missing values in MS-based proteomics


Davis Auditorium

Join via SLIDO enter code #WEHIWednesday

Including Q&A session


Mass spectrometry (MS) based label-free proteomics is a powerful tool in biomedical research, but its usefulness is limited by the frequent occurrence of missing values. We argue that missing values should always be viewed as missing not at random (MNAR) in MS-based proteomics data, because the probability of detection depends on the underlying intensity. We propose a statistical model for non-ignorable missing values in proteomics data, termed the detection probability curve (DPC). Importantly, the DPC provides a probabilistic model that determines how much information can be inferred from the missing values and can be used to inform downstream differential expression analyses.


To this end, we introduce a DPC-based protein quantification method, where missing values are taken into account when peptides are summarized into the proteins. An empirical Bayes approach is adopted to borrow information across the tens of thousands of peptides identified and quantified in the experiment. Quantification uncertainty is subsequently incorporated into differential expression analysis via a novel limma-style pipeline.


We evaluate the proposed methods on a range of real datasets, starting from a mixed-species experiment where yeast, E.coli and human proteins are mixed at known proportions. We show that our method eliminates missing values from protein-level data and improves the statistical power for differential expression in proteome-wide experiments while maintaining correct control of the false discovery rate. Proposed methods are also demonstrated on a breast cancer dataset by bulk proteomics and on public single cell proteomics data.


Mengbo Li is a PhD student at Smyth Lab in Bioinformatics Division, supervised by Professor Gordon Smyth, Associate Professor Andrew Webb and Dr Yunshun Chen. Her PhD work focuses on the development of statistical methods and computational software for complex biological data such as mass spectrometry-based proteomics, RNA-seq and spatial transcriptomics data.


All welcome!

Support us

Together we can create a brighter future

Your support will help WEHI’s researchers make discoveries and find treatments to ensure healthier, longer lives for you and your loved ones.

Sign up to our quarterly newsletter Illuminate

Find out about recent discoveries, community supporters and more.

Illuminate Winter 2024
View the current issue