The integration of multi-modal molecular data promises to deliver a more comprehensive understanding of cancer. By integrating diverse data modalities, such as genomics, transcriptomics, epigenomics, proteomics, and clinical data, researchers can uncover novel insights and identify potential biomarkers or therapeutic targets.
This project will involve analysis of big public data such as The Cancer Genome Atlas (TCGA) and several multi-omics datasets generated by collaborators from patient cohorts. The effect of removing unwanted variation, such as library size differences, batch effects and tumor purity will be explored (Molania, et al, Nature Biotechnology, 2023). This will improve the reliability of the data and deliver more accurate integration. The student will then develop and test new methods to effectively integrate genomics, transcriptomics, epigenomics, proteomics and clinical data.