Modern biology seeks to understand the genome’s circuitry by studying its functional read-out, the transcriptome. Advances in long-read sequencing can paint a more comprehensive picture of transcriptomes by revealing the full length of individual RNA molecules. Splicing and variants can be phased, and repetitive sequence resolved. Moreover, Nanopore long-read sequencing technology offers the potential for rapid, ultra-portable and real-time targeted sequencing. However, long-read transcriptome data analysis remains challenging. The base calling accuracy is low, and new methods are needed to take full advantage of phased events.
This project will build new computational methods to analyse this complex data, with a focus on resolving novel splicing and mutations in cancer and long reads combined with single-cell sequencing data.