Machine learning for drug discovery attracts enormous hype, but how useful is it actually? In this talk I argue that we’re not there yet, and that the bottleneck isn’t model architecture but data. I’ll introduce OpenADMET, an open-science effort built on a key insight: the high-value protein targets driving absorption, distribution, metabolism, excretion, and toxicity, the "Avoid-ome", are few in number, which makes designing against them tractable for modern drug discovery techniques. By combining high-throughput assays, structural biology, and active learning to generate consistent, mechanistically grounded data, alongside blind community challenges to keep models honest, OpenADMET aims to move the field from avoiding ADMET liabilities late to designing against them from the start. OpenADMET also gives us a platform to ask fundamental questions about machine learning for molecules, questions whose answers, we hope, can drive the practical, efficacious application of ML to drug discovery, so we can bring more medicines to patients.
Dr Hugo MacDermott-Opeskin completed his PhD in Chemistry at the Australian National University (ANU), specialising in molecular dynamics simulations of membrane proteins. He then moved into research software engineering, working across a variety of computational chemistry and molecular modelling projects. His journey into drug discovery began at the ASAP Discovery Consortium an open-science, NIH-funded effort that uses artificial intelligence, structural biology, and computational chemistry to develop oral antivirals against pandemic-threat viruses. He now serves as the technical lead for OpenADMET, a project hosted by the Open Molecular Software Foundation (OMSF). OpenADMET is an open effort to build predictive models of drug ADMET properties absorption, distribution, metabolism, excretion, and toxicity by combining high-throughput assays, structural biology, and machine learning to reduce late-stage failures in drug development.