The Future of Machine Learning-Driven Drug Discovery

September 7, 2022

Drug Discovery: New Answers to Old Questions

Our pharmaceutical and biotech clients often ask us how they can get ahead of the most time-consuming and expensive delay in drug discovery: identifying a promising drug target. While we have traditional drug target identification approaches to thank for yielding many of the drugs on the market today, these same approaches that rely on large mutational screening of cells or mice are proving slow and arduous as researchers tackle more complex and historically difficult-to-treat diseases. Enter Machine Learning (ML) and Artificial Intelligence (AI).

ML & AI Pave the Way for a New Approach

To investigate drug targets in human diseases, some researchers are turning away from mouse models and cell lines, and toward a growing source of “human-first” data: genomic databases with sequence information for millions of people across hundreds of diseases.

“Data at this scale make it much easier to discover rare gene variants with a powerful effect on health and disease,” writes Michael Eisenstein in a recent news feature for Nature Biotechnology. “But as these datasets get larger and larger and incorporate additional omic layers beyond the genome, including transcriptomic, proteomic, or even metabolomic data, they become more challenging to analyze.”

For challenging datasets like these, machine learning is emerging as a powerful tool. Machine learning models can recognize subtle patterns in genomic data after being trained to pick out drug targets that have been effective in the past. 

Companies Bet Big on ML

This extremely sophisticated level of bioinformatics has spawned multiple AI-driven drug discovery companies harnessing the existing volume of genomic data to gain new biological insights- with large financial backing. Celsius Therapeutics has garnered $83 million in funding for a clinical program for IBD based on an ML-identified gene target. In the neurodegenerative disease space, Verge Genomics raised $98 million, including investment from Eli Lilly and Merck for ALS research.

Immuneai has raised $215 million in series C funding for AI target identification from immunological data, and even more niche research areas are opened up to ML- Relation Therapeutics raised $25 million to use an ML platform that combines single cell data with clinical data for treating bone diseases.

Other research areas include integrating histopathology images with clinical and genomic data, like Inisitro has developed, identifying dysregulated gene signatures in ALS by Insilico Medicine’s platform, or applications in cancer- Alchemab is using ML to identify what makes cancer survivors resilient to disease.

Translation from funding and research to the clinic is backed by big pharma- BenevolentAI is partnered with AstraZeneca, who along with Regeneron has invested in building large biobanks in-house, while Amgen expanded their expertise by acquiring deCODE in 2012. Promising results are already in the horizon- in April, the BioAge Therapeutics treated their first trial patient with their AI- designed muscle atrophy drug.

What Does an ML Pipeline Look Like?

Machine learning for drug discovery can look different depending on the stage of understanding of a given disease. It can be as broad as identifying genomic patterns in single-cell sequencing data from a disease cohort versus a control cohort, all the way to identifying specific genes and proteins that are the most likely drivers of a given phenotype.

Though extremely powerful, machine learning is still just one step in this process- once targets are identified, they have to pass through the traditional wet lab experimental work to determine their efficacy. But the time saved by running computer-based models over traditional experimental methods for drug target identification can dramatically cut down the amount of time to develop a strong drug candidate and deliver it to patients as fast as possible.

Outsourcing Bioinformatics Analysis: How We Can Help

Whether interpreting high-throughput genomic data or building a machine learning model, large-scale genomic analyses are challenging computational tasks. We have a proven track record of helping clients find answers to their biological research questions through custom-built storage solutions and bioinformatic pipeline analysis. Contact us to discuss how we can help you take advantage of all that machine-learning drug discovery has to offer.



Dan Ryder, MPH, PhD

Dan is the founder and CEO of Bridge Informatics, a professional services firm helping pharmaceutical companies translate genomic data into medicine. Unlike any other data analytics firm, Bridge forges sustainable communication change between their client’s biological and computational scientists. Dan is particularly passionate about improving communication between people of different scientific backgrounds, enabling bioinformaticians and software engineers to collectively succeed.

Prior to forming Bridge Informatics, Dan served in a variety of roles helping pharmaceutical clients solve early phase drug discovery and development challenges. Dan received both a PhD in Biochemistry and Molecular Biology and an MPH in Disease Control from the University of Texas Health Science Center at Houston (UTHealth Houston). He completed his postdoctoral studies in Molecular Pathways of Energy Metabolism at the University of Florida College of Medicine. Dan received his undergraduate degree in Microbiology from the University of Texas at Austin. Click here to connect with Dan on LinkedIn.

Sources:

https://www.nature.com/articles/s41587-022-01457-1

Recent Posts

genomic sequencing cassette