Donate to Science & Enterprise

S&E on Mastodon

S&E on LinkedIn

S&E on Flipboard

Please share Science & Enterprise

Computer Model Predicts Protein Binding to DNA, RNA

DNA illustration

(National Heart, Lung, and Blood Institute, NIH)

28 July 2015. Geneticists and computer scientists wrote a machine-learning model for predicting the way proteins bind to genetic material, and uncovering mutations causing disease. The team led by Brendan Frey with the Canadian Institute for Advanced Research in Toronto published its findings yesterday in the journal Nature Biotechnology (paid subscription required).

Frey and other authors founded Deep Genomics, a start-up company in Toronto to commercialize their research applying machine learning to uncover genetic abnormalities.

The researchers, from University of Toronto and Children’s Hospital Medical Center in Cincinnati, sought an automated method for discovering disruptions in the way cells function caused by abnormal variations in an individual’s genetic code or the way that code is expressed. This understanding can help medical scientists develop much more precise therapies that address these specific anomalies, a strategy known as precision medicine.

The researchers used an emerging computer modeling technique called deep learning to analyze and create an understanding of the complex processes proteins in the body bind with DNA in the genetic code and RNA expressing the code. Deep learning makes it possible for machines to discern underlying patterns in relationships, and build those relationships into knowledge bases applied to a number of disciplines. Advances in deep learning developed at Canadian Institute for Advanced Research, for example, are applied to neural network models employed by Google and Facebook.

In this case, the deep-learning algorithms enable the researchers to predict the binding of proteins with specific sequences of DNA and RNA that result from mutations in the genetic code. The model tests the DNA or RNA sequence against the chemistry of proteins and computes the likelihood of proteins binding to that sequence. Where mutations alter the sites where proteins can bind offers a predictor of disrupted cellular functions and a higher probability of disease.

The team developed the model into a stand-alone software tool known as DeepBind that the authors say can automatically analyze millions of sequences at a time. In tests of the model reported in the journal paper, the team uncovered new details about hemophilia and hypercholesterolemia, inherited conditions causing blood clotting problems and high cholesterol respectively, as well as some forms of cancer and mutations related to disorders in the cerebral cortex.

Deep Genomics, a spin-off company from University of Toronto where Frey is on the engineering faculty, is commercializing deep learning technologies applied to genomics. Frey is the company’s CEO, with co-authors Andrew DeLong and Babak Alipanahi serving on the company’s scientific team. The first product from Deep Genomics is Spidex, a data set of genetic variations with predicted effects on RNA splicing, modifications of RNA before it becomes messenger RNA that carries signals to cells. The company is making Spidex available free of charge for non-commercial use.

Read more:

*     *     *

1 comment to Computer Model Predicts Protein Binding to DNA, RNA