AI Program Learns How to Find Dangerous Mutations in the Genome
A new learning computer system has uncovered dangerous mutations in large regions of the genome that previously could not be explored and revealed unexpected genetic determinants of autism, colon cancer, and spinal muscular atrophy.
The results, obtained by a team led by computational biologist Brendan Frey of the University of Toronto, are reported in a paper published in Science with the title “The human splicing code reveals new insights into the genetic determinants of disease.”
So far, most researchers interested in the genetic roots of diseases have only analyzed the 2 percent of the human genome that includes protein-coding DNA sequences. But that, according to Frey, is a relatively easy “low-hanging fruit.” In fact, many disease-related mutations happen in the “intronic” regions of the genome – the parts that do not directly make proteins but that still regulate how genes behave. In fact, intronic disease mutations alter splicing nine times more often than common variants. Scientists have long been aware of how valuable it would be to analyze the hidden 98 percent of the genome, but there has not been a practical way to do it.
The Computer Program Trained Itself to Read Instructions Embedded in the DNA
Enter the new machine algorithm developed by Frey and colleagues, which permits analyzing the entire genome, Scientific American reports. The algorithm was “trained” with millions of data points: DNA sequences, genetic variations, and RNA splicing patterns. The algorithm was then able to extrapolate how likely it was that any of tens of thousands of mutations could cause a splicing error associated with a particular disease, and detect tens of thousands of disease-causing mutations, including those involved in cancers, spinal muscular atrophy, and autism.
Frey’s team used an approach called “deep learning,” a machine-learning technique that tries to find hidden relationships between different sets of data, Quanta reports. In this case, relationships between the human reference genome and rich data sets cataloging the amounts of different protein components in different tissues. In essence, the computer program trained itself to read instructions embedded in the DNA.
Deep learning is considered as a step toward strong Artificial Intelligence (AI), and many organizations are exploring its applications, including military and civilian research labs worldwide and Internet companies. Facebook’s AI lab will be used for developing deep learning techniques that will help Facebook do tasks such as automatically tagging uploaded pictures with the names of the people in them, and Google has hired AI researchers to develop machine learning techniques for search, personalized content, and “Big Data” analysis.
Also read: The Robots Are Coming to Take Your Job
Images from Shutterstock.