MIT’s CodePhage Fixes Software Bugs With Biologically Inspired Gene Swapping

MIT researchers have developed a system that repairs dangerous software bugs by automatically importing functionality from other, more secure applications.

Dubbed CodePhage, the system uses a “horizontal code transfer” system that automatically eliminates errors in recipient software applications by finding correct code in donor applications, then transferring that code from the donor into the recipient. The result is a software hybrid that productively combines beneficial code from multiple applications.

In biology, “horizontal gene transfer” is the transfer of genetic material between cells in different organisms. Because of its ability to directly transfer functionality evolved and refined in one organism into another, horizontal gene transfer is recognized as a significant aspect of biological evolution. Virally-mediated gene therapy is an important applications of horizontal gene transfer, and the work of the MIT scientists could be considered as a form of gene therapy… for software programs.

Automatic Code Swaps Between Programs That Process the Same Input

MIT code rapairMany software errors, including potentially very dangerous security faults, are caused by uncommon operational cases that weren’t’ anticipated by the developers. But in many cases, the developers of another application did anticipate the uncommon case and wrote correct code to handle it.

“We have tons of source code available in open-source repositories, millions of projects, and a lot of these projects implement similar specifications,” says Stelios Sidiroglou-Douskos, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) who led the development of CodePhage. “They frequently have subcomponents that share functionality across a large number of projects. Over time, what you’d be doing is building this hybrid system that takes the best components from all these implementations.”

The researchers presented their work at the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. Their paper “Automatic error elimination by horizontal code transfer across multiple applications,” published in the conference proceedings, is freely available at the time of writing.

CodePhage can work in tandem with DIODE, a bug-locating program developed by the same research group that can generate crash-inducing inputs automatically. Once identified a crash-inducing input and a safe input in a “recipient” program to be fixed, CodePhage feeds both to a “donor” program that processes the same inputs correctly. The hypothesis is that the donor contains a check, missing in the recipient, that enables it to process the error-triggering input correctly. The goal is to identify the check, extract a code patch, and transfer it from the donor into the recipient.

The researchers tested CodePhage on seven common open-source programs in which DIODE had found bugs, importing repairs from between two and four donors for each. In all instances, CodePhage was able to patch up the vulnerable code, and it generally took between two and 10 minutes per repair.

In modern commercial software, security checks can take more than 80 percent of the code. Future versions of CodePhage could significantly speed up the software development process by automating the identification and implementation of the necessary security checks.

“The longer-term vision is that you never have to write a piece of code that somebody else has written before,” sais team leader Martin Rinard, MIT professor of computer science and engineering.

The system finds that piece of code and automatically puts it together with whatever pieces of code you need to make your program work.

Images from MIT and Shutterstock.

Giulio Prisco is a freelance writer specialized in science, technology, business and future studies.