DARPA to mine 'big code' to improve software reliability
Connecting state and local government leaders
By combining principles of big data analytics with software analysis, DARPA wants to make significant advances in the way software is built, debugged, verified, maintained and understood.
During the past decade information technologies have driven productivity gains that are essential to U.S. economic competitiveness, and computing systems now control a significant portion of the critical infrastructure.
As a result, tremendous public and commercial resources are devoted to ensuring that programs are correct, especially at scale. Yet, in spite of sizeable efforts by developers, software defects remain at the root of most system errors and security vulnerabilities.
To address the predicament, the Defense Advanced Research Projects Agency wants to advance the way software is built, debugged, verified, maintained and understood by combining principles of big data analytics with software analysis.
DARPA said its Mining and Understanding Software Enclaves (MUSE) program would facilitate new ways to dramatically improve software correctness and help develop radically different approaches for automatically constructing and repairing complex software, according to its announcement.
“Our goal is to apply the principles of big data analytics to identify and understand deep commonalities among the constantly evolving corpus of software drawn from the hundreds of billions of lines of open source code available today,” said Suresh Jagannathan, DARPA program manager in the announcement.
“We’re aiming to treat programs—more precisely, facts about programs—as data, discovering new relationships (enclaves) among this ‘big code’ to build better, more robust software.”
Central to MUSE’s approach is the creation of a community infrastructure that would incorporate a continuously operating specification-mining engine, the agency said. This engine would use “deep program analyses and big data analytics to create a public database containing … inferences about salient properties, behaviors and vulnerabilities of software drawn from the hundreds of billions of lines of open source code available today.”
“The collective knowledge gleaned from this effort would facilitate new mechanisms for dramatically improving software reliability and help develop radically different approaches for automatically constructing and repairing complex software,” DARPA said in describing the project.