DHS Bug Hunt Returns Mixed Reaction
Connecting state and local government leaders
The results of a Homeland Security Department-funded bug hunt spanning 40 popular open-source programs has thus far met ambivalence from the open-source community. While many projects are using the results to improve their software, others are bemoaning the high number of false positives.
Bruce Momjian, project coordinator for PostgreSQL, has used Coverity reports in the past and found them useful, if not essential.
In January, DHS' Science and Technology Directorate awarded a team comprising Coverity Inc. of San Francisco, Stanford University and Symantec Corp. of Cupertino, Calif., a three-year, $1.2 million contract to find heretofore undiscovered vulnerabilities in widely used open-source programs, such as the Linux kernel and the Apache Web server.
Improving quality
Through its Vulnerability Discovery and Remediation Open Source Hardening Project, DHS wants to improve the quality of open-source programs supporting critical U.S. infrastructure (dams, power grids, highways). The concern is that defects pose security vulnerabilities because malicious programs could use them to disrupt or gain control of a system.
'DHS realizes that much of the critical infrastructure runs on open source,' said David Park, co-founder and vice president of marketing and business development for software testing company Coverity.
But Ben Laurie, a chief developer of Apache, said he finds the project puzzling. He wants to know why the agency would pay a third party to search for bugs, yet 'make no contribution to fixing them.'
Stanford University, which gets the majority of the funding, will investigate new techniques for analyzing complex sets of software code for critical defects. Coverity was tasked with testing 40 open-source applications, using its existing test software (which was largely developed at Stanford).
In March, Coverity released the first set of results. Overall, the average defect density of all the programs was fairly low'about 0.43 bugs per thousand lines of code.
The most widely used programs scored well under this average.
The 3 million lines of code that make up the Linux kernel, for instance, had an average of 0.33 bugs per thousand lines of code. Apache has 0.25 bugs per thousand lines of code.
The open-source LAMP stack (consisting of Linux, Apache, MySQL and a scripting language of either Perl, PHP or Python), had a defect density of 0.29 bugs per thousand lines of code.
Several open-source project teams have registered with Coverity to see the full set of results. The team overseeing the PostgreSQL database system is reviewing the new list, said Bruce Momjian, the project's coordinator.
He has used Coverity reports before, he said, and found them to be useful, if not absolutely essential. The results of a previous study pointed to 'a few unusual cases that weren't exploitable bugs, but were something we wanted to clean up,' he said. With this set of results, however, he has found a number of reported bugs that were actually already fixed.
The initial results for Apache contain an abundance of 'false positives' as well, Laurie said.
False positives are pointers to potential problems that turn out not to be problems at all. 'Unverified results from [Coverity's] software are unlikely to continue to be seen to be valuable by developers because of the rather high false-positive rate,' he warned.
False positives
Members of the Linux kernel mailing list bemoaned its collection of false positives, which, to their chagrin, inflated the overall number of bugs Coverity claimed to be in the Linux kernel.
The group quickly descended on Coverity's list, with volunteers picking out reported bugs, listing them by their Coverity identification numbers, and posting code that would fix the problems. They soon found themselves wasting time going down blind alleys.
'About half of the [approximately] 50 reports I've looked at so far in their database have been false positives,' one developer quipped. Greg Kroah-Hartman, who maintains a number of driver subsystems for the Linux kernel, pointed to one instance where a developer investigated a purported memory initialization error only to find, after much discussion with his peers, that it did not exist after all.
The software had difficulty parsing basic C code, he said. 'There was nothing wrong with the kernel code at all,' he said in an e-mail.
'All source-code analysis solutions must deal with the false-positive problem,' admitted Ben Chelf, CTO of Coverity. He said that because the company's software is configurable, users can lower the false-positive rate by tweaking the settings.
Members of the Linux kernel group also question why the results shouldn't be made public. Simple bug repair is the type of job outside developers do well, they claim.
The project Web site, scan.coverity.com, offers a tally of how many bugs the project found with each program, but only those involved with the programs themselves can see the full results.
Originally, Coverity did not want to disclose the bugs, fearing that malicious hackers would use the results to nefarious ends. At press time, Chelf said the company was reconsidering publicly posting the full results.
Despite criticisms, Chelf remains confident that DHS is helping the open-source community. 'I'm personally very excited at the response thus far,' he said.
NEXT STORY: Tech Blog from GCN.com