Open Source Software — Who Actually Reviews the Code?

This post is co-authored by Robby Simpson and Sven Krasser. So, you can find it on both Robby’s and Sven’s blogs — you should check them both out!

Last year saw a large number of critical bugs in open source software (OSS). These bugs received a lot of media attention and re-opened the discussion of bugs and security in OSS. This has led many to question whether ESR’s famous statement that “Given enough eyeballs, all bugs are shallow” holds true.

There are two aspects to consider here: first, does bug discovery parallelize well? Particularly for subtle security-related bugs, a large number of users does not necessarily aid discovery, but rather a dedicated review effort may be required. We’ll leave this aspect for a separate discussion…

Second, are there actually more eyeballs looking at open source software? For that question, we have some data to contribute to the discussion. In 2003, Robby released NETI@home, a project to gather network performance metrics from endhosts, as part of his PhD dissertation work at Georgia Tech. You can find the source code on SourceForge. The NETI@home agent runs on end user machines and gathers various network performance data (e.g., numbers of flows per protocol, number of packets per flow, TCP window size). Such data has many uses, including improving models in network simulations and observing suspicious traffic patterns.

A driving factor for releasing NETI@home as OSS stems from the fact that it gathers a lot of information that could raise privacy concerns with users. The most forthcoming way to address these concerns, which could hinder adoption, is to allow users to actually read the code. And as researchers that piqued our curiosity — how many users have these concerns and will further review the source code?

How could we measure this user code review? Download stats for the source code are an option, but they don’t tell us much about users actually looking at the code. Instead, we placed a comment in the section of code where privacy preferences are honored, which reads:

/*
* You have found the "hid-
* den message!" Please visit
* http://www.neti.gatech.edu/sec/sec.html
* and log in as user 'neti'
* and pw 'hobbit'
*/

The web page mentioned in this comment contained an explanation along with an email address (it was taken down around 2009). Visiting such a link requires a lower threshold than sending an email, so we were looking for both pageviews of that page along with emails to the address given on that page. The former would have told us someone found the comment while the latter would confirm that someone would have taken action on it. However, we didn’t receive any pageviews (and therefore we didn’t receive any emails either).

To put this into perspective with NETI@home’s user base, there were about 13,000 downloads of the software, and there were about 4,500 active users that ran the agent. We can safely say that the type of user running the software falls into the geek category, so there’s some expected selection bias with respect to taking an interest in the source code.

Granted, this is slightly different than contributors to an open source project reviewing code. Nonetheless, it came as a surprise to us, and it certainly went against the conventional wisdom at the time.

As fans of OSS, we were both disappointed in the results. However, we hope that sharing this data point will add to the larger discussion, help strengthen the open source community, and show the need for dedicated code review.