The Maximal Usage Doctrine for Open Source

I've worked on a number of open source projects over the past several years (the most prominent being Merb, Ruby on Rails and jQuery) and have begun to form some thoughts about the usage (aka adoption) of open source projects and the practical effects of license styles.

The Playing Field

There are essentially two kinds of licenses popularly used in open source[1].

The type I've worked with most extensively in the BSD or MIT-style license. This license allows unlimited usage, modification, distribution, and commercialization of the source code, with two caveats. First, the copyright notice must be distributed with the source code. Second, to the extent legally enforceable, if you use my MIT-licensed source code, you may not sue me for the effects of using the source.

The other major license type is called copyleft, and essentially leverages copyright law to insist that any modifications to the original source must themselves be made public. The practical effect of this license could not be effected without government-enforced copyright laws.

The most popular copyleft license is GPL, which is used by Linux, and which also has viral characteristics. In short, if you link code covered by the GPL into your own code, you are required to open your code as well (if you distribute it). The GPL is not very clear about what this means for more dynamic languages (is requiring a Ruby file "linking"?).

A less viral version of the GPL, the LGPL, requires that any modifications you make to the software itself be released, but drops the "linking" requirement.

Because of the uncertainty surrounding the viral characteristics of the GPL, legal departments in big corporations are very hostile to the GPL license. Software covered by the LGPL license is more likely to get approval by corporate legal, and MIT/BSD licenses are the most well-liked (probably because they don't confer any obligations on the corporation).

What I Want When I Write Open Source

When I work on a serious open source project, like Ruby on Rails or jQuery, I have relatively simple desires.

One. I want as many people as possible to use the software I am working on. This is probably a selfish, egoist desire, but it may also be a desire to see useful software actually used.

Two. I want some subset of the many users to be willing to report bugs, and for the varied situations the code is used in to help improve the quality of the software as a result. This is essentially "given enough eyeballs, all bugs are shallow".

Three. I want people who are interested in improving the software to submit patches. Specifically, I want to receive patches from people who have thought through the problem the bug is trying to fix, and want to receive fewer patches from people who have hacked together a solution just to get it to work. In essence, a high signal-to-noise ratio of patches is more important than a high volume of patches.

Four. I want people who are interested in improving the software long term to become long-term contributors and eventually committers.

Of these concerns, numbers two and three are the highest priorities. I want to expose the code to as much real-world pressure as possible, and I want as many high-quality patches to fix those bugs as possible. If I had to choose between those two, I would pick number two: exposing my code to real-world stress is the best way to rapidly ferret out bugs and incorrect design assumptions that I know of.

Meeting Those Requirements

Starting out with the easiest, my first desire, to have my software used as much as possible, is most easily satisfied by an extremely liberal usage policy. Adding restrictions on the use of software I write reduces its adoption almost by definition.

Much more importantly, the same can be said about exposing code to real world stresses. By far the most important way to achieve this goal is to make it as easy as possible for as many people as possible to use the code.

If only 1% of all proprietary users of the source ever report bugs, that's 1% of potentially thousands of users, as opposed to 100% of the zero proprietary users who were able to use the software under a more restrictive usage scheme. In practice, this number is much more than 1%, as proprietary users of software experience and report bugs just like open source users do.

The only real counter-argument to this is that by forcing users to contribute, some number of proprietary users will be forced to become open source users, and their contributions will outweigh the smaller contributions of proprietary users. In practice, proprietary users choose proprietary solutions instead when they are forced to choose between restrictive open source usage schemes and other proprietary software.

There is also much to be said for exposing open source tools into proprietary environments.

There is also much to be said for introducing proprietary developers to the open source ecosystem. Proprietary developers have access to things like paid time, access to obscure usage scenarios, and access to markets with low open source penetration that the open source community lacks.

The next major desire I have while working on open source is a steady stream of high-quality patches. This should be advantage copyleft, because all users of the software are forced to contribute back. However, since copyleft licenses are not used in proprietary environments anyway, the patches to open source projects from those environments under more permissive licenses are much more numerous. Again, even if only a few percent of proprietary users contribute back to the project, that is significantly more contributions than the 100% of zero proprietary users.

Also importantly, the patches are contributed by much more dedicated users of the software, instead of being force-contributions. I have never heard a team member on an open source project say that inadequate patches are received by permissive-license software, and that this problem would be solved by going to a more restrictive model.

Finally, I can only speak for jQuery and Rails (and other smaller open source projects I work on), but a large number of new long-term contributors became involved while working on corporate projects, where a permissive license made the decision to use the software in the first place feasible.

Lower Barrier to Entry Helps Meet Open Source Goals

Regardless of how much of the above argument you agree with, it is clear that copyleft licenses intentionally impose a higher barrier to entry for usage than more permissive licenses.

For projects that use more permissive licenses, the fact that many proprietary users of their software do not contribute back changes is a feature, not a bug.

That's because we don't focus on all of the people who use our software without contributing back. Instead, we focus on all the users, bug reports, patches, and long term contributors we have gained by keeping the barrier to entry as low as possible.

In the end, the world of ubiquitous open source is here. And we made it without having to resort to coercion or the formation of an entirely self-contained set of software. We made it by building great software and convincing real users that it was better than the alternative.

Postscript: Linux

Linux is a peculiar example because its license has not impeded its usage much. In part, that is because most users of Linux do not make changes to it, and the linking characteristics of the GPL license rarely come into play.

In cases like this, I would argue that copyleft licenses are close enough to usage maximization to get most of the benefits of more permissive licenses. However, it's not clear to me what benefits the copyleft licenses provide in those (relatively rare) cases.

[1] There are other kinds of licenses, like the Affero license, which is even more restrictive than the GPL license. I would classify Affero as copyleft.