Wednesday, February 22, 2012

Why Organizations Should Open-Source Projects

I can understand if traditional businesses often struggle with the concept of contributing to open source projects or maintaining OSS projects of their own. If a business manager looks at software like a physical, inherently valuable object it is often hard to make that same object freely available. After all, if the business spent $5,000 on development of the software, why should it give the result away?

I believe the more appropriate view is that the value of an application isn't realized upon publication like a book or a movie. While each may be products of creative labor, software is never really "finished." A well-kept application is always in a state of flux, adapting to new use cases and fixing defects. The only time you are actually done with an application's codebase is when you abandon it.

When you realize software is never completely done, the big question is how a business effectively maintains the codebase. How are bugs found and then resolved? How are new features prioritized and implemented? How do you keep things going without draining your existing engineering staff? Once those questions come to light, releasing software to an open source community makes a lot more sense.

I'm not saying that all applications a business writes should be publicly released as open source. Apps laden with business logic, code that epitomize your core business, will likely not be re-usable for others in the community and may disclose sensitive business practices. However "glue" libraries such as utilities, messaging or scalability frameworks can be highly re-usable and can be isolated so as not to disclose any core business use cases.

If an application is re-usable and adopted by others in the open-source community, they begin to rely on your app and apply their own critical thinking to its codebase. The larger community may conceive of use cases you haven't yet encountered, or find esoteric bugs that you haven't run into yet. Even better, OSS developers will often contribute code or bug patches to resolve issues or add much desired features. At this point the maintenance efforts for your codebase are distributed among a large and very knowledgeable public, significantly reducing the expense in maintaining the code. By adding multiple points of view new ideas and a pool of developers an app may become more reliable than if you attempted to maintain it on your own. Both Netflix and Twitter have released such projects as open source with great success and community support.

Recently I helped foster a similar initative to release a Spring AMQP component for Apache Camel. There was absolutely no business logic within the codebase, it was a glue component used for message transport and so it needed to be as rock-solid and dependable as possible. Not only that, engineering resources were scarce and AMQP best practices were still new to the team. The more eyes that could review the component and the codebase the better it would be.

After some evaluation we chose GitHub to host the source and Sonatype's OSS repository to host the resulting Maven targets. After the intital import into GitHub was performed, we signed up for Sonatype's OSS repository access and began publishing snapshots. Jenkins, our continuous integration server, would check out the codebase from GitHub and then publish snapshots to Sonatype on demand. Internally we started using the snapshot builds, ensuring we pushed our local changes into GitHub whenever we were ready to distribute another snapshot build.

One question that arose early was what the company's "sponsorship" of the project should be. The company managers and directors wanted to ensure the project didn't carry any organization artifacts with it - for example, packages or classes that carry the company name. However it does make sense to have a single steward of the project, one that monitors submissions and maintains the pipeline. To that end we used GitHub's "team" concept to create a team repository where software engineers were added as owners of the team code. The core team would merge pull requests, monitor issues being submitted and ultimately be responsible for pushing artifacts to the Sonatype repository. If an employee would leave the company they could possibly be removed from the team itself and no longer be granted rights to publish to the Maven repository, however they could still create forks and provide pull requests. This was an added benefit - engineers could continue contributing to the project long after they left the company itself. By open sourcing the project we could both open maintenance to a sea of new developers as well as prevent losing the historical knowledge of old ones.

While the code itself was freely available and the binaries were actively published, the project couldn't necessarily be considered "released" to the community until we began promoting it. The project spanned many other popular projects including Spring, Apache Camel and RabbitMQ. Once the component was in a stable-ish state that could be tested by other developers, posts were submitted to mailing lists for each project. There were varying levels of response, but a few individuals began to express interest and even volunteered to write How-To documents and include it in peer presentations. At the same time I also started to see if I could share lessons learned with the StackOverflow user base, and offered snippets from the component's codebase if I though it could be useful. As a result the GitHub project began to get ranked higher with search ranks, which also helped greatly with visibility.

Once the component started to be used within production we released the initial one-dot-oh release. The Maven release was promoted to the Sonatype stable repository, tweets were tweeted, posts were submitted to mailing lists and we tried to invite as many people as necessary to kick the tires. Once it became easy to include the component in Ivy and Maven dependencies, adoption greatly increased and more people tried the component. As a result we started to see an increase in pull requests, suggestions and bug fixes. There were a few deviations from the AMQP specification that we wouldn't have noticed had not the community taken a critical look at the component and provided patches. Use cases for asynchronous production made for very helpful unit tests and helped prioritize new features. The 1.1.0 release of the component was markedly more robust than the 1.0 release but required less engineering effort on behalf of the team.

To my mind everyone won through the open-sourcing of the camel-spring-amqp project. The company was able to deliver high-quality software hardened through peer review and a global pool of developers were able to re-use a collaborative codebase. Overall cost went down, business value went up and high-fives were copiously distributed to all involved.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.