Code First and the Rise of the DVCS and GitHub

January 10, 2015 at 12:35 PM | categories: Git, Mozilla

The ascendancy of GitHub has very little to do with its namesake tool, Git.

What GitHub did that was so radical for its time and the strategy that GitHub continues to execute so well on today is the approach of putting code first and enabling change to be a frictionless process.

In case you weren't around for the pre-GitHub days or don't remember, they were not pleasant. Tools around code management were a far cry from where they are today (I still argue the tools are pretty bad, but that's for another post). Centralized version control systems were prevalent (CVS and Subversion in open source, Perforce, ClearCase, Team Foundation Server, and others in the corporate world). Tools for looking at and querying code had horrible, ugly interfaces and came out of a previous era of web design and browser capabilities. It felt like a chore to do anything, including committing code. Yes, the world had awesome services like SourceForge, but they weren't the same as GitHub is today.

Before I get to my central thesis, I want to highlight some supporting reasons for GitHub's success. There were two developments in the second half of the 2000s the contributed to the success of GitHub: the rises of the distributed version control system (DVCS) and the modern web.

While distributed version control systems like Sun WorkShop TeamWare and BitKeeper existed earlier, it wasn't until the second half of the 2000s that DVCS systems took off. You can argue part of the reason for this was open source: my recollection is there wasn't a well-known DVCS available as free software before 2005. Speaking of 2005, it was a big year for DVCS projects: Git, Mercurial, and Bazaar all had initial releases. Suddenly, there were old-but-new ideas on how to do source control being exposed to new and willing-to-experiment audiences. DVCS were a critical leap from traditional version control because they (theoretically) impose less process and workflow limitations on users. With traditional version control, you needed to be online to commit, meaning you were managing patches, not commits, in your local development workflow. There were some forms of branching and merging, but they were a far cry from what is available today and were often too complex for mere mortals to use. As more and more people were exposed to distributed version control, they welcomed its less-restrictive and more powerful workflows. They realized that source control tools don't have to be so limiting. Distributed version control also promised all kinds of revamped workflows that could be harnessed. There were potential wins all around.

Around the same time that open source DVCS systems were emerging, web browsers were evolving from an application to render static pages to a platform for running web applications. Web sites using JavaScript to dynamically manipulate web page content (DHTML as it was known back then) were starting to hit their stride. I believe it was GMail that turned the most heads as to the full power of the modern web experience, with its novel-for-its-time extreme reliance on XMLHttpRequest for dynamically changing page content. People were realizing that powerful, desktop-like applications could be built for the web and could run everywhere.

GitHub launched in April 2008 standing on the shoulders of both the emerging interest in the Git content tracking tool and the capabilities of modern browsers.

I wasn't an early user of GitHub. My recollection is that GitHub was mostly a Rubyist's playground back then. I wasn't a Ruby programmer, so I had little reason to use GitHub in the early stages. But people did start using GitHub. And in the spirit of Ruby (on Rails), GitHub moved fast, or at least was projecting the notion that they were. While other services built on top of DVCS tools - like Bitbucket - did exist back then, GitHub seemed to have momentum associated with it. (Look at the archives for GitHub's and Bitbucket's respective blogs. GitHub has hundreds of blog entries; Bitbucket numbers in the dozens.) Developers everywhere up until this point had all been dealing with sub-optimal tools and workflows. Some of us realized it. Others hadn't. Many of those who did saw GitHub as a beacon of hope: we have all these new ideas and new potentials with distributed version control and here is a service under active development trying to figure out how to exploit that. Oh, and it's free for open source. Sign me up!

GitHub did capitalize on a market opportunity. They also capitalized on the value of marketing and the perception that they were moving fast and providing features that people - especially in open source - wanted. This captured the early adopters market. But I think what really set GitHub apart and led to the success they are enjoying today is their code first approach and their desire to make contribution easy, and even fun and sociable.

As developers, our job is to solve problems. We often do that by writing and changing code. And this often involves working as part of a team, or collaborating. To collaborate, we need tools. You eventually need some processes. And as I recently blogged, this can lead to process debt and inefficiencies associated with them.

Before GitHub, the process debt for contributing to other projects was high. You often had to subscribe to mailing lists in order to submit patches as emails. Or, you had to create an account on someone's bug tracker or code review tool before you could send patches. Then you had to figure out how to use these tools and any organization or project-specific extensions and workflows attached to them. It was quite involved and a lot could go wrong. Many projects and organizations (like Mozilla) still practice this traditional methology. Furthermore (and as I've written before), these traditional, single patch/commit-based tools often aren't effective at ensuring the desired output of high quality software.

Before GitHub solved process debt via commoditization of knowledge via market dominance, they took another approach: emphasizing code first development.

GitHub is all about the code. You load a project page and you see code. You may think a README with basic project information would be the first thing on a project page. But it isn't. Code, like data, is king.

Collaboration and contribution on GitHub revolve around the pull request. It's a way of saying, hey, I made a change, will you take it? There's nothing too novel in the concept of the pull request: it's fundamentally no different than sending emails with patches to a mailing list. But what is so special is GitHub's execution. Gone are the days of configuring and using one-off tools and processes. Instead, we have the friendly confines of a clean, friendly, and modern web experience. While GitHub is built upon the Git tool, you don't even need to use Git (a tool lampooned for its horrible usability and approachability) to contribute on GitHub! Instead, you can do everything from your browser. That warrants repeating: you don't need to leave your browser to contribute on GitHub. GitHub has essentially reduced process debt to edit a text document territory, and pretty much anybody who has used a computer can do that. This has enabled GitHub to dabble into non-code territory, such as its GitHub and Government initiative to foster community involvement in government. (GitHub is really a platform for easily seeing and changing any content or data. But, please, let me continue using code as a stand-in, since I want to focus on the developer audience.)

GitHub took an overly-complicated and fragmented world of varying contribution processes and made the new world revolve around code and a unified and simple process for change - the pull request.

Yes, there are other reasons for GitHub's success. You can make strong arguments that GitHub has capitalized on the social and psychological aspects of coding and human desire for success and happiness. I agree.

You can also argue GitHub succeeded because of Git. That statement is more or less technically accurate, but I don't think it is a sound argument. Git may have been the most feature complete open source DVCS at the time GitHub came into existence. But that doesn't mean there is something special about Git that no other DVCS has that makes GitHub popular. Had another tool been more feature complete or had the backing of a project as large as Linux at the time of GitHub's launch, we could very well be looking at a successful service built on something that isn't Git. Git had early market advantage and I argue its popularity today - a lot of it via GitHub - is largely a result of its early advantages over competing tools. And, I would go so far to say that when you consider the poor usability of Git and the pain that its users go through when first learning it, more accurate statements would be that GitHub succeeded in spite of Git and Git owes much of its success to GitHub.

When I look back at the rise of GitHub, I see a service that has succeeded by putting people first by allowing them to capitalize on more productive workflows and processes. They've done this by emphasizing code, not process, as the means for change. Organizations and projects should take note.