All serious software development companies use version control tools to help with their work. These tools have evolved substantially in the last few years: we needed to update our arrangements to make them work well for us. In this blog post I describe what had changed and how we met our new requirements.

Some background and history

The value of “version control” has been long recognised for software development, for instance allowing developers to:

  • See exactly what changed when trying to track down bugs between versions of the software
  • Work as a team, bringing together changes made by other members in a manageable way
  • Experiment with new ideas without the risk of losing prior work
  • Avoid ending up having files with names like newfunction_version2_final_new.txt
  • … and lots more

We have been using version control software for very many years, starting with Microsoft Visual SourceSafe/SourceOffSite, then moving through CVS to our current choice of Mercurial. Our moves were prompted by changes in our requirements: many years ago, SourceSafe worked as fine as was expected back when we were fully office-based; SourceOffSite sidestepped SourceSafe’s atrocious performance when accessing the repository over a dial-up connection; and CVS supported a better workflow by removing the requirement to lock a file you are working on. Mercurial now provides decent tools to simultaneously manage different versions of software (known as branching), and is also designed to support a “decentralised” approach. Through all these stages, though, we have been using version control in a centralised way, though, with a “master” repository on an office server accessed by our consultants to help with building software.

Changing requirements

The strong capabilities of today’s version control systems mean that these tools are increasingly being used for new roles in software production, for instance in software deployment. If the tools allow me to manage the files on my computer when building software, why can’t I just use the same facilities to update my remote web server with the latest version? And, indeed, why can’t I configure it to automatically update the web server every time an update is made to my “production” branch? These concepts might be covered by such terms as Continuous Integration and Build Automation, and touch on technologies such as “platform as a service” (PAAS) since tools are available to deploy to platforms such as Google App Engine, Heroku, Amazon Web Services etc.

Our immediate problem

But let’s not get ahead of ourselves! At this point all we are looking to do is to streamline our processes by being able to use our internal version control system to update software we build on our clients’ servers. I saw a few options available to us to allow this:

  1. We could open our network so it can be accessed remotely from our clients’ networks, giving access to our internal repository from their site. Most of their networks are sensibly configured to not allow the kinds of connections we would need for this, though.
  2. We could copy our repository manually to somewhere on their network, and refresh it periodically. This is possible, but creates additional work in terms of having to periodically synchronise the repositories. So workable, but not great.
  3. We could host our repository outside of our office, somewhere accessible from our clients’ infrastructure.

With a particular client requirement in mind and some tight deadlines, I explored this third option. To get a sense as to how it might work, I copied our central repository for this client’s software over to a server we operate using Amazon Web Services (which will no doubt be the subject of another blog post soon). The copying was very easy, as was setting it up to allow secure, password-protected HTTPS access. The result was fantastic: despite the limited access we had to our client’s infrastructure, we were able to directly access our repository from their site and use it to pull through the latest code, make changes and push it back, etc. It quickly became clear that this broad approach was going to be extremely valuable to us in terms of productivity and simplicity.

Thinking about the solution over the weekend, though, I realised that I was missing a trick: there are a number of companies out there who specialise in hosting version control repositories in a central location for remote access. One of their main concerns will of course be staying on top of security, which is always very important. So I investigated two leading providers in this space: GitHub and BitBucket.

These companies are extremely well known and widely used in the “open source” world, allowing teams of contributors to collaborate in building software for use by everyone. They are both highly respected and would be a good choice for that kind of software. The main differences between them that leapt out to me for our purposes were:

  • BitBucket supports Mercurial as well as git in terms of version control tools, while GitHub only supports git (both tools are very good, but we currently use Mercurial and this decision isn’t one that should drive a change in tool)
  • Both platforms support private repositories (which of course we need), but BitBucket allows you to set these up and try them out in a free account for smaller teams whereas on GitHub they require a paid account

This article on InfoWorld explores the differences between these two platforms in more depth, if you are also considering these two platforms.

Our final solution

So I went ahead and set up a company account on BitBucket, and copied the repository in question over to it. Again, this was quick, easy and extremely successful: we still had the benefits of accessing the repository from our office, on the road and at the client’s site, and now benefit from having a company responsible for maintaining security. The BitBucket platform also provides other facilities that might be useful to us over time, such as providing a wiki for each repository and an issue tracker. Overall I think this is the solution we will stick with, for now, as it appears to meet our needs.

Just to summarise where this now stands from the point of view of security and client confidentiality:

  • The “decentralised” nature of Mercurial (and git, in fact) means that even if the provider disappeared tomorrow, we would still have multiple copies of our repositories available, so would lose nothing
  • From a privacy point of view, our repositories don’t typically contain client-confidential information: that is held either in databases whose data is sourced elsewhere, or local configuration files. BitBucket are obviously resting their reputation on maintaining privacy, too, and have significant experience in this area
  • Having this facility available doesn’t stop us from still using our internally hosted repository (or indeed any other external approach) when required for confidentiality or other reasons: in fact, many of our repositories are still only held internally, and it is easy to move them one way or the other

Summary

So, to summarise, by setting up an externally hosted account with BitBucket, we now have a facility to use private version control on projects in an efficient manner wherever we are, including the extremely useful ability to access it directly from our clients’ infrastructure.