Faster Cloning from hg.mozilla.org With Server Provided Bundles

May 29, 2015 at 11:30 AM | categories: Mercurial, Mozilla

When you type hg clone, the Mercurial server will create a bundle from repository content at the time of the request and stream it to the client. (Git works essentially the same way.)

This approach usually just works. But there are some downsides, particularly with large repositories.

Creating bundles for large repositories is not cheap. For mozilla-central, Firefox's main repository, it takes ~280s of CPU time on my 2014 MacBook Pro to generate a bundle. Every time a client runs a hg clone https://hg.mozilla.org/mozilla-central, a server somewhere is spinning a CPU core generating ~1.1 GB of data. What's more, if another clone arrives at the same time, another process will perform the exact same work! When we talk about multiple minutes of CPU time per request, this extra work starts to add up.

Another problem with large repositories is interrupted downloads. If you suffer a connectivity blip during your clone command, you'll have to start from scratch. This potentially means re-transferring hundreds of megabytes from the server. It also means the server has to generate a new bundle, consuming even more CPU time. This is not good for the user or the server.

There have been multiple outages of hg.mozilla.org as a result of the service being flooded with clone requests to large repositories. Dozens of clients (most of them in Firefox or Firefox OS release automation) have cloned the same repository around the same time and overwhelmed network bandwidth in the data center or CPU cores on the Mercurial servers.

A common solution to this problem is to not use the clone command to receive initial repository data from the server. Instead, a static bundle file will be generated and made available to clients. Clients will call hg init to create an empty repository then will perform an hg unbundle to apply the contents of a pre-generated bundle file. They will then run hg pull to fetch new data that was created after the bundle was generated. (It's worth noting that Git's clone --reference option is similar.)

This is a good technical solution. Firefox and Firefox OS release automation have effectively implemented this. However, it is a lot of work: you have to build your own bundle generation and hosting infrastructure and you have to remember that every hg clone should probably be using bundles instead. It is extra complexity and complexity that must be undertaken by every client. If a client forgets, the consequences can be disastrous (clone flooding leading to service outage). Client-side opt-in is prone to lapses and doesn't scale.

As of today, we've deployed a more scalable, server-based solution to hg.mozilla.org.

hg.mozilla.org is now itself generating bundles for a handful of repositories, including mozilla-central, inbound, fx-team, and mozharness. These bundles are being uploaded to Amazon S3. And those bundles are being advertised by the server over Mercurial's wire protocol.

When you install the bundleclone Mercurial extension, hg clone is taught to look for bundles being advertised on the server. If a bundle is available, the bundle is downloaded, applied, and then the client does the equivalent of an hg pull to fetch all new data since when the bundle was generated. If a bundle exists, it is used transparently: no client side cooperation is needed beyond installing the bundleclone extension. If a bundle doesn't exist, it simply falls back to Mercurial's default behavior. This effectively shifts responsibility for doing efficient clones from clients to server operators, which means server operators don't need cooperation from clients to enact important service changes. Before, if clients weren't using bundles, we'd have to wait for clients to update their code. Now, we can see a repository is being cloned heavily and start generating bundles for it without having to wait for the client to deploy new code.

Furthermore, we've built primitive content negotiation into the process. The server doesn't simply advertise one bundle file: it advertises several bundle files. We offer gzip, bzip2, and stream bundles. gzip is what Mercurial uses by default. It works OK. bzip2 bundles are smaller, but they take longer to process. stream bundles are essentially tar archives of the .hg/store directory and are larger than gzip bundles, but insanely fast because there is very little CPU required to apply them. In addition, we advertise URLs for multiple S3 regions, currently us-west-2 (Oregon) and us-east-1 (Virginia). This enables clients to prefer the bundle most appropriate for them.

A benefit of serving bundles from S3 is that Firefox and Firefox OS release automation (the biggest consumers of hg.mozilla.org) live in Amazon EC2. They are able to fetch from S3 over a gigabit network. And, since we're transferring data within the same AWS region, there are no data transfer costs. Previously, we were transferring ~1.1 GB from a Mozilla data center to EC2 for each clone. This took up bandwidth in Mozilla's network and cost Mozilla money to send data thousands of miles away. And, we never came close to saturating a gigabit network (we do with stream bundles). Wins everywhere!

The full instructions detail how to use bundleclone. I recommend everyone at Mozilla install the extension because there should be no downside to doing it.

Once bundleclone is deployed to Firefox and Firefox OS release automation, we should hopefully never again see those machines bring down hg.mozilla.org due to a flood of clone requests. We should also see a drastic reduction in load to hg.mozilla.org. I'm optimistic bandwidth will decrease by over 50%!

It's worth noting that the functionality from the bundleclone extension is coming to vanilla Mercurial. The functionality (which was initially added by Mozilla's Mike Hommey) is part of Mercurial's bundle2 protocol, which is available, but isn't enabled by default yet. bundleclone is thus a temporary solution to bring us server stability and client improvements until modern Mercurial versions are deployed everywhere in a few months time.

Finally, I would like to credit Augie Fackler for the original idea for server-assisted bundle-based clones.