Cloning From S3

July 08, 2015 at 11:40 AM | categories: Mercurial, Mozilla

A month ago, I blogged about faster cloning from hg.mozilla.org using bundle files. But deploying the feature on the servers was only the tip of the iceberg.

At the end of last week, Firefox release automation rolled out the bundleclone extension to their Linux and OS X machines. Essentially, clones are bootstrapped from Amazon S3 automatically once this extension is installed.

In just a few days of deployment, we've already seen a drastic shift in traffic. On UTC day 2015-07-07, S3 served 1,563,014,396,236 bytes of repository data! The hg.mozilla.org servers themselves served a total of 1,976,057,201,583 bytes.

But we're not done. The bundleclone extension isn't yet deployed on Windows. Nor is it always used on TaskCluster. In addition, there are still some high-use repositories that don't have bundles being generated. Yesterday, I crunched the data and enabled bundles on more repositories. It's too early to have conclusive data, but this should move an additional several hundred gigabytes of traffic to S3.

At Whistler, I said that reducing traffic on hg.mozilla.org by 90% is within the realm of possibility. Between a partial rollout of S3 clone offload that is already serving 44% of traffic and a feature I'm working on in core Mercurial to enable auto sharing of repository data, I'd say we're well on track.