My Contributions to Mercurial 3.5

July 31, 2015 at 10:55 AM | categories: Mercurial, Mozilla

Mercurial 3.5 was released today. I contributed some small improvements to this version that I thought I'd share with the world.

The feature I'm most proud of adding to Mercurial 3.5 is what I'm referring to as auto share. The existing hg share extension/command enables multiple checkouts of a repository to share the same backing repository store. Essentially the .hg/store directory is a symlink to shared directory. This feature has existed in Mercurial for years and is essentially identical to the git worktree feature just recently added in Git 2.5.

My addition to the share extension is the ability for Mercurial to automatically perform an hg clone + hg share in the same operation. If the share.pool config option is defined, hg clone will automatically clone or pull the repository data somewhere inside the directory pointed to by share.pool then create a new working copy from that shared location. But here's the magic: Mercurial can automatically deduce that different remotes are the same logical repository (by looking at the root changeset) and automatically have them share storage. So if you first hg clone the canonical repository then later do a hg clone of a fork, Mercurial will pull down the changesets unique to the fork into the previously created shared directory and perform a checkout from that. Contrast with performing a full clone of the fork. If you are cloning multiple repositories that are logically derived from the same original one, this can result in a significant reduction of disk space and network usage. I wrote this feature with automated consumers in mind, particularly continuous integration systems. However, there is also mode more suitable for humans where repositories are pooled not by their root changeset but by their URL. For more info, see hg help -e share.

For Mercurial 3.4, I contributed changes that refactored how Mercurial's tags cache works. This cache was a source of performance problems at Mozilla's scale for many years. Since upgrading to Mercurial 3.4, Mozilla has not encountered any significant performance problems with the cache on either client or server as far as I know.

Building on this work, Mercurial 3.5 supports transferring tags cache entries from server to client when clients clone/pull. Before, clients would have to recompute tags cache entries for pulled changesets. On repositories that are very large in terms of number of files (over 50,000) or heads (hudreds or more), this could take several dozen seconds or even minutes. This would manifest as a delay either during or after initial clone. In Mercurial 3.5 - assuming both client and server support the new bundle2 wire protocol - the cache entries are transferred from server to client and no extra computation needs to occur. The client does pay a very small price for transferring this additional data over the wire, but the payout is almost always worth it. For large repositories, this feature means clones are usable sooner.

A few weeks ago, a coworker told me that connections to a Mercurial server were timing out mid clone. We investigated and discovered a potential for a long CPU-intensive pause during clones where Mercurial would not touch the network. On this person's under-powered EC2 instance, the pause was so long that the server's inactivity timeout was triggered and it dropped the client's TCP connection. I refactored Mercurial's cloning code so there is no longer a pause. There should be no overall change in clone time, but there is no longer a perceivable delay between applying changesets and manifests where the network could remain idle. This investigation also revealed some potential follow-up work for Mercurial to be a bit smarter about how it interacts with networks.

Finally, I contributed hg help scripting to Mercurial's help database. This help topic covers how to use Mercurial from scripting and other automated environments. It reflects knowledge I've learned from seeing Mercurial used in automation at Mozilla.

Of course, there are plenty of other changes in Mercurial 3.5. Stay tuned for another blog post.