Mercurial 3.5 was released today. I contributed some small improvements to this version that I thought I'd share with the world.
The feature I'm most proud of adding to Mercurial 3.5 is what I'm referring to as auto share. The existing hg share extension/command enables multiple checkouts of a repository to share the same backing repository store. Essentially the .hg/store directory is a symlink to shared directory. This feature has existed in Mercurial for years and is essentially identical to the git worktree feature just recently added in Git 2.5.
My addition to the share extension is the ability for Mercurial to automatically perform an hg clone + hg share in the same operation. If the share.pool config option is defined, hg clone will automatically clone or pull the repository data somewhere inside the directory pointed to by share.pool then create a new working copy from that shared location. But here's the magic: Mercurial can automatically deduce that different remotes are the same logical repository (by looking at the root changeset) and automatically have them share storage. So if you first hg clone the canonical repository then later do a hg clone of a fork, Mercurial will pull down the changesets unique to the fork into the previously created shared directory and perform a checkout from that. Contrast with performing a full clone of the fork. If you are cloning multiple repositories that are logically derived from the same original one, this can result in a significant reduction of disk space and network usage. I wrote this feature with automated consumers in mind, particularly continuous integration systems. However, there is also mode more suitable for humans where repositories are pooled not by their root changeset but by their URL. For more info, see hg help -e share.
For Mercurial 3.4, I contributed changes that refactored how Mercurial's tags cache works. This cache was a source of performance problems at Mozilla's scale for many years. Since upgrading to Mercurial 3.4, Mozilla has not encountered any significant performance problems with the cache on either client or server as far as I know.
Building on this work, Mercurial 3.5 supports transferring tags cache entries from server to client when clients clone/pull. Before, clients would have to recompute tags cache entries for pulled changesets. On repositories that are very large in terms of number of files (over 50,000) or heads (hudreds or more), this could take several dozen seconds or even minutes. This would manifest as a delay either during or after initial clone. In Mercurial 3.5 - assuming both client and server support the new bundle2 wire protocol - the cache entries are transferred from server to client and no extra computation needs to occur. The client does pay a very small price for transferring this additional data over the wire, but the payout is almost always worth it. For large repositories, this feature means clones are usable sooner.
A few weeks ago, a coworker told me that connections to a Mercurial server were timing out mid clone. We investigated and discovered a potential for a long CPU-intensive pause during clones where Mercurial would not touch the network. On this person's under-powered EC2 instance, the pause was so long that the server's inactivity timeout was triggered and it dropped the client's TCP connection. I refactored Mercurial's cloning code so there is no longer a pause. There should be no overall change in clone time, but there is no longer a perceivable delay between applying changesets and manifests where the network could remain idle. This investigation also revealed some potential follow-up work for Mercurial to be a bit smarter about how it interacts with networks.
Finally, I contributed hg help scripting to Mercurial's help database. This help topic covers how to use Mercurial from scripting and other automated environments. It reflects knowledge I've learned from seeing Mercurial used in automation at Mozilla.
Of course, there are plenty of other changes in Mercurial 3.5. Stay tuned for another blog post.
Earlier this week, I landed some changes to the Firefox development environment that aggressively make mach prompt to run mach mercurial-setup. Full details in bug 1182677.
As expected, the change resulted in a fair amount of whining and bemoaning among various Firefox developers. I wanted to take some time to explain why we moved forward, even though we knew not everyone would like the feature.
My official job title at Mozilla is Developer Productivity Engineer. My job is to make you do your job better.
I've been an employee at Mozilla for four years and in that time I've witnessed a surprising lack of understanding around version control tools. (I don't think Mozilla is significantly different from most other companies here.) I find that a significant number of people are practicing sub-optimal version control workflows because they don't know any better (common) or because they are unwilling to change from learned habits.
Furthermore, Mercurial is a highly customizable tool. A lot of Mozillians have spent a lot of time developing useful extensions to Mercurial that enable Mozillians to use Mercurial more effectively and to thus become more productive. The latest epic time-saving hack is Nick Alexander's work to make Fennec build 80% faster by having deep integration with version control.
mach mercurial-setup is almost two years old. Yet, when assisting my fellow Mozillians with Mercurial issues, my "have you run mach mercurial-setup?" question is still often met with blank stares followed by "wait, there's a mach mercurial-setup?!" What's even more frustrating is people wrongly believing that Mercurial can't do things like rebasing and then spreading misinformation about the lackings of Mercurial. (Mercurial has many advanced features disabled out of the box so new users don't footgun themselves.)
Just like Firefox would be irrelevant if it didn't have millions of users, your awesome tool is mostly irrelevant if you are its only user. That's why when I hear of someone say they created an amazing tool for themselves or modified a third party tool without sending the improvements upstream, my blood pressure rises a little. It rises because here this person did something awesome and they or some limited subset of people who happened to be following the person on Twitter or reading their blog at that point in time managed to a) know about the tool b) take the effort to install it. The uptake rate is insanely low and return on investment for that tool is low. It results in duplication of effort. I find this painfully frustrating because I want everyone to have easy access to the best tools available. This requires that tools are well advertised and easy to install and use.
The primary goal of mach mercurial-setup is to make it super easy for anyone to have an optimal Mercurial experience. It was apparent to me that despite mach mercurial-setup existing, numerous people didn't know it existed or weren't using it. Your awesome tool isn't very awesome unless people are using it. And a lot of the awesome tools people have built around Mercurial at Mozilla weren't being utilized and lots of productivity wins were thus being unrealized. Forcefully pushing mach mercurial-setup onto people is thus an attempt to unlock unrealized productivity wins and to make people happier about the state of their tools.
I'm not thrilled that mach's prompting to run mach mercurial-setup is as disruptive as it is. It's bad user experience. I know better. But, (and this is explained somewhat in the bug), other solutions are more complicated and have other gotchas. The current, invasive implementation was the easiest to implement and has the biggest bang for the buck in terms of adoption. We knew people would complain about it. But from my perspective, it was do this or do nothing. And nothing hadn't been very effective. So we did something.
There has been lots of feedback about the change this week. Most surprising to me is the general sentiment of "I don't want something automatically changing my hgrc file." I find this surprising because mach mercurial-setup puts the user firmly in control by prompting before doing anything, thus respecting user choice and avoiding gotchas and unwanted changes. It's clear this property needs to be advertised a bit more so people aren't scared to run mach mercurial-setup and don't spread fear, uncertainty, and doubt about the tool to others. (I also find it somewhat surprising people would think this in the first place: I'd like to think we'd implicitly trust most Mozillians to implement tools that respect user choice and don't do malicious things.)
Like all software, things can and will change. The user experience of this new feature isn't terrific. We'll iterate on it. If you want to help enact change, please file a bug in Core :: mach (for now) and we'll go from there.
Thank you for your patience and your understanding.
As of today, ~15.6% of commits landing in Firefox in July have gone through MozReview or have been produced on machines that have used MozReview. This is still a small percentage of overall commits. But, signs are that the percentage is going up. Last month, about half as many commits exhibited the same signature. It's only July 16 and we've already passed the total from June.
What I find interesting is the differences between commits that have gone through MozReview versus the rest. When you look at the diff statistics (a quick proxy of change size), we find that MozReview commits tend to be smaller. The median adds as reported by diff stat (basically lines that were changed) is 12 for MozReview versus 17 elsewhere. The average is 58 for MozReview versus 100 elsewhere. For number of files modified, MozReview averages 2.59 versus elsewhere's 2.71. (These numbers exclude some specific large commits that appeared to be bulk imports of external projects and drove up the non-MozReview figures.)
It's entirely possible the root cause behind the discrepancy is a side-effect of the population of MozReview users: perhaps MozReview users just write smaller commits. However, I'd like to think it's because MozReview makes it easier to manage multiple commits and people are taking advantage of that (this is an explicit design goal of MozReview). Whatever the root cause, I'm glad diffs are smaller. As I've written about before, smaller commits are easier to review and land, thus enabling projects to move faster.
I have a quarterly goal to remove the requirement for a Mozilla LDAP account to push to MozReview. That will allow first time contributors to use MozReview. This will be a huge win, as we can do much more magic in the MozReview world than we can from vanilla Bugzilla (automatic bug filing, automatic reviewer assignment, etc). Unofficially, I'd like to have more than 50% of Firefox commits go through MozReview by the end of the year.
A month ago, I blogged about faster cloning from hg.mozilla.org using bundle files. But deploying the feature on the servers was only the tip of the iceberg.
At the end of last week, Firefox release automation rolled out the bundleclone extension to their Linux and OS X machines. Essentially, clones are bootstrapped from Amazon S3 automatically once this extension is installed.
In just a few days of deployment, we've already seen a drastic shift in traffic. On UTC day 2015-07-07, S3 served 1,563,014,396,236 bytes of repository data! The hg.mozilla.org servers themselves served a total of 1,976,057,201,583 bytes.
But we're not done. The bundleclone extension isn't yet deployed on Windows. Nor is it always used on TaskCluster. In addition, there are still some high-use repositories that don't have bundles being generated. Yesterday, I crunched the data and enabled bundles on more repositories. It's too early to have conclusive data, but this should move an additional several hundred gigabytes of traffic to S3.
At Whistler, I said that reducing traffic on hg.mozilla.org by 90% is within the realm of possibility. Between a partial rollout of S3 clone offload that is already serving 44% of traffic and a feature I'm working on in core Mercurial to enable auto sharing of repository data, I'd say we're well on track.
A lot of people contributed some really great feedback about MozReview at Whistler. One of the most frequent requests was for the ability to publish submitted review requests without having to open a browser. I'm pleased to report that as of yesterday, this feature is implemented! If reviewers have been assigned to all your review requests, Mercurial will now prompt you to publish the review requests during hg push. It should just work.
As part of this change, we also introduced more advanced feature negotiation into the handshake between client and server. This means we now have a mechanism for detecting out-of-date client installations. This will enable us to more aggressively drop backwards compatibility (making server-side development easier) while simultaneously ensuring that more people are running modern and hopefully better versions of the client code. This should translate to moving faster and a better experience for everyone.
« Previous Page -- Next Page »