Gregory Szorc's Digital Home

MozReview Git Support and Improved Commit Mapping

February 08, 2016 at 11:05 AM | categories: MozReview, Mozilla

MozReview - Mozilla's Review Board based code review tool - now supports ingestion from Git. Previously, it only supported Mercurial.

Instructions for configuring Git with MozReview are available. Because blog posts are not an appropriate medium for documenting systems and processes, I will not say anything more here on how to use Git with MozReview.

Somewhat related to the introduction of Git support is an improved mechanism for mapping commits to existing review requests.

When you submit commits to MozReview, MozReview has to decide how to map those commits to review requests in Review Board. It has to choose whether to recycle an existing review request or create a new one. When recycling, is has to pick an appropriate one. If it chooses incorrectly, wonky things can happen. For example, a review request could switch to tracking a new and completely unrelated commit. That's bad.

Up until today, our commit mapping algorithm was extremely simple. Yet it seemed to work 90% of the time. However, a number of people found the cracks and complained. With Git support coming online, I had a feeling that Git users would find these cracks with higher frequency than Mercurial users due to what I perceive to be variations in the commit workflows of Git versus Mercurial. So, I decided to proactively improve the commit mapping before the Git users had time to complain.

Both the Git and Mercurial MozReview client-side extensions now insert a MozReview-Commit-ID metadata line in commit messages. This line effectively defines a (likely) unique ID that identifies the commit across rewrites. When MozReview maps commits to review requests, it uses this identifier to find matches. What this means is that history rewriting (such as reordering commits) should be handled well by MozReview and should not confuse the commit mapping mechanism.

I'm not claiming the commit mapping mechanism is perfect. In fact, I know of areas where it can still fall apart. But it is much better than it was before. If you think you found a bug in the commit mapping, don't hesitate to file a bug. Please have it block bug 1243483.

A side-effect of introducing this improved commit mapping is that commit messages will have a MozReview-Commit-ID line in them. This may startle some. Some may complain about the spam. Unfortunately, there's no better alternative. Both Mercurial and Git do support a hidden key-value dictionary for each commit object. In fact, the MozReview Mercurial extension has been storing the very commit IDs that now appear in the commit message in this dictionary for months! Unfortunately, actually using this hidden dictionary for metadata storage is riddled with problems. For example, some Mercurial commands don't preserve all the metadata. And accessing or setting this data from Git is painful. While I wish this metadata (which provides little value to humans) were not located in the commit message where humans could be bothered by it, it's really the only practical place to put it. If people find it super annoying, we could modify Autoland to strip it before landing. Although, I think I like having it preserved because it will enable some useful scenarios down the road, such as better workflows for uplift requests. It's also worth noting that there is precedent for storing unique IDs in commit messages for purposes of commit mapping in the code review tool: Gerrit uses Change-ID lines.

I hope you enjoy the Git support and the more robust commit to review request mapping mechanism!

Making MozReview Faster by Disregarding RESTful Design

January 13, 2016 at 03:25 PM | categories: MozReview, Mozilla

When I first started writing web services, I was a huge RESTful fan boy. The architectural properties - especially the parts related to caching and scalability - really jived with me. But as I've grown older and gained experienced, I've realized that RESTful design, like many aspects of software engineering, is more of a guideline or ideal than a panacea. This post is about one of those experiences.

Review Board's Web API is RESTful. It's actually one of the better examples of a RESTful API I've seen. There is a very clear separation between resources. And everything - and I mean everything - is a resource. Hyperlinks are used for the purposes described in Roy T. Fielding's dissertation. I can tell the people who authored this web API understood RESTful design and they succeeded in transferring that knowledge to a web API.

Mozilla's MozReview code review tool is built on top of Review Board. We've made a number of customizations. The most significant is the ability to submit a series of commits as one logical review series. This occurs as a side-effect of a hg push to the code review repository. Once your changesets are pushed to the remote repository, that server issues a number of Review Board Web API HTTP requests to reviewboard.mozilla.org to create the review requests, assign reviewers, etc. This is mostly all built on the built-in web API endpoints offered by Review Board.

Because Review Board's Web API adheres to RESTful design principles so well, turning a series of commits into a series of review requests takes a lot of HTTP requests. For each commit, we have to perform something like 5 HTTP requests to define the server state. For series of say 10 commits (which aren't uncommon since we try to encourage the use of microcommits), this can add up to dozens of HTTP requests! And that's just counting the HTTP requests to Review Board: because we've integrated Review Board with Bugzilla, events like publishing result in additional RESTful HTTP requests from Review Board to bugzilla.mozilla.org.

At the end of the day, submitting and publishing a series of 10 commits consumes somewhere between 75 and 100 HTTP requests! While the servers are all in close physical proximity (read: low network latencies), we are reusing TCP connections, and each HTTP request completes fairly quickly, the overhead adds up. It's not uncommon for publishing commit series to take over 30s. This is unacceptable to developers. We want them to publish commits for review as quickly as possible so they can get on with their next task. Humans should not have to wait on machines.

Over in bug 1220468, I implemented a new batch submit web API for Review Board and converted the Mercurial server to call it instead of the classic, RESTful Review Board web APIs. In other words, I threw away the RESTful properties of the web API and implemented a monolith API doing exactly what we need. The result is a drastic reduction in net HTTP requests. In our tests, submitting a series of 20 commits for review reduced the HTTP request count by 104! Furthermore, the new API endpoint performs all modifications in a single database transaction. Before, each HTTP request was independent and we had bugs where failures in the middle of a HTTP request series left the server in inconsistent and unexpected state. The new API is significantly faster and more atomic as a bonus. The main reason the new implementation isn't yet nearly instantaneous is because we're still performing several RESTful HTTP requests to Bugzilla from Review Board. But there are plans for Bugzilla to implement the batch APIs we need as well, so stay tuned.

(I don't want to blame the Review Board or Bugzilla maintainers for their RESTful web APIs that are giving MozReview a bit of scaling pain. MozReview is definitely abusing them almost certainly in ways that weren't imagined when they were conceived. To their credit, the maintainers of both products have recognized the limitations in their APIs and are working to address them.)

As much as I still love the properties of RESTful design, there are practical limitations and consequences such as what I described above. The older and more experienced I get, the less patience I have for tolerating architecturally pure implementations that sacrifice important properties, such as ease of use and performance.

It's worth noting that many of the properties of RESTful design are applicable to microservices as well. When you create a new service in a microservices architecture, you are creating more overhead for clients that need to speak to multiple services, making changes less transactional and atomic, and making it difficult to consolidate multiple related requests into a higher-level, simpler, and performant API. I recommend Microservice Trade-Offs for more on this subject.

There is a place in the world for RESTful and microservice architectures. And as someone who does a lot of server-side engineering, I sympathize with wanting scalable, fault-tolerant architectures. But like most complex problems, you need to be cognizant of trade-offs. It is also important to not get too caught up with architectural purity if it is getting in the way of delivering a simple, intuitive, and fast product for your users. So, please, follow me down from the ivory tower. The air was cleaner up there - but that was only because it was so distant from the swamp at the base of the tower that surrounds every software project.

Lowering the Barrier to Pushing to MozReview

October 14, 2015 at 12:30 PM | categories: MozReview, Mozilla

Starting today, a Mozilla LDAP account with Mercurial SSH access is no longer required to hg push into MozReview to initiate code review with Mozilla projects.

The instructions for configuring your client to use MozReview have been updated to reflect how you can now push to MozReview over HTTP using a Bugzilla API Key for authentication.

This change effectively enables first-time contributors to use MozReview for code review. Before, you had to obtain an LDAP account and configure your SSH client, both of which could be time consuming processes and therefore discourage people from contributing. (Or you could just use Bugzilla/Splinter and not get the benefits of MozReview, which many did.)

I encourage others to update contribution docs to start nudging people towards MozReview over Bugzilla/patch-based workflows (such as bzexport).

Bug 1195856 tracked this feature.

MozReview Statistics July 2015

July 16, 2015 at 02:00 PM | categories: MozReview, Mozilla

As of today, ~15.6% of commits landing in Firefox in July have gone through MozReview or have been produced on machines that have used MozReview. This is still a small percentage of overall commits. But, signs are that the percentage is going up. Last month, about half as many commits exhibited the same signature. It's only July 16 and we've already passed the total from June.

What I find interesting is the differences between commits that have gone through MozReview versus the rest. When you look at the diff statistics (a quick proxy of change size), we find that MozReview commits tend to be smaller. The median adds as reported by diff stat (basically lines that were changed) is 12 for MozReview versus 17 elsewhere. The average is 58 for MozReview versus 100 elsewhere. For number of files modified, MozReview averages 2.59 versus elsewhere's 2.71. (These numbers exclude some specific large commits that appeared to be bulk imports of external projects and drove up the non-MozReview figures.)

It's entirely possible the root cause behind the discrepancy is a side-effect of the population of MozReview users: perhaps MozReview users just write smaller commits. However, I'd like to think it's because MozReview makes it easier to manage multiple commits and people are taking advantage of that (this is an explicit design goal of MozReview). Whatever the root cause, I'm glad diffs are smaller. As I've written about before, smaller commits are easier to review and land, thus enabling projects to move faster.

I have a quarterly goal to remove the requirement for a Mozilla LDAP account to push to MozReview. That will allow first time contributors to use MozReview. This will be a huge win, as we can do much more magic in the MozReview world than we can from vanilla Bugzilla (automatic bug filing, automatic reviewer assignment, etc). Unofficially, I'd like to have more than 50% of Firefox commits go through MozReview by the end of the year.

Publish When Pushing to MozReview

July 07, 2015 at 02:55 PM | categories: MozReview, Mozilla

A lot of people contributed some really great feedback about MozReview at Whistler. One of the most frequent requests was for the ability to publish submitted review requests without having to open a browser. I'm pleased to report that as of yesterday, this feature is implemented! If reviewers have been assigned to all your review requests, Mercurial will now prompt you to publish the review requests during hg push. It should just work.

As part of this change, we also introduced more advanced feature negotiation into the handshake between client and server. This means we now have a mechanism for detecting out-of-date client installations. This will enable us to more aggressively drop backwards compatibility (making server-side development easier) while simultaneously ensuring that more people are running modern and hopefully better versions of the client code. This should translate to moving faster and a better experience for everyone.

MozReview Git Support and Improved Commit Mapping

Making MozReview Faster by Disregarding RESTful Design

Lowering the Barrier to Pushing to MozReview

MozReview Statistics July 2015

Publish When Pushing to MozReview

Categories