Making MozReview Faster by Disregarding RESTful Design

January 13, 2016 at 03:25 PM | categories: MozReview, Mozilla

When I first started writing web services, I was a huge RESTful fan boy. The architectural properties - especially the parts related to caching and scalability - really jived with me. But as I've grown older and gained experienced, I've realized that RESTful design, like many aspects of software engineering, is more of a guideline or ideal than a panacea. This post is about one of those experiences.

Review Board's Web API is RESTful. It's actually one of the better examples of a RESTful API I've seen. There is a very clear separation between resources. And everything - and I mean everything - is a resource. Hyperlinks are used for the purposes described in Roy T. Fielding's dissertation. I can tell the people who authored this web API understood RESTful design and they succeeded in transferring that knowledge to a web API.

Mozilla's MozReview code review tool is built on top of Review Board. We've made a number of customizations. The most significant is the ability to submit a series of commits as one logical review series. This occurs as a side-effect of a hg push to the code review repository. Once your changesets are pushed to the remote repository, that server issues a number of Review Board Web API HTTP requests to reviewboard.mozilla.org to create the review requests, assign reviewers, etc. This is mostly all built on the built-in web API endpoints offered by Review Board.

Because Review Board's Web API adheres to RESTful design principles so well, turning a series of commits into a series of review requests takes a lot of HTTP requests. For each commit, we have to perform something like 5 HTTP requests to define the server state. For series of say 10 commits (which aren't uncommon since we try to encourage the use of microcommits), this can add up to dozens of HTTP requests! And that's just counting the HTTP requests to Review Board: because we've integrated Review Board with Bugzilla, events like publishing result in additional RESTful HTTP requests from Review Board to bugzilla.mozilla.org.

At the end of the day, submitting and publishing a series of 10 commits consumes somewhere between 75 and 100 HTTP requests! While the servers are all in close physical proximity (read: low network latencies), we are reusing TCP connections, and each HTTP request completes fairly quickly, the overhead adds up. It's not uncommon for publishing commit series to take over 30s. This is unacceptable to developers. We want them to publish commits for review as quickly as possible so they can get on with their next task. Humans should not have to wait on machines.

Over in bug 1220468, I implemented a new batch submit web API for Review Board and converted the Mercurial server to call it instead of the classic, RESTful Review Board web APIs. In other words, I threw away the RESTful properties of the web API and implemented a monolith API doing exactly what we need. The result is a drastic reduction in net HTTP requests. In our tests, submitting a series of 20 commits for review reduced the HTTP request count by 104! Furthermore, the new API endpoint performs all modifications in a single database transaction. Before, each HTTP request was independent and we had bugs where failures in the middle of a HTTP request series left the server in inconsistent and unexpected state. The new API is significantly faster and more atomic as a bonus. The main reason the new implementation isn't yet nearly instantaneous is because we're still performing several RESTful HTTP requests to Bugzilla from Review Board. But there are plans for Bugzilla to implement the batch APIs we need as well, so stay tuned.

(I don't want to blame the Review Board or Bugzilla maintainers for their RESTful web APIs that are giving MozReview a bit of scaling pain. MozReview is definitely abusing them almost certainly in ways that weren't imagined when they were conceived. To their credit, the maintainers of both products have recognized the limitations in their APIs and are working to address them.)

As much as I still love the properties of RESTful design, there are practical limitations and consequences such as what I described above. The older and more experienced I get, the less patience I have for tolerating architecturally pure implementations that sacrifice important properties, such as ease of use and performance.

It's worth noting that many of the properties of RESTful design are applicable to microservices as well. When you create a new service in a microservices architecture, you are creating more overhead for clients that need to speak to multiple services, making changes less transactional and atomic, and making it difficult to consolidate multiple related requests into a higher-level, simpler, and performant API. I recommend Microservice Trade-Offs for more on this subject.

There is a place in the world for RESTful and microservice architectures. And as someone who does a lot of server-side engineering, I sympathize with wanting scalable, fault-tolerant architectures. But like most complex problems, you need to be cognizant of trade-offs. It is also important to not get too caught up with architectural purity if it is getting in the way of delivering a simple, intuitive, and fast product for your users. So, please, follow me down from the ivory tower. The air was cleaner up there - but that was only because it was so distant from the swamp at the base of the tower that surrounds every software project.


Investing in the Firefox Build System in 2016

January 11, 2016 at 02:20 PM | categories: Mozilla, build system

Most of Mozilla gathered in Orlando in December for an all hands meeting. If you attended any of the plenary sessions, you probably heard people like David Bryant and Lawrence Mandel make references to improving the Firefox build system and related tools. Well, the cat is out of the bag: Mozilla will be investing heavily in the Firefox build system and related tooling in 2016!

In the past 4+ years, the Firefox build system has mostly been held together and incrementally improved by a loose coalition of people who cared. We had a period in 2013 where a number of people were making significant updates (this is when moz.build files happened). But for the past 1.5+ years, there hasn't really been a coordinated effort to improve the build system - just a lot of one-off tasks and (much-appreciated) individual heroics. This is changing.

Improving the build system is a high priority for Mozilla in 2016. And investment has already begun. In Orlando, we had a marathon 3 hour meeting planning work for Q1. At least 8 people have committed to projects in Q1.

The focus of work is split between immediate short-term wins and longer-term investments. We also decided to prioritize the Firefox and Fennec developer workflows (over Gecko/Platform) as well as the development experience on Windows. This is because these areas have been under-loved and therefore have more potential for impact.

Here are the projects we're focusing on in Q1:

  • Turnkey artifact based builds for Firefox and Fennec (download pre-built binaries so you don't have to spend 20 minutes compiling C++)
  • Running tests from the source directory (so you don't have to copy tens of thousands of files to the object directory)
  • Speed up configure / prototype a replacement
  • Telemetry for mach and the build system
  • NSPR, NSS, and (maybe) ICU build system rewrites
  • mach build faster improvements
  • Improvements to build rules used for building binaries (enables non-make build backends)
  • mach command for analyzing C++ dependencies
  • Deploy automated testing for mach bootstrap on TaskCluster

Work has already started on a number of these projects. I'm optimistic 2016 will be a watershed year for the Firefox build system and the improvements will have a drastic positive impact on developer productivity.


hg.mozilla.org replication updates

January 05, 2016 at 03:00 PM | categories: Mercurial, Mozilla

A few minutes ago, I formally enabled a new replication system for hg.mozilla.org. For the curious, technical details are available.

This impacts you because pushes to hg.mozilla.org should now be significantly faster. For example, pushes to mozilla-inbound that used to take 15s now take 2s. Pushes to Try that used to take 45s now take 10s. (Yes, the old replication system really added a lot of overhead.) Pushes to hg.mozilla.org are still not as fast as they could be due to us running the service on 5 year old hardware (we plan to buy new servers this year) and due to the use of NFS on the server. However, I believe push latency is now reasonable for every repo except Try.

The new replication system opens the door to a number of future improvements. We'd like to stand up mirrors in multiple data centers - perhaps even offices - so clients have the fastest connectivity and so we have a better disaster recovery story. The new replication system facilitates this.

The new replication log is effectively a unified pushlog - something people have wanted for years. While there is not yet a public API for it, one could potentially be exposed, perhaps indirectly via Pulse.

It is now trivial for us to stand up new consumers of the replication log that react to repository events merely milliseconds after they occur. This should eventually result in downstream systems like build automation and conversion to Git repos starting and thus completing faster.

Finally, the new replication system has been running unofficially for a few weeks, so you likely won't notice anything different today (other than removal of some printed messages when you push). What changed today is the new system is enabled by default and we have no plans to continue supporting or operating the legacy system. Good riddance.


Mozilla Mercurial Extension Updates

December 16, 2015 at 05:40 PM | categories: Mercurial, Mozilla

There have been a handful of updates to Mozilla's client-side Mercurial tools in the past week and all users are encouraged to pull down the latest commit of mozilla-central and run mach mercurial-setup to ensure their configuration is up to date.

Improvements to mach mercurial-setup include:

  • ~ are now used in paths when appropriate
  • Mercurial 3.5.2 is now the oldest version you can run without the wizard complaining
  • The clone bundles feature is enabled when running Mercurial 3.6
  • hg wip is available for configuration
  • hgwatchman (make hg status significantly faster via background filesystem watching) is now available on OS X
  • x509 host key fingerprints are no longer pinned if your Python and Mercurial version is modern (this pinning exists in Mercurial because old versions of Python don't verify x509 certificates properly)
  • 3rd party Mercurial extensions are pulled with extensions disabled (to prevent issues with old, incompatible extensions crashing the hg pull invocation)

The firefoxtree extension has also been updated. It now uses the new namespaces feature of Mercurial so all Firefox labels/names/refs are exposed in a firefoxtree namespace instead of as tags. As part of this change, you will lose old tags created by this extension and will need to re-pull repos to recreate them as namespaced labels. hg log output will now look like the following:

changeset:   332223:0babaa3edcf9
fxtree:      central
parent:      332188:40038a66525f
parent:      332222:c6fc9d77e86f
user:        Carsten "Tomcat" Book <cbook@mozilla.com>
date:        Wed Dec 16 12:01:46 2015 +0100
summary:     merge mozilla-inbound to mozilla-central a=merge

(Note the fxtree line.)

The firefoxtree extension also now works with hg share. i.e. if you hg share from a Firefox repository and hg pull from the source repo or any shared repo, the labels will be updated in every repo. This only works on newly-created shares. If you want to enable this on an existing share, see this comment.


hg.mozilla.org Updates

November 05, 2015 at 09:20 AM | categories: Mozilla

A number of changes have been made to hg.mozilla.org over the past few days:

  • Bookmarks are now replicated from master to mirrors properly (before, you could have seen foo@default bookmarks appearing). This means bookmarks now properly work on hg.mozilla.org! (bug 1139056).

  • Universally upgraded to Mercurial 3.5.2. Previously we were running 3.5.1 on the SSH master server and 3.4.1 on the HTTP endpoints. Some HTML templates received minor changes. (bug 1200769).

  • Pushes from clients running Mercurial older than 3.5 will now see an advisory message encouraging them to upgrade. (bug 1221827).

  • Author/user fields are now validated to be a RFC 822-like value (e.g. "Joe Smith <someone@example.com>"). (bug 787620).

  • We can now mark individual repositories as read-only. (bug 1183282).

  • We can now mark all repositories read-only (useful during maintenance events). (bug 1183282).

  • Pushlog commit list view only shows first line of commit message. (bug 666750).

Please report any issues in their respective bugs. Or if it is an emergency, #vcs on irc.mozilla.org.


« Previous Page -- Next Page »