Gregory Szorc's Digital Home

Review Board at Mozilla

January 27, 2014 at 04:30 PM | categories: Mozilla, mach

Some Mozillians recently stood up an instance of Review Board - a web-based code review tool for evaluation purposes. Having used Review Board at a previous company, I can say with high confidence that when properly configured, it beats the pants off Splinter (the code review interface baked into Bugzilla). Here are some advantages:

The HTML interface is more pleasant on the eyes (subjective).
Interdiffs actually work.
Intra-line diffs are rendered.
You can open issues for individual review comments and these issues can be tracked during subsequent reviews (Bugzilla doesn't really have anything similar and review comments tend to get lost unless the reviewer is sharp).
It integrates with the VCS, so you can expand code context from the review interface.
There are buttons to toggle whitespace differences.
Syntax hightlighting! It even recognizes things like TODO in comments.

You can read more from the official user guide.

If you have any interest in evaluating Review Board, the easiest way to upload patches to Mozilla's instance is to run mach rbt.

mach rbt will launch the Review Board tools command-line client (called RBTools). From there, you can do a number of things. Run mach rbt help to see the full list.

Here are some examples:

# See a diff that would be uploaded to Review Board:
$ mach rbt diff

# Create a review request based on the current Mercurial changeset:
$ mach rbt post

# That should print out a URL to the not-yet-published review
# request. If you go to that URL, you'll notice that the fields
# in that request are all empty.

# Next time, you can have some fields auto-populate by running:
$ mach rbt post --guess-summary --guess-description

# This grabs info from the commit message.

# To update an existing review request (e.g. to submit a new patch):
$ mach rbt post -r <review id>

# (where <review ID> is the ID of the review).

# You can also have it generate a "squashed" patch from multiple
# commits:
$ mach rbt post 164123::164125

Run mach rbt help post for more options. Also see the RBTools documentation for more.

It's worth noting that mach rbt will download an unreleased version of RBTools. This is because the released version doesn't work well with Mercurial. I contributed a handful of patches to RBTools to make Mercurial work better.

Before you dive in and start using Review Board for actual code review, there are some things you need to know:

Mozilla's Review Board instance does not yet send emails on changes. Bug 958236 tracks this. When it works, you'll see nice emails, just like you do for Bugzilla reviews.
Review Board doesn't currently interact with Bugzilla that well. In theory, we could have Review Board update corresponding Bugzilla bugs when actions are performed. Someone just needs to write this code and get it deployed.
If you create a Bugzilla attachment that contains the URL of a Review Board review (e.g. https://reviewboard.allizom.org/r/23/), Bugzilla will automatically set the MIME type as a Review Board review and set up an HTML redirect when the attachment is downloaded via the browser. You can even set r? on this attachment to have Bugzilla nag about reviews. See bug 875562 for an example.
There is currently no way to upload a patch to Review Board and update Bugzilla is one go. I have proof-of-concept code for this. Although, there is pushback on getting that checked in.
Review Board 2 is in development. It has a number of new and exciting features. And it looks better.

Finally and most importantly, Review Board at Mozilla is still in evaluation mode. It's usage has not been officially blessed as far as I know. I don't believe the SLA is as high as other services (like Bugzilla). Nobody is really using it yet. It still needs a lot of polish and integration for it to realize its potential. And, there is some talk about the future of code review at Mozilla that may or may not involve Review Board. In short, the future of Review Board at Mozilla is uncertain. I wouldn't rely on it to archive review comments from super important reviews/decisions.

Despite the shortcomings, I encourage people to play around with Review Board. If nothing else, at least gaze upon it's patch rendering beauty and witness what the future could hold.

mach now lives in mozilla-central

January 09, 2014 at 10:55 AM | categories: Mozilla, mach

mach -- the generic command line interface framework that is behind the mach tool used to build Firefox -- now has its canonical home in mozilla-central, the canonical repository for Firefox. The previous home has been updated to reflect the change.

mach will continue to be released on PyPI and installable via pip install mach.

I made the change because keeping multiple repositories in sync wasn't something I wanted to spend time doing. Furthermore, Mozillians have been contributing a steady stream of improvements to the mach core recently and it makes sense to leverage Mozilla's familiar infrastructure for patch contribution.

This decision may be revisited in the future. Time will tell.

Mozilla Build System Brain Dump

May 13, 2013 at 05:25 PM | categories: build system, Mozilla, Firefox, mach

I hold a lot of context in my head when it comes to the future of Mozilla's build system and the interaction with it. I wanted to perform a brain dump of sorts so people have an idea of where I'm coming from when I inevitably propose radical changes.

The sad state of build system interaction and the history of mach

I believe that Mozilla's build system has had a poor developer experience for as long as there has been a Mozilla build system. Getting started with Firefox development was a rite of passage. It required following (often out-of-date) directions on MDN. It required finding pages through MDN search or asking other people for info over IRC. It was the kind of process that turned away potential contributors because it was just too damn hard.

mach - while born out of my initial efforts to radically change the build system proper - morphed into a generic command dispatching framework by the time it landed in mozilla-central. It has one overarching purpose: provide a single gateway point for performing common developer tasks (such as building the tree and running tests). The concept was nothing new - individual developers had long coded up scripts and tools to streamline workflows. Some even published these for others to use. What set mach apart was a unified interface for these commands (the mach script in the top directory of a checkout) and that these productivity gains were in the tree and thus easily discoverable and usable by everybody without significant effort (just run mach help).

While mach doesn't yet satisfy everyone's needs, it's slowly growing new features and making developers' lives easier with every one. All of this is happening despite that there is not a single person tasked with working on mach full time. Until a few months ago, mach was largely my work. Recently, Matt Brubeck has been contributing a flurry of enhancements - thanks Matt! Ehsan Akhgari and Nicholas Alexander have contributed a few commands as well! There are also a few people with a single command to their name. This is fulfilling my original vision of facilitating developers to scratch their own itches by contributing mach commands.

I've noticed more people referencing mach in IRC channels. And, more people get angry when a mach command breaks or changes behavior. So, I consider the mach experiment a success. Is it perfect, no. If it's not good enough for you, please file a bug and/or code up a patch. If nothing else, please tell me: I love to know about everyone's subtle requirements so I can keep them in mind when refactoring the build system and hacking on mach.

The object directory is a black box

One of the ideas I'm trying to advance is that the object directory should be considered a black box for the majority of developers. In my ideal world, developers don't need to look inside the object directory. Instead, they interact with it through condoned and supported tools (like mach).

I say this for a few reasons. First, as the build config module owner I would like the ability to massively refactor the internals of the object directory without disrupting workflows. If people are interacting directly with the object directory, I get significant push back if things change. This inevitably holds back much-needed improvements and triggers resentment towards me, build peers, and the build system. Not a good situation. Whereas if people are indirectly interacting with the object directory, we simply need to maintain a consistent interface (like mach) and nobody should care if things change.

Second, I believe that the methods used when directly interacting with the object directory are often sub-par compared with going through a more intelligent tool and that productivity suffers as a result. For example, when you type make in inside the object directory you need to know to pass -j8, use make vs pymake, and that you also need to build toolkit/library, etc. Also, by invoking make directly, you bypass other handy features, such as automatic compiler warning aggregation (which only happens if you invoke the build system through mach). If you go through a tool like mach, you should automatically get the most ideal experience possible.

In order for this vision to be realized, we need massive improvements to tools like mach to cover the missing workflows that still require direct object directory interaction. We also need people to start using mach. I think increased mach usage comes after mach has established itself as obviously superior to the alternatives (I already believe it offers this for tasks like running tests).

I don't want to force mach upon people but...

Nobody likes when they are forced to change a process that has been familiar for years. Developers especially. I get it. That's why I've always attempted to position mach as an alternative to existing workflows. If you don't like mach, you can always fall back to the previous workflow. Or, you can improve mach (patches more than welcome!). Having gone down the please-use-this-tool-it's-better road before at other organizations, I strongly believe that the best method to incur adoption of a new tool is to gradually sway people through obvious superiority and praise (as opposed to a mandate to switch). I've been trying this approach with mach.

Lately, more and more people have been saying things like we should have the build infrastructure build through mach instead of client.mk and why do we need testsuite-targets.mk when we have mach commands. While I personally feel that client.mk and testsuite-targets.mk are antiquated as a developer-facing interface compared to mach, I'm reluctant to eliminate them because I don't like forcing change on others. That being said, there are compelling reasons to eliminate or at least refactor how they work.

Let's take testsuite-targets.mk as an example. This is the make file that provides the targets to run tests (like make xpcshell-test and make mochitest-browser-chrome). What's interesting about this file is that it's only used in local builds: our automation infrastructure does not use testsuite-targets.mk! Instead, mozharness and the old buildbot configs manually build up the command used to invoke the test harnesses. Initially, the mach commands for running tests simply invoked make targets defined in testsuite-targets.mk. Lately, we've been converting the mach commands to invoke the Python test runners directly. I'd argue that the logic for invoke the test runner only needs to live in one place in the tree. Furthermore as a build module peer, I have little desire to support multiple implementations. Especially considering how fragile they can be.

I think we're trending towards an outcome where mach (or the code behind mach commands) transitions into the authoratitive invocation method and legacy interfaces like client.mk and testsuite-targets.mk are reimplemented to either call mach commands or the same routine that powers them. Hopefully this will be completely transparent to developers.

The future of mozconfigs and environment configuration

mozconfig files are shell scripts used to define variables consumed by the build system. They are the only officially supported mechanism for configuring how the build system works.

I'd argue mozconfig files are a mediocre solution at best. First, there's the issue of mozconfig statements that don't actually do anything. I've seen no-op mozconfig content cargo culted into the in-tree mozconfigs (used for the builder configurations)! Oops. Second, doing things in mozconfig files is just awkward. Defining the object directory requires mk_add_options MOZ_OBJDIR=some-path. What's mk_add_options? If some-path is relative, what is it relative to? While certainly addressable, the documentation on how mozconfig files work is not terrific and fails to explain many pitfalls. Even with proper documentation, there's still the issue of the file format allowing no-op variable assignments to persist.

I'm very tempted to reinvent build configuration as something not mozconfigs. What exactly, I don't know. mach has support for ini-like configuration files. We could certainly have mach and the build system pull configs from the same file.

I'm not sure what's going to happen here. But deprecating mozconfig files as they are today is part of many of the options.

Handling multiple mozconfig files

A lot of developers only have a single mozconfig file (per source tree at least). For these developers, life is easy. You simply install your mozconfig in one of the default locations and it's automagically used when you use mach or client.mk. Easy peasy.

I'm not sure what the relative numbers are, but many developers maintain multiple mozconfig files per source tree. e.g. they'll have one mozconfig to build desktop Firefox and another one for Android. They may have debug variations of each.

Some developers even have a single mozconfig file but leverage the fact that mozconfig files are shell scripts and have their mozconfig dynamically do things depending on the current working directory, value of an environment variable, etc.

I've also seen wrapper scripts that glorify setting environment variables, changing directory, etc and invoke a command.

I've been thinking a lot about providing a common and well-supported solution for switching between active build configurations. Installing mach on $PATH goes a long way to facilitate this. If you are in an object directory, the mozconfig used when that object directory was created is automatically applied. Simple enough. However, I want people to start treating object directories as black boxes. So, I'd rather not see people have their shell inside the object directory.

Whenever I think about solutions, I keep arriving at a virtualenv-like solution. Developers would potentially need to activate a Mozilla build environment (similar to how Windows developers need to launch MozillaBuild). Inside this environment, the shell prompt would contain the name of the current build configuration. Users could switch between configurations using mach switch or some other magic command on the $PATH.

Truth be told, I'm skeptical if people would find this useful. I'm not sure it's that much better than exporting the MOZCONFIG environment variable to define the active config. This one requires more thought.

The integration between the build environment and Python

We use Python extensively in the build system and for common developer tasks. mach is written in Python. moz.build processing is implemented in Python. Most of the test harnesses are written in Python.

Doing practically anything in the tree requires a Python interpreter that knows about all the Python code in the tree and how to load it.

Currently, we have two very similar Python environments. One is a virtualenv created while running configure at the beginning of a build. The other is essentially a cheap knock-off that mach creates when it is launched.

At some point I'd like to consolidate these Python environments. From any Python process we should have a way to automatically bootstrap/activate into a well-defined Python environment. This certainly sounds like establishing a unified Python virtualenv used by both the build system and mach.

Unfortunately, things aren't straightforward. The virtualenv today is constructed in the object directory. How do we determine the current object directory? By loading the mozconfig file. How do we do that? Well, if you are mach, we use Python. And, how does mach know where to find the code to load the mozconfig file? You can see the dilemma here.

A related issue is that of portable build environments. Currently, a lot of our automation recreates the build system's virtualenv from its own configuration (not that from the source tree). This has and will continue to bite us. We'd really like to package up the virtualenv (or at least its config) with tests so there is no potential for discrepancy.

The inner workings of how we integrate with Python should be invisible to most developers. But, I figured I'd capture it here because it's an annoying problem. And, it's also related to an activated build environment. What if we required all developers to activate their shell with a Mozilla build environment (like we do on Windows)? Not only would this solve Python issues, but it would also facilitate simpler config switching (outlined above). Hmmm...

Direct interaction with the build system considered harmful

Ever since there was a build system developers have been typing make (or make.py) to build the tree. One of the goals of the transition to moz.build files is to facilitate building the tree with Tup. make will do nothing when you're not using Makefiles! Another goal of the moz.build transition is to start derecursifying the make build system such that we build things in parallel. It's likely we'll produce monolithic make files and then process all targets for a related class IDLs, C++ compilation, etc in one invocation of make. So, uh, what happens during a partial tree build? If a .cpp file from /dom/src/storage is being handled by a monolithic make file invoked by the Makefile at the top of the tree, how does a partial tree build pick that up? Does it build just that target or every target in the monolithic/non-recursive make file?

Unless the build peers go out of our way to install redundant targets in leaf Makefiles, directly invoking make from a subdirectory of the tree won't do what it's done for years.

As I said above, I'm sympathetic to forced changes in procedure, so it's likely we'll provide backwards-compatibile behavior. But, I'd prefer to not do it. I'd first prefer partial-tree builds are not necessary and a full tree build finishes quickly. But, we're not going to get there for a bit. As an alternative, I'll take people building through mach build. That way, we have an easily extensible interface on which to build partial tree logic. We saw this recently when dumbmake/smartmake landed. And, going through mach also reinforces my ideal that the object directory is a black box.

Semi-persistent state

Currently, most state as it pertains to a checkout or build is in the object directory. This is fine for artifacts from the build system. However, there is a whole class of state that arguably shouldn't be in the object directory. Specifically, it shouldn't be clobbered when you rebuild. This includes logs from previous builds, the warnings database, previously failing tests, etc. The list is only going to grow over time.

I'd like to establish a location for semi-persistant state related to the tree and builds. Perhaps we change the clobber logic to ignore a specific directory. Perhaps we start storing things in the user's home directory. Perhaps we could establish a second object directory named the state directory? How would this interact with build environments?

This will probably sit on the backburner until there is a compelling use case for it.

The battle against C++

Compiling C++ consumes the bulk of our build time. Anything we can do to speed up C++ compilation will work wonders for our build times.

I'm optimistic things like precompiled headers and compiling multiple .cpp files with a single process invocation will drastically decrease build times. However, no matter how much work we put in to make C++ compilation faster, we still have a giant issue: dependency hell.

As shown in my build system presentation a few months back, we have dozens of header files included by hundreds if not thousands of C++ files. If you change one file: you invalidate build dependencies and trigger a rebuild. This is why whenever files like mozilla-config.h change you are essentially confronted with a full rebuild. ccache may help if you are lucky. But, I fear that as long as headers proliferate the way they do, there is little the build system by itself can do.

My attitude towards this is to wait and see what we can get out of precompiled headers and the like. Maybe that makes it good enough. If not, I'll likely be making a lot of noise at Platform meetings requesting that C++ gurus brainstorm on a solution for reducing header proliferation.

Conclusion

Belive it or not, these are only some of the topics floating around in my head! But I've probably managed to bore everyone enough so I'll call it a day.

I'm always interested in opinions and ideas, especially if they are different from mine. I encourage you to leave a comment if you have something to say.

Omnipresent mach

March 03, 2013 at 12:30 PM | categories: Mozilla, Firefox, mach

Matt Brubeck recently landed an awesome patch for mach in bug 840588: it allows mach to be used by any directory. I'm calling it omnipresent mach.

Essentially, Matt changed the mach driver (the script in the root directory of mozilla-central) so instead of having it look in hard-coded relative paths for all its code, it walks up the directory tree and looks for signs of the source tree or the object directory.

What this all means is that if you have the mach script installed in your $PATH and you just type mach in your shell from within any source directory or object directory, mach should just work. So, no more typing ./mach: just copy mach to ~/bin, /usr/local/bin or some other directory on your $PATH and you should just be able to type mach.

Unfortunately, there are bound to be bugs here. Since mach traditionally was only executed with the current working directory as the top source directory, some commands are not prepared to handle a variable current working directory. Some commands will likely get confused when it comes resolving relative paths, etc. If you find an issue, please report it! A temporary workaround is to just invoke mach from the top source directory like you've always been doing.

If you enjoy the feature, thank Matt: this was completely his idea and he saw it through from conception to implementation.