Gregory Szorc's Digital Home

SQLite.jsm - SQLite Done Betterer

April 14, 2013 at 11:55 PM | categories: Mozilla

Did you know there is now a better way to interact with SQLite from JavaScript in Firefox? It's called SQLite.jsm and it's available in Firefox 20 and later.

SQLite.jsm is an abstraction around the low-level Storage APIs that you would have used before. However, it eliminates most of the footguns and makes it easier to write code that doesn't jank the browser and is less prone to memory leaks. It even has an API to free as much memory as possible from the current connection!

If you currently use SQLite via Storage, I highly recommend taking a look at SQLite.jsm - especially if you are using synchronous APIs.

If you are investigating SQLite for the storage needs of your add-on or browser feature, please keep in mind that SQLite can incur lots of filesystem I/O and may run slowly on old machines (especially with magnetic hard drives) and especially with its default configuration. You may be interested in low-level file I/O using OS.File instead.

If you insist on using SQLite, please educate yourself on and then seriously consider using Write-Ahead Logging mode on your database. Some detailed discussion on SQLite behavior as it pertains to Firefox can be found in bug 830492. I hope to eventually incorporate more sane by default connection options into SQLite.jsm to make it easier for add-ons and browser features to have the least-impactful behavior by default (e.g. enable WAL by default). Until then, please, please, please research PRAGMA statements to optimize how your SQLite database runs so it has as little performance overhead as possible. Also consider dropping into an IRC channel on irc.mozilla.org and asking for advice from one of the many who have fallen into SQLite's many performance pitfalls (including me).

Making hg-git Faster

April 14, 2013 at 09:45 PM | categories: Mozilla, Mercurial, Git

When enterprising individuals at Mozilla started maintaining a Git mirror of Firefox's main source repository (hosted in Mercurial), they ran into a significant problem: conversion was slow. The initial conversion apparently took over 6 days and used a lot of memory. Furthermore, each subsequent commit took many seconds, even on modern hardware. This meant that the they could only maintain a Git mirror of a few project branches and that updates would be slow. Furthermore, the slowness of the conversion significantly discouraged people from using the tool locally as part of regular development.

I thought this was unacceptable. I wanted to enable people to use their tool of choice (Git) to develop Firefox. So, I did what annoyed engineers do when confronted with an itch: I scratched it.

Diagnosing the Problem

When I started tackling this problem, I had little knowledge of the problem space other than the problem statement: converting from Mercurial to Git is prohibitively slow and that the slow tool was hg-git. My challenge was thus to make hg-git faster.

When confronted with a performance problem, one of the first things you do is identify the source of the bad performance. Then, you need to acertain whether that is something you have the ability to change.

This often starts by answering some high-level questions then drilling down into more detail as necessary. For a long-running system tool like hg-git, I start with the top test: how much CPU, memory, and I/O is the process utilizing?

In the case of hg-git, we were CPU bound. The Python process was consistently pegging a single CPU core while periodically incurring I/O (but not nearly enough to saturate a magnetic disk). This told me a few things. First, I should look for bottlenecks inside Python. Second, I should investigate whether parallel execution would be possible. The latter is especially important these days because the trend in processors is towards more cores rather than higher clock speeds. It's no longer acceptable to let increases in clock speed or cycle efficiency bail you out: if you want a CPU bound process to run as fast as possible, it's often necessary to involve all available CPU cores.

Once I diagnosed CPU as the limiting factor, I pulled out the next tool in the arsenal: a code profiler. I quickly discovered exactly where the conversion was taking the most CPU time. As feared, it was in the export Mercurial changeset to Git commit function. Specifically, profiling had flagged the conversion of Mercurial manifests to Git trees and blobs. Furthermore, most of the time was spent in functions in Mercurial itself (Mercurial is implemented in Python and hg-git calls into it natively) and Dulwich (a pure Python implementation of Git). So, I was either looking at deficiencies or Mercurial and/or Dulwich, a bad conversion algorithm in hg-git, or both. To know which, I would need a better grasp on the internal storage models of Mercurial and Git.

Learning about Mercurial's and Git's internal storage models

To understand why conversion from Mercurial to Git was slow, I needed to understand how each stored data internally. My hope was that if attained better understanding I could apply the knowledge to assess the algorithm hg-git was using and optimize it, hopefully introducing parallel execution along the way.

Git's internals

I already had a fairly good understanding of how Git works internally. And, it's quite simple really. The Git Internals chapter of the Pro Git is extremely useful. While I encourage readers to read all of the Git Objects section, the gist is:

Git's core storage is a key-value data store. Keys are SHA-1 checksums of content. Each entity is storage in a Git object.
A blob is an object holding the raw content of a file.
A tree is an object holding a list of tree entries. Each tree entry defines a blob, another tree object, etc. A tree is essentially a directory listing.
A commit object holds metadata about an individual Git commit. Each commit object refers to a specific tree object.

When you introduce a new file that hasn't been seen before, a new blob is added to storage. That blob is referenced by a tree. When you update a file, a new tree is created referring to the new blob that was created.

Things get a little complicated when you consider directories. If you update the file foo/bar/baz.c, the tree for foo/bar changes (because the SHA-1 of baz.c changed). And, the SHA-1 for the foo/bar tree changes, so the bar entry in foo's tree changes, changing the SHA-1 for the root tree.

That's essentially how Git addresses commits, directories, and files. If you don't grok this, please, please read the aforementioned page on it - it may even help you better grok Git!

Mercurial's internals

Unlike Git, I didn't really have a clue how Mercurial worked internally. So, I needed to do some self-education here.

The best resource for Mercurial's storage model I've found is the Behind the Scenes chapter from Mercurial: The Definitive Guide. The gist is:

History for an individual file is stored in a filelog. Each filelog contains the history of a single file. Each file revision has a hash based on the file contents.
The manifest lists every file, its permissions, and its file revision for each changeset in the repository.
The changelog contains information about each changeset, including the revision of the manifest to use.
Each of these logs contain revisions and you can address an individual revision within the log.

From a high level, Mercurial's storage model is very similar to Git's. They both address files by hashing their content. Where Git uses multiple tree objects to define every file in a commit, Mercurial has a single manifest containing a flat list. Aside from that, the differences are mostly in implementation details. These are important, as we'll soon see.

Analyzing hg-git's conversion algorithm

Armed with knowledge of how Git and Mercurial internally store data, I was ready to analyze how hg-git was performing conversion from Mercurial to Git. Since profiling revealed it was the convert a single changeset into Git commit function that was taking all the time, I started there.

In Python (but not the actual Python), the algorithm was essentially:

def export_changeset_to_git(changeset, git, already_converted):
    """Receives the Mercurial changeset and a handle on a Git object storre."""
    # This is an entity that helps us build Git tree objects from
    # paths and blobs. The logic is at
    # https://github.com/jelmer/dulwich/blob/2a8548be3b1fd4a1ae7d0436dce91611112c47c2/dulwich/index.py#L298
    tree_builder = TreeBuilder()

    for file in changeset.manifest:
        blob_id = already_converted.get(file.id, None)

        if blob_id is None:
            blob = Blob(file.data())
            git.store(blob.id, blob.content)
            already_converted[file.id] = blob.id
            blob_id = blob.id

        tree_builder.add_file(file.path, blob_id, file.mode)

    for tree in tree_builder.all_trees():
        git.store(tree.id, tree.content)

    root_tree = tree_builder.root_tree

    # And proceed to build the Git commit and insert it.

On the face of it, this code doesn't seem too bad. If I were writing the functionality from scratch, I'd likely do something very similar. So, why is it so slow?

As I mentioned earlier, profiling results had identified Mercurial and Dulwich as the hot spots. The Mercurial hotspot was in iteration over the files in the manifest. And the Dulwich offender with Git tree object construction. By why?

First, it turns out that iterating a manifest the way hg-git was isn't exactly performant. I never traced all the gory details, but I'm pretty sure every time it accessed the file context through the change context there was I/O involved. Not good, especially if you may not need the information contained if it was already cached!

Second, it turns out that creating Git tree objects in Dulwich is rather slow. And, the problem is magnified when converting large repositories - like mozilla-central (Firefox's canonical repository).

So, I was faced with a decision: make Mercurial and/or Dulwich faster or change hg-git. Since improving these would have benefits outside of hg-git, I initially went down those roads. However, I eventually abandonded the effort because of effort involved. And, in the case of Dulwich, improving things would likely require rewriting some pieces in C - not something I cared to do nor something that the Dulwich people would likely accept since Dulwich is all about being a pure Python implementation of Git! And in hindsight, this was the right call. Mercurial and Dulwich are fast enough - it's hg-git that was being suboptimal.

I was faced with two problems: don't mass iterate over manifests and don't mass generate Git trees. Both were seemingly impossible to avoid because both are critical to converting a Mercurial changeset to Git.

I thought about this problem for a while. I experimented with numerous micro benchmarks. I engaged the very helpful Mercurial developers on IRC (thanks everyone!). And, I eventually arrived at what I think is an elegant solution.

When I took a step back and looked at the larger problem of exporting Mercurial changesets to Git, I realized it would be beneficial in terms of efficiency for the conversion to be more aware of what had occurred before. Before I came along, hg-git was asking Mercurial for the full state of each changeset for each changeset conversion. When you think about it in low-level operations, this is extremely inefficient. Let's take Git trees as an example.

When you perform a commit, only the trees - and their parents - that had modified files will change. All the other trees will be identical across commits. For large repositories (in terms of files and directories) like mozilla-central, the number of static trees across small commits is quite significant compared to changed trees. The overhead of computing all these trees is not insignificant!

Instead of throwing away all the trees and file context information between changeset exports, what if I preserved it and reused it for the next changeset? I think I'm on to something...

Implementing incremental changeset export

To minimize the work performed when exporting Mercurial changesets to Git, I implemented a standalone class that can emit Mercurial changeset deltas in terms of Git objects. Essentially, it caches a Git tree representation of a Mercurial manifest. When you feed a new Mercurial changeset into it, it asks Mercurial to compare those changesets using the same API used by hg status. This API is efficient and returns the information I care about: the paths that changed. Once we have the changed files, we simply reflect those changes in terms of updating Git trees.

If a file changes or is added, we emit a blob. If a tree changes, we emit the new tree object. When the consumer has finished writing the set of new objects to Git, it asks for the SHA-1 of the root tree. (Up until this point the consumer is not aware of what any of the emitted objects actually are - just that they likely need to be added to storage.) It then uses the SHA-1 of the root tree to construct the commit. Then it moves on to the next changeset.

The impact of this change is significant. On my computer, converting Mercurial's own Mercurial repository Git went from 21:07 to 8:14 on my i7-2600k. mozilla-central is even more drastic. The first 200 commits (the first commit was a large dump from CVS) took 8:17 before and now take 2:32. I don't have exact numbers from newer commits, but I do know they were at least twice as slow as the initial commits and showed an even more drastic speedup.

But I was just getting started.

The initial implementation wasn't very efficient in terms of reducing tree object calculations. I changed that earlier today when I submitted a patch for consideration that only calculates tree changes for trees that actually changed. I also removed some needless sorting on the order of export operations. This second patch reduced conversion of Mercurial's repository down to 5:33. Even more impressive is that mozilla-central's changesets are now exporting almost 4x faster with this patch alone. The first 200 changesets now export in 42s (down from 2:32 which is down from 8:17). This is mostly due to the overhead of reprocessing non-dirty trees on every export.

And I'm not through.

As part of building the standalone incremental changeset exporter, one of the goals in the back of my mind was to eventually have things execute in parallel.

In my personal development branch I have a patch to perform Mercurial changeset export on multiple cores. Essentially hg-git fires up a bunch of worker processes and asks each to export a consecutive range of changesets. The workers writes new Git objects into Git and then tells the coordinator process the root tree SHA-1 corresponding to each Mercurial changeset. The coordinator process then uses these root tree SHA-1's to derive Git commit objects (you can't create the commit object until you know the SHA-1 of the commit's parents).

The blob and tree exporting on separate processes makes Mercurial to Git export scale out to however many cores you feel like throwing at it. When 32 core machines come around, you can convert using all available cores and the speedup should roughly be linear with the number of cores.

I'm still working out some kinks in the multiple processes patch (the multiprocessing module is very difficult to get working on all platforms and I don't want to break hg-git when it lands). But, Ehsan Akhgari has been using it to power the GitHub mirror of mozilla-central for months without issue. (His use of these patches freed up the CPU required to support conversion of more project branches on the Git mirror. And, he's still not using the 4x improvement patch I wrote today - he will shortly - so who knows what improvements will stem from that.)

With all the patches applied, hg-git now feels like a Ferrari when exporting Mercurial changesets to Git. Conversion of Mercurial's repository now takes 1:25 (down from 21:07). Conversion of mozilla-central has gone from 6+ days to about 3 hours! More importantly, ongoing conversions feel somewhat snappy now.

Making Git export even faster

With the patch today, I'd say optimization of exporting Mercurial changesets is nearing its limits. There are a few things I could try that may net another 2 or 3x improvement. But, I think the ~50x improvement I've already attained (at least for mozilla-central) is pretty damn good and good enough for most users. (Part of performance optimization is knowing when is good enough and stopping before you invest excessive time in the long tail.)

There is one giant refactor that could likely net a significant win for Git export. However, it requires optimizing for initial export over recurring incremental export (which is why I have little interest in it). Incremental export incurs a lot of random I/O accessing Mercurial filelogs and extracting specific file revisions as they are needed. An optimal export would iterate over the filelogs and export Git blobs from each filelog in the sequence they occur in within the filelogs. It would cache the file node to blob SHA-1. After all blobs are exported, the mappings would be combined and distributed to all workers. Then, tree export would occur in parallel largely under the existing model modulo blob writing. This would minimize overall I/O and work in Mercurial and would likely be significantly faster. However, it's mostly useful for initial export and IMO not worth implementing. (It's possible to employ a variation for incremental export that iterates over filelogs and exports not-yet-seen revisions. Perhaps I will investigate this some day.)

What about converting Git to Mercurial?

Now that I've tackled Mercurial to Git conversion, it's very tempting to work magic on the inverse: converting Git commits to Mercurial changesets. While I haven't looked at this problem in detail, I already know it will be at least slightly more challenging.

The reason is parallelization. With Mercurial export, I have each child process reading directly from Mercurial and writing directly to Git. There are no locks involved. There is just a coordinator that ensures minimum redundant work among workers. There is some redundant work, sure. But, the alternative would be lots of locking and/or exchange of state across processes - not cheap operations! Furthermore, the writes into Git can occur in any order (since Git is just a key-value store). The only hard requirement is a child commit must come after its parent (because you need the parent commit's SHA-1). And, single-threaded insert of commit objects isn't a big deal because you can crank through hundreds of them per second (it might even be over 1000/s on my machine).

Mercurial's storage implementation does not afford me the same carelessness with regards to writing into storage. Since Mercurial uses shared files for individual file and manifest history, we have a contention problem. We could lock files when writing to them. However, these files (revlogs in Mercurial speak) also use transparent delta compression. You get the best performance/compression when changes are written in the order they actually occured in (at least in the typical case).

To optimally write to Mercurial you need to order inserts. This means parallel reads from Git (in separate worker processes) would be very difficult to implement. Doable, sure, but you're looking at a lot of transferred state and ordering. This likely involves a lot more memory and CPU usage.

The best idea I've come up with so far is a single process that reads off Git commits and iterates trees. It hashes the paths of seen files to a consistent worker process which then pulls the blob from Git's storage and inserts it into the filelog. You don't need to lock filelogs because only one worker owns a specific path. Workers report the blob's corresponding file node to another process which then assembles manifests, writes manifests in order, and finally creates and writes changesets. Unfortunately, the worker processes are just doing blob I/O. There is no parallel processing of Git tree calculation or Mercurial manifests. Given this was a significant source of slowness exporting to Git, I worry the inverse will be true. Although, the problem with Git was tree creation and it was due to the volume. Since there is only 1 manifest per changeset, perhaps it won't be as bad.

While I've brainstormed a solution, I have no concrete plans to work on Git to Mercurial conversion. The impetus for me working on Mercurial to Git speedups was that I and a number of other Mozilla people were personally impacted. If the same is true for Git to Mercurial slowness, I could invest a few hours the next time I'm sick and bored over the weekend.

Conclusion

Converting Mercurial repositories to Git with hg-git is now significantly faster. If you thought it was too slow before, grab the latest code (from either the official repository or my personal branch) and enjoy.

Bulk Analysis of Mozilla's Build and Test Data

April 01, 2013 at 01:12 PM | categories: Mozilla, Firefox

When you push code changes to Firefox and other similar Mozilla projects, a flood of automated jobs is triggered on Mozilla's infrastructure. It works like any other continuous integration system. First you build, then you run tests, etc. What sets it apart from other continuous integration systems is the size: Mozilla runs thousands of jobs per week and the combined output sums into the tens of gigabytes.

Most of the data from Mozilla's continuous integration is available on public servers, notably ftp.mozilla.org. This includes compiled binaries, logs, etc.

While there are tools that can sift through this mountain of data (like TBPL), they don't allow ad-hoc queries over the raw data. Furthermore, these tools are very function-specific and there are many data views they don't expose. This missing data has always bothered me because, well, there are cool and useful things I'd like to do with this data.

This itch has been bothering me for well over a year. The persistent burning sensation coupled with rain over the weekend caused me to scratch it.

The product of my weekend labor is a system facilitating bulk storage and analysis of Mozilla's build data. While it's currently very alpha, it's already showing promise for more throrough data analysis.

Essentially, the tool works by collecting the dumps of all the jobs executed on Mozilla's infrastructure. It can optionally supplement this with the raw logs from those jobs. Then, it combs through this data, extracts useful bits, and stores them. Once the initial fetching has completed, you simply need to re-"parse" the data set into useful data. And, since all data is stored locally, the performance of this is not bound by Internet bandwidth. In practice, this means that you can obtain a new metric faster than would have been required before. The downside is you will likely be storing gigabytes of raw data locally. But, disks are cheap. And, you have control over what gets pulled in, so you can limit it to what you need.

Please note the project is very alpha and is only currently serving my personal interests. However, I know there is talk about TBPL2 and what I have built could evolve into the data store for the next generation TBPL tool. Also, most of the work so far has centered on data import. There is tons of analysis code waiting to be written.

If you are interested in improving the tool, please file a GitHub pull request.

I hope to soon blog about useful information I've obtained through this tool.

Omnipresent mach

March 03, 2013 at 12:30 PM | categories: Mozilla, Firefox, mach

Matt Brubeck recently landed an awesome patch for mach in bug 840588: it allows mach to be used by any directory. I'm calling it omnipresent mach.

Essentially, Matt changed the mach driver (the script in the root directory of mozilla-central) so instead of having it look in hard-coded relative paths for all its code, it walks up the directory tree and looks for signs of the source tree or the object directory.

What this all means is that if you have the mach script installed in your $PATH and you just type mach in your shell from within any source directory or object directory, mach should just work. So, no more typing ./mach: just copy mach to ~/bin, /usr/local/bin or some other directory on your $PATH and you should just be able to type mach.

Unfortunately, there are bound to be bugs here. Since mach traditionally was only executed with the current working directory as the top source directory, some commands are not prepared to handle a variable current working directory. Some commands will likely get confused when it comes resolving relative paths, etc. If you find an issue, please report it! A temporary workaround is to just invoke mach from the top source directory like you've always been doing.

If you enjoy the feature, thank Matt: this was completely his idea and he saw it through from conception to implementation.

moz.build Files and the Firefox Build System

February 28, 2013 at 07:45 PM | categories: Mozilla, Firefox, build system

The next time you update mozilla-central you may notice some significant changes with the build system. That's because this morning we finally landed the start of a massive overhaul of the build system! There are many end goals to this effort. The important ones for most will be faster build times and a build system that is friendlier to make changes to.

Introducing moz.build Files

If you look in the tree, you'll notice that nearly every directory now has a moz.build file.

moz.build files are what we are using to define the build system. Think of them each as a descriptor that describes how to build its own part of the tree. An individual moz.build file will contain the C++ sources to compile, the headers to export, the tests to run, etc. Eventually. Currently, they are limited to directory traversal information.

moz.build files essentially add a level of indirection between the build system definition and how the tree is actually built. Before moz.build files, the same metadata we are now capturing in moz.build files (or plan to capture) was captured in Makefile.in files. We performed simple variable substitution on these Makefile.in files to produce Makefile files in the object directory. These Makefile files were used by GNU Make (or Pymake on Windows) to build the tree.

As I outlined in Improving Mozilla's Build System, Makefile.in are suboptimal for a number of reasons. The important bit is they essentially tie us to the use of make (recursive or otherwise). We are very interested in supporting modern build systems such as Tup (the theory being they will build the tree faster).

Enter moz.build files. Storing our build configuration in moz.build files allows us to decouple the definition of the build system from the tool used to build it.

How moz.build Files Work

At the tail end of configure, the build system invokes the config.status script in the object directory. The role of config.status is to combine the information collected during configure with the build configuration obtained from moz.build files and take the necessary actions to ensure the build backend (make) can build the tree.

Before today, config.status essentially iterated over the source tree and converted Makefile.in files to Makefile in the object directory. Things are now slightly more complicated with moz.build files.

When config.status runs, it starts with the root moz.build from the source tree. It feeds this file into a Python interpreter. It then looks for special variables like DIRS and PARALLEL_DIRS to determine which directories contain additional moz.build files. It then descends into all the referenced directories, reading their moz.build files. While this is happening, we are converting the results of each moz.build file execution into backend.mk files that make knows how to build. It also performs the Makefile.in to Makefile conversion like it always has. When the whole process has finished, the object directory contains a bunch of Makefile and backend.mk files. make runs like it always has. The only real difference is some variables are coming from the moz.build-derived backend.mk files instead of Makefile.

This is just a brief overview, of course. If you want to know more, see the code in /python/mozbuild/mozbuild/frontend and /python/mozbuild/mozbuild/backend.

Build System Future

With the introduction of moz.build files, the intent is to eventually completely eliminate Makefile.in and have all the build definition live in moz.build files.

Doing this all at once would have been next to impossible. So, we decided to eliminate Makefile.in gradually. The first step is what landed today: essentially moving DIRS variables out of Makefile.in and into moz.build files. Next, we will be eliminating empty Makefile.in (bug 844635) and will be moving more parts of the build definition from Makefile.in to moz.build files. The next part to move over will likely be IDLs (bug 818246). After that, it may be exported files (EXPORTS in Makefile.in parlance). And repeat until we have no more Makefile.in in the tree.

Each migration of build definition data to moz.build files will likely occur in two phases:

A largely mechanical move of the variables from Makefile.in to moz.build.
Better build backend integration resulting from the move.

In phase #1, we will essentially cut and paste variable assignments to moz.build files. make will read the same variables it does today and perform the same actions. The only difference is the values in these variables will be defined in moz.build files.

In phase #2, we will leverage the fact that our build definition now has an API. We will change our existing make backend to be more efficient. For example, we should soon be able to compile IDLs and copy exported headers without make traversing through the directory tree at build time. We will be able to do this because the moz.build traversal at pre-build time sees data from all moz.build files and with this complete world view is able to produce more optimal make files than what you would get if you recursed into multiple directories. In short: it will make the build faster.

Once we have a sufficient portion of the build definition moved to moz.build files we will be able to experiment with new build backends (like Tup), look into automatic Visual Studio project generation, and more easily experiment with different ways of building (such as precompiled headers, fewer compiler process invocations, etc). These should all contribute to faster build times.

Frequently Asked Questions

What impact will I see from this change?

If you never touched Makefile.in files in the tree, you should not notice anything different about how the tree builds or how the build system works. You should have nothing to fear.

The most obvious changes to the source tree are:

There is a moz.build file in almost every directory now.
The variables related to directory traversal (those containing DIRS in their name) are now defined in moz.build files instead of Makefile.in.
If your Makefile.in contains a variable that has been moved to moz.build files, make will spew an error when processing that file and the build will fail.

Will this change how I build?

It shouldn't. You should build the tree just like you always have. Most of you won't notice any differences.

If you see something weird, speak up in #build or file a bug if you are really confident it is wrong.

What are the risks to this change?

The migration of variables from Makefile.in to moz.build files is largely mechanical and significant portions are done manually. This can be a mind-numbing and tedious process. Not helping things is the fact that Splinter's review interface for these kinds of patches is hard to read.

This all means that there is a non-trivial risk for transcription errors. All it takes is an inverted conditional block and all of a sudden large parts of the tree are no longer built, for example.

We have established bug 846825 to investigate any oddities from the initial transfer. Developers are encouraged to help with the effort. Please spot check that your directories are still being built, tests run, etc. Pay special attention to changes made in the last 4 months as these parts of Makefile.in would have been bit rotted and more prone to data loss.

Parts of the tree not enabled in standard configurations are more prone to breakage due to less testing. i.e. build configurations not captured by TBPL have a higher chance of breaking.

Will this make the tree build faster?

Not yet. But eventually it will. This initial landing paves the groundwork to making the tree build faster (see the Future section above).

I see a lot of empty moz.build files!

Yes. Sorry about that. The good news is they shouldn't be empty for long. As things move from Makefile.in to moz.build we'll see fewer and fewer empty moz.build files. We'll also see fewer and fewer Makefile.in files once we start deleting empty Makefile.in.

If you want to know why we must have empty files, it's mainly for validation. If we allowed moz.build files to be optional, how would you detect a typo in a directory name? Directory exists? What if that directory exists but isn't supposed to have a moz.build file?

You bitrotted my patches!

Yes. I'm sorry. The transition period to moz.build files could be a little messy. There will be lots of changes to Makefile.in and moz.build files and lots of chances for bit rot. Uplifts could be especially nasty. (Although I don't believe many uplifts involve significant changes to the build configuration.)

This all means there is a strong incentive for us to complete the transition as quickly as possible.

Can I help with the transition to moz.build files?

Yes!

The transition is largely mechanical (at least phase #1). If you are interested in moving a variable or set of variables, hop in #build on IRC and speak up!

You said moz.build files are actually Python files?!

Yes they are! However, they are executed in a very tightly controlled sandbox. You can't import modules, open files, etc. UPPERCASE variable names are reserved and only a few functions are exposed. If you attempt to assign to an unknown UPPERCASE variable or assign an invalid value, an error will occur. This is already much better than Makefile because we can now detect errors earlier in the build process (rather than 15 minutes into a build).

What variables and functions are available in moz.build files?

If you run |./mach mozbuild-reference| you will see a print-out of all the variables, functions, and symbols that are exposed to the Python sandbox that moz.build files execute in. There are even tests that will fail the build if the sandbox contains symbols not in this output!

The output should be valid reSTructuredText (in case you want to convert to HTML for reading in your browser).

What if a moz.build file contains an error?

The build will break.

A lot of work has gone into making the output of moz.build errors human friendly and actionable. If you do something wrong, it won't just complain: it will tell you how to fix it!

Besides build times, how else will this improve the build system?

There are several ways!

As mentioned above, moz.build are more strict about what data is allowed to be defined. If you assign to an UPPERCASE variable, that variable must be known to the sandbox or else the assignment will error. This means that if you assign to an UPPERCASE variable, you know it has a side-effect. No more cargo culting of old, meaningless variables!

To change the behavior of moz.build files (add new variables or functions, change how makefile generation works, etc) will require changes to the code in /python/mozbuild. This code belongs squarely to the build module and requires appropriate review. A problem with Makefiles is that they have lots of foot guns by default and its easy for self-inflicted wounds to land in the tree without explicit build peer review. This problem largely goes away with moz.build files because the sandbox takes away all of make's foot guns.

The output of a moz.build execution is essentially a static data structure. It's easy to validate them for conformance. If we discover bad practices in our build definition, we can centrally add tests for them and enforce best practices.

We will also see user experience wins by moving data to moz.build files. Take mochitests for an example. We currently have several flavors (plain, browser, chrome, etc). Sometimes you cannot distinguish the flavor by the filename alone. With moz.build files, it will be easier to answer questions like "what mochitest flavor is this file?" mach could hook into this so you can type |mach mochitest path/to/file.html| instead of |mach mochitest-plain path/to/file.html|. Even better, you should just be able to type |mach path/to/test.html| and mach knows from the build definition that path/to/test.html is a plain mochitest file and assumes you want to run it. There are dozens of small development workflow wins to be gained here!

If I change a moz.build file, what happens?

If you change a moz.build file, then make should detect that it has changed and it will update the dynamically generated backend.mk file and reinvoke the requested build action. This should all happen automatically (just like Makefile.in to Makefile conversion works automatically).

My build seems to pause for a few seconds before starting!

A change to any moz.build file will cause a full traversal of the entire moz.build tree. On modern machines, this should only take 1-3 seconds. If your source tree is not in the page cache (and you need to load moz.build files from disk) or if you are on older hardware, this could be a little slower.

This is an unfortunate side-effect of having a whole world view of the build definition. The build delay incurred by these full scans should eventually be cancelled out by build backend optimizations resulting from having this whole world view, however.

The good news is this full scan should only occur if a mozbuild file changes. And, if you are performing make recursion, it should only happen once (not in every directory). If you notice multiple moz.build scanning-related pauses during builds, please file a bug in Core :: Build Config!

Finally, we are performing the reads on a single thread currently. We can throw more cores at the task if someone codes up a patch.

What happened to allmakefiles.sh?

It has been sacked. allmakefiles.sh was an optimization to perform all the Makefile.in to Makefile conversion in one go. The directory traversal performed by moz.build reading effectively replaces the role of allmakefiles.sh. Not only that, but the moz.build build definition is always up to date! allmakefiles.sh was typically out of sync with reality and was a burden to maintain.

Did we just invent our own build system?

Kinda. We invented a generic Python sandboxing infrastructure. Then we hooked up code to populate it with variables from our build system and told it how to perform file traversal by reading specific variables set during file execution. Then we hooked up code for taking the evaluated results of all those sandboxes and convert them into make files.

Conceptually, what we invented is like GYP but with a different config file format. We have dabbled with the idea of converting the parsed build definition into GYP classes and then leveraging GYP to produce Makefiles, Ninja files, Visual Studio Projects, etc. This would an interesting experiment!

If you want to call it a build system, call it a build system. However, it is currently tightly coupled to Mozilla's needs, so you can't just use it anywhere. The concept might be worth exploring, however.

Is there anything else you'd like to share?

I think we set the record for most parts in a bug: 61. Although, they are numbered 1-17, 19-20. Part 18 has 30+ sub-parts using letters from the English and Greek alphabet for identifiers. Part 61 uses the infinity symbol as its number. See the pushlog.

Finally, I'd like to thank everyone who helped with this effort. The bug itself was only 6 months old and had active development off and on for a lot of it. Ted Mielczarek and Mike Hommey provided invaluable feedback on the core build system patches. A number of module owners stepped in to lend eyes to the mechanical conversion of their files. Last but not least, Ms2ger provided invaluable review aid on many of the patches. The work was so good that we all agreed that an Ms2ger f+ was good enough for a build peer rs! If reviewing the patches wasn't enough, Ms2ger also oversaw the tree closure and merging of the landing. I don't know how I will repay this debt.

Any more questions?

If you have more questions, drop in #build on irc.mozilla.org and ask away.

« Previous Page -- Next Page »