Makefile Execution Times

July 28, 2012 at 12:45 AM | categories: Mozilla, pymake, build system

In my course of hacking about with Mozilla's build system, I've been using pymake (a Python implementation of GNU make) to parse, examine, and manipulate make files. In doing so, I've learned some interesting things, dispelling myths in the process.

People often say that parsing make files is slow and that the sheer number of Makefile.in's in mozilla-central (Firefox's source tree) is leading to lots of overhead in make execution. This statement is only partially correct.

Parsing make files is actually pretty fast. Using pymake's parser API, I'm able to parse every Makefile.in in mozilla-central in under 5 seconds on my 2011 generation MacBook Pro using a single core. Not too shabby, especially considering that there are about 82,500 lines in all the Makefile.in's.

Evaluation of make files, however, is a completely different story. You see, parsing a string containing make file directives is only part of what needs to be done. Once you've parsed a make file into a statement list (essentially an AST), you need to load that into a data structure fit for evaluation. Because of the way make files are evaluated, you need to iterate through every parsed statement and evaluate it for side-effects. This occurs before you actually evaluate specific targets in the make file itself. As I found out, this process can be time-consuming.

For mozilla-central, the cost of loading the statement list into a data structure ready for target evaluation takes about 1 minute in aggregate. And, considering we effectively iterate through every Makefile in mozilla-central 3 times when building (once for every tier state of export, libs, and tools), you can multiply this figure by 3.

Put another way, parsing Makefile's is fast: loading them for target evaluation is slow.

Digging deeper, I uncovered the main source of the additional overhead: rules.mk.

Nearly every Makefile in mozilla-central has a pattern that looks like:

DEPTH = ../..
topsrcdir = @top_srcdir@
srcdir = @srcdir@
VPATH = @srcdir@

include $(DEPTH)/config/autoconf.mk

<LOCAL MAKE FILE DECLARATIONS>

include $(topsrcdir)/config/rules.mk

We have a header boilerplate, followed by a bunch of Makefile-specific variables definitions and rules. Finally, we include the rules.mk file. This is the make file that takes specially-named variables and converts them to rules (actions) for make to perform.

A typical Makefile.in is a few dozen lines or so. This often reduces to maybe a dozen parsed statements. By contrast, rules.mk is massive. It is currently 1770 lines and may include other make files, bringing the total to ~3000 lines.

Pymake has an LRU cache that caches the results of parsing make files. This means it only has to parse a single make file into a statement list once (assuming no cache eviction). rules.mk is frequently used, so it should have no eviction. Even if it were evicted, I've measured that parsing is pretty fast.

Unfortunately, the cache doesn't help with evaluation. For every Makefile in mozilla-central, pymake will need to evaluate rules.mk within the context of that specific Makefile. It's impossible to cache the results of a previous evaluation because the side-effects of rules.mk are determined by what is defined in the Makefile that includes it.

I performed an experiment where I stripped the include rules.mk statement from all parsed Makefile.in's. This essentially isolates the overhead of loading rules.mk. It turns out that all but ~2 seconds of evaluation time is spent in rules.mk. In other words, without rules.mk, the Makefile.in's are loaded and ready for evaluation in just a few seconds (over parsing time), not ~1 minute!

What does this all mean?

Is parsing make files slow? Technically no. Parsing itself is not slow. It is actually quite fast! Pymake even surprised me at how fast it can parse all the Makefile.in's in mozilla-central.

Loading parsed make file statements to be ready for evaluation is actually the bit that is slow - at least in the case of mozilla-central. Specifically, the loading of rules.mk is what constitutes the overwhelming majority of the time spent loading Makefile's.

That being said, parsing and loading go hand in hand. You almost never parse a make file without loading and evaluating it. So, if you consider parsing to include parsing and readying the make file for execution, there is some truth to the statement that parsing make files is slow. Someone splitting hairs may say differently.

Is there anything we can do? Good question.

I believe that build times of mozilla-central can be reduced by reducing the size of rules.mk. Obviously, the content of rules.mk is important, so we can't just delete content. But, we can be more intelligent about how it is loaded. For example, we can move pieces of rules.mk into separate .mk files and conditionally include these files based on the presence of specific variables. We already do this today, but only partially: there are still a number of bits of rules.mk that could be factored out into separate files. By conditionally loading make file content from rules.mk, we would be reducing the number of statements that need to be loaded before evaluating each Makefile. And, this should, in turn, make build times faster. Keep in mind that any savings will be multiplied by roughly 3 since we do 3 passes over Makefile's during a build.

To my knowledge, there aren't any bugs yet on file to do this. Given the measurements I've obtained, I encourage somebody to do this work. Even if it doesn't reduce build times, I think it will be a win since it will make the make rules easier to understand since they will be contained in function-specific files rather than one monolithic file. At worse, we have better readability. At best, we have better readability and faster build times. Win!

Finally, I don't know what the impact on GNU make is. Presumably, GNU make evaluates make files faster than pymake (C is generally faster than python). Therefore, reducing the size of rules.mk should make GNU make faster. By how much, I have no clue.