Most of Mozilla gathered in Orlando in December for an all hands meeting. If you attended any of the plenary sessions, you probably heard people like David Bryant and Lawrence Mandel make references to improving the Firefox build system and related tools. Well, the cat is out of the bag: Mozilla will be investing heavily in the Firefox build system and related tooling in 2016!
In the past 4+ years, the Firefox build system has mostly been held together and incrementally improved by a loose coalition of people who cared. We had a period in 2013 where a number of people were making significant updates (this is when moz.build files happened). But for the past 1.5+ years, there hasn't really been a coordinated effort to improve the build system - just a lot of one-off tasks and (much-appreciated) individual heroics. This is changing.
Improving the build system is a high priority for Mozilla in 2016. And investment has already begun. In Orlando, we had a marathon 3 hour meeting planning work for Q1. At least 8 people have committed to projects in Q1.
The focus of work is split between immediate short-term wins and longer-term investments. We also decided to prioritize the Firefox and Fennec developer workflows (over Gecko/Platform) as well as the development experience on Windows. This is because these areas have been under-loved and therefore have more potential for impact.
Here are the projects we're focusing on in Q1:
- Turnkey artifact based builds for Firefox and Fennec (download pre-built binaries so you don't have to spend 20 minutes compiling C++)
- Running tests from the source directory (so you don't have to copy tens of thousands of files to the object directory)
- Speed up configure / prototype a replacement
- Telemetry for mach and the build system
- NSPR, NSS, and (maybe) ICU build system rewrites
- mach build faster improvements
- Improvements to build rules used for building binaries (enables non-make build backends)
- mach command for analyzing C++ dependencies
- Deploy automated testing for mach bootstrap on TaskCluster
Work has already started on a number of these projects. I'm optimistic 2016 will be a watershed year for the Firefox build system and the improvements will have a drastic positive impact on developer productivity.
Many developers use MacBook Pros for day-to-day Firefox development. So, I thought it would be worthwhile to perform a comparison of Firefox build times for various models of MacBook Pros.
The numbers in this post are obtained from 3 generations of MacBook Pros:
A 2011 Sandy Bridge 4 core x 2.3 GHz with 8 GB RAM and an aftermarket SSD.
A 2012 Ivy Bridge retina with 4 core x 2.6 GHz, 16 GB RAM, and a factory SSD (or possibly flash storage).
A 2013 Haswell retina with 4 core x 2.6 GHz, 16 GB RAM, and flash storage.
All machines were running OS X 10.9 Mavericks and were using the Xcode 5.0.1 toolchain (Xcode 5 clang: Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)) to build.
The power settings prevented machine sleep and machines were plugged into A/C power during measuring. I did not use the machines while obtaining measurements.
The 2012 and 2013 machines were very vanilla OS installs. However, the 2011 machine was my primary work computer and may have had a few background services running and may have been slower due to normal wear and tear. The 2012 machine was a loaner machine from IT and has an unknown history.
All data was obtained from mozilla-central revision d4a27d8eda28.
The mozconfig used contained:
export MOZ_PSEUDO_DERECURSE=1 mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-firefox.noindex
Please note that the objdir name ends with .noindex to prevent Finder from indexing build files.
I performed all tests multiple times and used the fastest time. I used time command for obtaining measurements of wall, user, and system time.
The result of mach configure is as follows:
|Machine||Wall time||User time||System time|
Clobber build no ccache
mach build was performed after running mach configure. ccache was not enabled.
|Machine||Wall time||User time||System time||Total CPU time|
|2011||22:29 (1349)||145:35 (8735)||12:03 (723)||157:38 (9458)|
|2012||15:00 (900)||94:18 (5658)||8:14 (494)||102:32 (6152)|
|2013||11:13 (673)||69:55 (4195)||6:04 (364)||75:59 (4559)|
Clobber build with empty ccache
mach build was performed after running mach configure. ccache was enabled. The ccache ccache was cleared before running mach configure.
|Machine||Wall time||User time||System time||Total CPU time|
|2011||25:57 (1557)||161:30 (9690)||18:21 (1101)||179:51 (10791)|
|2012||16:58 (1018)||104:50 (6290)||12:32 (752)||117:22 (7042)|
|2013||12:59 (779)||79:51 (4791)||9:24 (564)||89:15 (5355)|
Clobber build with populated ccache
mach build was performed after running mach configure. ccache was enabled and the ccache was populated with the results of a prior build. In theory, all compiler invocations should be serviced by ccache entries.
This measure is a very crude way to measure how fast clobber builds would be if compiler invocations were nearly instantaneous.
|Machine||Wall time||User time||System time|
|2011||3:59 (239)||8:04 (484)||3:21 (201)(|
|2012||3:11 (191)||6:45 (405)||2:53 (173)|
|2013||2:31 (151)||5:22 (322)||2:12 (132)|
mach build was performed on a tree that was already built.
|Machine||Wall time||User time||System time|
|2011||1:58 (118)||2:25 (145)||0:41 (41)|
|2012||1:42 (102)||2:02 (122)||0:37 (37)|
|2013||1:20 (80)||1:39 (99)||0:28 (28)|
mach build binaries was performed on a fully built tree. This results in nothing being executed. It's a way to test the overhead of the binaries make target.
|Machine||Wall time||User time||System time|
binaries touch single .cpp
mach build binaries was performed on a fully built tree after touching the file netwerk/dns/nsHostResolver.cpp. ccache was enabled but cleared before running this test. This test simulates common C++ developer workflow of changing C++ and recompiling.
|Machine||Wall time||User time||System time|
The times of each build system tier were measured on the 2013 Haswell MacBook Pro. These timings were obtained out of curiosity to help isolate the impact of different parts of the build. ccache was not enabled for these tests.
|Action||Wall time||User time||System time||Total CPU time|
|compile clobber||9:01 (541)||64:58 (3898)||5:08 (308)||70:06 (4206)|
|libs clobber||1:34 (94)||2:15 (135)||0:39 (39)||2:54 (174)|
Observations and conclusions
The data speaks for itself: the 2013 Haswell MacBook Pro is significantly faster than its predecessors. It clocks in at 2x faster than the benchmarked 2011 Sandy Bridge model (keep in mind the 300 MHz base clock difference) and is ~34% faster than the 2012 Ivy Bridge (at similar clock speed). Personally, I was surprised by this. I was expecting speed improvements over Ivy Bridge, but not 34%.
It should go without saying: if you have the opportunity to upgrade to a new, Haswell-based machine: do it. If possible, purchase the upgrade to a 2.6 GHz CPU, as it contains ~13% more MHz than the base 2.3 GHz model: this will make a measurable difference in build times.
It's worth noting the increased efficiency of Haswell over its predecessors. The total CPU time required to build decreased from ~158 minutes to ~103 minutes to 76 minutes! That 76 minute number is worth highlighting because it means if we get 100% CPU saturation during builds, we'll be able to build the tree in under 10 wall time minutes!
I hadn't performed crude benchmarks of high-level build system actions since the MOZ_PSEUDO_DERECURSE work landed and I wanted to use the opportunity of this hardware comparison to grab some numbers.
The overhead of ccache continues to surprise me. On the 2013 machine, enabling ccache increased the wall time of a clobber build by 1:46 and added 13:16 of CPU time. This is an increase of 16% and 17%, respectively.
It's worth highlighting just how much time is spent compiling C/C++. In our artificial tier measuring results, our clobber build time was ~660 wall time seconds (11 minutes) and used ~4473s CPU time (74:33). Of this, 9:01 wall time and 70:06 CPU time was spent compiling C/C++. This represents ~82% wall time and ~94% CPU time! Please note this does not include linking. Anything we can do to decrease the CPU time used by the compiler will make the build faster.
I also found it interesting to note variances in obtained times. Even on my brand new 2013 Haswell MacBook Pro where I know there aren't many background processes running, wall times could vary significantly. I think I isolated it to CPU bursting and heat issues. If I wait a few minutes between CPU intensive tests, results are pretty consistent. But if I perform CPU intensive tests back-to-back, the run times often vary. The only other thing coming into play could be page caching or filesystem indexing. I accounted for the latter by disabling Finder on the object directory. And, I'd like to think that flash storage is fast enough to remove I/O latency from the equation. Who knows. At the end of the day, laptops aren't servers and OS X is a consumer OS, so I don't expect ultra consistency.
Finally, I want to restate just how fast Haswell is. If you have the opportunity to upgrade, do it.
If you had infinite CPU cores available and the Firefox build system could distribute them all for concurrent compilation, Firefox clobber build times would likely be 3-5 minutes instead of ~15 minutes on modern machines. This is a massive win. It therefore should come as no surprise that distributed compiling is very interesting to us.
Up until recently, the benefits of distributed compiling in the Firefox build system couldn't be fully realized. This was because the build system was performing recursive make traversal and make only knew about a tiny subset of the tree's total C++ files at one time. For example, when visiting /layout/base it only knew about 35 of the close to 6000 files that get compiled as part of building Firefox. This meant there was a hard ceiling to the max concurrency the build system could achieve. This ceiling was often higher than the number of cores in an individual machine, so it wasn't a huge issue for single machine builds. But it did significantly limit the benefits of distributed compiling. This all changed recently.
As of a few weeks ago, the build system no longer encounters a low ceiling preventing distributed compilation from reaping massive benefits. If you have build with make -j128, make will spawn 128 compiler processes when processing the compile tier (which is where most compilation occurs). If your compiler is set to a distributed compiler, you will win.
So, what should you do about it?
I encourage people to set up distributed compilation networks to reap the benefits of distributed compilation. Here are some tools you should know about and some things to keep in mind.
distcc is the tried and proven tool for performing distributed compilation. It's heavily used and gets the job done. It even works on Windows and can perform remote processing, which is a huge win for our tree, where preprocessing can be computationally expensive because of excessive includes. But, it has a few significant drawbacks. Read the next paragraph.
I'm personally more excited about icecream. It has some very compelling advantages to distcc. It has a scheduler that can intelligently distribute load between machines. It uses network broadcast to discover the scheduler. So, you just start the client daemon and if there is a scheduler on the local network, it's all set up. Icecream transfers the compiler toolchain between nodes so you are guaranteed to have consistent output. (With distcc, output may not be idempotent if the nodes aren't homogenous since distcc relies on the system-local toolchain. If different versions are installed on different nodes, you are out of luck). Icecream also supports cross-compiling. In theory, you can have Linux machines building for OS X, 32-bit machines building for 64-bit, etc. This is all very difficult (if not impossible) to do with distcc. Unfortunately, icecream doesn't work on Windows and doesn't appear to support server-side preprocessing. Although, I imagine both could be made to work if someone put in the effort.
Distributed compilation is very network intensive. I haven't measured, but I suspect Wi-Fi bandwidth and latency constraints might make it prohibitive there. It certainly won't be good for Wi-Fi saturation! If you are in a Mozilla office, please do not attempt to perform distributed compilation over Wi-Fi! For the same reasons, distributed compilation will likely not benefit you if you are attempting to compile on network-distant nodes.
I have set up an icecream server in the Mozilla San Francisco office. If you install the icecream client daemon (iceccd) on your machine, it should just work. I'm not sure what broadcast nets are configured as, but I've successfully had machines on the 7th floor discover it automatically. I guarantee no SLA for this server. Ping me privately if you have difficulty connecting.
I've started very preliminary talks with Mozilla IT about setting up dedicated compiler farms in Mozilla offices. I'm not saying this is coming any time soon. I feel this will have a major impact on developer productivity and I wanted to get the ball rolling months in advance so nobody can claim this is a fire drill.
For distributed compilation to work well, the build system really needs to be aware of distributed compilation. For example, to yield the benefits of distributed compilation with make, you need to pass -j64 or some other large value for concurrency. However, this value would be universal for every task in the build. There are still thousands of processes that must run locally. Using -j64 on these local tasks could cause memory exhaustion, I/O saturation, excessive context switching, etc. But if you decrease the concurrency ceiling, you lose the benefits of distributed compilation! The build system thus needs to be taught when distributed compilation is available and what tasks can be made concurrent so it can intelligently adjust the -j concurrency limit at run-time. This is why we have a higher-level build wrapper tool: mach build. (This is another reason why people should be building through mach instead of invoking make directly.)
No matter what technical solution we employ, I would like the build system to automatically discover and use distributed compilation if it is available. If we need to hardcode Mozilla IP addresses or hostnames into the build system, I'm fine with that. I just don't want developers not achieving much-faster build times because they are ignorant. If you are in a physical location with distributed compilation support, you should get that automatically: fast builds should not be hard.
We can and should investigate distributed compilation as part of release automation. Icecream should mitigate the concerns about build reproducibility since the toolchain is transferred at build time.
I have had success getting Icecream to work with Linux builds. However, OS X is problematic. Specifically, Icecream is unable to create the build environment for distribution (likely modern OS X/Xcode compatibility issue). Details are in bug 927952.
Build peers have a lot on our plate this quarter and making distributed compilation work well is not in our official goals. I would love, love, love if someone could step up and be a hero to make distributed compilation work better with the build system. If you are interested, pop into #build on irc.mozilla.org.
In summary, there are massive developer productivity wins waiting to be realized through distributed compiling. There is nobody tasked to work on this officially. Although, I'd love it if there were. If you find yourself setting up ad-hoc networks in offices, I'd really like to see some kind of discovery in mach. If not, there will be people left behind and that really stinks for those individuals. If you do any work around distributed compiling, please have it tracked under bug 485559.
As we look ahead to Q4 planning for the Firefox build system, I wanted to take the time to reflect on what was accomplished in Q3 and to simultaneously look forward to Q4 and beyond.
2013 Q3 Build System Improvements
There were notable improvements in the build system during the last quarter.
The issues our customers care most about is speed. Here is a list of accomplishments in that area:
MOZ_PSEUDO_DERECURSE work to change how make directory traversal works. This enabled the binaries make target, which can do no-op libxul-only builds in just a few seconds. Of all the changes that landed this quarter, this is the most impactful to local build times. This change also enables C++ compilation to scale out to as many cores as you have. Previously, the build system was starved in many parts of the tree when compiling C++. Mike Hommey is responsible for this work. I reviewed most of it.
WebIDL and IPDL bindings are now compiled in unified mode, reducing compile times and linker memory usage. Nathan Froyd wrote the code. I reviewed the patches.
XPIDL files are generated much more efficiently. This removed a few minutes of CPU core time from builds. I wrote these patches and Mike Hommey reviewed.
Increased reliance on install manifests to process file installs. They have drastically reduced the number of processes required to build by performing all actions inside Python processes as system calls and removing the clownshoes of having to delete parts of the object directory at the beginning of builds. When many mochitests were converted to manifests, no-op build times dropped by ~15% on my machine. Many people are responsible for this work. Mike Hommey wrote the original install code for packaging a few months ago. I built in manifest file support, support for symlinks, and made the code a bit more robust and faster. Mike Hommey reviewed these patches.
Many bugs and issues around dependency files on Windows have been discovered and fixed. These were a common source of clobbers. Mike Hommey found most of these, many during his work to make MOZ_PSEUDO_DERECURSE work.
The effort to reduce C++ include hell is resulting in significantly shorter incremental builds. While this effort is largely outside the build config module, it is worth mentioning. Ehsan Akhgari is leading this effort. He's been assisted by too many people to mention.
The build system now has different build modes favoring faster building vs release build options depending on the environment. Mike Hommey wrote most (all?) of the patches.
A number of other non-speed related improvements have been made:
The build system now monitors resource usage during builds and can graph the results. I wrote the code. Ted Mielczarek, Mike Hommey, and Mike Shal had reviews.
Support for test manifests has been integrated with the build system. This enabled some build speed wins and is paving the road for better testing UX, such as the automagical mach test command, which will run the appropriate test suite automatically. Multiple people were involved in the work to integrate test manifests with the build system. I wrote the patches. But Ted Mielczarek got primary review. Joel Maher, Jeff Hammel, and Ms2ger provided excellent assistance during the design and implementation phase. The work around mochitest manifests likely wouldn't have happened this quarter if all of us weren't attending an A*Team work week in August.
There are now in-tree build system docs. They are published automatically. Efforts have been made to purge MDN of cruft. I am responsible for writing the code and most of the docs. Benjamin Smedberg and Mike Shal performed code reviews.
Improvements have been made to object directory detection in mach. This was commonly a barrier to some users using mach. I am responsible for the code. Nearly every peer has reviewed patches.
We now require Python 2.7.3 to build, making our future Python 3 compatibility story much easier while eliminating a large class of Python 2.7.2 and below bugs that we constantly found ourselves working around.
mach bootstrap has grown many new features and should be more robust than ever. There are numerous contributors here, including many community members that have found and fixed bugs and have added support for additional distributions.
The boilerplate from Makefile.in has disappeared. Mike Hommey is to thank.
dumbmake integrated with mach. Resulted in friendlier build interface for a nice UX win. Code by Nick Alexander. I reviewed.
Many variables have been ported from Makefile.in to moz.build. We started Q3 with support for 47 variables and now support 73. We started with 1226 Makefile.in and 1517 moz.build and currently have 941 Makefile.in and 1568 moz.build. Many people contributed to this work. Worth mentioning are Joey Armstrong, Mike Shal, Joshua Cranmer, and Ms2ger.
Many build actions are moving to Python packages. This enabled pymake inlining (faster builds) and is paving the road towards no .pyc files in the source directory. (pyc files commonly are the source of clobber headaches and make it difficult to efficiently perform builds on read-only filesystems.) I wrote most of the patches and Mike Shal and Mike Hommey reviewed.
moz.build is now more strict about what it accepts. We check for missing files at config parse time rather than build time, causing errors to surface faster. Many people are responsible for this work. Mike Shal deserves kudos for work around C/C++ file validation.
mach has been added to the B2G repo. Jonathan Griffin and Andrew Halberstadt drove this.
Current status of the build bystem
Q3 was a very significant quarter for the build system. For the first time in years, we made fundamental changes to how the build system goes about building. The moz.build work to free our build config from the shackles of make files had enabled us to consume that data and do new and novel things with it. This has enabled improvements in build robustness and - most importantly - speed.
This is most evident with the MOZ_PSEUDO_DERECURSE work, which effectively replaces how make traverses directories. The work there has allowed Gecko developers focused on libxul to go from e.g. 50s no-op build times to less than 5s. Combined with optimized building of XPIDL, IPDL, and WebIDL files, processing of file installs via manifests, and C++ header dependency reduction, and a host of other changes, and we are finally turning a corner on build times! Much of this work wouldn't have been possible without moz.build files providing a whole world view of our build config.
The quarter wasn't all roses. Unfortunately, we also broke things. A lot. The total number of required clobbers this quarter grew slightly from 38 in Q2 to 43 in Q3. Many of these clobbers were regressions from supposed improvements to the build system. Too many of these regressions were Windows/pymake only and surely would have been found prior to landing if more build peers were actively building on Windows. There are various reasons we aren't. We should strive to fix them so more build development occurs on Windows and Windows users aren't unfairly punished.
The other class of avoidable clobbers mostly revolves around the theme that the build system is complicated, particularly when it comes to integration with release automation. Build automation has its build logic currently coded in Buildbot config files. This means it's all but impossible for build peers to test and reproduce that build environment and flow without time-intensive, stop-energy abundant excessive try pushes or loading out build slaves. The RelEng effort to extract this code from buildbot to mozharness can't come soon enough. See my overview on how automation works for more.
This quarter, the sheriffs have been filing bugs whenever a clobber is needed. This has surfaced clobber issues to build peers better and I have no doubt their constant pestering caused clobber issues to be resolved sooner. It's a terrific incentive for us to fix the build system.
I have mixed feelings on the personnel/contribution front in Q3. Kyle Huey no longer participates in active build system development or patch review. Ted Mielczarek is also starting to drift away from active coding and review. Although, he does constantly provide knowledge and historical context, so not all is lost. It is disappointing to see fantastic people and contributors no longer actively participating on the coding front. But, I understand the reasons behind it. Mozilla doesn't have a build team with a common manager and decree (a mistake if you ask me). Ted and Kyle are both insanely smart and talented and they work for teams that have other important goals. They've put in their time (and suffering). So I see why they've moved on.
On the plus side, Mike Hommey has been spending a lot more time on build work. He was involved in many of the improvements listed above. Due to review load and Mike's technical brilliance, I don't think many of our accomplishments would have happened without him. If there is one Mozillian who should be commended for build system work in Q3, it should be Mike Hommey.
Q3 also saw the addition of new build peers. Mike Shal is now a full build config module peer. Nick Alexander is now a peer of a submodule covering just the Fennec build system. Aside from his regular patch work, Mike Shal has been developing his review skills and responsibilities. Without him, we would likely be drowning in review requests and bug investigations due to the departures of Kyle and Ted. Nick is already doing what I'd hope he'd do when put in charge of the Fennec build system: looking at a proper build backend for Java (not make) and Eclipse project generation. (I still can't believe many of our Fennec developers code Java in vanilla text editors, not powerful IDEs. If there is one language that would miss IDEs the most, I'd think it would be Java. Anyway.)
There was a steady stream of contributions from people not in the build config module. Joshua Cranmer has been keeping up with moz.build conversions for comm-central. Nathan Froyd and Boris Zbarsky have helped with all kinds of IDL work. Trevor Saunders has helped keep things clean. Ms2ger has been eager to provide assistance through code and reviews. Various community contributors have helped with moz.build conversion patches and improvements to mach and the bootstrapper. Thank you to everyone who contributed last quarter!
Looking to the future
At the beginning of the quarter, I didn't think it would be possible to attain no-op build speeds with make as quickly as make binaries now does. But, Mike Hommey worked some magic and this is now possible. This was a game changer. The code he wrote can be applied to other build actions. And, our other solutions involving moz.build files to autogenerated make files seems to be working pretty well too. This raises some interesting questions with regards to priortization.
Long term, we know we want to move away from make. It is old and clumsy. It's easy to do things wrong. It doesn't scale to handle a single DAG as large as our build system. The latter is particularly important if we are to ever have a build system that doesn't require clobbers periodically.
Up to this point we've prioritized work on moz.build conversion, with the rationale being that it would more soon enable a clean break from make and thus we'd arrive at drastically faster builds sooner. The assumption in that argument was that drastically faster builds weren't attainable with make. Between the directory traversal overhaul and the release of GNU make 4.0 last week (which actually seems to work on Windows, making the pymake slowness a non-issue), the importance of breaking away from make now seems much less pressing.
While we would like to actively move off make, developments in the past few weeks seem to say that we can reassess priorities. I believe that we can drive down no-op builds with make to a time that satisfies many - let's say under 10s to be conservative. Using clever tricks optimizing for common developer workflows, we can probably get that under 5s everywhere, including Windows (people only caring about libxul can get 2.5s on mozilla-central today). This isn't the 250ms we could get with Tup. But it's much better than 45s. If we got there, I don't think many people would be complaining.
So, the big question for goals setting this quarter will be whether we want to focus on a new build backend (likely Tup) or whether we should continue with an emphasis on make. Now, a lot of the work involved applies to both make and any other build backend. But, I have little doubt it would be less overall work to support one build backend (make) than two. On the other hand, we know we want to support multiple build backends eventually. Why wait? In the balance are numerous other projects that have varying impact for developers and release automation. While important in their own right, it is difficult to balance them against build speed. While we could strive towards instantaneous builds, at some point we'll hit good enough and the diminishing returns that accompany them. There is already a small vocal faction advocating for Ninja support, even though it would only decrease no-op libxul build times from ~2.5s to 250ms. While a factor of 10x improvement, I think this is dangerously close to diminishing returns territory and our time investment would be better spent elsehwere. (Of course, once we can support building libxul with Ninja, we could easily get it for Tup. And, I believe Tup wins that tie.). Anyway, I'm sure it will be an interesting discussion!
Whatever the future holds, it was a good quarter for the build system and the future is looking brighter than ever. We have transitioned from a maintain-and-react mode (which I understand has largely been the norm since the dawn of Firefox) to a proactive and future-looking approach that will satisfy the needs of Firefox and its developers for the next ten years. All of this progress is even more impressive when you consider that we still react to an aweful lot of fire drills and unwanted maintenance!
The Firefox build system is improving. I'm as anxioux as you are to see various milestones in terms of build speed and other features. But it's hard work. Wish us luck. Please help out where you can.
I'd like to make an attempt at delivering regular status updates on the Gecko/Firefox build system and related topics. Here we go with the first instance. I'm sure I missed awesomeness. Ping me and I'll add it to the next update.
MozillaBuild Windows build environment updated
Kyle Huey released version 1.7 of our Windows build environment. It contains a newer version of Python and a modern version of Mercurial among other features.
I highly recommend every Windows developer update ASAP. Please note that you will likely encounter Python errors unless you clobber your build.
New submodule and peers
I used my power as module owner to create a submodule of the build config module whose scope is the (largely mechanical) transition of content from Makefile.in to moz.build files. I granted Joey Armstrong and Mike Shal peer status for this module. I would like to eventually see both elevated to build peers of the main build module.
The following progress has been made:
- Mike Shal has converted variables related to defining XPIDL files in bug 818246.
- Mike Shal converted MODULE in bug 844654.
- Mike Shal converted EXPORTS in bug 846634.
- Joey Armstrong converted xpcshell test manifests in bug 844655.
- Brian O'Keefe converted PROGRAM in bug 862986.
- Mike Shal is about to land conversion of CPPSRCS in bug 864774.
Non-recursive XPIDL generation
In bug 850380 I'm trying to land non-recursive building of XPIDL files. As part of this I'm trying to combine the generation of .xpt and .h for each input .idl file into a single process call because profiling revealed that parsing the IDL consumes most of the CPU time. This shaves a few dozen seconds off of build times.
I have encounterd multiple pymake bugs when developing this patch, which is the primary reason it hasn't landed yet.
I was looking at my build logs and noticed WebIDL generation was taking longer than I thought it should. I filed bug 861587 to investigate making it faster. While my initial profiling turned out to be wrong, Boris Zbarsky looked into things and discovered that the serialization and deserialization of the parser output was extremely slow. He is currently trying to land a refactor of how WebIDL bindings are handled. The early results look very promising.
I think the bug is a good example of the challenges we face improving the build system, as Boris can surely attest.
Test directory reorganization
Joel Maher is injecting sanity into the naming scheme of test directories in bug 852065.
Manifests for mochitests
Jeff Hammel, Joel Maher, Ted Mielczarek, and I are working out using manifests for mochitests (like xpcshell tests) in bug 852416.
Mach core is now a standalone package
Mach now categorizes commands in its help output.
Requiring Python 2.7.3
Now that the Windows build environment ships with Python 2.7.4, I've filed bug 870420 to require Python 2.7.3+ to build the tree. We already require Python 2.7.0+. I want to bump the point release because there are many small bug fixes in 2.7.3, especially around Python 3 compatibility.
This is currently blocked on RelEng rolling out 2.7.3 to all the builders.
Eliminating master xpcshell manifest
Now that xpcshell test manifests are defined in moz.build files, we theoretically don't need the master manifest. Joshua Cranmer is working on removing them in bug 869635.
Enabling GTests and dual linking libxul
Benoit Gerard and Mike Hommey are working in bug 844288 to dual link libxul so GTests can eventually be enabled and executed as part of our automation.
This will regress build times since we need to link libxul twice. But, giving C++ developers the ability to write unit tests with a real testing framework is worth it, in my opinion.
ICU was briefly enabled in bug 853301 but then backed out because it broke cross-compiling. It should be on track for enabling in Firefox 24.
Resource monitoring in mozbase
I gave mozbase a class to record system resource usage. I plan to eventually hook this up to the build system so the build system records how long it took to perform key events. This will give us better insight into slow and inefficient parts of the build and will help us track build system speed improvements over time.
Sorted lists in moz.build files
I'm working on requiring lists in moz.build be sorted. Work is happening in bug 863069.
This idea started as a suggestion on the dev-platform list. If anyone has more great ideas, don't hold them back!
Smartmake added to mach
Nicholas Alexander taught mach how to build intelligently by importing some of Josh Matthews' smartmake tool's functionality into the tree.
Source server fixed
Kyle Huey and Ted Mielczarek collaborated to fix the source server.
Auto clobber functionality
Auto clobber functionality was added to the tree. After flirting briefly with on-by-default, we changed it to opt-in. When you encounter it, it will tell you how to enable it.
Faster clobbers on automation
I was looking at build logs and identified we were inefficiently performing clobber.
Massimo Gervasini and Chris AtLee deployed changes to automation to make it more efficient. My measurements showed a Windows try build that took 15 fewer minutes to start - a huge improvement.
Upgrading to Mercurial 2.5.4
RelEng is tracking the global deployment of Mercurial 2.5.4. hg.mozilla.org is currently running 2.0.2 and automation is all over the map. The upgrade should make Mercurial operations faster and more robust across the board.
I'm considering adding code to mach or the build system that prompts the user when her Mercurial is out of date (since an out of date Mercurial can result in a sub-par user experience).
Nathan Froyd is leading an effort to parallelize reftest execution. If he pulls this off, it could shave hours off of the total automation load per checkin. Go Nathan!
Overhaul of MozillaBuild in the works
I am mentoring a pair of interns this summer. I'm still working out the final set of goals, but I'm keen to have one of them overhaul the MozillaBuild Windows development environment. Cross your fingers.
Next Page »