Gregory Szorc's Digital Home

Using Mercurial to query Mozilla metadata

November 08, 2013 at 09:42 AM | categories: Mercurial, Mozilla

I have updated my Mercurial extension tailored for Gecko/Firefox development with features that support rich querying of Mozilla/Gecko-development specific metadata!

The extension now comes with a bug full of revision set selectors and template keywords. You can use them to query and format Mozilla-central metadata from the repository.

Revision set selectors

You can now select changesets referencing a specific bug number:

hg log -r 'bug(931383)'

Or that were reviewed by a specific person:

hg log -r 'reviewer(gps)'

Or were reviewed or not reviewed:

hg log -r 'reviewed()'
hg log -r 'not reviewed()'

You can now select changesets that are present in a specific tree:

hg log -r 'tree(central)'

I've also introduced support to query changesets you influenced:

hg log -r 'me()'

(This finds changesets you authored or reviewed.)

You can select changesets that initially landed on a specific tree:

hg log -r 'firstpushtree(central)'

You can select changesets marked as DONTBUILD:

hg log -r 'dontbuild()'

You can select changesets that don't reference a bug:

hg log -r 'nobug()'

You can select changesets that were push heads for a tree:

hg log -r 'pushhead(central)'

(This would form the basis of a push-aware bisection tool - an excellent idea for a future feature in this extension.)

You can combine these revset selector functions with other revset selectors to do some pretty powerful things.

To select all changesets on inbound but not central:

hg log -r 'tree(inbound) - tree(central)'

To find all your contributions on beta but not release:

hg log -r 'me() & (tree(beta) - tree(release))'

To find all changesets referencing a specific bug that have landed in Aurora:

hg log -r 'bug(931383) and tree(aurora)'

To find all changesets marked DONTBUILD that landed directly on central:

hg log -r 'dontbuild() and firstpushtree(central)'

To find all non-merge changesets that don't reference a bug:

hg log -r 'not merge() and nobug()'

Neato!

Template keywords

You can also now print some Mozilla information when using templates.

To print the main bug of a changeset, use:

{bug}

To retrieve all referenced bugs:

{bugs} {join(bugs, ', ')}

To print the reviewers:

{reviewer} {join(reviewers, ', ')}

To print the first version a changeset appeared in a specific channel:

{firstrelease} {firstbeta} {firstaurora} {firstnightly}

To print the estimated first Aurora and Nightly date for a changeset, use:

{auroradate} {nightlydate}

(Getting the exact first Aurora and Nightly dates requires consulting 3rd party services, which we don't currently do. I'd like to eventually integrate these into the extension. For now, it just estimates dates from the pushlog data.)

You can also print who and where pushed a changeset:

{firstpushuser} {firstpushtree}

You can also print the TBPL URL with the results of the first push:

{firstpushtbpl}

Here is an example that prints channel versions and dates for each changesets:

hg log --template '{rev} Nightly: {firstnightly} {nightlydate}; Aurora {firstaurora} {auroradate}; Beta: {firstbeta}; Release: {firstrelease}\n'

Putting it all together

Of course, you can combine selectors and templates to create some mighty powerful queries.

To look at your impact on Mozilla, do something like:

hg log --template '{rev} Bug {bug}; Release {firstrelease}\n' -r 'me()'

You can easily forumate a status report for your activity in the past week:

hg log --template '{firstline(desc)}\n' -r 'firstpushdate(-7) and me()'

You can also query Mercurial to see where changesets have been landing in the past 30 days:

hg log --template '{firstpushtree}\n' -r 'firstpushdate(-30)' | sort | uniq -c

You can see who has been reviewing lots of patches lately:

hg log --template '{join(reviewers, "\n")}\n' -r 'firstpushdate(-30)' | sort | uniq -c | sort -n

(smaug currently has the top score, edging out my 116 reviews with 137.)

If you want to reuse templates (instead of having to type them on the command line), you can save them as style files. Search the Internets to learn how to use them. You can even change your default style so the default output from hg log contains everything you'd ever want to know about a changeset!

Keeping it running

Many of the queries rely on data derived from multiple repositories and pushlog data that is external to the repository.

To get best results, you'll need to be running a monolithic/unified Mercurial repository. You can either assemble one locally with this extension by periodically pulling from the separate repos:

hg pull releases
hg pull integration

Or, you can pull from my personal unified repo.

You will also need to ensure the pushlog data is current. If you pull directly from the official repos, this will happen automatically. To be sure, run:

hg pushlogsync

Finally, you can force a repopulation of cached bug data by running:

hg buginfo --reset

Over time, I want all this to automagically work. Stay tuned.

Comments and future improvements

I implemented this feature to save myself from having to go troving through Bugzilla and repository history to answer questions and to obtain metrics. I can now answer many questions via simple Mercurial one-liners.

Custom revision set selectors and template keywords are a pretty nifty feature of Mercurial. They demonstrate how you can extend Mercurial to be aware of more than just tracking commits and files. As I've said before and will continue to say, the extensibility of Mercurial is really its killer feature, especially for organizations with well-defined processes (like Mozilla). The kind of extensibility I achieved with this extension with custom queries and formatting functions is just not possible with Git (at least not with the reference C implementation that the overwhelming majority of Git users use).

There are numerous improvements that can be made to the extension. Obviously more revision set selectors and template keywords can be added. The parsing routine to extract bugs and reviewers isn't the most robust in the world. I copied some existing Mozilla code. It does well at detecting string patters but doesn't cope well with extracting lists.

I'd also love to better integrate Mercurial with automation results so you can do things like expose a greenpush() selector and do things like hg up -r 'last(tree(inbound)) and greenpush()' (which of course could be exposed as lastgreen(inbound). Wouldn't that be cool! (This would be possible if we had better APIs for querying individual push results.) It would also be possible to have the Mercurial server expose this data as repository data so clients pull it automatically. That would prevent clients from all needing to query the same 3rd party services. Just a crazy thought.

Speed can be an issue. Calculating the release information ({firstnightly} etc) is currently slower than I'd like. This is mostly due to me using inefficient algorithms and not caching things where I should. Speed issues should be fixed in due time.

Please let me know if you run into any problems or have suggestions for improvements. If you want to implement your own revision set selectors or template keywords, it's easier than you think! I will happily accept patches. Keep in mind that Mercurial can integrate with 3rd party services. So if you want to supplement repository data with data from a HTTP+JSON web service, that's very doable. The sky is the limit.

MacBook Pro Firefox Build Times Comparison

November 05, 2013 at 10:00 AM | categories: Mozilla, build system

Many developers use MacBook Pros for day-to-day Firefox development. So, I thought it would be worthwhile to perform a comparison of Firefox build times for various models of MacBook Pros.

Test setup

The numbers in this post are obtained from 3 generations of MacBook Pros:

A 2011 Sandy Bridge 4 core x 2.3 GHz with 8 GB RAM and an aftermarket SSD.
A 2012 Ivy Bridge retina with 4 core x 2.6 GHz, 16 GB RAM, and a factory SSD (or possibly flash storage).
A 2013 Haswell retina with 4 core x 2.6 GHz, 16 GB RAM, and flash storage.

All machines were running OS X 10.9 Mavericks and were using the Xcode 5.0.1 toolchain (Xcode 5 clang: Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)) to build.

The power settings prevented machine sleep and machines were plugged into A/C power during measuring. I did not use the machines while obtaining measurements.

The 2012 and 2013 machines were very vanilla OS installs. However, the 2011 machine was my primary work computer and may have had a few background services running and may have been slower due to normal wear and tear. The 2012 machine was a loaner machine from IT and has an unknown history.

All data was obtained from mozilla-central revision d4a27d8eda28.

The mozconfig used contained:

export MOZ_PSEUDO_DERECURSE=1 mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-firefox.noindex

Please note that the objdir name ends with .noindex to prevent Finder from indexing build files.

I performed all tests multiple times and used the fastest time. I used time command for obtaining measurements of wall, user, and system time.

Results

Configure Times

The result of mach configure is as follows:

Machine	Wall time	User time	System time
2011	29.748	17.921	11.644
2012	26.765	15.942	10.501
2013	21.581	12.597	8.595

Clobber build no ccache

mach build was performed after running mach configure. ccache was not enabled.

Machine	Wall time	User time	System time	Total CPU time
2011	22:29 (1349)	145:35 (8735)	12:03 (723)	157:38 (9458)
2012	15:00 (900)	94:18 (5658)	8:14 (494)	102:32 (6152)
2013	11:13 (673)	69:55 (4195)	6:04 (364)	75:59 (4559)

Clobber build with empty ccache

mach build was performed after running mach configure. ccache was enabled. The ccache ccache was cleared before running mach configure.

Machine	Wall time	User time	System time	Total CPU time
2011	25:57 (1557)	161:30 (9690)	18:21 (1101)	179:51 (10791)
2012	16:58 (1018)	104:50 (6290)	12:32 (752)	117:22 (7042)
2013	12:59 (779)	79:51 (4791)	9:24 (564)	89:15 (5355)

Clobber build with populated ccache

mach build was performed after running mach configure. ccache was enabled and the ccache was populated with the results of a prior build. In theory, all compiler invocations should be serviced by ccache entries.

This measure is a very crude way to measure how fast clobber builds would be if compiler invocations were nearly instantaneous.

Machine	Wall time	User time	System time
2011	3:59 (239)	8:04 (484)	3:21 (201)(
2012	3:11 (191)	6:45 (405)	2:53 (173)
2013	2:31 (151)	5:22 (322)	2:12 (132)

No-op builds

mach build was performed on a tree that was already built.

Machine	Wall time	User time	System time
2011	1:58 (118)	2:25 (145)	0:41 (41)
2012	1:42 (102)	2:02 (122)	0:37 (37)
2013	1:20 (80)	1:39 (99)	0:28 (28)

binaries no-op

mach build binaries was performed on a fully built tree. This results in nothing being executed. It's a way to test the overhead of the binaries make target.

Machine	Wall time	User time	System time
2011	4.21	4.38	0.92
2012	3.17	3.37	0.71
2013	2.67	2.75	0.56

binaries touch single .cpp

mach build binaries was performed on a fully built tree after touching the file netwerk/dns/nsHostResolver.cpp. ccache was enabled but cleared before running this test. This test simulates common C++ developer workflow of changing C++ and recompiling.

Machine	Wall time	User time	System time
2011	12.89	13.88	1.96
2012	10.82	11.63	1.78
2013	8.57	9.29	1.23

Tier times

The times of each build system tier were measured on the 2013 Haswell MacBook Pro. These timings were obtained out of curiosity to help isolate the impact of different parts of the build. ccache was not enabled for these tests.

Action	Wall time	User time	System time	Total CPU time
export clobber	15.75	66.11	11.33	77.44
compile clobber	9:01 (541)	64:58 (3898)	5:08 (308)	70:06 (4206)
libs clobber	1:34 (94)	2:15 (135)	0:39 (39)	2:54 (174)
tools clobber	9.33	13.41	2.48	15.89
export no-op	3.01	9.72	3.47	13.19
compile no-op	3.18	18.02	2.64	20.66
libs no-op	58.2	46.9	13.4	60.3
tools no-op	8.82	12.68	1.72	14.40

Observations and conclusions

The data speaks for itself: the 2013 Haswell MacBook Pro is significantly faster than its predecessors. It clocks in at 2x faster than the benchmarked 2011 Sandy Bridge model (keep in mind the 300 MHz base clock difference) and is ~34% faster than the 2012 Ivy Bridge (at similar clock speed). Personally, I was surprised by this. I was expecting speed improvements over Ivy Bridge, but not 34%.

It should go without saying: if you have the opportunity to upgrade to a new, Haswell-based machine: do it. If possible, purchase the upgrade to a 2.6 GHz CPU, as it contains ~13% more MHz than the base 2.3 GHz model: this will make a measurable difference in build times.

It's worth noting the increased efficiency of Haswell over its predecessors. The total CPU time required to build decreased from ~158 minutes to ~103 minutes to 76 minutes! That 76 minute number is worth highlighting because it means if we get 100% CPU saturation during builds, we'll be able to build the tree in under 10 wall time minutes!

I hadn't performed crude benchmarks of high-level build system actions since the MOZ_PSEUDO_DERECURSE work landed and I wanted to use the opportunity of this hardware comparison to grab some numbers.

The overhead of ccache continues to surprise me. On the 2013 machine, enabling ccache increased the wall time of a clobber build by 1:46 and added 13:16 of CPU time. This is an increase of 16% and 17%, respectively.

It's worth highlighting just how much time is spent compiling C/C++. In our artificial tier measuring results, our clobber build time was ~660 wall time seconds (11 minutes) and used ~4473s CPU time (74:33). Of this, 9:01 wall time and 70:06 CPU time was spent compiling C/C++. This represents ~82% wall time and ~94% CPU time! Please note this does not include linking. Anything we can do to decrease the CPU time used by the compiler will make the build faster.

I also found it interesting to note variances in obtained times. Even on my brand new 2013 Haswell MacBook Pro where I know there aren't many background processes running, wall times could vary significantly. I think I isolated it to CPU bursting and heat issues. If I wait a few minutes between CPU intensive tests, results are pretty consistent. But if I perform CPU intensive tests back-to-back, the run times often vary. The only other thing coming into play could be page caching or filesystem indexing. I accounted for the latter by disabling Finder on the object directory. And, I'd like to think that flash storage is fast enough to remove I/O latency from the equation. Who knows. At the end of the day, laptops aren't servers and OS X is a consumer OS, so I don't expect ultra consistency.

Finally, I want to restate just how fast Haswell is. If you have the opportunity to upgrade, do it.

Distributed Compiling and Firefox

October 31, 2013 at 11:35 AM | categories: Mozilla, build system

If you had infinite CPU cores available and the Firefox build system could distribute them all for concurrent compilation, Firefox clobber build times would likely be 3-5 minutes instead of ~15 minutes on modern machines. This is a massive win. It therefore should come as no surprise that distributed compiling is very interesting to us.

Up until recently, the benefits of distributed compiling in the Firefox build system couldn't be fully realized. This was because the build system was performing recursive make traversal and make only knew about a tiny subset of the tree's total C++ files at one time. For example, when visiting /layout/base it only knew about 35 of the close to 6000 files that get compiled as part of building Firefox. This meant there was a hard ceiling to the max concurrency the build system could achieve. This ceiling was often higher than the number of cores in an individual machine, so it wasn't a huge issue for single machine builds. But it did significantly limit the benefits of distributed compiling. This all changed recently.

As of a few weeks ago, the build system no longer encounters a low ceiling preventing distributed compilation from reaping massive benefits. If you have build with make -j128, make will spawn 128 compiler processes when processing the compile tier (which is where most compilation occurs). If your compiler is set to a distributed compiler, you will win.

So, what should you do about it?

I encourage people to set up distributed compilation networks to reap the benefits of distributed compilation. Here are some tools you should know about and some things to keep in mind.

distcc is the tried and proven tool for performing distributed compilation. It's heavily used and gets the job done. It even works on Windows and can perform remote processing, which is a huge win for our tree, where preprocessing can be computationally expensive because of excessive includes. But, it has a few significant drawbacks. Read the next paragraph.

I'm personally more excited about icecream. It has some very compelling advantages to distcc. It has a scheduler that can intelligently distribute load between machines. It uses network broadcast to discover the scheduler. So, you just start the client daemon and if there is a scheduler on the local network, it's all set up. Icecream transfers the compiler toolchain between nodes so you are guaranteed to have consistent output. (With distcc, output may not be idempotent if the nodes aren't homogenous since distcc relies on the system-local toolchain. If different versions are installed on different nodes, you are out of luck). Icecream also supports cross-compiling. In theory, you can have Linux machines building for OS X, 32-bit machines building for 64-bit, etc. This is all very difficult (if not impossible) to do with distcc. Unfortunately, icecream doesn't work on Windows and doesn't appear to support server-side preprocessing. Although, I imagine both could be made to work if someone put in the effort.

Distributed compilation is very network intensive. I haven't measured, but I suspect Wi-Fi bandwidth and latency constraints might make it prohibitive there. It certainly won't be good for Wi-Fi saturation! If you are in a Mozilla office, please do not attempt to perform distributed compilation over Wi-Fi! For the same reasons, distributed compilation will likely not benefit you if you are attempting to compile on network-distant nodes.

I have set up an icecream server in the Mozilla San Francisco office. If you install the icecream client daemon (iceccd) on your machine, it should just work. I'm not sure what broadcast nets are configured as, but I've successfully had machines on the 7th floor discover it automatically. I guarantee no SLA for this server. Ping me privately if you have difficulty connecting.

I've started very preliminary talks with Mozilla IT about setting up dedicated compiler farms in Mozilla offices. I'm not saying this is coming any time soon. I feel this will have a major impact on developer productivity and I wanted to get the ball rolling months in advance so nobody can claim this is a fire drill.

For distributed compilation to work well, the build system really needs to be aware of distributed compilation. For example, to yield the benefits of distributed compilation with make, you need to pass -j64 or some other large value for concurrency. However, this value would be universal for every task in the build. There are still thousands of processes that must run locally. Using -j64 on these local tasks could cause memory exhaustion, I/O saturation, excessive context switching, etc. But if you decrease the concurrency ceiling, you lose the benefits of distributed compilation! The build system thus needs to be taught when distributed compilation is available and what tasks can be made concurrent so it can intelligently adjust the -j concurrency limit at run-time. This is why we have a higher-level build wrapper tool: mach build. (This is another reason why people should be building through mach instead of invoking make directly.)

No matter what technical solution we employ, I would like the build system to automatically discover and use distributed compilation if it is available. If we need to hardcode Mozilla IP addresses or hostnames into the build system, I'm fine with that. I just don't want developers not achieving much-faster build times because they are ignorant. If you are in a physical location with distributed compilation support, you should get that automatically: fast builds should not be hard.

We can and should investigate distributed compilation as part of release automation. Icecream should mitigate the concerns about build reproducibility since the toolchain is transferred at build time.

I have had success getting Icecream to work with Linux builds. However, OS X is problematic. Specifically, Icecream is unable to create the build environment for distribution (likely modern OS X/Xcode compatibility issue). Details are in bug 927952.

Build peers have a lot on our plate this quarter and making distributed compilation work well is not in our official goals. I would love, love, love if someone could step up and be a hero to make distributed compilation work better with the build system. If you are interested, pop into #build on irc.mozilla.org.

In summary, there are massive developer productivity wins waiting to be realized through distributed compiling. There is nobody tasked to work on this officially. Although, I'd love it if there were. If you find yourself setting up ad-hoc networks in offices, I'd really like to see some kind of discovery in mach. If not, there will be people left behind and that really stinks for those individuals. If you do any work around distributed compiling, please have it tracked under bug 485559.

OS X Mavericks and the Firefox Build System

October 22, 2013 at 01:30 PM | categories: Mozilla

OS X Mavericks is available today as a free upgrade. People at Mozilla are probably asking if it is safe to upgrade: will it affect my ability to build Firefox.

A few people (myself included) have been running OS X Mavericks developer previews for a few months. I believe all the Firefox build system issues have been worked out for at least a few weeks now. So, upgrading to OS X Mavericks should not impact your ability to build a stock Firefox configuration from mozilla-central.

There might still be some non-default features and code paths that need fixed to work with Mavericks. Support for the OS X 10.9 SDK might also be problematic. In addition, if you build older trees such as Aurora and Beta, you may run into issues building on Mavericks because those trees may not have all required fixes uplifted.

While I won't encourage you to upgrade, I will say that Mavericks should build Firefox without any issue. And since the number of Mavericks users will only increase in the days ahead, it should be safe to assume that regressions will be promptly fixed.

If you run into any issues with Firefox and Mavericks, bug 883824 is the master tracking bug. Bug 894090 tracks build system issues.

Alternate Mercurial Server for Firefox Development

October 17, 2013 at 07:30 AM | categories: Mercurial, Mozilla

I have long opined about the sad state of Mercurial at Mozilla. The short version is Mozilla has failed to use Mercurial optimally, at least for Firefox development. It's easy to see why so many Mozillians are quick to discredit Mercurial when compared to Git!

I have a history attempting to address the deficiencies. Up to this point, I've been able to make things better through local tooling. But, for my next set of tricks, I reached an impasse with the Mercurial server at hg.mozilla.org. So, I stood up my own Mercurial server at hg.gregoryszorc.com/!

This server is running Mercurial 2.7 and has a few nice features the official Mercurial server at hg.mozilla.org does not.

The repositories

http://hg.gregoryszorc.com/gecko is a read-only unified Mercurial repository containing the commits for the major Firefox/Gecko repositories. If you look at its bookmarks, you'll see something special: the heads of all the separate Mercurial repos it is aggregating are being stored as bookmarks! (Bookmarks are effectively Git branches.) The tip of mozilla-central is at the bookmark central/default. The tip of Beta is at beta/default. You get the idea. Once you clone this repo, you can easily switch between project branches by running e.g. hg up central/default. When you pull the repo, you get changesets for all repos by connecting to one server, not several (this reduces load on Mozilla's servers and is faster for clients).

This repository shares the same changesets/SHA-1's as the official repositories. It just has everything under one roof. You can work out of this repository and push to the official repositories. Although, you may want to use the pushtree command from my custom extension to make your life easier (hg push with no arguments will attempt to push all changesets, which you definitely don't want when pushing to e.g. mozilla-central).

http://hg.gregoryszorc.com/gecko-collab is an offshoot of the gecko repo that you can push to. Changesets from the gecko repo are pulled into it automatically.

What makes the gecko-collab repository special is that it has obsolescence enabled. That is the core Mercurial feature enabling changeset evolution. More on that feature and why it is amazing in a future blog post. Stay tuned.

Cloning

If you would like to clone one of these unified repos, please do my paltry EC2 server a favor and bootstrap your clone from an existing clone. e.g. if you have a copy of mozilla-central sitting around but don't want my repo's changesets to pollute it, do the following:

hg clone mozilla-central gecko
cd gecko
hg pull http://hg.gregoryszorc.com/gecko

Or, if you are OK with your clone accumulating the extra changesets from all the project branches, just run:

hg pull http://hg.gregoryszorc.com/gecko

Don't forget to update the [paths] section in your .hg/hgrc file to point to hg.gregoryszorc.com! e.g.

[paths]
gecko = http://hg.gregoryszorc.com/gecko
collab = http://hg.gregoryszorc.com/gecko-collab

Setting up push support and SSH keys

If you would like to push to the gecko-collab repository, you'll need to give me your SSH public key. But don't give me your key - give an automated process your key!

Head on over to http://phabricator.gregoryszorc.com/ and log in (look for the Persona button). Once you've logged in, go to your settings by clicking the wrench icon in the top right. Then look for SSH Public Keys to add your key(s). If you can't find it, just go to http://phabricator.gregoryszorc.com/settings/panel/ssh/.

Once your SSH public key is added, it will take up to a minute for it to be added to my system. It's all automatic. You don't need to wait for any manual action.

To connect to my server over SSH, you'll need to log in as the hgssh user. e.g. in your hgrc file, add:

[paths]
gecko = ssh://hgssh@hg.gregoryszorc.com/gecko
collab = ssh://hgssh@hg.gregoryszorc.com/gecko-collab

Then, you should be able to pull and push over SSH!

Other Notes

This server is running on an EC2 instance that isn't as powerful as I'd like. Expect some operations to be slower than desired.

I don't guarantee an SLA for this service. It could go down at any moment. However, Mercurial being a distributed version control system, there should be little to no data loss assuming people pull frequently. I know I have a backup on all my machines now.

I'm running this server for two main reasons.

First, I want to demonstrate the utility of a unified Mercurial server for Firefox development in hopes we can run one officially. I've been running a unified repo locally for a few months and I have little doubt I'm more productive because of it. I want others to realize the awesomeness.

Second, I needed a server that supported changeset evolution so I could play around with it. I asked the powers at be to enable it on hg.mozilla.org and didn't get a response that met my timeline. So, I figured setting up my own server was easier.

Please let me know if you have any questions or issues with this server. I'd also love to hear whether people like the unified repo approach!

« Previous Page -- Next Page »