Absorbing Commit Changes in Mercurial 4.8

November 05, 2018 at 09:25 AM | categories: Mercurial, Mozilla | View Comments

Every so often a tool you use introduces a feature that is so useful that you can't imagine how things were before that feature existed. The recent 4.8 release of the Mercurial version control tool introduces such a feature: the hg absorb command.

hg absorb is a mechanism to automatically and intelligently incorporate uncommitted changes into prior commits. Think of it as hg histedit or git rebase -i with auto squashing.

Imagine you have a set of changes to prior commits in your working directory. hg absorb figures out which changes map to which commits and absorbs each of those changes into the appropriate commit. Using hg absorb, you can replace cumbersome and often merge conflict ridden history editing workflows with a single command that often just works. Read on for more details and examples.

Modern version control workflows often entail having multiple unlanded commits in flight. What this looks like varies heavily by the version control tool, standards and review workflows employed by the specific project/repository, and personal preferences.

A workflow practiced by a lot of projects is to author your commits into a sequence of standalone commits, with each commit representing a discrete, logical unit of work. Each commit is then reviewed/evaluated/tested on its own as part of a larger series. (This workflow is practiced by Firefox, the Git and Mercurial projects, and the Linux Kernel to name a few.)

A common task that arises when working with such a workflow is the need to incorporate changes into an old commit. For example, let's say we have a stack of the following commits:

$ hg show stack
  @  1c114a ansible/hg-web: serve static files as immutable content
  o  d2cf48 ansible/hg-web: synchronize templates earlier
  o  c29f28 ansible/hg-web: convert hgrc to a template
  o  166549 ansible/hg-web: tell hgweb that static files are in /static/
  o  d46d6a ansible/hg-web: serve static template files from httpd
  o  37fdad testing: only print when in verbose mode
 /   (stack base)
o  e44c2e (@) testing: install Mercurial 4.8 final

Contained within this stack are 5 commits changing the way that static files are served by hg.mozilla.org (but that's not important).

Let's say I submit this stack of commits for review. The reviewer spots a problem with the second commit (serve static template files from httpd) and wants me to make a change.

How do you go about making that change?

Again, this depends on the exact tool and workflow you are using.

A common workflow is to not rewrite the existing commits at all: you simply create a new fixup commit on top of the stack, leaving the existing commits as-is. e.g.:

$ hg show stack
  o  deadad fix typo in httpd config
  o  1c114a ansible/hg-web: serve static files as immutable content
  o  d2cf48 ansible/hg-web: synchronize templates earlier
  o  c29f28 ansible/hg-web: convert hgrc to a template
  o  166549 ansible/hg-web: tell hgweb that static files are in /static/
  o  d46d6a ansible/hg-web: serve static template files from httpd
  o  37fdad testing: only print when in verbose mode
 /   (stack base)
o  e44c2e (@) testing: install Mercurial 4.8 final

When the entire series of commits is incorporated into the repository, the end state of the files is the same, so all is well. But this strategy of using fixup commits (while popular - especially with Git-based tooling like GitHub that puts a larger emphasis on the end state of changes rather than the individual commits) isn't practiced by all projects. hg absorb will not help you if this is your workflow.

A popular variation of this fixup commit workflow is to author a new commit then incorporate this commit into a prior commit. This typically involves the following actions:

<save changes to a file>

$ hg commit
<type commit message>

$ hg histedit
<manually choose what actions to perform to what commits>

OR

<save changes to a file>

$ git add <file>
$ git commit
<type commit message>

$ git rebase --interactive
<manually choose what actions to perform to what commits>

Essentially, you produce a new commit. Then you run a history editing command. You then tell that history editing command what to do (e.g. to squash or fold one commit into another), that command performs work and produces a set of rewritten commits.

In simple cases, you may make a simple change to a single file. Things are pretty straightforward. You need to know which two commits to squash together. This is often trivial. Although it can be cumbersome if there are several commits and it isn't clear which one should be receiving the new changes.

In more complex cases, you may make multiple modifications to multiple files. You may even want to squash your fixups into separate commits. And for some code reviews, this complex case can be quite common. It isn't uncommon for me to be incorporating dozens of reviewer-suggested changes across several commits!

These complex use cases are where things can get really complicated for version control tool interactions. Let's say we want to make multiple changes to a file and then incorporate those changes into multiple commits. To keep it simple, let's assume 2 modifications in a single file squashing into 2 commits:

<save changes to file>

$ hg commit --interactive
<select changes to commit>
<type commit message>

$ hg commit
<type commit message>

$ hg histedit
<manually choose what actions to perform to what commits>

OR

<save changes to file>

$ git add <file>
$ git add --interactive
<select changes to stage>

$ git commit
<type commit message>

$ git add <file>
$ git commit
<type commit message>

$ git rebase --interactive
<manually choose which actions to perform to what commits>

We can see that the number of actions required by users has already increased substantially. Not captured by the number of lines is the effort that must go into the interactive commands like hg commit --interactive, git add --interactive, hg histedit, and git rebase --interactive. For these commands, users must tell the VCS tool exactly what actions to take. This takes time and requires some cognitive load. This ultimately distracts the user from the task at hand, which is bad for concentration and productivity. The user just wants to amend old commits: telling the VCS tool what actions to take is an obstacle in their way. (A compelling argument can be made that the work required with these workflows to produce a clean history is too much effort and it is easier to make the trade-off favoring simpler workflows versus cleaner history.)

These kinds of squash fixup workflows are what hg absorb is designed to make easier. When using hg absorb, the above workflow can be reduced to:

<save changes to file>

$ hg absorb
<hit y to accept changes>

OR

<save changes to file>

$ hg absorb --apply-changes

Let's assume the following changes are made in the working directory:

$ hg diff
diff --git a/ansible/roles/hg-web/templates/vhost.conf.j2 b/ansible/roles/hg-web/templates/vhost.conf.j2
--- a/ansible/roles/hg-web/templates/vhost.conf.j2
+++ b/ansible/roles/hg-web/templates/vhost.conf.j2
@@ -76,7 +76,7 @@ LimitRequestFields 1000
      # Serve static files straight from disk.
      <Directory /repo/hg/htdocs/static/>
          Options FollowSymLinks
 -        AllowOverride NoneTypo
 +        AllowOverride None
          Require all granted
      </Directory>

@@ -86,7 +86,7 @@ LimitRequestFields 1000
      # and URLs are versioned by the v-c-t revision, they are immutable
      # and can be served with aggressive caching settings.
      <Location /static/>
 -        Header set Cache-Control "max-age=31536000, immutable, bad"
 +        Header set Cache-Control "max-age=31536000, immutable"
      </Location>

      #LogLevel debug

That is, we have 2 separate uncommitted changes to ansible/roles/hg-web/templates/vhost.conf.j2.

Here is what happens when we run hg absorb:

$ hg absorb
showing changes for ansible/roles/hg-web/templates/vhost.conf.j2
        @@ -78,1 +78,1 @@
d46d6a7 -        AllowOverride NoneTypo
d46d6a7 +        AllowOverride None
        @@ -88,1 +88,1 @@
1c114a3 -        Header set Cache-Control "max-age=31536000, immutable, bad"
1c114a3 +        Header set Cache-Control "max-age=31536000, immutable"

2 changesets affected
1c114a3 ansible/hg-web: serve static files as immutable content
d46d6a7 ansible/hg-web: serve static template files from httpd
apply changes (yn)?
<press "y">
2 of 2 chunk(s) applied

hg absorb automatically figured out that the 2 separate uncommitted changes mapped to 2 different changesets (Mercurial's term for commit). It print a summary of what lines would be changed in what changesets and prompted me to accept its plan for how to proceed. The human effort involved is a quick review of the proposed changes and answering a prompt.

At a technical level, hg absorb finds all uncommitted changes and attempts to map each changed line to an unambiguous prior commit. For every change that can be mapped cleanly, the uncommitted changes are absorbed into the appropriate prior commit. Commits impacted by the operation are rebased automatically. If a change cannot be mapped to an unambiguous prior commit, it is left uncommitted and users can fall back to an existing workflow (e.g. using hg histedit).

But wait - there's more!

The automatic rewriting logic of hg absorb is implemented by following the history of lines. This is fundamentally different from the approach taken by hg histedit or git rebase, which tend to rely on merge strategies based on the 3-way merge to derive a new version of a file given multiple input versions. This approach combined with the fact that hg absorb skips over changes with an ambiguous application commit means that hg absorb will never encounter merge conflicts! Now, you may be thinking if you ignore lines with ambiguous application targets, the patch would always apply cleanly using a classical 3-way merge. This statement logically sounds correct. But it isn't: hg absorb can avoid merge conflicts when the merging performed by hg histedit or git rebase -i would fail.

The above example attempts to exercise such a use case. Focusing on the initial change:

diff --git a/ansible/roles/hg-web/templates/vhost.conf.j2 b/ansible/roles/hg-web/templates/vhost.conf.j2
--- a/ansible/roles/hg-web/templates/vhost.conf.j2
+++ b/ansible/roles/hg-web/templates/vhost.conf.j2
@@ -76,7 +76,7 @@ LimitRequestFields 1000
     # Serve static files straight from disk.
     <Directory /repo/hg/htdocs/static/>
         Options FollowSymLinks
-        AllowOverride NoneTypo
+        AllowOverride None
         Require all granted
     </Directory>

This patch needs to be applied against the commit which introduced it. That commit had the following diff:

diff --git a/ansible/roles/hg-web/templates/vhost.conf.j2 b/ansible/roles/hg-web/templates/vhost.conf.j2
--- a/ansible/roles/hg-web/templates/vhost.conf.j2
+++ b/ansible/roles/hg-web/templates/vhost.conf.j2
@@ -73,6 +73,15 @@ LimitRequestFields 1000
         {% endfor %}
     </Location>

+    # Serve static files from templates directory straight from disk.
+    <Directory /repo/hg/hg_templates/static/>
+        Options None
+        AllowOverride NoneTypo
+        Require all granted
+    </Directory>
+
+    Alias /static/ /repo/hg/hg_templates/static/
+
     #LogLevel debug
     LogFormat "%h %v %u %t \"%r\" %>s %b %D \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\""
     ErrorLog "/var/log/httpd/hg.mozilla.org/error_log"

But after that commit was another commit with the following change:

diff --git a/ansible/roles/hg-web/templates/vhost.conf.j2 b/ansible/roles/hg-web/templates/vhost.conf.j2
--- a/ansible/roles/hg-web/templates/vhost.conf.j2
+++ b/ansible/roles/hg-web/templates/vhost.conf.j2
@@ -73,14 +73,21 @@ LimitRequestFields 1000
         {% endfor %}
     </Location>

-    # Serve static files from templates directory straight from disk.
-    <Directory /repo/hg/hg_templates/static/>
-        Options None
+    # Serve static files straight from disk.
+    <Directory /repo/hg/htdocs/static/>
+        Options FollowSymLinks
         AllowOverride NoneTypo
         Require all granted
     </Directory>

...

When we use hg histedit or git rebase -i to rewrite this history, the VCS would first attempt to re-order commits before squashing 2 commits together. When we attempt to reorder the fixup diff immediately after the commit that introduces it, there is a good chance your VCS tool would encounter a merge conflict. Essentially your VCS is thinking you changed this line but the lines around the change in the final version are different from the lines in the initial version: I don't know if those other lines matter and therefore I don't know what the end state should be, so I'm giving up and letting the user choose for me.

But since hg absorb operates at the line history level, it knows that this individual line wasn't actually changed (even though the lines around it did), assumes there is no conflict, and offers to absorb the change. So not only is hg absorb significantly simpler than today's hg histedit or git rebase -i workflows in terms of VCS command interactions, but it can also avoid time-consuming merge conflict resolution as well!

Another feature of hg absorb is that all the rewriting occurs in memory and the working directory is not touched when running the command. This means that the operation is fast (working directory updates often account for a lot of the execution time of hg histedit or git rebase commands). It also means that tools looking at the last modified time of files (e.g. build systems like GNU Make) won't rebuild extra (unrelated) files that were touched as part of updating the working directory to an old commit in order to apply changes. This makes hg absorb more friendly to edit-compile-test-commit loops and allows developers to be more productive.

And that's hg absorb in a nutshell.

When I first saw a demo of hg absorb at a Mercurial developer meetup, my jaw - along with those all over the room - hit the figurative floor. I thought it was magical and too good to be true. I thought Facebook (the original authors of the feature) were trolling us with an impossible demo. But it was all real. And now hg absorb is available in the core Mercurial distribution for anyone to use.

From my experience, hg absorb just works almost all of the time: I run the command and it maps all of my uncommitted changes to the appropriate commit and there's nothing more for me to do! In a word, it is magical.

To use hg absorb, you'll need to activate the absorb extension. Simply put the following in your hgrc config file:

[extensions]
absorb =

hg absorb is currently an experimental feature. That means there is no commitment to backwards compatibility and some rough edges are expected. I also anticipate new features (such as hg absorb --interactive) will be added before the experimental label is removed. If you encounter problems or want to leave comments, file a bug, make noise in #mercurial on Freenode, or submit a patch. But don't let the experimental label scare you away from using it: hg absorb is being used by some large install bases and also by many of the Mercurial core developers. The experimental label is mainly there because it is a brand new feature in core Mercurial and the experimental label is usually affixed to new features.

If you practice workflows that frequently require amending old commits, I think you'll be shocked at how much easier hg absorb makes these workflows. I think you'll find it to be a game changer: once you use hg abosrb, you'll soon wonder how you managed to get work done without it.

Read and Post Comments

Benefits of Clone Offload on Version Control Hosting

July 27, 2018 at 03:48 PM | categories: Mercurial, Mozilla | View Comments

Back in 2015, I implemented a feature in Mercurial 3.6 that allows servers to advertise URLs of pre-generated bundle files. When a compatible client performs a hg clone against a repository leveraging this feature, it downloads and applies the bundle from a URL then goes back to the server and performs the equivalent of an hg pull to obtain the changes to the repository made after the bundle was generated.

On hg.mozilla.org, we've been using this feature since 2015. We host bundles in Amazon S3 and make them available via the CloudFront CDN. We perform IP filtering on the server so clients connecting from AWS IPs are served S3 URLs corresponding to the closest region / S3 bucket where bundles are hosted. Most Firefox build and test automation is run out of EC2 and automatically clones high-volume repositories from an S3 bucket hosted in the same AWS region. (Doing an intra-region transfer is very fast and clones can run at >50 MB/s.) Everyone else clones from a CDN. See our official docs for more.

I last reported on this feature in October 2015. Since then, Bitbucket also deployed this feature in early 2017.

I was reminded of this clone bundles feature this week when kernel.org posted Best way to do linux clones for your CI and that post was making the rounds in my version control circles. tl;dr git.kernel.org apparently suffers high load due to high clone volume against the Linux Git repository and since Git doesn't have an equivalent feature to clone bundles built in to Git itself, they are asking people to perform equivalent functionality to mitigate server load.

(A clone bundles feature has been discussed on the Git mailing list before. I remember finding old discussions when I was doing research for Mercurial's feature in 2015. I'm sure the topic has come up since.)

Anyway, I thought I'd provide an update on just how valuable the clone bundles feature is to Mozilla. In doing so, I hope maintainers of other version control tools see the obvious benefits and consider adopting the feature sooner.

In a typical week, hg.mozilla.org is currently serving ~135 TB of data. The overwhelming majority of this data is related to the Mercurial wire protocol (i.e. not HTML / JSON served from the web interface). Of that ~135 TB, ~5 TB is served from the CDN, ~126 TB is served from S3, and ~4 TB is served from the Mercurial servers themselves. In other words, we're offloading ~97% of bytes served from the Mercurial servers to S3 and the CDN.

If we assume this offloaded ~131 TB is equally distributed throughout the week, this comes out to ~1,732 Mbps on average. In reality, we do most of our load from California's Sunday evenings to early Friday evenings. And load is typically concentrated in the 12 hours when the sun is over Europe and North America (where most of Mozilla's employees are based). So the typical throughput we are offloading is more than 2 Gbps. And at a lower level, automation tends to perform clones soon after a push is made. So load fluctuates significantly throughout the day, corresponding to when pushes are made.

By volume, most of the data being offloaded is for the mozilla-unified Firefox repository. Without clone bundles and without the special stream clone Mercurial feature (which we also leverage via clone bundles), the servers would be generating and sending ~1,588 MB of zstandard level 3 compressed data for each clone of that repository. Each clone would consume ~280s of CPU time on the server. And at ~195,000 clones per month, that would come out to ~309 TB/mo or ~72 TB/week. In CPU time, that would be ~54.6 million CPU-seconds, or ~21 CPU-months. I will leave it as an exercise to the reader to attach a dollar cost to how much it would take to operate this service without clone bundles. But I will say the total AWS bill for our S3 and CDN hosting for this service is under $50 per month. (It is worth noting that intra-region data transfer from S3 to other AWS services is free. And we are definitely taking advantage of that.)

Despite a significant increase in the size of the Firefox repository and clone volume of it since 2015, our servers are still performing less work (in terms of bytes transferred and CPU seconds consumed) than they were in 2015. The ~97% of bytes and millions of CPU seconds offloaded in any given week have given us a lot of breathing room and have saved Mozilla several thousand dollars in hosting costs. The feature has likely helped us avoid many operational incidents due to high server load. It has made Firefox automation faster and more reliable.

Succinctly, Mercurial's clone bundles feature has successfully and largely effortlessly offloaded a ton of load from the hg.mozilla.org Mercurial servers. Other version control tools should implement this feature because it is a game changer for server operators and results in a better client-side experience (eliminates server-side CPU bottleneck and may eliminate network bottleneck due to a geo-local CDN typically being as fast as your Internet pipe). It's a win-win. And a massive win if you are operating at scale.

Read and Post Comments

Deterministic Firefox Builds

June 20, 2018 at 11:10 AM | categories: Mozilla | View Comments

As of Firefox 60, the build environment for official Firefox Linux builds switched from CentOS to Debian.

As part of the transition, we overhauled how the build environment for Firefox is constructed. We now populate the environment from deterministic package snapshots and are much more stringent about dependencies and operations being deterministic and reproducible. The end result is that the build environment for Firefox is deterministic enough to enable Firefox itself to be built deterministically.

Changing the underlying operating system environment used for builds was a risky change. Differences in the resulting build could result in new bugs or some users not being able to run the official builds. We figured a good way to mitigate that risk was to make the old and new builds as bit-identical as possible. After all, if the environments produce the same bits, then nothing has effectively changed and there should be no new risk for end-users.

Employing the diffoscope tool, we identified areas where Firefox builds weren't deterministic in the same environment and where there was variance across build environments. We iterated on differences and changed systems so variance would no longer occur. By the end of the process, we had bit-identical Firefox builds across environments.

So, as of Firefox 60, Firefox builds on Linux are deterministic in our official build environment!

That being said, the builds we ship to users are using PGO. And an end-to-end build involving PGO is intrinsically not deterministic because it relies on timing data that varies from one run to the next. And we don't yet have continuous automated end-to-end testing that determinism holds. But the underlying infrastructure to support deterministic and reproducible Firefox builds is there and is not going away. I think that's a milestone worth celebrating.

This milestone required the effort of many people, often working indirectly toward it. Debian's reproducible builds effort gave us an operating system that provided deterministic and reproducible guarantees. Switching Firefox CI to Taskcluster enabled us to switch to Debian relatively easily. Many were involved with non-determinism fixes in Firefox over the years. But Mike Hommey drove the transition of the build environment to Debian and he deserves recognition for his individual contribution. Thanks to all these efforts - and especially Mike Hommey's - we can now say Firefox builds deterministically!

The fx-reproducible-build bug tracks ongoing efforts to further improve the reproducibility story of Firefox. (~300 bugs in its dependency tree have already been resolved!)

Read and Post Comments

Scaling Firefox Development Workflows

May 16, 2018 at 04:10 PM | categories: Mozilla | View Comments

One of the central themes of my time at Mozilla has been my pursuit of making it easier to contribute to and hack on Firefox.

I vividly remember my first day at Mozilla in 2011 when I went to build Firefox for the first time. I thought the entire experience - from obtaining the source code, installing build dependencies, building, running tests, submitting patches for review, etc was quite... lacking. When I asked others if they thought this was an issue, many rightfully identified problems (like the build system being slow). But there was a significant population who seemed to be naive and/or apathetic to the breadth of the user experience shortcomings. This is totally understandable: the scope of the problem is immense and various people don't have the perspective, are blinded/biased by personal experience, and/or don't have the product design or UX experience necessary to comprehend the problem.

When it comes to contributing to Firefox, I think the problems have as much to do with user experience (UX) as they do with technical matters. As I wrote in 2012, user experience matters and developers are people too. You can have a technically superior product, but if the UX is bad, you will have a hard time attracting and retaining new users. And existing users won't be as happy. These are the kinds of problems that a product manager or designer deals with. A difference is that in the case of Firefox development, the target audience is a very narrow and highly technically-minded subset of the larger population - much smaller than what your typical product targets. The total addressable population is (realistically) in the thousands instead of millions. But this doesn't mean you ignore the principles of good product design when designing developer tooling. When it comes to developer tooling and workflows, I think it is important to have a product manager mindset and treat it not as a collection of tools for technically-minded individuals, but as a product having an overall experience. You only have to look as far as the Firefox Developer Tools to see this approach applied and the positive results it has achieved.

Historically, Mozilla has lacked a formal team with the domain expertise and mandate to treat Firefox contribution as a product. We didn't have anything close to this until a few years ago. Before we had such a team, I took on some of these problems individually. In 2012, I wrote mach - a generic CLI command dispatch tool - to provide a central, convenient, and easy-to-use command to discover development actions and to run them. (Read the announcement blog post for some historical context.) I also introduced a one-line bootstrap tool (now mach bootstrap) to make it easier to configure your machine for building Firefox. A few months later, I was responsible for introducing moz.build files, which paved the way for countless optimizations and for rearchitecting the Firefox build system to use modern tools - a project that is still ongoing (digging out from ~two decades of technical debt is a massive effort). And a few months after that, I started going down the version control rabbit hole and improving matters there. And I was also heavily involved with MozReview for improving the code review experience.

Looking back, I was responsible for and participated in a ton of foundational changes to how Firefox is developed. Of course, dozens of others have contributed to getting us to where we are today and I can't and won't take credit for the hard work of others. Nor will I claim I was the only person coming up with good ideas or transforming them into reality. I can name several projects (like Taskcluster and Treeherder) that have been just as or more transformational than the changes I can take credit for. It would be vain and naive of me to elevate my contributions on a taller pedestal and I hope nobody reads this and thinks I'm doing that.

(On a personal note, numerous people have told me that things like mach and the bootstrap tool have transformed the Firefox contribution experience for the better. I've also had very senior people tell me that they don't understand why these tools are important and/or are skeptical of the need for investments in this space. I've found this dichotomy perplexing and troubling. Because some of the detractors (for lack of a better word) are highly influential and respected, their apparent skepticism sews seeds of doubt and causes me to second guess my contributions and world view. This feels like a form or variation of imposter syndrome and it is something I have struggled with during my time at Mozilla.)

From my perspective, the previous five or so years in Firefox development workflows has been about initiating foundational changes and executing on them. When it was introduced, mach was radical. It didn't do much and its use was optional. Now almost everyone uses it. Similar stories have unfolded for Taskcluster, MozReview, and various other tools and platforms. In other words, we laid a foundation and have been steadily building upon it for the past several years. That's not to say other foundational changes haven't occurred since (they have - the imminent switch to Phabricator is a terrific example). But the volume of foundational changes has slowed since 2012-2014. (I think this is due to Mozilla deciding to invest more in tools as a result of growing pains from significant company expansion that began in 2010. With that investment, we invested in the bigger ticket long-standing workflow pain points, such as CI (Taskcluster), the Firefox build system, Treeherder, and code review.)

Workflows Today and in the Future

Over the past several years, the size, scope, and complexity of Firefox development activities has increased.

One way to see this is at the source code level. The following chart shows the size of the mozilla-central version control repository over time.

mozilla-central size over time

The size increases are obvious. The increases cumulatively represent new features, technologies, and workflows. For example, the repository contains thousands of Web Platform Tests (WPT) files, a shared test suite for web platform implementations, like Gecko and Blink. WPT didn't exist a few years ago. Now we have files under source control, tools for running those tests, and workflows revolving around changing those tests. The incorporation of Rust and components of Servo into Firefox is also responsible for significant changes. Firefox features such as Developer Tools have been introduced or ballooned in size in recent years. The Go Faster project and the move to system add-ons has introduced various new workflows and challenges for testing Firefox.

Many of these changes are building upon the user-facing foundational workflow infrastructure that was last significantly changed in 2012-2014. This has definitely contributed to some growing pains. For example, there are now 92 mach commands instead of like 5. mach help - intended to answer what can I do and how should I do it - is overwhelming, especially to new users. The repository is around 2 gigabytes of data to clone instead of around 500 megabytes. We have 240,000 files in a full checkout instead of 70,000 files. There's a ton of new pieces floating around. Any product manager tasked with user acquisition and retention will tell you that increasing the barrier to entry and use will jeopardize these outcomes. But with the growth of Firefox's technical underbelly in the previous years, we've made it harder to contribute by requiring users to download and see a lot more files (version control) and be overwhelmed by all the options for actions to take (mach having 92 commands). And as the sheer number of components constituting Firefox increases, it becomes harder and harder for everyone - not just new contributors - to reason about how everything fits together.

I've been framing this general problem as scaling Firefox development workflows and every time I think about the high-level challenges facing Firefox contribution today and in the years ahead, this problem floats to the top of my list of concerns. Yes, we have pressing issues like improving the code review experience and making the Firefox build system and Taskcluster-based CI fast, efficient, and reliable. But even if you make these individual pieces great, there is still a cross-domain problem of how all these components weave together. This is why I think it is important to take a wholistic view and treat developer workflow as a product.

When I look at this the way a product manager or designer would, I see a few fundamental problems that need addressing.

First, we're not optimizing for comprehensive end-to-end workflows. By and large, we're designing our tools in isolation. We focus more on maximizing the individual components instead of maximizing the interaction between them. For example, Taskcluster and Treeherder are pretty good in isolation. But we're missing features like Treeherder being able to tell me the command to run locally to reproduce a failure: I want to see a failure on Treeherder and be able to copy and paste commands into my terminal to debug the failure. In the case of code review, we've designed two good code review tools (MozReview and Phabricator) but we haven't invested in making submitting code reviews turn key (the initial system configuration is difficult and we still don't have things like automatic bug filing or reviewer selection). We are leaving many workflow optimizations on the table by not implementing thoughtful tie-ins and transitions between various tools.

Second, by-and-large we're still optimizing for a single, monolithic user segment instead of recognizing and optimizing for different users and their workflow requirements. For example, mach help lists 92 commands. I don't think any single person cares about all 92 of those commands. The average person may only care about 10 or even 20. In terms of user interface design, the features and workflow requirements of small user segments are polluting the interface for all users and making the entire experience complicated and difficult to reason about. As a concrete example, why should a system add-on developer or a Firefox Developer Tools developer (these people tend to care about testing a standalone Firefox add-on) care about Gecko's build system or tests? If you aren't touching Gecko or Firefox's chrome code, why should you be exposed to workflows and requirements that don't have a major impact on you? Or something more extreme, if you are developing a standalone Rust module or Python package in mozilla-central, why do you need to care about Firefox at all? (Yes, Firefox or another downstream consumer may care about changes to that standalone component and you can't ignore those dependencies. But it should at least be possible to hide those dependencies.)

Waving my hands, the solution to these problems is to treat Firefox development workflow as a product and to apply the same rigor that we use for actual Firefox product development. Give people with a vision for the entire workflow the ability to prioritize investment across tools and platforms. Give them a process for defining features that work across tools. Perform formal user studies. See how people are actually using the tools you build. Bring in design and user experience experts to help formulate better workflows. Perform user typing so different, segmentable workflows can be optimized for. Treat developers as you treat users of real products: listen to them. Give developers a voice to express frustrations. Let them tell you what they are trying to do and what they wish they could do. Integrate this feedback into a feature roadmap. Turn common feedback into action items for new features.

If you think these ideas are silly and it doesn't make sense to apply a product mindset to developer workflows and tooling, then you should be asking whether product management and all that it entails is also a silly idea. If you believe that aspects of product management have beneficial outcomes (which most companies do because otherwise there wouldn't be product managers), then why wouldn't you want to apply the methods of that discipline to developers and development workflows? Developers are users too and the fact that they work for the same company that is creating the product shouldn't make them immune from the benefits of product management.

If we want to make contributing to Firefox an even better experience for Mozilla employees and community contributors, I think we need to take a step back and assess the situation as a product manager would. The improvements that have been made to the individual pieces constituting Firefox's development workflow during my nearly seven years at Mozilla have been incredible. But I think in order to achieve the next round of major advancements in workflow productivity, we'll need to focus on how all of the pieces fit together. And that requires treating the entire workflow as a cohesive product.

Read and Post Comments

Revisiting Using Docker

May 16, 2018 at 01:45 PM | categories: Docker, Mozilla | View Comments

When Docker was taking off like wildfire in 2013, I was caught up in the excitement like everyone else. I remember knowing of the existence of LXC and container technologies in Linux at the time. But Docker seemed to be the first open source tool to actually make that technology usable (a terrific example of how user experience matters).

At Mozilla, Docker was adopted all around me and by me for various utilities. Taskcluster - Mozilla's task execution framework geared for running complex CI systems - adopted Docker as a mechanism to run processes in self-contained images. Various groups in Mozilla adopted Docker for running services in production. I adopted Docker for integration testing of complex systems.

Having seen various groups use Docker and having spent a lot of time in the trenches battling technical problems, my conclusion is Docker is unsuitable as a general purpose container runtime. Instead, Docker has its niche for hosting complex network services. Other uses of Docker should be highly scrutinized and potentially discouraged.

When Docker hit first the market, it was arguably the only game in town. Using Docker to achieve containerization was defensible because there weren't exactly many (any?) practical alternatives. So if you wanted to use containers, you used Docker.

Fast forward a few years. We now have the Open Container Initiative (OCI). There are specifications describing common container formats. So you can produce a container once and take it to any number OCI compatible container runtimes for execution. And in 2018, there are a ton of players in this space. runc, rkt, and gVisor are just some. So Docker is no longer the only viable tool for executing a container. If you are just getting started with the container space, you would be wise to research the available options and their pros and cons.

When you look at all the options for running containers in 2018, I think it is obvious that Docker - usable though it may be - is not ideal for a significant number of container use cases. If you divide use cases into a spectrum where one end is run a process in a sandbox and the other is run a complex system of orchestrated services in production, Docker appears to be focusing on the latter. Take it from Docker themselves:

Docker is the company driving the container movement and the only container platform provider to address every application across the hybrid cloud. Today's businesses are under pressure to digitally transform but are constrained by existing applications and infrastructure while rationalizing an increasingly diverse portfolio of clouds, datacenters and application architectures. Docker enables true independence between applications and infrastructure and developers and IT ops to unlock their potential and creates a model for better collaboration and innovation.

That description of Docker (the company) does a pretty good job of describing what Docker (the technology) has become: a constellation of software components providing the underbelly for managing complex applications in complex infrastructures. That's pretty far detached on the spectrum from run a process in a sandbox.

Just because Docker (the company) is focused on a complex space doesn't mean they are incapable of exceeding at solving simple problems. However, I believe that in this particular case, the complexity of what Docker (the company) is focusing on has inhibited its Docker products to adequately address simple problems.

Let's dive into some technical specifics.

At its most primitive, Docker is a glorified tool to run a process in a sandbox. On Linux, this is accomplished by using the clone(2) function with specific flags and combined with various other techniques (filesystem remounting, capabilities, cgroups, chroot, seccomp, etc) to sandbox the process from the main operating system environment and kernel. There are a host of tools living at this not-quite-containers level that make it easy to run a sandboxed process. The bubblewrap tool is one of them.

Strictly speaking, you don't need anything fancy to create a process sandbox: just an executable you want to invoke and an executable that makes a set of system calls (like bubblewrap) to run that executable.

When you install Docker on a machine, it starts a daemon running as root. That daemon listens for HTTP requests on a network port and/or UNIX socket. When you run docker run from the command line, that command establishes a connection to the Docker daemon and sends any number of HTTP requests to instruct the daemon to take actions.

A daemon with a remote control protocol is useful. But it shouldn't be the only way to spawn containers with Docker. If all I want to do is spawn a temporary container that is destroyed afterwards, I should be able to do that from a local command without touching a network service. Something like bubblewrap. The daemon adds all kinds of complexity and overhead. Especially if I just want to run a simple, short-lived command.

Docker at this point is already pretty far detached from a tool like bubblewrap. And the disparity gets worse.

Docker adds another abstraction on top of basic process sandboxing in the form of storage / filesystem management. Docker insists that processes execute in self-contained, chroot()'d filesystem environment and that these environments (Docker images) be managed by Docker itself. When Docker images are imported into Docker, Docker manages them using one of a handful of storage drivers. You can choose from devicemapper, overlayfs, zfs, btrfs, and aufs and employ various configurations of all these. Docker images are composed of layers, with one layer stacked on top of the prior. This allows you to have an immutable base layer (that can be shared across containers) where run-time file changes can be isolated to a specific container instance.

Docker's ability to manage storage is pretty cool. And I dare say Docker's killer feature in the very beginning of Docker was the ability to easily produce and exchange self-contained Docker images across machines.

But this utility comes at a very steep price. Again, if our use case is run a process in a sandbox, do we really care about all this advanced storage functionality? Yes, if you are running hundreds of containers on a single system, a storage model built on top of copy-on-write is perhaps necessary for scaling. But for simple cases where you just want to run a single or small number of processes, it is extremely overkill and adds many more problems than it solves.

I cannot stress this enough, but I have spent hours debugging and working around problems due to how filesystems/storage works in Docker.

When Docker was initially released, aufs was widely used. As I previously wrote, aufs has abysmal performance as you scale up the number of concurrent I/O operations. We shaved minutes off tasks in Firefox CI by ditching aufs for overlayfs.

But overlayfs is far from a panacea. File metadata-only updates are apparently very slow in overlayfs. We're talking ~100ms to call fchownat() or utimensat(). If you perform an rsync -a or chown -R on a directory with only just a hundred files that were defined in a base image layer, you can have delays of seconds.

The Docker storage drivers backed by real filesystems like zfs and btrfs are a bit better. But they have their quirks too. For example, creating layers in images is comparatively very slow compared to overlayfs (which are practically instantaneous). This matters when you are iterating on a Dockerfile for example and want to quickly test changes. Your edit-compile cycle grows frustratingly long very quickly.

And I could opine on a handful of other problems I've encountered over the years.

Having spent hours of my life debugging and working around issues with Docker's storage, my current attitude is enough of this complexity, just let me use a directory backed by the local filesystem, dammit.

For many use cases, you don't need the storage complexity that Docker forces upon you. Pointing Docker at a directory on a local filesystem to chroot into is good enough. I know the behavior and performance properties of common Linux filesystems. ext4 isn't going to start making fchownat() or utimensat() calls take ~100ms. It isn't going to complain when a hard link spans multiple layers in an image. Or slow down to a crawl when multiple threads are performing concurrent read I/O. There's not going to be intrinsically complicated algorithms and caching to walk N image layers to find the most recent version of a file (or if there is, it will be so far down the stack in kernel land that I likely won't ever have to deal with it as a normal user). Docker images with their multiple layers add complexity and overhead. For many use uses, the pain it inflicts offsets the initial convenience it saves.

Docker's optimized-for-complex-use-cases architecture demonstrates its inefficiency in simple benchmarks.

On my machine, docker run -i --rm debian:stretch /bin/ls / takes ~850ms. Almost a second to perform a directory listing (sometimes it does take over 1 second - I was being generous and quoting a quicker time). This command takes ~1ms when run in a local shell. So we're at 2.5-3 magnitudes of overhead. The time here does include time to initially create the container and destroy it afterwards. We can isolate that overhead by starting a persistent container and running docker exec -i <cid> /bin/ls / to spawn a new process in an existing container. This takes ~85ms. So, ~2 magnitudes of overhead to spawn a process in a Docker container versus spawning it natively. What's adding so much overhead, I'm not sure. Yes, there are HTTP requests under the hood. But HTTP to a local daemon shouldn't be that slow. I'm not sure what's going on.

If we docker export that image to the local filesystem and use runc state to configure so we can run it with runc, runc run takes ~85ms to run /bin/ls /. If we runc exec <cid> /bin/ls / to start a process in an existing container, that completes in ~10ms. runc appears to be executing these simple tasks ~10x faster than Docker.

But to even get to that point, we had to make a filesystem available to spawn the container in. With Docker, you need to load an image into Docker. Using docker save to produce a 105,523,712 tar file, docker load -i image.tar takes ~1200ms to complete. tar xf image.tar takes ~65ms to extract that image to the local filesystem. Granted, Docker is computing the SHA-256 of the image as part of import. But SHA-256 runs at ~250MB/s on my machine and on that ~105MB input takes ~400ms. Where is that extra ~750ms of overhead in Docker coming from?

The Docker image loading overhead is still present on large images. With a 4,336,605,184 image, docker load was taking ~32s and tar x was taking ~2s. Obviously the filesystem was buffering writes in the tar case. And the ~2s is ignoring the ~17s to obtain the SHA-256 of the entire input. But there's still a substantial disparity here. (I suspect a lot of it is overlayfs not being as optimal as ext4.)

Several years ago there weren't many good choices for tools to execute containers. But today, there are good tools readily available. And thanks to OCI standards, you can often swap in alternate container runtimes. Docker (the tool) has an architecture that is optimized for solving complex use cases (coincidentally use cases that Docker the company makes money from). Because of this, my conclusion - drawn from using Docker for several years - is that Docker is unsuitable for many common use cases. If you care about low container startup/teardown overhead, low latency when interacting with containers (including spawning processes from outside of them), and for workloads where Docker's storage model interferes with understanding or performance, I think Docker should be avoided. A simpler tool (such as runc or even bubblewrap) should be used instead.

Call me a curmudgeon, but having seen all the problems that Docker's complexity causes, I'd rather see my containers resemble a tarball that can easily be chroot()'d into. I will likely be revisiting projects that use Docker and replacing Docker with something lighter weight and architecturally simpler. As for the use of Docker in the more complex environments it seems to be designed for, I don't have a meaningful opinion as I've never really used it in that capacity. But given my negative experiences with Docker over the years, I am definitely biased against Docker and will lean towards simpler products, especially if their storage/filesystem management model is simpler. Docker introduced containers to the masses and they should be commended for that. But for my day-to-day use cases for containers, Docker is simply not the right tool for the job.

I'm not sure exactly what I'll replace Docker with for my simpler use cases. If you have experiences you'd like to share, sharing them in the comments will be greatly appreciated.

Read and Post Comments

Next Page ยป