Gregory Szorc's Digital Home | Implications of Using Bugzilla for Firefox Patch Development

Implications of Using Bugzilla for Firefox Patch Development

October 27, 2014 at 03:27 PM | categories: MozReview, Bugzilla, Mozilla, code review

Mozilla is very close to rolling out a new code review tool based on Review Board. When I became involved in the project, I viewed it as an opportunity to start from a clean slate and design the ideal code development workflow for the average Firefox developer. When the design of the code review experience was discussed, I would push for decisions that were compatible with my utopian end state.

As part of formulating the ideal workflows and design of the new tool, I needed to investigate why we do things the way we do, whether they are optimal, and whether they are necessary. As part of that, I spent a lot of time thinking about Bugzilla's role in shaping the code that goes into Firefox. This post is a summary of my findings.

The primary goal of this post is to dissect the practices that Bugzilla influences and to prepare the reader for the potential to reassemble the pieces - to change the workflows - in the future, primarily around Mozilla's new code review tool. By showing that Bugzilla has influenced the popularization of what I consider non-optimal practices, it is my hope that readers start to question the existing processes and open up their mind to change.

Since the impetus for this post in the near deployment of Mozilla's new code review tool, many of my points will focus on code review.

Before I go into my findings, I'd like to explicitly state that while many of the things I'm about to say may come across as negativity towards Bugzilla, my intentions are not to put down Bugzilla or the people who maintain it. Yes, there are limitations in Bugzilla. But I don't think it is correct to point fingers and blame Bugzilla or its maintainers for these limitations. I think we got where we are following years of very gradual shifts. I don't think you can blame Bugzilla for the circumstances that led us here. Furthermore, Bugzilla maintainers are quick to admit the faults and limitations of Bugzilla. And, they are adamant about and instrumental in rolling out the new code review tool, which shifts code review out of Bugzilla. Again, my intent is not to put down Bugzilla. So please don't direct ire that way yourself.

So, let's drill down into some of the implications of using Bugzilla.

Difficult to Separate Contexts

The stream of changes on a bug in Bugzilla (including review comments) is a flat, linear list of plain text comments. This works great when the activity of a bug follows a nice, linear, singular topic flow. However, real bug activity does not happen this way. All but the most trivial bugs usually involve multiple points of discussion. You typically have discussion about what the bug is. When a patch comes along, reviewer feedback comes in both high-level and low-level forms. Each item in each group is its own logical discussion thread. When patches land, you typically have points of discussion tracking the state of this patch. Has it been tested, does it need uplift, etc.

Bugzilla has things like keywords, flags, comment tags, and the whiteboard to enable some isolation of these various contexts. However, you still have a flat, linear list of plain text comments that contain the meat of the activity. It can be extremely difficult to follow these many interleaved logical threads.

In the context of code review, lumping all review comments into the same linear list adds overhead and undermines the process of landing the highest-quality patch possible.

Review feedback consists of both high-level and low-level comments. High-level would be things like architecture discussions. Low-level would be comments on the code itself. When these two classes of comments are lumped together in the same text field, I believe it is easy to lose track of the high-level comments and focus on the low-level. After all, you may have a short paragraph of high-level feedback right next to a mountain of low-level comments. Your eyes and brain tend to gravitate towards the larger set of more concrete low-level comments because you sub-consciously want to fix your problems and that large mass of text represents more problems, easier problems to solve than the shorter and often more abstract high-level summary. You want instant gratification and the pile of low-level comments is just too tempting to pass up. We have to train ourselves to temporarily ignore the low-level comments and focus on the high-level feedback. This is very difficult for some people. It is not an ideal disposition. Benjamin Smedberg's recent post on code review indirectly talks about some of this by describing his rational approach of tackling high-level first.

As review iterations occur, the bug devolves into a mix of comments related to high and low-level comments. It thus becomes harder and harder to track the current high-level state of the feedback, as they must be picked out from the mountain of low-level comments. If you've ever inherited someone else's half-finished bug, you know what I'm talking about.

I believe that Bugzilla's threadless and contextless comment flow disposes us towards focusing on low-level details instead of the high-level. I believe that important high-level discussions aren't occurring at the rate they need and that technical debt increases as a result.

Difficulty Tracking Individual Items of Feedback

Code review feedback consists of multiple items of feedback. Each one is related to the review at hand. But oftentimes each item can be considered independent from others, relevant only to a single line or section of code. Style feedback is one such example.

I find it helps to model code review as a tree. You start with one thing you want to do. That's the root node. You split that thing into multiple commits. That's a new layer on your tree. Finally, each comment on those commits and the comments on those comments represent new layers to the tree. Code review thus consists of many related, but independent branches, all flowing back to the same central concept or goal. There is a one to many relationship at nearly every level of the tree.

Again, Bugzilla lumps all these individual items of feedback into a linear series of flat text blobs. When you are commenting on code, you do get some code context printed out. But everything is plain text.

The result of this is that tracking the progress on individual items of feedback - individual branches in our conceptual tree - is difficult. Code authors must pore through text comments and manually keep an inventory of their progress towards addressing the comments. Some people copy the review comment into another text box or text editor and delete items once they've fixed them locally! And, when it comes time to review the new patch version, reviewers must go through the same exercise in order to verify that all their original points of feedback have been adequately addressed! You've now redundantly duplicated the feedback tracking mechanism among at least two people. That's wasteful in of itself.

Another consequence of this unstructured feedback tracking mechanism is that points of feedback tend to get lost. On complex reviews, you may be sorting through dozens of individual points of feedback. It is extremely easy to lose track of something. This could have disastrous consequences, such as the accidental creation of a 0day bug in Firefox. OK, that's a worst case scenario. But I know from experience that review comments can and do get lost. This results in new bugs being filed, author and reviewer double checking to see if other comments were not acted upon, and possibly severe bugs with user impacting behavior. In other words, this unstructured tracking of review feedback tends to lessen code quality and is thus a contributor to technical debt.

Fewer, Larger Patches

Bugzilla's user interface encourages the writing of fewer, larger patches. (The opposite would be many, smaller patches - sometimes referred to as micro commits.)

This result is achieved by a user interface that handles multiple patches so poorly that it effectively discourages that approach, driving people to create larger patches.

The stream of changes on a bug (including review comments) is a flat, linear list of plain text comments. This works great when the activity of a bug follows a nice, linear flow. However, reviewing multiple patches doesn't work in a linear model. If you attach multiple patches to a bug, the review comments and their replies for all the patches will be interleaved in the same linear comment list. This flies in the face of the reality that each patch/review is logically its own thread that deserves to be followed on its own. The end result is that it is extremely difficult to track what's going on in each patch's review. Again, we have different contexts - different branches of a tree - all living in the same flat list.

Because conducting review on separate patches is so painful, people are effectively left with two choices: 1) write a single, monolithic patch 2) create a new bug. Both options suck.

Larger, monolithic patches are harder and slower to review. Larger patches require much more cognitive load to review, as the reviewer needs to capture the entire context in order to make a review determination. This takes more time. The increased surface area of the patch also increases the liklihood that the reviewer will find something wrong and will require a re-review. The added complexity of a larger patch also means the chances of a bug creeping in are higher, leading to more bugs being filed and more reviews later. The more review cycles the patch goes through, the greater the chances it will suffer from bit rot and will need updating before it lands, possibly incurring yet more rounds of review. And, since we measure progress in terms of code landing, the delay to get a large patch through many rounds of review makes us feel lethargic and demotivates us. Large patches have intrinsic properties that lead to compounding problems and increased development cost.

As bad as large patches are, they are roughly in the same badness range as the alternative: creating more bugs.

When you create a new bug to hold the context for the review of an individual commit, you are doing a lot of things, very few of them helpful. First, you must create a new bug. There's overhead to do that. You need to type in a summary, set up the bug dependencies, CC the proper people, update the commit message in your patch, upload your patch/attachment to the new bug, mark the attachment on the old bug obsolete, etc. This is arguably tolerable, especially with tools that can automate the steps (although I don't believe there is a single tool that does all of what I mentioned automatically). But the badness of multiple bugs doesn't stop there.

Creating multiple bugs fragments the knowledge and history of your change and diminishes the purpose of a bug. You got in the situation of creating multiple bugs because you were working on a single logical change. It just so happened that you needed/wanted multiple commits/patches/reviews to represent that singular change. That initial change was likely tracked by a single bug. And now, because of Bugzilla's poor user interface around mutliple patch reviews, you now find yourself creating yet another bug. Now you have two bug numbers - two identifiers that look identical, only varying by their numeric value - referring to the same logical thing. We've started with a single bug number referring to your logical change and created what are effectively sub-issues, but allocated them in the same namespace as normal bugs. We've diminished the importance of the average bug. We've introduced confusion as to where one should go to learn about this single, logical change. Should I go to bug X or bug Y? Sure, you can likely go to one and ultimately find what you were looking for. But that takes more effort.

Creating separate bugs for separate reviews also makes refactoring harder. If you are going the micro commit route, chances are you do a lot of history rewriting. Commits are combined. Commits are split. Commits are reordered. And if those commits are all mapping to individual bugs, you potentially find yourself in a huge mess. Combining commits might mean resolving bugs as duplicates of each other. Splitting commits means creating yet another bug. And let's not forget about managing bug dependencies. Do you set up your dependencies so you have a linear, waterfall dependency corresponding to commit order? That logically makes sense, but it is hard to keep in sync. Or, do you just make all the review bugs depend on a single parent bug? If you do that, how do you communicate the order of the patches to the reviewer? Manually? That's yet more overhead. History rewriting - an operation that modern version control tools like Git and Mercurial have enabled to be a lightweight operation and users love because it doesn't constrain them to pre-defined workflows - thus become much more costly. The cost may even be so high that some people forego rewriting completely, trading their effort for some poor reviewer who has to inherit a series of patches that isn't organized as logically as it could be. Like larger patches, this increases cognitive load required to perform reviews and increases development costs.

As you can see, reviewing multiple, smaller patches with Bugzilla often leads to a horrible user experience. So, we find ourselves writing larger, monolithic patches and living with their numerous deficiencies. At least with monolithic patches we have a predictable outcome for how interaction with Bugzilla will play out!

I have little doubt that large patches (whose existence is influenced by the UI of Bugzilla) slows down the development velocity of Firefox.

Commit Message Formatting

The heavy involvement of Bugzilla in our code development lifecycle has influenced how we write commit messages. Let's start with the obvious example. Here is our standard commit message format for Firefox:

Bug 1234 - Fix some feature foo; r=gps

The bug is right there at the front of the commit message. That prominent placement is effectively saying the bug number is the most important detail about this commit - everything else is ancillary.

Now, I'm sure some of you are saying, but Greg, the short description of the change is obviously more important than the bug number. You are right. But we've allowed ourselves to make the bug and the content therein more important than the commit.

Supporting my theory is the commit message content following the first/summary line. That data is almost always - wait for it - nothing: we generally don't write commit messages that contain more than a single summary line. My repository forensics show that that less than 20% of commit messages to Firefox in 2014 contain multiple lines (this excludes merge and backout commits). (We are doing better than 2013 - the rate was less than 15% then).

Our commit messages are basically saying, here's a highly-abbreviated summary of the change and a pointer (a bug number) to where you can find out more. And of course loading the bug typically reveals a mass of interleaved comments on various topics, hardly the high-level summary you were hoping was captured in the commit message.

Before I go on, in case you are on the fence as to the benefit of detailed commit messages, please read Phabricator's recommendations on revision control and writing reviewable code. I think both write-ups are terrific and are excellent templates that apply to nearly everyone, especially a project as large and complex as Firefox.

Anyway, there are many reasons why we don't capture a detailed, multi-line commit message. For starters, you aren't immediately rewarded for doing it: writing a good commit message doesn't really improve much in the short term (unless someone yells at you for not doing it). This is a generic problem applicable to all organizations and tools. This is a problem that culture must ultimately rectify. But our tools shouldn't reinforce the disposition towards laziness: they should reward best practices.

I don't believe Bugzilla and our interactions with it do an adequate job rewarding good commit message writing. Chances are your mechanism for posting reviews to Bugzilla or posting the publishing of a commit to Bugzilla (pasting the URL in the simple case) brings up a text box for you to type review notes, a patch description, or extra context for the landing. These should be going in the commit message, as they are the type of high-level context and summarizations of choices or actions that people crave when discerning the history of a repository. But because that text box is there, taunting you with its presence, we write content there instead of in the commit message. Even where tools like bzexport exist to upload patches to Bugzilla, potentially nipping this practice in the bug, it still engages in frustrating behavior like reposting the same long commit message on every patch upload, producing unwanted bug spam. Even a tool that is pretty sensibly designed has an implementation detail that undermines a good practice.

Machine Processing of Patches is Difficult

I have a challenge for you: identify all patches currently under consideration for incorporation in the Firefox source tree, run static analysis on them, and tell me if they meet our code style policies.

This should be a solved problem and deployed system at Mozilla. It isn't. Part of the problem is because we're using Bugzilla for conducting review and doing patch management. That may sound counter-intuitive at first: Bugzilla is a centralized service - surely we can poll it to discover patches and then do stuff with those patches. We can. In theory. Things break down very quickly if you try this.

We are uploading patch files to Bugzilla. Patch files are representations of commits that live outside a repository. In order to get the full context - the result of the patch file - you need all the content leading up to that patch file - the repository data. When a naked patch file is uploaded to Bugzilla, you don't always have this context.

For starters, you don't know with certainly which repository the patch belongs to because that isn't part of the standard patch format produced by Mercurial or Git. There are patches for various repositories floating around in Bugzilla. So now you need a way to identify which repository a patch belongs to. It is a solvable problem (aggregate data for all repositories and match patches based on file paths, referenced commits, etc), albeit one Mozilla has not yet solved (but should).

Assuming you can identify the repository a patch belongs to, you need to know the parent commit so you can apply this patch. Some patches list their parent commits. Others do not. Even those that do may lie about it. Patches in MQ don't update their parent field when they are pushed, only after they are refreshed. You could be testing and uploading a patch with a different parent commit than what's listed in the patch file! Even if you do identify the parent commit, this commit could belong to another patch under consideration that's also on Bugzilla! So now you need to assemble a directed graph with all the patches known from Bugzilla applied. Hopefully they all fit in nicely.

Of course, some patches don't have any metadata at all: they are just naked diffs or are malformed commits produced by tools that e.g. attempt to convert Git commits to Mercurial commits (Git users: you should be using hg-git to produce proper Mercurial commits for Firefox patches).

Because Bugzilla is talking in terms of patch files, we often lose much of the context needed to build nice tools, preventing numerous potential workflow optimizations through automation. There are many things machines could be doing for us (such as looking for coding style violations). Instead, humans are doing this work and costing Mozilla a lot of time and lost developer productivity in the process. (A human costs ~$100/hr. A machine on EC2 is pennies per hour and should do the job with lower latency. In other words, you can operate over 300 machines 24 hours a day for what you may an engineer to work an 8 hour shift.)

Conclusion

I have outlined a few of the side-effects of using Bugzilla as part of our day-to-day development, review, and landing of changes to Firefox.

There are several takeways.

First, one cannot argue the fact that Firefox development is bug(zilla) centric. Nearly every important milestone in the lifecycle of a patch involves Bugzilla in some way. This has its benefits and drawbacks. This article has identified many of the drawbacks. But before you start crying to expunge Bugzilla from the loop completely, consider the benefits, such as a place anyone can go to to add metadata or comments on something. That's huge. There is a larger discussion to be had here. But I don't want to be inviting it quite yet.

A common thread between many of the points above is Bugzilla's unstructured and generic handling of code and metadata attached to it (patches, review comments, and landing information). Patches are attachments, which can be anything under the sun. Review comments are plain text comments with simple author, date, and tag metadata. Landings are also communicated by plain text review comments (at least initially - keywords and flags are used in some scenarios).

By being a generic tool, Bugzilla throws away a lot of the rich metadata that we produce. That data is still technically there in many scenarios. But it becomes extremely difficult if not practically impossible for both humans and machines to access efficiently. We lose important context and feedback by normalizing all this data to Bugzilla. This data loss creates overhead and technical debt. It slows Mozilla down.

Fortunately, the solutions to these problems and shortcomings are conceptually simple (and generally applicable): preserve rich context. In the context of patch distribution, push commits to a repository and tell someone to pull those commits. In the context of code review, create sub-reviews for different commits and allow tracking and easy-to-follow (likely threaded) discussions on found issues. Design workflow to be code first, not tool or bug first. Optimize workflows to minimize people time. Lean heavily on machines to do grunt work. Integrate issue tracking and code review, but not too tightly (loosely coupled, highly cohesive). Let different tools specialize in the handling of different forms of data: let code review handle code review. Let Bugzilla handle issue tracking. Let a landing tool handle tracking the state of landings. Use middleware to make them appear as one logical service if they aren't designed to be one from the start (such as is Mozilla's case with Bugzilla).

Another solution that's generally applicable is to refine and optimize the whole process to land a finished commit. Your product is based on software. So anything that adds overhead or loss of quality in the process of developing that software is fundamentally a product problem and should be treated as such. Any time and brain cycles lost to development friction or bugs that arise from things like inadequate code reviews tools degrade the quality of your product and take away from the user experience. This should be plain to see. Attaching a cost to this to convince the business-minded folks that it is worth addressing is a harder matter. I find management with empathy and shared understanding of what amazing tools can do helps a lot.

If I had to sum up the solution in one sentence, it would be: invest in tools and developer happiness.

I hope to soon publish a post on how Mozilla's new code review tool addresses many of the workflow deficiencies present today. Stay tuned.