Problems with Pull Requests and How to Fix Them

January 07, 2020 at 12:10 PM | categories: Mercurial, Git

You've probably used or at least heard of pull requests: the pull request is the contribution workflow practiced on and made popular by [code] collaboration sites like GitHub, GitLab, Bitbucket, and others. Someone (optionally) creates a fork, authors some commits, pushes them to a branch, then creates a pull request to track integrating those commits into a target repository and branch. The pull request is then used as a vehicle for code review, tracking automated checks, and discussion until it is ready to be integrated. Integration is usually performed by a project maintainer, often with the click of a merge button on the pull request's web page.

It's worth noting that the term pull request is not universally used: GitLab calls them merge requests for example. Furthermore I regard the terms pull request and merge request to be poorly named, as the terms can be conflated with terminology used by your version control tool (e.g. git pull or git merge. And the implementations of a pull or merge request may not even perform a pull or a merge (you can also rebase a pull/merge request, but nobody is calling them rebase requests). A modern day pull request is so much more than a version control tool operation or even a simple request to pull or merge a branch: it is a nexus to track the integration of a proposed change before during and after that change is integrated. But alas. Because GitHub coined the term and is the most popular collaboration platform implementing this functionality, I'll refer to this general workflow as implemented on GitHub, GitLab, Bitbucket, and others as pull requests for the remainder of this post.

Pull requests have existed in essentially their current form for over a decade. The core workflow has remained mostly unchanged. What is different are the addition of value-add features, such as integrating status checks like CI results, the ability to rebase or squash commits instead of merging, code review tooling improvements, and lots of UI polish. GitLab deserves a call out here, as their implementation of merge requests tracks so much more than other tools do. (This is a side-effect of GitLab having more built-in features than comparable tools.) I will also give kudos to GitLab for adding new features to pull requests when GitHub was asleep at the wheel as a company a few years ago. (Not having a CEO for clear product/company leadership really showed.) Fortunately, both companies (and others) are now churning out new, useful features at a terrific clip, greatly benefiting the industry!

While I don't have evidence of this, I suspect pull requests (and the forking model used by services that implement them) came into existence when someone thought how do I design a collaboration web site built on top of Git's new and novel distributed nature and branching features. They then proceeded to invent forking and pull requests. After all, the pull request as implemented by GitHub was initially a veneer over a common Git workflow of create a clone, create a branch, and send it somewhere. Without GitHub, you would run git clone, git branch, then some other command like git request-pull (where have I seen those words before) to generate/send your branch somewhere. On GitHub, the comparable steps are roughly create a fork, create a branch to your fork, and submit a pull request. Today, you can even do all of this straight from the web interface without having to run git directly! This means that GitHub can conceptually be thought of as a purely server-side abstraction/implementation of the Git feature branch workflow.

At its core, the pull request is fundamentally a nice UI and feature layer built around the common Git feature branch workflow. It was likely initially conceived as polish and value-add features over this historically client-side workflow. And this core property of pull requests from its very first days has been copied by vendors like Bitbucket and GitLab (and in Bitbucket's case it was implemented for Mercurial - not Git - as Bitbucket was initially Mercurial only).

A decade is an eternity in the computer industry. As they say, if you aren't moving forward, you are moving backward. I think it is time for industry to scrutinize the pull request model and to evolve it into something better.

I know what you are thinking: you are thinking that pull requests work great and that they are popular because they are a superior model compared to what came before. These statements - aside from some nuance - are true. But if you live in the version control space (like I do) or are paid to deliver tools and workflows to developers to improve productivity and code/product quality (which I am), the deficiencies in the pull request workflow and implementation of that workflow among vendors like GitHub, GitLab, Bitbucket, etc are obvious and begging to be overhauled if not replaced wholesale.

So buckle in: you've started a ten thousand word adventure about everything you didn't think you wanted to know about pull requests!

Problems with Pull Requests

To build a better workflow, we first have to understand what is wrong/sub-optimal with pull requests.

I posit that the foremost goal of an pull request is to foster the incorporation of a high quality and desired change into a target repository with minimal overhead and complexity for submitter, integrator, and everyone in between. Pull requests achieve this goal by fostering collaboration to discuss the change (including code review), tracking automated checks against the change, linking to related issues, etc. In other words, the way I see the world is that a specific vendor's pull request implementation is just that: an implementation detail. And like all implementation details, they should be frequently scrutinized and changed, if necessary.

Let's start dissecting the problems with pull requests by focusing on the size of review units. Research by Google, Microsoft here, and here, and others has shown an inverse correlation with review unit size and defect rate. In Google's words (emphasis mine):

The size distribution of changes is an important factor in the quality of the code review process. Previous studies have found that the number of useful comments decreases and the review latency increases as the size of the change increases. Size also influences developers' perception of the code review process; a survey of Mozilla contributors found that developers feel that size-related factors have the greatest effect on review latency. A correlation between change size and review quality is acknowledged by Google and developers are strongly encouraged to make small, incremental changes (with the exception of large deletions and automated refactoring). These findings and our study support the value of reviewing small changes and the need for research and tools to help developers create such small, self-contained code changes for review.

Succinctly, larger changes result in fewer useful comments during review (meaning quality is undermined) and make reviews take longer (meaning productivity is undermined). Takeaway: if you care about defect rate / quality and/or velocity, you should be authoring and reviewing more, smaller changes as opposed to fewer, larger changes.

I stronger agree with Google's opinion on this matter and wholeheartedly endorse writing more, smaller changes. Having practiced both forms of change authorship, I can say without a doubt that more, smaller changes is superior: superior for authors, superior for code reviewers, and superior for people looking at repository history later. The main downside with this model is that it requires a bit more knowledge of your version control tool to execute. And, it requires corresponding tooling to play well with this change authorship model and to introduce as little friction as possible along the way since the number of interactions with tooling will increase as change size decreases, velocity increases, and there are more distinct units of change being considered for integration.

That last point is important and is germane to this post because the common implementation of pull requests today is not very compatible with the many small changes workflow. As I'll argue, the current implementation of pull requests actively discourages the many smaller changes workflow. And since smaller changes result in higher quality and faster reviews, today's implementations of pull requests are undermining quality and velocity.

I don't mean to pick on them, but since they are the most popular and the people who made pull requests popular, let's use GitHub's implementation of pull requests to demonstrate my point.

I posit that in order for us to author more, smaller changes, we must either a) create more, smaller pull requests or b) have pull request reviews put emphasis on the individual commits (as opposed to the overall merge diff). Let's examine these individually.

If we were to author more, smaller pull requests, this would seemingly necessitate the need for dependencies between pull requests in order to maintain velocity. And dependencies between pull requests adds a potentially prohibitive amount of overhead. Let me explain. We don't want to sacrifice the overall rate at which authors and maintainers are able to integrate proposed changes. If we were to split existing proposed changes into more, smaller pull requests, we would have a lot more pull requests. Without dependencies between them, authors could wait for each pull request to be integrated before sending the next one. But this would incur more round trips between author and integrator and would almost certainly slow down the overall process. That's not desirable. The obvious mitigation to that is to allow multiple, related pull requests in flight simultaneously. But this would necessitate the invention of dependencies between pull requests in order to track relationships so one pull request doesn't integrate before another it logically depends on. This is certainly technically doable. But it imposes considerable overhead of its own. How do you define dependencies? Are dependencies automatically detected or updated based on commits in a DAG? If yes, what happens when you force push and it is ambiguous whether a new commit is a logically new commit or a successor of a previous one? If no, do you really want to impose additional hurdles on submitters to define dependencies between every pull request? In the extreme case of one pull request per commit, do you make someone submitting a series of say twenty commits and pull requests really annotate nineteen dependencies? That's crazy!

There's another, more practical issue at play: the interplay between Git branches and pull requests. As implemented on GitHub, a pull request is tracked by a Git branch. If we have N inter-dependent pull requests, that means N Git branches. In the worst case, we have one Git branch for every Git commit. Managing N in-flight Git branches would be absurd. It would impose considerable overhead on pull request submitters. It would perfectly highlight the inefficiency in Git's game of refs branch management that I blogged about two years ago. (Succinctly, once you are accustomed to workflows - like Mercurial's - which don't require you to name commits or branches, Git's forced naming of branches and all the commands requiring those branch names feels grossly inefficient and a mountain of overhead.) Some tooling could certainly be implemented to enable efficient submission of pull requests. (See ghstack for an example.) But I think the interplay between Git branches and GitHub pull requests is sufficiently complex that the tooling and workflow would be intractable for anything but the most trivial and best-case scenarios. Keep in mind that any sufficiently user-friendly solution to this problem would also entail improving git rebase so it moves branches on rewritten ancestor commits instead of leaving them on the old versions of commits. (Seriously, someone should implement this feature: it arguably makes sense as the default behavior for local branches.) In other words, I don't think you can implement the multiple pull request model reliably and without causing excessive burden on people without fundamentally changing the requirement that a pull request be a Git branch. (I'd love to be proven wrong.)

Therefore, I don't think the more, smaller changes workflow can be easily practiced with multiple pull requests using the common GitHub model without effectively moving the definition of a pull request away from equivalence with a Git branch (more on this later). And I also don't mean to imply that dependencies between pull requests can't be implemented: they can and GitLab is evidence. But GitLab's implementation is somewhat simple and crude (possibly because doing anything more complicated is really hard as I speculate).

So without fundamentally changing the relationship between a pull request and a branch, that leaves us with our alternative of pull requests putting more emphasis on the individual changes rather than the merge diff. Let's talk about that now.

Pull requests have historically placed emphasis on the merge diff. That is, GitHub (or another provider) takes the Git branch you have submitted, runs a git merge against the target branch behind the scenes, and displays that diff front and center for review as the main proposed unit of change: if you click the Files changed tab to commence review, you are seeing this overall merge diff. You can click on the Commits tab then select an individual commit to review just that commit. Or you can use the dropdown on the Files changed tab to select an individual commit to review it. These (relatively new) features are a very welcome improvement and do facilitate performing a commit-by-commit review, which is a requirement to realize the benefits of a more, smaller changes workflow. Unfortunately, they are far from sufficient to fully realize the benefits of that workflow.

Defaults matter and GitHub's default is to show the merge diff when conducting review. (I bet a large percentage of users don't even know it is possible to review individual commits.) Since larger changes result in a higher defect rate and slower review, GitHub's default of showing the merge diff effectively means GitHub is defaulting to lower quality, longer-lasting reviews. (I suppose this is good for engagement numbers, as it inflates service usage both immediately and in the long-term due to subsequent bugs driving further usage. But I sincerely hope no product manager is thinking let's design a product that undermines quality to drive engagement.)

Unfortunately, a trivial change of the default to show individual commits instead of the merge diff is not so simple, as many authors and projects don't practice clean commit authorship practices, where individual commits are authored such that they can be reviewed in isolation.

(One way of classifying commit authorship styles is by whether a series of commits is authored such that each commit is good in isolation or whether the effect of applying the overall series is what matters. A handful of mature projects - like the Linux kernel, Firefox, Chrome, Git, and Mercurial - practice the series of individually-good commits model, which I'll call a commit-centric workflow. I would wager the majority of projects on GitHub and similar services practice the we only care about the final result of the series of commits model. A litmus test for practicing the latter model is whether pull requests contain commits like fixup foo or if subsequent revisions to pull requests create new commits instead of amending existing ones. I'm a strong proponent of a clean commit history where each commit in the final repository history stands as good in isolation. But I tend to favor more grown-up software development practices and am a version control guru. That being said, the subject/debate is fodder for another post.)

If GitHub (or someone else) switched the pull request default to a per-commit review without otherwise changing the relationship between a pull request and a Git branch, that would force a lot of less experienced users to familiarize themselves with history rewriting in Git. This would impose considerable pain and suffering on pull request authors, which would in turn upset users, hurt engagement, etc. Therefore, I don't think this is a feasible global default that can be changed. Maybe if Git's user experience for history rewriting were better or we didn't have a decade of behavior to undo we'd be in a better position... But pull request implementations don't need to make a global change: they could right the ship by offering projects that practice clean commit practices an option to change the review default so it emphasizes individual commits instead of the merge diff. This would go a long way towards encouraging authoring and reviewing individual commits, which should have positive benefits on review velocity and code quality outcomes.

But even if these services did emphasize individual commits by default in pull request reviews, there's still a handful of significant deficiencies that would undermine the more, smaller changes workflow that we desire.

While it is possible to review individual commits, all the review comments are still funneled into a single per pull request timeline view of activity. If submitter and reviewer make the effort to craft and subsequently review individual commits, your reward is that all the feedback for the discrete units of change gets lumped together into one massive pile of feedback for the pull request as a whole. This unified pile of feedback (currently) does a poor job of identifying which commit it applies to and gives the author little assistance in knowing which commits need amending to address the feedback. This undermines the value of commit-centric workflows and effectively pushes commit authors towards the fixup style of commit authorship. In order to execute per-commit review effectively, review comments and discussion need to be bucketed by commit and not combined into a unified pull request timeline. This would be a massive change to the pull request user interface and would be a daunting undertaking, so it is understandable why it hasn't happened yet. And such an undertaking would also require addressing subtly complex issues like how to preserve reviews in the face of force pushes. Today, GitHub's review comments can lose context when force pushes occur. Things are better than they used to be, when review comments left on individual commits would flat out be deleted (yes: GitHub really did effectively lose code review comments for several years.) But even with tooling improvements, problems still remain and should adoption of commit-level review tracking occur, these technical problems would likely need resolution to appease users of this workflow.

Even if GitHub (or someone else) implements robust per-commit review for pull requests, there's still a problem with velocity. And that problem is that if the pull request is your unit of integration (read: merging), then you have to wait until every commit is reviewed before integration can occur. This may sound tolerable (it's what we practice today after all). But I argue this is less optimal than a world where a change integrates as soon as it is ready to, without having to wait for the changes after it. As an author and maintainer, if I see a change that is ready to integrate, I prefer to integrate it as soon as possible, without delay. The longer a ready-to-integrate change lingers, the longer it is susceptible to bit rot (when the change is no longer valid/good due to other changes in the system). Integrating a judged-good change sooner also reduces the time to meaningful feedback: if there is a fundamental problem early in a series of changes that isn't caught before integration, integrating earlier changes sooner without waiting for the ones following will expose problems sooner. This minimizes deltas in changed systems (often making regression hunting easier), often minimizes the blast radius if something goes wrong, and gives the author more time and less pressure to amend subsequent commits that haven't been integrated yet. And in addition to all of this, integrating more often just feels better. The Progress Principle states that people feel better and perform better work when they are making continuous progress. But setbacks more than offset the power of small wins. While I'm not aware of any explicit research in this area, my interpretation of the Progress Principle to change authorship and project maintenance(which is supported by anecdotal observation) is that a steady stream of integrated changes feels a hell of a lot better than a single monolithic change lingering in review purgatory for what can often seem like an eternity. While you need to be cognizant to not confuse movement with meaningful progress, I think there is real power to the Progress Principle and that we should aim to incorporate changes as soon as they are ready and not any later. Applied to version control and code review, this means integrating a commit as soon as author, reviewer, and our machine overlords reporting status checks all agree it is ready, without having to wait for a larger unit of work, like the pull request. Succinctly, move forward as soon as you are able to!

This desire to integrate faster has significant implications for pull requests. Again, looking at GitHub's implementation of pull requests, I don't see how today's pull requests could adapt to this desired end state without significant structural changes. For starters, review must grow the ability to track per-commit state otherwise integrating individual commits without the entirety of the parts makes little sense. But this entails all the complexity I described above. Then there's the problem of Git branches effectively defining a pull request. What happens when some commits in a pull request are integrated and the author rebases or merges their local branch against their new changes? This may or may not just work. And when it doesn't just work, the author can easily find themselves in merge conflict hell, where one commit after the other fails to apply cleanly and their carefully curated stack of commits quickly becomes a liability and impediment to forward progress. (As an aside, the Mercurial version control tool has a concept called changeset evolution where it tracks which commits - changesets in Mercurial parlance - have been rewritten as other commits and gracefully reacts in situations like a rebase. For example, if you have commits X and Y and X is integrated via a rebase as X', an hg rebase of Y onto X' will see that X was rewritten as X' and skip attempting to rebase X because it is already applied! This cleanly sidesteps a lot of the problems with history rewriting - like merge conflicts - and can make the end-user experience much more pleasant as a result.) While it is certainly possible to integrate changes as soon as they are ready with a pull request workflow, I think that it is awkward and that by the time you've made enough changes to accommodate the workflow, very little is left of the pull request workflow as we know it and it is effectively a different workflow altogether.

The above arguments overly hinge on the assumption that more smaller changes is superior for quality and/or velocity and that we should design workflows around this assertion. While I strongly believe in the merits of smaller units of change, others may disagree. (If you do disagree, you should ask yourself whether you believe the converse: that larger units of change are better for quality and velocity. I suspect most people can't justify this. But I do believe there is merit to the argument that smaller units of change impose additional per-unit costs or have second order effects that undermine their touted quality or velocity benefits.)

But even if you don't buy into the change size arguments, there's still a very valid reason why we should think beyond pull requests as they are implemented today: tool scalability.

The implementation of pull requests today is strongly coupled with how Git works out of the box. A pull request is initiated from a Git branch pushed to a remote Git repository. When the pull request is created, the server creates a Git branch/ref referring to that pull request's head commits. On GitHub, these refs are named pull/ID/head (you can fetch these from the remote Git repository but they are not fetched by default). Also when a pull request is created or updated, a git merge is performed to produce a diff for review. On GitHub, the resulting merge commit is saved and pointed to on the open pull request via a pull/ID/merge ref, which can also be fetched locally. (The merge ref is deleted when the pull request is closed.)

Herein resides our scalability problem: unbound growth of Git refs and ever-increasing rate of a change for a growing project. Each Git ref adds overhead to graph walking operations and data exchange. While involved operations are continuously getting optimized (often through the use of more advanced data structures or algorithms), there are intrinsic scaling challenges with this unbound growth that - speaking as a version control tool maintainer - I want no part of. Are technical solutions enabling things to scale to millions of Git refs viable? Yes'ish. But it requires high-effort solutions like JGit's Reftable, which required ~90 review rounds spanned across ~4 months to land. And that's after design of the feature was first proposed at least as far back as July 2017. Don't get me wrong: am I glad Reftable exists: yes. It is a fantastic solution to a real problem and reading how it works will probably make you a better engineer. But simultaneously, it is a solution to a problem that does not need to exist. There is a space for scaling graph data structures and algorithms to millions or even billions of nodes, edges, and paths: your version control tool should not be it. Millions or billions of commits and files: that's fine. But scaling the number of distinct paths through that graph by introducing millions of DAG heads is insane given how much complexity it introduces in random areas of the tool. In my opinion it requires unjustifiably large amounts of investment to make work at scale. As an engineer, my inclination when posed with problems like these is to avoid them in the first place. The easiest problems to solve are those you don't have.

Unfortunately, the tight coupling of pull requests to Git branches/refs introduces unbound growth and a myriad of problems associated with it. Most projects may not grow to a size that experiences these problems. But as someone who has experience with this problem space at multiple companies, I can tell you the problem is very real and the performance and scalability issues it creates undermines the viability of using today's implementation of pull requests once you've reached a certain scale. Since we can likely fix the underlying scaling issues with Git, I don't think the explosion of Git refs is a long-term deal breaker for scaling pull requests. But it is today and will remain so until Git and the tools built on top of it improve.

In summary, some high-level problems with pull requests are as follows:

  • Review of merge diff by default encourages larger units of review, which undermines quality and velocity outcomes.
  • Inability to incrementally integrate commits within a pull request, which slows down velocity, time to meaningful feedback, and can lower morale.
  • Tight coupling of pull requests with Git branches adds rigidity to workflows and shoehorns into less flexible and less desired workflows.
  • Deficiencies in the Git user experience - particular around what happens when rewrites (including rebase) occur - significantly curtail what workflows can be safely practiced with pull requests.
  • Tight coupling of pull requests with Git branches can lead to performance issues at scale.

We can invert language to arrive at a set of more ideal outcomes:

  • Review experience is optimized for individual commits - not the merge diff - so review units are smaller and quality and velocity outcomes are improved.
  • Ability to incrementally integrate individual commits from a larger set so ready-to-go changes are incorporated sooner, improving velocity, time to meaningful feedback, and morale.
  • How you use Git branches does not impose significant restrictions on handling of pull requests.
  • You can use your version control tool how you want, without having to worry about your workflow being shoehorned by how pull requests work.
  • Pull request server can scale to the most demanding use cases with relative ease.

Let's talk about how we could achieve these more desirable outcomes.

Exploring Alternative Models

A pull request is merely an implementation pattern for the general problem space of integrating a proposed change. There are other patterns used by other tools. Before I describe them, I want to coin the term integration request to refer to the generic concept of requesting some change being integrated elsewhere. GitHub pull requests and GitLab merge requests are implementations of integration requests, for example.

Rather than describe alternative tools in detail, I will outline the key areas where different tools differ from pull requests and assess the benefits and drawbacks to the different approaches.

Use of the VCS for Data Exchange

One can classify implementations of integration requests by how they utilize the underlying version control tools.

Before Git and GitHub came along, you were probably running a centralized version control tool which didn't support offline commits or feature branches (e.g. CVS or Subversion). In this world, the common mechanism for integration requests was exchanging diffs or patches through various media - email, post to a web service of a code review tool, etc. Your version control tool didn't speak directly to a VCS server to initiate an integration request. Instead, you would run a command which would export a text-based representation of the change and then send it somewhere.

Today, we can classify integration requests by whether or not they speak the version control tool's native protocol for exchanging data or whether they exchange patches through some other mechanism. Pull requests speak the VCS native protocol. Tools like Review Board and Phabricator exchange patches via custom HTTP web services. Typically, tools using non-native exchange will require additional client-side configuration, including potentially the installation of a custom tool (e.g. RBTools for Review Board or Arcanist for Phabricator). Although modern version control tools sometimes have this functionality built-in. (e.g. Git and Mercurial fulfill Zawinski's law and Mercurial has a Phabricator extension in its official distribution).

An interesting outlier is Gerrit, which ingests its integration requests via git push. (See the docs.) But the way Gerrit's ingestion via git push works is fundamentally different from how pull requests work! With pull requests, you are pushing your local branch to a remote branch and a pull request is built around that remote branch. With Gerrit, your push command is like git push gerrit HEAD:refs/for/master. For the non-gurus, that HEAD:refs/for/master syntax means, push the HEAD commit (effectively the commit corresponding to the working directory) to the refs/for/master ref on the gerrit remote (the SOURCE:DEST syntax specifies a mapping of local revision identifier to remote ref). The wizard behind the curtain here is that Gerrit runs a special Git server that implements non-standard behavior for the refs/for/* refs. When you push to refs/for/master, Gerrit receives your Git push like a normal Git server would. But instead of writing a ref named refs/for/master, it takes the incoming commits and ingests them into a code review request! Gerrit will create Git refs for the pushed commits. But it mainly does that for its internal tracking (Gerrit stores all its data in Git - from Git data to review comments). And if that functionality isn't too magical for you, you can also pass parameters to Gerrit via the ref name! e.g. git push gerrit HEAD refs/for/master%private will create a private review request that requires special permissions to see. (It is debatable whether overloading the ref name for additional functionality is a good user experience for average users. But you can't argue that this isn't a cool hack!)

On the surface, it may seem like using the version control tool's native data exchange is a superior workflow because it is more native and more modern. (Emailing patches is so old school.) Gone are the days of having to configure client-side tools to export and submit patches. Instead, you run git push and your changes can be turned into an integration request automatically or with a few mouse clicks. And from a technical level, this exchange methodology is likely safer, as round-tripping a text-based representation of a change without data loss is surprisingly finicky. (e.g. JSON's lack of lossless binary data exchange without encoding to e.g. base64 first often means that many services exchanging text-based patches are lossy, especially in the presence of content which doesn't conform to UTF-8, which can be commonplace in tests. You would be surprised how many tools experience data loss when converting version control commits/diffs to text. But I digress). Having Git's wire protocol exchange binary data is safer than exchanging text patches and probably easier to use since it doesn't require any additional client-side configuration.

But despite being more native, modern, and arguably robust, exchange via the version control tool may not be better.

For starters, use of the version control tool's native wire protocol inhibits use of arbitrary version control tools on the client. When your integration request requires the use of a version control tool's wire protocol, the client likely needs to be running that version control tool. With other approaches like exchange of text based patches, the client could be running any software it wanted: as long as it could spit out a patch or API request in the format the server needed, an integration request could be created! This meant there was less potential for lock-in, as people could use their own tools on their machines if they wanted and they (hopefully) wouldn't be inflicting their choice on others. Case in point, a majority of Firefox developers use Mercurial - the VCS of the canonical repository - but a large number use Git on the client. Because Firefox is using Phabricator (Review Board and Bugzilla before that) for code review and because Phabricator ingests text-based patches, the choice of the VCS on the client doesn't matter that much and the choice of the server VCS can be made without inciting a holy war among developers who would be forced to use a tool they don't prefer. Yes, there are good reasons for using a consistent tool (including organizational overhead) and sometimes mandates for tool use are justified. But in many cases (such as random open source contributions), it probably doesn't or shouldn't matter. And in cases like Git and Mercurial, where tools like the fantastic git-cinnabar make it possible to easily convert between the repositories without data loss and acceptable overhead, adoption of the version control tool's native wire protocol can exclude or inhibit the productivity of contributors since it can mandate use of specific, undesired tooling.

Another issue with using the version control tool's wire protocol is that it often forces or strongly encourages you to work a certain way. Take GitHub pull requests for example. The pull request is defined around the remote Git branch that you git push. If you want to update that branch, you need to know its name. So that requires some overhead to either create and track that branch or find its name when you want to update it. Contrast with Gerrit, where you don't have an explicit remote branch you push to: you simply git push gerrit HEAD:refs/for/master and it figures things out automatically (more on this later). With Gerrit, I don't have to create a local Git branch to initiate an integration request. With pull requests, I'm compelled to. And this can undermine my productivity by compelling me to practice less-efficient workflows!

Our final point of comparison involves scalability. When you use the version control tool wire protocol as part of integration requests, you have introduced the problem of scaling your version control server. Take it from someone who has had multiple jobs involving scaling version control servers and who is intimately aware of the low-level details of both the Git and Mercurial wire protocols: you don't want to be in the business of scaling a version control server. The wire protocols for both Git and Mercurial were designed in a now-ancient era of computing and weren't designed by network protocol experts. They are fundamentally difficult to scale at just the wire protocol level. I've heard stories that at one time, the most expensive single server at Google was their Perforce or Perforce-derived server (this was several years ago - Google has since moved on to a better architecture).

The poor network protocols of version control tools have many side-effects, including the inability or sheer difficulty of using distributed storage on the server. So in order to scale compute horizontally, you need to invest in expensive network storage solutions or devise a replication and synchronization strategy. And take it from someone who worked on data synchronization products (outside of the source control space) at three companies: this is a problem you don't want to solve yourself. Data synchronization is intrinsically difficult and rife with difficult trade-offs. It's almost always a problem best avoided if you have a choice in the matter.

If creating Git refs is part of creating an integration request, you've introduced a scaling challenge with the number of Git refs. Do these Git refs live forever? What happens when you have thousands of developers - possibly all working in the same repository - and the number of refs or ref mutations grows to the hundreds of thousands or millions per year?

Can your version control server handle ingesting a push every second or two with reasonable performance? Unless you are Google, Facebook, or a handful of other companies I'm aware of, it can't. And before you cry that I'm talking about problems that only plague the 0.01% of companies out there, I can name a handful of companies under 10% the size of these behemoths where this is a problem for them. And I also guarantee that many people don't have client-side metrics for their git push P99 times or reliability and don't even realize there is a problem! Scaling version control is probably not a core part of your company's business. Unfortunately, it all too often becomes something companies have to allocate resources for because of poorly designed or utilized tools.

Contrast the challenges of scaling integration requests with a native version control server versus just exchanging patches. With the more primitive approach, you are probably sending the patch over HTTP to a web service. And with tools like Phabricator and Review Board, that patch gets turned into rows in a relational database. I guarantee it will be easier to scale an HTTP web service fronting a relational database than it will be your version control server. If nothing else, it should be easier to manage and debug, as there are tons more experts in these domains than in the version control server domain!

Yes, it is true that many will not hit the scaling limits of the version control server. And some nifty solutions for scaling do exist. But large segments of this problem space - including the version control tool maintainers having to support crazy scaling vectors in their tools - could be avoided completely if integration requests didn't lean so heavily on the version control tools's default mode of operation. Unfortunately, solutions like GitHub pull requests and Gerrit's use of Git refs for storing everything exert a lot of pressure on scaling the version control server and make this a very real problem once you reach a certain scale.

Hopefully the above paragraphs enlightened you to some of the implications that the choice of a data exchange mechanism has on integration requests! Let's move on to another point of comparison.

Commit Tracking

One can classify implementations of integration requests by how they track commits through their integration lifecycle. What I mean by this is how the integration request follows the same logical change as it evolves. For example, if you submit a commit then amend it, how does the system know that the commit evolved from commit X to X'.

Pull requests don't track commits directly. Instead, a commit is part of a Git branch and that branch is tracked as the entity the pull request is built around. The review interface presents the merge diff front and center. It is possible to view individual commits. But as far as I know, none of these tools have smarts to explicitly track or map commits across new submissions. Instead, they simply assume that the commit order will be the same. If commits are reordered or added or removed in the middle of an existing series, the tool can get confused quite easily. (With GitHub, it was once possible for a review comment left on a commit to disappear entirely. The behavior has since been corrected and if GitHub doesn't know where to print a comment from a previous commit, it renders it as part of the pull request's timeline view.)

If all you are familiar with is pull requests, you may not realize there are alternatives to commit tracking! In fact, the most common alternative (which isn't do nothing) predates pull requests entirely and is still practiced by various tools today.

The way that Gerrit, Phabricator, and Review Board work is the commit message contains a unique token identifying the integration request for that commit. e.g. a commit message for a Phabricator review will contain the line Differential Revision: https://phab.mercurial-scm.org/D7543. Gerrit will have something like Change-Id: Id9bfca21f7697ebf85d6a6fa7bac7de4358d7a43.

The way this annotation appears in the commit message differs by tool. Gerrit's web UI advertises a shell one-liner to clone repositories which not only performs a git clone but also uses curl to download a shell script from the Gerrit server and install it as Git's commit-msg hook in the newly-cloned repositories. This Git hook will ensure that any newly-created commit has a Change-ID: XXX line containing a randomly generated, hopefully unique identifier. Phabricator and Review Board leverage client-side tooling to rewrite commit messages after submission to their respective tool so the commit message contains the URL of the code review. One can debate which approach is better - they each have advantages and drawbacks. Fortunately, this debate is not germane to this post, so we won't cover it here.

What is important is how this metadata in commit messages is used.

The commit message metadata comes into play when a commit is being ingested into an integration request. If a commit message lacks metadata or references an entity that doesn't exist, the receiving system assumes it is new. If the metadata matches an entity on file, the incoming commit is often automatically matched up to an existing commit, even if its Git SHA is different!

This approach of inserting a tracking identifier into commit messages works surprisingly well for tracking the evolution of commits! Even if you amend, reorder, insert, or remove commits, the tool can often figure out what matches up to previous submissions and reconcile state accordingly. Although support for this varies by tool. Mercurial's extension for submitting to Phabricator is smart enough to take the local commit DAG into account and change dependencies of review units in Phabricator to reflect the new DAG shape, for example.

The tracking of commits is another one of those areas where the simpler and more modern features of pull requests often don't work as well as the solutions that came before. Yes, inserting an identifier into commit messages feels hacky and can be brittle at times (some tools don't implement commit rewriting very well and this can lead to a poor user experience). But you can't argue with the results: using explicit, stable identifiers to track commits is far more robust than the heuristics that pull requests rely on. The false negative/positive rate is so much lower. (I know this from first hand experience because we attempted to implement commit tracking heuristics for a code review tool at Mozilla before Phabricator was deployed and there were a surprising number of corner cases we couldn't handle properly. And this was using Mercurial's obsolescence markers, which gave us commit evolution data generated directly by the version control tool! If that didn't work well enough, it's hard to imagine an heuristic that would. We eventually gave up and used stable identifiers in commit messages, which fixed most of the annoying corner cases.)

The use of explicit commit tracking identifiers may not seem like it makes a meaningful difference. But it's impact is profound.

The obvious benefit of tracking identifiers is that they allow rewriting commits without confusing the integration request tool. This means that people can perform advanced history rewriting with near impunity as to how it would affect the integration request. I am a heavy history rewriter. I like curating a series of individually high-quality commits that can each stand in isolation. When I submit a series like this to a GitHub pull request and receive feedback on something I need to change, when I enact those changes I have to think will my rewriting history here make re-review harder? (I try to be empathetic with the reviewer and make their life easier whenever possible. I ask what I would appreciate someone doing if I were reviewing their change and tend to do that.) With GitHub pull requests, if I reorder commits or add or remove a commit in the middle of a series, I realize that this may make review comments left on those commits hard to find since GitHub won't be able to sort out the history rewriting. And this may mean those review comments get lost and are ultimately not acted upon, leading to bugs or otherwise deficient changes. This is a textbook example of tooling deficiencies dictating a sub-optimal workflow and outcome: because pull requests don't track commits explicitly, I'm forced to adopt a non-ideal workflow or sacrifice something like commit quality in order to minimize risks that the review tool won't get confused. In general, tools should not externalize these kinds of costs or trade-offs onto users: they should just work and optimize for generally agreed-upon ideal outcomes.

Another benefit to tracking identifiers is that they enable per-commit review to be viable. Once you can track the logical evolution of a single commit, you can start to associate things like review comments with individual commits with a high degree of confidence. With pull requests (as they are implemented today), you can attempt to associate comments with commits. But because you are unable to track commits across rewrites with an acceptably high degree of success, rewritten commits often fall through the cracks, orphaning data like review comments with them. Data loss is bad, so you need a place to collect this orphaned data. The main pull request activity timeline facilitates this function.

But once you can track commits reliably (and tools like Gerrit and Phabricator prove this is possible), you don't have this severe problem of data loss and therefore don't need to worry about finding a place to collect orphaned data! You are then able to create per-commit review units, each as loosely coupled with other commits and an overall series as you want to make it!

It is interesting to note the different approaches in different tools here. it is doubly interesting to note behavior that is possible with the review tool itself and what it does by default!

Let's examine Phabricator. Phabricator's review unit is the Differential revision. (Differential is the name of the code review tool in Phabricator, which is actually a suite of functionality - like GitLab, but not nearly as feature complete.) A Differential revision represents a single diff. Differential revisions can have parent-child relationships with others. Multiple revisions associated like this form a conceptual stack in Phabricator's terminology. Go to https://phab.mercurial-scm.org/D4414 and search for stack to see it in action. (Stack is a bit misleading name because the parent-child relationships actually form a DAG and Phabricator is capable of rendering things like multiple children in its graphical view.) Phabricator's official client side tool for submitting to Phabricator - Arcanist or arc - has default behavior of collapsing all Git commits into a single Differential revision.

Phabricator can preserve metadata from the individual commits (it can render at least the commit messages in the web UI so you can see where the Differential revision came from). In other words, by default Arcanist does not construct multiple Differential revisions for each commit and therefore does not construct parent-child relationships for them. So there is no stack to render here. To be honest, I'm not sure if modern versions of Arcanist even support doing this. I do know both Mercurial and Mozilla authored custom client side tools for submitting to Phabricator to work around deficiencies like this in Arcanist. Mozilla's may or may not be generally suitable for users outside of Mozilla - I'm not sure.

Another interesting aspect of Phabricator is that there is no concept of an over-arching series. Instead, each Differential revision stands in isolation. They can form parent-child relationships and constitute a stack. But there is no primary UI or APIs for stacks (the last I looked anyway). This may seem radical. You may be asking questions like how do I track the overall state of a series or how do I convey information pertinent to the series as a whole. These are good questions. But without diving into them, the answer is that as radical as it sounds to not have an overall tracking entity for a series of Differential revisions, it does work. And having used this workflow with the Mercurial Project for a few years, I can say I'm not missing the functionality that much.

Gerrit is also worth examining. Like Phabricator, Gerrit uses an identifier in commit messages to track the commit. But whereas Phabricator rewrites commit messages at initially submission time to contain the URL that was created as part of that submission, Gerrit peppers the commit message with a unique identifier at commit creation time. The server then maintains a mapping of commit identifier to review unit. Implementation details aside, the end result is similar: individual commits can be tracked more easily.

What distinguishes Gerrit from Phabricator is that Gerrit does have a stronger grouping around multiple commits. Gerrit will track when commits are submitted together and will render both a relation chain and submitted together list automatically. While it lacks the visual beauty of Phabricator's implementation, it is effective and is shown in the UI by default, unlike Phabricator.

Another difference from Phabricator is that Gerrit uses per-commit review by default. Whereas you need a non-official client for Phabricator to submit a series of commits to constitute a linked chain, Gerrit does this by default. And as far as I can tell, there's no way to tell Gerrit to squash your local commits down to a single diff for review: if you want a single review to appear, you must first squash commits locally then push the squashed commit. (More on this topic later in the post.)

A secondary benefit of per-commit review is that this model enables incremental integration workflows, where some commits in a series or set can integrate before others, without having to wait for the entire batch. Incremental integration of commits can drastically speed up certain workflows, as commits can integrate as soon as they are ready and not any longer. The benefits of this model can be incredible. But actually deploying this workflow can be tricky. One problem is that your version control tool may get confused when you rebase or merge partially landed state. Another problem is it can increase the overall change rate of the repository, which may strain systems from version control to CI to deployment mechanisms. Another potential problem involves communicating review sign-off from integration sign-off. Many tools/workflows conflate I sign off on this change and I sign off on landing this change. While they are effectively identical in many cases, there are some valid cases where you want to track these distinctly. And adopting a workflow where commits can integrate incrementally will expose these corner cases. So before you go down this path, you want to be thinking about who integrates commits and when they are integrated. (You should probably be thinking about this anyway because it is important.)

Designing a Better Integration Request

Having described some problems with pull requests and alternate ways of going about solving the general problem of integration requests, it is time to answer the million dollar problem: designing a better integration request. (When you factor in the time people spend in pull requests and the cost of bugs / low quality changes that slip through due to design of existing tooling, improving integration requests industry wide would be a lot more valuable than $1M.)

As a reminder, the pull request is fundamentally a nice UI and set of features built around the common Git feature branch workflow. This property is preserved from the earliest days of pull requests in 2007-2008 and has been copied by vendors like Bitbucket and GitLab in the years since. In my mind, pull requests should be ripe for overhaul.

Replace Forks

The first change I would make to pull requests is to move away from forks being a required part of the workflow. This may seem radical. But it isn't!

A fork on services like GitHub is a fully fledged project - just like the canonical project it was forked from. It has its own issues, wiki, releases, pull requests, etc. Now show of hands: how often do you use these features on a fork? Me neither. In the overwhelming majority of cases, a fork exists solely as a vehicle to initiate a pull request against the repository it was forked from. It serves little to no additional meaningful functionality. Now, I'm not saying forks don't serve a purpose - they certainly do! But in the case of someone wanting to propose a change to a repository, a fork is not strictly required and its existence is imposed on us by the current implementation of pull requests.

I said impose in the previous sentence because forks introduce overhead and confusion. The existence of a fork may confuse someone as to where a canonical project lives. Forks also add overhead in the version control tool. Their existence forces the user to manage an additional Git remote and branches. It forces people to remember to keep their branches in sync on their fork. As if remembering to keep your local repository in sync wasn't hard enough! And if pushing to a fork, you need to re-push data that was already pushed to the canonical repository, even though that data already exists on the server (just in a different view of the Git repository). (I believe Git is working on wire protocol improvements to mitigate this.)

When merely used as a vehicle to initiate integration requests, I do not believe forks offer enough value to justify their existence. Should forks exist: yes. Should people be forced to use them in order to contribute changes, no. (Valid use cases for a fork would be to perform a community splinter of a project and to create an independent entity for reasons such as better guarantees of data availability and integrity.)

Forks are essentially a veneer on top of a server-side git clone. And the reason why a separate Git repository is used at all is probably because the earliest versions of GitHub were just a pile of abstractions over git commands. The service took off in popularity, people copied its features almost verbatim, and nobody ever looked back and thought why are we doing things like this in the first place.

To answer what we would replace forks with, we have to go back to first principles and ask what are we trying to do. And that is propose a unit a change against an existing project. And for version control tools, all you need to propose a change is a patch/commit. So to replace forks, we just need an alternate mechanism to submit patches/commits to an existing project.

My preferred alternative to forks is to use git push directly to the canonical repository. This could be implemented like Gerrit where you push to a special ref. e.g. git push origin HEAD:refs/for/master. Or - and this is my preferred solution - version control servers could grow more smarts about how pushes work - possibly even changing what commands like git push do if the server is operating in special modes.

One idea would be for the Git server to expose different refs namespaces depending on the authenticated user. For example, I'm indygreg on GitHub. If I wanted to propose a change to a project - let's say python/cpython - I would git clone git@github.com:python/cpython. I would create a branch - say indygreg/proposed-change. I would then git push origin indygreg/proposed-change and because the branch prefix matches my authenticated username, the server lets it through. I can then open a pull request without a fork! (Using branch prefixes is less than ideal, but it should be relatively easy to implement on the server. A better approach would rely on remapping Git ref names. But this may require a bit more configuration with current versions of Git than users are willing to stomach. An even better solution would be for Git to grow some functionality to make this easier. e.g. git push --workspace origin proposed-change would push proposed-change to a workspace on the origin remote, which Git would know how to translate to a proper remote ref update.)

Another idea would be for the version control server to invent a new concept for exchanging commits - one based on sets of commits instead of DAG synchronization. Essentially, instead of doing a complicated discovery dance to synchronize commits with the underlying Git repository, the server would ingest and expose representations of sets of commits stored next to - but not within - the repository itself. This way you are not scaling the repository DAG to infinite heads - which is a hard problem! A concrete implementation of this might have the client run a git push --workspace origin proposed-change to tell the remote server to store your proposed-change branch in your personal workspace (sorry for reusing the term from the previous paragraph). The Git server would receive your commits, generate a standalone blob to hold them, save that blob to a key-value store like S3, then update a mapping of which commits/branches are in which blobs in a data store such as a relational database somewhere. This would effectively segment the core project data from the more transient branch data, keeping the core repository clean and pure. It allows the server to lean on easier-to-scale data stores such as key-value blob stores and relational databases instead of the version control tool. I know this idea is feasible because Facebook implemented it for Mercurial. The infinitepush extension essentially siphons Mercurial bundles (standalone files holding commit data) off to a blob store when pushes come in over the wire. At hg pull time, if a requested revision is not present in the repository, the server asks the database-backed blob index if the revision exists anywhere. If it does, the blob/bundle is fetched, dynamically overlayed onto the repository in memory, and served to the client. While the infinitepush extension in the official Mercurial project is somewhat lacking (through no fault of Facebook's), the core idea is solid and I wish someone would spend the time to flush out the design a bit more because it really could lead to logically scaling repositories to infinite DAG heads without the complexities of actually scaling scaling DAG algorithms, repository storage, and version control tool algorithms to infinite heads. Getting back to the subject of integration requests, one could imagine having a target for workspace pushes. For example, git push --workspace=review origin would push to the review workspace, which would automatically initiate a code review.

Astute readers of this blog may find these ideas familiar. I proposed user namespaces in my /blog/2017/12/11/high-level-problems-with-git-and-how-to-fix-them/ post a few years ago. So read there for more on implications of doing away with forks.

Could forks be done away with as a requirement to submit pull requests? Yes! Gerrit's git push origin HEAD:refs/for/master mechanism proves it. Is Gerrit's approach too much magic or confusing for normal users? I'm not sure. Could Git grow features to make the user experience much better so users don't need to be burdened with complexity or magic and could simply run commands like git submit --for review? Definitely!

Shift Focus From Branches to Individual Commits

My ideal integration request revolves around individual commits, not branches. While the client may submit a branch to initiate or update an integration request, the integration request is composed of a set of loosely coupled commits, where parent-child relationships can exist to express a dependency between commits. Each commit is evaluated individually. Although someone may need to inspect multiple commits to gain a full understanding of the proposed change. And some UI enabling operations against a group of related commits (such as mass deleting abandoned commits) may be warranted.

In this world, the branch would not matter. Instead, commits are king. Because we would be abandoning the branch name as a tracker for the integration request, we would need something to replace it, otherwise we have no way of knowing how to update an existing integration request! We should do what tools like Phabricator, Gerrit, and Review Board do and add a persistent identifier to commits which survive history rewriting. (Branch-based pull requests should do this anyway so history rewrites don't confuse the review tool and e.g. cause comments to get orphaned - see above.)

It's worth noting that a commit-centric integration request model does not imply that everyone is writing or reviewing series of smaller commits! While titans of industry and I strongly encourage the authorship of smaller commits, commit-centric integration requests don't intrinsically force you to do so. This is because commit-centric integration requests aren't forcing you to change your local workflow! If you are the type of person who doesn't want to curate a ton of small, good-in-isolation commits (it does take a bit more work after all), nobody would be forcing you to do so. Instead, if this is your commit authorship pattern, the submission of the proposed change could squash these commits together as part of the submission, optionally rewriting your local history in the process. If you want to keep dozens of fixup commits around in your local history, that's fine: just have the tooling collapse them all together on submission. While I don't think those fixup commits are that valuable and shouldn't be seen by reviewers, if we wanted, we could have tools continue to submit them and make them visible (like they are in e.g. GitHub pull requests today). But they wouldn't be the focus of review (again like GitHub pull requests today). Making integration requests commit-centric doesn't force people to adopt a different commit authorship workflow. But it does enable projects that wish to adopt more mature commit hygiene to do so. That being said, hows tools are implemented can impose restrictions. But that's nothing about commit-centric review that fundamentally prohibits the use of fixup commits in local workflows.

While I should create a dedicated post espousing the virtues of commit-centric workflows, I'll state my case through proxy by noting that some projects aren't using modern pull requests precisely because commit-centric workflows are not viable. When I was at Mozilla, one of the blockers to moving to GitHub was the pull request review tooling wasn't compatible with our world view that review units should be small. (This view is generally shared by Google, Facebook, and some prominent open source projects, among others.) And for reasons outlined earlier in this post, I think that as long as pull requests revolve around branches / merge diffs and aren't robust in the face of history rewriting (due to the lack of robust commit tracking), projects that insist on more refined practices will continue to eschew pull requests. Again, a link between review size and quality has been established. And better quality - along with its long-term effect of lowering development costs due to fewer bugs - can tip the scales in its favor, even against all the benefits you receive when using a product like GitHub, GitLab, or Bitbucket.

The Best of What's Around

Aspects of a better integration request exist in tools today. Unfortunately, many of these features are not present on pull requests as implemented by GitHub, GitLab, Bitbucket, etc. So to improve the pull request, these products will need to borrow ideas from other tools.

Integration requests not built around Git branches (Gerrit, Phabricattor, Review Board, etc) use identifiers in commit messages to track commits. This helps tracking commits across changes. There are compelling advantages to this model. Robust commit tracking is a requirement for commit-centric workflows. And it would even improve the functionality of branch-based pull requests. A well-designed integration request would have a robust commit tracking mechanism.

Gerrit has the best-in-class experience for commit-centric workflows. It is the only popular implementation of integration requests I'm aware of that supports and caters to this workflow by default. In fact, I don't think you can change this! (This behavior is user hostile in some cases since it forces users to know how to rewrite commits, which is often perilous in Git land. It would be nice if you could have Gerrit squash commits into the same review unit automatically on the server. But I understand the unwillingness to implement this feature because this has its own set of challenges around commit tracking, which I won't bore you with.) Gerrit also shows groups of related commits front and center when viewing a proposed change.

Phabricator is the only other tool I know of where one can achieve a reasonable commit-centric workflow without the pitfalls of orphaned comments, context overload, etc mentioned earlier in this post. But this requires non-standard submission tooling and commit series aren't featured prominently in the web UI. So Phabricator's implementation is not as solid as Gerrit's.

Another Gerrit feature worth lauding is the submission mechanism. You simply git push to a special ref. That's it. There's no fork to create. No need to create a Git branch. No need to create a separate pull request after the push. Gerrit just takes the commits you pushed and turns them into a request for review. And it doesn't require any additional client-side tooling!

Using a single common git command to submit and update an integration request is simpler and arguably more intuitive than other tools. Is Gerrit's submission perfect? No. The git push origin HEAD:refs/for/master syntax is not intuitive. And overloading submission options by effectively encoding URL parameters on the ref name is a gross - albeit effective - hack. But users will likely quickly learn the one-liner's or create more intuitive aliases.

The elegance of using just a git push to initiate an integration request puts Gerrit in a league of its own. I would be ecstatic if the GitHubs of the world reduced the complexity of submitting pull requests to simply clone the canonical repository, create some commits, and run a git command. The future of submitting integration requests* hopefully looks more like Gerrit than other alternatives.

What Needs Built

Some aspects of the better integration request don't yet exist or need considerable work before I consider them viable.

For tools which leverage the native version control tool for submission (e.g. via git push), there needs to be some work to support submission via a more generic, HTTP endpoint. I'm fine with leveraging git push as a submission mechanism because it makes the end-user experience so turnkey. But making it the only submission mechanism is a bit unfortunate. There is some support for this: I believe you can cobble together a pull request from scratch via GitHub's APIs, for example. But it isn't as simple as submit a patch to an endpoint, which it arguably should be. Even Gerrit's robust HTTP API, does not seem to allow creating new commits/diffs via that API. Anyway, this limitation not only excludes non-Git tools from using these tools, but also limits other tooling from submitting without using Git. For example, you may want to write a bot that proposes automated changes and it is much easier to produce a diff than to use git since the former does not require a filesystem (this matters in serverless environments for example).

A larger issue with many implementations is the over-reliance on Git for server storage. This is most pronounced in Gerrit, where not only are your git pushes stored in a Git repository on the Gerrit server, but every code review comment and reply is stored in Git as well! Git is a generic key-value store and you can store any data you want in it if you shoehorn it properly. And it is cool that all your Gerrit data can be replicated via git clone - this pretty much eliminates the we took a decentralized tool and centralized it via GitHub series of arguments. But if you apply this store everything in Git approach at scale, it means you will be running a Git server at scale. And not just any Git server - a write load heavy Git server! And if you have thousands of developers possibly all working out of the same repository, then you are looking at potentially millions of new Git refs per year. While the Git, Gerrit, and JGit people have done some fantastic work making these tools scale, I'd feel much better if we eschewed the make Git scale to infinite pushes and refs problem and used a more scalable approach, like an HTTP ingestion endpoint which writes data to key-value stores or relational databases. In order words, use of a version control tool for servicing integration requests at scale is a self-imposed footgun and could be avoided.

Conclusion

Congratulations on making it through my brain dump! As massive as the wall of text is, there are still plenty of topics I could have covered but didn't. This includes the more specific topic of code review and the various features that entails. I also largely ignored some general topics like the value that an integration request can serve on the overall development lifecycle: integration requests are more than just code review - they serve as a nexus to track the evolution of a change throughout time.

Hopefully this post gave you some idea at some of the structural issues at play with the integration of pull requests and integration requests. And if you are someone in a position to design or implement a better integration request or tooling around them (including in version control tools themselves), hopefully it gave you some good ideas or where to go next.