“Merge pull request” Considered Harmful

Merge pull request button

I love Github – I think it’s made contributing to open source 1000% more approachable and enjoyable. But I’ve found the open source maintainer workflow that Github puts front and center in the form of the web Pull Request UI is actively harmful to project quality and speed of taking contributions. So before you hit that big old “Merge pull request” button on the next open source Github Pull Request you think is deserving of inclusion, let me tell you a quick story.

Meet the Maintainer

Lots of Pull Requests

Jane’s the maintainer on a modestly successful open source project. She gets a couple new Issues opened on her project’s Github repo every week, and she’s quick to dive in and provide feedback on the requests coming in. While she doesn’t have time to implement all the good ideas, she does try to give folks a quick thumbsup or thumbsdown so that they can code them up and open a Pull Request for inclusion.

As all good maintainers do, she has written up a CONTRIBUTING document, and of course her repo has a README and a CHANGELOG that get updated as the project moves forward. There are a couple levels of automated tests that contributors can run as they develop and are expected to extend as they add features. Jane has even adopted a comprehensive public style guide so that the code base will stay tidy.

As conscientious as she tries to be, though, Jane has a problem: there are over a dozen unmerged Pull Requests on her repo! And she feels like she’s always behind on getting them merged, even though a lot of them are pretty straightforward. Looking over them, a couple are awaiting large revisions based on feedback, but the vast majority are stuck on small stuff: an extra test that needs to be added, a few whitespace changes to match the style guide, a missing CHANGELOG entry, a bunch of commits that need to be squashed, etc.

What’s even more frustrating is that even though Jane provides feedback quickly, often contributors lose interest and/or forget about taking their Pull Requests the final step after initially contributing them. The apparent triviality of the changes Jane’s asking for (somewhat perversely) contributes to that loss of interest, since it just feels like nit-picking when she’s asking for the fifth overlooked stylistic change.

What’s the end result? All kinds of unmerged “code inventory” sitting around taking up mental space. Jane’s frustrated and losing motivation, and the current crop of contributors are less likely to show up again in the future.

Git as Linus Wrote It

Linus

Hello, my name is Nathaniel, and once I was in the same spot as Jane. I’d recently jumped onto the ActiveMerchant project, and it had pages of Pull Request inventory stuck on trivial stuff. I knew exactly what needed to get done for most of them, and I was stuck with two options:

  1. Hope that original contributors show back up to finish cleaning up their contributions so they can be merged cleanly.
  2. Merge contributions that aren’t quite ready, then do another commit to clean them up.

Unfortunately, neither of these paths is very appealing. I tried the former for a while, giving contributors detailed feedback and waiting for them to incorporate it. Most of the Pull Requests just kept sitting there, staring at me. And I ended up trying to teach advanced git to a bunch of people: “Can you squash out your five WIP commits?” “Isn’t zucchini a squash?”

The latter option results in weird historical states, the possibility of non-working intermediate code breaking git bisect, and if the cleanup takes a while to get to future Pull Requests could be based off the not-quite-ready commit or commits. Or the clean up might just get forgotten. I tried it once or twice, but it was super painful and my OCD couldn’t deal with the results.

So, what’s the solution? Turns out git was built to make this situation easy to deal with, and the only thing that was really holding me back was the workflow that Github puts front and center. So I stopped using the “Merge pull request” button, and instead I now do this:

  1. Install hub.
  2. Grab the Pull Request url. Looks like “https://github.com/Shopify/active_merchant/pull/1259″.
  3. In the repo, on the master branch, I run git am -3 <url>, where url is the Pull Request url. Conflicts? I fix ’em and then git am --continue.
  4. Now I have the changes applied right in my local master branch. I make fixes, clean up whitespace, add a line to the CHANGELOG, tack on tests, etc. The world is my oyster.

I’m not done yet – I still need to prep the changes to be pushed – but before I go on I want to highlight what I’ve done here. Git has a built in tool called am (stands for “apply mail”) that is designed to slurp up incoming patches out of an email inbox. The handy hub tool from Github adapts git am to work with web hosted patches, specifically the raw patch you can get for any Pull Request. So by using this workflow, I’m hewing much closer to the workflow the kernel maintainers (for instance) use when reviewing and merging patches.

History Is Written By the Victors

Choose Your Own Adventure Book

So now I have perfect code (hello hyperbole) sitting in my local master branch, with all the contributed commits showing up in git log, and my changes uncommitted. What comes next is a bit of a “Choose Your Own Adventure” situation, since it depends on what I’m working with:

  • If the contribution is a single commit, I’ll do a git commit --amend and just smoosh my changes right into the original commit.
  • If the contribution is multiple “work in progress” commits, I’ll commit my changes and then git rebase -i origin/master to squash all the commits including my own into a single logical commit.
  • It’s rare in my experience, but if the contribution is multiple logical commits that are “history worthy” in terms of being incremental steps that make sense in isolation, I’ll add my commit at the end.

Now I have the contribution ready to go, and there’s just one more little trick that makes it even more awesome: I add Closes #XXX. as the last line of the commit message, and when I git push the Pull Request auto-closes. Even cooler? If the Pull Request is open in my browser, the browser automatically updates to reflect the Pull Request being closed. Github, you so smart.

Here’s what the commit history looks like on the ActiveMerchant project:

Recent ActiveMerchant commit history

Compare that to the Rails commit history:

Recent Rails commit history

I know which I prefer. And what’s even better is that I’m now spending way less time going around and around with contributors and way more time merging code. Final bonus: if you notice in the ActiveMerchant history, the contributors still get all the credit for the commits, regardless of how much editing and/or rebasing of them I did of them. That’s just how I roll.

That Pesky Button

Merge pull request button

When might you still want to use the “Merge pull request” button?

  • Super small changes – think one-liner doc only tweaks. That said I’ve stopped even doing this since it’s just plain ugly, and it doesn’t take much longer to do it “right”.
  • Internal/private projects – do what you want. I have no idea what your situation is, or what makes sense. Personally I still prefer a clean commit history.

But what are the benefits of not using the “Merge pull request” button?

  • Contributors don’t have to get every detail exactly right.
  • It’s easy as a maintainer to clean up commits.
  • Less posting/emailing and more coding.
  • Commit history becomes a useful story that’s a joy to read.

I’m a big fan of the benefits, and my experience is that it means less time and less frustration for me as a maintainer. “Just push a button to take contributions” sounds great in theory, but in reality it adds friction that makes my job as a maintainer harder.

And as a hypothetical story-telling construct I created in my mind, Jane agrees 100%!