Maintaining a Fork With Git
Gunicorn is a fabulous little WSGI server for Python 2.5+. For reasons outside my control, I have to deploy it in a Python 2.4 environment. Therefore, I've been maintaining a fork on github.
The problem is one of maintaining readable and tidy history when trying to maintain a fork. A standard git merge won't suffice. Since my fork's history contains changes that were never included upstream, git merge will generate nasty merge commits even when all the most recent changes from upstream apply cleanly. git cherry-pick is great for grabbing one or two commits from upstream but is cumbersome when you want to grab a dozen. I'm sure there's some way to pipe git rev-list or git log into git cherry-pick, but even if I figure it out the command will neither be simple nor easily remembered.
The solution is clever use of git rebase --onto. Observe.
I have two branches named master and py24 with the following histories:
[master] tilgovi@vissarion:~/src/gunicorn$ git log --oneline -8
19ab06c Update Python 2.4 installation note.
93ad20e Minor white space and ordering fixes for my CDO
eca6fad Output config if debug loglevel is set
7acfe5c Update documentation to reflect changes in aca70fb
e99a384 Initialize logging before setup since setup can emit warnings to log
0d67447 Add pre/post request hooks
aca70fb allows worker_class uri shortcut. It's now possible to do :
dbd66b6 work around evdns not playing well with fork
[py24] tilgovi@vissarion:~/src/gunicorn$ git log --oneline -3
0a5f7b3 Merge branch 'master' into py24
c52efd5 Clean up minor differences from upstream
dbd66b6 work around evdns not playing well with fork
The last commit (dbd66b6) I've shown from the history of master has already been merged into py24 (I hadn't yet figured this trick out), and so it will serve as our starting point. Now that I've read the documentation and figured out how to do this cleanly the commands actually seem quite natural.
What we would like is to use git rebase to replay the differences between dbd66b6 and master on the end of py24. In order to not mangle master, we need to check out a new branch.
[py24] tilgovi@vissarion:~/src/gunicorn$ git checkout master --no-track -b rebasing
Switched to a new branch 'rebasing'
The git rebase <upstream> command rewinds back to a common ancestry point of <upstream> and the current branch and replays the current branch's changes on top of the newest <upstream>. It is useful for getting upstream changes when <upstream> is a moving target like a remote branch. In our case, dbd66b6 is a commit that's an ancestor of our rebasing branch so git rebase dbd66b6 would actually do nothing by itself.
[rebasing] tilgovi@vissarion:~/src/gunicorn$ git rebase dbd66b6
Current branch rebasing is up to date.
However, in this case we want to replay the changes since dbd66b6 on top of py24. The --onto option of git rebase will do just this.
[rebasing] tilgovi@vissarion:~/src/gunicorn$ git rebase --onto py24 dbd66b6
First, rewinding head to replay your work on top of it...
Applying: allows worker_class uri shortcut. It's now possible to do :
Applying: Add pre/post request hooks
Applying: Initialize logging before setup since setup can emit warnings to log
Applying: Update documentation to reflect changes in aca70fb
Applying: Output config if debug loglevel is set
Applying: Minor white space and ordering fixes for my CDO
Applying: Update Python 2.4 installation note.
We've just cherry-picked all the changes between dbd66b6 and master, but we did so as though we were on the py24 branch! Look!
[rebasing] tilgovi@vissarion:~/src/gunicorn$ git log --oneline -10
3836ac0 Update Python 2.4 installation note.
0eeb6d0 Minor white space and ordering fixes for my CDO
c1de2be Output config if debug loglevel is set
66040b8 Update documentation to reflect changes in aca70fb
e4b468b Initialize logging before setup since setup can emit warnings to log
fb56182 Add pre/post request hooks
698dee8 allows worker_class uri shortcut. It's now possible to do :
0a5f7b3 Merge branch 'upstream' into py24
c52efd5 Clean up minor differences from upstream
dbd66b6 work around evdns not playing well with fork
Now all we need to do is fast-forward py24 through these new commits from rebasing. Then we can drop our rebasing branch and call it a day.
[rebasing] tilgovi@vissarion:~/src/gunicorn$ git checkout py24
Switched to branch 'py24'
[py24] tilgovi@vissarion:~/src/gunicorn$ git merge rebasing
Updating 0a5f7b3..3836ac0
Fast-forward
doc/htdocs/install.html | 2 +-
doc/site/install.rst | 2 +-
gunicorn/arbiter.py | 17 ++++++---
gunicorn/config.py | 85 ++++++++++++++++++++++++++++++--------------
gunicorn/util.py | 8 ++++-
gunicorn/workers/async.py | 2 +
gunicorn/workers/sync.py | 2 +
7 files changed, 83 insertions(+), 35 deletions(-)
[py24] tilgovi@vissarion:~/src/gunicorn$ git branch -d rebasing
Deleted branch rebasing (was 3836ac0).
The result is that we cleanly placed all the new upstream changes onto our fork. Our revision history is clean and readable and life is beautiful.