Gitology #1 - git-flip-history

Posted October 16, 2020 ‐ 5 min read

This is the first post in a series to expand on various utilities I wrote to assist my work with Git. Some of these utilities are located in a repository on GitHub called misc-gitology.

Today I'll introduce the history flipper - git-flip-history.

The problem with splitting commits

When working with Git and browsing other developers' commit history, it is clear to many developers that separating logical changes to commits often provides value to reviewers. It also makes it easier to revert changes if needed, or when bisecting for bugs.

However, it is often hard to abide by the rule. When we are about to make a change, sometimes we find ourselves refactoring or doing more changes along the way, and we end up doing many preceding and/or proceeding changes to support the change we wanted to do in the first place.

Managing several closely related changes together while they are still "hot in the stove" really depends on development style and organization of the person doing so. Some manage to take every change to its own branch and commit it right away, nice and cleanly. Others have the discipline to split right away to separate commits and rebase-squash them with fixups. However, others cannot afford many of the context switches that would be involved and prefer to delay "feature splitting" to the very end once everything feels more mature and well-formed. The utility discussed in this post is more helpful to the latter group of developers.

Splitting a large commit that is currently in HEAD involves doing some Git maneuvering starting with git reset, proceeding to git add and git add -p, and an occasional git commit. However, this has a limitation - some logical changes may be dependent on one another, in this case git add -p would not help. For example, suppose two features add to a global list, where they modify the same location in a file.

History flipping to the rescue

Another way to tackle the problem of untangling features from one another is this - if I can do the work to remove the features one by one, it would be the mirror image of the work of rewriting these features from scratch, and in that case I just need an elaborate Git trick to flip the history on itself!

In further detail, I have a large commit implementing A+B+C. If I write a new commit that reverts A, and another one that reverts B, and flip the history of the three versions, the result would be a history that implements all three versions separately.

So this is where git-flip-history comes into play, where it also makes sure that the commit log looks sane afterward.

For example, we have the following commit, with a tentative commit message:

219bd740a8f1 WIP - three changes packed into one

Our purpose is to reach a state where we have three commits each describing a separate feature (can also be a bugfix any other kind of change).

Here, we'll manually revert the first two features into two commits, and for each one put specially crafted commit message with a Revert: prefix. This can be done in the editor, and git checkout can also assist. Once we're done for the original commit we'll reword it, so it becomes prefixed with a First: prefix, describing the remaining feature which we have not reverted. (i.e. whatever change the diff to HEAD~3 presented). We are free to write full commit messages, as long as the Revert: and First: prefixes are there.

Following our work to revert the features and reword the commits accordingly, our history looks like this:

509c3964befd Revert: Ignore type aliases
8941699bfb69 Revert: Treat DefKind::Mod as being in the value namespace
372d10a7ffcc First: Have trait names on their own namespace

Now that we are ready, we run git-flip-history. This requires no user input, and now HEAD looks like this:

a086ec0322a4 Treat DefKind::Mod as being in the value namespace
9d78613ed775 Ignore type aliases
86d20b3e77dc Have trait names on their own namespace

Observing the changes with git log -p, we should see a sane history now, where each commit adds the feature that the commit message talks about. It's worth to stress out that flipping the history like this does not require us to solve any conflicts, unlike with git rebase -i in the case where changes overlap on file offsets. Untangling the changes from one another is done only in the process of creating the Revert: commits, and you are free to use whatever method e.g. git checkout -p, or manual editing, in order to create these commits.

Further discussion

To explain how this is possible, let's look at it from another angle.

We have the following history:

HEAD     Revert A          This tree has C
HEAD~1   Revert B          This tree has A+C
HEAD~2   Implement A+B+C   This tree has A+B+C

By flipping, we now have this (start by comparing the third column):

HEAD     Implement B       This tree has A+B+C
HEAD~1   Implement A       This tree has A+C
HEAD~2   Implement C       This tree has C

How does it work

The git-flip-history program is a bash script that looks back at the history, finding the Revert: prefixes and stops when seeing a commit with the First: prefix. The history recreation process does not do any working tree changes, by simply invoking the git-commit-tree command. The resulting topmost version is identical to the original branch version, and the branch is reset using git reset --hard to the new history. The script currently requires a clean Git status for good measure (this may change in the future).


Share this post
Follow author