Gitology #2 - git-retext

Posted October 23, 2020 ‐ 5 min read

This is the second post in a series to expand on various utilities I wrote to assist my work with Git. Some of these utilities are located in a repository on GitHub called misc-gitology.

Previous post: #1.

Today I'll introduce the commit rewriter - git-retext.

Familiar ways of Git history editing

In the previous post I mentioned the desire to present a clear Git history to reviewers. Sometimes, we are reviewing our own unpublished Git history and would like to do some polishing here and there before pushing it to a remote repository.

There are multitude of ways in which editing the history can be done:

  • git commit --amend, with or without -a (in the later git add is used). This takes care of the HEAD commit, but not commits that are earlier to it.
  • git rebase, where you labor on a fixup commit or a set of them, with or without --interactive, with or without --autosquash, so that the fixup changes amend commits further down the history. There are tools such as git-absorb that can be used to automate creation of fixup commits.
  • git-filter-branch, a big hammer that lets you run a command per a commit, where the result of a command is the rewritten tree. Obviously not quite suitable for easy 'final touches'.
  • Editor environment features. In Emacs, Magit, and in Vim, Vimagit, and possibly other editors have their own Git integrations, some are elaborate enough to ease on editing the history. However, to each his own, and belaboring on a certain editor environment would get us off-topic.

A new way - git-retext

In the process of reviewing, we are most likely looking at diffs. What if it was possible in an environment-independent way, to just edit the diff?

In the pre-cursor to the invention of Git, developers used emails (and some are still using emails) in order to pass along changes to one another. The git format-patch command knows how to turn a list of commits into such emails. As plain text emails can be easily edited by a text editor, it stands to reason that we can edit changes in-place, if we momentarily turn commits into emails and back.

This is what git-retext essentially lets you do.

For example, we want to edit the most recent commit in the history. Let's issue the command:

git-retext HEAD~1

This git-retext command turns the HEAD~1..HEAD range into emails, and then we immediately find ourselves thrown into editing in the default text editor that we configured either in Git's configuration, or EDITOR environment variable. The "thrown into editor" situation should not be foreign to Git users, as it is done in the default git commit or git rebase -i workflows.

Here's an example for such git-retext HEAD~1 invocation at one of my repositories:

From bbb92d7cc4ac33bd0e368164d555e4e93a66e658 Mon Sep 17 00:00:00 2001
From: Dan Aloni <alonid@gmail.com>
Date: Thu, 21 May 2020 09:56:01 +0300
Subject: [PATCH] Apply random rotation for new piece

By applying a random set of rotation, the new piece is effectively
rotated to all possible rotations.

---
 src/main.rs | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/main.rs b/src/main.rs
index de5aba608ec3..d2741fa61fe2 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -161,6 +161,9 @@ impl Game {

         self.falling = self.possible_pieces[idx].clone();
         self.shift = (0, 0);
+        for _ in 0 .. rng.gen_range(0, 4usize) {
+            self.rotate(false)
+        }
     }

     fn render(&self, window: &mut PistonWindow, event: &Event) {
--
2.26.2

This is how a commit looks like when it is sent by email. It is possible to edit the first line of the commit in the Subject: ; and the details of the commit message before the ---, and the diff itself. About the diff, we don't have to fix the unidiff's meta-data, because git-retext's processing will take care of it instead.

Saving the temporary and exiting, git-retext will try to apply the edited commits using git am. If we have carefully edited them, they should apply cleanly. The resultant HEAD is the edited history.

Of course, this new method is not perfect, but it's a time saver for certain set of editing activities. After getting accustomed to it, you may start reviewing changes in git-retext even if you are not ending up modifying them, instead of running git-diff or viewing a diff made by the editor environment.

Advantages

  • Easy to amend changes right when they are being reviewed, as long as the editing is not too complicated. Adding new diff lines inside hunks is possible.
  • Easy to do search and replace on the changes themselves, and that also takes care of the commit message, and added filenames if relevant.
  • It is possible to remove unwanted diff hunks.
  • Splitting of commits to small new commits is possible by adding a new email header in between hunks (no need to edit the diffstat).

Disadvantages

  • If there is more than one commit, and the commits are dependent on one another, we must be more careful so that the editing is consistent, and that the diff hunk context is correct.
  • It's easy to mess up the editing and discover the mess only when git-retext fails to apply the changes.

Future enhancements of git-retext should assist in avoiding mistakes during editing, perhaps with better editor integration.

How it is implemented

git-retext is Python 3 script that relies on recountdiff in order to fix the edited diffs. The underlying git commands being used are git-format-patch, git-am, and git-reset.


Share this post
Follow author