I've fiddled with my blog template because I decided I wanted more horizontal viewing space, given that it was using less than a third of my 1920 horizontal pixels. If it feels too spread out for you, I added a drag-and-drop handle over to the left to let you resize the main content column. The javascript is pretty primitive. If it breaks, drop me a comment.

Saturday, July 18, 2009

The Journey to Git, Part VI--Rewriting History

Welcome to my sixth (!!!) post on Git. I totally didn't expect this to go so long. In this post, we'll look at some methods Git gives you to change the history of your files. With centralized version control, there's a strong tendency to consider things irrevocable once committed. I'll show you that Git has no such constraints.

This post assumes you've read the previous ones in the series or are at least familiar enough with Git to use some of its basic commands. If you've been following along with the commands I've given in previous posts, this continues to build on the repository you've created as a result. If not, just create a repo and commit a file named "foo" to it with a line of text in it. Then add another line and commit again. This should give you enough to work with to see how the commands in this post work.


Articles in this series:

Changing History

Git isn't nearly so picky as SVN about changing or undoing things that you've done in version control. It supplies some very handy options for doing just that quickly and easily.

Note: While the commands are available, consider carefully what you're doing when you use them. Have the commits you're changing already been pushed or pulled out somewhere else? If so, will you or Git be able to resolve the differences if you start messing with the history? The easiest thing is to only change the history if the commit hasn't been propagated to another repository!

Amending Commits

The first command is a relatively simple one. Run:

git commit --amend -m "I decided to change this message"

Now "git log" will show that your most recent commit on the current branch has a different message. That's just the tip of the iceberg, though. You can amend a commit in any way. Modify the file foo again, stage the change, and do another amend commit:

echo "Another change to this file" >> foo
git add foo
git commit --amend -m "Foo needed to change again in this commit"

You can change your most recent commit in any way by simply staging some changes and using this command.

Undoing Commits with Reset

A more powerful history-altering command is "git reset". There are several options you can give this that do subtly different things. Let's look at the default form first. Run:

git reset HEAD^

The "HEAD^" means one commit before HEAD. This is a quick way of referring to commits rather than using "git log" every time to look up the hash. The carets stack, so "HEAD^^^^" would be four commits before HEAD, which can also be expressed as "HEAD~4".

Back to the reset: when you reset to some commit, you're making that commit the latest on the current branch. Any commits after it go away, but the changes they made aren't lost. All the changes from those dropped commits end up in your working tree, so "git status" will show you that foo is modified, and "git diff" will show that the change is the string that you most recently added to the file and committed.

Alternate forms of this, which we won't try here, are with a --hard and a --soft option (the default option is --mixed). With --soft, you get the same behavior as the default, but the changes end up in the index instead of your working tree. With --hard, you drop back to some commit, but all of your changes are lost. Use with caution. This is the Git equivalent of "rm -rf *". Note that the default target of reset is HEAD, meaning that "git reset --hard" will just erase all the work you've done since your last commit. This can be useful at times.

Totally Thrashing Commits with Rebase

Note: I believe you have to have your core.editor configuration option defined in order to use this since it relies on being able to edit text. See post two from this series for configuration stuff.

If you're only familiar with SVN, this one will blow your mind. Use "git log" to get the hash of your first commit, and run:

git rebase -i <hash>

In your text editor, you'll be presented with a list of commits and some basic instructions. The commits are all of the ones from the one after the commit you specified to the current commit--that's <selected commit> + 1 through HEAD, inclusive. By moving the commit lines around, you'll change the order of the commits in this branch. If you decide a commit is trash, delete the line, and when the rebase completes, the commit will be gone. Git does this by removing all of the commits you selected from the branch and then reapplying them in the order you specify.

In addition, you can specify a couple of other things. First, you can choose to amend any of the commits as they're reapplied by simply changing the "pick" before that commit to "e" or "edit". Remember when I said that "git commit --amend" only applies to the most recent commit? (I did say that, right? Back in post three, I think?) Well, that's technically true, but it doesn't mean you couldn't use rebase to go back ten commits to do an amend.

Finally, there's the squash option. "Squash" means compress two or more commits into one. Try this:

echo 1 >> foo
git commit -am "squashme"
echo 2 >> foo
git commit -am "squashme"
echo 3 >> foo
git commit -am "squashme"
git rebase -i HEAD^^^

Note: You'll notice that I used a new flag on commit that let me skip the "git add". Using -a with commit tells it to automatically add all modifications before committing. It only picks up changes to tracked files. The -a flag won't add any new files.

Now, let's squash the commits we made. Select "squash" or "s" for all three of the commits, save, and exit. You'll see the rebase happening, and at the end, you'll be prompted for a new commit message.

This interactive rebase (i=interactive) really points to how Git differs from centralized version control with respect to its attitude toward history. In Git, history is completely mutable until it gets too complicated to control by being replicated to other repositories. As long as it only exists in your own repo, you can change practically anything about it. Git views commits as part of the development process, and therefore recognizes them as things that might need to be cleaned up a bit after the fact. In, for instance, SVN, on the other hand, the general attitude is that once it's in the repository, it's set in adamantium.

Note: I believe that in newer versions of Git, the pop command has been removed in favor of apply, which also existed in older versions, but had different behavior. Check "git help stash" to see the commands available.

In the next--and hopefully last--post that deals with using Git locally, I'll show you how to navigate around your repository, and find out things you want to know about your files and history.

Vote for this article on DZone.


Tim Harper said...

I believe 'git stash pop' was added after 'git stash apply', in version 1.5.5. (commit bd56ff54 introduced it)

Ryan said...

Thanks for the input Tim. I couldn't find anything related to this with a quick google search. It might be that what I thought was a new version of Git was actually an old one.