I've fiddled with my blog template because I decided I wanted more horizontal viewing space, given that it was using less than a third of my 1920 horizontal pixels. If it feels too spread out for you, I added a drag-and-drop handle over to the left to let you resize the main content column. The javascript is pretty primitive. If it breaks, drop me a comment.

Saturday, July 18, 2009

The Journey to Git, Part III—The Basics

This post: the most basic commands for interacting with a Git repository. By the end of this post, you should be able to use Git to track basic history for a project. Once again, I strongly encourage you to follow along with the commands in your own environment to help you learn.


Articles in this series:

In Git, if you don't have a repository, you've got nothing, so let's make a directory and put it under version control using Git. Then we can run some commands against it.

Creating a Repository

Make a directory named firstgitproject and change to it. Run:

git init

You'll see a message like, "Initialized empty Git repository in <path> to project>/firstgitproject/.git", and if you look in your project directory, you'll see that there is, indeed, a .git folder. This folder contains the repository you just created along with its configuration.

Status of a New Repo

Let's see what Git has to say about your brand new project with:

git status

You should see three things: that you're "On branch master", that this is the "Initial commit", and that you have "nothing to commit". The "master" branch is always the branch you start off in. In ways, it's similar to SVN's "trunk", but unlike in SVN, there's nothing special about it. You can rename or delete master with no problem. It's just a branch like any other you might create. The "Initial commit" is a little mysterious in Git. It signifies that you have no history to the project yet. Many Git commands will exit with errors when run against this "Initial commit"--i.e. until you make your first commit to the repo, so let's work on that.

Adding Files/The First Commit

Create File--Working Tree

The directory that you're in, where the .git directory is and where all your files for this project would ultimately go, is what Git refers to as your working tree. It's where you do all your work. Let's add a file to your working tree:

echo "Hello World" > foo

Run "git status" again. Now you have an "untracked" file named foo. Untracked means what it sounds like: Git isn't tracking this file. It exists only in your working tree. Let's change that.

Stage Change--Index

Tell Git to track your newly created file with:

git add foo

Another "git status" now shows foo under "Changes to be committed". What you just did, in Gitspeak, is called "staging a change". You took something that has changed in your working tree (a "change") and told Git to include it in the next commit ("staged it") using the add command. Git has a name for this area where you stage your commits to: the index. I don't know the rationale behind the naming, but when you see the documentation refer to the "index", this is what it's talking about.

Commit Change--Repository

The final step is to commit, which takes all the changes in your index and moves them into the repository. You should be familiar with the concept of a repository. It's where the full history of your project is kept. The general workflow, which we just simulated, is this: you have a working tree and index that are in sync with the repository--you've changed no files and staged no changes. You make changes, and now you have a dirty working tree. You stage some or all of the changes to the index, and now you have a clean working tree and dirty index. Then you commit the staged changes to the repository, and you're back to a clean state. Repeat.

So let's finish the cycle:

git commit -m "My first commit"

Note: If you've set the core.editor configuration option, you can omit the "-m <message>" and your editor will be opened, showing you the complete status message. Just type your commit message, save, and exit to complete the commit

The response of this command should look something like:

[master (root-commit)]: created b1468b4: "My first commit"
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 foo

This indicates several things that I'm going to explain only briefly here. First, you made this commit on branch "master". This is the root-commit, meaning the very first in the repository--the one that went on top of the "Initial commit". A commit identifiable by "b1468b4" was created with the given message. In this commit, one file was changed, a single line was inserted, and no lines were deleted. Finally, this commit included the creation of a new file named "foo" with mode 100644 (that's the Linux file mode that indicates file permissions for you Windows guys).

At this point, some people may be surprised at the fact that we just made a commit without having any kind of Git server set up. Recall that Git is a Distributed VCS. You have an entire copy of the repository, and it lives in that .git directory, remember? So all that's involved in a commit is some local file operations. There's no need for any kind of client/server setup. Now back to our regularly scheduled programming.

Modifying Files Under Version Control

Make Change--Working Tree

The file "foo" is now under version control. Let's make a change to it:

echo "Line 2" >> foo

Now a "git status" will indicate that foo is "Changed but not updated". This message is a little ambiguous. What it means is that the file is under version control, but the copy of the file in your working tree differs from that in the repository, and you haven't staged this change yet. It's time to look at a massively useful command, which you're surely familiar with from other VCSs:

git diff

This is probably the most common use of diff, though there are a host of other options available for it. I'll cover some of those later on.

You've now seen all three parts of the "git status" output. To recap, the output contains 1) the untracked files which aren't under version control, 2) files under version control that you've made changes to, and 3) changes--either new files or modifications to version controlled files--that you've staged and will be included in the next commit.

You'll also see in this latest status output that it prompts you with a couple of ways to deal with this file. You can either "add" it--we'll do this in a moment--or "checkout" it. In SVN, you're used to "checkout" being a very rare command used only for the initial retrieval of a project from a repository. In Git, it's a much more common command which means, roughly, "get X from the repository and put it in my working tree". If you were to "git checkout foo", then foo would be replaced by the version in the repository, removing your most recent changes. In this way, it acts like SVN's revert command. Let's not do that now.

Stage Change--Index

Instead, stage your change with:

git add foo

This is something that regularly trips up SVN users trying out Git. In SVN, you only add newly created files. In Git, you "add" every change. It's actually a more consistent approach and very logical when you get used to the idea. Remember that "add" always stages a change--moves it into the index--and it's the index that gets committed. Another "git status" at this point will show you a similar output to before, when you had staged the newly created file, showing that a new file and a modified file are both just considered "changes" that have to be incorporated.

Commit Change--Repository

Go ahead and commit now with:

git commit -m "My second commit"

Take note of the difference between the output from this commit and the previous one. It's somewhat briefer since this isn't the root-commit and there wasn't a new file created.

Seeing Your History

Now look at your commit history with:

git log

This brings up a really handy feature of Git. If the output of any Git command exceeds what will fit on one screen, Git automatically pipes the output through less, making it easily browsable. Once your history grows beyond a few commits, you'll see this behavior regularly with "git log".

In the commit log, you'll see for each commit 1) the unique identifier of the commit--a 40-hex-digit SHA-1 hash, 2) the author of the commit, 3) the timestamp of the commit, and 4) the commit message entered for the commit.

Removing Files from Version Control

Now for one more, trivial command:

git rm foo
git commit -m "I removed foo"

That's pretty self-explanatory, I think. It's what you do to delete files. Go ahead and add foo back, so that we can use it some more:

echo "new foo" >> foo
git add foo
git commit -m "Added foo back"

View Previous Versions

Establishing a history of things isn't very useful if you can't look back through the history. Here's the basic form of two commands to let you get a feel for a simple history. Later on, I'll do a whole post on navigating around a repository and seeing what's in it.

Show a Previous Version

Use "git log" to get the hash of a previous commit, and run:

git show <hash>:foo

That will show you the content of the file foo at that commit. Now use "git log" to pick two different commits, and run:

git diff <older hash> <newer hash>

You'll see the differences between the two versions in patch format. Added lines have a '+' in front, and deleted lines have a '-'.

I'm going to break this post here, though there's lots more to cover. By now, however, you know all you need in order to keep a simple, forward-moving history of any personal project you happen to be running. It's also handy for versioning configuration files if you run, for instance, a Squid proxy server or other local service with complex, text-based configuration. The next couple of posts will primarily cover branching and merging, where Git really shines in comparison to centralized version control.

First, branching: Part IV--Branching

Vote for this article on DZone.

No comments: