I've fiddled with my blog template because I decided I wanted more horizontal viewing space, given that it was using less than a third of my 1920 horizontal pixels. If it feels too spread out for you, I added a drag-and-drop handle over to the left to let you resize the main content column. The javascript is pretty primitive. If it breaks, drop me a comment.
>
>
>
>

Friday, August 7, 2009

The Journey to Git, Part VIII--Connecting Git to Subversion

In this post, we'll start seeing how to use Git as a client to a Subversion repository. This is an excellent way to get your feet wet with Git without forcing the learning curve on others working on the same project. It might also be a useful intermediate step in moving from SVN to Git by getting all the members of a team accustomed to Git while still having their old SVN client as backup in case they get lost. As has happened previously, when I got to the end of what I thought was one post, I decided it was way too long, so I'm breaking it up into two pieces. This piece discusses cloning an existing Subversion repo and what you'll have after you do that. The next one explains the commands you use to trade commits with Subversion: the equivalents of "svn commit" and "svn update".

Before you dig in here, you should be able to use basic Git commands like commit, checkout, and branch. If you're not comfortable with that, have a look at my earlier posts:

TOC

Articles in this series:

Setting Up for Subversion Interaction

All Subversion interaction is done through a set of special sub-commands that start with "git svn". If you're running Git through Cygwin, there's two things to note. (Non-Cygwin-users can skip to the next paragraph.) First, you need to install the "subversion-perl" package in Cygwin to be able to use the "git svn" set of commands. Second, a change introduced to Cygwin a few months ago slightly broke "git svn" under Cygwin. See this message for a description of the problem and a workaround for it. When I refer to the "git svn rebase" command in the next post, you'll need to use the mentioned workaround in its place. It may also affect the "git clone" command, but I've neither checked it for myself nor seen any reports on it.

Only Cygwin Git users need to do anything special. On Linux and in mysysgit, everything is already in place. I expect the primary way that Java developers will start using Git is by cloning an existing SVN repo, and that's what I'm going to go through here.

Cloning an Existing Subversion Repo

To use Git to work against an existing SVN repository, your first step is to clone it. Remember that Git is a Distributed VCS, meaning you have your own copy of the entire repository. Cloning SVN is one way to get one.

Note: This can take several hours on a large project with a moderate number of branches because of the way SVN stores branches and the sheer number of files that have to be transferred!

If you're ready to start the clone, get the URL of your SVN repo, and switch to the directory in which you want your project directory to live. The "git clone" command creates a subdirectory and checks out the project in it. For a Subversion repo using the standard directory layout--that is, directories named trunk, branches, and tags--run:

git svn clone --stdlayout --username=<your username> <svn url> foo

If your repository structure differs from the standard layout, use this form instead:

git svn clone --trunk=<trunk dir> --tags=<tags dir> --branches=<branches dir> --username=<your username> <svn url> foo

Username is only required if using authentication, obviously, and "foo" is the name of the directory to create to hold the project. If you don't provide this, then the last bit of the URL--after the final '/'--will be used as the directory name.

Working with a Git Clone of a Subversion Repository

After a successful clone, the target directory will have a typical Git repository and working tree in it. Your standard master branch will be there, and master's HEAD is what's checked out. In my experience, the content of master will be the Subversion branch with the most recent commit on it, but I haven't seen this behavior documented anywhere. You can see all of your Subversion branches and tags by running:

git branch -r

Note: I haven't covered remote Git interaction yet, so this may stray a bit into unfamiliar territory. Just understand that master is your local branch, where you do your work. If you were to "git commit" something, this is where that commit would go. All the Subversion branches you just saw are called "remote-tracking branches". For the most part, you can pretend they're not there. They just act kind of like a mirror of the Subversion repo so that you always have a copy of it around. The usefulness of this will become apparent later. Finally, master "tracks" one of the remote Subversion branches, meaning initially it contains exactly the same commits as that branch, and when you commit to or update from Subversion, that's the branch you'll be interacting with.

So now you have a Git copy of your SVN repo. What next? Well, now you can develop away using Git just like you always would: make commits, branch, merge, rebase, etc. There's just one caveat: don't fool with the history that came from Subversion. Don't try to rebase and change SVN commits around, for example. Just treat the commits from SVN as read-only. Immutable. Untouchable. Get the idea? If you screw with SVN's tiny brain in that way, don't come back to me unless it's only to describe how your SVN or Git repo melted down! I'd be interested to hear about that. You'll know the SVN commits because when you "git log", you'll see a special "git-svn-id" line in each SVN commit. Other than that, it's open season for making changes.

Handling Different Subversion Branches from Git

Of course, there's just the one local branch--master--and it's tracking just the one Subversion branch. That means any changes you make on master will always be sent back to that same Subversion branch. How do we send changes to a different branch? Just create a new local branch that tracks the remote-tracking branch you want to interact with. To create and switch to it with one command, run:

git checkout -b <new branch name> --track <remote branch name>

Now all commits made in your Git branch <new branch name> will eventually be sent to the Subversion branch <remote branch name>, and when you update from Subversion on that local branch, you'll get changes from that remote branch. Remember that you can use "git branch -r" to see all the remote branches that Git knows about. If you don't see the branch you want, then you'll need to use this command to refresh your remote-tracking branches with the latest Subversion changes, including new branches:

git svn fetch

Note: If there's a new branch to fetch, it can take a while, though not as long as the initial clone.

The one thing to keep in mind when branching in Git is to make sure to keep the history linear from Subversion's perspective. Don't branch from master and try to commit both the branch and master back to Subversion. Either merge them together, or commit one back, update the other to get it current, and then commit it, too. Again, I'm only interested in hearing about the details of the meltdown.

Those are the high points of cloning a Subversion repo and working in the clone. The next step is to be able to send commits back to Subversion and update the clone that you've made when someone else commits.