Git: Difference between revisions

Content deleted Content added

Inline

Revision as of 00:37, 22 April 2005

In computing, git is an open source project. The project was started in April 2005 by Linus Torvalds to manage his Linux operating system after he dropped the proprietary BitKeeper after a controversy caused when the developer Andrew Tridgell attempted to reverse engineer the BitKeeper protocols.

Like BitKeeper, git does not use a centralized server. However, according to Torvalds, git is being developed solely to solve problems he has in maintaining the code base and applying patches to Linux kernel. As such, it lacks a number of features present in other source management tools. For example, git cannot handle files being renamed; a user must first delete one file and recreate the file elsewhere. Also, according to McVoy, the creator of BitKeeper, git does not handle space efficiently, "a tiny one-character change to a 1MB file in Git will result in a 2MB file, whereas BitKeeper's file will grow only by one byte." [1]

External links

PC World "Torvalds seemed aware that his decision to drop BitKeeper would also be controversial. When asked why he called the new software, "git," British slang meaning "a rotten person," he said. 'I'm an egotistical bastard, so I name all my projects after myself. First Linux, now git.' "

Linux: Importing The Kernel Into git, Merging from kerneltrap. "From: Linus Torvalds [email blocked]

Subject: Re: full kernel history, in patchset format Date: Sat, 16 Apr 2005 10:04:31 -0700 (PDT)

On Sat, 16 Apr 2005, Ingo Molnar wrote: > > i've converted the Linux kernel CVS tree into 'flat patchset' format, > which gave a series of 28237 separate patches. (Each patch represents a > changeset, in the order they were applied. I've used the cvsps utility.) > > the history data starts at 2.4.0 and ends at 2.6.12-rc2. I've included a > script that will apply all the patches in order and will create a > pristine 2.6.12-rc2 tree.

Hey, that's great. I got the CVS repo too, and I was looking at it, but the more I looked at it, the more I felt that the main reason I want to import it into git ends up being to validate that my size estimates are at all realistic.

I see that Thomas Gleixner seems to have done that already, and come to a figure of 3.2GB for the last three years, which I'm very happy with, mainly because it seems to match my estimates to a tee. Which means that I just feel that much more confident about git actually being able to handle the kernel long-term, and not just as a stop-gap measure.

But I wonder if we actually want to actually populate the whole history.. Now that my size estimates have been verified, I have little actual real reason to put the history into git. There are no visualization tools done for git yet, and no helpers to actually find problems, and by the time there will be, we'll have new history.

So I'd _almost_ suggest just starting from a clean slate after all. Keeping the old history around, of course, but not necessarily putting it into git now. It would just force everybody who is getting used to git in the first place to work with a 3GB archive from day one, rather than getting into it a bit more gradually.

What do people think? I'm not so much worried about the data itself: the git architecture is _so_ damn simple that now that the size estimate has been confirmed, that I don't think it would be a problem per se to put 3.2GB into the archive. But it will bog down "rsync" horribly, so it will actually hurt synchronization untill somebody writes the rev-tree-like stuff to communicate changes more efficiently..

IOW, it smells to me like we don't have the infrastructure to really work with 3GB archives, and that if we start from scratch (2.6.12-rc2), we can build up the infrastructure in parallell with starting to really need it.

But it's _great_ to have the history in this format, especially since looking at CVS just reminded me how much I hated it.

Comments?

Linus"

This computing article is a stub. You can help Wikipedia by expanding it.