Git
In computing, Git is an open source project started by Linus Torvalds to manage the Linux kernel. It was first released on April 7, 2005 shortly after he dropped the proprietary BitKeeper software after a controversy where the developer Andrew Tridgell attempted to reverse engineer the BitKeeper protocols.
Like BitKeeper, Git does not use a centralized server. However, Git is not a SCM system. Torvalds explains, "in many ways you can just see git as a filesystem - it's content-addressable, and it has a notion of versioning, but I really really designed it coming at the problem from the viewpoint of a filesystem person (hey, kernels is what I do), and I actually have absolutely zero interest in creating a traditional SCM system."
Git has two data structures: an object database and a directory cache. There are three types of objects:
- A blob object is the content of a file.
- A tree object is a list of blob objects and any information associated with each blob object, such as the file name and file permission. This object describes the source tree at a certain time.
- A commit object provides the history of the source tree. It contains a log message, a tree object, and pointers to one or more parent commit objects.
The object database can hold any kind of object. The directory cache holds a tree object; it may or may not reflect the current state of the source tree.
Each object is distinguished by a SHA1 hash of the objects contents. Git computes the object's hash, and it uses this hash value for the object's filename. The object is put into a directory matching the first two characters of its hash. The rest of the hash is used as the file name for that object.
GIT stores each revision of a file as a unique blob object. The relationships between the blobs can be found through examining the commit objects. Normally a blob is stored in its entirety using zlib compression, but this can consume a large amount of hard disk space quickly. Therefore, as an optionally extension to the repository format, delta compression can be used to store some of the blobs as their changes relative other blobs.
Git has the following commands:
- git-init-db creates a new repository.
- git-update-cache --add adds new blob objects to the object database for any file that has been changed since the last commit. The old blob objects are left intact.
- git-write-tree creates a new tree object from the directory cache and adds it into the object database. It outputs the hash of the tree object.
- git-commit-tree takes in the hash of a tree object and the hashes of the parent commit objects. It will create a new commit object.
Petr Baudis maintains a set of scripts called cogito (formerly git-pasky), a revision control system that uses Git as its backend.
As of April 2005, there are two web interfaces for Git:
- gitweb – a Perl implementation maintained by Kay Sievers.
- wit – a Python implementation maintained by Christian Meder.
External links
- Git Traffic – Newsletter that summarises events on the git mailing list
- The guts of git
- From KernelTrap:
- Re: (PATCH) add the ability to create and retrieve delta objects
- PC World – "Torvalds seemed aware that his decision to drop BitKeeper would also be controversial. When asked why he called the new software, "git," British slang meaning "a rotten person," he said. 'I'm an egotistical bastard, so I name all my projects after myself. First Linux, now git.'"