Jump to content

Git

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by David Gerard (talk | contribs) at 10:48, 9 June 2005 (license in second sentence). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computing, Git is a revision control filesystem project started by Linus Torvalds to manage the Linux kernel. It is open source and free software released under the GNU General Public License version 2.

Git was first released on April 7, 2005 shortly after Torvalds dropped the proprietary BitKeeper software, after a controversy where permission to use BitKeeper as freeware was withdrawn by proprietor Larry McVoy after Andrew Tridgell attempted to reverse engineer the BitKeeper protocols.

Like BitKeeper, Git does not use a centralized server. However, Git is not a SCM system. Torvalds explains, "in many ways you can just see git as a filesystem - it's content-addressable, and it has a notion of versioning, but I really really designed it coming at the problem from the viewpoint of a filesystem person (hey, kernels is what I do), and I actually have absolutely zero interest in creating a traditional SCM system."

Git has two data structures: an object database and a directory cache. There are three types of objects:

  • A blob object is the content of a file.
  • A tree object is a list of blob objects and any information associated with each blob object, such as the file name and file permission. This object describes the source tree at a certain time.
  • A commit object provides the history of the source tree. It contains a log message, a tree object, and pointers to one or more parent commit objects.

The object database can hold any kind of object. The directory cache holds a tree object; it may or may not reflect the current state of the source tree.

Each object is distinguished by a SHA1 hash of its contents. Git computes the hash, and uses this value for the object's filename. The object is put into a directory matching the first two characters of its hash. The rest of the hash is used as the file name for that object.

GIT stores each revision of a file as a unique blob object. The relationships between the blobs can be found through examining the commit objects. Normally a blob is stored in its entirety using zlib compression, but this can consume a large amount of hard disk space quickly. Therefore, as an optionally extension to the repository format, delta compression can be used to store some of the blobs as their changes relative other blobs.

Git has the following commands:

  • git-init-db creates a new repository.
  • git-update-cache --add adds new blob objects to the object database for any file that has been changed since the last commit. The old blob objects are left intact.
  • git-write-tree creates a new tree object from the directory cache and adds it into the object database. It outputs the hash of the tree object.
  • git-commit-tree takes in the hash of a tree object and the hashes of the parent commit objects. It will create a new commit object.

Petr Baudis maintains a set of scripts called cogito (formerly git-pasky), a revision control system that uses Git as its backend.

As of April 2005, there are two web interfaces for Git:

  • gitweb – a Perl implementation maintained by Kay Sievers.
  • wit – a Python implementation maintained by Christian Meder.