Jump to content

Globally unique identifier: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Uses: calculation error in DAO example corrected
Just cleaning up the writing in the opening paragraph so that the star comment doesn't seem so out of place.
Line 8: Line 8:
|work=The Code Project
|work=The Code Project
|publisher=Microsoft
|publisher=Microsoft
}}</ref> is so large that the probability of the same number being generated twice is very small (considering the [[observable universe]] contains about 5 × 10<sup>22</sup> stars). However, if every star had a GUID, there would still be a collision due to the [[birthday paradox]].
}}</ref> is so large that the probability of the same number being generated twice is very small. For example, considering the [[observable universe]] contains about 5 × 10<sup>22</sup> stars, every star could have a GUID (though there would still be a collision due to the [[birthday paradox]]).


The term GUID usually refers to [[Microsoft]]'s implementation of the [[Universally Unique Identifier]] (UUID) standard; however, many other pieces of software use the term GUID including [[Oracle Database]], [[dBase]], [[OpenView Operations]], and [[Novell eDirectory]]. The GUID is also the basis of the [[GUID Partition Table]], [[Intel]]'s replacement for [[Master Boot Record|Master Boot Records]] under [[Extensible Firmware Interface|EFI]].
The term GUID usually refers to [[Microsoft]]'s implementation of the [[Universally Unique Identifier]] (UUID) standard; however, many other pieces of software use the term GUID including [[Oracle Database]], [[dBase]], [[OpenView Operations]], and [[Novell eDirectory]]. The GUID is also the basis of the [[GUID Partition Table]], [[Intel]]'s replacement for [[Master Boot Record|Master Boot Records]] under [[Extensible Firmware Interface|EFI]].

Revision as of 16:32, 7 February 2008

A Globally Unique Identifier or GUID (Template:PronEng or /ˈgwɪd/) is a special type of identifier used in software applications in order to provide a reference number which is unique in any context (hence, "Globally"), for example, in defining the internal reference for a type of access point in a software application, or for creating unique keys in a database. While each generated GUID is not guaranteed to be unique, the total number of unique keys (2128 or 3.4×1038)[1] is so large that the probability of the same number being generated twice is very small. For example, considering the observable universe contains about 5 × 1022 stars, every star could have a GUID (though there would still be a collision due to the birthday paradox).

The term GUID usually refers to Microsoft's implementation of the Universally Unique Identifier (UUID) standard; however, many other pieces of software use the term GUID including Oracle Database, dBase, OpenView Operations, and Novell eDirectory. The GUID is also the basis of the GUID Partition Table, Intel's replacement for Master Boot Records under EFI.

Basic structure

The GUID is a 16-byte (128-bit) number. The most commonly used structure of the data type is:

Bits Description
32 Data1
16 Data2
16 Data3
8 × 8 Data4

The most significant byte in every field is stored last; the last 8 bytes are stored consecutively.

One to three of the most significant bits of the second byte in Data 4 define the type variant of the GUID:

Pattern Description
0 Network Computing System backward compatibility
10 Standard
110 Microsoft Component Object Model backward compatibility; this includes the GUID's for important interfaces like IUnknown and IDispatch.
111 Reserved for future use.

The most significant four bits of Data3 define the version number, and hence the algorithm used.[1]

Text encoding

Guids are most commonly written in text as a sequence of hexadecimal digits such as:

3F2504E0-4F89-11D3-9A0C-0305E82C3301

This text notation contains the following fields, separated by hyphens:

Hex digits Description
8 Data1
4 Data2
4 Data3
4 Initial two bytes from Data4
12 Remaining six bytes from Data4

For the first three fields, the most significant digit is on the left. The last two fields are treated as eight separate bytes, each having their most significant digit on the left, and they follow each other from left to right. Note that the digit order of the fourth field may be unexpected, since it's treated differently than in the structure.

Often braces are added to enclose the above format, as such:

{3F2504E0-4F89-11D3-9A0C-0305E82C3301}

When printing fewer characters is desired, GUIDs are sometimes encoded into a base64 string of 22 to 24 characters (depending on padding). For instance:

7QDBkvCA1+B9K/U0vrQx1A
7QDBkvCA1+B9K/U0vrQx1A==

Algorithm

The OSF-specified algorithm for generating new GUIDs has been widely criticized. In these (V1) GUIDs, the user's network card MAC address is used as a base for the last group of GUID digits, which means, for example, that a document can be tracked back to the computer that created it. This privacy hole was used when locating the creator of the Melissa worm. Most of the other digits are based on the time while generating the GUID.

V1 GUIDs which contain a MAC address and time can be identified by the digit "1" in the first position of the third group of digits, for example {2f1e4fc0-81fd-11da-9156-00036a0f876a}. GUIDs using the later algorithm, which is mostly a random number, have a "4" in the same position, for example {38a52be4-9352-453e-af97-5c3b448652f0}. More specifically, the 'data3' bit pattern would be 0001xxxxxxxxxxxx in the first case, and 0100xxxxxxxxxxxx in the second.

Uses

Depending on the context, groups of GUIDs may be used to represent similar but not quite identical things. For example, in the Windows registry, in the key sequence "My Computer\HKEY_Classes_Root\CLSID", the DAO database management system identifies the particular version and type of accessing module of DAO to be used by a group of about a dozen GUIDs which begin with 5 zeros, a three-digit identifier for that particular version and type, and the remainder of the guid, which ends with the same value for every case, 0000-0010-8000-00AA006D2EA4, so that the set of GUIDs used by this database system runs from {00000010-0000-0010-8000-00AA006D2EA4} through {00000109-0000-0010-8000-00AA006D2EA4} although not all GUIDs in that range are used.

In the Microsoft Component Object Model (COM), GUIDs are used to uniquely distinguish different software component interfaces. This means that two (possibly incompatible) versions of a component can have exactly the same name but still be distinguishable by their GUIDs.

The use of GUIDs permits certain types of object orientation to be used in a consistent manner. For example, in the creation of components for Microsoft Windows using COM, all components must implement the IUnknown interface in order to be able to find all other interfaces and features of that component, and they do this by creating a GUID which may be called upon to provide an entry point. The IUnknown interface is defined as a GUID with the value of {00000000-0000-0000-C000-000000000046}, and rather than having a named entry point called "IUnknown", the preceding GUID is used, thus every component that provides an IUnknown entry point gives the same GUID, and every program that looks for an IUnknown interface in a component always uses that GUID to find the entry point, knowing that an application using that particular GUID must always consistently implement IUnknown in the same manner and the same way.

GUIDs are also inserted into documents from Microsoft Office programs, as these are regarded as objects as well. Even audio or video streams in the Advanced Systems Format (ASF) are identified by their GUIDs.

GUIDs are frequently stored in files as text, but sometimes they are stored in binary. For example some files, including Advanced Systems Format (ASF) files, is a series of values in little endian byte order — a 32-bit unsigned integer, followed by two 16-bit unsigned integers, followed by eight unsigned bytes. On IA-32 hardware, like most desktop PCs, this means that the order of the bytes in memory and in the file is the same. Note that this does not match the text display format, because if the first field looks like {12345678-… this would be stored as 78 56 34 12 (hexademical). Software running on big endian hardware which reads such GUIDs and creates and writes new ones or compares them with GUIDs created locally may need to reverse the byte order of the first three values, if it stores the fields of the data structure in its native byte order.

Subtypes

There are several flavors of GUIDs used in COM:

  • IID – interface identifier; (The ones that are registered on a system are stored in the Windows Registry at the key HKEY_CLASSES_ROOT\Interface)
    • REFIID – a pointer to an IID
  • CLSID – class identifier; (Stored in the registry at HKEY_CLASSES_ROOT\CLSID)
  • LIBID – type library identifier;
  • CATID – category identifier; (its presence on a class identifies it as belonging to certain class categories)

DCOM introduces many additional GUID subtypes:

  • AppID – application identifier;
  • MID – machine identifier;
  • IPID – interface pointer identifier; (applicable to an interface engaged in RPC)
  • CID – causality identifier; (applicable to a RPC session)
  • OID – object identifier; (applicable to an object instance)
  • OXID – object exporter identifier; (applicable to an instance of the system object that performs RPC)
  • SETID – ping set identifier; (applicable to a group of objects)

These GUID subspaces may overlap, as the context of GUID usage defines its subtype. For example, there might be a class using same GUID for its CLSID as another class is using for its IID – all without a problem. On the other hand, two classes using same CLSID couldn't co-exist.

XML syndication formats

There is also a guid element in some versions of the RSS specification, and mandatory id element in Atom, which should contain a unique identifier for each individual article or weblog post. In RSS the contents of the guid can be any text, and in practice is typically a copy of the article URL. Atom's IDs need to be valid URIs (usually URLs pointing to the entry, or URNs containing any other unique identifier).

See also

References

  1. ^ a b "Generating GUIDs on the Pocket PC". The Code Project. Microsoft. 2004-01-20. Retrieved 2007-06-27. {{cite web}}: Check date values in: |date= (help)