Jump to content

Hexadecimal

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 209.183.34.48 (talk) at 01:25, 28 August 2008 (Uses). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In mathematics and computer science, hexadecimal (also base-16, hexa, or hex) is a numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 09 to represent values zero to nine, and A, B, C, D, E, F (or a through f) to represent values ten to fifteen.

Its primary use is as a human friendly representation of binary coded values, so it is often used in digital electronics and computer engineering. Since each hexadecimal digit represents four binary digits (bits)—also called a nibble—it is a compact and easily translated shorthand to express values in base two.

Uses

0hex = 0dec = 0oct 0 0 0 0
1hex = 1dec = 1oct 0 0 0 1
2hex = 2dec = 2oct 0 0 1 0
3hex = 3dec = 3oct 0 0 1 1
4hex = 4dec = 4oct 0 1 0 0
5hex = 5dec = 5oct 0 1 0 1
6hex = 6dec = 6oct 0 1 1 0
7hex = 7dec = 7oct 0 1 1 1
8hex = 8dec = 10oct 1 0 0 0
9hex = 9dec = 11oct 1 0 0 1
Ahex = 10dec = 12oct 1 0 1 0
Bhex = 11dec = 13oct 1 0 1 1
Chex = 12dec = 14oct 1 1 0 0
Dhex = 13dec = 15oct 1 1 0 1
Ehex = 14dec = 16oct 1 1 1 0
Fhex = 15dec = 17oct 1 1 1 1

In digital computing, hexadecimal is primarily used to represent bytes. Attempts to represent the 256 possible byte values by other means have led to problems. Directly representing each possible byte value with a single character representation runs into unprintable control characters in the ASCII character set. Even if a standard set of printable characters were devised for every byte value, neither users nor input hardware are equipped to handle 256 unique characters. Most hex editing software displays each byte as a single character, but unprintable characters are usually substituted with period or blank.

In URLs, all characters can be coded using hexadecimal.[1] Each 2-digit (1 byte) hexadecimal sequence is preceded by a percent sign. For example, the URL http://en.wikipedia.org/wiki/Main%20Page substitutes a space (which is not allowed in URLs) with the hex code for a space (%20).

Representing hexadecimal

In situations where there is no context, a hexadecimal number might be ambiguous and confused with numbers expressed in other bases. There are several conventions for unambiguously expressing values. In mathematics, a subscript is often used on each number explicitly giving the base: 15910 is decimal 159; 15916 is hexadecimal 159 which is equal to 34510. Some authors prefer a text subscript, such as 159decimal and 159hex.

In linear text systems, such as those used in most computer programming environments, a variety of methods have arisen:

  • In URLs, character codes are written as hexadecimal pairs prefixed with %: http://www.example.com/name%20with%20spaces where %20 is the space (blank) character, code 20 hex, or 32 decimal.
  • In XML and XHTML, characters can be expressed as hexadecimal using the notation . Color references are expressed in hex prefixed with #: #FFFFFF which gives white.[2]
  • The C programming language (and its syntactical descendants[3]) use the prefix 0x: 0x5A3 Character and string constants may express character codes in hexadecimal with the prefix \x followed by two hex digits: '\x1B' (specifies the Esc control character), "\x1B[0m\x1B[25;1H" is a string containing 11 characters (not including an implied trailing NUL).[4] To output a value as hexadecimal with the printf function family, the format conversion code %X or %x is used.
  • In the Unicode standard, a character value is represented with U+ followed by the hex value: U+20AC is the Euro sign (€).
  • MIME (e-mail extensions) quoted-printable characters by code inside a text/plain MIME-part body prefix non-printable ASCII characters with an equal to sign =, as in Espa=D1a to send "España" (Spain).
  • In Intel-derived assembly languages, hexadecimal is indicated with a suffixed H or h: FFh or 0A3CH. Some implementations require a leading zero when the first character is not a digit: 0FFh
  • Other assembly languages (6502, AT&T, Motorola), Pascal, and some versions of BASIC (Commodore) and Forth use $ as a prefix: $5A3.
  • Some assembly languages (Microchip) use the notation H'ABCD' (for ABCD16).
  • *nix (UNIX and related) shells use an escape character form \x0FF in expressions and 0xFF for constants.
  • Ada and VHDL enclose hexadecimal numerals in based "numeric quotes": 16#5A3#
  • Verilog represents hexadecimal constants in the form 8'hFF, where 8 is the number of bits in the value and FF is the hexadecimal constant.
  • Modula 2 and some other languages use # as a prefix: #01AF
  • The Smalltalk programming language uses the prefix 16r: 16r6EF7
  • Postscript indicates hex with prefix 16#: 16#ABCD. Binary data (such as image pixels) can be expressed as unprefixed consecutive hexadecimal pairs: AA213FD51B3801043FBC...
  • Common Lisp use the prefixes #x and #16r.
  • QBasic and Visual Basic, prefix hexadecimal numerals with &H: &H5A3
  • BBC BASIC and Locomotive_BASIC use & for hex.[5]
  • TI-89 and 92 series uses 0h: 0hA3
  • Notations such as X'5A3' are sometimes seen, such as in PL/I. This is the most common format for hexadecimal on IBM mainframes (zSeries) and midrange computers (iSeries) running traditional OS's (zOS, zVSE, zVM, TPF, OS/400), and is used in Assembler, PL/1, Cobol, JCL, scripts, commands and other places. This format was common on other (and now obsolete) IBM systems as well.
  • Donald Knuth introduced the use of particular typeface to represent a particular radix in his book The TeXbook.[6] There, hexadecimal representations are written in a typewriter typeface: 5A3

There is no universal convention to use lowercase or uppercase for the letter digits, and each is prevalent or preferred by particular environments by community standards or convention.

Bruce A. Martin's hexadecimal notation proposal

The choice of the letters A through F to represent the digits above nine was not universal in the early history of computers. During the 1950s, some installations favored using the digits 0 through 5 with a macron character ("¯") to indicate the values 10-15. Users of Bendix G-15 computers used the letters U through Z. Bruce A. Martin of Brookhaven National Laboratory considered the choice of A-F "ridiculous" and in 1968 proposed in a letter to the editor of the ACM an entirely new set of symbols based on the bit locations, which did not gain much acceptance.[7]

A hexadecimal multiplication table

Verbal representations

Not only are there no digits to represent the quantities from ten to fifteen—so letters are used as a substitute—but most Western European languages also lack a nomenclature to name hexadecimal numbers. "Thirteen" and "fourteen" are decimal-based, and even though English has names for several non-decimal powers: pair for the first binary power; score for the first vigesimal power; dozen, gross, and great gross for the first three duodecimal powers. However, no English name describes the hexadecimal powers (corresponding to the decimal values 16, 256, 4096, 65536, ...). Some people read hexadecimal numbers digit by digit like a phone number: 4DA is "four-dee-aye". However, the letter 'A' sounds similar to eight, 'C' sounds similar to three, and 'D' can easily be mistaken for the 'ty' suffix: Is it 4D or forty? Other people avoid confusion by using the NATO phonetic alphabet: 4DA is "four-delta-alpha". Similarly, some use the Joint Army/Navy Phonetic Alphabet ("four-dog-able"), or a similar ad hoc system.

Signs

The hexadecimal system can express negative numbers the same way as in decimal: –2A to represent –42 and so on.

However, some prefer instead to express the exact bit patterns used in the processor and consider hexadecimal values best handled as unsigned values. This way, the negative number –42 can be written as FFFF FFD6 in a 32-bit CPU register, as C228 0000 in a 32-bit FPU register or C045 0000 0000 0000 in a 64-bit FPU register.

Fractions

As with other numeral systems, the hexadecimal system can be used to represent rational numbers, although recurring digits are common since sixteen (10h) has only a single prime factor (two):

    12
=
0.8     16
=
0.2AAAAAAAA...     1A
=
0.1999999999...     1E
=
0.1249249249...
    13
=
0.5555555555...     17
=
0.2492492492...     1B
=
0.1745D1745D...     1F
=
0.1111111111...
    14
=
0.4     18
=
0.2     1C
=
0.1555555555...     110
=
0.1
    15
=
0.3333333333...     19
=
0.1C71C71C71...     1D
=
0.13B13B13B1...     111
=
0.0F0F0F0F0F...

For any base, 0.1 (or "1/10") is always equivalent to one divided by the representation of that base value in its own number system: Counting in base 3 is 0, 1, 2, 10 (three). Thus, whether dividing one by two for binary or dividing one by sixteen for hexadecimal, both of these fractions are written as 0.1. Because the radix 16 is a perfect square (4²), fractions expressed in hexadecimal have an odd period much more often than decimal ones, and there are no cyclic numbers (other than trivial single digits). Recurring digits are exhibited when the denominator in lowest terms has a prime factor not found in the radix; thus, when using hexadecimal notation, all fractions with denominators that are not a power of two result in an infinite string of recurring digits (such as thirds and fifths). This makes hexadecimal (and binary) less convenient than decimal for representing rational numbers since a larger proportion lie outside its range of finite representation.

All rational numbers finitely representable in hexadecimal are also finitely representable in decimal, duodecimal and sexagesimal: that is, any hexadecimal number with a finite number of digits has a finite number of digits when expressed in those other bases. Conversely, only a fraction of those finitely representable in the latter bases are finitely representable in hexadecimal: That is, decimal 0.1 corresponds to the infinite recurring representation 0.199999999999... in hexadecimal. However, hexadecimal is more efficient than bases 12 and 60 for representing fractions with powers of two in the denominator (e.g., decimal one sixteenth is 0.1 in hexadecimal, 0.09 in duodecimal, 0;3,45 in sexagesimal and 0.0625 in decimal).

Binary translation

Most computers manipulate binary data, but it is difficult for humans to work with the large number of digits for even a relatively small binary number. Although most humans are familiar with the base 10 system, it is much easier to map binary to hexadecimal than to decimal because each hexadecimal digit maps to a whole number of bits (410). This example converts 11112 to base ten. Since each position in a binary numeral can contain either a 1 or 0, its value may be easily determined by its position from the right:

  • 00012 = 110
  • 00102 = 210
  • 01002 = 410
  • 10002 = 810

Therefore:

11112 = 810 + 410 + 210 + 110
  = 1510

With surprisingly little practice, mapping 11112 to F16 in one step becomes easy: see table in Uses. The advantage of using hexadecimal rather than decimal increases rapidly with the size of the number. When the number becomes large, conversion to decimal is very tedious. However, when mapping to hexadecimal, it is trivial to regard the binary string as 4 digit groups and map each to a single hexadecimal digit.

This example shows the conversion of a binary number to decimal, mapping each digit to the decimal value, and adding the results.

010111101011010100102 = 26214410 + 6553610 + 3276810 + 1638410 + 819210 + 204810 + 51210 + 25610 + 6410 + 1610 + 210
  = 38792210

Compare this to the conversion to hexadecimal, where each group of four digits can be considered independently, and converted directly:

010111101011010100102 = 0101  1110  1011  0101  00102
  = 5 E B 5 216
  = 5EB5216

The conversion from hexadecimal to binary is equally direct.

The octal system can also be useful as a tool for people who need to deal directly with binary computer data. Octal represents data as three bits per character, rather than four.

Converting from other bases

Division-remainder in source base

As with all bases there is a simple algorithm for converting a representation of a number to hexadecimal by doing integer division and remainder operations in the source base. Theoretically this is possible from any base but for most humans only decimal and for most computers only binary (which can be converted by far more efficient methods) can be easily handled with this method.

Let d be the number to represent in hexadecimal, and the series hihi-1...h2h1 be the hexadecimal digits representing the number.

  1. i := 1
  2. hi := d mod 16
  3. d := (d-hi) / 16
  4. If d = 0 (return series hi) else increment i and go to step 2

"16" may be replaced with any other base that may be desired.

The following is a JavaScript implementation of the above algorithm for converting any number to a hexadecimal in String representation. Its purpose is to illustrate the above algorithm. To work with data seriously however, it is much more advisable to work with bitwise operators.

function toHex(d) {
    var r = d % 16;
    var result;
    if (d-r == 0) 
        result = toChar(r);
    else 
        result = toHex( (d-r)/16 ) + toChar(r);
    return result;
}

function toChar(n) {
    var alpha = "0123456789ABCDEF";
    return alpha.charAt(n);
}

Addition and multiplication

It is also possible to make the conversion by assigning each place in the source base the hexadecimal representation of its place value and then performing multiplication and addition to get the final representation. I.e. to convert the number B3AD to decimal one can split the conversion into D (1310), A (1010), 3 (310) and B (1110) then get the final result by multiplying each decimal representation by 16p, where 'p' is the corresponding position from right to left, beginning with 0. In this case we have 13*(160) + 10*(161) + 3*(162) + 11*(163), which is equal 45997 in the decimal system.

Conversion via binary

As most computers work in binary, the normal way for a computer to make such a conversion would be to convert to binary first (by doing multiplication and addition in binary) and then make use of the direct mapping from binary to hexadecimal.

Tools for conversion

Most modern computer systems with graphical user interfaces provide a built-in calculator utility, capable of performing conversions between various radixes, generally including hexadecimal.

In Microsoft Windows, the Calculator utility can be set to scientific calculator mode, which allows conversions between radix 16 (hexadecimal), 10 (decimal), 8 (octal) and 2 (binary); the bases most commonly used by programmers. In Scientific Mode, the on screen numeric keypad includes the hexadecimal digits A through F which are active when "Hex" is selected. The Windows Calculator however only supports integers.

Cultural

Etymology

The word "hexadecimal" is strange in that hexa is derived from the Greek έξ (hex) for "six" and decimal is derived from the Latin for "tenth". It may have been derived from the Latin root, but Greek deka is so similar to the Latin decem that some would not consider this nomenclature inconsistent. However, the word "sexagesimal" (base 60) retains the Latin prefix. The earlier Bendix documentation used the term "sexadecimal". Donald Knuth has pointed out that the etymologically correct term is "senidenary", from the Latin term for "grouped by 16". (The terms "binary", "ternary" and "quaternary" are from the same Latin construction, and the etymologically correct term for "decimal" arithmetic is "denary".)[8] Schwartzman notes that the pure expectation from the form of usual Latin-type phrasing would be "sexadecimal", but then computer hackers would be tempted to shorten the word to "sex".[9] Incidentally, the etymologically proper Greek term would be hexadecadic (although in Modern Greek deca-hexadic (δεκαεξαδικός) is more commonly used).

Common patterns and humor

Hexadecimal is sometimes used in programmer jokes because certain words can be formed using only hexadecimal digits. Some of these words are "dead", "beef", "babe", and with appropriate substitutions "c0ffee". Since these are quickly recognizable by programmers, debugging setups sometimes initialize memory to them to help programmers see when something has not been initialized. Some people add an H after a number if they want to show that it is written in hexadecimal. In older Intel assembly syntax, this is sometimes the case. "Hexspeak" may be the forerunner of the modern web parlance of "1337speak"

An example is the magic number in FAT Mach-O files and java class file structure, which is "CAFEBABE". Single-architecture Mach-O files have the magic number "FEEDFACE" at their beginning.

A Knuth reward check is one hexadecimal dollar, or $2.56.

The following table shows a joke in hexadecimal:

3x12=36
2x12=24
1x12=12
0x12=18

The first three are interpreted as multiplication, but in the last, "0x" signals Hexadecimal interpretation of 12, which is 18.

0xdeadbeef is sometimes put into uninitialized memory.

Another joke based on the use of a word containing only letters from the first six in the alphabet (and thus those used in hexadecimal) is...

If only DEAD people understand hexadecimal, how many people understand hexadecimal?

In this case, DEAD refers to a hexadecimal number (57005 base 10), not the state of being no longer alive. Obviously, DEAD normally should not be written in all-caps (as in the preceding) as it makes it stand out, thus ruining the riddle.

Microsoft Windows XP clears its locked index.dat files with the hex codes: "0BADF00D".

Two common bit patterns often employed to test hardware are 01010101 and 10101010 (their corresponding hex values are 55h and AAh, respectively). The reason for their use is to alternate between off ('0') to on ('1') or vice versa when switching between these two patterns. These two values are often used together as signatures in critical PC system sectors (e.g., the hex word, 0xAA55 which on little-endian systems is 55h followed by AAh, must be at the end of a valid Master Boot Record).

Primary numeral system

There have been occasional attempts to promote hexadecimal as the preferred numeral system. These attempts usually propose pronunciation and/or symbology. Sometimes the proposal unifies standard measures so that they are multiples of 16.[10][11][12]

An example of unifying standard measures is Hexadecimal time which subdivides a day by 16 so that there are 16 "hexhours" in a day.[12]

See also

References

  1. ^ See RFC 3986 at RFC 3986.
  2. ^ "Hexadecimal web colors explained".
  3. ^ Some of C's syntactical descendants are C++, C#, Java, JavaScript, and Windows PowerShell
  4. ^ The string "\x1B[0m\x1B[25;1H" specifies the characters: Esc [ 0 m Esc [ 2 5 ; 1 H. This expresses the escape sequences used to reset the character set and color then move the cursor to line 25 in an ANSI terminal.
  5. ^ BBC BASIC is not portable to Microsoft BASIC since the latter takes & to prefix octal values.
  6. ^ Donald E. Knuth. The TeXbook (Computers and Typesetting, Volume A). Reading, Massachusetts: Addison-Wesley, 1984. ISBN 0-201-13448-9. The source code of the book in TeX (and a needed set of macros [1]) is available online on CTAN.
  7. ^ Letters to the editor: On binary notation, Bruce A. Martin, Associated Universities Inc., Communications of the ACM, Volume 11, Issue 10 (October 1968) Page: 658 doi:10.1145/364096.364107
  8. ^ Knuth, Donald. (1969). Donald Knuth, in The Art of Computer Programming, Volume 2. ISBN 0-201-03802-1. (Chapter 17.)
  9. ^ Schwartzman, S. (1994). The Words of Mathematics: an etymological dictionary of mathematical terms used in English. ISBN 0-88385-511-9.
  10. ^ "Intuitor Hex Headquarters".
  11. ^ "A proposal for addition of the six Hexadecimal digits (A-F) to Unicode".
  12. ^ a b Nystrom, John William (1862). Project of a New System of Arithmetic, Weight, Measure and Coins: Proposed to be called the Tonal System, with Sixteen to the Base. Philadelphia.{{cite book}}: CS1 maint: location missing publisher (link)

Hex conversion utilities or pages