跳转到内容

Uuencode

维基百科,自由的百科全书

这是本页的一个历史版本,由Greener留言 | 贡献2006年7月29日 (六) 12:28编辑。这可能和当前版本存在着巨大的差异。

uuencode这个名字是衍生自 "Unix-to-Unix encoding",原先是Unix系统下将二进制的资料借由uucp邮件系统传输的一个编码程式,是一种二进制到文字的编码uudecode 是与uuencode搭配的解码程式,uuencode/decode常见于电子邮件中的档案传送以及usenet新闻组的贴文等等。近来以被MIME这种编码程序大量取代.

编码程序

Uuencoded data starts with a line of the form:

begin <mode> <file>

Where <mode> is the file's read/write/execute permissions as three octal digits, and <file> is the name to be used when recreating the binary data.

Uuencode repeatedly takes in a group of three bytes, adding trailing zeros if there are fewer than three bytes left. These 24 bits are split into four groups of six which are treated as numbers between 0 and 63. Decimal 32 is added to each number and they are output as ASCII characters which will lie in the range 32 (space) to 32+63 = 95 (underscore). ASCII characters greater than 95 may also be used; however, only the six right-most bits are relevant.

Each group of sixty output characters (corresponding to 45 input bytes) is output as a separate line preceded by an encoded character giving the number of encoded bytes on that line. For all lines except the last, this will be the character 'M' (ASCII code 77 = 32+45). If the input is not evenly divisible by 45, the last line will contain the remaining N output characters, preceded by the character whose code is 32 + the number of remaining input bytes. Finally, a line containing just a single space (or grave character) is output, followed by one line containing the string "end".

Sometimes each data line has extra dummy characters (often the grave accent) added to avoid problems with mailers that strip trailing spaces. These characters are ignored by uudecode. The grave accent (ASCII 96) can also be used in place of a space character. When stripped of their high bits they both decode to 100000.

Despite using this limited range of characters, there are still some problems encountered when uuencoded data passes through certain old computers. The worst offenders are computers using non-ASCII character sets such as EBCDIC. To solve this problem, the Xxencode format was created as a more robust version of the encoding, which used only alphanumeric characters and the plus and minus symbols.

uuencode编码范例

The table shows the uuencoding of the three ASCII encoded characters Cat into its uuencoded representation 0V%T:

原始字元 C a t
原始ASCII码(十进制) 67 97 116
ASCII码(二进制) 0 1 0 0 0 0 1 1 0 1 1 0 0 0 0 1 0 1 1 1 0 1 0 0
新的十进制数值 16 54 5 52
+32 48 86 37 84
编码后的Uuencode字元 0 V % T

If the complete uuencoded output of the three ASCII characters Cat might appear as the following

begin 644 cat.txt
#0V%T
`
end

Uuencode对照表

The following table represents the subset of ASCII characters used by UUEncode and the 6-bit binary string they represent.

Printable
Representation
ASCII Decimal Binary
Representation
Printable
Representation
ASCII Decimal Binary
Representation
(space) 32 000 000   @ 64 100 000
! 33 000 001   A 65 100 001
" 34 000 010   B 66 100 010
# 35 000 011   C 67 100 011
$ 36 000 100   D 68 100 100
% 37 000 101   E 69 100 101
& 38 000 110   F 70 100 110
' 39 000 111   G 71 100 111
( 40 001 000   H 72 101 000
) 41 001 001   I 73 101 001
* 42 001 010   J 74 101 010
+ 43 001 011   K 75 101 011
, 44 001 100   L 76 101 100
- 45 001 101   M 77 101 101
. 46 001 110   N 78 101 110
/ 47 001 111   O 79 101 111
0 48 010 000   P 80 110 000
1 49 010 001   Q 81 110 001
2 50 010 010   R 82 110 010
3 51 010 011   S 83 110 011
4 52 010 100   T 84 110 100
5 53 010 101   U 85 110 101
6 54 010 110   V 86 110 110
7 55 010 111   W 87 110 111
8 56 011 000   X 88 111 000
9 57 011 001   Y 89 111 001
: 58 011 010   Z 90 111 010
; 59 011 011   [ 91 111 011
< 60 011 100   \ 92 111 100
= 61 011 101   ] 93 111 101
> 62 011 110   ^ 94 111 110
? 63 011 111   _ 95 111 111
  ` 96 (1) 000 000

参见

参考资料

本条目部分或全部内容出自以GFDL授权发布的《自由线上电脑词典》(FOLDOC)。

外部链接

  • UUDeview - open-source program to encode/decode Base64, BinHex, uuencode, xxencode, etc. for Unix/Windows/DOS