跳转到内容

Uuencode

维基百科,自由的百科全书

这是本页的一个历史版本,由Greener留言 | 贡献2006年7月30日 (日) 03:16 uuencode編碼範例编辑。这可能和当前版本存在着巨大的差异。

uuencode这个名字是衍生自 "Unix-to-Unix encoding",原先是Unix系统下将二进制的资料借由uucp邮件系统传输的一个编码程式,是一种二进制到文字的编码uudecode 是与uuencode搭配的解码程式,uuencode/decode常见于电子邮件中的档案传送以及usenet新闻组的贴文等等。近来以被MIME这种编码程序大量取代.

编码程序

Uuencode的编码结果输出档案格式如下:

begin <輸入檔存取模式> <輸入檔名>
<編碼內容>
`
end

<输入档存取模式>

沿用自Unix系统档案存取权限模式,由三个八进制的数字组成,其构成形式为:

拥有人 群组 其他人
读取(r) 写入(w) 执行(x) 读取(r) 写入(w) 执行(x) 读取(r) 写入(w) 执行(x)

举例而言:当<输入档存取模式>666,转换成二进制码为110110110,也就是拥有人、群组以及其他人对于这个档案都有读取以及写入的权力。

<编码内容>

Uuencode repeatedly takes in a group of three bytes, adding trailing zeros if there are fewer than three bytes left. These 24 bits are split into four groups of six which are treated as numbers between 0 and 63. Decimal 32 is added to each number and they are output as ASCII characters which will lie in the range 32 (space) to 32+63 = 95 (underscore). ASCII characters greater than 95 may also be used; however, only the six right-most bits are relevant.

Each group of sixty output characters (corresponding to 45 input bytes) is output as a separate line preceded by an encoded character giving the number of encoded bytes on that line. For all lines except the last, this will be the character 'M' (ASCII code 77 = 32+45). If the input is not evenly divisible by 45, the last line will contain the remaining N output characters, preceded by the character whose code is 32 + the number of remaining input bytes. Finally, a line containing just a single space (or grave character) is output, followed by one line containing the string "end".

Sometimes each data line has extra dummy characters (often the grave accent) added to avoid problems with mailers that strip trailing spaces. These characters are ignored by uudecode. The grave accent (ASCII 96) can also be used in place of a space character. When stripped of their high bits they both decode to 100000.

Despite using this limited range of characters, there are still some problems encountered when uuencoded data passes through certain old computers. The worst offenders are computers using non-ASCII character sets such as EBCDIC. To solve this problem, the Xxencode format was created as a more robust version of the encoding, which used only alphanumeric characters and the plus and minus symbols.

uuencode编码范例

简短的例子

下面的表格显示如何将Cat这三个ASCII字元编码成uuencode的0V%T

原始字元 C a t
原始ASCII码(十进制) 67 97 116
ASCII码(二进制) 0 1 0 0 0 0 1 1 0 1 1 0 0 0 0 1 0 1 1 1 0 1 0 0
新的十进制数值 16 54 5 52
+32 48 86 37 84
编码后的Uuencode字元 0 V % T

因此Cat这三个ASCII字元表示成uuencode的档案形式:

begin 644 cat.txt
#0V%T
`
end

长范例

Uuencode对照表

The following table represents the subset of ASCII characters used by UUEncode and the 6-bit binary string they represent.

Printable
Representation
ASCII Decimal Binary
Representation
Printable
Representation
ASCII Decimal Binary
Representation
(space) 32 000 000   @ 64 100 000
! 33 000 001   A 65 100 001
" 34 000 010   B 66 100 010
# 35 000 011   C 67 100 011
$ 36 000 100   D 68 100 100
% 37 000 101   E 69 100 101
& 38 000 110   F 70 100 110
' 39 000 111   G 71 100 111
( 40 001 000   H 72 101 000
) 41 001 001   I 73 101 001
* 42 001 010   J 74 101 010
+ 43 001 011   K 75 101 011
, 44 001 100   L 76 101 100
- 45 001 101   M 77 101 101
. 46 001 110   N 78 101 110
/ 47 001 111   O 79 101 111
0 48 010 000   P 80 110 000
1 49 010 001   Q 81 110 001
2 50 010 010   R 82 110 010
3 51 010 011   S 83 110 011
4 52 010 100   T 84 110 100
5 53 010 101   U 85 110 101
6 54 010 110   V 86 110 110
7 55 010 111   W 87 110 111
8 56 011 000   X 88 111 000
9 57 011 001   Y 89 111 001
: 58 011 010   Z 90 111 010
; 59 011 011   [ 91 111 011
< 60 011 100   \ 92 111 100
= 61 011 101   ] 93 111 101
> 62 011 110   ^ 94 111 110
? 63 011 111   _ 95 111 111
  ` 96 (1) 000 000

参见

参考资料

本条目部分或全部内容出自以GFDL授权发布的《自由线上电脑词典》(FOLDOC)。

外部链接

  • UUDeview - open-source program to encode/decode Base64, BinHex, uuencode, xxencode, etc. for Unix/Windows/DOS