Shannon–Fano–Elias coding

In information theory, Shannon–Fano–Elias coding is a precursor to arithmetic coding, in which probabilities are used to determine codewords.^[1]

Algorithm Description

Given a discrete random variable X of ordered values to be encoded, let $p(x)$ be the probability for any x in X. Define a function

{\bar {F}}(x)=\sum _{x_{i}<x}p(x_{i})+{\frac {1}{2}}p(x)

Algorithm:

For each x in X,

Let Z be the binary expansion of

{\bar {F}}(x)

.

Choose the length of the encoding of x,

L(x)

, to be the integer

\lceil log_{2}{\frac {1}{p}}(x)\rceil +1

Choose the encoding of x,

code(x)

, be the first

L(x)

most significant bits after the decimal point of Z.

Example

Let X = {A, B, C, D}, with probabilities p = {1/3, 1/4, 1/6, 1/4}.

For A

{\bar {F}}(A)={\frac {1}{2}}p(A)={\frac {1}{2}}\cdot {\frac {1}{3}}=0.1666...

In binary, Z(A) = 0.0010101010...

L(A) =

\lceil log_{2}{\frac {1}{\frac {1}{3}}}\rceil +1

= 3

code(A) is 001

For B

{\bar {F}}(B)=p(A)+{\frac {1}{2}}p(B)={\frac {1}{3}}+{\frac {1}{2}}\cdot {\frac {1}{4}}=0.4583333...

In binary, Z(B) = 0.01110101010101...

L(B) =

\lceil log_{2}{\frac {1}{\frac {1}{4}}}\rceil +1

= 3

code(B) is 011

For C

{\bar {F}}(C)=p(A)+p(B)+{\frac {1}{2}}p(C)={\frac {1}{3}}+{\frac {1}{4}}+{\frac {1}{2}}\cdot {\frac {1}{6}}=0.66666...

In binary, Z(C) = 0.101010101010...

L(C) =

\lceil log_{2}{\frac {1}{\frac {1}{6}}}\rceil +1

= 4

code(C) is 1010

For D

{\bar {F}}(D)=p(A)+p(B)+p(C)+{\frac {1}{2}}p(D)={\frac {1}{3}}+{\frac {1}{4}}+{\frac {1}{6}}{\frac {1}{2}}\cdot {\frac {1}{4}}=0.875

In binary, Z(D) = 0.111

L(D) =

\lceil log_{2}{\frac {1}{\frac {1}{4}}}\rceil +1

= 3

code(D) is 111

Algorithm Analysis

Prefix Code

Shannon-Fano-Elias coding produces a binary prefix code, allowing for direct decoding.

Let bcode(x) be the rational number formed by adding a decimal point before a binary code. For example, if code(C)=1010 then bcode(C) = 0.1010. For all x, if no y exists such that

bcode(x)\leq bcode(y)<bcode(x)+2^{-L(x)}

then all the codes form a prefix code.

By comparing F to the CDF of X, this property may be demonstrated graphically for Shannon-Fano-Elias coding.

By definition of L it follows that

2^{-L(x)}\leq {\frac {1}{2}}p(x)

And because the bits after L(x) are truncated from F(x) to form code(x), it follows that

{\bar {F}}(x)-bcode(x)\leq 2^{-L(x)}

So the above graph demonstrates the disjoint nature of the ranges for the codes.

Code Length

The average code length is $LC(X)=\sum _{x\epsilon X}p(x)L(x)=\sum _{x\epsilon X}p(x)(\lceil log_{2}{\frac {1}{p(x)}}\rceil +1)$ .
Thus for H(X), the Entropy of the random variable X,

H(X)+1\leq LC(X)<H(X)+2

Thus the Shannon Fano Elias codes from 1 to 2 extra bits per symbol from X than entropy, so the code is not used in practice.

References

^ T. M. Cover and Joy A. Thomas (2006). Elements of information theory (2nd ed.). John Wiley and Sons. pp. 127–128. ISBN 978-0-471-24195-9.

This algorithms or data structures-related article is a stub. You can help Wikipedia by expanding it.

[1] T. M. Cover and Joy A. Thomas (2006). Elements of information theory (2nd ed.). John Wiley and Sons. pp. 127–128. ISBN 978-0-471-24195-9.

[1]