Jump to content

Information bottleneck method

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Therustyone (talk | contribs) at 14:05, 21 November 2007. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The information bottleneck method is a technique for finding the best trade-off between accuracy and compression when summarizing (e.g. clustering) a random variable X when given a joint probability distribution between X and an observed variable Y.


The compressed variable is and the algorithm minimises the following



where are the mutual information between and respectively

Gaussian Information Bottleneck [1]

A relatively simple application of the information bottleneck is to Gaussian variates and this has some semblance to a least squares reduced rank or canonical approximation. Assume are jointly multivariate zero mean normal vectors and is a compressed version of which must maintain a given value of mutual information with . It can be shown that the optimum is a normal vector consisting of orthogonal linear combinations of the elements of .

The projection matrix contains rows selected from the weighted left eigenvectors of the singular value decomposition of the following matrix (generally asymmetric)



Define the singular value decomposition

 with 

and the critical values

.

then the number of active eigenvectors in the projection, or order of approximation, is given by

 

And we finally get



In which the weights are given by



where .


[1] G. Chechik, A Globerson, N. Tishby and Y. Weiss: “ Information Bottleneck for Gaussian Variables”. Journal of Machine Learning Research 6, Jan 2005, pp. 165-188

See also