Jump to content

Talk:Information theory/Archive 2

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by MiszaBot I (talk | contribs) at 03:25, 14 July 2009 (Archiving 9 thread(s) from Talk:Information theory.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
Archive 1Archive 2

Pending tasks (what do we want?)

Here are how I interpret the pending tasks:

  • Make sure intro and overview are understandable with daily life examples. Making sure the reader can subjectively tell what a "bit" is, for example, is a good start. Discussion of elephant trumpets, whale songs, and black holes, do not contribute in this manner, although their mere mention (with links, say, to the no hair theorem) may be useful to the curious.
  • Move applications up before mathematics (just after Overview) and make it independent. I'm not sure this is for the best, as Overview states some applications and, for the rest, it's useful to know some theory. Also, we need to be careful not to conflate security and information theory, since information knowledge is necessary for, yet not sufficient for, breaking ciphers. (For example, a Diffie-Hellman key exchange assumes that an adversary will have full knowledge, but the form of that knowledge assures that the adversary will not be able to use it within a reasonable amount of time. Thus, although secure in practice, Diffie-Hellman key exchange is information-theoretically insecure).
  • Move history down to end (just before References). Again, I'm not sure why this is best, and some "good" articles, like Thermodynamics, list history first. If moved, we need to check that this is just as readable, but the reader doesn't need to know scientific history to understand scientific phenomena. Clearly more post-1948 history is needed. Most applications of information theory, even elementary ones, are post-1948 (Huffman codes, arithmetic codes, LDPC codes, Turbo codes), and much of the theory is too, e.g., information divergence, information inequalities, Fisher sufficient statistics, Kolmogorov complexity, network information theory, etc. However, it might be best to move the history to its own article. The current section is far too in-depth.

There's still a lot to do and a lot to add, and we might reasonably ask if certain topics should be shortened or omitted, e.g., Kullback–Leibler divergence (which, although important, can generally be omitted from elementary explanation), differential entropy (which is most important for transinformation, which can be interpreted as a limit of discrete transinformation), gambling (which most information theoretists I know love but is a somewhat fringe topic). Again, the thermodynamics article is shorter and far less mathematical. Do we want this article to be long with full mathematical explanations or medium-sized with general explanations and links to the math? I don't know the right answer, but it might be nice to get some ideas of what people would like out of this.

A few final things to keep in mind:

  • Diversions or examples absent of explanation confuse, not clarify.
  • Sentences should be concise as possible, but no more so, for easy reading.
  • Proofread before (or at least immediately after) changes.
  • Others might not read the same way as you.

Due to these factors, it's good to discuss significant changes on the talk page. I find my "peer edited" sections are a lot better than my first attempts. And too much unilateral change can result in an unstable, unreadable, and flat-out wrong article. Calbaer 01:44, 15 June 2006 (UTC)

I say keep the K–L divergence. Not only is it important, but it is really useful in understanding the mutual information. The treatment of other stuff like gambling, intelligence, and measure theory could be cut. The measure theory section shows a useful mnemonic device, but the analogy is incomplete due to the failure of the transinformation to remain non-negative in the case of three or more random variables. It could probably be cut out entirely. History of information theory could indeed have its own article, but keep a short summary of history in this article here. Also some mention of Gibbs' inequality would be nice as it shows the K–L divergence (and thus the bivariate mutual information) to be non-negative. The AEP could be mentioned in the source theory section, since it is a property of certain sources. Don't hesitate to add what you think should be added. --130.94.162.64 17:17, 20 June 2006 (UTC)
K-L divergence is certainly fundamental, so it should stay. Measure theory is also fundamental, but it is not necessary for a good understanding of information theory, so it could go. History could be summarized and moved. I'll get to this when I can, though if someone else wants to do so first, go ahead. Calbaer 18:36, 20 June 2006 (UTC)
the section that mentions measure theory should be kept, or moved somewhere else, rather than deleted. also, the following sentence in that section is misleading/incorrect: "it justifies, in a certain formal sense, the practice of calling Shannon's entropy a "measure" of information." entropy is not a measure, given an random variable f, one integrates f lnf against a suitable measure (the Lebesgue measure, the counting measure, etc) to get the entropy. Mct mht 18:52, 20 June 2006 (UTC)
I think that the section confuses more than it helps; most readers will not have a working knowledge of measure theory, and, as you say, the section needs at least a little reworking. I'll delete it and add a link to Information theory and measure theory. Calbaer 19:47, 20 June 2006 (UTC)

Seems to me a big priority should be to break out all the details into other articles. Information Theory is very much a science. If you look at the "Mathematics" Wikipedia article, you'll see that there are few actual mathematics concepts dicussed. Its more a history and categorization of the sub-sciences of mathematics, with links. I picture "Information Theory" as a fairly small article (only because the science is young) giving only a bit more than a dictionary definition of "information theory", and then a ton of links with brief descriptions to categories, related theories, practical applications, algorithms, coding methods, etc.. I think this article should be more of a starting point. Details should be elsewhere. Also, until this can be done, at least a set of links might be nice. I guess I'm saying here I'm surprised that a search of the "Information Theory" article in Wikipedia doesn't find the phrase "forward error correction"! qz27, 22 June 2006

Good point, and one that argues for the elimination of the following as all but links: Self-information, Kullback–Leibler divergence, differential entropy. We shouldn't be scared to have a little math in the article, but regular, joint, and conditional entropy can be defined together and mutual entropy in terms of them. That's enough for the idealizations of source and channel coding. If people want to know why the math is as it is, there could be a definitions in information theory article or something like that. Two things though: 1) "error correction" is mentioned three times in the article (and I don't think it's that important to specifically state FEC in a "starting point" article) and 2) are you volunteering to do this? Calbaer 06:13, 23 July 2006 (UTC)
I went through much of the article and tried to make it more approachable with links to unnecessary math. Even now it may have too much math, and someone should probably reorganize it accordingly. Calbaer 22:45, 24 July 2006 (UTC)

The "Intelligence and Secrecy" section looks like someone's weird pet theory disguised as encyclopedic text. What does the Shannon-Hartley theorem have to do with keeping secrets? Even if this connection exists, it is far from clear here. kraemer 15:01, 11 June 2007 (UTC)

Information theory is everything in crypto -- entropy is its fundamental concept. The real questions are whether cryptography merits the mention (probably) and whether the subsection is written well enough to keep (probably not). CRGreathouse (t | c) 15:09, 11 June 2007 (UTC)
The text has been there since December 2005 and was likely written by User:130.94.162.64, who wrote the bulk of the first draft of this article. I've criticized him (or her) for a mistake or two, but the truth is that, without that user, this article would be far less rich. Unfortunately, there's been no sign of that IP since June 21, 2006, but still, I'd say that the text should be modified to something a bit better rather than simply removed, especially since it has, in some sense, "stood the test of time." But I agree it should be improved. It's important to express the fundamental idea of information theory in cryptography: That methods with keys shorter than their plaintext are theoretically breakable, and that modern crypto pretty much counts on the hypothesis — neither proven or disproven — that such methods require enough steps to break as to make them secure for any reasonable amount of time (e.g., the timespan of the universe). Calbaer 16:40, 11 June 2007 (UTC)
The text in question is the material surrounding the assertion, "It is extremely hard to contain the flow of information that has high redundancy." This has nothing to do with the Shannon-Hartley Theorem, which is about the relationship between channel noise and channel capacity. The difficulty in keeping a secret that is known to multiple people is not a function of the signal-to-noise ratio of any informational vector in any obvious way. If this is clear to someone else then the connection should be made explicit. If these are, as I suspect, unrelated ideas, then this section should be rewritten. The relationship between cryptography and information theory is indeed important -- far too important to be represented by some vanished author's unsupported ranting. kraemer 18:48, 18 June 2007 (UTC)
To be fair, the text has degraded, so we shouldn't call it a "vanished author's unsupported ranting." Anyway, here's a replacement that I'll do if people like it:
Information theoretic concepts apply to making and breaking cryptographic systems. Such concepts were used in breaking the German Enigma machine and hastening the end of WWII in Europe. Shannon himself defined an important concept now called the unicity distance. Based on the redundancy of the plaintext, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability.
Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack can break systems based on public-key cryptography or on most commonly used methods of private-key cryptography, such as block ciphers. The security of such methods comes from the assumption that no known attack can break them in a practical amount of time, e.g., before the universe meets its ultimate fate. Information theoretic security refers to methods such as the one-time pad which are not vulnerable to such brute force attacks. However, as in any other cryptographic system, one must be careful to correctly apply even information-theoretically secure methods; the Venona project was able to crack the one-time pads of the Soviet Union due to their improper reuse.
Calbaer 21:00, 18 June 2007 (UTC)
This is all true, though the connection between cryptographic attack and information theory could be drawn more explicitly. Big improvement though! kraemer 21:22, 18 June 2007 (UTC)
Well, if you have any modifications, you can make them and add them. Also, the PRNG subsection should be modified; for such generators, e.g., extractors, min-entropy, not the more common and fundamental information theoretic quantity of entropy, is the correct measurement. Calbaer 16:14, 19 June 2007 (UTC)

Game theory as a template?

Game theory was recently a featured article. Perhaps information theory should look more like that? Calbaer 03:52, 2 August 2006 (UTC)

Removed tag, added poached images, reworked header

The article has been largely dormant and those who have tagged it — not to mention have made many contributions — are no longer in the Wikipedia community with the same aliases. (One was kicked off in September; the other stopped contributing and/or changed IP address in June.) It's not a perfect article, but it's better, so I removed the tag. Any advice for how to improve it — by novice or by expert — would be appreciated, especially on this talk page. I feel some of the article is a bit too redundant, but it's better to be redundant than to omit or underplay basic facts. I added some images from other articles to visually illustrate the concepts, which will hopefully get people more interested in the subject and more quickly clear what it's all about. Finally, I made the disambiguation header shorter, updated (to reflect the library/information science split), and added informatics. By the way, my apologies for misstating that I thought "K-L divergence should stay." I meant that mutual information should stay. K-L divergence may be a bit too confusing for those who don't want to explore it as a separate topic. Calbaer 21:59, 25 October 2006 (UTC)

The illustration for Information theory#Channel capacity is missing labels for X and Y, which are referred to in the text. 198.145.196.71 23:28, 13 September 2007 (UTC)
I changed to one what has x and y marked (these are elements of the spaces X and Y). Dicklyon 00:33, 14 September 2007 (UTC)
Perhaps the "Channel capacity" and "Source theory" sections could be swapped, since source theory talks about rate, which is a prerequisite for channel capacity. 198.145.196.71 23:31, 14 September 2007 (UTC)
That's fine, though "coding theory" shouldn't be after the two and hierarchically siblings to them (a change made here). It should come first and/or be under something different, preferably both. The idea for this was to explain source and channel coding, then mention their applications under "coding theory." (In fact, calling these "applications" is rather loose; maybe I'll change it around.) Calbaer 00:22, 15 September 2007 (UTC)
Coding theory is really important. It is the main application of information theory, its raison d'être, so to speak. It really shouldn't look like just another "application" among several others. Just go ahead and make your changes, and let's look at it from there. Let's make sure all the section and subsection titles are accurate, too. 198.145.196.71 15:28, 15 September 2007 (UTC)
You could make similar arguments for channel capacity and source theory. Maybe that section needs to be split into two. Think up a good heading and go for it. Dicklyon 16:11, 15 September 2007 (UTC)
Coding theory, source theory, and channel capacity have been rearranged now. A little blurb on source coding could go into the source theory section, and something about channel coding could be said in the channel capacity section, possibly with another diagram somewhat along this line:
Source --> Encoder --> Noisy Channel --> Decoder --> Receiver.
198.145.196.71 01:32, 21 September 2007 (UTC)

Modern information definitions

In 2006, Deng Yu et al use standard logic 'Genus add the inter-species difference ' definition mode (or calls connotation definition). This definition mode manifests by following formula:

Is defined a item = neighbor genus + the inter-species difference.

they changed the original negative Shannon and Wiener information into a positive definition.

Advance of Wiener information definition: contrary

The information is the information, the information is the material, the energy, the information and the attribute indication----Wiener information definition opposite.[1]


Shannon information definition affirmation turn over 2006

Reverse Shannon information definition: The information is certainty increase. (information is a measure of one's freedom of choice when one selects a message)

Corresponding formula

Ir=-logPi+1

or

Ir‘=log((N-ni)/N) =log(nq/N) =logPq

Shannon information from the form of negative entropy -- Uncertainty, transforms the positive entropy formally to make up --the determine degrees. See the original negative of the Shannon definition of information formula

I=-logPi =-log((ni)/N) =-(logni-logN)=logN-log ni =-log((N-nq)/N) =1-1- logPi =1-(1+ logPi) =(1- logPi) –1

Deng's information definition

in 2002, 2004, Deng Yu et al: The information is the thing phenomenon and the attribute marking (indication) set.[2]


— Preceding unsigned comment added by Jingluolaodao (talkcontribs) 06:47, 21 October 2008 (UTC)

Laws of Information

Law of conservation of information

Deng's "law of information conservation", Deng Yu et al, Standardization of Information Definition (J), Medical Information, 2006, (7), Chinese, and Deng Yu et al, JOURNAL OF MATHEMATICAL MEDICINE, 2000, (1)[5]. Information conservation and transformation law (basic information equation) definition 1 Total flows in system's information to be equal to that total the information which flows out from the system, in addition system insider information change; The information can transform, transforms from one condition other one condition; The information may create, may lose saves. With the formula expression is NQ= NW +ΔNU

Definition (law definition) 2: The information conservation relations are refer to “in the system the store information to increase were equal to that enters system's information to subtract leaves system's information” ΔNU= NQ-NW

in the system the store information change = to enter system's information - to leave system's information the = new creation information - to lose (the `vanishing ' leaves) information

In the system the information change was equal to that enters system's new creation information to subtract the system loses (leaves, loses saves, vanishing, elimination) the information. In the system store information's change = new creation's information - loses information

ΔNU =Ncre-Nlos which saves

The overall flow of information into the system must be equal to the total outflow from the system, coupled with changes in the internal information systems; be able to convert information from one state into another state; information can be created, you can keep missing. That the formula used for NQ = NW + ΔNU

Nonsense. — Arthur Rubin (talk) 15:25, 2 November 2008 (UTC)

Entropy

In the entropy section the article says:

If is the set of all messages that could be, and , then has

bits of entropy. An important property of entropy is that it is maximized when all the messages in the message space are equiprobable — i.e., most unpredictable — in which case

but if all states are equiprobable then so

not . Thesm 08:52, 6 December 2006 (UTC)

Changed; it was a slight abuse of notation. Though for something like this, you can Wikipedia:Be bold. Worst thing that can happen is that you'd be reverted. Calbaer 17:26, 6 December 2006 (UTC)

Boltzmann's entropy and von Neumann anecdote

No history of information theory would be complete without this famous anecdote: Claude Shannon asked John von Neumann which name he should give to this cool new concept he discovered: . Von Neumann replied: "Call it H." Shannon: "H? Why H?" Von Neumann: "Because that's what Boltzmann called it."

Ludwig Boltzmann introduced the concept in 1870. Compare: Boltzmann, Ludwig (1896, 1898). Vorlesungen über Gastheorie : 2 Volumes - Leipzig 1895/98 UB: O 5262-6. English version: Lectures on gas theory. Translated by Stephen G. Brush (1964) Berkeley: University of California Press; (1995) New York: Dover ISBN 0-486-68455-5

Algorithms 19:51, 7 June 2007 (UTC)

If no history would be complete without it, add it to the History of information theory article, not a section of an article focusing on the idea, not the history. You might also want to find a source for it, since I've been in information theory for ten years and have never heard the "famous anecdote." Googling, I only encounter one webpage with the story, told secondhand without attribution. (The reference you give is for "H", not your "H story," I'm assuming.) For both these reasons, I don't think it should be in this article. Calbaer 00:13, 8 June 2007 (UTC)
Some of these books support the idea that von Neumann suggested Shannon use entropy after Boltzmann. This one has it as a quote from Shannon, not quite in that form. It's already quoted in History of information theory. Dicklyon 00:29, 8 June 2007 (UTC)
Calbaer reversed and wrote: "Anecdote doesn't belong; order is by relevance, not strictly chronological." But what could possibly be more relevant than the original entropy formulation of Boltzmann, providing the very foundations of information theory? Surprisingly, Boltzmann's work is even omitted from the references. This history seems a bit one-sided. Algorithms 19:47, 9 June 2007 (UTC)
That's because Boltzmann wasn't an information theorist. You would no more expect to find him on this page than Newton on the aerospace engineering page or Euler on the cryptography page. Yes, they did the math and statistics that made those fields possible — and Boltzmann merits a mention on the history page, but to give him a good portion of the short history section overstates his role in information theory. And random anecdotes also don't belong. I wouldn't want the history to get too dry, but overstating the relations between information theory and thermodynamics — which, although many, are largely unimportant — would not be beneficial to the article, especially in trying to make it accessable to those who may already be daunted by information theory being a mix of engineering, CS, stat, and math. Does anyone else feel the Boltzmann connection merits this much attention? Calbaer 00:39, 10 June 2007 (UTC)
I'm not so sure, either, how much of the history section should be devoted to Boltzmann, but I disagree on the importance of the relations between information theory and thermodynamics. The connection between the two is fundamental and deep, so much so, in fact, that I can say that one bit of (information-theoretic) entropy is equal to Boltzmann's constant times the natural logarithm of 2. I'm not sure, though, how much of that belongs in this article, which is already quite long. (My old IP address was 130.94.162.64, by the way.) -- 198.145.196.71 18:27, 25 August 2007 (UTC)
Welcome back. I guess I meant "unimportant" in the context of the article or your average textbook or paper on information theory. I didn't meant to trivialize the connection, which is indeed deep, but is usually not important for someone looking for either a reference or tutorial on information theory. Calbaer 20:09, 25 August 2007 (UTC)

Source theory and rate

I still don't like the expression

because it isn't clear what probability distribution the expectation is taken over. There is already a limit in this expression on n, and an expectation is implicit in the conditional entropy. 198.145.196.71 22:08, 15 September 2007 (UTC)

I don't care for it much either, though I wrote it. The formula previously didn't have the limit, which made sense only if the conditional entropy was independent of n (previously called t here); but it did have the expectation, which seems to me is needed if the conditional entropy varies with the identity of the previous symbols. I think the distribution is over those previous values. Maybe we can find a source about this. Dicklyon 22:44, 15 September 2007 (UTC)
OK, you're right, according to this book, eq. 2.34. Dicklyon 22:47, 15 September 2007 (UTC)
Actually the expression (without the expectation) is correct for a strictly stationary process, as you noted in the article. For a non-stationary process, the more general expression I added to the article (which I stole from Entropy rate) previously is needed (as you also noted apparently). 198.145.196.71 23:28, 15 September 2007 (UTC)

Help request

How about some help over on the article Information theory and measure theory? I wrote most of that article over a year ago, and it's still complete mess. 198.145.196.71 20:32, 25 August 2007 (UTC)

See these books for some good ideas. The online ref mentioned in that article doesn't even mention measure theory, so no wonder it's a mess. Dicklyon 05:51, 28 August 2007 (UTC)
I meant to be bold and help edit the article, and maybe take the expert tag off if that scares people from editing. I don't mean to claim ownership of it. More at Talk:Information theory and measure theory. 198.145.196.71 17:06, 7 September 2007 (UTC)