Talk:Information theory/Archive 2

This is an archive of past discussions about Information theory. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Pending tasks (what do we want?)

Here are how I interpret the pending tasks:

Make sure intro and overview are understandable with daily life examples. Making sure the reader can subjectively tell what a "bit" is, for example, is a good start. Discussion of elephant trumpets, whale songs, and black holes, do not contribute in this manner, although their mere mention (with links, say, to the no hair theorem) may be useful to the curious.

Move applications up before mathematics (just after Overview) and make it independent. I'm not sure this is for the best, as Overview states some applications and, for the rest, it's useful to know some theory. Also, we need to be careful not to conflate security and information theory, since information knowledge is necessary for, yet not sufficient for, breaking ciphers. (For example, a Diffie-Hellman key exchange assumes that an adversary will have full knowledge, but the form of that knowledge assures that the adversary will not be able to use it within a reasonable amount of time. Thus, although secure in practice, Diffie-Hellman key exchange is information-theoretically insecure).

A separate section on the theory already exists (although it could use a mention of a few more concepts, e.g., the asymptotic equipartition property).

Move history down to end (just before References). Again, I'm not sure why this is best, and some "good" articles, like Thermodynamics, list history first. If moved, we need to check that this is just as readable, but the reader doesn't need to know scientific history to understand scientific phenomena. Clearly more post-1948 history is needed. Most applications of information theory, even elementary ones, are post-1948 (Huffman codes, arithmetic codes, LDPC codes, Turbo codes), and much of the theory is too, e.g., information divergence, information inequalities, Fisher sufficient statistics, Kolmogorov complexity, network information theory, etc. However, it might be best to move the history to its own article. The current section is far too in-depth.

There's still a lot to do and a lot to add, and we might reasonably ask if certain topics should be shortened or omitted, e.g., Kullback–Leibler divergence (which, although important, can generally be omitted from elementary explanation), differential entropy (which is most important for transinformation, which can be interpreted as a limit of discrete transinformation), gambling (which most information theoretists I know love but is a somewhat fringe topic). Again, the thermodynamics article is shorter and far less mathematical. Do we want this article to be long with full mathematical explanations or medium-sized with general explanations and links to the math? I don't know the right answer, but it might be nice to get some ideas of what people would like out of this.

A few final things to keep in mind:

Diversions or examples absent of explanation confuse, not clarify.
Sentences should be concise as possible, but no more so, for easy reading.
Proofread before (or at least immediately after) changes.
Others might not read the same way as you.

Due to these factors, it's good to discuss significant changes on the talk page. I find my "peer edited" sections are a lot better than my first attempts. And too much unilateral change can result in an unstable, unreadable, and flat-out wrong article. Calbaer 01:44, 15 June 2006 (UTC)

I say keep the K–L divergence. Not only is it important, but it is really useful in understanding the mutual information. The treatment of other stuff like gambling, intelligence, and measure theory could be cut. The measure theory section shows a useful mnemonic device, but the analogy is incomplete due to the failure of the transinformation to remain non-negative in the case of three or more random variables. It could probably be cut out entirely. History of information theory could indeed have its own article, but keep a short summary of history in this article here. Also some mention of Gibbs' inequality would be nice as it shows the K–L divergence (and thus the bivariate mutual information) to be non-negative. The AEP could be mentioned in the source theory section, since it is a property of certain sources. Don't hesitate to add what you think should be added. --130.94.162.64 17:17, 20 June 2006 (UTC)

K-L divergence is certainly fundamental, so it should stay. Measure theory is also fundamental, but it is not necessary for a good understanding of information theory, so it could go. History could be summarized and moved. I'll get to this when I can, though if someone else wants to do so first, go ahead. Calbaer 18:36, 20 June 2006 (UTC)

the section that mentions measure theory should be kept, or moved somewhere else, rather than deleted. also, the following sentence in that section is misleading/incorrect: "it justifies, in a certain formal sense, the practice of calling Shannon's entropy a "measure" of information." entropy is not a measure, given an random variable f, one integrates f lnf against a suitable measure (the Lebesgue measure, the counting measure, etc) to get the entropy. Mct mht 18:52, 20 June 2006 (UTC)

I think that the section confuses more than it helps; most readers will not have a working knowledge of measure theory, and, as you say, the section needs at least a little reworking. I'll delete it and add a link to Information theory and measure theory. Calbaer 19:47, 20 June 2006 (UTC)

Seems to me a big priority should be to break out all the details into other articles. Information Theory is very much a science. If you look at the "Mathematics" Wikipedia article, you'll see that there are few actual mathematics concepts dicussed. Its more a history and categorization of the sub-sciences of mathematics, with links. I picture "Information Theory" as a fairly small article (only because the science is young) giving only a bit more than a dictionary definition of "information theory", and then a ton of links with brief descriptions to categories, related theories, practical applications, algorithms, coding methods, etc.. I think this article should be more of a starting point. Details should be elsewhere. Also, until this can be done, at least a set of links might be nice. I guess I'm saying here I'm surprised that a search of the "Information Theory" article in Wikipedia doesn't find the phrase "forward error correction"! qz27, 22 June 2006

Good point, and one that argues for the elimination of the following as all but links: Self-information, Kullback–Leibler divergence, differential entropy. We shouldn't be scared to have a little math in the article, but regular, joint, and conditional entropy can be defined together and mutual entropy in terms of them. That's enough for the idealizations of source and channel coding. If people want to know why the math is as it is, there could be a definitions in information theory article or something like that. Two things though: 1) "error correction" is mentioned three times in the article (and I don't think it's that important to specifically state FEC in a "starting point" article) and 2) are you volunteering to do this? Calbaer 06:13, 23 July 2006 (UTC)

I went through much of the article and tried to make it more approachable with links to unnecessary math. Even now it may have too much math, and someone should probably reorganize it accordingly. Calbaer 22:45, 24 July 2006 (UTC)

The "Intelligence and Secrecy" section looks like someone's weird pet theory disguised as encyclopedic text. What does the Shannon-Hartley theorem have to do with keeping secrets? Even if this connection exists, it is far from clear here. kraemer 15:01, 11 June 2007 (UTC)

Information theory is everything in crypto -- entropy is its fundamental concept. The real questions are whether cryptography merits the mention (probably) and whether the subsection is written well enough to keep (probably not). CRGreathouse (t | c) 15:09, 11 June 2007 (UTC)

The text has been there since December 2005 and was likely written by User:130.94.162.64, who wrote the bulk of the first draft of this article. I've criticized him (or her) for a mistake or two, but the truth is that, without that user, this article would be far less rich. Unfortunately, there's been no sign of that IP since June 21, 2006, but still, I'd say that the text should be modified to something a bit better rather than simply removed, especially since it has, in some sense, "stood the test of time." But I agree it should be improved. It's important to express the fundamental idea of information theory in cryptography: That methods with keys shorter than their plaintext are theoretically breakable, and that modern crypto pretty much counts on the hypothesis — neither proven or disproven — that such methods require enough steps to break as to make them secure for any reasonable amount of time (e.g., the timespan of the universe). Calbaer 16:40, 11 June 2007 (UTC)

The text in question is the material surrounding the assertion, "It is extremely hard to contain the flow of information that has high redundancy." This has nothing to do with the Shannon-Hartley Theorem, which is about the relationship between channel noise and channel capacity. The difficulty in keeping a secret that is known to multiple people is not a function of the signal-to-noise ratio of any informational vector in any obvious way. If this is clear to someone else then the connection should be made explicit. If these are, as I suspect, unrelated ideas, then this section should be rewritten. The relationship between cryptography and information theory is indeed important -- far too important to be represented by some vanished author's unsupported ranting. kraemer 18:48, 18 June 2007 (UTC)

To be fair, the text has degraded, so we shouldn't call it a "vanished author's unsupported ranting." Anyway, here's a replacement that I'll do if people like it:

Information theoretic concepts apply to making and breaking cryptographic systems. Such concepts were used in breaking the German Enigma machine and hastening the end of WWII in Europe. Shannon himself defined an important concept now called the unicity distance. Based on the redundancy of the plaintext, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability.

Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack can break systems based on public-key cryptography or on most commonly used methods of private-key cryptography, such as block ciphers. The security of such methods comes from the assumption that no known attack can break them in a practical amount of time, e.g., before the universe meets its ultimate fate. Information theoretic security refers to methods such as the one-time pad which are not vulnerable to such brute force attacks. However, as in any other cryptographic system, one must be careful to correctly apply even information-theoretically secure methods; the Venona project was able to crack the one-time pads of the Soviet Union due to their improper reuse.

Calbaer 21:00, 18 June 2007 (UTC)

This is all true, though the connection between cryptographic attack and information theory could be drawn more explicitly. Big improvement though! kraemer 21:22, 18 June 2007 (UTC)

Well, if you have any modifications, you can make them and add them. Also, the PRNG subsection should be modified; for such generators, e.g., extractors, min-entropy, not the more common and fundamental information theoretic quantity of entropy, is the correct measurement. Calbaer 16:14, 19 June 2007 (UTC)

Game theory as a template?

Game theory was recently a featured article. Perhaps information theory should look more like that? Calbaer 03:52, 2 August 2006 (UTC)

Removed tag, added poached images, reworked header

The article has been largely dormant and those who have tagged it — not to mention have made many contributions — are no longer in the Wikipedia community with the same aliases. (One was kicked off in September; the other stopped contributing and/or changed IP address in June.) It's not a perfect article, but it's better, so I removed the tag. Any advice for how to improve it — by novice or by expert — would be appreciated, especially on this talk page. I feel some of the article is a bit too redundant, but it's better to be redundant than to omit or underplay basic facts. I added some images from other articles to visually illustrate the concepts, which will hopefully get people more interested in the subject and more quickly clear what it's all about. Finally, I made the disambiguation header shorter, updated (to reflect the library/information science split), and added informatics. By the way, my apologies for misstating that I thought "K-L divergence should stay." I meant that mutual information should stay. K-L divergence may be a bit too confusing for those who don't want to explore it as a separate topic. Calbaer 21:59, 25 October 2006 (UTC)

The illustration for Information theory#Channel capacity is missing labels for X and Y, which are referred to in the text. 198.145.196.71 23:28, 13 September 2007 (UTC)

I changed to one what has x and y marked (these are elements of the spaces X and Y). Dicklyon 00:33, 14 September 2007 (UTC)

Perhaps the "Channel capacity" and "Source theory" sections could be swapped, since source theory talks about rate, which is a prerequisite for channel capacity. 198.145.196.71 23:31, 14 September 2007 (UTC)

That's fine, though "coding theory" shouldn't be after the two and hierarchically siblings to them (a change made here). It should come first and/or be under something different, preferably both. The idea for this was to explain source and channel coding, then mention their applications under "coding theory." (In fact, calling these "applications" is rather loose; maybe I'll change it around.) Calbaer 00:22, 15 September 2007 (UTC)

Coding theory is really important. It is the main application of information theory, its raison d'être, so to speak. It really shouldn't look like just another "application" among several others. Just go ahead and make your changes, and let's look at it from there. Let's make sure all the section and subsection titles are accurate, too. 198.145.196.71 15:28, 15 September 2007 (UTC)

You could make similar arguments for channel capacity and source theory. Maybe that section needs to be split into two. Think up a good heading and go for it. Dicklyon 16:11, 15 September 2007 (UTC)

Coding theory, source theory, and channel capacity have been rearranged now. A little blurb on source coding could go into the source theory section, and something about channel coding could be said in the channel capacity section, possibly with another diagram somewhat along this line:

Source --> Encoder --> Noisy Channel --> Decoder --> Receiver.

198.145.196.71 01:32, 21 September 2007 (UTC)

Modern information definitions

In 2006, Deng Yu et al use standard logic 'Genus add the inter-species difference ' definition mode (or calls connotation definition). This definition mode manifests by following formula:

Is defined a item = neighbor genus + the inter-species difference.

they changed the original negative Shannon and Wiener information into a positive definition.

Advance of Wiener information definition: contrary

The information is the information, the information is the material, the energy, the information and the attribute indication----Wiener information definition opposite.^[1]

Shannon information definition affirmation turn over 2006

Reverse Shannon information definition: The information is certainty increase. (information is a measure of one's freedom of choice when one selects a message)

Corresponding formula

Ir=-logPi+1

or

Ir‘=log((N-ni)/N) ＝log(nq/N) ＝logPq

Shannon information from the form of negative entropy -- Uncertainty, transforms the positive entropy formally to make up --the determine degrees. See the original negative of the Shannon definition of information formula

I=-logPi =-log((ni)/N) =-(logni-logN)=logN-log ni =-log((N-nq)/N) =1-1- logPi =1-(1+ logPi) =(1- logPi) –1

Deng's information definition

in 2002, 2004, Deng Yu et al: The information is the thing phenomenon and the attribute marking (indication) set.^[2]

Deng Yu et al, Journal of Mathematical Medicine, 2004 (5). Chinese.
Journal of Mathematical Medicine, 2004 (6). Chinese.
Standardization of Information Definition (J), Medical Information, 2006 (7)

—Preceding unsigned comment added by Jingluolaodao (talk • contribs) 06:47, 21 October 2008 (UTC)

References

Laws of Information

Law of conservation of information

Deng's "law of information conservation", Deng Yu et al, Standardization of Information Definition (J), Medical Information, 2006, (7), Chinese, and Deng Yu et al, JOURNAL OF MATHEMATICAL MEDICINE, 2000, (1)[5]. Information conservation and transformation law (basic information equation) definition 1 Total flows in system's information to be equal to that total the information which flows out from the system, in addition system insider information change; The information can transform, transforms from one condition other one condition; The information may create, may lose saves. With the formula expression is NQ= NW +ΔNU

Definition (law definition) 2: The information conservation relations are refer to “in the system the store information to increase were equal to that enters system's information to subtract leaves system's information” ΔNU= NQ-NW

in the system the store information change = to enter system's information - to leave system's information the = new creation information - to lose (the `vanishing ' leaves) information

In the system the information change was equal to that enters system's new creation information to subtract the system loses (leaves, loses saves, vanishing, elimination) the information. In the system store information's change = new creation's information - loses information

ΔNU =Ncre-Nlos which saves

The overall flow of information into the system must be equal to the total outflow from the system, coupled with changes in the internal information systems; be able to convert information from one state into another state; information can be created, you can keep missing. That the formula used for NQ = NW + ΔNU

Nonsense. — Arthur Rubin (talk) 15:25, 2 November 2008 (UTC)

Entropy

In the entropy section the article says:

If $\mathbb {M} \,$ is the set of all messages $m$ that $M$ could be, and $p(m)=Pr(M=m)$ , then $M$ has

H(M)=\mathbb {E} _{M}[-\log p(m)]=-\sum _{m\in \mathbb {M} }p(m)\log p(m)

bits of entropy. An important property of entropy is that it is maximized when all the messages in the message space are equiprobable — i.e., most unpredictable — in which case $H(M)=\log |M|.$

but if all states are equiprobable then $p(m)={\frac {1}{|\mathbb {M} |}}$ so

H(M)=-\sum _{m\in \mathbb {M} }p(m)\log p(m)=-\sum _{m\in \mathbb {M} }{\frac {1}{|\mathbb {M} |}}\log {\frac {1}{|\mathbb {M} |}}=\log |\mathbb {M} |

not $\log |M|$ . Thesm 08:52, 6 December 2006 (UTC)

Changed; it was a slight abuse of notation. Though for something like this, you can Wikipedia:Be bold. Worst thing that can happen is that you'd be reverted. Calbaer 17:26, 6 December 2006 (UTC)

Boltzmann's entropy and von Neumann anecdote

No history of information theory would be complete without this famous anecdote: Claude Shannon asked John von Neumann which name he should give to this cool new concept he discovered: $-\sum p_{i}\log _{2}p_{i}\!$ . Von Neumann replied: "Call it H." Shannon: "H? Why H?" Von Neumann: "Because that's what Boltzmann called it."

Ludwig Boltzmann introduced the concept in 1870. Compare: Boltzmann, Ludwig (1896, 1898). Vorlesungen über Gastheorie : 2 Volumes - Leipzig 1895/98 UB: O 5262-6. English version: Lectures on gas theory. Translated by Stephen G. Brush (1964) Berkeley: University of California Press; (1995) New York: Dover ISBN 0-486-68455-5

Algorithms 19:51, 7 June 2007 (UTC)

If no history would be complete without it, add it to the History of information theory article, not a section of an article focusing on the idea, not the history. You might also want to find a source for it, since I've been in information theory for ten years and have never heard the "famous anecdote." Googling, I only encounter one webpage with the story, told secondhand without attribution. (The reference you give is for "H", not your "H story," I'm assuming.) For both these reasons, I don't think it should be in this article. Calbaer 00:13, 8 June 2007 (UTC)

Some of these books support the idea that von Neumann suggested Shannon use entropy after Boltzmann. This one has it as a quote from Shannon, not quite in that form. It's already quoted in History of information theory. Dicklyon 00:29, 8 June 2007 (UTC)

Calbaer reversed and wrote: "Anecdote doesn't belong; order is by relevance, not strictly chronological." But what could possibly be more relevant than the original entropy formulation of Boltzmann, providing the very foundations of information theory? Surprisingly, Boltzmann's work is even omitted from the references. This history seems a bit one-sided. Algorithms 19:47, 9 June 2007 (UTC)

That's because Boltzmann wasn't an information theorist. You would no more expect to find him on this page than Newton on the aerospace engineering page or Euler on the cryptography page. Yes, they did the math and statistics that made those fields possible — and Boltzmann merits a mention on the history page, but to give him a good portion of the short history section overstates his role in information theory. And random anecdotes also don't belong. I wouldn't want the history to get too dry, but overstating the relations between information theory and thermodynamics — which, although many, are largely unimportant — would not be beneficial to the article, especially in trying to make it accessable to those who may already be daunted by information theory being a mix of engineering, CS, stat, and math. Does anyone else feel the Boltzmann connection merits this much attention? Calbaer 00:39, 10 June 2007 (UTC)

I'm not so sure, either, how much of the history section should be devoted to Boltzmann, but I disagree on the importance of the relations between information theory and thermodynamics. The connection between the two is fundamental and deep, so much so, in fact, that I can say that one bit of (information-theoretic) entropy is equal to Boltzmann's constant times the natural logarithm of 2. I'm not sure, though, how much of that belongs in this article, which is already quite long. (My old IP address was 130.94.162.64, by the way.) -- 198.145.196.71 18:27, 25 August 2007 (UTC)

Welcome back. I guess I meant "unimportant" in the context of the article or your average textbook or paper on information theory. I didn't meant to trivialize the connection, which is indeed deep, but is usually not important for someone looking for either a reference or tutorial on information theory. Calbaer 20:09, 25 August 2007 (UTC)

Source theory and rate

I still don't like the expression

r=\lim _{n\to \infty }\mathbb {E} H(X_{n}|X_{n-1},X_{n-2},X_{n-3},\ldots );

because it isn't clear what probability distribution the expectation is taken over. There is already a limit in this expression on n, and an expectation is implicit in the conditional entropy. 198.145.196.71 22:08, 15 September 2007 (UTC)

I don't care for it much either, though I wrote it. The formula previously didn't have the limit, which made sense only if the conditional entropy was independent of n (previously called t here); but it did have the expectation, which seems to me is needed if the conditional entropy varies with the identity of the previous symbols. I think the distribution is over those previous values. Maybe we can find a source about this. Dicklyon 22:44, 15 September 2007 (UTC)

OK, you're right, according to this book, eq. 2.34. Dicklyon 22:47, 15 September 2007 (UTC)

Actually the expression (without the expectation) is correct for a strictly stationary process, as you noted in the article. For a non-stationary process, the more general expression I added to the article (which I stole from Entropy rate) previously is needed (as you also noted apparently). 198.145.196.71 23:28, 15 September 2007 (UTC)

Help request

How about some help over on the article Information theory and measure theory? I wrote most of that article over a year ago, and it's still complete mess. 198.145.196.71 20:32, 25 August 2007 (UTC)

See these books for some good ideas. The online ref mentioned in that article doesn't even mention measure theory, so no wonder it's a mess. Dicklyon 05:51, 28 August 2007 (UTC)

I meant to be bold and help edit the article, and maybe take the expert tag off if that scares people from editing. I don't mean to claim ownership of it. More at Talk:Information theory and measure theory. 198.145.196.71 17:06, 7 September 2007 (UTC)

Variety

I had never heard of Ashby's concept of variety (cybernetics) until it was added to, and removed from, the see-also list in this article. So I looked in Google Books, and found all kinds of stuff about it. Some books on information theory, such as Kullback's Information Theory and Statistics, refer to Ashby's 1956 book Introduction to Cybernetics as part of the literature on information theory, and a big chunk of this book is about "variety". And his "requisite variety" paper is referenced in Uncertainty-Based Information: Elements of Generalized Information Theory. I don't see any reason to exclude this concept from the see-also list here on the basis of it being not information theory. Someone is being too picky, methinks. Dicklyon 03:41, 17 September 2007 (UTC)

I think it's helpful and just fine to mention something like this in the see-also list. 198.145.196.71 21:18, 22 September 2007 (UTC)

My main problem is that the article, as it stands, doesn't seem to have much to do with information theory. While variety itself may be relevant, the article reads more like one on social sciences. ⇌Elektron 12:27, 24 September 2007 (UTC)

These guys are doing a lot of "cybernetics" articles, which are topics that sort of bridge information theory's very mathematical style to the style of the social sciences. It's not what you're used to, but shouldn't be excluded from a see-also list on that basis. It may be more like what some readers are looking for. Dicklyon 14:34, 24 September 2007 (UTC)

References

The "References" section is starting to get disorganized. A footnote was added in conflict with the style of the other references, and something about Boltzmann that is not very relevant was added to "The Classic Work" subsection, which is supposed to contain just that: Shannon's classic work, "A Mathematical Theory of Communication." Perhaps there is a better way to organize the references, but it should be discussed on this talk page first, and in any case, the references should not be left in such disarray. This being the main information theory article, it contains a lot of material that is general knowledge in the field. Therefore several textbooks are listed, any of which will do for a reference for some of the "common knowledge" material (which should perhaps be more explicitly referenced in the body of the article). I would like some input and help here to better organize the references in this article. 198.145.196.71 20:45, 22 September 2007 (UTC)

Dicklyon just tagged the article as needing improved references, which I agree with. The footnote that was added is probably a good start. Should all the references be footnotes? I have several of the textbooks that are mentioned in the article, so I could go through and reference some of the material, but I am still somewhat unsure about some of the details, such as how best to deal with multiple references to the same source (in the footnote style). Perhaps I can add some references in the article, and then they can be changed to a more consistent style. 198.145.196.71 19:25, 28 September 2007 (UTC)

You can easily add multiple ref tags after each other; it's pointless to list more than 2 or 3 in support of a point, though, and usually 1 will do. Thanks for working on this. I'll try to help. Dicklyon 19:29, 28 September 2007 (UTC)

What I meant was referring to the same source at multiple points in the article without being redundant, as in "ibid." You can help me, too, by pointing out what specific statements in the article need sources to back them up. A lot of good sources are already listed at the end of the article; they just need to be referenced in the body. Some of them might be irrelevant because the material that referenced them has been moved to other articles, in which case we should make sure those articles have the necessary sources before deleting them entirely here. Very little content has actually been deleted from this article without moving it somewhere else. 198.145.196.71 22:11, 28 September 2007 (UTC)

Instead of ibid, use a named ref, like <ref name=foo>{{cite book | ...}}</ref> the first time, and just <ref name=foo/> for subsequent uses. Dicklyon 23:20, 28 September 2007 (UTC)

Page is too long

I suggest removing all technical and specialized material and instead placing it on the appropriate separate pages. I do not see why coding theory needs more than a paragraph or two on this page. Coding theory is a rich field in its own right...it is connected to information theory but I don't think we need to have a full exposition of it duplicated on this page. The same goes for other sections. I think this page should remain general-audience. As such, I think more discussion of information theory's applications and usefulness would be a good addition to the page. Cazort (talk) 20:37, 26 November 2007 (UTC)

I am the one originally responsible for much of this article's length that you consider excessive. It is admittedly a long article, but on the other hand I like having a collection of the basic formulas on this page. This page has by no means a full exposition of coding theory, only its very basic rudiments. I do agree that more discussion of applications and usefulness would be helpful, although that might make the article even longer. I would rather have an article that is a little bit too long than one that is too short and scanty on details. 198.145.196.71 (talk) 02:38, 20 December 2007 (UTC)

I think Czort's objection is apposite. This is a field with multitudes of connections elsewhere, and any article which addresses these even in passing will not be very small. In addition, the suggestion that all technical content be removed to other articles is ill advised. This is a tricky and technical field. The proper remedy for unclarity caused by technical material is better writing or a better introduction to technical sections.

Our Gentle Readers should not be written down to by hiding complexity of complex subjects. An article on a technical subject can never be made sufficiently innocuous that every Reader will not be put off. We should, instead, write very clearly, distinguishing between necessary technical material and supplemental technical material which could indeed be more fully explained in some other article(s). And, banishing technical content has the very considerable disadvantage of requireing tour Gentle Readers to chase links and assemble an adequate picture of a subject from them. This is a skill held by few and by very few of our presumed Gentle Readers.

There are numerous reasons why Czort's suggestion should not be followed. Concur, mostly, with 198.145, ww (talk) 21:41, 2 June 2008 (UTC)

Minor suggestion

Introducing the mathematical definitions in "Ways of measuring information" are useful, brief, intuitive definitions. Perhaps these should be italicized to draw attention to them for those who might be scared off by the mathematics. Alternatively, they could be indented in their own paragraphs. On a related note, I don't like the title of this section; it implies that these are different ways to measure the same thing, rather than different methods for measuring different things. It would be as though a description on wattage, amperage, voltage with respect to ground, and potential difference were titled "ways of measuring electricity." Calbaer (talk) 00:33, 19 July 2008 (UTC)

I've changed my mind on this issue. As far as I am concerned, please feel free to cut almost all the math out of "Ways of measuring information" and leave nothing but practical, intuitive, "English-language" definitions. Maybe leave a formula or two, but nothing much more complicated than

I=-p\log p.

Most of the mathematical details from this article have already been copied over to the article Quantities of information. (It's nice to have those formulas listed in one place, but it doesn't have to be this article.) The math is/can be explained in even more detail in self-information, information entropy, conditional entropy, mutual information, and so forth. According to stats.grok.se, this article gets several hundred to a thousand hits every day. Having all the math here on the front page is just going to scare most of them away, and the article has been tagged as too technical for several years now. Deepmath (talk) 04:11, 19 July 2008 (UTC)

I've thought along these lines for some time. And even amde a feew edits to the intro, and some comments here, a while ago. The underlying ideas of information theory are important in many fields, and intrinsically. There is no reason, save the tradition in mathematical circles, to entangle those ideas with the machinery of their manipulation and proofs of their exact form. Shannon and Weaver managed to do so in their book 6 decades ago. We here should not allow ourselves to settle for something less now.

I'm with Deepmath here in favor of including plain English accounts in this article, which is one of first resort for those inquiring into the field. Certainly in the introduction. I'm less enthusiastic about removing all the technical material to another article (here Quantities of information).

There is a small, but respected tradition of such high quality writing in mathematical topics. Consider that Bertrand Russel, world class in several highly technical, and state of the art fields, won the Nobel Prize for --- Literature. Not research into the foundations of mathematical logic, nor philosophy... Even we technical sorts should high thee and do likewise. The tradition of obscurity and passive voice fog in science should be avoided here in the Wikipedia. ww (talk) 21:48, 19 July 2008 (UTC)

log probability

Why does log probability redirect here? —Preceding unsigned comment added by 199.46.198.230 (talk) 14:12, 18 March 2009 (UTC)

because -ln(p) = I (I representing amount of information) Kevin Baas^talk 16:10, 13 January 2011 (UTC)

thermodynamic entropy

I'd like to add a section on the relationship of Shannon's information to thermodynamic entropy in the section on applications to other fields. It would draw heavily from Ben-Naim's book A Farewell to Entropy: Statistical Thermodynamics Based on Information, but would also mention the impact of information theory on the resolution of Maxwell's Demon. It would probably just be 1-2 paragraphs, but I might develop it into a separate article. Thoughts? Maniacmagee (talk) 22:01, 21 August 2009 (UTC)

technical note about log

I am following through the math, and there is a seeming discrepancy that should be corrected (if it is an error) or explained. The entropy of the random variable X involves the log function. For the general definition, just "log" is written, which usually means natural log in most fields. Later, log2 is written. I presume log2 is meant in both cases and that two notational conventions are represented here. I think the article should be fixed to use "log2" consistently. I don't want to change it before getting a response because everything I know about this entropy function I know from this article. Thoughts? Tbonepower07 (talk) 04:21, 16 February 2010 (UTC)

It is said in "Quantities of information" that "The choice of logarithmic base in the following formulae determines the unit of information entropy that is used." In computer science, log2 is the most natural logarithm. Similarly, in the case of the binary entropy function, you are most commonly interested in log2 since the function is the entropy of a value that can take only two values. However, the article could and probably should use the natural log all over. Nageh (talk) 14:40, 23 March 2010 (UTC)

Overview

Having practically no experience as a contributor I hesitate to rewrite a section. On the other hand, having had a course in "Information Theory" at MIT in the 1950's, I believe that the Overview section has some serious errors. Should I undertake to rewrite large portions of that section?

"The main concepts of information theory can be grasped by considering the most widespread means of human communication: language." True enough, but the rest of the paragraph describes the main concepts of coding theory, not of information theory. The main concept of information theory is that a platitude such as "Thank you; come again" conveys less information the urgent plea, "Call an ambulance!" not because it is less important, but because it is less unexpected. In context, however, either of these messages might convey very little information, the former because it is not unexpected at the end of a transaction, the latter because if someone is obviously injured, "it goes without saying" that one should call an ambulance. The next main concept of information theory is that the speech channel has a capacity limit. If you talk faster, you convey information at a higher rate, unless you talk so fast that your hearer can't understand you. Symbols are being transmitted at a faster rate, but they are being received with errors; the channel is noisy.

Again "The central paradigm of classical information theory" is not "the engineering problem of the transmission of information over a noisy channel." That is the central paradigm of coding theory. The central paradigm of classical information theory is the quantification of information and of the capacity of an information carrying channel (which may be noisy or noiseless).

It might also be appropriate in the Overview section to introduce some simple quantitative concepts, for example, to define bits in terms of binary digits, to point out that capacity is defined for noiseless as well as noisy channels, or even to mention, by way of illustration, that when a fair and balanced coin is flipped, a message revealing which side came up conveys exactly one bit of information. Marty39 (talk) 21:06, 30 September 2011 (UTC)

Welcome! It is quite some time since I looked at this article, but I believe my thoughts were much the same as what you have noted. Please edit as you see fit—don't worry about mistakes and formatting issues as they can be fixed. I see that you know citations are needed; if you like, just put the info after the text in brackets and someone will tweak it. Johnuniq (talk) 00:10, 1 October 2011 (UTC)

As already mentioned, you're encouraged to contribute. Please also feel free to make contributions to the coding theory article, which presently needs to be rewritten completely. Isheden (talk) 11:17, 1 October 2011 (UTC)

The latest Nobel prize in economics went to economists who used information theory to build a model of how people respond to government policies. Perhaps this means economics should be listed in the opening section as a field in which information theory has been applied. — Preceding unsigned comment added by 68.16.142.186 (talk) 23:00, 2 January 2012 (UTC)

Black holes vs conservation of information

In this astrophysics series on the Discovery Channel they said something about black holes "...breaking the fundamental law of conservation of information" and some quarrelling between theoretical physicists over black holes. I had never before heard of such a fundamental law and I thought it sounded very strange. So I went looking for it and what I found on Wikipedia had little to do with physics as far as I could see. Can someone point me to an article better describing what the TV series means with conservation of information, or is the series just misinformed? Thanks! Eddi (Talk) 00:27, 9 May 2012 (UTC)

I think I've found the answer here and here. The TV show either overstates a "common assumption" as a "fundamental law" to make it more sensational, or it uses "information" as a synonym for "entropy" to make it easier to understand. This may not be completely unreasonable, but I found it a bit confusing... Eddi (Talk) 09:47, 14 May 2012 (UTC)

See also quite a lot on the talk page of the first of those articles, for discussion on this. Jheald (talk) 21:29, 14 May 2012 (UTC)

Conformity Measurements

Information Theory (at least in the modern research arena) also includes the concept of Conformity, a measure that describes whether any unit bit of information conforms to the wider sea of information within which the bit of information rests.

What constitutes conformity is rather like the old PBS television shows and other educational efforts designed for children which show a set of objects which all conform to some set or sets of grouping, one object of which is mildly or wildly different than the rest or, applied to Information Theory, one of which is wildly out of Conformity with the rest of the information within the arena of bits of information.

Information Theory shows vectors of Conformity that can be graphed. When there are 2 information units being considered, Conformity is zero, and when there are an infinite number of information bits, conformity is also zero. For information bits > 3 yet less than infinity, the measure of Conformity behaves much the way that photons behave when they interact with each other.

Any way, the article here does not touch upon Conformity as a measurement within the wider realm of Information theory -- which is a shame since it's currently being used as a way of detecting corrupted data within streams of data. If I was better informed about the concepts involved I would add a section. Maybe someone who has formal training could be sought to add some. Damotclese (talk) 20:19, 15 May 2012 (UTC)

[1] ttp://dlib.cnki.net/kns50/detail.aspx?filename=YXXX200607014&dbname=CJFD2006&filetitle=%E4%BF%A1%E6%81%AF%E5%AE%9A%E4%B9%89%E7%9A%84%E6%A0%87%E5%87%86%E5%8C%96

[2] ttp://dlib.cnki.net/kns50/detail.aspx?filename=SLYY200405038&dbname=CJFD2004

[1]

[2]