Talk:Binomial proportion confidence interval

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics
???	This article has not yet received a rating on Wikipedia's content assessment scale.
???	This article has not yet received a rating on the importance scale.

suggestions for improvement

I wrote some of the material a few years ago. Maybe it is a bit too technical. I'll add an extra paragraph in the introduction that explains why there is more than one formula. Steve Simon (talk) 15:00, 9 September 2008 (UTC)[reply]

The article has been labelled too technical, but I don't see it as being that much more technical than a lot of other mathematical articles. One could leave out detail to make things more succinct, but that might make it more difficult to follow. Some suggestions:

remove the bit on inverting hypothesis tests, and just mention the normal-derived interval is called a Wald interval, with a link

add a section on continuity corrections for the normal interval (and score intervals?)146.232.75.208 15:17, 22 September 2006 (UTC)[reply]

27 Nov 2006: I am not a statistician but I believe there may be an important error in the Wilson score interval. According to http://www.ppsw.rug.nl/~boomsma/confbin.pdf, the final term in the numerator under the square root sign should be (z squared)/(4n squared), not (z squared)/4n as is written. I don't have the mathematical capacity to determine which is correct, but for my data the former calculation makes a lot more sense than the latter, so I suspect that wikipedia's entry is wrong. I hope a statistician reviews this at some point!

Actually I think the Wilson score interval was right the first time. The formula in the cited article only looks different because the expression inside the square root was multiplied out. 131.111.8.104 15:34, 29 May 2007 (UTC)[reply]

There is a comment within the article that does not belong there. Look in the section "wilson score interval" for the sentence "(The following formula may be wrong. It's identical to the way the Normal approximation is derived)". This statement has to be moved to the discussion. Can someone please check if the formula is correct and then remove that comment, please. —Preceding unsigned comment added by 82.212.0.230 (talk) 13:52, 17 May 2008 (UTC)[reply]

I don't understand the comment that the Clopper-Pearson intervals are conservative due to the discreteness of the Binomial distribution; they are based on the beta distribution which IS continuous and well behaved in the interval. So, in fact, I think the comment is wrong (Fredrik x nilsson).

Fredrik, check out Brown, Cai, DasGupta 2001 in the references for a great illustration of the conservative performance of the Clopper-Pearson interval. I updated this section to help clarify. In short, by ensuring that the coverage is never below 95%, it is often much above 95%. MrYdobon (talk) 08:26, 29 September 2009 (UTC)[reply]

I don't agree with the last section in its summary of those papers. First, it makes no mention of the Wald interval, which it flat out states that it generally does not produce coverage probabilities of the levels desired. Moreover, I think the description of "better" is ambiguous and misleading at best. The exact coverage probabilities are inherently good, since they always guarantee that you reach at least the desired intervals. However, they may calculate an interval that is too large than desired. The approximations are "better" in the sense they don't over-estimate as much. In either case, however, both the exact and non-Wald approximations are generally generally better CI estimates for small n. —Preceding unsigned comment added by 134.174.140.216 (talk) 21:01, 11 June 2010 (UTC)[reply]

The entry is technical, not too technical, in my opinion. However, for readers unfamiliar with all the formulas, -- myself included -- it would be helpful to see the calculation and result for an example case using each formula. — Preceding unsigned comment added by Tjrm (talk • contribs) 21:36, 9 January 2011 (UTC)[reply]

Please have a look at the German WP article on this topic. It includes an example and tries to illustrate the intervals and the coverage probabilities. -- KurtSchwitters (talk) 09:17, 4 February 2011 (UTC)[reply]

Why reverted?

@Qwfp: Can you please elaborate on why you reverted my edit? The old version was certainly not clear enough for a layperson. I can see your objecting to my use of the word "likelihood" instead of "probability", though I was using it in what I thought was a layperson-friendly way; so I agree to changing it to "probability". But you also said "Added text contained incorrect interpretation of confidence interval" -- can you explain what's incorrect about it? The added text said, with "likelihood" replaced by "probability",

A simple example of a binomial distribution is the set of various possible outcomes, and their probabilities, for the number of heads observed when a (not necessarily fair) coin is flipped ten times. The observed binomial proportion is the fraction of the flips which turn out to be heads. Given this observed proportion, the confidence interval for the true proportion innate in that coin is the range of possible proportions which, with some specified probability such as 95%, contains the true proportion. Thus there is a 95% chance that the true probability of heads coming up on any throw is somewhere in that range.

What's wrong with that? Duoduoduo (talk) 14:40, 14 February 2011 (UTC)[reply]

See the lead of confidence interval:"A confidence interval does not predict that the true value of the parameter has a particular probability of being in the confidence interval given the data actually obtained." This is a widespread misconception. Repeating it in Wikipedia could make it even more widespread, which would not be desirable. --Qwfp (talk) 19:48, 14 February 2011 (UTC)[reply]

That passage was put in in November, and it is true only if one uses a Bayesian definition of "probability". The article also says "How frequently the interval contains the parameter is determined by the confidence level or confidence coefficient." In other words, the "probability" in the frequentist sense that you've got the true parameter in the confidence interval is 95% or whatever. So it's not a "misconception"; rather it is a matter of one's preferred definition of "probability". But I think it is valid to use wording that avoids semantic debates.

The problem I was trying to address with my edit is that the article as it is written does not define the term in the title of the article. All it says is "a binomial proportion confidence interval is a confidence interval for a proportion in a statistical population." That's circular and about as uninformative as it could get! I'll rewrite it to put in the same definition that the other article uses, applied to this context. In doing so, the example also needs to be expanded as I did; as the example stands, it is uninformative since it says nothing about a proportion. Duoduoduo (talk) 21:17, 14 February 2011 (UTC)[reply]

I'm happy with the text of your contribution to the article now — many thanks for revising it. I apologise for reverting your edits rather than taking the time to revise the text myself. I don't entirely agree that it's a mere semantic debate but that's not relevant to improving the article, and it's a debate that's already been held at Talk:Confidence interval#Meaning of the term "confidence" and Talk:Confidence interval#second paragraph in lead. I didn't participate myself as I find such debates somewhat stressful, so I certainly don't wish to start another here. Best wishes, Qwfp (talk) 09:07, 15 February 2011 (UTC)[reply]