Jump to content

Binomial proportion confidence interval: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Normal approximation interval: Addition of image plotting the interval along the logistic curve. Anyone who wishes to employ this interval needs to understand its properties, and this is a simple data-neutral visualisation.
P-values link
 
(93 intermediate revisions by 50 users not shown)
Line 1: Line 1:
{{short description|Statistical confidence interval for success counts}}
{{short description|Statistical confidence interval for success counts}}
In [[statistics]], a '''binomial proportion confidence interval''' is a [[confidence interval]] for the probability of success calculated from the outcome of a series of success–failure experiments ([[Bernoulli trial|Bernoulli trials]]). In other words, a binomial proportion confidence interval is an interval estimate of a success probability ''p'' when only the number of experiments ''n'' and the number of successes ''n<sub>S</sub>'' are known.
In [[statistics]], a '''binomial proportion confidence interval''' is a [[confidence interval]] for the probability of success calculated from the outcome of a series of success–failure experiments ([[Bernoulli trial]]s). In other words, a binomial proportion confidence interval is an interval estimate of a success probability <math>\ p\ </math> when only the number of experiments <math>\ n\ </math> and the number of successes <math>\ n_\mathsf{s}\ </math> are known.


There are several formulas for a binomial confidence interval, but all of them rely on the assumption of a [[binomial distribution]]. In general, a binomial distribution applies when an experiment is repeated a fixed number of times, each trial of the experiment has two possible outcomes (success and failure), the probability of success is the same for each trial, and the trials are [[statistically independent]]. Because the binomial distribution is a [[discrete probability distribution]] (i.e., not continuous) and difficult to calculate for large numbers of trials, a variety of approximations are used to calculate this confidence interval, all with their own tradeoffs in accuracy and computational intensity.
There are several formulas for a binomial confidence interval, but all of them rely on the assumption of a [[binomial distribution]]. In general, a binomial distribution applies when an experiment is repeated a fixed number of times, each trial of the experiment has two possible outcomes (success and failure), the probability of success is the same for each trial, and the trials are [[statistically independent]]. Because the binomial distribution is a [[discrete probability distribution]] (i.e., not continuous) and difficult to calculate for large numbers of trials, a variety of approximations are used to calculate this confidence interval, all with their own tradeoffs in accuracy and computational intensity.


A simple example of a binomial distribution is the set of various possible outcomes, and their probabilities, for the number of heads observed when a [[Coin flipping|coin is flipped]] ten times. The observed binomial proportion is the fraction of the flips that turn out to be heads. Given this observed proportion, the confidence interval for the true probability of the coin landing on heads is a range of possible proportions, which may or may not contain the true proportion. A 95% confidence interval for the proportion, for instance, will contain the true proportion 95% of the times that the procedure for constructing the confidence interval is employed.<ref name="Sullivan">{{Cite web|url=http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Confidence_Intervals/|title=Confidence Intervals|last1=Sullivan|first1=Lisa|date=2017-10-27|website=Boston University School of Public Health}}</ref>
A simple example of a binomial distribution is the set of various possible outcomes, and their probabilities, for the number of heads observed when a [[Coin flipping|coin is flipped]] ten times. The observed binomial proportion is the fraction of the flips that turn out to be heads. Given this observed proportion, the confidence interval for the true probability of the coin landing on heads is a range of possible proportions, which may or may not contain the true proportion. A 95% confidence interval for the proportion, for instance, will contain the true proportion 95% of the times that the procedure for constructing the confidence interval is employed.<ref name=Sullivan>
{{cite web
|last = Sullivan |first = Lisa
|date = 2017-10-27
|title = Confidence Intervals
|id = BS704
|type = course notes
|website = sphweb.bumc.bu.edu
|place = Boston, MA
|publisher = [[Boston University]] School of Public Health
|url = http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Confidence_Intervals/
}}
</ref>


==Normal approximation interval==
== Problems with using a normal approximation or "Wald interval" {{anchor|Normal approximation interval|Wald interval}} ==
[[File:Normal_approx_interval_and_logistic_example.png|thumb|Plotting the normal approximation interval on a logistic curve reveals problems of ''overshoot'' and ''zero-width intervals''.<ref name=New/>]]
[[File:Normal_approx_interval_and_logistic_example.png|thumb|300px|Plotting the normal approximation interval on an arbitrary [[logistic curve]] reveals problems of ''overshoot'' and ''zero-width intervals''.<ref name=Newcombe-1998/>]]

A commonly used formula for a binomial confidence interval relies on approximating the distribution of error about a binomially-distributed observation, <math>\hat p</math>, with a [[normal distribution]].<ref name=Wallis2013>{{Cite journal
A commonly used formula for a binomial confidence interval relies on approximating the distribution of error about a binomially-distributed observation, <math>\hat p</math>, with a [[normal distribution]].<ref name=Wallis2013>
| last1 = Wallis
{{cite journal
| first1 = Sean A.
|last = Wallis |first = Sean A.
| title = Binomial confidence intervals and contingency tests: mathematical fundamentals and the evaluation of alternative methods
|year = 2013
| journal = Journal of Quantitative Linguistics
|title = Binomial confidence intervals and contingency tests: Mathematical fundamentals and the evaluation of alternative methods
| volume = 20
|journal = [[Journal of Quantitative Linguistics]]
| issue = 3
| pages = 178–208
|volume = 20 |issue = 3 |pages = 178–208
|doi = 10.1080/09296174.2013.799918 |s2cid = 16741749
| year = 2013
|url = http://www.ucl.ac.uk/english-usage/staff/sean/resources/binomialpoisson.pdf
| doi = 10.1080/09296174.2013.799918
}}
| s2cid = 16741749
</ref>
| url = http://www.ucl.ac.uk/english-usage/staff/sean/resources/binomialpoisson.pdf
}}</ref> This approximation is based on the [[central limit theorem]] and is unreliable when the sample size is small or the success probability is close to 0 or 1.<ref name="Brown2001">{{Cite journal
The normal approximation depends on the [[de Moivre–Laplace theorem]] (the original, [[binomial distribution|binomial]]-only version of the [[central limit theorem]]) and becomes unreliable when it violates the theorems' premises, as the sample size becomes small or the success probability grows close to either {{math|0}} or {{math|1}}&nbsp;.<ref name=Brown2001>
{{cite journal
| last1 = Brown
| first1 = Lawrence D.
| last1 = Brown | first1 = Lawrence D. | author1-link = Lawrence D. Brown
| last2 = Cai | first2 = T. Tony | author-link2 = T. Tony Cai
| author-link = Lawrence D. Brown
| last3 = DasGupta | first3 = Anirban
| last2 = Cai
| first2 = T. Tony
| author-link2 = T. Tony Cai
| last3 = DasGupta
| first3 = Anirban
| title = Interval Estimation for a Binomial Proportion
| journal = Statistical Science
| volume = 16
| issue = 2
| pages = 101–133
| year = 2001
| year = 2001
| title = Interval estimation for a binomial proportion
| doi = 10.1214/ss/1009213286
| journal = [[Statistical Science]]
| mr = 1861069
| volume = 16 | issue = 2 | pages = 101–133
| zbl = 1059.62533
| doi = 10.1214/ss/1009213286 | mr = 1861069
| citeseerx = 10.1.1.50.3025
| zbl = 1059.62533 | citeseerx = 10.1.1.50.3025
}}</ref>
}}</ref>


Using the normal approximation, the success probability ''p'' is estimated as
Using the normal approximation, the success probability <math>\ p\ </math> is estimated by


: <math>\hat p \pm z \sqrt{\frac{\hat p \left(1 - \hat p\right)}{n}},</math>
: <math>\ p ~~ \approx ~~ \hat p \pm \frac{\; z_\alpha\ }{\ \sqrt{n\; }\ }\ \sqrt{ \hat p\ \left(1 - \hat p \right)\ }\ ,</math>


where <math>\ \hat p \equiv \frac{\!\ n_\mathsf{s}\!\ }{ n }\ </math> is the proportion of successes in a [[Bernoulli trial]] process and an estimator for <math>\ p\ </math> in the underlying [[Bernoulli distribution]]. The equivalent formula in terms of observation counts is
or the equivalent


: <math>\frac{n_S}{n} \pm \frac{z}{n\sqrt{n}} \sqrt{n_S n_F},</math>
: <math>\ p ~~ \approx ~~ \frac{\!\ n_\mathsf{s}\!\ }{ n } \pm \frac{\; z_\alpha\ }{\ \sqrt{n\; }\ } \sqrt{ \frac{\!\ n_\mathsf{s}\!\ }{ n }\ \frac{\!\ n_\mathsf{f}\!\ }{ n }\ }\ ,</math>


where <math>\hat p = n_S / n</math> is the proportion of successes in a [[Bernoulli trial]] process, measured with <math>n</math> trials yielding <math>n_S</math> successes and <math>n_F = n - n_S</math> failures, and <math>z</math> is the <math>1 - \tfrac{\alpha}{2}</math> [[quantile]] of a [[standard normal distribution]] (i.e., the [[probit]]) corresponding to the target error rate <math>\alpha</math>. For a 95% confidence level, the error <math>\alpha=1-0.95=0.05</math>, so <math>1 - \tfrac \alpha 2=0.975</math> and <math>z=1.96</math>.
where the data are the results of <math>\ n\ </math> trials that yielded <math>\ n_\mathsf{s}\ </math> successes and <math>\ n_\mathsf{f} = n - n_\mathsf{s}\ </math> failures. The distribution function argument <math>\ z_\alpha\ </math> is the <math>\ 1 - \tfrac{\!\ \alpha\!\ }{2}\ </math> [[quantile]] of a [[standard normal distribution]] (i.e., the [[probit]]) corresponding to the target error rate <math>\ \alpha ~.</math> For a 95% confidence level, the error <math>\ \alpha = 1 - 0.95 = 0.05\ ,</math> so <math>\ 1 - \tfrac{\!\ \alpha\!\ }{ 2 } = 0.975\ </math> and <math>\ z_{.05} = 1.96 ~.</math>


When using the Wald formula to estimate <math>\ p\ </math>, or just considering the possible outcomes of this calculation, two problems immediately become apparent:
An important theoretical derivation of this confidence interval involves the inversion of a hypothesis test. Under this formulation, the confidence interval represents those values of the population parameter that would have large ''p''-values if they were tested as a hypothesized [[population proportion]]. The collection of values, <math>\theta</math>, for which the normal approximation is valid can be represented as
* First, for <math>\ \hat p\ </math> approaching either {{math|1}} or {{math|0}}, the interval narrows to zero width (falsely implying certainty).
* Second, for values of <math>~ \hat p ~ < ~ \frac{ 1 }{\ 1 + n / z_\alpha^2\ } ~</math> (probability too low / too close to {{math|0}}), the interval boundaries exceed <math>\ [0 ,\ 1]\ </math> (''overshoot'').
(Another version of the second, overshoot problem, arises when instead <math>\ 1 - \hat p\ </math> falls below the same upper bound: probability too high / too close to {{math|1}}&nbsp;.)


An important theoretical derivation of this confidence interval involves the inversion of a hypothesis test. Under this formulation, the confidence interval represents those values of the population parameter that would have large [[p value|P-values]] if they were tested as a hypothesized [[population proportion]].{{clarification needed|date=January 2024|reason=Hypothesis tests assume the population parameter value and derive <math>\ p\ </math>values, probabilities, for observed estimates under the assumption of that hypothesis. Confidence interval gives those values such that, assuming the center of the confidence interval as null hypothesis, estimates falling within the interval would be assigned <math>\ p\ </math>values too large to reject. Which is kind of what the above is saying but also not, hence why clarification is needed.}} The collection of values, <math>\ \theta\ ,</math> for which the normal approximation is valid can be represented as
: <math>\left\{ \theta \,\,\bigg|\,\, y \le \frac{\hat p - \theta}{\sqrt{\frac{1}{n} \hat p \left(1 - \hat p\right)}} \le z_ \tfrac \alpha 2 \right\},</math>

: <math>\left\{~~ \theta \quad \Bigg| \quad y_{\alpha/2} ~~ \le ~~ \frac{\ \hat p - \theta\ }{\sqrt{\frac{\!\ 1\!\ }{ n }\ \hat p\ \left(1 - \hat p\right)\ }\ } ~~ \le ~~ z_{\alpha/2} ~~\right\}\ ,</math>

where <math>\ y_{\alpha/2}\ </math> is the lower <math>\ \tfrac{ \alpha }{\!\ 2\!\ }\ </math> [[quantile]] of a [[standard normal distribution]], vs. <math>\ z_{\alpha/2}\ ,</math> which is the ''upper'' quantile.

Since the test in the middle of the inequality is a [[Wald test]], the normal approximation interval is sometimes called the '''Wald interval''' or '''Wald method''', after [[Abraham Wald]], but it was first described by [[Pierre-Simon Laplace|Laplace]] (1812).<ref>
{{cite book
|last = Laplace |first=P.S. |author-link = Pierre-Simon Laplace
|year = 1812
|title = Théorie analytique des probabilités |lang=fr
|trans-title = Analyitic Probability Theory
|publisher = Ve. Courcier
|page = 283
|url = https://archive.org/details/thorieanalytiqu00laplgoog
}}
</ref>

=== Bracketing the confidence interval ===
Extending the normal approximation and Wald-Laplace interval concepts, [[Michael Short (engineer)|Michael Short]] has shown that inequalities on the approximation error between the binomial distribution and the normal distribution can be used to accurately bracket the estimate of the confidence interval around <math>\ p\ :</math><ref name=Short-2021>
{{cite journal
|last = Short |first = Michael
|date = 2021-11-08
|title = On binomial quantile and proportion bounds: With applications in engineering and informatics
|journal = Communications in Statistics - Theory and Methods
|volume = 52 |issue = 12 |pages = 4183–4199
|doi = 10.1080/03610926.2021.1986540 |doi-access = free
|s2cid = 243974180 |issn = 0361-0926
}}
</ref>

:<math>\ \frac{\ k + C_\mathsf{L1} - z_\alpha\ \widehat{W}\ }{\ n + z_\alpha^2\ } ~~ \le ~~ p ~~ \le ~~ \frac{\ k + C_\mathsf{U1} + z_\alpha\ \widehat{W}\ }{ n + z_\alpha^2 }\ </math>

with
:<math>\ \widehat{W} \equiv \sqrt{ \frac{\ n\ k - k^2 + C_\mathsf{L2} n\ - C_\mathsf{L3} k + C_\mathsf{L4}\ }{ n } ~}\ ,</math>


and where <math>\ p\ </math> is again the (unknown) proportion of successes in a Bernoulli trial process (as opposed to <math>\ \hat p \equiv \frac{\ n_\mathsf{s}\ }{ n } \ </math> that estimates it) measured with <math>\ n\ </math> trials yielding <math>\ k\ </math> successes, <math>\ z_\alpha\ </math> is the <math>\ 1 - \tfrac{\alpha}{2}\ </math> quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate <math>\ \alpha\ ,</math> and the constants <math>\ C_\mathsf{L1}, C_\mathsf{L2}, C_\mathsf{L3}, C_\mathsf{L4}, C_\mathsf{U1}, C_\mathsf{U2}, C_\mathsf{U3}\ </math> and <math>\ C_\mathsf{U4}\ </math> are simple algebraic functions of <math>\ z_\alpha ~.</math><ref name=Short-2021/> For a fixed <math>\ \alpha\ </math> (and hence <math>\ z_\alpha\ </math>), the above inequalities give easily computed one- or two-sided intervals which bracket the exact binomial upper and lower confidence limits corresponding to the error rate <math>\ \alpha ~.</math>
where <math>y</math> is the <math>\tfrac \alpha 2</math> [[quantile]] of a [[standard normal distribution]]. Since the test in the middle of the inequality is a [[Wald test]], the normal approximation interval is sometimes called the [[Abraham Wald|Wald]] interval, but it was first described by [[Pierre-Simon Laplace]] in 1812.<ref>{{Cite book|url=https://archive.org/details/thorieanalytiqu00laplgoog|title=Théorie analytique des probabilités|publisher=Ve. Courcier|last=Laplace|first=Pierre Simon|year=1812|pages=283|language=fr}}</ref>


===Standard error of a proportion estimation when using weighted data===
===Standard error of a proportion estimation when using weighted data===


Let there be a simple random sample <math>X_1, \ldots, X_n</math> where each <math>X_i</math> is [[i.i.d]] from a [[Bernoulli distribution|Bernoulli]](p) distribution and weight <math>w_i</math> is the weight for each observation. Standardize the (positive) weights <math>w_i</math> so they sum to 1. The [[Weighted arithmetic mean|weighted sample proportion]] is: <math>\hat p = \sum_{i=1}^n w_i X_i</math>. Since the <math>X_i</math> are independent and each one has variance <math>\text{Var}(X_i) = p(1-p)</math>, the '''sampling variance of the proportion''' therefore is:<ref>[https://stats.stackexchange.com/a/159220/253 How to calculate the standard error of a proportion using weighted data?]</ref>
Let there be a simple random sample <math>\ X_1,\ \ldots,\ X_n\ </math> where each <math>\ X_i\ </math> is [[Independent and identically distributed random variables|i.i.d]] from a [[Bernoulli distribution|Bernoulli]](p) distribution and weight <math>w_i</math> is the weight for each observation, with the(positive) weights <math>w_i</math> normalized so they sum to {{math|1}}&nbsp;. The [[Weighted arithmetic mean|weighted sample proportion]] is: <math display="inline">\ \hat p = \sum_{i=1}^n\ w_i\ X_i ~.</math> Since each of the <math>\ X_i\ </math> is independent from all the others, and each one has variance <math>\ \operatorname{var}\{\ X_i\ \}\ =\ p\ (1 - p) ~~</math> for every <math>~~ i\ =\ 1 ,\ \ldots\ , n\ ,</math> the '''sampling variance of the proportion''' therefore is:<ref>{{cite web |title=How to calculate the standard error of a proportion using weighted data? |website = stats.stackexchange.com |id = 159220 / 253 |url=https://stats.stackexchange.com/a/159220/253 }}</ref>


<math>\text{Var}(\hat p) = \sum_{i=1}^n \text{Var}(\omega_i X_i) = p(1-p)\sum_{i=1}^n\omega_i^2</math>.
:<math>\ \operatorname{var}\{\ \hat p\ \} = \sum_{i=1}^n \operatorname{var}\{\ w_i\ X_i\ \} = p\ ( 1 - p )\ \sum_{i=1}^n w_i^2 ~.</math>


The '''standard error''' of <math>\hat p</math> is the square root of this quantity. Because we do not know <math>p(1-p)</math>, we have to estimate it. Although there are many possible estimators, a conventional one is to use <math>\hat p</math>, the sample mean, and plug this into the formula. That gives:
The '''standard error''' of <math>\ \hat p\ </math> is the square root of this quantity. Because we do not know <math>\ p\ (1-p)\ ,</math> we have to estimate it. Although there are many possible estimators, a conventional one is to use <math>\ \hat p\ ,</math> the sample mean, and plug this into the formula. That gives:


<math>\text{SE}(\hat p) = \sqrt{\hat p(1-\hat p) \sum_{i=1}^n w_i^2}</math>
:<math>\ \operatorname{SE}\{\ \hat p\ \} \approx \sqrt{~\hat p\ (1 - \hat p)\ \sum_{i=1}^n w_i^2 ~~}\ </math>


For unweighted data, <math>w_i = 1/n</math>, giving <math>\sum_{i=1}^n w_i^2 = 1/n</math>. The SE becomes <math>\sqrt{p(1-p)/n}</math>, leading to the familiar formulas, showing that the calculation for weighted data is a direct generalization of them.
For otherwise unweighted data, the effective weights are uniform <math display="inline">\ w_i = \tfrac{\!\ 1\!\ }{ n }\ ,</math> giving <math display="inline">\ \sum_{i=1}^n w_i^2 = \tfrac{\!\ 1\!\ }{ n } ~.</math> The <math>\ \operatorname{SE}\ </math> becomes <math display="inline">\sqrt{ \tfrac{\!\ 1\!\ }{ n }\ \hat p\ (1-\hat p)~}\ ,</math> leading to the familiar formulas, showing that the calculation for weighted data is a direct generalization of them.


==Wilson score interval==
==Wilson score interval==
[[File:Wilson_score_interval_and_logistic_example.png|thumb|300px|Wilson score intervals plotted on a logistic curve, revealing asymmetry and good performance for small {{mvar|n}} and where {{mvar|p}} is at or near 0 or 1.]]
The Wilson score interval is an improvement over the normal approximation interval in multiple respects. It was developed by [[Edwin Bidwell Wilson]] (1927).<ref name=Wilson1927>
The '''Wilson score interval''' was developed by [[Edwin Bidwell Wilson|E.B. Wilson]] (1927).<ref name=Wilson1927>
{{cite journal
{{cite journal
| last = Wilson | first = E.B. | author-link = Edwin Bidwell Wilson
| last1 = Wilson
| first1 = E. B.
| year = 1927
| author-link1 = Edwin Bidwell Wilson
| title = Probable inference, the law of succession, and statistical inference
| title = Probable inference, the law of succession, and statistical inference
| journal = Journal of the American Statistical Association
| journal = Journal of the American Statistical Association
| volume = 22
| volume = 22 | issue = 158 | pages = 209–212
| jstor = 2276774 | doi = 10.1080/01621459.1927.10502953
| issue = 158
}}
| pages = 209–212
</ref>
| year = 1927
It is an improvement over the normal approximation interval in multiple respects: Unlike the symmetric normal approximation interval (above), the Wilson score interval is ''asymmetric'', and it doesn't suffer from problems of ''overshoot'' and ''zero-width intervals'' that afflict the normal interval. It can be safely employed with small samples and skewed observations.<ref name=Wallis2013/> The observed [[coverage probability]] is consistently closer to the nominal value, <math>\ 1 - \alpha ~.</math><ref name=Newcombe-1998/>
|jstor = 2276774
| doi = 10.1080/01621459.1927.10502953
}}</ref> Unlike the symmetric normal approximation interval (above), the Wilson score interval is '''asymmetric'''. It does not suffer from problems of ''overshoot'' and ''zero-width intervals'' that afflict the normal interval, and it may be safely employed with small samples and skewed observations.<ref name="Wallis2013" /> The observed [[coverage probability]] is consistently closer to the nominal value, <math>1 - \alpha</math>.<ref name="New" />


Like the normal interval, but unlike the [[Clopper-Pearson interval]], the interval can be computed directly from a formula.
Like the normal interval, the interval can be computed directly from a formula.


Wilson started with the normal approximation to the binomial:
Wilson started with the normal approximation to the binomial:
:<math> z \approx \frac{~\left(\,p - \hat{p}\,\right)~}{\sigma_n} </math>
:<math>\ z_\alpha \approx \frac{~ \left(\ p - \hat{p}\ \right) ~}{ \sigma_n }\ </math>
with the analytic formula for the sample standard deviation given by
where <math>\ z_\alpha\ </math> is the standard normal interval half-width corresponding to the desired confidence <math>\ 1 - \alpha ~.</math> The analytic formula for a binomial sample standard deviation is
:<math> \sigma_n = \sqrt{\,\frac{\,p\left(1-p\right)\,}{n}~}~</math>.
<math display="block">\ \sigma_n = \sqrt{\frac{\ p\ \left( 1 - p \right)\ }{n} ~} ~.</math>
Combining the two, and squaring out the radical, gives an equation that is quadratic in {{mvar|p}}:<!-- \hat{p}=p\pm z\sqrt{\frac{\,p\left(1-p\right)\,}{n}}
Combining the two, and squaring out the radical, gives an equation that is quadratic in <math>\ p\ :</math>
<!-- \hat{p} = p \pm z_\alpha\ \sqrt{\frac{\ p\left(1 - p\right)\ }{n}\ } \qquad\Longrightarrow\qquad -->
\qquad\Longrightarrow\qquad -->
:<math>
:<math>
\left(\, \hat{p} - p \,\right)^{2} = z^{2}\cdot\frac{\,p\left(1-p\right)\,}{n}
\left(\ p - \hat{p} \ \right)^2 = \frac{\; z_\alpha^2\ }{ n }\ p\ \left(\ 1 - p\ \right)\ \qquad
</math> or <math> \qquad
p^2 - 2\ p\ \hat{p} + {\hat{p}}^2 = p\ \frac{\; z_\alpha^2\ }{ n } - p^2\ \frac{\; z_\alpha^2\ }{ n } ~.
</math>
</math>
Transforming the relation into a standard-form quadratic equation for {{mvar|p}}, treating <math>\hat p</math> and {{mvar|n}} as known values from the sample (see prior section), and using the value of {{mvar|z}} that corresponds to the desired confidence for the estimate of {{mvar|p}} gives this:
Transforming the relation into a standard-form quadratic equation for <math>\ p\ ,</math> treating <math>\ \hat p\ </math> and <math>\ n\ </math> as known values from the sample (see prior section), and using the value of <math>\ z_\alpha\ </math> that corresponds to the desired confidence <math>\ 1 - \alpha\ </math> for the estimate of <math>\ p\ </math> gives this:
:<math>
<math display="block">
\biggl( 1 + \frac{\,z^2\,}{n} \biggr)\,p^2 +
\left(\ 1 + \frac{\; z_\alpha^2\ }{ n }\ \right)\ p^2 -
\biggl( - 2 {\hat p} - \frac{\,z^2\,}{n} \biggr)\,p +
\left(\ 2\ {\hat p} + \frac{\; z_\alpha^2\ }{ n }\ \right)\ p +
\biggl( {\hat p}^2 \biggr) = 0 ~
\biggl(\ {\hat p}^2\ \biggr) = 0 ~,
</math>,
</math>
where all of the values in parentheses are known quantities.
where all of the values bracketed by parentheses are known quantities.
The solution for {{mvar|p}} estimates the upper and lower limits of the confidence interval for {{mvar|p}}. Hence the probability of success {{mvar|p}} is estimated by
The solution for <math>\ p\ </math> estimates the upper and lower limits of the confidence interval for <math>\ p ~.</math> Hence the probability of success <math>\ p\ </math> is estimated by <math>\ \hat p\ </math> and with <math>\ 1 - \alpha\ </math> confidence bracketed in the interval
:<math>
:<math>
p \approx ( w^- , w^+ ) = \frac{1}{~1+\frac{\,z^2\,}{n}~}\left( \hat p+\frac{\,z^2\,}{2n} \right)
p \quad \underset{\approx}{\in}_\alpha \quad \bigl(\ w^- ,\ w^+\ \bigr) ~~ = ~~ \frac{ 1 }{~ 1 + z_\alpha^2\ / n ~} \Bigg(\ \hat p + \frac{\; z_\alpha^2\ }{ 2\ n }
~ \pm ~
~~ \pm ~~
\frac{z}{~1+\frac{z^2}{n}~}\sqrt{\frac{\,\hat p(1-\hat p)\,}{n}+\frac{\,z^2\,}{4n^2}~} ~
\frac{\!\ z_\alpha\!\ }{\!\ 2\ n\!\ }\ \sqrt{\ 4\ n\ \hat p\ (1 - \hat p) + \ z_\alpha^2 ~} ~\Biggr)\
</math>
</math>


where <math>\ \underset{\approx}{\in}_\alpha\ </math> is an abbreviation for
or the equivalent


:<math>\ \operatorname{\mathbb P} \Bigr\{~~ p \in \left(\ w^- ,\ w^+\ \right) ~~\Bigl\} = 1 - \alpha ~.</math>

An equivalent expression using the observation counts <math>\ n_\mathsf{s}\ </math> and <math>\ n_\mathsf{f}\ </math> is
:<math>
:<math>
p \approx \frac{~ n_S + \tfrac{1}{2} z^2 ~}{ n + z^2 }
p \quad \underset{\approx}{\in}_\alpha \quad \frac{\ n_\mathsf{s} + \tfrac{\!\ 1\!\ }{ 2 }\ z_\alpha^2\ }{ n + z_\alpha^2 }
~ \pm ~ \frac{z}{n + z^2}
~ \pm ~ \frac{ z_\alpha }{\ n + z_\alpha^2\ }
\sqrt{
\sqrt{
\frac{~n_S \, n_F~}{n} + \frac{z^2}{4} ~
\frac{\ n_\mathsf{s}\ n_\mathsf{f}\ }{ n } + \frac{\; z_\alpha^2\ }{ 4 } ~}\ ,
}~.
</math>
</math>


with the counts as above: <math> n_\mathsf{s} \equiv </math> the count of observed "successes", <math>\ n_\mathsf{f} \equiv </math> the count of observed "failures", and their sum is the total number of observations <math>\ n = n_\mathsf{s} + n_\mathsf{f} ~.</math>
The practical observation from using this interval is that it has good properties even for a small number of trials and / or an extreme probability.

In practical tests of the formula's results, users find that this interval has good properties even for a small number of trials and / or the extremes of the probability estimate, <math>\ \hat p \equiv \frac{\!\ n_\mathsf{s}\ }{ n }~.</math><ref name=Newcombe-1998/><ref name=Wallis2013/><ref name=Wallis2021/>


Intuitively, the center value of this interval is the weighted average of <math>\hat{p}</math> and <math>\tfrac{1}{2}</math>, with <math>\hat{p}</math> receiving greater weight as the sample size increases. Formally, the center value corresponds to using a [[pseudocount]] of {{math|{{sfrac|1|2}} ''z''²}}, the number of standard deviations of the confidence interval: add this number to both the count of successes and of failures to yield the estimate of the ratio. For the common two standard deviations in each direction interval (approximately 95% coverage, which itself is approximately 1.96 standard deviations), this yields the estimate <math>(n_S+2)/(n+4)</math>, which is known as the "plus four rule".
Intuitively, the center value of this interval is the weighted average of <math>\ \hat{p}\ </math> and <math>\ \tfrac{\!\ 1\!\ }{ 2 }\ ,</math> with <math>\ \hat{p}\ </math> receiving greater weight as the sample size increases. Formally, the center value corresponds to using a [[pseudocount]] of <math>\ \tfrac{\!\ 1\!\ }{ 2 } z_\alpha^2\ ,</math> the number of standard deviations of the confidence interval: Add this number to both the count of successes and of failures to yield the estimate of the ratio. For the common two standard deviations in each direction interval (approximately 95% coverage, which itself is approximately 1.96&nbsp;standard deviations), this yields the estimate <math>\ \frac{\ n_\mathsf{s} + 2 \ }{ n + 4 }\ ,</math> which is known as the "plus four rule".


Although the quadratic can be solved explicitly, in most cases Wilson's equations can also be solved numerically using the fixed-point iteration
Although the quadratic can be solved explicitly, in most cases Wilson's equations can also be solved numerically using the fixed-point iteration
:<math>
:<math>
p_{k+1}=\hat{p} \pm z\cdot\sqrt{\frac{ p_k \cdot \left( 1 - p_k \right)}{n}}
p_{\!\ k+1} = \hat{p} \pm z_\alpha\ \sqrt{ \tfrac{\!\ 1\!\ }{ n }\ p_k\ \left(\ 1 - p_k\ \right) ~}
</math>
</math>
with <math>p_0 = \hat{p}</math>.
with <math>\ p_0 = \hat{p} ~.</math>


The Wilson interval can also be derived from the [[z-test|single sample z-test]] or [[Pearson's chi-squared test]] with two categories. The resulting interval,
The Wilson interval can also be derived from the [[z-test|single sample z-test]] or [[Pearson's chi-squared test]] with two categories. The resulting interval,


:<math>\left\{
:<math>
\theta \,\,\bigg|\,\, y \le
\left\{ ~~ \theta \quad \Bigg| \quad \ y_\alpha ~~ \le ~~
\frac{\hat{p} - \theta}{\sqrt{\tfrac{1}{n} \theta(1 - \theta)}} \le
\frac{\ \hat{p} - \theta\ }{\ \sqrt{ \tfrac{\!\ 1\!\ }{ n }\ \theta\ \left( 1 - \theta\right) ~}\ }
~~ \le ~~ z_\alpha ~~ \right\}\ ,
z
\right\},</math>
</math>


can then be solved for <math>\theta</math> to produce the Wilson score interval. The test in the middle of the inequality is a [[score test]].
(with <math>\ y_\alpha\ </math> the lower <math>\ \alpha\ </math> quantile)
can then be solved for <math>\ \theta\ </math> to produce the Wilson score interval. The test in the middle of the inequality is a [[score test]].


===The interval equality principle===
===The interval equality principle===
[[File:Wilson score pdf and interval equality.png|thumb|300px|The [[probability density function]] ({{sc|pdf}}) for the Wilson score interval, plus {{sc|pdf}}s at interval bounds. Tail areas are equal.]]
Since the interval is derived by solving from the normal approximation to the binomial, the Wilson score interval <math>~ \bigl(\ w^-\ ,\ w^+\ \bigr) ~</math> has the property of being guaranteed to obtain the same result as the equivalent [[z-test]] or [[Pearson's chi-squared test|chi-squared test]].


[[File:Wilson score pdf and interval equality.png|thumb|The probability density function for the Wilson score interval, plus pdfs at interval bounds. Tail areas are equal.]]
This property can be visualised by plotting the [[probability density function]] for the Wilson score interval (''see'' Wallis).<ref name=Wallis2021>
Since the interval is derived by solving from the normal approximation to the binomial, the Wilson score interval <math>( w^- , w^+ )</math> has the property of being guaranteed to obtain the same result as the equivalent [[z-test]] or [[Pearson's chi-squared test|chi-squared test]].

This property can be visualised by plotting the [[probability density function]] for the Wilson score interval (see Wallis 2021: 297-313)<ref name=Wallis2021>
{{cite book
{{cite book
| last = Wallis
| last = Wallis | first = Sean A.
| first = Sean A.
| year = 2021
| title = Statistics in Corpus Linguistics - a new approach
| title = Statistics in Corpus Linguistics: A new approach
| place = New York, NY
| publisher = Routledge
| publisher = Routledge
| location = New York
| isbn = 9781138589384
| isbn = 9781138589384
| url = https://www.routledge.com/Statistics-in-Corpus-Linguistics-Research-A-New-Approach/Wallis/p/book/9781138589384
| url = https://www.routledge.com/Statistics-in-Corpus-Linguistics-Research-A-New-Approach/Wallis/p/book/9781138589384
}}
| date = 2021
</ref>{{rp|style=ama|pp= 297-313}}
}}</ref> and then plotting a normal pdf at each bound. The tail areas of the resulting Wilson and normal distributions, representing the chance of a significant result in that direction, must be equal.
After that, then also plotting a normal {{sc|pdf}} across each bound. The tail areas of the resulting Wilson and normal distributions represent the chance of a significant result, in that direction, must be equal.


The continuity-corrected Wilson score interval and the [[Clopper-Pearson interval]] are also compliant with this property. The practical import is that these intervals may be employed as [[Statistical hypothesis testing|significance tests]], with identical results to the source test, and new tests may be derived by geometry.<ref name=Wallis2021/>
The continuity-corrected Wilson score interval and the [[Clopper-Pearson interval]] are also compliant with this property. The practical import is that these intervals may be employed as [[Statistical hypothesis testing|significance tests]], with identical results to the source test, and new tests may be derived by geometry.<ref name=Wallis2021/>


===Wilson score interval with continuity correction===
===Wilson score interval with continuity correction===
The Wilson interval may be modified by employing a [[continuity correction]], in order to align the minimum [[coverage probability]], rather than the average coverage probability, with the nominal value, <math>\ 1 - \alpha ~.</math>

The Wilson interval may be modified by employing a [[continuity correction]], in order to align the minimum [[coverage probability]], rather than the average coverage probability, with the nominal value, <math>1 - \alpha</math>.


Just as the Wilson interval mirrors [[Pearson's chi-squared test]], the Wilson interval with continuity correction mirrors the equivalent [[Yates's correction for continuity|Yates' chi-squared test]].
Just as the Wilson interval mirrors [[Pearson's chi-squared test]], the Wilson interval with continuity correction mirrors the equivalent [[Yates's correction for continuity|Yates' chi-squared test]].


The following formulae for the lower and upper bounds of the Wilson score interval with continuity correction <math>( w_{cc}^- , w_{cc}^+ )</math> are derived from Newcombe (1998).<ref name=New/>
The following formulae for the lower and upper bounds of the Wilson score interval with continuity correction <math>\ \left( w_\mathsf{cc}^- , w_\mathsf{cc}^+ \right)\ </math> are derived from Newcombe:<ref name=Newcombe-1998>
{{cite journal
| last = Newcombe | first = R.G.
| year = 1998
| title = Two-sided confidence intervals for the single proportion: Comparison of seven methods
| journal = [[Statistics in Medicine (journal)|Statistics in Medicine]]
| volume = 17 | issue = 8 | pages = 857–872
| pmid = 9595616
| doi = 10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E
}}
</ref>


:<math>\begin{align}
:<math display="block">\begin{align}
w_{cc}^- &= \max\left\{
w_\mathsf{cc}^- &= \max \left\{ ~~ 0\ , ~~
0, \frac { 2n\hat{p} + z^2 - \left[z \sqrt{z^2 - \frac{1}{n} + 4n\hat{p}(1 - \hat{p}) + (4\hat{p} - 2)} + 1\right] } { 2(n + z^2) }
\frac{\ 2\ n\ \hat{p} + z_\alpha^2 - \left[\ z_\alpha\ \sqrt{ z_\alpha^2 - \frac{\ 1\ }{ n } + 4\ n\ \hat{p}\ \left(1 - \hat{p}\right) + \left( 4\ \hat{p} - 2 \right) ~}\ +\ 1\ \right]\ }{\ 2 \left( n + z_\alpha^2 \right) ~~}
\right\} \\
\right\}\ ,\\
w_{cc}^+ &= \min\left\{
w_\mathsf{cc}^+ &= \min \left\{ ~~ 1\ , ~~
1, \frac { 2n\hat{p} + z^2 + \left[z \sqrt{z^2 - \frac{1}{n} + 4n\hat{p}(1 - \hat{p}) - (4\hat{p} - 2)} + 1\right] } { 2(n + z^2) }
\frac{\ 2\ n\ \hat{p} + z_\alpha^2 + \left[\ z_\alpha\ \sqrt{ z_\alpha^2 - \frac{\ 1\ }{ n } + 4\ n\ \hat{p}\ \left( 1 - \hat{p} \right) - \left( 4\ \hat{p} - 2 \right) ~}\ +\ 1\ \right]\ }{\ 2 \left( n + z_\alpha^2 \right) ~~}
\right\}
\right\} ,
\end{align}</math>
\end{align} </math>
for <math>\ \hat p \ne 0\ </math> and <math>\ \hat p \ne 1 ~.</math>


However, if ''p''&nbsp;=&nbsp;0, <math>w_{cc}^-</math> must be taken as 0; if ''p''&nbsp;=&nbsp;1, <math>w_{cc}^+</math> is then 1.
If <math>\ \hat p = 0\ ,</math> then <math>\ w_\mathsf{cc}^-\ </math> must instead be set to <math>\ 0\ ;</math> if <math>\ \hat p = 1\ ,</math> then <math>\ w_\mathsf{cc}^+\ </math> must be instead set to <math>\ 1 ~.</math>


Wallis (2021)<ref name=Wallis2021/> identifies a simpler method for computing continuity-corrected Wilson intervals that employs functions. For the lower bound, let <math>WilsonLower(\hat{p}, n, \alpha/2) = w^-</math>, where <math>\alpha</math> is the selected error level for <math>z</math>. Then
Wallis (2021)<ref name=Wallis2021/> identifies a simpler method for computing continuity-corrected Wilson intervals that employs a special function based on Wilson's lower-bound formula: In Wallis' notation, for the lower bound, let
:<math>~ \mathsf{Wilson_{lower}}\left(\ \hat{p},\ n,\ \tfrac{\!\ \alpha\!\ }{ 2 }\ \right)\ \equiv\ w^-\ =\ \frac{ 1 }{~ 1 + z_\alpha^2\ / n ~} \Bigg(\ \hat p + \frac{\; z_\alpha^2\ }{\!\ 2\ n\!\ }
<math>w_{cc}^- = WilsonLower(\max(\hat{p} - \tfrac{1}{2n}, 0), n, \alpha/2)</math>. This method has the advantage of being further decomposable.
~~ - ~~
\frac{\!\ z_\alpha\!\ }{\!\ 2\ n\!\ }\ \sqrt{\ 4\ n\ \hat p\ (1 - \hat p) + \ z_\alpha^2 ~} ~\Biggr)\ ,
</math>

where <math>\ \alpha\ </math> is the selected tolerable error level for <math>\ z_\alpha ~.</math> Then
: <math>\ w_\mathsf{cc}^- = \mathsf{Wilson_{lower}}\left(\ \max \left\{\ \hat{p} - \tfrac{ 1 }{\!\ 2\ n\!\ },\ 0\ \right\},\ n,\ \tfrac{\!\ \alpha\!\ }{ 2 }\ \right) ~.</math>

This method has the advantage of being further decomposable.


==Jeffreys interval==
==Jeffreys interval==
The ''Jeffreys interval'' has a Bayesian derivation, but it has good frequentist properties. In particular, it has coverage properties that are similar to those of the Wilson interval, but it is one of the few intervals with the advantage of being ''equal-tailed'' (e.g., for a 95% confidence interval, the probabilities of the interval lying above or below the true value are both close to 2.5%). In contrast, the Wilson interval has a systematic bias such that it is centred too close to ''p'' = 0.5.<ref>{{cite journal | last1 = Cai | first1 = TT | author-link = T. Tony Cai | year = 2005 | title = One-sided confidence intervals in discrete distributions | journal = [[Journal of Statistical Planning and Inference]] | volume = 131 | issue = 1 | pages = 63–88 | doi=10.1016/j.jspi.2004.01.005}}</ref>
The ''Jeffreys interval'' has a Bayesian derivation, but good frequentist properties (outperforming most frequentist constructions). In particular, it has coverage properties that are similar to those of the Wilson interval, but it is one of the few intervals with the advantage of being ''equal-tailed'' (e.g., for a 95% confidence interval, the probabilities of the interval lying above or below the true value are both close to 2.5%). In contrast, the Wilson interval has a systematic bias such that it is centred too close to <math>\ p = 0.5 ~.</math><ref>
{{cite journal
| last = Cai | first = T.T. | author-link = T. Tony Cai
| year = 2005
| title = One-sided confidence intervals in discrete distributions
| journal = [[Journal of Statistical Planning and Inference]]
| volume = 131 | issue = 1 | pages = 63–88
| doi=10.1016/j.jspi.2004.01.005
}}
</ref>


The Jeffreys interval is the Bayesian [[credible interval]] obtained when using the [[non-informative prior|non-informative]] [[Jeffreys prior]] for the binomial proportion {{math|''p''}}. The [[Jeffreys prior#Bernoulli trial|Jeffreys prior for this problem]] is a [[Beta distribution]] with parameters {{math|(1/2,&nbsp;1/2)}}, it is a [[conjugate prior]]. After observing {{math|''x''}} successes in {{math|''n''}} trials, the [[posterior distribution]] for {{math|''p''}} is a Beta distribution with parameters {{math|(''x''&nbsp;+&nbsp;1/2,&nbsp;''n''&nbsp;–&nbsp;''x''&nbsp;+&nbsp;1/2)}}.
The Jeffreys interval is the Bayesian [[credible interval]] obtained when using the [[non-informative prior|non-informative]] [[Jeffreys prior]] for the binomial proportion <math>\ p ~.</math> The [[Jeffreys prior#Bernoulli trial|Jeffreys prior for this problem]] is a [[Beta distribution]] with parameters <math>\ \left( \tfrac{\!\ 1\!\ }{ 2 }, \tfrac{\!\ 1\!\ }{ 2 } \right)\ ,</math> a [[conjugate prior]]. After observing <math>\ x\ </math> successes in <math>\ n\ </math> trials, the [[posterior distribution]] for <math>\ p\ </math> is a Beta distribution with parameters <math>\ \left( x + \tfrac{\!\ 1\!\ }{ 2 }, n - x + \tfrac{\!\ 1\!\ }{ 2 } \right) ~.</math>


When {{math|''x''&nbsp;≠0}} and {{math|''x''&nbsp;≠&nbsp;''n''}}, the Jeffreys interval is taken to be the {{math|100(1&nbsp;–&nbsp;''α'')%}} equal-tailed posterior probability interval, i.e., the {{math|''α''&thinsp;/&thinsp;2}} and {{math|1&nbsp;–&nbsp;''α''&thinsp;/&thinsp;2}} quantiles of a Beta distribution with parameters {{math|(''x''&nbsp;+&nbsp;1/2,&nbsp;''n''&nbsp;–&nbsp;''x''&nbsp;+&nbsp;1/2)}}. These quantiles need to be computed numerically, although this is reasonably simple with modern statistical software.
When <math>\ x \ne 0\ </math> and <math>\ x \ne n\ ,</math> the Jeffreys interval is taken to be the <math>\ 100\ \left( 1 - \alpha \right)\ \mathrm{%}\ </math> equal-tailed posterior probability interval, i.e., the <math>\ \tfrac{\!\ 1\!\ }{ 2 }\ \alpha\ </math> and <math>\ 1 - \tfrac{\!\ 1\!\ }{ 2 }\ \alpha \ </math> quantiles of a Beta distribution with parameters <math>\ \left(\ x + \tfrac{\!\ 1\!\ }{ 2 },\ n - x + \tfrac{\!\ 1\!\ }{ 2 }\ \right) ~.</math>


In order to avoid the coverage probability tending to zero when {{math|''p''&nbsp;→&nbsp;0}} or {{math|1}}, when {{math|''x''&nbsp;{{=}}&nbsp;0}} the upper limit is calculated as before but the lower limit is set to 0, and when {{math|''x''&nbsp;{{=}}&nbsp;''n''}} the lower limit is calculated as before but the upper limit is set to 1.<ref name=Brown2001/>
In order to avoid the coverage probability tending to zero when <math>\ p \to 0\ </math> or {{math|1}}&nbsp;, when <math>\ x = 0\ </math> the upper limit is calculated as before but the lower limit is set to {{math|0}}&nbsp;, and when <math>\ x = n\ </math> the lower limit is calculated as before but the upper limit is set to {{math|1}}&nbsp;.<ref name=Brown2001/>

Jeffreys' interval can also be thought of as a frequentist interval based on inverting the p-value from the [[G-test]] after applying the [[Yates's correction for continuity|Yates correction]] to avoid a potentially-infinite value for the test statistic.


==Clopper–Pearson interval==
==Clopper–Pearson interval==
Line 194: Line 269:
The Clopper–Pearson interval is an early and very common method for calculating binomial confidence intervals.<ref>
The Clopper–Pearson interval is an early and very common method for calculating binomial confidence intervals.<ref>
{{Cite journal
{{Cite journal
| last1 = Clopper
| last1 = Clopper | first1 = C.
| last2 = Pearson | first2 = E.S. | author-link2 = Egon Pearson
| first1 = C.
| last2 = Pearson
| first2 = E. S.
| author-link2 = Egon Pearson
| title = The use of confidence or fiducial limits illustrated in the case of the binomial
| journal = Biometrika
| volume = 26
| issue = 4
| pages = 404–413
| year = 1934
| year = 1934
| title = The use of confidence or fiducial limits illustrated in the case of the binomial
| journal = [[Biometrika]]
| volume = 26 | issue = 4 | pages = 404–413
| doi = 10.1093/biomet/26.4.404
| doi = 10.1093/biomet/26.4.404
}}
}}</ref> This is often called an 'exact' method, because it is based on the cumulative probabilities of the binomial distribution (i.e., exactly the correct distribution rather than an approximation). However, in cases where we know the population size, the intervals may not be the smallest possible. For instance, for a population of size 20 with true proportion of 50%, Clopper–Pearson gives [0.272, 0.728], which has width 0.456 (and where bounds are 0.0280 away from the "next achievable values" of 6/20 and 14/20); whereas Wilson's gives [0.299, 0.701], which has width 0.401 (and is 0.0007 away from the next achievable values).
</ref>
This is often called an 'exact' method, as it attains the nominal coverage level in an exact sense, meaning that the coverage level is never less than the nominal <math>\ 1 - \alpha ~.</math><ref name=Newcombe-1998/>


The Clopper–Pearson interval can be written as
The Clopper–Pearson interval can be written as


:<math>\ S_{\le} \cap S_{\ge}\ </math>
:<math>
S_{\le} \cap S_{\ge}
</math>


or equivalently,
or equivalently,


:<math>\ \left( \inf S_{\ge}\ ,\ \sup S_{\le} \right)\ </math>
:<math>
\left( \inf S_{\ge}\,,\, \sup S_{\le} \right)
</math>


with
with


:<math> S_{\le} ~ \equiv ~ \left\{ ~ p ~~ \Big| ~~ \operatorname{\mathbb P} \left\{~ \operatorname{Bin}\left( n; p \right) \le x ~\right\} > \tfrac{\!\ \alpha\!\ }{ 2 } ~~ \right\} ~</math>
:<math>
and
S_{\le} := \left\{ \theta \,\,\Big|\,\, P \left[ \operatorname{Bin}\left( n; \theta \right) \le x \right] > \frac{\alpha}{2} \right\}
:<math>~ S_{\ge} ~ \equiv ~ \left\{ ~ p ~~ \Big| ~~ \operatorname{\mathbb P} \left\{~ \operatorname{Bin}\left( n; p \right) \ge x \right\} > \tfrac{\!\ \alpha\!\ }{ 2 } ~~ \right\}\ ,</math>
\text{ and }
S_{\ge} := \left\{ \theta \,\,\Big|\,\, P \left[ \operatorname{Bin}\left( n; \theta \right) \ge x \right] > \frac{\alpha}{2} \right\},
</math>


where 0 ''x'' ''n'' is the number of successes observed in the sample and Bin(''n'';&nbsp;''θ'') is a binomial random variable with ''n'' trials and probability of success&nbsp;''θ''.
where <math>\ 0 \le x \le n\ </math> is the number of successes observed in the sample and <math>\ \mathsf{Bin}\left( n; p \right)\ </math> is a binomial random variable with <math>\ n\ </math> trials and probability of success <math>\ p ~.</math>


Equivalently we can say that the Clopper–Pearson interval is <math display="inline">\left(\frac{x}{n} - \varepsilon_1,\ \frac{x}{n} + \varepsilon_2\right)</math> with confidence level <math>1 - \alpha</math> if <math>\varepsilon_i</math> is the infimum of those such that the following tests of hypothesis succeed with significance <math display="inline">\frac{\alpha}{2}</math>:
Equivalently we can say that the Clopper–Pearson interval is <math display="inline">\ \left(\ \frac{\!\ x\!\ }{ n } - \varepsilon_1,\ \frac{\!\ x\!\ }{ n } + \varepsilon_2\ \right)\ </math> with confidence level <math>\ 1 - \alpha\ </math> if <math>\ \varepsilon_i\ </math> is the infimum of those such that the following tests of hypothesis succeed with significance <math display="inline">\ \frac{\!\ \alpha\!\ }{2}\ :</math>


# H<sub>0</sub>: <math>\theta = \frac{x}{n} - \varepsilon_1</math> with H<sub>A</sub>: <math>\theta > \frac{x}{n} - \varepsilon_1</math>
# H<sub>0</sub>: <math>\ p = \frac{\!\ x\!\ }{ n } - \varepsilon_1\ </math> with H<sub>A</sub>: <math>\ p > \frac{\!\ x\!\ }{ n } - \varepsilon_1</math>
# H<sub>0</sub>: <math>\theta = \frac{x}{n} + \varepsilon_2</math> with H<sub>A</sub>: <math>\theta < \frac{x}{n} + \varepsilon_2</math>.
# H<sub>0</sub>: <math>\ p = \frac{\!\ x\!\ }{ n } + \varepsilon_2\ </math> with H<sub>A</sub>: <math>\ p < \frac{\!\ x\!\ }{ n } + \varepsilon_2 ~.</math>


Because of a relationship between the binomial distribution and the [[beta distribution]], the Clopper–Pearson interval is sometimes presented in an alternate format that uses quantiles from the beta distribution.
Because of a relationship between the binomial distribution and the [[beta distribution]], the Clopper–Pearson interval is sometimes presented in an alternate format that uses quantiles from the beta distribution.<ref name=Thulin2014>
{{cite journal
|last = Thulin |first = Måns
|date = 2014-01-01
|title = The cost of using exact confidence intervals for a binomial proportion
|journal = [[Electronic Journal of Statistics]]
|volume = 8 |issue = 1 |pages = 817–840
|doi = 10.1214/14-EJS909 |issn = 1935-7524
|arxiv = 1303.1288 |s2cid = 88519382 |lang = en
}}
</ref>


:<math>B\left(\frac{\alpha}{2}; x, n - x + 1\right) < \theta < B\left(1 - \frac{\alpha}{2}; x + 1, n - x\right)</math>
:<math>\ B\!\left(\tfrac{\!\ \alpha\!\ }{ 2 }\ ;\ x\ ,\ n - x + 1 \right) ~ < ~ p ~ < ~ B\!\left(\ 1 - \tfrac{\!\ \alpha\!\ }{ 2 }\ ;\ x + 1\ ,\ n - x\ \right)\ </math>


where ''x'' is the number of successes, ''n'' is the number of trials, and ''B''(''p''; ''v'',''w'') is the ''p''th [[Cumulative distribution function#Inverse distribution function (quantile function)|quantile]] from a beta distribution with shape parameters ''v'' and ''w''.
where <math>\ x\ </math> is the number of successes, <math>\ n\ </math> is the number of trials, and <math>\ B\!\left(\ p\ ;\ v\ ,\ w\ \right)\ </math> is the {{mvar|p}}th [[Cumulative distribution function#Inverse distribution function (quantile function)|quantile]] from a beta distribution with shape parameters <math>\ v\ </math> and <math>\ w ~.</math>


Thus, <math>\theta_{min} < \theta < \theta_{max}</math>, where:
Thus, <math>\ p_{\min} ~ < ~ p < ~ p_{\max}\ ,</math> where:
:<math>
:<math>\frac{\Gamma(n+1)}{\Gamma(x)\Gamma(n-x+1)}\int_0^{\theta_{min}} t^{x-1}(1-t)^{n-x}dt = \frac{\alpha}{2}</math>
:<math>\frac{\Gamma(n+1)}{\Gamma(x+1)\Gamma(n-x)}\int_0^{\theta_{max}} t^{x}(1-t)^{n-x-1}dt = 1-\frac{\alpha}{2}</math>
\tfrac{\ \Gamma(n+1)\ }{\ \Gamma\!( x )\ \Gamma\!( n-x+1 )\ }\ \int_0^{ p_{\min}}\ t^{x-1}\ (1-t)^{n-x}\ \mathrm{d}\!\ t ~~ = ~~ \tfrac{\!\ \alpha\!\ }{ 2 }\ ,
</math>
The binomial proportion confidence interval is then <math>(\theta_{min}, \theta_{max})</math>, as follows from the relation between the [[Binomial distribution#Cumulative distribution function|Binomial distribution cumulative distribution function]] and the [[Beta_function#Incomplete_beta_function|regularized incomplete beta function]].
:<math>
\tfrac{\Gamma(n+1)}{\Gamma(x+1)\Gamma(n-x)}\ \int_0^{ p_{\max}}\ t^{x}\ (1-t)^{n-x-1}\ \mathrm{d}\!\ t ~ = ~ 1 - \tfrac{\!\ \alpha\!\ }{ 2 } ~.
</math>
The binomial proportion confidence interval is then <math>\ \left(\ p_{\min}\ ,\ p_{\max} \right)\ ,</math> as follows from the relation between the [[Binomial distribution#Cumulative distribution function|Binomial distribution cumulative distribution function]] and the [[Beta function#Incomplete beta function|regularized incomplete beta function]].


When <math>x</math> is either <math>0</math> or <math>n</math>, closed-form expressions for the interval bounds are available: when <math>x = 0</math> the interval is <math display="inline">\left(0,\, 1 - \left(\frac{\alpha}{2}\right)^\frac{1}{n} \right)</math> and when <math>x = n</math> it is <math display="inline">\left(\left(\frac{\alpha}{2}\right)^\frac{1}{n},\, 1\right)</math>.<ref>{{Cite journal|last=Thulin|first=Måns|date=2014-01-01|title=The cost of using exact confidence intervals for a binomial proportion|journal=Electronic Journal of Statistics|language=EN|volume=8|issue=1|pages=817–840|doi=10.1214/14-EJS909|issn=1935-7524|arxiv=1303.1288|s2cid=88519382}}</ref>
When <math>\ x\ </math> is either {{math|0}} or <math>\ n\ ,</math> closed-form expressions for the interval bounds are available: when <math>\ x = 0\ </math> the interval is
:<math display="inline">\ \left( 0\ ,\ 1 - \left(\ \tfrac{\!\ \alpha\!\ }{ 2 }\ \right)^{ 1/n } \right)\ </math>
and when
<math>\ x = n\ </math> it is
:<math display="inline">\ \left(\ \left( \tfrac{\!\ \alpha\!\ }{ 2 }\ \right)^{ 1 / n }\ ,\ 1\ \right) ~.</math><ref name=Thulin2014/>


The beta distribution is, in turn, related to the [[F-distribution]] so a third formulation of the Clopper–Pearson interval can be written using F quantiles:
The beta distribution is, in turn, related to the [[F-distribution]] so a third formulation of the Clopper–Pearson interval can be written using [[F-distribution|F]] quantiles:


:<math>
:<math>
\left( 1 + \frac{n - x + 1}{x\,F\!\left[\frac{\alpha}{2}; 2x, 2(n - x + 1)\right]} \right)^{-1} <
\left(\ 1 + \frac{\ n - x + 1\ }{\ x \ F\!\left[\ \tfrac{\!\ \alpha\!\ }{ 2 }\ ;\ 2\ x\ ,\ 2\ (\ n - x + 1\ )\ \right]\ }\ \right)^{-1}
~~ < ~~ p ~~ < ~~
\theta <
\left( 1 + \frac{n - x}{(x + 1)\,\,F\!\left[1 - \frac{\alpha}{2}; 2(x + 1), 2(n - x)\right]} \right)^{-1}
\left(\ 1 + \frac{\ n - x\ }{(x + 1)\ \ F\!\left[\ 1 - \tfrac{\!\ \alpha\!\ }{ 2 }\ ;\ 2\ (x + 1)\ ,\ 2\ (n - x\ )\ \right]\ }\ \right)^{-1}
</math>
</math>


where ''x'' is the number of successes, ''n'' is the number of trials, and ''F''(''c''; ''d''<sub>1</sub>, ''d''<sub>2</sub>) is the ''c'' quantile from an F-distribution with ''d''<sub>1</sub> and ''d''<sub>2</sub> degrees of freedom.<ref name=AgrestiCoull1998 />
where <math>\ x\ </math> is the number of successes, <math>\ n\ </math> is the number of trials, and <math>\ F\!\left(\ c\ ;\ d_1\ , d_2\ \right)\ </math> is the <math>\ c\ </math> quantile from an [[F-distribution]] with <math>\ d_1\ </math> and <math>\ d_2\ </math> degrees of freedom.<ref name=AgrestiCoull1998/>


The Clopper–Pearson interval is an exact interval since it is based directly on the binomial distribution rather than any approximation to the binomial distribution. This interval never has less than the nominal coverage for any population proportion, but that means that it is usually conservative. For example, the true coverage rate of a 95% Clopper–Pearson interval may be well above 95%, depending on ''n'' and&nbsp;''θ''.<ref name="Brown2001" /> Thus the interval may be wider than it needs to be to achieve 95% confidence. In contrast, it is worth noting that other confidence bounds may be narrower than their nominal confidence width, i.e., the normal approximation (or "standard") interval, Wilson interval,<ref name=Wilson1927/> Agresti–Coull interval,<ref name=AgrestiCoull1998>
The Clopper–Pearson interval is an 'exact' interval, since it is based directly on the binomial distribution rather than any approximation to the binomial distribution. This interval never has less than the nominal coverage for any population proportion, but that means that it is usually conservative. For example, the true coverage rate of a 95% Clopper–Pearson interval may be well above 95%, depending on <math>\ n\ </math> and <math>\ p ~.</math><ref name=Brown2001/> Thus the interval may be wider than it needs to be to achieve 95% confidence, and wider than other intervals. In contrast, it is worth noting that other confidence interval may have coverage levels that are lower than the nominal <math>\ 1 - \alpha\ ,</math> i.e., the normal approximation (or "standard") interval, Wilson interval,<ref name=Wilson1927/> Agresti–Coull interval,<ref name=AgrestiCoull1998>
{{Cite journal
{{cite journal
| last1 = Agresti
| last1 = Agresti | first1 = Alan | author-link1 = Alan Agresti
| last2 = Coull | first2 = Brent A.
| first1 = Alan
| author-link1 = Alan Agresti
| last2 = Coull
| first2 = Brent A.
| title = Approximate is better than 'exact' for interval estimation of binomial proportions
| journal = The American Statistician
| volume = 52
| issue = 2
| pages = 119–126
| year = 1998
| year = 1998
| title = Approximate is better than 'exact' for interval estimation of binomial proportions
| mr = 1628435
| journal = [[The American Statistician]]
| jstor = 2685469
| volume = 52 | issue = 2 | pages = 119–126
| mr = 1628435 | jstor = 2685469
| doi=10.2307/2685469
| doi=10.2307/2685469
}}
}}</ref> etc., with a nominal coverage of 95% may in fact cover less than 95%.<ref name=Brown2001/>
</ref>
etc., with a nominal coverage of 95% may in fact cover less than 95%,<ref name=Brown2001/> even for large sample sizes.<ref name=Thulin2014/>


The definition of the Clopper–Pearson interval can also be modified to obtain exact confidence intervals for different distributions. For instance, it can also be applied to the case where the samples are drawn without replacement from a population of a known size, instead of repeated draws of a binomial distribution. In this case, the underlying distribution would be the [[hypergeometric distribution]].
The definition of the Clopper–Pearson interval can also be modified to obtain exact confidence intervals for different distributions. For instance, it can also be applied to the case where the samples are drawn without replacement from a population of a known size, instead of repeated draws of a binomial distribution. In this case, the underlying distribution would be the [[hypergeometric distribution]].

The interval boundaries can be computed with numerical functions {{mono|qbeta}}<ref>
{{cite web
|title = The Beta distribution
|series = R Manual
|type = software doc
|website = stat.ethz.ch
|url=https://stat.ethz.ch/R-manual/R-devel/library/stats/html/Beta.html
|access-date=2023-12-02
}}
</ref>
in R and scipy.stats.beta.ppf<ref>
{{cite report
|section= scipy.stats.beta
|title= SciPy Manual |edition = 1.11.4
|website = docs.scipy.org
|type = software doc
|section-url = https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.beta.html
|access-date = 2023-12-02
}}
</ref>
in Python.

<syntaxhighlight lang="python">
from scipy.stats import beta
k = 20
n = 400
alpha = 0.05
p_u, p_o = beta.ppf([alpha/2, 1 - alpha/2], [k, k + 1], [n - k + 1, n - k])
</syntaxhighlight>


==Agresti–Coull interval==
==Agresti–Coull interval==
Line 282: Line 392:
The Agresti–Coull interval is also another approximate binomial confidence interval.<ref name=AgrestiCoull1998/>
The Agresti–Coull interval is also another approximate binomial confidence interval.<ref name=AgrestiCoull1998/>


Given <math>X</math> successes in <math>n</math> trials, define
Given <math>\ n_\mathsf{s}\ </math> successes in <math>\ n\ </math> trials, define
:<math>\tilde{n} = n + z^2</math>
:<math>\ \tilde n \equiv n + z^2_\alpha\ </math>


and
and
:<math>\tilde{p} = \frac{1}{\tilde{n}}\left(X + \frac{z^2}{2}\right)</math>
:<math>\ \tilde p = \frac{ 1 }{\!\ \tilde n\!\ }\left(\!\ n_\mathsf{s} + \tfrac{\ z^2_\alpha }{ 2 }\!\ \right)\ </math>


Then, a confidence interval for <math>p</math> is given by
Then, a confidence interval for <math>\ p\ </math> is given by


:<math>
:<math>
\tilde{p} \pm z
\ p ~~ \approx ~~ \tilde p ~ \pm ~
\sqrt{\frac{\tilde{p}}{\tilde{n}}\left(1 - \tilde{p} \right)}
z_\alpha\ \sqrt{ \frac{\!\ \tilde p\!\ }{ \tilde n }\ \left(\!\ 1 - \tilde p \!\ \right) ~}
</math>
</math>


where <math>z = \Phi^{-1}\!\left(1 - \frac{\alpha}{2}\!\right)</math> is the quantile of a standard normal distribution, as before (for example, a 95% confidence interval requires <math>\alpha = 0.05</math>, thereby producing <math>z = 1.96</math>). According to [[Lawrence D. Brown|Brown]], [[T. Tony Cai|Cai]], and DasGupta,<ref name="Brown2001"/> taking <math>z = 2</math> instead of 1.96 produces the "add 2 successes and 2 failures" interval previously described by [[Alan Agresti|Agresti]] and [[Brent Coull|Coull]].<ref name=AgrestiCoull1998/>
where <math>\ z_\alpha = \operatorname{\Phi^{-1}}\!\!\left(\ 1 - \tfrac{\!\ \alpha\!\ }{ 2 }\ \right)\ </math> is the quantile of a standard normal distribution, as before (for example, a 95% confidence interval requires <math>\ \alpha = 0.05\ ,</math> thereby producing <math>\ z_{.05} = 1.96\ </math>). According to [[Lawrence D. Brown|Brown]], [[T. Tony Cai|Cai]], & DasGupta (2001),<ref name=Brown2001/> taking <math>\ z = 2\ </math> instead of 1.96 produces the "add 2 successes and 2 failures" interval previously described by [[Alan Agresti|Agresti]] & [[Brent Coull|Coull]].<ref name=AgrestiCoull1998/>


This interval can be summarised as employing the centre-point adjustment, <math>\tilde{p}</math>, of the Wilson score interval, and then applying the Normal approximation to this point.<ref name="Wallis2013" /><ref name="Brown2001" />
This interval can be summarised as employing the centre-point adjustment, <math>\ \tilde p\ ,</math> of the Wilson score interval, and then applying the Normal approximation to this point.<ref name=Wallis2013/><ref name=Brown2001/>


:<math>\ \tilde p = \frac{\quad \hat p + \frac{\ z^2_\alpha\!\ }{\ 2\ n\ } \quad}{\quad 1 + \frac{\ z^2_\alpha }{ n } \quad}\ </math>
:<math>
\tilde{p} = \frac{\hat p+\frac{z^2}{2n}}{1+\frac{z^2}{n}}
</math>


==Arcsine transformation==
==Arcsine transformation==
{{main|Arcsine transformation}}
{{details|Cohen's h}}
{{further|Cohen's h}}


The arcsine transformation has the effect of pulling out the ends of the distribution.<ref>
The arcsine transformation has the effect of pulling out the ends of the distribution.<ref>{{Cite web|last=Holland|first=Steven|title=Transformations of proportions and percentages|url=http://strata.uga.edu/8370/rtips/proportions.html|access-date=2020-09-08|website=strata.uga.edu}}</ref> While it can stabilize the variance (and thus confidence intervals) of proportion data, its use has been criticized in several contexts.<ref>{{Cite journal|last=Warton|first=David I.|last2=Hui|first2=Francis K. C.|date=January 2011|title=The arcsine is asinine: the analysis of proportions in ecology|url=http://doi.wiley.com/10.1890/10-0340.1|journal=Ecology|language=en|volume=92|issue=1|pages=3–10|doi=10.1890/10-0340.1|issn=0012-9658|hdl=1885/152287|hdl-access=free}}</ref>
{{cite web
|last=Holland|first=Steven
|title=Transformations of proportions and percentages
|url=http://strata.uga.edu/8370/rtips/proportions.html
|access-date=2020-09-08
|website=strata.uga.edu
}}
</ref>
While it can stabilize the variance (and thus confidence intervals) of proportion data, its use has been criticized in several contexts.<ref>
{{cite journal
|last1 = Warton |first1 = David I.
|last2 = Hui |first2 = Francis K.C.
|date = January 2011
|title = The arcsine is asinine: The analysis of proportions in ecology
|journal = [[Ecology (journal)|Ecology]]
|volume = 92 |issue = 1 |pages = 3–10
|doi = 10.1890/10-0340.1 |pmid = 21560670
|bibcode = 2011Ecol...92....3W |issn = 0012-9658
|hdl = 1885/152287 |hdl-access = free
|url = http://doi.wiley.com/10.1890/10-0340.1
|lang = en
}}
</ref>


Let ''X'' be the number of successes in ''n'' trials and let ''p'' = ''X''/''n''. The variance of ''p'' is
Let <math>\ X\ </math> be the number of successes in <math>\ n\ </math> trials and let <math>\ p = \tfrac{ 1 }{\!\ n\!\ }\!\ X ~.</math> The variance of <math>\ p\ </math> is


:<math> \operatorname{var}(p) = \frac{p (1 - p)}{n} .</math>
:<math> \operatorname{var}\{~~ p ~~\} = \tfrac{ 1 }{\!\ n\!\ }\ p\ (1 - p) ~.</math>


Using the arc sine transform the variance of the arcsine of ''p''<sup>1/2</sup> is<ref name=Shao1998>Shao J (1998) Mathematical statistics. Springer. New York, New York, USA</ref>
Using the [[arcsine|arc sine]] transform, the variance of the arcsine of <math>\ \sqrt{\ p ~}\ </math> is<ref name=Shao1998>
{{cite book
|last = Shao |first = J.
|year = 1998
|title = Mathematical Statistics
|publisher = Springer
|place = New York, NY
}}
</ref>


: <math> \operatorname{var}\left(\arcsin\left(\sqrt{p}\right)\right) \approx \frac{\operatorname{var}(p)}{4p (1 - p)} = \frac{p (1 - p)}{4n p(1 - p)} = \frac{1}{4n} .</math>
: <math>\ \operatorname{var} \left\{~~ \arcsin \sqrt{ p ~} ~~\right\} ~ \approx ~ \frac{\ \operatorname{var}\{~~ p ~~\}\ }{\ 4\ p\ (1 - p)\ } = \frac{\ p\ (1 - p)\ }{\ 4\ n\ p\ (1 - p)\ } = \frac{\ 1\ }{\ 4\ n\ } ~.</math>


So, the confidence interval itself has the following form:
So, the confidence interval itself has the form


: <math> \sin^2\left(\arcsin\left(\sqrt{p}\right) - \frac{z}{2\sqrt{n}}\right) < \theta < \sin^2\left(\arcsin\left(\sqrt{p}\right) + \frac{z}{2\sqrt{n}}\right) </math>
: <math>\ \sin^2 \left(\ -\ \frac{ z_\alpha }{\ 2\ \sqrt{n\ }\ } + \arcsin\sqrt{p\ } ~\right) ~ < ~ \theta ~ < ~ \sin^2 \left(\ +\ \frac{\ z_\alpha\ }{\ 2\ \sqrt{n\ }\ }\ + \arcsin\sqrt{p\ } ~\right)\ ,</math>


where <math>z</math> is the <math>\scriptstyle 1 \,-\, \frac{\alpha}{2}</math> quantile of a standard normal distribution.
where <math>\ z_\alpha\ </math> is the <math>\ 1 \ -\ \tfrac{\!\ \alpha\!\ }{2}\ </math> quantile of a standard normal distribution.


This method may be used to estimate the variance of ''p'' but its use is problematic when ''p'' is close to 0 or&nbsp;1.
This method may be used to estimate the variance of <math>\ p\ </math> but its use is problematic when <math>\ p\ </math> is close to {{math|0}} or {{math|1}}&nbsp;.


==''t''<sub>''a''</sub> transform==
==''t''<sub>''a''</sub> transform==
{{unreferenced section|date=July 2017}}
{{unreferenced section|date=July 2017}}
Let ''p'' be the proportion of successes. For 0 ''a'' 2,
Let <math>\ p\ </math> be the proportion of successes. For <math>\ 0 \le a \le 2\ ,</math>


: <math> t_a = \log\left( \frac{p^a}{ (1 - p)^{2-a} } \right) = a \log(p) - (2-a) \log(1-p) </math>
: <math>\ t_a\ =\ \log\left(\ \frac{\ p^a\ }{\ (1 - p)^{2-a}\ }\ \right)\ =\ a\ \log p - (2-a)\ \log(\ 1 - p\ )\ </math>


This family is a generalisation of the logit transform which is a special case with ''a'' = 1 and can be used to transform a proportional data distribution to an approximately [[normal distribution]]. The parameter ''a'' has to be estimated for the data set.
This family is a generalisation of the logit transform which is a special case with ''a'' = 1 and can be used to transform a proportional data distribution to an approximately [[normal distribution]]. The parameter ''a'' has to be estimated for the data set.


==Rule of three - for when no successes are observed==
==Rule of three for when no successes are observed==
The [[Rule of three (statistics)|rule of three]] is used to provide a simple way of stating an approximate 95% confidence interval for ''p'', in the special case that no successes (<math>\hat p = 0</math>) have been observed.<ref>Steve Simon (2010) [http://www.pmean.com/01/zeroevents.html "Confidence interval with zero events"], The Children's Mercy Hospital, Kansas City, Mo. (website: "Ask Professor Mean at [http://www.childrensmercy.org/stats/ Stats topics or Medical Research] {{webarchive |url=https://web.archive.org/web/20111015182854/http://www.childrensmercy.org/stats/ |date=October 15, 2011 }})</ref> The interval is {{nowrap|(0,3/''n'')}}.
The [[Rule of three (statistics)|rule of three]] is used to provide a simple way of stating an approximate 95% confidence interval for <math>\ p\ ,</math> in the special case that no successes (<math>\ \hat p = 0\ </math>) have been observed.<ref>
{{cite web
|first = Steve |last = Simon
|year = 2010
|title = Confidence interval with zero events
|series= Ask Professor Mean
|place = Kansas City, MO
|publisher = The Children's Mercy Hospital
|url = http://www.pmean.com/01/zeroevents.html
|archive-url = https://web.archive.org/web/20111015182854/http://www.childrensmercy.org/stats/
|archive-date=15 October 2011
}} [http://www.childrensmercy.org/stats/ Stats topics on Medical Research]
</ref>
The interval is <math>\ \left(\ 0,\ \tfrac{\!\ 3\!\ }{ n } \right) ~.</math>


By symmetry, one could expect for only successes (<math>\hat p = 1</math>), the interval is {{nowrap|(1&nbsp;−&nbsp;3/''n'',1)}}.
By symmetry, in the case of only successes (<math>\ \hat p = 1\ </math>), the interval is <math>\ \left(\ 1 - \tfrac{\!\ 3\!\ }{ n },\ 1\ \right) ~.</math>


==Comparison of different intervals==
==Comparison and discussion==


There are several research papers that compare these and other confidence intervals for the binomial proportion.<ref name=Wallis2013/><ref name=Newcombe-1998/><ref>
There are several research papers that compare these and other confidence intervals for the binomial proportion.<ref name=Wallis2013/><ref name=New/><ref name=Rei/><ref name=SL/> Both Agresti and Coull (1998)<ref name=AgrestiCoull1998/> and Ross (2003)<ref name=Ross/> point out that exact methods such as the Clopper–Pearson interval may not work as well as certain approximations. The Normal approximation interval and its presentation in textbooks has been heavily criticised, with many statisticians advocating that it be not used.<ref name="Brown2001" /> The principal problems are ''overshoot'' (bounds exceed [0, 1]), ''zero-width intervals'' at <math>\hat p</math> = 0 and 1 (falsely implying certainty),<ref name=New/> and overall inconsistency with significance testing.<ref name=Wallis2013/>
{{cite conference

|last1 = Sauro |first1 = J.
Of the approximations listed above, Wilson score interval methods (with or without continuity correction) have been shown to be the most accurate and the most robust,<ref name=Wallis2013/><ref name="Brown2001" /><ref name=New/> though some prefer the Agresti&ndash;Coull approach for larger sample sizes.<ref name="Brown2001" /> Wilson and Clopper-Pearson methods obtain consistent results with source significance tests,<ref name=Wallis2021/> and this property is decisive for many researchers.
|last2 = Lewis |first2 = J.R.
|year = 2005
|title = Comparison of Wald, Adj-Wald, exact, and Wilson intervals calculator
|conference = Human Factors and Ergonomics Society, 49th Annual Meeting (HFES 2005)
|place = Orlando, FL
|pages = 2100–2104
|url = http://www.measuringusability.com/papers/sauro-lewisHFES.pdf
|archive-url = https://web.archive.org/web/20120618053914/http://www.measuringusability.com/papers/sauro-lewisHFES.pdf
|archive-date = 2012-06-18 |df = dmy-all
}}
</ref><ref>
{{cite journal
| last = Reiczigel | first = J.
| year = 2003
| title = Confidence intervals for the binomial parameter: Some new considerations
| url = http://www.zoologia.hu/qp/Reiczigel_conf_int.pdf
| journal = [[Statistics in Medicine]]
| volume = 22 | issue = 4 | pages = 611–621
| doi=10.1002/sim.1320
| pmid = 12590417 | s2cid = 7715293
}}
</ref>


Both Ross (2003)<ref>
Many of these intervals can be calculated in [[R (programming language)|R]] using packages like [https://cran.r-project.org/web/packages/binom/index.html "binom"], or in [[Python (programming language)|Python]] using package [https://github.com/KazKobara/ebcic "ebcic"] (Exact Binomial Confidence Interval Calculator).
{{Cite journal

| last = Ross | first = T.D.
==See also==
* [[Estimation theory]]
* [[Pseudocount]]

==References==
{{reflist|25em|refs=

<ref name=New>{{Cite journal
| title = Two-sided confidence intervals for the single proportion: comparison of seven methods
| year = 1998
| issue = 8
| pmid = 9595616
| doi = 10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E
| pages = 857–872
| last1 = Newcombe | first1 = R. G.
| volume = 17
|journal = [[Statistics in Medicine (journal)|Statistics in Medicine]]
}}</ref>
<ref name=Rei>{{cite journal | last1 = Reiczigel | first1 = J | year = 2003 | title = Confidence intervals for the binomial parameter: some new considerations | url = http://www.zoologia.hu/qp/Reiczigel_conf_int.pdf | journal = Statistics in Medicine | volume = 22 | issue = 4| pages = 611–621 | doi=10.1002/sim.1320| pmid = 12590417 }}</ref>
<ref name=SL>Sauro J., Lewis J.R. (2005) [http://www.measuringusability.com/papers/sauro-lewisHFES.pdf "Comparison of Wald, Adj-Wald, Exact and Wilson intervals Calculator"] {{Webarchive|url=https://web.archive.org/web/20120618053914/http://www.measuringusability.com/papers/sauro-lewisHFES.pdf |date=2012-06-18 }}. ''Proceedings of the Human Factors and Ergonomics Society, 49th Annual Meeting (HFES 2005)'', Orlando, FL, pp. 2100–2104</ref>
<ref name=Ross>{{Cite journal
| last1 = Ross | first1 = T. D.
| title = Accurate confidence intervals for binomial proportion and Poisson rate estimation
| journal = Computers in Biology and Medicine
| volume = 33
| issue = 6
| pages = 509–531
| year = 2003
| year = 2003
| title = Accurate confidence intervals for binomial proportion and Poisson rate estimation
| journal = [[Computers in Biology and Medicine]]
| volume = 33 | issue = 6 | pages = 509–531
| doi = 10.1016/S0010-4825(03)00019-2 | pmid = 12878234
| doi = 10.1016/S0010-4825(03)00019-2 | pmid = 12878234
| url = https://zenodo.org/record/1259565
| url = https://zenodo.org/record/1259565
}}
}}
</ref>
</ref>
and Agresti & Coull (1998)<ref name=AgrestiCoull1998/>
point out that exact methods such as the Clopper–Pearson interval may not work as well as some approximations. The normal approximation interval and its presentation in textbooks has been heavily criticised, with many statisticians advocating that it not be used.<ref name=Brown2001/>
The principal problems are ''overshoot'' (bounds exceed <math>\ \left[\ 0,\ 1\ \right]\ </math>), ''zero-width intervals'' at <math>\ \hat p\ = 0\ </math> or {{math|1}} (falsely implying certainty),<ref name=Newcombe-1998/> and overall inconsistency with significance testing.<ref name=Wallis2013/>


Of the approximations listed above, Wilson score interval methods (with or without continuity correction) have been shown to be the most accurate and the most robust,<ref name=Wallis2013/><ref name=Brown2001/><ref name=Newcombe-1998/> though some prefer Agresti & Coulls' approach for larger sample sizes.<ref name=Brown2001/> Wilson and Clopper–Pearson methods obtain consistent results with source significance tests,<ref name=Wallis2021/> and this property is decisive for many researchers.

Many of these intervals can be calculated in [[R (programming language)|R]] using packages like {{mono|binom}}.<ref>
{{cite report
|last = Dorai-Raj |first = Sundar
|date = 2022-05-02
|title = binom: Binomial confidence intervals for several parameterizations
|type = software doc.
|url = https://cran.r-project.org/web/packages/binom/index.html
|access-date = 2023-12-02 |df=dmy-all
}}
}}
</ref>

==See also==
* [[binomial distribution#Confidence intervals]]
* [[estimation theory]]
* [[pseudocount]]
* [[CDF-based_nonparametric_confidence_interval#Pointwise_band]]
* [[Z-test#Comparing the Proportions of Two Binomials]]

==References==
{{reflist|25em}}


{{DEFAULTSORT:Binomial Proportion Confidence Interval}}
{{DEFAULTSORT:Binomial Proportion Confidence Interval}}

Latest revision as of 06:50, 15 December 2024

In statistics, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments (Bernoulli trials). In other words, a binomial proportion confidence interval is an interval estimate of a success probability when only the number of experiments and the number of successes are known.

There are several formulas for a binomial confidence interval, but all of them rely on the assumption of a binomial distribution. In general, a binomial distribution applies when an experiment is repeated a fixed number of times, each trial of the experiment has two possible outcomes (success and failure), the probability of success is the same for each trial, and the trials are statistically independent. Because the binomial distribution is a discrete probability distribution (i.e., not continuous) and difficult to calculate for large numbers of trials, a variety of approximations are used to calculate this confidence interval, all with their own tradeoffs in accuracy and computational intensity.

A simple example of a binomial distribution is the set of various possible outcomes, and their probabilities, for the number of heads observed when a coin is flipped ten times. The observed binomial proportion is the fraction of the flips that turn out to be heads. Given this observed proportion, the confidence interval for the true probability of the coin landing on heads is a range of possible proportions, which may or may not contain the true proportion. A 95% confidence interval for the proportion, for instance, will contain the true proportion 95% of the times that the procedure for constructing the confidence interval is employed.[1]

Problems with using a normal approximation or "Wald interval"

[edit]
Plotting the normal approximation interval on an arbitrary logistic curve reveals problems of overshoot and zero-width intervals.[2]

A commonly used formula for a binomial confidence interval relies on approximating the distribution of error about a binomially-distributed observation, , with a normal distribution.[3] The normal approximation depends on the de Moivre–Laplace theorem (the original, binomial-only version of the central limit theorem) and becomes unreliable when it violates the theorems' premises, as the sample size becomes small or the success probability grows close to either 0 or 1 .[4]

Using the normal approximation, the success probability is estimated by

where is the proportion of successes in a Bernoulli trial process and an estimator for in the underlying Bernoulli distribution. The equivalent formula in terms of observation counts is

where the data are the results of trials that yielded successes and failures. The distribution function argument is the quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate For a 95% confidence level, the error so and

When using the Wald formula to estimate , or just considering the possible outcomes of this calculation, two problems immediately become apparent:

  • First, for approaching either 1 or 0, the interval narrows to zero width (falsely implying certainty).
  • Second, for values of (probability too low / too close to 0), the interval boundaries exceed (overshoot).

(Another version of the second, overshoot problem, arises when instead falls below the same upper bound: probability too high / too close to 1 .)

An important theoretical derivation of this confidence interval involves the inversion of a hypothesis test. Under this formulation, the confidence interval represents those values of the population parameter that would have large P-values if they were tested as a hypothesized population proportion.[clarification needed] The collection of values, for which the normal approximation is valid can be represented as

where is the lower quantile of a standard normal distribution, vs. which is the upper quantile.

Since the test in the middle of the inequality is a Wald test, the normal approximation interval is sometimes called the Wald interval or Wald method, after Abraham Wald, but it was first described by Laplace (1812).[5]

Bracketing the confidence interval

[edit]

Extending the normal approximation and Wald-Laplace interval concepts, Michael Short has shown that inequalities on the approximation error between the binomial distribution and the normal distribution can be used to accurately bracket the estimate of the confidence interval around [6]

with

and where is again the (unknown) proportion of successes in a Bernoulli trial process (as opposed to that estimates it) measured with trials yielding successes, is the quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate and the constants and are simple algebraic functions of [6] For a fixed (and hence ), the above inequalities give easily computed one- or two-sided intervals which bracket the exact binomial upper and lower confidence limits corresponding to the error rate

Standard error of a proportion estimation when using weighted data

[edit]

Let there be a simple random sample where each is i.i.d from a Bernoulli(p) distribution and weight is the weight for each observation, with the(positive) weights normalized so they sum to 1 . The weighted sample proportion is: Since each of the is independent from all the others, and each one has variance for every the sampling variance of the proportion therefore is:[7]

The standard error of is the square root of this quantity. Because we do not know we have to estimate it. Although there are many possible estimators, a conventional one is to use the sample mean, and plug this into the formula. That gives:

For otherwise unweighted data, the effective weights are uniform giving The becomes leading to the familiar formulas, showing that the calculation for weighted data is a direct generalization of them.

Wilson score interval

[edit]
Wilson score intervals plotted on a logistic curve, revealing asymmetry and good performance for small n and where p is at or near 0 or 1.

The Wilson score interval was developed by E.B. Wilson (1927).[8] It is an improvement over the normal approximation interval in multiple respects: Unlike the symmetric normal approximation interval (above), the Wilson score interval is asymmetric, and it doesn't suffer from problems of overshoot and zero-width intervals that afflict the normal interval. It can be safely employed with small samples and skewed observations.[3] The observed coverage probability is consistently closer to the nominal value, [2]

Like the normal interval, the interval can be computed directly from a formula.

Wilson started with the normal approximation to the binomial:

where is the standard normal interval half-width corresponding to the desired confidence The analytic formula for a binomial sample standard deviation is Combining the two, and squaring out the radical, gives an equation that is quadratic in

or

Transforming the relation into a standard-form quadratic equation for treating and as known values from the sample (see prior section), and using the value of that corresponds to the desired confidence for the estimate of gives this: where all of the values bracketed by parentheses are known quantities. The solution for estimates the upper and lower limits of the confidence interval for Hence the probability of success is estimated by and with confidence bracketed in the interval

where is an abbreviation for

An equivalent expression using the observation counts and is

with the counts as above: the count of observed "successes", the count of observed "failures", and their sum is the total number of observations

In practical tests of the formula's results, users find that this interval has good properties even for a small number of trials and / or the extremes of the probability estimate, [2][3][9]

Intuitively, the center value of this interval is the weighted average of and with receiving greater weight as the sample size increases. Formally, the center value corresponds to using a pseudocount of the number of standard deviations of the confidence interval: Add this number to both the count of successes and of failures to yield the estimate of the ratio. For the common two standard deviations in each direction interval (approximately 95% coverage, which itself is approximately 1.96 standard deviations), this yields the estimate which is known as the "plus four rule".

Although the quadratic can be solved explicitly, in most cases Wilson's equations can also be solved numerically using the fixed-point iteration

with

The Wilson interval can also be derived from the single sample z-test or Pearson's chi-squared test with two categories. The resulting interval,

(with the lower quantile) can then be solved for to produce the Wilson score interval. The test in the middle of the inequality is a score test.

The interval equality principle

[edit]
The probability density function (PDF) for the Wilson score interval, plus PDFs at interval bounds. Tail areas are equal.

Since the interval is derived by solving from the normal approximation to the binomial, the Wilson score interval has the property of being guaranteed to obtain the same result as the equivalent z-test or chi-squared test.

This property can be visualised by plotting the probability density function for the Wilson score interval (see Wallis).[9](pp 297-313) After that, then also plotting a normal PDF across each bound. The tail areas of the resulting Wilson and normal distributions represent the chance of a significant result, in that direction, must be equal.

The continuity-corrected Wilson score interval and the Clopper-Pearson interval are also compliant with this property. The practical import is that these intervals may be employed as significance tests, with identical results to the source test, and new tests may be derived by geometry.[9]

Wilson score interval with continuity correction

[edit]

The Wilson interval may be modified by employing a continuity correction, in order to align the minimum coverage probability, rather than the average coverage probability, with the nominal value,

Just as the Wilson interval mirrors Pearson's chi-squared test, the Wilson interval with continuity correction mirrors the equivalent Yates' chi-squared test.

The following formulae for the lower and upper bounds of the Wilson score interval with continuity correction are derived from Newcombe:[2]

for and

If then must instead be set to if then must be instead set to

Wallis (2021)[9] identifies a simpler method for computing continuity-corrected Wilson intervals that employs a special function based on Wilson's lower-bound formula: In Wallis' notation, for the lower bound, let

where is the selected tolerable error level for Then

This method has the advantage of being further decomposable.

Jeffreys interval

[edit]

The Jeffreys interval has a Bayesian derivation, but good frequentist properties (outperforming most frequentist constructions). In particular, it has coverage properties that are similar to those of the Wilson interval, but it is one of the few intervals with the advantage of being equal-tailed (e.g., for a 95% confidence interval, the probabilities of the interval lying above or below the true value are both close to 2.5%). In contrast, the Wilson interval has a systematic bias such that it is centred too close to [10]

The Jeffreys interval is the Bayesian credible interval obtained when using the non-informative Jeffreys prior for the binomial proportion The Jeffreys prior for this problem is a Beta distribution with parameters a conjugate prior. After observing successes in trials, the posterior distribution for is a Beta distribution with parameters

When and the Jeffreys interval is taken to be the equal-tailed posterior probability interval, i.e., the and quantiles of a Beta distribution with parameters

In order to avoid the coverage probability tending to zero when or 1 , when the upper limit is calculated as before but the lower limit is set to 0 , and when the lower limit is calculated as before but the upper limit is set to 1 .[4]

Jeffreys' interval can also be thought of as a frequentist interval based on inverting the p-value from the G-test after applying the Yates correction to avoid a potentially-infinite value for the test statistic.

Clopper–Pearson interval

[edit]

The Clopper–Pearson interval is an early and very common method for calculating binomial confidence intervals.[11] This is often called an 'exact' method, as it attains the nominal coverage level in an exact sense, meaning that the coverage level is never less than the nominal [2]

The Clopper–Pearson interval can be written as

or equivalently,

with

and

where is the number of successes observed in the sample and is a binomial random variable with trials and probability of success

Equivalently we can say that the Clopper–Pearson interval is with confidence level if is the infimum of those such that the following tests of hypothesis succeed with significance

  1. H0: with HA:
  2. H0: with HA:

Because of a relationship between the binomial distribution and the beta distribution, the Clopper–Pearson interval is sometimes presented in an alternate format that uses quantiles from the beta distribution.[12]

where is the number of successes, is the number of trials, and is the pth quantile from a beta distribution with shape parameters and

Thus, where:

The binomial proportion confidence interval is then as follows from the relation between the Binomial distribution cumulative distribution function and the regularized incomplete beta function.

When is either 0 or closed-form expressions for the interval bounds are available: when the interval is

and when it is

[12]

The beta distribution is, in turn, related to the F-distribution so a third formulation of the Clopper–Pearson interval can be written using F quantiles:

where is the number of successes, is the number of trials, and is the quantile from an F-distribution with and degrees of freedom.[13]

The Clopper–Pearson interval is an 'exact' interval, since it is based directly on the binomial distribution rather than any approximation to the binomial distribution. This interval never has less than the nominal coverage for any population proportion, but that means that it is usually conservative. For example, the true coverage rate of a 95% Clopper–Pearson interval may be well above 95%, depending on and [4] Thus the interval may be wider than it needs to be to achieve 95% confidence, and wider than other intervals. In contrast, it is worth noting that other confidence interval may have coverage levels that are lower than the nominal i.e., the normal approximation (or "standard") interval, Wilson interval,[8] Agresti–Coull interval,[13] etc., with a nominal coverage of 95% may in fact cover less than 95%,[4] even for large sample sizes.[12]

The definition of the Clopper–Pearson interval can also be modified to obtain exact confidence intervals for different distributions. For instance, it can also be applied to the case where the samples are drawn without replacement from a population of a known size, instead of repeated draws of a binomial distribution. In this case, the underlying distribution would be the hypergeometric distribution.

The interval boundaries can be computed with numerical functions qbeta[14] in R and scipy.stats.beta.ppf[15] in Python.

from scipy.stats import beta
k = 20
n = 400
alpha = 0.05
p_u, p_o = beta.ppf([alpha/2, 1 - alpha/2], [k, k + 1], [n - k + 1, n - k])

Agresti–Coull interval

[edit]

The Agresti–Coull interval is also another approximate binomial confidence interval.[13]

Given successes in trials, define

and

Then, a confidence interval for is given by

where is the quantile of a standard normal distribution, as before (for example, a 95% confidence interval requires thereby producing ). According to Brown, Cai, & DasGupta (2001),[4] taking instead of 1.96 produces the "add 2 successes and 2 failures" interval previously described by Agresti & Coull.[13]

This interval can be summarised as employing the centre-point adjustment, of the Wilson score interval, and then applying the Normal approximation to this point.[3][4]

Arcsine transformation

[edit]

The arcsine transformation has the effect of pulling out the ends of the distribution.[16] While it can stabilize the variance (and thus confidence intervals) of proportion data, its use has been criticized in several contexts.[17]

Let be the number of successes in trials and let The variance of is

Using the arc sine transform, the variance of the arcsine of is[18]

So, the confidence interval itself has the form

where is the quantile of a standard normal distribution.

This method may be used to estimate the variance of but its use is problematic when is close to 0 or 1 .

ta transform

[edit]

Let be the proportion of successes. For

This family is a generalisation of the logit transform which is a special case with a = 1 and can be used to transform a proportional data distribution to an approximately normal distribution. The parameter a has to be estimated for the data set.

Rule of three — for when no successes are observed

[edit]

The rule of three is used to provide a simple way of stating an approximate 95% confidence interval for in the special case that no successes () have been observed.[19] The interval is

By symmetry, in the case of only successes (), the interval is

Comparison and discussion

[edit]

There are several research papers that compare these and other confidence intervals for the binomial proportion.[3][2][20][21]

Both Ross (2003)[22] and Agresti & Coull (1998)[13] point out that exact methods such as the Clopper–Pearson interval may not work as well as some approximations. The normal approximation interval and its presentation in textbooks has been heavily criticised, with many statisticians advocating that it not be used.[4] The principal problems are overshoot (bounds exceed ), zero-width intervals at or 1 (falsely implying certainty),[2] and overall inconsistency with significance testing.[3]

Of the approximations listed above, Wilson score interval methods (with or without continuity correction) have been shown to be the most accurate and the most robust,[3][4][2] though some prefer Agresti & Coulls' approach for larger sample sizes.[4] Wilson and Clopper–Pearson methods obtain consistent results with source significance tests,[9] and this property is decisive for many researchers.

Many of these intervals can be calculated in R using packages like binom.[23]

See also

[edit]

References

[edit]
  1. ^ Sullivan, Lisa (2017-10-27). "Confidence Intervals". sphweb.bumc.bu.edu (course notes). Boston, MA: Boston University School of Public Health. BS704.
  2. ^ a b c d e f g h Newcombe, R.G. (1998). "Two-sided confidence intervals for the single proportion: Comparison of seven methods". Statistics in Medicine. 17 (8): 857–872. doi:10.1002/(SICI)1097-0258(19980430)17:8<857::AID-SIM777>3.0.CO;2-E. PMID 9595616.
  3. ^ a b c d e f g Wallis, Sean A. (2013). "Binomial confidence intervals and contingency tests: Mathematical fundamentals and the evaluation of alternative methods" (PDF). Journal of Quantitative Linguistics. 20 (3): 178–208. doi:10.1080/09296174.2013.799918. S2CID 16741749.
  4. ^ a b c d e f g h i Brown, Lawrence D.; Cai, T. Tony; DasGupta, Anirban (2001). "Interval estimation for a binomial proportion". Statistical Science. 16 (2): 101–133. CiteSeerX 10.1.1.50.3025. doi:10.1214/ss/1009213286. MR 1861069. Zbl 1059.62533.
  5. ^ Laplace, P.S. (1812). Théorie analytique des probabilités [Analyitic Probability Theory] (in French). Ve. Courcier. p. 283.
  6. ^ a b Short, Michael (2021-11-08). "On binomial quantile and proportion bounds: With applications in engineering and informatics". Communications in Statistics - Theory and Methods. 52 (12): 4183–4199. doi:10.1080/03610926.2021.1986540. ISSN 0361-0926. S2CID 243974180.
  7. ^ "How to calculate the standard error of a proportion using weighted data?". stats.stackexchange.com. 159220 / 253.
  8. ^ a b Wilson, E.B. (1927). "Probable inference, the law of succession, and statistical inference". Journal of the American Statistical Association. 22 (158): 209–212. doi:10.1080/01621459.1927.10502953. JSTOR 2276774.
  9. ^ a b c d e Wallis, Sean A. (2021). Statistics in Corpus Linguistics: A new approach. New York, NY: Routledge. ISBN 9781138589384.
  10. ^ Cai, T.T. (2005). "One-sided confidence intervals in discrete distributions". Journal of Statistical Planning and Inference. 131 (1): 63–88. doi:10.1016/j.jspi.2004.01.005.
  11. ^ Clopper, C.; Pearson, E.S. (1934). "The use of confidence or fiducial limits illustrated in the case of the binomial". Biometrika. 26 (4): 404–413. doi:10.1093/biomet/26.4.404.
  12. ^ a b c Thulin, Måns (2014-01-01). "The cost of using exact confidence intervals for a binomial proportion". Electronic Journal of Statistics. 8 (1): 817–840. arXiv:1303.1288. doi:10.1214/14-EJS909. ISSN 1935-7524. S2CID 88519382.
  13. ^ a b c d e Agresti, Alan; Coull, Brent A. (1998). "Approximate is better than 'exact' for interval estimation of binomial proportions". The American Statistician. 52 (2): 119–126. doi:10.2307/2685469. JSTOR 2685469. MR 1628435.
  14. ^ "The Beta distribution". stat.ethz.ch (software doc). R Manual. Retrieved 2023-12-02.
  15. ^ "scipy.stats.beta". SciPy Manual. docs.scipy.org (software doc) (1.11.4 ed.). Retrieved 2023-12-02.
  16. ^ Holland, Steven. "Transformations of proportions and percentages". strata.uga.edu. Retrieved 2020-09-08.
  17. ^ Warton, David I.; Hui, Francis K.C. (January 2011). "The arcsine is asinine: The analysis of proportions in ecology". Ecology. 92 (1): 3–10. Bibcode:2011Ecol...92....3W. doi:10.1890/10-0340.1. hdl:1885/152287. ISSN 0012-9658. PMID 21560670.
  18. ^ Shao, J. (1998). Mathematical Statistics. New York, NY: Springer.
  19. ^ Simon, Steve (2010). "Confidence interval with zero events". Ask Professor Mean. Kansas City, MO: The Children's Mercy Hospital. Archived from the original on 15 October 2011. Stats topics on Medical Research
  20. ^ Sauro, J.; Lewis, J.R. (2005). Comparison of Wald, Adj-Wald, exact, and Wilson intervals calculator (PDF). Human Factors and Ergonomics Society, 49th Annual Meeting (HFES 2005). Orlando, FL. pp. 2100–2104. Archived from the original (PDF) on 18 June 2012.
  21. ^ Reiczigel, J. (2003). "Confidence intervals for the binomial parameter: Some new considerations" (PDF). Statistics in Medicine. 22 (4): 611–621. doi:10.1002/sim.1320. PMID 12590417. S2CID 7715293.
  22. ^ Ross, T.D. (2003). "Accurate confidence intervals for binomial proportion and Poisson rate estimation". Computers in Biology and Medicine. 33 (6): 509–531. doi:10.1016/S0010-4825(03)00019-2. PMID 12878234.
  23. ^ Dorai-Raj, Sundar (2 May 2022). binom: Binomial confidence intervals for several parameterizations (software doc.). Retrieved 2 December 2023.