Jump to content

Bisection method: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Analysis: even more pecisely
Analysis: I've updated th discussion on the worst case performance as well as other metrics of performance. I've added references to alternative solvers and also to improvements on the bisection method.
Line 96: Line 96:


== Analysis ==
== Analysis ==
The method is guaranteed to converge to a root of ''f'' if ''f'' is a [[continuous function]] on the interval [''a'', ''b''] and ''f''(''a'') and ''f''(''b'') have opposite signs. The [[approximation error|absolute error]] is halved at each step so the method [[Rate of convergence|converges linearly]], which is comparatively slow.
The method is guaranteed to converge to a root of ''f'' if ''f'' is a [[continuous function]] on the interval [''a'', ''b''] and ''f''(''a'') and ''f''(''b'') have opposite signs. The [[approximation error|absolute error]] is halved at each step so the method [[Rate of convergence|converges linearly]]. Specifically, if ''c''<sub>1</sub> = {{sfrac|''a''+''b''|2}} is the midpoint of the initial interval, and ''c''<sub>''n''</sub> is the midpoint of the interval in the ''n''th step, then the difference between ''c''<sub>''n''</sub> and a solution ''c'' is bounded by<ref>{{Harvnb|Burden|Faires|1985|p=31}}, Theorem 2.1</ref>

Specifically, if ''c''<sub>1</sub> = {{sfrac|''a''+''b''|2}} is the midpoint of the initial interval, and ''c''<sub>''n''</sub> is the midpoint of the interval in the ''n''th step, then the difference between ''c''<sub>''n''</sub> and a solution ''c'' is bounded by<ref>{{Harvnb|Burden|Faires|1985|p=31}}, Theorem 2.1</ref>
:<math>|c_n-c|\le\frac{|b-a|}{2^n}.</math>
:<math>|c_n-c|\le\frac{|b-a|}{2^n}.</math>
This formula can be used to determine, in advance, an upper bound on the number of iterations that the bisection method needs to converge to a root to within a certain tolerance.
This formula can be used to determine, in advance, an upper bound on the number of iterations that the bisection method needs to converge to a root to within a certain tolerance.
The number ''n'' of iterations needed to achieve a required tolerance ε (that is, an error guaranteed to be at most ε), is bounded by
The number ''n'' of iterations needed to achieve a required tolerance ε (that is, an error guaranteed to be at most ε), is bounded by
:<math>n \le \left\lceil\log_2\left(\frac{\epsilon_0}{\epsilon}\right)\right\rceil=\left\lceil\frac{\log\epsilon_0-\log\epsilon}{\log2}\right\rceil , </math>
:<math>n \le n_{1/2} \equiv \left\lceil\log_2\left(\frac{\epsilon_0}{\epsilon}\right)\right\rceil, </math>
where the initial bracket size <math>\epsilon_0 = |b-a|</math> and the required bracket size <math>\epsilon \le \epsilon_0.</math> The main motivation to use the bisection method is that over the set of continuous functions, no other method can guarantee to produce an estimate c<sub>n</sub> to the solution c that in the ''worst case'' has an <math>\epsilon</math> absolute error with less than n<sub>1/2</sub> iterations <ref name=":0">{{Cite journal|last=Sikorski|first=K.|date=1982-02-01|title=Bisection is optimal|url=https://doi.org/10.1007/BF01459080|journal=Numerische Mathematik|language=en|volume=40|issue=1|pages=111–117|doi=10.1007/BF01459080|issn=0945-3245}}</ref>. This is also true under several common assumptions on function f and the behaviour of the function in the neighbourhood of the root <ref name=":0" /><ref>{{Cite journal|last=Sikorski|first=K|date=1985-12-01|title=Optimal solution of nonlinear equations|url=http://www.sciencedirect.com/science/article/pii/0885064X85900111|journal=Journal of Complexity|language=en|volume=1|issue=2|pages=197–209|doi=10.1016/0885-064X(85)90011-1|issn=0885-064X}}</ref>.
where the initial bracket size <math>\epsilon_0 = |b-a|</math> and the required bracket size <math>\epsilon \le \epsilon_0.</math>

However, despite the bisection method being optimal with respect to worst case performance under absolute error criteria it is sub-optimal with respect to ''average performance'' under standard assumptions <ref>{{Cite journal|last=Graf|first=Siegfried|last2=Novak|first2=Erich|last3=Papageorgiou|first3=Anargyros|date=1989-07-01|title=Bisection is not optimal on the average|url=https://doi.org/10.1007/BF01396051|journal=Numerische Mathematik|language=en|volume=55|issue=4|pages=481–491|doi=10.1007/BF01396051|issn=0945-3245}}</ref><ref>{{Cite journal|last=Novak|first=Erich|date=1989-12-01|title=Average-case results for zero finding|url=http://www.sciencedirect.com/science/article/pii/0885064X89900228|journal=Journal of Complexity|language=en|volume=5|issue=4|pages=489–501|doi=10.1016/0885-064X(89)90022-8|issn=0885-064X}}</ref> as well as ''asymptotic performance'' <ref name=":1">{{Cite journal|last=Oliveira|first=I. F. D.|last2=Takahashi|first2=R. H. C.|date=2020-12-06|title=An Enhancement of the Bisection Method Average Performance Preserving Minmax Optimality|url=https://doi.org/10.1145/3423597|journal=ACM Transactions on Mathematical Software|volume=47|issue=1|pages=5:1–5:24|doi=10.1145/3423597|issn=0098-3500}}</ref>. Popular alternatives to the bisection method, such as the [[secant method]], [[ridders' method]] or [[brent's method]] (amongst others), typically perform better since they trade-off worst case performance to achive higher [[Rate of convergence|orders of convergence]] to the root. And, a strict improvement to the bisection method can be achived with a higher order of convergence without trading-off worst case performence with the ITP Method<ref name=":1" /><ref>{{Cite web|last=Ivo|first=Oliveira|date=14/12/2020|title=An Improved Bisection Method|url=https://link.growkudos.com/1iwxps83474|url-status=live|archive-url=|archive-date=|access-date=|website=}}</ref>.



<!-- Therefore, the linear convergence is expressed by <math>\epsilon_{n+1} = \text{constant} \times \epsilon_n^m, \ m=1 .</math> :: Commented out; "Therefore"? If this needs to be said, it has to be said in a better way. -->
<!-- Therefore, the linear convergence is expressed by <math>\epsilon_{n+1} = \text{constant} \times \epsilon_n^m, \ m=1 .</math> :: Commented out; "Therefore"? If this needs to be said, it has to be said in a better way. -->
Line 111: Line 112:
*[[Lehmer–Schur algorithm]], generalization of the bisection method in the complex plane
*[[Lehmer–Schur algorithm]], generalization of the bisection method in the complex plane
*[[Nested intervals]]
*[[Nested intervals]]
*[[Brent's method|Brent's Method]]
*[[Ridders Method]]
*Secant Method


== References ==
== References ==

Revision as of 19:23, 14 December 2020

A few steps of the bisection method applied over the starting range [a1;b1]. The bigger red dot is the root of the function.

In mathematics, the bisection method is a root-finding method that applies to any continuous functions for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods.[1] The method is also called the interval halving method,[2] the binary search method,[3] or the dichotomy method.[4]

For polynomials, more elaborated methods exist for testing the existence of a root in an interval (Descartes' rule of signs, Sturm's theorem, Budan's theorem). They allow extending bisection method into efficient algorithms for finding all real roots of a polynomial; see Real-root isolation.

The method

The method is applicable for numerically solving the equation f(x) = 0 for the real variable x, where f is a continuous function defined on an interval [ab] and where f(a) and f(b) have opposite signs. In this case a and b are said to bracket a root since, by the intermediate value theorem, the continuous function f must have at least one root in the interval (a, b).

At each step the method divides the interval in two by computing the midpoint c = (a+b) / 2 of the interval and the value of the function f(c) at that point. Unless c is itself a root (which is very unlikely, but possible) there are now only two possibilities: either f(a) and f(c) have opposite signs and bracket a root, or f(c) and f(b) have opposite signs and bracket a root.[5] The method selects the subinterval that is guaranteed to be a bracket as the new interval to be used in the next step. In this way an interval that contains a zero of f is reduced in width by 50% at each step. The process is continued until the interval is sufficiently small.

Explicitly, if f(a) and f(c) have opposite signs, then the method sets c as the new value for b, and if f(b) and f(c) have opposite signs then the method sets c as the new a. (If f(c)=0 then c may be taken as the solution and the process stops.) In both cases, the new f(a) and f(b) have opposite signs, so the method is applicable to this smaller interval.[6]

Iteration tasks

The input for the method is a continuous function f, an interval [a, b], and the function values f(a) and f(b). The function values are of opposite sign (there is at least one zero crossing within the interval). Each iteration performs these steps:

  1. Calculate c, the midpoint of the interval, c = a + b/2.
  2. Calculate the function value at the midpoint, f(c).
  3. If convergence is satisfactory (that is, c - a is sufficiently small, or |f(c)| is sufficiently small), return c and stop iterating.
  4. Examine the sign of f(c) and replace either (a, f(a)) or (b, f(b)) with (c, f(c)) so that there is a zero crossing within the new interval.

When implementing the method on a computer, there can be problems with finite precision, so there are often additional convergence tests or limits to the number of iterations. Although f is continuous, finite precision may preclude a function value ever being zero. For example, consider f(x) = x − π; there will never be a finite representation of x that gives zero. Additionally, the difference between a and b is limited by the floating point precision; i.e., as the difference between a and b decreases, at some point the midpoint of [ab] will be numerically identical to (within floating point precision of) either a or b..

Algorithm

The method may be written in pseudocode as follows:[7]

INPUT: Function f, 
       endpoint values a, b, 
       tolerance TOL, 
       maximum iterations NMAX
CONDITIONS: a < b, 
            either f(a) < 0 and f(b) > 0 or f(a) > 0 and f(b) < 0
OUTPUT: value which differs from a root of f(x) = 0 by less than TOL
 
N ← 1
while NNMAX do // limit iterations to prevent infinite loop
    c ← (a + b)/2 // new midpoint
    if f(c) = 0 or (ba)/2 < TOL then // solution found
        Output(c)
        Stop
    end if
    NN + 1 // increment step counter
    if sign(f(c)) = sign(f(a)) then ac else bc // new interval
end while
Output("Method failed.") // max number of steps exceeded

Example: Finding the root of a polynomial

Suppose that the bisection method is used to find a root of the polynomial

First, two numbers and have to be found such that and have opposite signs. For the above function, and satisfy this criterion, as

and

Because the function is continuous, there must be a root within the interval [1, 2].

In the first iteration, the end points of the interval which brackets the root are and , so the midpoint is

The function value at the midpoint is . Because is negative, is replaced with for the next iteration to ensure that and have opposite signs. As this continues, the interval between and will become increasingly smaller, converging on the root of the function. See this happen in the table below.

Iteration
1 1 2 1.5 −0.125
2 1.5 2 1.75 1.6093750
3 1.5 1.75 1.625 0.6660156
4 1.5 1.625 1.5625 0.2521973
5 1.5 1.5625 1.5312500 0.0591125
6 1.5 1.5312500 1.5156250 −0.0340538
7 1.5156250 1.5312500 1.5234375 0.0122504
8 1.5156250 1.5234375 1.5195313 −0.0109712
9 1.5195313 1.5234375 1.5214844 0.0006222
10 1.5195313 1.5214844 1.5205078 −0.0051789
11 1.5205078 1.5214844 1.5209961 −0.0022794
12 1.5209961 1.5214844 1.5212402 −0.0008289
13 1.5212402 1.5214844 1.5213623 −0.0001034
14 1.5213623 1.5214844 1.5214233 0.0002594
15 1.5213623 1.5214233 1.5213928 0.0000780

After 13 iterations, it becomes apparent that there is a convergence to about 1.521: a root for the polynomial.

Analysis

The method is guaranteed to converge to a root of f if f is a continuous function on the interval [a, b] and f(a) and f(b) have opposite signs. The absolute error is halved at each step so the method converges linearly. Specifically, if c1 = a+b/2 is the midpoint of the initial interval, and cn is the midpoint of the interval in the nth step, then the difference between cn and a solution c is bounded by[8]

This formula can be used to determine, in advance, an upper bound on the number of iterations that the bisection method needs to converge to a root to within a certain tolerance. The number n of iterations needed to achieve a required tolerance ε (that is, an error guaranteed to be at most ε), is bounded by

where the initial bracket size and the required bracket size The main motivation to use the bisection method is that over the set of continuous functions, no other method can guarantee to produce an estimate cn to the solution c that in the worst case has an absolute error with less than n1/2 iterations [9]. This is also true under several common assumptions on function f and the behaviour of the function in the neighbourhood of the root [9][10].

However, despite the bisection method being optimal with respect to worst case performance under absolute error criteria it is sub-optimal with respect to average performance under standard assumptions [11][12] as well as asymptotic performance [13]. Popular alternatives to the bisection method, such as the secant method, ridders' method or brent's method (amongst others), typically perform better since they trade-off worst case performance to achive higher orders of convergence to the root. And, a strict improvement to the bisection method can be achived with a higher order of convergence without trading-off worst case performence with the ITP Method[13][14].


See also

References

  1. ^ Burden & Faires 1985, p. 31
  2. ^ "Archived copy". Archived from the original on 2013-05-19. Retrieved 2013-11-07.{{cite web}}: CS1 maint: archived copy as title (link)
  3. ^ Burden & Faires 1985, p. 28
  4. ^ "Dichotomy method - Encyclopedia of Mathematics". www.encyclopediaofmath.org. Retrieved 2015-12-21.
  5. ^ If the function has the same sign at the endpoints of an interval, the endpoints may or may not bracket roots of the function.
  6. ^ Burden & Faires 1985, p. 28 for section
  7. ^ Burden & Faires 1985, p. 29. This version recomputes the function values at each iteration rather than carrying them to the next iterations.
  8. ^ Burden & Faires 1985, p. 31, Theorem 2.1
  9. ^ a b Sikorski, K. (1982-02-01). "Bisection is optimal". Numerische Mathematik. 40 (1): 111–117. doi:10.1007/BF01459080. ISSN 0945-3245.
  10. ^ Sikorski, K (1985-12-01). "Optimal solution of nonlinear equations". Journal of Complexity. 1 (2): 197–209. doi:10.1016/0885-064X(85)90011-1. ISSN 0885-064X.
  11. ^ Graf, Siegfried; Novak, Erich; Papageorgiou, Anargyros (1989-07-01). "Bisection is not optimal on the average". Numerische Mathematik. 55 (4): 481–491. doi:10.1007/BF01396051. ISSN 0945-3245.
  12. ^ Novak, Erich (1989-12-01). "Average-case results for zero finding". Journal of Complexity. 5 (4): 489–501. doi:10.1016/0885-064X(89)90022-8. ISSN 0885-064X.
  13. ^ a b Oliveira, I. F. D.; Takahashi, R. H. C. (2020-12-06). "An Enhancement of the Bisection Method Average Performance Preserving Minmax Optimality". ACM Transactions on Mathematical Software. 47 (1): 5:1–5:24. doi:10.1145/3423597. ISSN 0098-3500.
  14. ^ Ivo, Oliveira (14/12/2020). "An Improved Bisection Method". {{cite web}}: Check date values in: |date= (help)CS1 maint: url-status (link)

Further reading