Iterative refinement: Difference between revisions

Content deleted Content added

Inline

Revision as of 04:30, 3 July 2020

Iterative refinement is an iterative method proposed by James H. Wilkinson to improve the accuracy of numerical solutions to systems of linear equations.^[1]^[2]

When solving a linear system $\,\mathrm {A} \,\mathbf {x} =\mathbf {b} \,,$ due to the compounded accumulation of rounding errors, the computed solution ${\hat {\mathbf {x} }}$ may sometimes deviate from the exact solution $\mathbf {x} _{\star }\,.$ Starting with $\mathbf {x} _{1}={\hat {\mathbf {x} }}\,,$ iterative refinement computes a sequence $\{\,\mathbf {x} _{1},\,\mathbf {x} _{2},\,\mathbf {x} _{3},\,...\}$ which converges to $\mathbf {x} _{\star }\,,$ when certain assumptions are met.

Description

For $m=1,2,3,...\,,$ the $m$ th iteration of iterative refinement consists of three steps:

(i)	Compute the residual error $r m$	$\mathbf {r} _{m}=\mathbf {b} -\mathrm {A} \mathbf {x} _{m}\,.$
(ii)	Solve the system for the correction, $c m$ ,that removes the residual error	$\mathrm {A} \,\mathbf {c} _{m}=\mathbf {r} _{m}\,.$
(iii)	Add the correction to get the revised next solution $x m +1$	$\mathbf {x} _{m+1}=\mathbf {x} _{m}+\mathbf {c} _{m}\,.$

The crucial reasoning for the refinement algorithm is that although the solution for $c m$ in step (ii) may indeed be troubled by similar errors as the first solution, ${\hat {\mathbf {x} }}$ , the calculation of the residual $r m$ in step (i), in comparison, is numerically nearly exact: You may not know the right answer very well, but you know quite accurately just how far the solution you have in hand is from producing the correct outcome ( $b$ ). If the residual is small in some sense, then the correction must also be small, and should at the very least steer the current estimate of the answer, $x m$ , closer the the desired one, $\mathbf {x} _{\star }\,.$

The iterations will stop on their own, when the residual $r m$ is zero, or close enough to zero that the correction $c m$ is too small to change the solution $x m$ which produced it; alternatively, the algorithm stops when $c m$ is too small to convince the linear algebraist monitoring the refinement that it is worth continuing with any further refinements.

Error analysis

As a rule of thumb, iterative refinement for Gaussian elimination produces a solution correct to working precision if double the working precision is used in the computation of $r$ , e.g. by using quad or double extended precision IEEE 754 floating point, and if $A$ is not too ill-conditioned (and the iteration and the rate of convergence are determined by the condition number of $A$ ).^[3]

More formally, assuming that each step (ii) can be solved reasonably accurately, i.e., in mathematical terms, for every $m$ , we have

\mathrm {A} \,\left(\mathrm {I+F} _{m}\right)\mathbf {c} _{m}=\mathbf {r} _{m}

where $‖F m ‖ \infty < 1$ , the relative error in the $m$ th iterate of iterative refinement satisfies

{\frac {\lVert \mathbf {x} _{m}-\mathbf {x} _{\star }\rVert _{\infty }}{\lVert \mathbf {x} _{\star }\rVert _{\infty }}}\leq {\bigl (}\sigma \,\kappa (\mathrm {A} )\,\varepsilon _{1}{\bigr )}^{m}+\mu _{1}\,\varepsilon _{1}+n\,\kappa (\mathrm {A} )\,\mu _{2}\,\varepsilon _{2}

where

$‖\cdot‖ \infty$ denotes the $\infty$ -norm of a vector,
$κ (A)$ is the $\infty$ -condition number of $A$ ,
$n$ is the order of $A$ ,
$ε$ ₁ and $ε$ ₂ are unit round-offs of floating-point arithmetic operations,
$σ$ , $μ$ ₁ and $μ$ ₂ are constants that depend on $A$ , $ε$ ₁ and $ε$ ₂

if $A$ is "not too badly conditioned", which in this context means

0 < σ κ (A) ε 1 ≪ 1

and implies that $μ$ ₁ and $μ$ ₂ are of order unity.

The distinction of $ε$ ₁ and $ε$ ₂ is intended to allow mixed-precision evaluation of $r m$ where intermediate results are computed with unit round-off $ε$ ₂ before the final result is rounded (or truncated) with unit round-off $ε$ ₁. All other computations are assumed to be carried out with unit round-off $ε$ ₁.

References

^ Wilkinson, James H. (1963). Rounding Errors in Algebraic Processes. Englewood Cliffs, NJ: Prentice Hall.
^ Moler, Cleve B. (April 1967). "Iterative refinement in floating point". Journal of the ACM. 14 (2). New York, NY: Association for Computing Machinery: 316–321. doi:10.1145/321386.321394.
^ Higham, Nicholas (2002). Accuracy and Stability of Numerical Algorithms (2 ed.). SIAM. p. 232.

This computer science article is a stub. You can help Wikipedia by expanding it.

[Wilkinson1963-1] Wilkinson, James H. (1963). Rounding Errors in Algebraic Processes. Englewood Cliffs, NJ: Prentice Hall.

[Moler1967-2] Moler, Cleve B. (April 1967). "Iterative refinement in floating point". Journal of the ACM. 14 (2). New York, NY: Association for Computing Machinery: 316–321. doi:10.1145/321386.321394.

[3] Higham, Nicholas (2002). Accuracy and Stability of Numerical Algorithms (2 ed.). SIAM. p. 232.

[1]

[2]

[3]

@@ Line 1: / Line 1: @@
 {{about|iterative refinement in mathematics|iterative refinement in software development|Iterative and incremental development}}
+{{use dmy dates}}
-'''Iterative refinement''' is an [[iterative method]] proposed by James H. Wilkinson to improve the accuracy of numerical solutions to [[systems of linear equations]].
+'''Iterative refinement''' is an [[iterative method]] proposed by James H. Wilkinson to improve the accuracy of numerical solutions to [[systems of linear equations]].<ref name=Wilkinson1963>{{cite book |first=James H. |last=Wilkinson |year=1963 |title=Rounding Errors in Algebraic Processes |publisher=[[Prentice Hall]] |place=Englewood Cliffs, NJ |url=https://archive.org/details/roundingerrorsin0000wilk |url-access=registration}}</ref><ref name=Moler1967>{{cite journal |first=Cleve B. |last=Moler |date=April 1967 |title=Iterative refinement in floating point |journal=[[Journal of the ACM]] |volume=14 |issue=2 |pages=316–321 |url=http://portal.acm.org/citation.cfm?id=321394 |doi=10.1145/321386.321394 |publisher=[[Association for Computing Machinery]] |place=New York, NY}}</ref>
-When solving a linear system {{math|'''<var>Ax</var>''' {{=}} '''<var>b</var>'''}}, due to the presence of [[Round-off error|rounding error]]s, the computed solution {{math|'''<var>x</var>&#x302;'''}} may sometimes deviate from the exact solution {{math|'''<var>x</var>'''<sup>*</sup>}}. Starting with {{math|'''<var>x</var>'''<sub>1</sub> {{=}} '''<var>x</var>&#x302;'''}}, iterative refinement computes a sequence {{math|{{mset|'''<var>x</var>'''<sub>1</sub>,'''<var>x</var>'''<sub>2</sub>,'''<var>x</var>'''<sub>3</sub>,...}}}} which converges to {{math|'''<var>x</var>'''<sup>*</sup>}} when certain assumptions are met.
+When solving a linear system <math>\,\mathrm{A} \, \mathbf{x} = \mathbf{b} \,,</math> due to the compounded accumulation of [[Round-off error|rounding error]]s, the computed solution <math>\hat{\mathbf{x}}</math> may sometimes deviate from the exact solution <math>\mathbf{x}_\star\,.</math> Starting with <math>\mathbf{x}_1 = \hat{\mathbf{x}}\,,</math> iterative refinement computes a sequence <math>\{\,\mathbf{x}_1,\,\mathbf{x}_2,\,\mathbf{x}_3,\,...\}</math> which converges to <math>\mathbf{x}_\star\,,</math> when certain assumptions are met.
 ==Description==
-For {{math|<var>m</var> {{=}} 1,2,...}}, the <math>m</math>th iteration of iterative refinement consists of three steps:
+For <math>m = 1, 2, 3, ...\,,</math> the {{mvar|m}}th iteration of iterative refinement consists of three steps:
+:{|
-# Compute the residual<br>{{math|'''<var>r</var>'''<sub><var>m</var></sub> {{=}} '''<var>b</var>''' &minus; '''<var>Ax</var>'''<sub><var>m</var></sub>}}
+|-
-# Solve the system<br>{{math|'''<var>Ad</var>'''<sub><var>m</var></sub> {{=}} '''<var>r</var>'''<sub><var>m</var></sub>}}
+| style="text-align:center;"| &nbsp; (i) &nbsp; <br/> &nbsp;
-# Add the correction<br>{{math|'''<var>x</var>'''<sub><var>m</var>+1</sub> {{=}} '''<var>x</var>'''<sub><var>m</var></sub> + '''<var>d</var>'''<sub><var>m</var></sub>}}
+| Compute the residual<br/>error {{math|'''r'''{{sub|''m''}}}}
+| &nbsp;  &nbsp; <math>\mathbf{r}_m = \mathbf{b} - \mathrm{A} \mathbf{x}_m\,.</math> <br/> &nbsp;
+|-
+| style="text-align:center;"| &nbsp; (ii) &nbsp; <br/> &nbsp;
+| Solve the system for the correction,<br/>{{math|'''c'''{{sub|''m''}}}},that removes the residual error
+| &nbsp;  &nbsp; <math>\mathrm{A}\,\mathbf{c}_m = \mathbf{r}_m\,.</math> <br/> &nbsp;
+|-
+| style="text-align:center;"| &nbsp; (iii) &nbsp; <br/> &nbsp;
+| Add the correction to get the<br/>revised next solution {{math|'''x'''{{sub|''m''+1}}}} &nbsp;
+| &nbsp;  &nbsp; <math>\mathbf{x}_{m+1} = \mathbf{x}_m + \mathbf{c}_m\,.</math> <br/> &nbsp;
+|}
+The crucial reasoning for the refinement algorithm is that although the solution for {{math|'''c'''{{sub|''m''}}}} in step&nbsp;(ii) may indeed be troubled by similar errors as the first solution, <math>\hat\mathbf{x}</math>, the calculation of the residual {{math|'''r'''{{sub|''m''}}}} in step&nbsp;(i), in comparison, is numerically nearly exact: You may not know the right answer very well, but you know quite accurately just how far the solution you have in hand is from producing the correct outcome ({{math|'''b'''}}). If the residual is small in some sense, then the correction must also be small, and should at the very least steer the current estimate of the answer, {{math|'''x'''{{sub|''m''}}}}, closer the the desired one, <math>\mathbf{x}_\star\,.</math>
+The iterations will stop on their own, when the residual {{math|'''r'''{{sub|''m''}}}} is zero, or close enough to zero that the correction {{math|'''c'''{{sub|''m''}}}} is too small to change the solution {{math|'''x'''{{sub|''m''}}}} which produced it; alternatively, the algorithm stops when {{math|'''c'''{{sub|''m''}}}} is too small to convince the linear algebraist monitoring the refinement that it is worth continuing with any further refinements.
 ==Error analysis==
-As a rule of thumb, iterative refinement for [[Gaussian elimination]] produces a solution correct to working precision if double the working precision is used in the computation of {{math|<var>r</var>}}, e.g. by using [[Quadruple-precision floating-point format|quad]] or [[extended precision|double extended]] precision [[IEEE 754]] [[floating point]], and if {{math|<var>A</var>}} is not too ill-conditioned (and the iteration  and the rate of convergence are determined by the condition number of {{math|<var>A</var>}}).<ref>{{cite book|first=Nicholas | last=Higham |title=Accuracy and Stability of Numerical Algorithms (2 ed)| publisher=SIAM|year=2002 | pages=232  }}</ref>
+As a rule of thumb, iterative refinement for [[Gaussian elimination]] produces a solution correct to working precision if double the working precision is used in the computation of {{math|'''r'''}}, e.g. by using [[Quadruple-precision floating-point format|quad]] or [[extended precision|double extended]] precision [[IEEE 754]] [[floating point]], and if {{math|A}} is not too ill-conditioned (and the iteration  and the rate of convergence are determined by the condition number of {{math|A}}).<ref>{{cite book |first=Nicholas |last=Higham |title=Accuracy and Stability of Numerical Algorithms |edition=2 |publisher=SIAM |year=2002 |pages=232}}</ref>
-More formally, assuming that each solve step is reasonably accurate, i.e., in mathematical terms, for every {{math|<var>m</var>}}, we have
+More formally, assuming that each step&nbsp;(ii) can be solved reasonably accurately, i.e., in mathematical terms, for every {{mvar|m}}, we have
+:<math>\mathrm{A}\,\left( \mathrm{ I + F}_m \right)\mathbf{c}_m = \mathbf{r}_m</math>
-:{{math|'''<var>A</var>'''('''<var>I</var>''' + <var>'''F'''<sub>m</sub></var>)<var>'''d'''<sub>m</sub></var> {{=}} <var>'''r'''<sub>m</sub></var>}}
-where {{math|‖<var>'''F'''<sub>m</sub></var>‖<sub>&infin;</sub> < 1}}, the [[Approximation error|relative error]] in the <math>m</math>th iterate of iterative refinement satisfies
+where {{math|‖F{{sub|''m''}}‖{{sub|&infin;}} < 1}}, the [[Approximation error|relative error]] in the {{mvar|m}}th iterate of iterative refinement satisfies
-:<math>\frac{\lVert\boldsymbol{x}_m-\boldsymbol{x}^\ast\rVert_\infty}{\lVert\boldsymbol{x}^\ast\rVert_\infty}\leq\bigl(\sigma\kappa_\infty(\boldsymbol{A})\varepsilon_1\bigr)^m+\mu_1\varepsilon_1+\mu_2n\kappa_\infty(\boldsymbol{A})\varepsilon_2</math>
+:<math>\frac{ \lVert \mathbf{x}_m - \mathbf{x}_\star \rVert_\infty }{ \lVert \mathbf{x}_\star \rVert_\infty} \leq \bigl( \sigma\,\kappa(\mathrm{A})\,\varepsilon_1\bigr)^m + \mu_1\, \varepsilon_1 + n\, \kappa(\mathrm{A})\, \mu_2\, \varepsilon_2</math>
 where
 * {{math|‖&middot;‖<sub>&infin;</sub>}} denotes the [[Uniform norm|{{math|&infin;}}-norm]] of a vector,
-* {{math|<var>&kappa;</var><sub>&infin;</sub>('''<var>A</var>''')}} is the {{math|&infin;}}-[[condition number]] of {{math|'''<var>A</var>'''}},
+* {{math|''&kappa;''(A)}} is the {{math|&infin;}}-[[condition number]] of {{math|A}},
-* <math>n</math> is the order of {{math|'''<var>A</var>'''}},
+* {{mvar|n}} is the order of {{math|A}},
-* {{math|<var>&epsilon;</var><sub>1</sub>}} and {{math|<var>&epsilon;</var><sub>2</sub>}} are [[Machine epsilon|unit round-offs]] of [[Floating point|floating-point]] arithmetic operations,
+* {{mvar|&epsilon;}}{{sub|1}} and {{mvar|&epsilon;}}{{sub|2}} are [[Machine epsilon|unit round-offs]] of [[Floating point|floating-point]] arithmetic operations,
-* {{math|<var>&sigma;</var>}}, {{math|<var>&mu;</var><sub>1</sub>}} and {{math|<var>&mu;</var><sub>2</sub>}} are constants depending on {{math|'''<var>A</var>'''}}, {{math|<var>&epsilon;</var><sub>1</sub>}} and {{math|<var>&epsilon;</var><sub>2</sub>}}
+* {{mvar|&sigma;}}, {{mvar|&mu;}}{{sub|1}} and {{mvar|&mu;}}{{sub|2}} are constants that depend on {{math|A}}, {{mvar|&epsilon;}}{{sub|1}} and {{mvar|&epsilon;}}{{sub|2}}
-if {{math|'''<var>A</var>'''}} is "not too badly conditioned", which in this context means
+if {{math|A}} is "not too badly conditioned", which in this context means
-:{{math|0 &lt; <var>&sigma;&kappa;</var><sub>&infin;</sub>('''<var>A</var>''')<var>&epsilon;</var><sub>1</sub> ≪ 1}}
+:{{math|0 &lt; ''&sigma; &kappa;''(A) ''&epsilon;''{{sub|1}} ≪ 1}}
-and implies that {{math|<var>&mu;</var><sub>1</sub>}} and {{math|<var>&mu;</var><sub>2</sub>}} are of order unity.
+and implies that {{mvar|&mu;}}{{sub|1}} and {{mvar|&mu;}}{{sub|2}} are of order unity.
-The distinction of {{math|<var>&epsilon;</var><sub>1</sub>}} and {{math|<var>&epsilon;</var><sub>2</sub>}} is intended to allow mixed-precision evaluation of {{math|'''<var>r</var>'''<sub><var>m</var></sub>}} where intermediate results are computed with unit round-off {{math|<var>&epsilon;</var><sub>2</sub>}} before the final result is rounded (or truncated) with unit round-off {{math|<var>&epsilon;</var><sub>1</sub>}}. All other computations are assumed to be carried out with unit round-off {{math|<var>&epsilon;</var><sub>1</sub>}}.
+The distinction of {{mvar|&epsilon;}}{{sub|1}} and {{mvar|&epsilon;}}{{sub|2}} is intended to allow mixed-precision evaluation of {{math|'''r'''{{sub|''m''}} }} where intermediate results are computed with unit round-off {{mvar|&epsilon;}}{{sub|2}} before the final result is rounded (or truncated) with unit round-off {{mvar|&epsilon;}}{{sub|1}}. All other computations are assumed to be carried out with unit round-off {{mvar|&epsilon;}}{{sub|1}}.
-==Notes==
-{{Reflist}}
 ==References==
+{{reflist|25em}}
-*{{cite book|title=Rounding Errors in Algebraic Processes|url=https://archive.org/details/roundingerrorsin0000wilk|url-access=registration|first=James H.|last=Wilkinson|publisher=[[Prentice Hall]]|location= Englewood Cliffs, NJ|year=1963}}
-*{{cite journal|title=Iterative Refinement in Floating Point|first=Cleve B.|last=Moler|publisher=[[Association for Computing Machinery]]|location=New York, NY|date=April 1967|journal=[[Journal of the ACM]]|volume=14|issue=2|pages=316&ndash;321|url=http://portal.acm.org/citation.cfm?id=321394|doi=10.1145/321386.321394}}
 {{Numerical linear algebra}}

v t e Numerical linear algebra
Key concepts	Floating point Numerical stability
Problems	System of linear equations Matrix decompositions Matrix multiplication (algorithms) Matrix splitting Sparse problems
Hardware	CPU cache TLB Cache-oblivious algorithm SIMD Multiprocessing
Software	ATLAS MATLAB Basic Linear Algebra Subprograms (BLAS) LAPACK Specialized libraries General purpose software