Iteratively reweighted least squares: Difference between revisions
mNo edit summary |
explained more general case |
||
Line 1: | Line 1: | ||
The method of '''iteratively reweighted least squares (IRLS |
The method of '''iteratively reweighted least squares''' ('''IRLS''') is used to solve certain optimization problems. It solves [[objective function]]s of the form: |
||
:<math>\underset{\boldsymbol\beta} \operatorname{arg\,min}\sum_{i=1}^n w_i (\boldsymbol\beta) \big[ y_i - \mathbf{x}_i^\top \boldsymbol\beta \big]^2, </math> |
:<math>\underset{\boldsymbol\beta} \operatorname{arg\,min}\sum_{i=1}^n w_i (\boldsymbol\beta) \big[ y_i - \mathbf{x}_i^\top \boldsymbol\beta \big]^2, </math> |
||
Line 11: | Line 11: | ||
Although not a linear regression problem, [[Weiszfeld's algorithm]] for approximating the [[geometric median]] can also be viewed as a special case of iteratively reweighted least squares, in which the objective function is the sum of distances of the estimator from the samples. |
Although not a linear regression problem, [[Weiszfeld's algorithm]] for approximating the [[geometric median]] can also be viewed as a special case of iteratively reweighted least squares, in which the objective function is the sum of distances of the estimator from the samples. |
||
== |
== Examples == |
||
=== ''L<sup>p</sup>'' norm linear regression === |
|||
Starting with a [[diagonal matrix|diagonal]] weighting matrix equal to the [[identity matrix]] <math>\scriptstyle W = I</math> and a linear problem <math>\scriptstyle A x = b</math>, the (weighted) linear equation |
|||
To find the parameters '''''β''''' = (''β''<sub>1</sub>, …,''β''<sub>''k''</sub>)<sup>T</sup> which minimise the [[Lp space|''L<sup>p</sup>'' norm]] for the [[linear regression]] problem: |
|||
:<math> \underset{\boldsymbol \beta} \operatorname{arg\,min} \big\| \mathbf y - X \boldsymbol \beta \|_p = \underset{\boldsymbol \beta} \operatorname{arg\,min} \sum_{i=1}^n \big| y_i - \mathbf{x}_i^\top \boldsymbol\beta \big|^p </math> |
|||
:<math> W A x = W b \, </math> |
|||
The IRLS algorithm at step ''t''+1 involves solving the weighted linear least squares problem: |
|||
is formed. The least squares solution of this equation is then found using standard linear algebra methods. The [[errors and residuals in statistics|residuals]] |
|||
:<math>\boldsymbol\beta^{(t+1)} = \underset{\boldsymbol\beta} \operatorname{arg\,min} \sum_{i=1}^n \big|y_i - \mathbf{x}^\top_i \boldsymbol \beta ^{(t)} \big|^{p-2} \big| y_i - \mathbf{x}_i^\top \boldsymbol\beta \big|^2. </math> |
|||
:<math> r = A x - b \, </math> |
|||
In the case ''p'' = 1, this corresponds to [[least absolute deviation]] regression. |
|||
are calculated and the weighting matrix is updated to some non-negative function <math>\scriptstyle f(r)</math> of the residuals, ''r'', e.g., <math>\scriptstyle f(r) = 1/|r|</math> |
|||
:<math> W = \mathop{\rm diag}( f(r) ). \, </math> |
|||
With these new weights, the weighted least squares equation is re-solved and the residuals are re-calculated. The process can be iterated many times. |
|||
The solution to which this iterative process converges is the minimizer of an objective function related to the function <math>\scriptstyle f(r)</math>. With <math>\scriptstyle f(r) = 1/|r|</math> the objective is the [[least absolute deviation]] <math>\scriptstyle \sum |r_i|</math>. |
|||
==Convergence== |
==Convergence== |
||
Convergence of the method is not guaranteed |
Convergence of the method is not guaranteed. |
||
== References == |
== References == |
Revision as of 17:30, 23 June 2009
The method of iteratively reweighted least squares (IRLS) is used to solve certain optimization problems. It solves objective functions of the form:
by an iterative method in which each step involves solving a standard weighted linear least squares problem of the form:
IRLS is used to find the maximum likelihood estimates of a generalized linear model, and in robust regression to find an M-estimator, as a way of mitigating the influence of outliers in an otherwise normally-distributed data set. For example, by minimizing the least absolute error rather than the least square error.
Although not a linear regression problem, Weiszfeld's algorithm for approximating the geometric median can also be viewed as a special case of iteratively reweighted least squares, in which the objective function is the sum of distances of the estimator from the samples.
Examples
Lp norm linear regression
To find the parameters β = (β1, …,βk)T which minimise the Lp norm for the linear regression problem:
The IRLS algorithm at step t+1 involves solving the weighted linear least squares problem:
In the case p = 1, this corresponds to least absolute deviation regression.
Convergence
Convergence of the method is not guaranteed.
References
- Stanford Lecture Notes on the IRLS algorithm by Antoine Guitton
- Numerical Methods for Least Squares Problems by Åke Björck (Chapter 4: Generalized Least Squares Problems.)
- Robust Estimation in Numerical Recipes in C by Press et al (requires the FileOpen plugin to view)