Jump to content

Talk:Gauss–Newton algorithm

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 138.64.2.77 (talk) at 14:15, 13 September 2007 (Derivation: new section). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

I always thought that Newton's method, when applied to systems of equation, is still called Newton's method (or Newton-Raphson) and that the Gauss-Newton is a modified Newton's method for solving least squares problems. To wit, let f(x) = sum_i r_i(x)^2 be the least squares problem (x is a vector, and I hope you don't mind my being too lazy to properly type-set the maths). Then we need to solve 2 A(x) r(x) = 0, where A(x) is the Jacobian matrix, so A_ij = derivative of r_j w.r.t. x_i. In my understanding, Newton's method is as described in the article: f'(x_k) (x_{k+1} - x_k) = - f(x_k) with f(x) = 2 A(x) r(x). The derivative f'(x) can be calculated as f'(x) = 2 A(x) A(x)^\top + 2 sum_i r_i(x) nabla^2 r_i(x). On the other hand, the Gauss-Newton method neglect the second term, so we get the iteration A(x_k) A(x_k)^\top (x_{k+1} - x_k) = - A(x_k) r(x_k).

Could you please tell me whether I am mistaken, preferably giving a reference if I am? Thanks. -- Jitse Niesen 18:33, 18 Nov 2004 (UTC)

It is quite possible that you are right. Please feel free to improve both this article and the corresponding section in Newton's method. I do not have an appropriate reference at hands, therefore I am unable to contribute to a clarification. - By the way, thanks for the personal message on my talk page. -- Frau Holle 22:23, 20 Nov 2004 (UTC)
I moved the description of the general Newton's method in R^n back to Newton's method and wrote here a bit about the modified method for least squares problems. -- Jitse Niesen 00:45, 5 Dec 2004 (UTC)

Transform of the Jacobians in the wrong place?

Should the formula not be p(t)=p(t+1)-(J'J)^(-1)J'f?

I thought the RHS was supposed to look like an OLS regression estimate... —The preceding unsigned comment was added by 131.111.8.96 (talkcontribs) 15:48, 10 January 2006 (UTC)

Please be more precise. Do you want to change the formula
in the article to read
If so, can you give some more justfication than "I thought the RHS was supposed to look like an OLS regression estimate", like a reference? If not, what do you want? -- Jitse Niesen (talk) 18:12, 10 January 2006 (UTC)[reply]

montazr rabiei

The term that is neglected in the Newton step to obtain the Gauss-Newton step is not the Laplacian. It also contains mixed second-order derivatives, which is not the case for the Laplacian! Georg

You're right. Many thanks for your comment. It should be fixed now. -- Jitse Niesen (talk) 03:10, 28 October 2006 (UTC)[reply]

no inversion?

What is meant by "The matrix inverse is never computed explicitly in practice"? Right after it is said that a certain linear system should be solved. How does that system gets solved if not by the inversion of that matrix?

It seems to mee also that there is a pseudoinversion going on there, and this should be made explicit in the article. -- NIC1138 18:17, 21 June 2007 (UTC)[reply]

Missing Assumptions - Difference between Gauss-Newton and Newton

I would like to see an explicit statement about the different assumptions behind Gauss-Newton and Newton method. The Newton-method-article is quite clear in its assumptions.

Now, in the Gauss-Newton method, why can the Hessian (implicitely) be approximated by . What is the geometric interpretation? Is the reason a special assumption, e.g. that the function is very close to zero near the true solution and therefore is neglectable ? That would mean that Gauss-Newton (and also Levenberg-Marquardt!) is not suitable for finding minima where the "residual" function value is quite large, e.g.

Or from another viewpoint: Given the funtion value and the jacobian there are infinitely many parabolas which are tangent to the function, what's the justification to guess the Hessian to be (close to)  ?

Without this explanation, the update rule given in the article seems too unjustified for me, where does it come from? I do believe Gauss had good reasons for it :-)

--89.53.92.248 17:18, 6 September 2007 (UTC)[reply]

Derivation

Much like the comment above, I am looking for more of a derivation of the Gauss-Newton algorithm. This article is useful in telling us what the algorithm is, what it uses as an update equation, but not why it works.