Talk:Gauss–Newton algorithm

I always thought that Newton's method, when applied to systems of equation, is still called Newton's method (or Newton-Raphson) and that the Gauss-Newton is a modified Newton's method for solving least squares problems. To wit, let f(x) = sum_i r_i(x)^2 be the least squares problem (x is a vector, and I hope you don't mind my being too lazy to properly type-set the maths). Then we need to solve 2 A(x) r(x) = 0, where A(x) is the Jacobian matrix, so A_ij = derivative of r_j w.r.t. x_i. In my understanding, Newton's method is as described in the article: f'(x_k) (x_{k+1} - x_k) = - f(x_k) with f(x) = 2 A(x) r(x). The derivative f'(x) can be calculated as f'(x) = 2 A(x) A(x)^\top + 2 sum_i r_i(x) nabla^2 r_i(x). On the other hand, the Gauss-Newton method neglect the second term, so we get the iteration A(x_k) A(x_k)^\top (x_{k+1} - x_k) = - A(x_k) r(x_k).

Could you please tell me whether I am mistaken, preferably giving a reference if I am? Thanks. -- Jitse Niesen 18:33, 18 Nov 2004 (UTC)

It is quite possible that you are right. Please feel free to improve both this article and the corresponding section in Newton's method. I do not have an appropriate reference at hands, therefore I am unable to contribute to a clarification. - By the way, thanks for the personal message on my talk page. -- Frau Holle 22:23, 20 Nov 2004 (UTC)

I moved the description of the general Newton's method in R^n back to Newton's method and wrote here a bit about the modified method for least squares problems. -- Jitse Niesen 00:45, 5 Dec 2004 (UTC)

Transform of the Jacobians in the wrong place?

Should the formula not be p(t)=p(t+1)-(J'J)^(-1)J'f?

I thought the RHS was supposed to look like an OLS regression estimate... —The preceding unsigned comment was added by 131.111.8.96 (talk • contribs) 15:48, 10 January 2006 (UTC)

Please be more precise. Do you want to change the formula

p^{k+1}=p^{k}-(J_{f}(p^{k})\,J_{f}(p^{k})^{\top })^{-1}J_{f}(p^{k})\,f(p^{k})

in the article to read

p^{k}=p^{k+1}-(J_{f}(p^{k})\,J_{f}(p^{k})^{\top })^{-1}J_{f}(p^{k})\,f(p^{k})?

If so, can you give some more justfication than "I thought the RHS was supposed to look like an OLS regression estimate", like a reference? If not, what do you want? -- Jitse Niesen (talk) 18:12, 10 January 2006 (UTC)[reply]

montazr rabiei

The term that is neglected in the Newton step to obtain the Gauss-Newton step is not the Laplacian. It also contains mixed second-order derivatives, which is not the case for the Laplacian! Georg

You're right. Many thanks for your comment. It should be fixed now. -- Jitse Niesen (talk) 03:10, 28 October 2006 (UTC)[reply]

no inversion?

What is meant by "The matrix inverse is never computed explicitly in practice"? Right after it is said that a certain linear system should be solved. How does that system gets solved if not by the inversion of that matrix?

It seems to mee also that there is a pseudoinversion going on there, and this should be made explicit in the article. -- NIC1138 18:17, 21 June 2007 (UTC)[reply]

Missing Assumptions - Difference between Gauss-Newton and Newton

I would like to see an explicit statement about the different assumptions behind Gauss-Newton and Newton method. The Newton-method-article is quite clear in its assumptions.

Now, in the Gauss-Newton method, why can the Hessian (implicitely) be approximated by $J_{f}(p)^{\top }J_{f}(p)$ . What is the geometric interpretation? Is the reason a special assumption, e.g. that the function is very close to zero near the true solution and therefore $\sum _{i=1}^{m}f_{i}(p)\,H(f_{i})(p)$ is neglectable ? That would mean that Gauss-Newton (and also Levenberg-Marquardt!) is not suitable for finding minima where the "residual" function value is quite large, e.g. $f_{i}(p)=p^{2}+100000000$

Or from another viewpoint: Given the funtion value and the jacobian there are infinitely many parabolas which are tangent to the function, what's the justification to guess the Hessian to be (close to) $J_{f}(p)^{\top }J_{f}(p)$ ?

Without this explanation, the update rule given in the article seems too unjustified for me, where does it come from? I do believe Gauss had good reasons for it :-)

--89.53.92.248 17:18, 6 September 2007 (UTC)[reply]

Derivation

Much like the comment above, I am looking for more of a derivation of the Gauss-Newton algorithm. This article is useful in telling us what the algorithm is, what it uses as an update equation, but not why it works.