Talk:Gauss–Newton algorithm
I always thought that Newton's method, when applied to systems of equation, is still called Newton's method (or Newton-Raphson) and that the Gauss-Newton is a modified Newton's method for solving least squares problems. To wit, let f(x) = sum_i r_i(x)^2 be the least squares problem (x is a vector, and I hope you don't mind my being too lazy to properly type-set the maths). Then we need to solve 2 A(x) r(x) = 0, where A(x) is the Jacobian matrix, so A_ij = derivative of r_j w.r.t. x_i. In my understanding, Newton's method is as described in the article: f'(x_k) (x_{k+1} - x_k) = - f(x_k) with f(x) = 2 A(x) r(x). The derivative f'(x) can be calculated as f'(x) = 2 A(x) A(x)^\top + 2 sum_i r_i(x) nabla^2 r_i(x). On the other hand, the Gauss-Newton method neglect the second term, so we get the iteration A(x_k) A(x_k)^\top (x_{k+1} - x_k) = - A(x_k) r(x_k).
Could you please tell me whether I am mistaken, preferably giving a reference if I am? Thanks. -- Jitse Niesen 18:33, 18 Nov 2004 (UTC)
- It is quite possible that you are right. Please feel free to improve both this article and the corresponding section in Newton's method. I do not have an appropriate reference at hands, therefore I am unable to contribute to a clarification. - By the way, thanks for the personal message on my talk page. -- Frau Holle 22:23, 20 Nov 2004 (UTC)
- I moved the description of the general Newton's method in R^n back to Newton's method and wrote here a bit about the modified method for least squares problems. -- Jitse Niesen 00:45, 5 Dec 2004 (UTC)
Transform of the Jacobians in the wrong place?
Should the formula not be p(t)=p(t+1)-(J'J)^(-1)J'f?
I thought the RHS was supposed to look like an OLS regression estimate... —The preceding unsigned comment was added by 131.111.8.96 (talk • contribs) 15:48, 10 January 2006 (UTC)
- Please be more precise. Do you want to change the formula
- in the article to read
- If so, can you give some more justfication than "I thought the RHS was supposed to look like an OLS regression estimate", like a reference? If not, what do you want? -- Jitse Niesen (talk) 18:12, 10 January 2006 (UTC)
montazr rabiei
The term that is neglected in the Newton step to obtain the Gauss-Newton step is not the Laplacian. It also contains mixed second-order derivatives, which is not the case for the Laplacian! Georg
- You're right. Many thanks for your comment. It should be fixed now. -- Jitse Niesen (talk) 03:10, 28 October 2006 (UTC)
no inversion?
What is meant by "The matrix inverse is never computed explicitly in practice"? Right after it is said that a certain linear system should be solved. How does that system gets solved if not by the inversion of that matrix?
It seems to mee also that there is a pseudoinversion going on there, and this should be made explicit in the article. -- NIC1138 18:17, 21 June 2007 (UTC)
Missing Assumptions - Difference between Gauss-Newton and Newton
I would like to see an explicit statement about the different assumptions behind Gauss-Newton and Newton method. The Newton-method-article is quite clear in its assumptions.
Now, in the Gauss-Newton method, why can the Hessian (implicitely) be approximated by . What is the geometric interpretation? Is the reason a special assumption, e.g. that the function is very close to zero near the true solution and therefore is neglectable ? That would mean that Gauss-Newton (and also Levenberg-Marquardt!) is not suitable for finding minima where the "residual" function value is quite large, e.g.
Or from another viewpoint: Given the funtion value and the jacobian there are infinitely many parabolas which are tangent to the function, what's the justification to guess the Hessian to be (close to) ?
Without this explanation, the update rule given in the article seems too unjustified for me, where does it come from? I do believe Gauss had good reasons for it :-)
--89.53.92.248 17:18, 6 September 2007 (UTC)
Derivation
Much like the comment above, I am looking for more of a derivation of the Gauss-Newton algorithm. This article is useful in telling us what the algorithm is, what it uses as an update equation, but not why it works.