Jump to content

Total least squares

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Charles Matthews (talk | contribs) at 21:18, 19 April 2007 (Robust linear regression: lk). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Errors-in-variables (EIV) is a robust modeling technique in statistics, which assumes that every variable can have error or noise. Errors-in-variables is also referred to as total least squares (TLS), in a broad sense, in the literature of computational mathematics and engineering. However, TLS in a strict sense implies the application of EIV or orthogonal regression to a linear model .

Robust linear regression

In linear regression, the least squares (LS) attributes all error to the dependent variables. It has variant versions according to other error configurations, including total least squares (i.e. orthogonal error), data least squares (DLS), and constrained or structured TLS.

Given an observation vector and a data matrix , consider the solution of the overdetermined system of equations

.

The ordinary least squares method (OLS) yields the solution that minimizes the Euclidean norm of the residuals , where is also known as the two-norm. The residual is an estimate of the error. Equivalently, the OLS problem can be paraphrased by

If the data matrix is also noisy (i.e. error in both the dependent and the explanatory variables), the OLS solution is no longer optimal. In cases where orthogonal optimization is acceptable, TLS offers a proper formulation:

where is the Frobenius norm (or colloquially the "length" of the vector); and the perturbations and are used to compensate for the noisy signals and , respectively. This formulation of TLS also implies that the noises are assumed to be independent and identically distributed (i.i.d.) both in and . Note that the objective can have a weighting matrix according to the distribution of errors if the distribution is known or well-estimated, which is called the constrained or structured TLS.

In the other case, where the noise is only in , DLS can be used alternatively as

The solution of the OLS problem can be obtained by using the (pseudo-)inverse of the data matrix. Solutions to the TLS and DLS problems have been shown to be closely connected to a set of singular vectors of the (augmented) system-related matrix corresponding to the minimum singular value.

References

  • S. V. Huffel and P. Lemmerling, Total Least Squares and Errors-in-Variables Modeling: Analysis, Algorithms and Applications. Dordrecht, The Netherlands: Kluwer Academic Publishers, 2002.
  • S. Jo and S. W. Kim, "Consistent normalized least mean square filtering with noisy data matrix," IEEE Trans. Signal Processing, vol. 53, no. 6, pp. 2112-2123, Jun. 2005.
  • R. D. DeGroat and E. M. Dowling, "The data least squares problem and channel equalization," IEEE Trans. Signal Processing, vol. 41, no. 1, pp. 407–411, Jan. 1993.
  • T. Abatzoglou and J. Mendel, "Constrained total least squares," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP’87), Apr. 1987, vol. 12, pp. 1485–1488.