Jump to content

Principal component regression

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Marion.cuny (talk | contribs) at 14:48, 4 March 2011 (Principal Component Principle). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In statistics, principal component regression (PCR) is a regression analysis that uses principal component analysis when estimating regression coefficients. It is a procedure used to overcome problems which arise when the exploratory variables are close to being colinear.[1]

In PCR instead of regressing the independent variables (the regressors) on the dependent variable directly, the principal components of the independent variables are used. One typically only uses a subset of the principal components in the regression, making a kind of regularized estimation.

Often the principal components with the highest variance are selected. However, the low-variance principal components may also be important, — in some cases even more important.[2]

Principal Component Regression Principle

PCR (Principal Components Regression) is a regression method that can be divided into three steps:

  1. The first step is to run a Principal Components Analysis on the table of the explanatory variables,
  2. The second step is to run an Ordinary Least Squares regression (linear regression) on the selected components: the most correlated factors to the dependent variable will be selected
  3. Finally the parameters of the model are computed for the explanatory variables.

Software implementation

  • XLSTAT is a modular statistical and multivariate analysis software including Principal Component Regression among other multivariate tools. link to XLSTAT website

See also

References

  1. ^ Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9
  2. ^ Ian T. Jolliffe (1982). "A note on the Use of Principal Components in Regression". Journal of the Royal Statistical Society, Series C (Applied Statistics). 31 (3). Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 31, No. 3: 300–303. doi:10.2307/2348005.
  • R. Kramer, Chemometric Techniques for Quantitative Analysis, (1998) Marcel-Dekker, ISBN 0-8247-0198-4.