12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

8<strong>Principal</strong> <strong>Component</strong>s in Regression<strong>Analysis</strong>As illustrated elsewhere in this book, principal components are used inconjunction with a variety of other statistical techniques. One area in whichthis activity has been extensive is regression analysis.In multiple regression, one of the major difficulties with the usual leastsquares estimators is the problem of multicollinearity, which occurs whenthere are near-constant linear functions of two or more of the predictor,or regressor, variables. A readable review of the multicollinearity problemis given by Gunst (1983). Multicollinearities are often, but not always,indicated by large correlations between subsets of the variables and, if multicollinearitiesexist, then the variances of some of the estimated regressioncoefficients can become very large, leading to unstable and potentially misleadingestimates of the regression equation. To overcome this problem,various approaches have been proposed. One possibility is to use only a subsetof the predictor variables, where the subset is chosen so that it does notcontain multicollinearities. Numerous subset selection methods are available(see, for example, Draper and Smith, 1998, Chapter 15; Hocking, 1976;Miller, 1984, 1990), and among the methods are some based on PCs. Thesemethods will be dealt with later in the chapter (Section 8.5), but first somemore widely known uses of PCA in regression are described.These uses of PCA follow from a second class of approaches to overcomingthe problem of multicollinearity, namely the use of biased regressionestimators. This class includes ridge regression, shrinkage estimators, partialleast squares, the so-called LASSO, and also approaches based on PCA.The best-known such approach, generally known as PC regression, simplystarts by using the PCs of the predictor variables in place of the predic-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!