12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

180 8. <strong>Principal</strong> <strong>Component</strong>s in Regression <strong>Analysis</strong>shrinkage is towards the first few PCs. This tends to downweight the contributionof the less stable low-variance PC but does not ignore them.Oman (1991) demonstrates considerable improvements over least squareswith these estimators.A rather different type of approach, which, nevertheless, still uses PCs ina regression problem, is provided by latent root regression. The main differencebetween this technique and straightforward PC regression is thatthe PCs are not calculated for the set of p predictor variables alone. Instead,they are calculated for a set of (p + 1) variables consisting of thep predictor variables and the dependent variable. This idea was suggestedindependently by Hawkins (1973) and by Webster et al. (1974), and termed‘latent root regression’ by the latter authors. Subsequent papers (Gunst etal., 1976; Gunst and Mason, 1977a) investigated the properties of latentroot regression, and compared it with other biased regression estimators.As with the biased estimators discussed in the previous section, the latentroot regression estimator can be derived by optimizing a quadratic functionof β, subject to constraints (Hocking, 1976). Latent root regression,as defined in Gunst and Mason (1980, Section 10.2), will now be described;the technique introduced by Hawkins (1973) has slight differences and isdiscussed later in this section.In latent root regression, a PCA is done on the set of (p + 1) variablesdescribed above, and the PCs corresponding to the smallest eigenvaluesare examined. Those for which the coefficient of the dependent variabley is also small are called non-predictive multicollinearities, and aredeemed to be of no use in predicting y. However, any PC with a smalleigenvalue will be of predictive value if its coefficient for y is large.Thus, latent root regression deletes those PCs which indicate multicollinearities,but only if the multicollinearities appear to be useless forpredicting y.Let δ k be the vector of the p coefficients on the p predictor variablesin the kth PC for the enlarged set of (p + 1) variables; let δ 0k be thecorresponding coefficient of y, and let ˜l k be the corresponding eigenvalue.Then the latent root estimator for β is defined asˆβ LR = ∑ f k δ k , (8.4.1)M LRwhere M LR is the subset of the integers 1, 2,...,p+ 1, in which integerscorresponding to the non-predictive multicollinearities defined above, andno others, are deleted; the f k are coefficients chosen to minimize residualsums of squares among estimators of the form (8.4.1).The f k can be determined by first using the kth PC to express y as alinear function of X to provide an estimator ŷ k . A weighted average, ŷ LR ,of the ŷ k for k ∈ M LR is then constructed, where the weights are chosenso as to minimize the residual sum of squares (ŷ LR − y) ′ (ŷ LR − y). Thevector ŷ LR is then the latent root regression predictor Xˆβ LR , and the f k

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!