12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

174 8. <strong>Principal</strong> <strong>Component</strong>s in Regression <strong>Analysis</strong>Table 8.1. Variation accounted for by PCs of predictor variables in monsoon datafor (a) predictor variables, (b) dependent variable.<strong>Component</strong>number 1 2 3 4 5 6 7 8 9 10Percentage (a) Predictorvariation variables 26 22 17 11 10 7 4 3 1 < 1accounted (b) Dependentfor variable 3 22 < 1 1 3 3 6 24 5 20below some desired level. The original VIF for a variable is related to thesquared multiple correlation R 2 between that variable and the other (p−1)predictor variables by the formula VIF = (1 − R 2 ) −1 . Values of VIF > 10correspond to R 2 > 0.90, and VIF > 4 is equivalent to R 2 > 0.75, so thatvalues of R 2 can be considered when choosing how small a level of VIF isdesirable. However, the choice of this desirable level is almost as arbitraryas the choice of l ∗ above.Deletion based solely on variance is an attractive and simple strategy, andProperty A7 of Section 3.1 gives it, at first sight, an added respectability.However, low variance for a component does not necessarily imply thatthe corresponding component is unimportant in the regression model. Forexample, Kung and Sharif (1980) give an example from meteorology where,in a regression of monsoon onset dates on all of the (ten) PCs, the mostimportant PCs for prediction are, in decreasing order of importance, theeighth, second and tenth (see Table 8.1). The tenth component accountsfor less than 1% of the total variation in the predictor variables, but isan important predictor of the dependent variable, and the most importantPC in the regression accounts for 24% of the variation in y but only 3% ofthe variation in x. Further examples of this type are presented in <strong>Jolliffe</strong>(1982). Thus, the two objectives of deleting PCs with small variances andof retaining PCs that are good predictors of the dependent variable maynot be simultaneously achievable.Some authors (for example, Hocking, 1976; Mosteller and Tukey, 1977,pp. 397–398; Gunst and Mason, 1980, pp. 327–328) argue that the choiceof PCs in the regression should be made entirely, or mainly, on the basis ofvariance reduction but, as can be seen from the examples cited by <strong>Jolliffe</strong>(1982), such a procedure can be dangerous if low-variance components havepredictive value. <strong>Jolliffe</strong> (1982) notes that examples where this occurs seemto be not uncommon in practice. Berk’s (1984) experience with six datasets indicates the opposite conclusion, but several of his data sets are ofa special type, in which strong positive correlations exist between all theregressor variables and between the dependent variable and the regressorvariables. In such cases the first PC is a (weighted) average of the regressorvariables, with all weights positive (see Section 3.8), and as y is also

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!