12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

8.7. Examples of <strong>Principal</strong> <strong>Component</strong>s in Regression 197be acceptable; R 2 is 0.874 for the full model including all 28 variables, andit is reduced to 0.865, 0.851, respectively, when five and eight componentsare deleted.It is interesting to examine the ordering of size of correlations between yand the PCs, or equivalently the ordering of the individual t-values, whichis also given in Table 8.6. It is seen that those PCs with small variances donot necessarily have small correlations with y. The 18th, 19th and 22nd insize of variance are in the first ten in order of importance for predicting y;inparticular, the 22nd PC with variance 0.07, has a highly significant t-value,and should almost certainly be retained.An approach using stepwise deletion based solely on the size of correlationbetween y and each PC produces, because of the zero correlationsbetween PCs, the subset whose value of R 2 is maximized for any givensubset size. Far fewer PCs need to be retained using this approach thanthe 20 to 23 indicated when only small-variance components are rejected.In particular, if the 10 PCs are retained that best predict y, then R 2 is0.848, compared with 0.874 for the full model and 0.851 using the first 20PCs. It would appear that a strategy based solely on size of variance isunsatisfactory.The two ‘weak MSE’ criteria described in Section 8.2 were also tested, ina limited way, on these data. Because of computational constraints it wasnot possible to find the overall ‘best’ subset M, so a stepwise approach wasadopted, deleting PCs according to either size of variance, or correlationwith y. The first criterion selected 22 PCs when selection was based onsize of variance, but only 6 PCs when correlation with y was the basis forstepwise selection. The corresponding results for the second (predictive)criterion were 24 and 12 PCs, respectively. It is clear, once again, thatselection based solely on order of size of variance retains more componentsthan necessary but may still miss predictive components.The alternative approach of Lott (1973) was also investigated for thesedata in a stepwise manner using correlation with y to determine order ofselection, with the result that ¯R 2 was maximized for 19 PCs. This is asubstantially larger number than indicated by those other methods thatuse correlation with y to define order of selection and, given the concensusfrom the other methods, suggests that Lott’s (1973) method is not ideal.When PCs are found for the augmented set of variables, including y andall the regressor variables, as required for latent root regression, there isremarkably little change in the PCs, apart from the addition of an extra one.All of the coefficients on the regressor variables are virtually unchanged,and the PCs that have largest correlation with y are in very nearly thesame order as in the PC regression.It may be of more interest to select a subset of variables, rather than asubset of PCs, to be included in the regression, and this was also attempted,using various methods, for the household formation data. Variable selectionbased on PC regression, deleting just one variable at a time before

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!