12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

188 8. <strong>Principal</strong> <strong>Component</strong>s in Regression <strong>Analysis</strong>collinearities, respectively; those variables in the second group can usuallybe excluded from the regression analysis, whereas those in the third groupcertainly cannot. The fourth group simply consists of variables that do notfall into any of the other three groups. These variables may or may notbe important in the regression, depending on the purpose of the analysis(for example, prediction or identification of structure) and each must beexamined individually (see Baskerville and Toogood (1982) for an example).A further possibility for variable selection is based on the idea of associatinga variable with each of the first few (last few) components and thenretaining (deleting) those variables associated with the first few (last few)PCs. This procedure was described in a different context in Section 6.3,and it is clearly essential to modify it in some way for use in a regressioncontext. In particular, when there is not a single clear-cut choice of whichvariable to associate with a particular PC, the choice should be determinedby looking at the strength of the relationships between the candidate variablesand the dependent variable. Great care is also necessary to avoiddeletion of variables that occur in a predictive multicollinearity.Daling and Tamura (1970) adopt a modified version of this type of approach.They first delete the last few PCs, then rotate the remaining PCsusing varimax, and finally select one variable associated with each of thoserotated PCs which has a ‘significant’ correlation with the dependent variable.The method therefore takes into account the regression context of theproblem at the final stage, and the varimax rotation increases the chancesof an unambiguous choice of which variable to associate with each rotatedPC. The main drawback of the approach is in its first stage, where deletionof the low-variance PCs may discard substantial information regardingthe relationship between y and the predictor variables, as was discussed inSection 8.2.8.6 Functional and Structural RelationshipsIn the standard regression framework, the predictor variables are implicitlyassumed to be measured without error, whereas any measurement error inthe dependent variable y can be included in the error term ε. If all thevariables are subject to measurement error the problem is more complicated,even when there is only one predictor variable, and much has beenwritten on how to estimate the so-called functional or structural relationshipsbetween the variables in such cases (see, for example, Kendall andStuart (1979, Chapter 29); Anderson (1984); Cheng and van Ness (1999)).The term ‘functional and structural relationships’ seems to have gone outof fashion, but there are close connections to the ‘errors-in-variables’ modelsfrom econometrics (Darnell, 1994) and to some of the approaches ofSection 9.3.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!