12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6Choosing a Subset of <strong>Principal</strong><strong>Component</strong>s or VariablesIn this chapter two separate, but related, topics are considered, both ofwhich are concerned with choosing a subset of variables. In the first section,the choice to be examined is how many PCs adequately account for thetotal variation in x. The major objective in many applications of PCA isto replace the p elements of x by a much smaller number m of PCs, whichnevertheless discard very little information. It is crucial to know how smallm can be taken without serious information loss. Various rules, many adhoc, have been proposed for determining a suitable value of m, and theseare discussed in Section 6.1. Examples of their use are given in Section 6.2.Using m PCs instead of p variables considerably reduces the dimensionalityof the problem when m ≪ p, but usually the values of all p variablesare still needed in order to calculate the PCs, as each PC is likely to bea function of all p variables. It might be preferable if, instead of using mPCs we could use m, or perhaps slightly more, of the original variables,to account for most of the variation in x. The question arises of how tocompare the information contained in a subset of variables with that inthe full data set. Different answers to this question lead to different criteriaand different algorithms for choosing the subset. In Section 6.3 we concentrateon methods that either use PCA to choose the variables or aim toreproduce the PCs in the full data set with a subset of variables, thoughother variable selection techniques are also mentioned briefly. Section 6.4gives two examples of the use of variable selection methods.All of the variable selection methods described in the present chapterare appropriate when the objective is to describe variation within x aswell as possible. Variable selection when x is a set of regressor variables

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!