12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

22 2. Properties of Population <strong>Principal</strong> <strong>Component</strong>sA major argument for using correlation—rather than covariance—matrices to define PCs is that the results of analyses for different setsof random variables are more directly comparable than for analyses basedon covariance matrices. The big drawback of PCA based on covariance matricesis the sensitivity of the PCs to the units of measurement used foreach element of x. If there are large differences between the variances of theelements of x, then those variables whose variances are largest will tendto dominate the first few PCs (see, for example, Section 3.3). This maybe entirely appropriate if all the elements of x are measured in the sameunits, for example, if all elements of x are anatomical measurements on aparticular species of animal, all recorded in centimetres, say. Even in suchexamples, arguments can be presented for the use of correlation matrices(see Section 4.1). In practice, it often occurs that different elements of x arecompletely different types of measurement. Some might be lengths, someweights, some temperatures, some arbitrary scores on a five-point scale,and so on. In such a case, the structure of the PCs will depend on thechoice of units of measurement, as is illustrated by the following artificialexample.Suppose that we have just two variables, x 1 , x 2 , and that x 1 is a lengthvariable which can equally well be measured in centimetres or in millimetres.The variable x 2 is not a length measurement—it might be aweight, in grams, for example. The covariance matrices in the two casesare, respectively,Σ 1 =( 80) 4444 80and Σ 2 =( )8000 440.440 80ThefirstPCis0.707x 1 +0.707x 2 for Σ 1 and 0.998x 1 +0.055x 2 for Σ 2 ,so a relatively minor change in one variable has the effect of changing aPC that gives equal weight to x 1 and x 2 to a PC that is almost entirelydominated by x 1 . Furthermore, the first PC accounts for 77.5 percent ofthe total variation for Σ 1 , but 99.3 percent for Σ 2 .Figures 2.1 and 2.2 provide another way of looking at the differences betweenPCs for the two scales of measurement in x 1 . The plots give contoursof constant probability, assuming multivariate normality for x for Σ 1 andΣ 2 , respectively. It is clear from these figures that, whereas with Σ 1 bothvariables have the same degree of variation, for Σ 2 most of the variation isin the direction of x 1 . This is reflected in the first PC, which, from PropertyG1, is defined by the major axis of the ellipses of constant probability.This example demonstrates the general behaviour of PCs for a covariancematrix when the variances of the individual variables are widely different;the same type of behaviour is illustrated again for samples in Section 3.3.The first PC is dominated by the variable with the largest variance, thesecond PC is dominated by the variable with the second largest variance,and so on, with a substantial proportion of the total variation accountedfor by just two or three PCs. In other words, the PCs differ little from

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!