12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

2.4. <strong>Principal</strong> <strong>Component</strong>s with Equal and/or Zero Variances 272.4 <strong>Principal</strong> <strong>Component</strong>s with Equal and/or ZeroVariancesThe final, short, section of this chapter discusses two problems that mayarise in theory, but are relatively uncommon in practice. In most of thischapter it has been assumed, implicitly or explicitly, that the eigenvaluesof the covariance or correlation matrix are all different, and that none ofthem is zero.Equality of eigenvalues, and hence equality of variances of PCs, will occurfor certain patterned matrices. The effect of this occurrence is that for agroup of q equal eigenvalues, the corresponding q eigenvectors span a certainunique q-dimensional space, but, within this space, they are, apart frombeing orthogonal to one another, arbitrary. Geometrically (see PropertyG1), what happens for q = 2 or 3 is that the principal axes of a circle orsphere cannot be uniquely defined; a similar problem arises for hypersphereswhen q>3. Thus individual PCs corresponding to eigenvalues in a group ofequal eigenvalues are not uniquely defined. A further problem with equalvariancePCs is that statistical inference becomes more complicated (seeSection 3.7).The other complication, variances equal to zero, occurs rather more frequently,but is still fairly unusual. If q eigenvalues are zero, then the rankof Σ is (p − q) rather than p, and this outcome necessitates modificationsto the proofs of some properties given in Section 2.1 above. Any PC withzero variance defines an exactly constant linear relationship between theelements of x. If such relationships exist, then they imply that one variableis redundant for each relationship, as its value can be determined exactlyfrom the values of the other variables appearing in the relationship. Wecould therefore reduce the number of variables from p to (p − q) withoutlosing any information. Ideally, exact linear relationships should be spottedbefore doing a PCA, and the number of variables reduced accordingly. Alternatively,any exact or near-exact linear relationships uncovered by thelast few PCs can be used to select a subset of variables that contain mostof the information available in all of the original variables. This and relatedideas are more relevant to samples than to populations and are discussedfurther in Sections 3.4 and 6.3.There will always be the same number of zero eigenvalues for a correlationmatrix as for the corresponding covariance matrix, since an exactlinear relationship between the elements of x clearly implies an exact linearrelationship between the standardized variables, and vice versa. There isnot the same equivalence, however, when it comes to considering equal variancePCs. Equality of some of the eigenvalues in a covariance (correlation)matrix need not imply that any of the eigenvalues of the correspondingcorrelation (covariance) matrix are equal. A simple example is when the pvariables all have equal correlations but unequal variances. If p>2, then

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!