12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

10.4. Robust Estimation of <strong>Principal</strong> <strong>Component</strong>s 263well separated. A similar analysis can be carried out if λ k is increased byan amount ε, in which case the stability of z k depends on the separationbetween λ k and λ k−1 . Thus, the stability of a PC depends on the separationof its variance from the variances of adjacent PCs, an unsurprisingresult, especially considering the discussion of ‘influence’ in Section 10.2.The ideas in Section 10.2 are, as here, concerned with how perturbationsaffect α k , but they differ in that the perturbations are deletions of individualobservations, rather than hypothetical changes in λ k . Nevertheless, wefind in both cases that the changes in α k are largest if λ k is close to λ k+1or λ k−1 (see equations (10.2.4) and (10.2.5)).As an example of the use of the sample analogue of the expression(10.3.1), consider the PCs presented in Table 3.2. Rounding the coefficientsin the PCs to the nearest 0.2 gives a change of 9% in l 1 and changesthe direction of a 1 through an angle of about 8 ◦ (see Section 11.3). We canuse (10.3.1) to find the maximum angular change in a 1 that can occur ifl 1 is decreased by 9%. The maximum angle is nearly 25 ◦ , so that roundingthe coefficients has certainly not moved a 1 in the direction of maximumsensitivity.The eigenvalues l 1 , l 2 and l 3 in this example are 2.792, 1.532 and 1.250,respectively, so that the separation between l 1 and l 2 is much greater thanthat between l 2 and l 3 . The potential change in a 2 for a given decrease inl 2 is therefore larger than that for a 1 , given a corresponding decrease in l 1 .In fact, the same percentage decrease in l 2 as that investigated above forl 1 leads to a maximum angular change of 35 ◦ ; if the change is made thesame in absolute (rather than percentage) terms, then the maximum anglebecomes 44 ◦ .10.4 Robust Estimation of <strong>Principal</strong> <strong>Component</strong>sIt has been noted several times in this book that for PCA’s main (descriptive)rôle, the form of the distribution of x is usually not very important.The main exception to this statement is seen in the case where outliersmay occur. If the outliers are, in fact, influential observations, they willhave a disproportionate effect on the PCs, and if PCA is used blindly inthis case, without considering whether any observations are influential, thenthe results may be largely determined by such observations. For example,suppose that all but one of the n observations lie very close to a planewithin p-dimensional space, so that there are two dominant PCs for these(n − 1) observations. If the distance of the remaining observation from theplane is greater than the variation of the (n − 1) observations within theplane, then the first component for the n observations will be determinedsolely by the direction of the single outlying observation from the plane.This, incidentally, is a case where the first PC, rather than last few, will

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!