12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

10.1. Detection of Outliers Using <strong>Principal</strong> <strong>Component</strong>s 237possible test statistic, d 2 1i , suggested by Rao (1964) and discussed furtherby Gnanadesikan and Kettenring (1972), is the sum of squares of the valuesof the last q (< p) PCs, that isp∑d 2 1i = zik, 2 (10.1.1)k=p−q+1where z ik is the value of the kth PC for the ith observation. The statisticsd 2 1i ,i=1, 2,...,nshould, approximately, be independent observationsfrom a gamma distribution if there are no outliers, so that a gamma probabilityplot with suitably estimated shape parameter may expose outliers(Gnanadesikan and Kettenring, 1972).A possible criticism of the statistic d 2 1i is that it still gives insufficientweight to the last few PCs, especially if q, the number of PCs contributing tod 2 1i , is close to p. Because the PCs have decreasing variance with increasingindex, the values of zik 2 will typically become smaller as k increases, andd 2 1i therefore implicitly gives the PCs decreasing weight as k increases. Thiseffect can be severe if some of the PCs have very small variances, and thisis unsatisfactory as it is precisely the low-variance PCs which may be mosteffective in determining the presence of certain types of outlier.An alternative is to give the components equal weight, and this can beachieved by replacing z ik by zik ∗ = z ik/l 1/2k, where l k is the variance of thekth sample PC. In this case the sample variances of the zik ∗ will all be equalto unity. Hawkins (1980, Section 8.2) justifies this particular renormalizationof the PCs by noting that the renormalized PCs, in reverse order,are the uncorrelated linear functions ã ′ px, ã ′ p−1x,...,ã ′ 1x of x which, whenconstrained to have unit variances, have coefficients ã jk that successivelymaximize the criterion ∑ pj=1 ã2 jk,fork = p, (p − 1),...,1. Maximization ofthis criterion is desirable because, given the fixed-variance property, linearfunctions that have large absolute values for their coefficients are likely to bemore sensitive to outliers than those with small coefficients (Hawkins,1980,Section 8.2). It should be noted that when q = p, the statisticp∑d 2 zik22i =(10.1.2)l kk=p−q+1becomes ∑ pk=1 z2 ik /l k, which is simply the (squared) Mahalanobis distanceD 2 i between the ith observation and the sample mean, defined as D 2 i =(x i − ¯x) ′ S −1 (x i − ¯x). This follows because S = AL 2 A ′ where, as usual,L 2 is the diagonal matrix whose kth diagonal element is l k ,andA is thematrix whose (j, k)th element is a jk . Furthermore,S −1 = AL −2 A ′x ′ i = z ′ iA ′¯x ′ = ¯z ′ A ′ ,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!