12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

so D 2 i is (x i − ¯x) ′ S −1 (x i − ¯x) =(z i − ¯z) ′ A ′ AL −2 A ′ A(z i − ¯z)238 10. Outlier Detection, Influential Observations and Robust Estimation=(z i − ¯z) ′ L −2 (z i − ¯z)p∑ zik2 = ,l kk=1where z ik is the kth PC score for the ith observation, measured about themean of the scores for all observations. Flury (1997, p. 609-610) suggeststhat a plot of (Di 2 − d2 2i ) versus d2 2i will reveal observations that are notwell represented by the first (p − q) PCs. Such observations are potentialoutliers.Gnanadesikan and Kettenring (1972) consider also the statisticp∑d 2 3i = l k zik, 2 (10.1.3)k=1which emphasizes observations that have a large effect on the first fewPCs, and is equivalent to (x i − ¯x) ′ S(x i − ¯x). As stated earlier, the firstfew PCs are useful in detecting some types of outlier, and d 2 3i emphasizessuch outliers. However, we repeat that such outliers are often detectablefrom plots of the original variables, unlike the outliers exposed by the lastfew PCs. Various types of outlier, including some that are extreme withrespect to both the first few and and the last few PCs, are illustrated inthe examples given later in this section.Hawkins (1974) prefers to use d 2 2i with q

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!