12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

236 10. Outlier Detection, Influential Observations and Robust EstimationFigure 10.2. The data set of Figure 10.1, plotted with respect to its PCs.As well as simple plots of observations with respect to PCs, it is possibleto set up more formal tests for outliers based on PCs, assuming that the PCsare normally distributed. Strictly, this assumes that x has a multivariatenormal distribution but, because the PCs are linear functions of p randomvariables, an appeal to the Central Limit Theorem may justify approximatenormality for the PCs even when the original variables are not normal. Abattery of tests is then available for each individual PC, namely those fortesting for the presence of outliers in a sample of univariate normal data(see Hawkins (1980, Chapter 3) and Barnett and Lewis (1994, Chapter 6)).The latter reference describes 47 tests for univariate normal data, plus 23for univariate gamma distributions and 17 for other distributions. Othertests, which combine information from several PCs rather than examiningone at a time, are described by Gnanadesikan and Kettenring (1972) andHawkins (1974), and some of these will now be discussed. In particular, wedefine four statistics, which are denoted d 2 1i , d2 2i , d2 3i and d 4i.The last few PCs are likely to be more useful than the first few in detectingoutliers that are not apparent from the original variables, so one

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!