12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

264 10. Outlier Detection, Influential Observations and Robust Estimationdetect the outlier (see Section 10.1), but if the distance from the plane islarger than distances within the plane, then the observation is likely to‘stick out’ on at least one of the original variables as well.To avoid such problems, it is possible to use robust estimation of thecovariance or correlation matrix, and hence of the PCs. Such estimationdownweights or discards the effect of any outlying observations. Five robustestimators using three different approaches are investigated by Devlinet al. (1981). The first approach robustly estimates each element of thecovariance or correlation matrix separately and then ‘shrinks’ the elementsto achieve positive-definiteness if this is not achieved with the initial estimates.The second type of approach involves robust regression of x j on(x 1 ,x 2 ,...,x j−1 )forj =2, 3,...,p. An illustration of the robust regressionapproach is presented for a two-variable example by Cleveland and Guarino(1976). Finally, the third approach has three variants; in one (multivariatetrimming) the observations with the largest Mahalanobis distances D i froma robust estimate of the mean of x are discarded, and in the other two theyare downweighted. One of these variants is used by Coleman (1985) in thecontext of quality control. Both of the downweighting schemes are examplesof so-called M-estimators proposed by Maronna (1976). One is the maximumlikelihood estimator for a p-variate elliptical t distribution, whichhas longer tails than the multivariate normal distribution for which theusual non-robust estimate is optimal. The second downweighting schemeuses Huber weights (1964, 1981) which are constant for values of D i up toa threshold Di ∗, and equal to D∗ i /D i thereafter.All but the first of these five estimates involve iteration; full details aregiven by Devlin et al. (1981), who also show that the usual estimator ofthe covariance or correlation matrix can lead to misleading PCs if outlyingobservations are included in the analysis. Of the five possible robust alternativeswhich they investigate, only one, that based on robust regression,is clearly dominated by other methods, and each of the remaining four maybe chosen in some circumstances.Robust estimation of covariance matrices, using an iterative procedurebased on downweighting observations with large Mahalanobis distance fromthe mean, also based on M-estimation, was independently described byCampbell (1980). He uses Huber weights and, as an alternative, a so-calledredescending estimate in which the weights decrease more rapidly than Huberweights for values of D i larger than the threshold Di ∗ . Campbell (1980)notes that, as a by-product of robust estimation, the weights given to eachdata point give an indication of potential outliers. As the weights are nonincreasingfunctions of Mahalanobis distance, this procedure is essentiallyusing the statistic d 2 2i , defined in Section 10.1, to identify outliers, exceptthat the mean and covariance matrix of x are estimated robustly in thepresent case.Other methods for robustly estimating covariance or correlation matriceshave been suggested since Devlin et al.’s (1981) work. For example, Mehro-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!