12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

10.1. Detection of Outliers Using <strong>Principal</strong> <strong>Component</strong>s 241outliers is proposed by Gabriel and Zamir (1979). This proposal uses theidea of weighted PCs, and will be discussed further in Section 14.2.1.Projection pursuit was introduced in Section 9.2.2 as a family of techniquesfor finding clusters, but it can equally well be used to look foroutliers. PCA is not specifically designed to find dimensions which bestdisplay either clusters or outliers. As with clusters, optimizing a criterionother than variance can give better low-dimensional displays in which toidentify outliers. As noted in Section 9.2.2, projection pursuit techniquesfind directions in p-dimensional space that optimize some index of ‘interestingness,’where ‘uninteresting’ corresponds to multivariate normality and‘interesting’ implies some sort of ‘structure,’ such as clusters or outliers.Some indices are good at finding clusters, whereas others are better atdetecting outliers (see Friedman (1987); Huber (1985); Jones and Sibson(1987)). Sometimes the superiority in finding outliers has been observedempirically; in other cases the criterion to be optimized has been chosenwith outlier detection specifically in mind. For example, if outliers ratherthan clusters are of interest, Caussinus and Ruiz (1990) suggest replacingthe quantity in equation (9.2.1) byˆΓ =∑ ni=1 K[‖x i − x ∗ ‖ 2 S −1](x i − x ∗ )(x i − x ∗ ) ′∑ ni=1 K[‖x i − x ∗ ‖ 2 S −1] , (10.1.5)where x ∗ is a robust estimate of the centre of the x i such as a multivariatemedian, and K[.], S are defined as in (9.2.1). Directions given by the firstfew eigenvectors of SˆΓ −1 are used to identify outliers. Further theoreticaldetails and examples of the technique are given by Caussinus and Ruiz-Gazen (1993, 1995). A mixture model is assumed (see Section 9.2.3) inwhich one element in the mixture corresponds to the bulk of the data, andthe other elements have small probabilities of occurrence and correspondto different types of outliers. In Caussinus et al. (2001) it is assumed thatif there are q types of outlier, then q directions are likely needed to detectthem. The bulk of the data is assumed to have a spherical distribution, sothere is no single (q +1)th direction corresponding to these data. The questionof an appropriate choice for q needs to be considered. Using asymptoticresults for the null (one-component mixture) distribution of a matrix whichis closely related to SˆΓ −1 , Caussinus et al. (2001) use simulation to derivetables of critical values for its eigenvalues. These tables can then be usedto assess how many eigenvalues are ‘significant,’ and hence decide on anappropriate value for q. The use of the tables is illustrated by examples.The choice of the value of β is discussed by Caussinus and Ruiz-Gazen(1995) and values in the range 0.1 to0.5 are recommended. Caussinuset al. (2001) use somewhat smaller values in constructing their tables,which are valid for values of β in the range 0.01 to 0.1. Penny and <strong>Jolliffe</strong>(2001) include Caussinus and Ruiz-Gazen’s technique in a comparative

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!