12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

10.4. Robust Estimation of <strong>Principal</strong> <strong>Component</strong>s 267(1996). When the unweighted version of the fixed effects model of Section3.9 assumes a multivariate normal distribution for its error terme i , maximum likelihood estimation of the model leads to the usual PCs.However, if the elements of e i are instead assumed to be independentLaplace random variables with probability density functions f(e ij ) =12σ exp(− 1 σ |e ij|), maximum likelihood estimation requires the minimizationof ∑ n ∑ pi=1 j=1 |x ij − z ij |, leading to an L 1 -norm variant of PCA. Baccini etal. (1996) show that the L 1 -norm PCs can be estimated using a canonicalcorrelation analysis (see Section 9.3) of the original variables and rankedversion of the variables. Although by no means a robust method, it seemsnatural to note here the minimax approach to component analysis proposedby Bargmann and Baker (1977). Whereas PCA minimizes the sum ofsquared discrepancies between x ij and a rank-m approximation, and Bacciniet al. (1996) minimize the sum of absolute discrepencies, Bargmann andBaker (1977) suggest minimizing the maximum discrepancy. They providearguments to justify the procedure, but it is clearly sensitive to extremeobservations and could even be thought of as ’anti-robust.’A different, but related, topic is robust estimation of the distribution ofthe PCs, their coefficients and their variances, rather than robust estimationof the PCs themselves. It was noted in Section 3.6 that this can bedone using bootstrap estimation (Diaconis and Efron, 1983). The ‘shape’of the estimated distributions should also give some indication of whetherany highly influential observations are present in a data set (the distributionsmay be multimodal, corresponding to the presence or absence of theinfluential observations in each sample from the data set), although themethod will not directly identify such observations.The ideas of robust estimation and influence are brought together inJaupi and Saporta (1993), Croux and Haesbroeck (2000) and Croux andRuiz-Gazen (2001). Given a robust PCA, it is of interest to examine influencefunctions for the results of such an analysis. Jaupi and Saporta(1993) investigate influence for M-estimators, and Croux and Haesbroek(2000) extend these results to a much wider range of robust PCAs for bothcovariance and correlation matrices. Croux and Ruiz-Gazen (2001) deriveinfluence functions for Li and Chen’s (1985) projection pursuit-based robustPCA. Croux and Haesbroeck (2000) also conduct a simulation studyusing the same structure as Devlin et al. (1981), but with a greater range ofrobust procedures included. They recommend the S-estimator, describedin Rousseeuw and Leroy (1987, p. 263) for practical use.Naga and Antille (1990) explore the stability of PCs derived from robustestimators of covariance matrices, using the measure of stability definedby Daudin et al. (1988) (see Section 10.3). PCs derived from covarianceestimators based on minimum variance ellipsoids perform poorly with respectto this type of stability on the data sets considered by Naga andAntille (1990), but those associated with M-estimated covariance matricesare much better.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!