Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
10.2. Influential Observations in a Principal Component Analysis 257changes occur. The ‘estimated’ changes in eigenvalues given in Tables 10.2and 10.3 are derived from multiples of I(·) in (10.2.2), (10.2.3), respectively,with the value of the kth PC for each individual observation substituted forz k , and with l k , a kj , r ij replacing λ k , α kj , ρ ij . The multiples are requiredbecause Change = Influence × ε, and we need a replacement for ε; here wehave used 1/(n − 1), where n = 54 is the sample size. Thus, apart from amultiplying factor (n − 1) −1 , ‘actual’ and ‘estimated’ changes are sampleand empirical influence functions, respectively.In considering changes to an eigenvector, there are changes to each ofthe p (= 4) coefficients in the vector. Comparing vectors is more difficultthan comparing scalars, but Tables 10.2 and 10.3 give the sum of squaresof changes in the individual coefficients of each vector, which is a plausiblemeasure of the difference between two vectors. This quantity is a monotonicallyincreasing function of the angle in p-dimensional space betweenthe original and perturbed versions of a k , which further increases its plausibility.The idea of using angles between eigenvectors to compare PCs isdiscussed in a different context in Section 13.5.The ‘actual’ changes for eigenvectors again come from leaving out oneobservation at a time, recomputing and then comparing the eigenvectors,while the estimated changes are computed from multiples of sample versionsof the expressions (10.2.4) and (10.2.5) for I(x; α k ). The changes ineigenvectors derived in this way are much smaller in absolute terms thanthe changes in eigenvalues, so the eigenvector changes have been multipliedby 10 3 in Tables 10.2 and 10.3 in order that all the numbers are of comparablesize. As with eigenvalues, apart from a common multiplier we arecomparing empirical and sample influences.The first comment to make regarding the results given in Tables 10.2 and10.3 is that the estimated values are extremely good in terms of obtainingthe correct ordering of the observations with respect to their influence.There are some moderately large discrepancies in absolute terms for the observationswith the largest influences, but experience with this and severalother data sets suggests that the most influential observations are correctlyidentified unless sample sizes become very small. The discrepancies in absolutevalues can also be reduced by taking multiples other than (n − 1) −1and by including second order (ε 2 ) terms.A second point is that the observations which are most influential fora particular eigenvalue need not be so for the corresponding eigenvector,and vice versa. For example, there is no overlap between the four mostinfluential observations for the first eigenvalue and its eigenvector in eitherthe correlation or covariance matrix. Conversely, observations can sometimeshave a large influence on both an eigenvalue and its eigenvector (seeTable 10.3, component 2, observation 34).Next, note that observations may be influential for one PC only, or affecttwo or more. An observation is least likely to affect more than one PC inthe case of eigenvalues for a covariance matrix—indeed there is no over-
258 10. Outlier Detection, Influential Observations and Robust Estimationlap between the four most influential observations for the first and secondeigenvalues in Table 10.2. However, for eigenvalues in a correlation matrix,more than one value is likely to be affected by a very influential observation,because the sum of eigenvalues remains fixed. Also, large changes inan eigenvector for either correlation or covariance matrices result in at leastone other eigenvector being similarly changed, because of the orthogonalityconstraints. These results are again reflected in Tables 10.2 and 10.3, withobservations appearing as influential for both of the first two eigenvectors,and for both eigenvalues in the case of the correlation matrix.Comparing the results for covariance and correlation matrices in Tables10.2 and 10.3, we see that several observations are influential for bothmatrices. This agreement occurs because, in the present example, the originalvariables all have similar variances, so that the PCs for correlationand covariance matrices are similar. In examples where the PCs based oncorrelation and covariance matrices are very different, the sets of influentialobservations for the two analyses often show little overlap.Turning now to the observations that have been identified as influentialin Table 10.3, we can examine their positions with respect to the first twoPCs on Figures 5.2 and 5.3. Observation 34, which is the most influentialobservation on eigenvalues 1 and 2 and on eigenvector 2, is the painter indicatedin the top left of Figure 5.2, Fr. Penni. His position is not particularlyextreme with respect to the first PC, and he does not have an unduly largeinfluence on its direction. However, he does have a strong influence on boththe direction and variance (eigenvalue) of the second PC, and to balance theincrease which he causes in the second eigenvalue there is a compensatorydecrease in the first eigenvalue. Hence, he is influential on that eigenvaluetoo. Observation 43, Rembrandt, is at the bottom of Figure 5.2 and, likeFr. Penni, has a direct influence on PC2 with an indirect but substantialinfluence on the first eigenvalue. The other two observations, 28 and 31,Caravaggio and Palma Vecchio, which are listed in Table 10.3 as being influentialfor the first eigenvalue, have a more direct effect. They are the twoobservations with the most extreme values on the first PC and appear atthe extreme left of Figure 5.2.Finally, the observations in Table 10.3 that are most influential on thefirst eigenvector, two of which also have large values of influence for thesecond eigenvector, appear on Figure 5.2 in the second and fourth quadrantsin moderately extreme positions.Student Anatomical MeasurementsIn the discussion of the data on student anatomical measurements in Section10.1 it was suggested that observation 16 is so extreme on the secondPC that it could be largely responsible for the direction of that component.Looking at influence functions for these data enables us to investigate thisconjecture. Not surprisingly, observation 16 is the most influential observa-
- Page 238 and 239: 9.1. Discriminant Analysis 207that
- Page 240 and 241: 9.1. Discriminant Analysis 209betwe
- Page 242 and 243: 9.2. Cluster Analysis 211dimensiona
- Page 244 and 245: 9.2. Cluster Analysis 213Before loo
- Page 246 and 247: 9.2. Cluster Analysis 215Figure 9.3
- Page 248 and 249: 9.2. Cluster Analysis 217demographi
- Page 250 and 251: 9.2. Cluster Analysis 219county clu
- Page 252 and 253: 9.2. Cluster Analysis 221choosing a
- Page 254 and 255: 9.3. Canonical Correlation Analysis
- Page 256 and 257: 9.3. Canonical Correlation Analysis
- Page 258 and 259: 9.3. Canonical Correlation Analysis
- Page 260 and 261: 9.3. Canonical Correlation Analysis
- Page 262 and 263: 9.3. Canonical Correlation Analysis
- Page 264 and 265: 10.1. Detection of Outliers Using P
- Page 266 and 267: 10.1. Detection of Outliers Using P
- Page 268 and 269: 10.1. Detection of Outliers Using P
- Page 270 and 271: 10.1. Detection of Outliers Using P
- Page 272 and 273: 10.1. Detection of Outliers Using P
- Page 274 and 275: 10.1. Detection of Outliers Using P
- Page 276 and 277: 10.1. Detection of Outliers Using P
- Page 278 and 279: 10.1. Detection of Outliers Using P
- Page 280 and 281: 10.2. Influential Observations in a
- Page 282 and 283: 10.2. Influential Observations in a
- Page 284 and 285: 10.2. Influential Observations in a
- Page 286 and 287: 10.2. Influential Observations in a
- Page 290 and 291: 10.3. Sensitivity and Stability 259
- Page 292 and 293: 10.3. Sensitivity and Stability 261
- Page 294 and 295: 10.4. Robust Estimation of Principa
- Page 296 and 297: 10.4. Robust Estimation of Principa
- Page 298 and 299: 10.4. Robust Estimation of Principa
- Page 300 and 301: 11Rotation and Interpretation ofPri
- Page 302 and 303: 11.1. Rotation of Principal Compone
- Page 304 and 305: oot of the corresponding eigenvalue
- Page 306 and 307: 11.1. Rotation of Principal Compone
- Page 308 and 309: 11.1. Rotation of Principal Compone
- Page 310 and 311: 11.2. Alternatives to Rotation 279w
- Page 312 and 313: 11.2. Alternatives to Rotation 281F
- Page 314 and 315: 11.2. Alternatives to Rotation 283F
- Page 316 and 317: 11.2. Alternatives to Rotation 285T
- Page 318 and 319: 11.2. Alternatives to Rotation 287T
- Page 320 and 321: 11.2. Alternatives to Rotation 289A
- Page 322 and 323: 11.2. Alternatives to Rotation 291
- Page 324 and 325: 11.3. Simplified Approximations to
- Page 326 and 327: 11.3. Simplified Approximations to
- Page 328 and 329: 11.4. Physical Interpretation of Pr
- Page 330 and 331: 12Principal Component Analysis forT
- Page 332 and 333: 12.1. Introduction 301series is alm
- Page 334 and 335: 12.2. PCA and Atmospheric Time Seri
- Page 336 and 337: 12.2. PCA and Atmospheric Time Seri
10.2. Influential Observations in a <strong>Principal</strong> <strong>Component</strong> <strong>Analysis</strong> 257changes occur. The ‘estimated’ changes in eigenvalues given in Tables 10.2and 10.3 are derived from multiples of I(·) in (10.2.2), (10.2.3), respectively,with the value of the kth PC for each individual observation substituted forz k , and with l k , a kj , r ij replacing λ k , α kj , ρ ij . The multiples are requiredbecause Change = Influence × ε, and we need a replacement for ε; here wehave used 1/(n − 1), where n = 54 is the sample size. Thus, apart from amultiplying factor (n − 1) −1 , ‘actual’ and ‘estimated’ changes are sampleand empirical influence functions, respectively.In considering changes to an eigenvector, there are changes to each ofthe p (= 4) coefficients in the vector. Comparing vectors is more difficultthan comparing scalars, but Tables 10.2 and 10.3 give the sum of squaresof changes in the individual coefficients of each vector, which is a plausiblemeasure of the difference between two vectors. This quantity is a monotonicallyincreasing function of the angle in p-dimensional space betweenthe original and perturbed versions of a k , which further increases its plausibility.The idea of using angles between eigenvectors to compare PCs isdiscussed in a different context in Section 13.5.The ‘actual’ changes for eigenvectors again come from leaving out oneobservation at a time, recomputing and then comparing the eigenvectors,while the estimated changes are computed from multiples of sample versionsof the expressions (10.2.4) and (10.2.5) for I(x; α k ). The changes ineigenvectors derived in this way are much smaller in absolute terms thanthe changes in eigenvalues, so the eigenvector changes have been multipliedby 10 3 in Tables 10.2 and 10.3 in order that all the numbers are of comparablesize. As with eigenvalues, apart from a common multiplier we arecomparing empirical and sample influences.The first comment to make regarding the results given in Tables 10.2 and10.3 is that the estimated values are extremely good in terms of obtainingthe correct ordering of the observations with respect to their influence.There are some moderately large discrepancies in absolute terms for the observationswith the largest influences, but experience with this and severalother data sets suggests that the most influential observations are correctlyidentified unless sample sizes become very small. The discrepancies in absolutevalues can also be reduced by taking multiples other than (n − 1) −1and by including second order (ε 2 ) terms.A second point is that the observations which are most influential fora particular eigenvalue need not be so for the corresponding eigenvector,and vice versa. For example, there is no overlap between the four mostinfluential observations for the first eigenvalue and its eigenvector in eitherthe correlation or covariance matrix. Conversely, observations can sometimeshave a large influence on both an eigenvalue and its eigenvector (seeTable 10.3, component 2, observation 34).Next, note that observations may be influential for one PC only, or affecttwo or more. An observation is least likely to affect more than one PC inthe case of eigenvalues for a covariance matrix—indeed there is no over-