Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
10.3. Sensitivity and Stability 259tion for the second eigenvector and, in fact, has an influence nearly six timesas large as that of the second most influential observation. It is also themost influential on the first, third and fourth eigenvectors, showing that theperturbation caused by observation 16 to the second PC has a ‘knock-on’effect to other PCs in order to preserve orthogonality. Although observation16 is very influential on the eigenvectors, its effect is less marked on theeigenvalues. It has only the fifth highest influence on the second eigenvalue,though it is highest for the fourth eigenvalue, second highest for the first,and fourth highest for the third. It is clear that values of influence on eigenvaluesneed not mirror major changes in the structure of the eigenvectors,at least when dealing with correlation matrices.Having said that observation 16 is clearly the most influential, for eigenvectors,of the 28 observations in the data set, it should be noted that itsinfluence in absolute terms is not outstandingly large. In particular, the coefficientsrounded to one decimal place for the second PC when observation16 is omitted are0.2 0.1 − 0.4 0.8 − 0.1 − 0.3 − 0.2.The corresponding coefficients when all 28 observations are included are−0.0 − 0.2 − 0.2 0.9 − 0.1 − 0.0 − 0.0.Thus, when observation 16 is removed, the basic character of PC2 as mainlya measure of head size is unchanged, although the dominance of head sizein this component is reduced. The angle between the two vectors definingthe second PCs, with and without observation 16, is about 24 ◦ ,whichisperhaps larger than would be deduced from a quick inspection of the simplifiedcoefficients above. Pack et al. (1988) give a more thorough discussionof influence in the context of this data set, together with similar data setsmeasured on different groups of students (see also Brooks (1994)). A problemthat does not arise in the examples discussed here, but which does inPack et al.’s (1988) larger example, is the possibility that omission of anobservation causes switching or rotation of eigenvectors when consecutiveeigenvalues have similar magnitudes. What appear to be large changes ineigenvectors may be much smaller when a possible reordering of PCs inthe modified PCA is taken into account. Alternatively, a subspace of twoor more PCs may be virtually unchanged when an observation is deleted,but individual eigenvectors spanning that subspace can look quite different.Further discussion of these subtleties in the context of an example is givenby Pack et al. (1988).10.3 Sensitivity and StabilityRemoving a single observation and estimating its influence explores onetype of perturbation to a data set, but other perturbations are possible. In
260 10. Outlier Detection, Influential Observations and Robust Estimationone the weights of one or more observations are reduced, without removingthem entirely. This type of ‘sensitivity’ is discussed in general for multivariatetechniques involving eigenanalyses by Tanaka and Tarumi (1985, 1987),with PCA as a special case. Benasseni (1986a) also examines the effect ofdiffering weights for observations on the eigenvalues in a PCA. He givesbounds on the perturbed eigenvalues for any pattern of perturbations ofthe weights for both covariance and correlation-based analyses. The workis extended in Benasseni (1987a) to include bounds for eigenvectors as wellas eigenvalues. A less structured perturbation is investigated empiricallyby Tanaka and Tarumi (1986). Here each element of a (4 × 4) ‘data’ matrixhas an independent random perturbation added to it.In Tanaka and Mori (1997), where the objective is to select a subsetof variables reproducing all the p variables as well as possible and hencehas connections with Section 6.3, the influence of variables is discussed.Fujikoshi et al. (1985) examine changes in the eigenvalues of a covariancematrix when additional variables are introduced. Krzanowski (1987a) indicateshow to compare data configurations given by sets of retained PCs,including all the variables and with each variable omitted in turn. Thecalculations are done using an algorithm for computing the singular valuedecomposition (SVD) with a variable missing, due to Eastment and Krzanowski(1982), and the configurations are compared by means of Procrustesrotation (see Krzanowski and Marriott 1994, Chapter 5). Holmes-Junca(1985) gives an extensive discussion of the effect of omitting observationsor variables from a PCA. As in Krzanowski (1987a), the SVD plays a prominentrôle, but the framework in Holmes-Junca (1985) is a more general onein which unequal weights may be associated with the observations, and ageneral metric may be associated with the variables (see Section 14.2.2).A different type of stability is investigated by Benasseni (1986b). He considersreplacing each of the np-dimensional observations in a data set bya p-dimensional random variable whose probability distribution is centredon the observed value. He relates the covariance matrix in the perturbedcase to the original covariance matrix and to the covariance matrices ofthe n random variables. From this relationship, he deduces bounds on theeigenvalues of the perturbed matrix. In a later paper, Benasseni (1987b)looks at fixed, rather than random, perturbations to one or more of theobservations. Expressions are given for consequent changes to eigenvaluesand eigenvectors of the covariance matrix, together with approximationsto those changes. A number of special forms for the perturbation, for examplewhere it affects only one of the p variables, are examined in detail.Corresponding results for the correlation matrix are discussed briefly.Dudziński et al. (1975) discuss what they call ‘repeatability’ of principalcomponents in samples, which is another way of looking at the stabilityof the components’ coefficients. For each component of interest the angleis calculated between the vector of coefficients in the population and thecorresponding vector in a sample. Dudziński et al. (1975) define a repeata-
- Page 240 and 241: 9.1. Discriminant Analysis 209betwe
- Page 242 and 243: 9.2. Cluster Analysis 211dimensiona
- Page 244 and 245: 9.2. Cluster Analysis 213Before loo
- Page 246 and 247: 9.2. Cluster Analysis 215Figure 9.3
- Page 248 and 249: 9.2. Cluster Analysis 217demographi
- Page 250 and 251: 9.2. Cluster Analysis 219county clu
- Page 252 and 253: 9.2. Cluster Analysis 221choosing a
- Page 254 and 255: 9.3. Canonical Correlation Analysis
- Page 256 and 257: 9.3. Canonical Correlation Analysis
- Page 258 and 259: 9.3. Canonical Correlation Analysis
- Page 260 and 261: 9.3. Canonical Correlation Analysis
- Page 262 and 263: 9.3. Canonical Correlation Analysis
- Page 264 and 265: 10.1. Detection of Outliers Using P
- Page 266 and 267: 10.1. Detection of Outliers Using P
- Page 268 and 269: 10.1. Detection of Outliers Using P
- Page 270 and 271: 10.1. Detection of Outliers Using P
- Page 272 and 273: 10.1. Detection of Outliers Using P
- Page 274 and 275: 10.1. Detection of Outliers Using P
- Page 276 and 277: 10.1. Detection of Outliers Using P
- Page 278 and 279: 10.1. Detection of Outliers Using P
- Page 280 and 281: 10.2. Influential Observations in a
- Page 282 and 283: 10.2. Influential Observations in a
- Page 284 and 285: 10.2. Influential Observations in a
- Page 286 and 287: 10.2. Influential Observations in a
- Page 288 and 289: 10.2. Influential Observations in a
- Page 292 and 293: 10.3. Sensitivity and Stability 261
- Page 294 and 295: 10.4. Robust Estimation of Principa
- Page 296 and 297: 10.4. Robust Estimation of Principa
- Page 298 and 299: 10.4. Robust Estimation of Principa
- Page 300 and 301: 11Rotation and Interpretation ofPri
- Page 302 and 303: 11.1. Rotation of Principal Compone
- Page 304 and 305: oot of the corresponding eigenvalue
- Page 306 and 307: 11.1. Rotation of Principal Compone
- Page 308 and 309: 11.1. Rotation of Principal Compone
- Page 310 and 311: 11.2. Alternatives to Rotation 279w
- Page 312 and 313: 11.2. Alternatives to Rotation 281F
- Page 314 and 315: 11.2. Alternatives to Rotation 283F
- Page 316 and 317: 11.2. Alternatives to Rotation 285T
- Page 318 and 319: 11.2. Alternatives to Rotation 287T
- Page 320 and 321: 11.2. Alternatives to Rotation 289A
- Page 322 and 323: 11.2. Alternatives to Rotation 291
- Page 324 and 325: 11.3. Simplified Approximations to
- Page 326 and 327: 11.3. Simplified Approximations to
- Page 328 and 329: 11.4. Physical Interpretation of Pr
- Page 330 and 331: 12Principal Component Analysis forT
- Page 332 and 333: 12.1. Introduction 301series is alm
- Page 334 and 335: 12.2. PCA and Atmospheric Time Seri
- Page 336 and 337: 12.2. PCA and Atmospheric Time Seri
- Page 338 and 339: and a typical row of the matrix is1
10.3. Sensitivity and Stability 259tion for the second eigenvector and, in fact, has an influence nearly six timesas large as that of the second most influential observation. It is also themost influential on the first, third and fourth eigenvectors, showing that theperturbation caused by observation 16 to the second PC has a ‘knock-on’effect to other PCs in order to preserve orthogonality. Although observation16 is very influential on the eigenvectors, its effect is less marked on theeigenvalues. It has only the fifth highest influence on the second eigenvalue,though it is highest for the fourth eigenvalue, second highest for the first,and fourth highest for the third. It is clear that values of influence on eigenvaluesneed not mirror major changes in the structure of the eigenvectors,at least when dealing with correlation matrices.Having said that observation 16 is clearly the most influential, for eigenvectors,of the 28 observations in the data set, it should be noted that itsinfluence in absolute terms is not outstandingly large. In particular, the coefficientsrounded to one decimal place for the second PC when observation16 is omitted are0.2 0.1 − 0.4 0.8 − 0.1 − 0.3 − 0.2.The corresponding coefficients when all 28 observations are included are−0.0 − 0.2 − 0.2 0.9 − 0.1 − 0.0 − 0.0.Thus, when observation 16 is removed, the basic character of PC2 as mainlya measure of head size is unchanged, although the dominance of head sizein this component is reduced. The angle between the two vectors definingthe second PCs, with and without observation 16, is about 24 ◦ ,whichisperhaps larger than would be deduced from a quick inspection of the simplifiedcoefficients above. Pack et al. (1988) give a more thorough discussionof influence in the context of this data set, together with similar data setsmeasured on different groups of students (see also Brooks (1994)). A problemthat does not arise in the examples discussed here, but which does inPack et al.’s (1988) larger example, is the possibility that omission of anobservation causes switching or rotation of eigenvectors when consecutiveeigenvalues have similar magnitudes. What appear to be large changes ineigenvectors may be much smaller when a possible reordering of PCs inthe modified PCA is taken into account. Alternatively, a subspace of twoor more PCs may be virtually unchanged when an observation is deleted,but individual eigenvectors spanning that subspace can look quite different.Further discussion of these subtleties in the context of an example is givenby Pack et al. (1988).10.3 Sensitivity and StabilityRemoving a single observation and estimating its influence explores onetype of perturbation to a data set, but other perturbations are possible. In