Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

More documents

Recommendations

Info

$Producing Beautiful Documents With TEX and LATEX$

A.1. Numerical Calculation of Principal Components 411There is one problem with shifting the origin that has not yet beenmentioned. This is the fact that to choose efficiently the values of ρ thatdetermine the shifts, we need some preliminary idea of the eigenvaluesof T. This preliminary estimation can be achieved by using the methodof bisection, which in turn is based on the Sturm sequence property oftridiagonal matrices. Details will not be given here (see Wilkinson, 1965,pp. 300–302), but the method provides a quick way of finding approximatevalues of the eigenvalues of a tridiagonal matrix. In fact, bisection could beused to find the eigenvalues to any required degree of accuracy, and inverseiteration implemented solely to find the eigenvectors.Two major collections of subroutines for finding eigenvalues and eigenvectorsfor a wide variety of classes of matrix are the EISPACK package(Smith et al., 1976), which is distributed by IMSL, and parts of the NAGlibrary of subroutines. In both of these collections, there are recommendationsas to which subroutines are most appropriate for various types ofeigenproblem. In the case where only a few of the eigenvalues and eigenvectorsof a real symmetric matrix are required (corresponding to finding justa few of the PCs for a covariance or correlation matrix) both EISPACKand NAG recommend transforming to tridiagonal form using Householdertransformations, and then finding eigenvalues and eigenvectors using bisectionand inverse iteration respectively. NAG and EISPACK both base theirsubroutines on algorithms published in Wilkinson and Reinsch (1971), asdo the ‘Numerical Recipes’ for eigensystems given by Press et al. (1992,Chapter 11).The QL AlgorithmIf all of the PCs are required, then methods other than those just describedmay be more efficient. For example, both EISPACK and NAG recommendthat we should still transform the covariance or correlation matrix to tridiagonalform, but at the second stage the so-called QL algorithm should nowbe used, instead of bisection and inverse iteration. Chapter 8 of Wilkinson(1965) spends over 80 pages describing the QR and LR algorithms (whichare closely related to the QL algorithm), but only a very brief outline willbe given here.The basic idea behind the QL algorithm is that any non-singular matrixT can be written as T = QL, where Q is orthogonal and L is lowertriangular. (The QR algorithm is similar, except that T is written insteadas T = QR, where R is upper triangular, rather than lower triangular.) IfT 1 = T and we write T 1 = Q 1 L 1 , then T 2 is defined as T 2 = L 1 Q 1 .Thisis the first step in an iterative procedure. At the next step, T 2 is writtenas T 2 = Q 2 L 2 and T 3 is defined as T 3 = L 2 Q 2 . In general, T r is writtenas Q r L r and T r+1 is then defined as L r Q r ,r=1, 2, 3,..., where Q 1 , Q 2 ,Q 3 ,... are orthogonal matrices, and L 1 , L 2 , L 3 ,... are lower triangular. Itcan be shown that T r converges to a diagonal matrix, with the eigenvalues
412 Appendix A. Computation of Principal Componentsof T in decreasing absolute size down the diagonal. Eigenvectors can befound by accumulating the transformations in the QL algorithm (Smith etal., 1976, p. 468).As with the power method, the speed of convergence of the QL algorithmdepends on the ratios of consecutive eigenvalues. The idea of incorporatingshifts can again be implemented to improve the algorithm and, unlike thepower method, efficient strategies exist for finding appropriate shifts thatdo not rely on prior information about the eigenvalues (see, for example,Lawson and Hanson (1974, p. 109)). The QL algorithm can also cope withequality between eigenvalues.It is probably fair to say that the algorithms described in detail byWilkinson (1965) and Wilkinson and Reinsch (1971), and implemented invarious IMSL and NAG routines, have stood the test of time. They stillprovide efficient ways of computing PCs in many circumstances. However,we conclude the Appendix by discussing two alternatives. The first is implementationvia the singular value decomposition (SVD) of the data matrix,and the second consists of the various algorithms for PCA that have beensuggested in the neural networks literature. The latter is a large topic andwill be summarized only briefly.One other type of algorithm that has been used recently to find PCs isthe EM algorithm (Dempster et al., 1977). This is advocated by Tippingand Bishop (1999a,b) and Roweis (1997), and has its greatest value in caseswhere some of the data are missing (see Section 13.6).Singular Value DecompositionThe suggestion that PCs may best be computed using the SVD of thedata matrix (see Section 3.5) is not new. For example, Chambers (1977, p.111) talks about the SVD providing the best approach to computation ofprincipal components and Gnanadesikan (1977, p. 10) states that ‘...therecommended algorithm for ...obtainingtheprincipalcomponents is eitherthe...QRmethod...orthesingularvaluedecomposition.’ In constructingthe SVD, it turns out that similar algorithms to those given above canbe used. Lawson and Hanson (1974, p. 110) describe an algorithm (seealso Wilkinson and Reinsch (1971)) for finding the SVD, which has twostages; the first uses Householder transformations to transform to an upperbidiagonal matrix, and the second applies an adapted QR algorithm. Themethod is therefore not radically different from that described earlier.As noted at the end of Section 8.1, the SVD can also be useful in computationsfor regression (Mandel, 1982; Nelder, 1985), so the SVD hasfurther advantages if PCA is used in conjunction with regression. Nashand Lefkovitch (1976) describe an algorithm that uses the SVD to providea variety of results for regression, as well as PCs.Another point concerning the SVD is that it provides simultaneously notonly the coefficients and variances for the PCs, but also the scores of each
Page 1:
Principal ComponentAnalysis,Second
Page 7 and 8:
viPreface to the Second Editionerty
Page 9 and 10:
viiiPreface to the Second EditionA
Page 11 and 12:
xPreface to the First Editionand in
Page 13 and 14:
xiiPreface to the First EditionIn m
Page 15 and 16:
This page intentionally left blank
Page 17 and 18:
xviAcknowledgmentsthese institution
Page 19 and 20:
xviiiContents3.4.1 Example ........
Page 21 and 22:
xxContents10 Outlier Detection, Inf
Page 23 and 24:
Page 25 and 26:
xxivList of Figures5.2 Artistic qua
Page 27 and 28:
Page 29 and 30:
xxviiiList of Tables6.1 First six e
Page 31 and 32:
Page 33 and 34:
2 1. IntroductionFigure 1.1. Plot o
Page 35:
4 1. IntroductionFigure 1.3. Studen
Page 38 and 39:
1.2. A Brief History of Principal C
Page 40 and 41:
1.2. A Brief History of Principal C
Page 42 and 43:
2.1. Optimal Algebraic Properties o
Page 44 and 45:
Page 46 and 47:
Page 48 and 49:
Page 50 and 51:
2.2. Geometric Properties of Popula
Page 52 and 53:
2.3. Principal Components Using a C
Page 54 and 55:
Page 56 and 57:
Page 58 and 59:
2.4. Principal Components with Equa
Page 60 and 61:
3Mathematical and StatisticalProper
Page 62 and 63:
where3.1. Optimal Algebraic Propert
Page 64 and 65:
3.2. Geometric Properties of Sample
Page 66 and 67:
Page 68 and 69:
Page 70 and 71:
3.3. Covariance and Correlation Mat
Page 72 and 73:
3.3. Covariance and Correlation Mat
Page 74 and 75:
3.4. Principal Components with Equa
Page 76 and 77:
show that X = ULA ′ .⎡ULA ′ =
Page 78 and 79:
3.6. Probability Distributions for
Page 80 and 81:
3.7. Inference Based on Sample Prin
Page 82 and 83:
3.7.2 Interval Estimation3.7. Infer
Page 84 and 85:
Page 86 and 87:
Page 88 and 89:
3.8. Patterned Covariance and Corre
Page 90 and 91:
3.9. Models for Principal Component
Page 92 and 93:
3.9. Models for Principal Component
Page 94 and 95:
4Principal Components as a SmallNum
Page 96 and 97:
4.1. Anatomical Measurements 65Tabl
Page 98 and 99:
4.1. Anatomical Measurements 67spac
Page 100 and 101:
4.2. The Elderly at Home 69Table 4.
Page 102 and 103:
4.3. Spatial and Temporal Variation
Page 104 and 105:
4.3. Spatial and Temporal Variation
Page 106 and 107:
4.4. Properties of Chemical Compoun
Page 108 and 109:
4.5. Stock Market Prices 77Table 4.
Page 110 and 111:
5. Graphical Representation of Data
Page 112 and 113:
Anatomical Measurements5.1. Plottin
Page 114 and 115:
5.1. Plotting Two or Three Principa
Page 116 and 117:
5.2. Principal Coordinate Analysis
Page 118 and 119:
Page 120 and 121:
Page 122 and 123:
5.3. Biplots 91columns, L is an (r
Page 124 and 125:
5.3. Biplots 93ButandSubstituting i
Page 126 and 127:
5.3. Biplots 95The vector gi ∗ co
Page 128 and 129:
5.3. Biplots 97Figure 5.3. Biplot u
Page 130 and 131:
5.3. Biplots 99Table 5.2. First two
Page 132 and 133:
5.3. Biplots 101Figure 5.5. Biplot
Page 134 and 135:
5.4. Correspondence Analysis 103of
Page 136 and 137:
5.4. Correspondence Analysis 105Fig
Page 138 and 139:
5.6. Displaying Intrinsically High-
Page 140 and 141:
5.6. Displaying Intrinsically High-
Page 142 and 143:
6Choosing a Subset of PrincipalComp
Page 144 and 145:
6.1. How Many Principal Components?
Page 146 and 147:
Page 148 and 149:
Page 150 and 151:
Page 152 and 153:
Page 154 and 155:
Page 156 and 157:
Page 158 and 159:
Page 160 and 161:
Page 162 and 163:
Page 164 and 165:
6.2. Choosing m, the Number of Comp
Page 166 and 167:
6.2. Choosing m, the Number of Comp
Page 168 and 169:
6.3. Selecting a Subset of Variable
Page 170 and 171:
Page 172 and 173:
Page 174 and 175:
Page 176 and 177:
6.4. Examples Illustrating Variable
Page 178 and 179:
Page 180 and 181:
Page 182 and 183:
7.1. Models for Factor Analysis 151
Page 184 and 185:
7.2. Estimation of the Factor Model
Page 186 and 187:
Page 188 and 189:
Page 190 and 191:
7.3. Comparisons Between Factor and
Page 192 and 193:
7.4. An Example of Factor Analysis
Page 194 and 195:
7.4. An Example of Factor Analysis
Page 196 and 197:
7.5. Concluding Remarks 165To illus
Page 198 and 199:
8Principal Components in Regression
Page 200 and 201:
8.1. Principal Component Regression
Page 202 and 203:
8.1. Principal Component Regression
Page 204 and 205:
8.2. Selecting Components in Princi
Page 206 and 207:
8.2. Selecting Components in Princi
Page 208 and 209:
8.3. Connections Between PC Regress
Page 210 and 211:
8.4. Variations on Principal Compon
Page 212 and 213:
Page 214 and 215:
Page 216 and 217:
8.5. Variable Selection in Regressi
Page 218 and 219:
8.5. Variable Selection in Regressi
Page 220 and 221:
8.6. Functional and Structural Rela
Page 222 and 223:
8.7. Examples of Principal Componen
Page 224 and 225:
Table 8.3. Principal component regr
Page 226 and 227:
Page 228 and 229:
Page 230 and 231:
9Principal Components Used withOthe
Page 232 and 233:
9.1. Discriminant Analysis 201on th
Page 234 and 235:
9.1. Discriminant Analysis 203Figur
Page 236 and 237:
9.1. Discriminant Analysis 205Corbi
Page 238 and 239:
9.1. Discriminant Analysis 207that
Page 240 and 241:
9.1. Discriminant Analysis 209betwe
Page 242 and 243:
9.2. Cluster Analysis 211dimensiona
Page 244 and 245:
9.2. Cluster Analysis 213Before loo
Page 246 and 247:
9.2. Cluster Analysis 215Figure 9.3
Page 248 and 249:
9.2. Cluster Analysis 217demographi
Page 250 and 251:
9.2. Cluster Analysis 219county clu
Page 252 and 253:
9.2. Cluster Analysis 221choosing a
Page 254 and 255:
9.3. Canonical Correlation Analysis
Page 256 and 257:
Page 258 and 259:
Page 260 and 261:
Page 262 and 263:
Page 264 and 265:
10.1. Detection of Outliers Using P
Page 266 and 267:
Page 268 and 269:
Page 270 and 271:
Page 272 and 273:
Page 274 and 275:
Page 276 and 277:
Page 278 and 279:
Page 280 and 281:
10.2. Influential Observations in a
Page 282 and 283:
Page 284 and 285:
Page 286 and 287:
Page 288 and 289:
Page 290 and 291:
10.3. Sensitivity and Stability 259
Page 292 and 293:
10.3. Sensitivity and Stability 261
Page 294 and 295:
10.4. Robust Estimation of Principa
Page 296 and 297:
Page 298 and 299:
Page 300 and 301:
11Rotation and Interpretation ofPri
Page 302 and 303:
11.1. Rotation of Principal Compone
Page 304 and 305:
oot of the corresponding eigenvalue
Page 306 and 307:
Page 308 and 309:
Page 310 and 311:
11.2. Alternatives to Rotation 279w
Page 312 and 313:
11.2. Alternatives to Rotation 281F
Page 314 and 315:
11.2. Alternatives to Rotation 283F
Page 316 and 317:
11.2. Alternatives to Rotation 285T
Page 318 and 319:
11.2. Alternatives to Rotation 287T
Page 320 and 321:
11.2. Alternatives to Rotation 289A
Page 322 and 323:
11.2. Alternatives to Rotation 291
Page 324 and 325:
11.3. Simplified Approximations to
Page 326 and 327:
11.3. Simplified Approximations to
Page 328 and 329:
11.4. Physical Interpretation of Pr
Page 330 and 331:
12Principal Component Analysis forT
Page 332 and 333:
12.1. Introduction 301series is alm
Page 334 and 335:
12.2. PCA and Atmospheric Time Seri
Page 336 and 337:
Page 338 and 339:
and a typical row of the matrix is1
Page 340 and 341:
Page 342 and 343:
Page 344 and 345:
Page 346 and 347:
Page 348 and 349:
12.3. Functional PCA 317A key refer
Page 350 and 351:
12.3. Functional PCA 319The sample
Page 352 and 353:
12.3. Functional PCA 321speed (mete
Page 354 and 355:
12.3. Functional PCA 323of the data
Page 356 and 357:
12.3. Functional PCA 325subject to
Page 358 and 359:
12.3. Functional PCA 327series than
Page 360 and 361:
12.4. PCA and Non-Independent Data
Page 362 and 363:
Page 364 and 365:
Page 366 and 367:
Page 368 and 369:
Page 370 and 371:
13.1. Principal Component Analysis
Page 372 and 373:
Page 374 and 375:
13.2. Analysis of Size and Shape 34
Page 376 and 377:
13.2. Analysis of Size and Shape 34
Page 378 and 379:
Page 380 and 381:
Page 382 and 383:
Page 384 and 385:
Page 386 and 387:
13.5. Common Principal Components 3
Page 388 and 389:
Page 390 and 391:
Page 392 and 393: 13.5. Common Principal Components 3
Page 394 and 395: 13.6. Principal Component Analysis
Page 396 and 397: 13.6. Principal Component Analysis
Page 398 and 399: 13.7. PCA in Statistical Process Co
Page 400 and 401: 13.8. Some Other Types of Data 369A
Page 402 and 403: 13.8. Some Other Types of Data 371d
Page 404 and 405: 14Generalizations and Adaptations o
Page 406 and 407: 14.1. Non-Linear Extensions of Prin
Page 408 and 409: 14.1. Additive Principal Components
Page 414 and 415: 14.2. Weights, Metrics, Transformat
Page 424 and 425: 14.3. PCs in the Presence of Second
Page 426 and 427: 14.4. PCA for Non-Normal Distributi
Page 428 and 429: 14.5. Three-Mode, Multiway and Mult
Page 430 and 431: 14.5. Three-Mode, Multiway and Mult
Page 432 and 433: 14.6. Miscellanea 401• Linear App
Page 434 and 435: 14.6. Miscellanea 40314.6.3 Regress
Page 436 and 437: 14.7. Concluding Remarks 405space o
Page 438 and 439: Appendix AComputation of Principal
Page 440 and 441: A.1. Numerical Calculation of Princ
Page 444 and 445: A.1. Numerical Calculation of Princ
Page 446 and 447: ReferencesAguilera, A.M., Gutiérre
Page 448 and 449: References 417Apley, D.W. and Shi,
Page 450 and 451: References 419Benasseni, J. (1986b)
Page 452 and 453: References 421Boik, R.J. (1986). Te
Page 454 and 455: References 423Castro, P.E., Lawton,
Page 456 and 457: References 425Cook, R.D. (1986). As
Page 458 and 459: References 427Dempster, A.P., Laird
Page 460 and 461: References 429Feeney, G.J. and Hest
Page 462 and 463: References 431in Descriptive Multiv
Page 464 and 465: References 433Gunst, R.F. and Mason
Page 466 and 467: References 435Hocking, R.R., Speed,
Page 468 and 469: References 437Jeffers, J.N.R. (1978
Page 470 and 471: References 439Kazi-Aoual, F., Sabat
Page 472 and 473: References 441Krzanowski, W.J. (200
Page 474 and 475: References 443Mann, M.E. and Park,
Page 476 and 477: References 445Monahan, A.H., Tangan
Page 478 and 479: References 447Pack, P., Jolliffe, I
Page 480 and 481: References 449Richman M.B. (1993).
Page 482 and 483: References 451Soofi, E.S. (1988). P
Page 484 and 485: References 453Tenenbaum, J.B., de S
Page 486 and 487: References 455Vong, R., Geladi, P.,
Page 488 and 489: References 457regularities in multi
Page 490 and 491: Index 459116, 127-130, 133, 270, 27
Page 492 and 493:
Index 461computationin (PC) regress
Page 494 and 495:
Index 463discriminant principal com
Page 496 and 497:
Index 465of correlations between va
Page 498 and 499:
Index 467see also hypothesis testin
Page 500 and 501:
Index 469PC algorithms with noise 4
Page 502 and 503:
Index 471in functional and structur
Page 504 and 505:
Index 473variance ellipsoids,S-esti
Page 506 and 507:
Index 475spatial correlation/covari
Page 508 and 509:
Index 477for covariance matrices 26
Page 510 and 511:
Author Index 479Belsley, D.A. 169Be
Page 512 and 513:
Author Index 481Fowlkes, E.B. 377Fr
Page 514 and 515:
Author Index 483Krzanowski, W.J. 46
Page 516 and 517:
Author Index 485Rencher, A.C. 64, 1
Page 518:
Author Index 487Yaguchi, H. 371Yana
show all

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Create successful ePaper yourself

Delete template?

Save as template?