Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
13.4. Principal Component Analysis in Designed Experiments 35113.4 Principal Component Analysis in DesignedExperimentsIn Chapters 8 and 9 we discussed ways in which PCA could be used as a preliminaryto, or in conjunction with, other standard statistical techniques.The present section gives another example of the same type of application;here we consider the situation where p variables are measured in the courseof a designed experiment. The standard analysis would be either a set ofseparate analyses of variance (ANOVAs) for each variable or, if the variablesare correlated, a multivariate analysis of variance (MANOVA—Rencher,1995, Chapter 6) could be done.As an illustration, consider a two-way model of the formx ijk = µ + τ j + β k + ɛ ijk ,i=1, 2,...,n jk ; j =1, 2,...,t; k =1, 2,...,b,where x ijk is the ith observation for treatment j in block k of a p-variatevector x. The vector x ijk is therefore the sum of an overall mean µ, atreatment effect τ j , a block effect β k and an error term ɛ ijk .The most obvious way in which PCA can be used in such analyses issimply to replace the original p variables by their PCs. Then either separateANOVAs can be done on each PC, or the PCs can be analysed usingMANOVA. Jackson (1991, Sections 13.5–13.7) discusses the use of separateANOVAs for each PC in some detail. In the context of analysinggrowth curves (see Section 12.4.2) Rao (1958) suggests that ‘methods ofmultivariate analysis for testing the differences between treatments’ can beimplemented on the first few PCs, and Rencher (1995, Section 12.2) advocatesPCA as a first step in MANOVA when p is large. However, as notedby Rao (1964), for most types of designed experiment this simple analysisis often not particularly useful. This is because the overall covariancematrix represents a mixture of contributions from within treatments andblocks, between treatments, between blocks, and so on, whereas we usuallywish to separate these various types of covariance. Although the PCs areuncorrelated overall, they are not necessarily so, even approximately, withrespect to between-group or within-group variation. This is a more complicatedmanifestation of what occurs in discriminant analysis (Section 9.1),where a PCA based on the covariance matrix of the raw data may proveconfusing, as it inextricably mixes up variation between and within populations.Instead of a PCA of all the x ijk , a number of other PCAs havebeen suggested and found to be useful in some circumstances.Jeffers (1962) looks at a PCA of the (treatment × block) means ¯x jk ,j=1, 2,...,t; k =1, 2,...,b, where¯x jk = 1n∑ jkx ijk ,n jkthat is, a PCA of a data set with tb observations on a p-variate random vec-i=1
352 13. Principal Component Analysis for Special Types of Datator. In an example on tree seedlings, he finds that ANOVAs carried out onthe first five PCs, which account for over 97% of the variation in the originaleight variables, give significant differences between treatment means(averaged over blocks) for the first and fifth PCs. This result contrastswith ANOVAs for the original variables, where there were no significantdifferences. The first and fifth PCs can be readily interpreted in Jeffers’(1962) example, so that transforming to PCs produces a clear advantage interms of detecting interpretable treatment differences. However, PCs willnot always be interpretable and, as in regression (Section 8.2), there is noreason to expect that treatment differences will necessarily manifest themselvesin high variance, rather than low variance, PCs. For example, whileJeffer’s first component accounts for over 50% of total variation, his fifthcomponent accounts for less than 5%.Jeffers (1962) looked at ‘between’ treatments and blocks PCs, but thePCs of the ‘within’ treatments or blocks covariance matrices can also provideuseful information. Pearce and Holland (1960) give an example havingfour variables, in which different treatments correspond to different rootstocks,but which has no block structure. They carry out separate PCAsfor within- and between-rootstock variation. The first PC is similar in thetwo cases, measuring general size. Later PCs are, however, different for thetwo analyses, but they are readily interpretable in both cases so that thetwo analyses each provide useful but separate information.Another use of ‘within-treatments’ PCs occurs in the case where there areseveral populations, as in discriminant analysis (see Section 9.1), and ‘treatments’are defined to correspond to different populations. If each populationhas the same covariance matrix Σ, and within-population PCs based onΣ are of interest, then the ‘within-treatments’ covariance matrix providesan estimate of Σ. Yet another way in which ‘error covariance matrix PCs’can contribute is if the analysis looks for potential outliers as suggested formultivariate regression in Section 10.1.A different way of using PCA in a designed experiment is describedby Mandel (1971, 1972). He considers a situation where there is only onevariable, which follows the two-way modelx jk = µ + τ j + β k + ε jk ,j=1, 2,...,t; k =1, 2,...,b, (13.4.1)that is, there is only a single observation on the variable x at each combinationof treatments and blocks. In Mandel’s analysis, estimates ˆµ, ˆτ j ,ˆβ k are found for µ, τ j , β k , respectively, and residuals are calculated ase jk = x jk − ˆµ − ˆτ j − ˆβ k . The main interest is then in using e jk to estimatethe non-additive part ε jk of the model (13.4.1). This non-additive part isassumed to take the formm∑ε jk = u jh l h a kh , (13.4.2)h=1
- Page 332 and 333: 12.1. Introduction 301series is alm
- Page 334 and 335: 12.2. PCA and Atmospheric Time Seri
- Page 336 and 337: 12.2. PCA and Atmospheric Time Seri
- Page 338 and 339: and a typical row of the matrix is1
- Page 340 and 341: 12.2. PCA and Atmospheric Time Seri
- Page 342 and 343: 12.2. PCA and Atmospheric Time Seri
- Page 344 and 345: 12.2. PCA and Atmospheric Time Seri
- Page 346 and 347: 12.2. PCA and Atmospheric Time Seri
- Page 348 and 349: 12.3. Functional PCA 317A key refer
- Page 350 and 351: 12.3. Functional PCA 319The sample
- Page 352 and 353: 12.3. Functional PCA 321speed (mete
- Page 354 and 355: 12.3. Functional PCA 323of the data
- Page 356 and 357: 12.3. Functional PCA 325subject to
- Page 358 and 359: 12.3. Functional PCA 327series than
- Page 360 and 361: 12.4. PCA and Non-Independent Data
- Page 362 and 363: 12.4. PCA and Non-Independent Data
- Page 364 and 365: 12.4. PCA and Non-Independent Data
- Page 366 and 367: 12.4. PCA and Non-Independent Data
- Page 368 and 369: 12.4. PCA and Non-Independent Data
- Page 370 and 371: 13.1. Principal Component Analysis
- Page 372 and 373: 13.1. Principal Component Analysis
- Page 374 and 375: 13.2. Analysis of Size and Shape 34
- Page 376 and 377: 13.2. Analysis of Size and Shape 34
- Page 378 and 379: 13.3. Principal Component Analysis
- Page 380 and 381: 13.3. Principal Component Analysis
- Page 384 and 385: 13.4. Principal Component Analysis
- Page 386 and 387: 13.5. Common Principal Components 3
- Page 388 and 389: 13.5. Common Principal Components 3
- Page 390 and 391: 13.5. Common Principal Components 3
- Page 392 and 393: 13.5. Common Principal Components 3
- Page 394 and 395: 13.6. Principal Component Analysis
- Page 396 and 397: 13.6. Principal Component Analysis
- Page 398 and 399: 13.7. PCA in Statistical Process Co
- Page 400 and 401: 13.8. Some Other Types of Data 369A
- Page 402 and 403: 13.8. Some Other Types of Data 371d
- Page 404 and 405: 14Generalizations and Adaptations o
- Page 406 and 407: 14.1. Non-Linear Extensions of Prin
- Page 408 and 409: 14.1. Additive Principal Components
- Page 410 and 411: 14.1. Additive Principal Components
- Page 412 and 413: 14.1. Additive Principal Components
- Page 414 and 415: 14.2. Weights, Metrics, Transformat
- Page 416 and 417: 14.2. Weights, Metrics, Transformat
- Page 418 and 419: 14.2. Weights, Metrics, Transformat
- Page 420 and 421: 14.2. Weights, Metrics, Transformat
- Page 422 and 423: 14.2. Weights, Metrics, Transformat
- Page 424 and 425: 14.3. PCs in the Presence of Second
- Page 426 and 427: 14.4. PCA for Non-Normal Distributi
- Page 428 and 429: 14.5. Three-Mode, Multiway and Mult
- Page 430 and 431: 14.5. Three-Mode, Multiway and Mult
352 13. <strong>Principal</strong> <strong>Component</strong> <strong>Analysis</strong> for Special Types of Datator. In an example on tree seedlings, he finds that ANOVAs carried out onthe first five PCs, which account for over 97% of the variation in the originaleight variables, give significant differences between treatment means(averaged over blocks) for the first and fifth PCs. This result contrastswith ANOVAs for the original variables, where there were no significantdifferences. The first and fifth PCs can be readily interpreted in Jeffers’(1962) example, so that transforming to PCs produces a clear advantage interms of detecting interpretable treatment differences. However, PCs willnot always be interpretable and, as in regression (Section 8.2), there is noreason to expect that treatment differences will necessarily manifest themselvesin high variance, rather than low variance, PCs. For example, whileJeffer’s first component accounts for over 50% of total variation, his fifthcomponent accounts for less than 5%.Jeffers (1962) looked at ‘between’ treatments and blocks PCs, but thePCs of the ‘within’ treatments or blocks covariance matrices can also provideuseful information. Pearce and Holland (1960) give an example havingfour variables, in which different treatments correspond to different rootstocks,but which has no block structure. They carry out separate PCAsfor within- and between-rootstock variation. The first PC is similar in thetwo cases, measuring general size. Later PCs are, however, different for thetwo analyses, but they are readily interpretable in both cases so that thetwo analyses each provide useful but separate information.Another use of ‘within-treatments’ PCs occurs in the case where there areseveral populations, as in discriminant analysis (see Section 9.1), and ‘treatments’are defined to correspond to different populations. If each populationhas the same covariance matrix Σ, and within-population PCs based onΣ are of interest, then the ‘within-treatments’ covariance matrix providesan estimate of Σ. Yet another way in which ‘error covariance matrix PCs’can contribute is if the analysis looks for potential outliers as suggested formultivariate regression in Section 10.1.A different way of using PCA in a designed experiment is describedby Mandel (1971, 1972). He considers a situation where there is only onevariable, which follows the two-way modelx jk = µ + τ j + β k + ε jk ,j=1, 2,...,t; k =1, 2,...,b, (13.4.1)that is, there is only a single observation on the variable x at each combinationof treatments and blocks. In Mandel’s analysis, estimates ˆµ, ˆτ j ,ˆβ k are found for µ, τ j , β k , respectively, and residuals are calculated ase jk = x jk − ˆµ − ˆτ j − ˆβ k . The main interest is then in using e jk to estimatethe non-additive part ε jk of the model (13.4.1). This non-additive part isassumed to take the formm∑ε jk = u jh l h a kh , (13.4.2)h=1