Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

cda.psych.uiuc.edu
from cda.psych.uiuc.edu More from this publisher
12.07.2015 Views

13.4. Principal Component Analysis in Designed Experiments 35113.4 Principal Component Analysis in DesignedExperimentsIn Chapters 8 and 9 we discussed ways in which PCA could be used as a preliminaryto, or in conjunction with, other standard statistical techniques.The present section gives another example of the same type of application;here we consider the situation where p variables are measured in the courseof a designed experiment. The standard analysis would be either a set ofseparate analyses of variance (ANOVAs) for each variable or, if the variablesare correlated, a multivariate analysis of variance (MANOVA—Rencher,1995, Chapter 6) could be done.As an illustration, consider a two-way model of the formx ijk = µ + τ j + β k + ɛ ijk ,i=1, 2,...,n jk ; j =1, 2,...,t; k =1, 2,...,b,where x ijk is the ith observation for treatment j in block k of a p-variatevector x. The vector x ijk is therefore the sum of an overall mean µ, atreatment effect τ j , a block effect β k and an error term ɛ ijk .The most obvious way in which PCA can be used in such analyses issimply to replace the original p variables by their PCs. Then either separateANOVAs can be done on each PC, or the PCs can be analysed usingMANOVA. Jackson (1991, Sections 13.5–13.7) discusses the use of separateANOVAs for each PC in some detail. In the context of analysinggrowth curves (see Section 12.4.2) Rao (1958) suggests that ‘methods ofmultivariate analysis for testing the differences between treatments’ can beimplemented on the first few PCs, and Rencher (1995, Section 12.2) advocatesPCA as a first step in MANOVA when p is large. However, as notedby Rao (1964), for most types of designed experiment this simple analysisis often not particularly useful. This is because the overall covariancematrix represents a mixture of contributions from within treatments andblocks, between treatments, between blocks, and so on, whereas we usuallywish to separate these various types of covariance. Although the PCs areuncorrelated overall, they are not necessarily so, even approximately, withrespect to between-group or within-group variation. This is a more complicatedmanifestation of what occurs in discriminant analysis (Section 9.1),where a PCA based on the covariance matrix of the raw data may proveconfusing, as it inextricably mixes up variation between and within populations.Instead of a PCA of all the x ijk , a number of other PCAs havebeen suggested and found to be useful in some circumstances.Jeffers (1962) looks at a PCA of the (treatment × block) means ¯x jk ,j=1, 2,...,t; k =1, 2,...,b, where¯x jk = 1n∑ jkx ijk ,n jkthat is, a PCA of a data set with tb observations on a p-variate random vec-i=1

352 13. Principal Component Analysis for Special Types of Datator. In an example on tree seedlings, he finds that ANOVAs carried out onthe first five PCs, which account for over 97% of the variation in the originaleight variables, give significant differences between treatment means(averaged over blocks) for the first and fifth PCs. This result contrastswith ANOVAs for the original variables, where there were no significantdifferences. The first and fifth PCs can be readily interpreted in Jeffers’(1962) example, so that transforming to PCs produces a clear advantage interms of detecting interpretable treatment differences. However, PCs willnot always be interpretable and, as in regression (Section 8.2), there is noreason to expect that treatment differences will necessarily manifest themselvesin high variance, rather than low variance, PCs. For example, whileJeffer’s first component accounts for over 50% of total variation, his fifthcomponent accounts for less than 5%.Jeffers (1962) looked at ‘between’ treatments and blocks PCs, but thePCs of the ‘within’ treatments or blocks covariance matrices can also provideuseful information. Pearce and Holland (1960) give an example havingfour variables, in which different treatments correspond to different rootstocks,but which has no block structure. They carry out separate PCAsfor within- and between-rootstock variation. The first PC is similar in thetwo cases, measuring general size. Later PCs are, however, different for thetwo analyses, but they are readily interpretable in both cases so that thetwo analyses each provide useful but separate information.Another use of ‘within-treatments’ PCs occurs in the case where there areseveral populations, as in discriminant analysis (see Section 9.1), and ‘treatments’are defined to correspond to different populations. If each populationhas the same covariance matrix Σ, and within-population PCs based onΣ are of interest, then the ‘within-treatments’ covariance matrix providesan estimate of Σ. Yet another way in which ‘error covariance matrix PCs’can contribute is if the analysis looks for potential outliers as suggested formultivariate regression in Section 10.1.A different way of using PCA in a designed experiment is describedby Mandel (1971, 1972). He considers a situation where there is only onevariable, which follows the two-way modelx jk = µ + τ j + β k + ε jk ,j=1, 2,...,t; k =1, 2,...,b, (13.4.1)that is, there is only a single observation on the variable x at each combinationof treatments and blocks. In Mandel’s analysis, estimates ˆµ, ˆτ j ,ˆβ k are found for µ, τ j , β k , respectively, and residuals are calculated ase jk = x jk − ˆµ − ˆτ j − ˆβ k . The main interest is then in using e jk to estimatethe non-additive part ε jk of the model (13.4.1). This non-additive part isassumed to take the formm∑ε jk = u jh l h a kh , (13.4.2)h=1

352 13. <strong>Principal</strong> <strong>Component</strong> <strong>Analysis</strong> for Special Types of Datator. In an example on tree seedlings, he finds that ANOVAs carried out onthe first five PCs, which account for over 97% of the variation in the originaleight variables, give significant differences between treatment means(averaged over blocks) for the first and fifth PCs. This result contrastswith ANOVAs for the original variables, where there were no significantdifferences. The first and fifth PCs can be readily interpreted in Jeffers’(1962) example, so that transforming to PCs produces a clear advantage interms of detecting interpretable treatment differences. However, PCs willnot always be interpretable and, as in regression (Section 8.2), there is noreason to expect that treatment differences will necessarily manifest themselvesin high variance, rather than low variance, PCs. For example, whileJeffer’s first component accounts for over 50% of total variation, his fifthcomponent accounts for less than 5%.Jeffers (1962) looked at ‘between’ treatments and blocks PCs, but thePCs of the ‘within’ treatments or blocks covariance matrices can also provideuseful information. Pearce and Holland (1960) give an example havingfour variables, in which different treatments correspond to different rootstocks,but which has no block structure. They carry out separate PCAsfor within- and between-rootstock variation. The first PC is similar in thetwo cases, measuring general size. Later PCs are, however, different for thetwo analyses, but they are readily interpretable in both cases so that thetwo analyses each provide useful but separate information.Another use of ‘within-treatments’ PCs occurs in the case where there areseveral populations, as in discriminant analysis (see Section 9.1), and ‘treatments’are defined to correspond to different populations. If each populationhas the same covariance matrix Σ, and within-population PCs based onΣ are of interest, then the ‘within-treatments’ covariance matrix providesan estimate of Σ. Yet another way in which ‘error covariance matrix PCs’can contribute is if the analysis looks for potential outliers as suggested formultivariate regression in Section 10.1.A different way of using PCA in a designed experiment is describedby Mandel (1971, 1972). He considers a situation where there is only onevariable, which follows the two-way modelx jk = µ + τ j + β k + ε jk ,j=1, 2,...,t; k =1, 2,...,b, (13.4.1)that is, there is only a single observation on the variable x at each combinationof treatments and blocks. In Mandel’s analysis, estimates ˆµ, ˆτ j ,ˆβ k are found for µ, τ j , β k , respectively, and residuals are calculated ase jk = x jk − ˆµ − ˆτ j − ˆβ k . The main interest is then in using e jk to estimatethe non-additive part ε jk of the model (13.4.1). This non-additive part isassumed to take the formm∑ε jk = u jh l h a kh , (13.4.2)h=1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!