12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

13.4. <strong>Principal</strong> <strong>Component</strong> <strong>Analysis</strong> in Designed Experiments 353where m, l h , u jh , a kh , are suitably chosen constants. Apart from slightchanges in notation, the right-hand side of (13.4.2) is the same as that ofthe singular value decomposition (SVD) in (3.5.3). Thus, the model (13.4.2)is fitted by finding the SVD of the matrix E whose (j, k)th element ise jk , or equivalently finding the PCs of the covariance matrix based on thedata matrix E. This analysis also has links with correspondence analysis(see Section 13.1). In both cases we find an SVD of a two-way table ofresiduals, the difference being that in the present case the elements of thetable are residuals from an additive model for a quantitative variable, ratherthan residuals from an independence (multiplicative) model for counts. Asnoted in Section 1.2, R.A. Fisher used the SVD in a two-way analysis of anagricultural trial, leading to an eigenanalysis of a multiple of a covariancematrix as long ago as 1925.A substantial amount of work has been done on the model defined by(13.4.1) and (13.4.2). Freeman (1975) showed that Mandel’s approach canbe used for incomplete as well as complete two-way tables, and a numberof authors have constructed tests for the rank m of the interaction term.For example, Boik (1986) develops likelihood ratio and union-intersectiontests, and Milliken and Johnson (1989) provide tables of critical points forlikelihood ratio statistics. Boik (1986) also points out that the model is areduced-rank regression model (see Section 9.3.4).Shafii and Price (1998) give an example of a more complex design inwhich seed yields of 6 rapeseed cultivars are measured in 27 environmentsspread over three separate years, with 4 replicates at each of the(6 × 27) combinations. There are additional terms in the model comparedto (13.4.1), but the non-additive part is still represented as in (13.4.2).The first two terms in (13.4.2) are deemed significant using Milliken andJohnson’s (1989) tables; they account for 80% of the variability that wouldbe explained by taking m of full rank 5. The results of the analysis of thenon-additive term are interpreted using biplots (see Section 5.3).Gower and Krzanowski (1999) consider the situation in which the datahave a MANOVA structure but where the assumptions that underlie formalMANOVA procedures are clearly invalid. They suggest a number of graphicaldisplays to represent and interpret such data; one is based on weightedPCA. Goldstein (1995, Section 4.5) notes the possibility of using PCA toexplore the structure of various residual matrices in a multilevel model.Planned surveys are another type of designed experiment, and one particulartype of survey design is based on stratified sampling. Pla (1991)suggests that when the data from a survey are multivariate, the first PCcan be used to define the strata for a stratified sampling scheme. She showsthat stratification in this manner leads to reduced sampling variability comparedto stratification based on only one or two variables. Skinner et al.(1986) and Tortora (1980) demonstrate the effect of the non-independenceinduced by other methods of stratification on subsequently calculated PCs(see Section 12.4.5).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!