Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
14.5. Three-Mode, Multiway and Multiple Group PCA 399procedures can be extended to more than two groups. For example, Casin(2001) reviews a number of techniques for dealing with K sets of variables,most of which involve a PCA of the data arranged in one way or another.He briefly compares these various methods with his own ‘generalization’ ofPCA, which is now described.Suppose that X k is an (n × p k ) data matrix consisting of measurementsof p k variables on n individuals, k =1, 2,...,K. The same individuals areobserved for each k. The first step in Casin’s (2001) procedure is a PCAbased on the correlation matrix obtained from the (n × p) supermatrixX =(X 1 X 2 ... X K ),where p = ∑ Kk=1 p k. The first PC, z (1) , thus derived is then projected ontothe subspaces spanned by the columns of X k , k =1, 2,...,K, to give afor each X k . To obtain a second component, residualmatrices X (2)kare calculated. The jth column of X (2)kconsists of residualsfrom a regression of the jth column of X k on z (1)k. A covariance matrixPCA is then performed for the supermatrix‘first component’ z (1)kX (2) =(X (2)1 X (2)2 ... X (2)K ).The first PC from this analysis is next projected onto the subspaces spannedby the columns of X (2)k,k=1, 2,...,K to give a second component z(2)kforX k . This is called a ‘second auxiliary’ by Casin (2001). Residuals from regressionsof the columns of X (2)kon z (2)kgive matrices X (3)k, and a covariancematrix PCA is carried out on the supermatrix formed from these matrices.From this, third auxiliaries z (3)kare calculated, and so on. Unlike an ordinaryPCA of X, which produces p PCs, the number of auxiliaries for thekth group of variables is only p k . Casin (2001) claims that this procedure isa sensible compromise between separate PCAs for each X k , which concentrateon within-group relationships, and extensions of canonical correlationanalysis, which emphasize relationships between groups.Van de Geer (1984) reviews the possible ways in which linear relationshipsbetween two groups of variables can be quantified, and then discusseshow each might be generalized to more than two groups (see also van deGeer (1986)). One of the properties considered by van de Geer (1984) inhis review is the extent to which within-group, as well as between-group,structure is considered. When within-group variability is taken into accountthere are links to PCA, and one of van de Geer’s (1984) generalizations isequivalent to a PCA of all the variables in the K groups, as in extendedEOF analysis. Lafosse and Hanafi (1987) extend Tucker’s inter-batterymodel, which was discussed in Section 9.3.3, to more than two groups.
400 14. Generalizations and Adaptations of Principal Component Analysis14.6 MiscellaneaThis penultimate section discusses briefly some topics involving PCA thatdo not fit very naturally into any of the other sections of the book.14.6.1 Principal Components and Neural NetworksThis subject is sufficiently large to have a book devoted to it (Diamantarasand Kung, 1996). The use of neural networks to provide non-linearextensions of PCA is discussed in Section 14.1.3 and computational aspectsare revisited in Appendix A1. A few other related topics are notedhere, drawing mainly on Diamantaras and Kung (1996), to which the interestedreader is referred for further details. Much of the work in thisarea is concerned with constructing efficient algorithms, based on neuralnetworks, for deriving PCs. There are variations depending on whether asingle PC or several PCs are required, whether the first or last PCs areof interest, and whether the chosen PCs are found simultaneously or sequentially.The advantage of neural network algorithms is greatest whendata arrive sequentially, so that the PCs need to be continually updated.In some algorithms the transformation to PCs is treated as deterministic;in others noise is introduced (Diamantaras and Kung, 1996, Chapter 5). Inthis latter case, the components are written asy = B ′ x + e,and the original variables are approximated byˆx = Cy = CB ′ x + Ce,where B, C are (p × q) matrices and e is a noise term. When e = 0, minimizingE[(ˆx − x) ′ (ˆx − x)] with respect to B and C leads to PCA (thisfollows from Property A5 of Section 2.1), but the problem is complicatedby the presence of the term Ce in the expression for ˆx. Diamantaras andKung (1996, Chapter 5) describe solutions to a number of formulations ofthe problem of finding optimal B and C. Some constraints on B and/or Care necessary to make the problem well-defined, and the different formulationscorrespond to different constraints. All solutions have the commonfeature that they involve combinations of the eigenvectors of the covariancematrix of x with the eigenvectors of the covariance matrix of e. As withother signal/noise problems noted in Sections 12.4.3 and 14.2.2, there isthe necessity either to know the covariance matrix of e or to be able toestimate it separately from that of x.Networks that implement extensions of PCA are described in Diamantarasand Kung (1996, Chapters 6 and 7). Most have links to techniquesdeveloped independently in other disciplines. As well as non-linearextensions, the following analysis methods are discussed:
- Page 380 and 381: 13.3. Principal Component Analysis
- Page 382 and 383: 13.4. Principal Component Analysis
- Page 384 and 385: 13.4. Principal Component Analysis
- Page 386 and 387: 13.5. Common Principal Components 3
- Page 388 and 389: 13.5. Common Principal Components 3
- Page 390 and 391: 13.5. Common Principal Components 3
- Page 392 and 393: 13.5. Common Principal Components 3
- Page 394 and 395: 13.6. Principal Component Analysis
- Page 396 and 397: 13.6. Principal Component Analysis
- Page 398 and 399: 13.7. PCA in Statistical Process Co
- Page 400 and 401: 13.8. Some Other Types of Data 369A
- Page 402 and 403: 13.8. Some Other Types of Data 371d
- Page 404 and 405: 14Generalizations and Adaptations o
- Page 406 and 407: 14.1. Non-Linear Extensions of Prin
- Page 408 and 409: 14.1. Additive Principal Components
- Page 410 and 411: 14.1. Additive Principal Components
- Page 412 and 413: 14.1. Additive Principal Components
- Page 414 and 415: 14.2. Weights, Metrics, Transformat
- Page 416 and 417: 14.2. Weights, Metrics, Transformat
- Page 418 and 419: 14.2. Weights, Metrics, Transformat
- Page 420 and 421: 14.2. Weights, Metrics, Transformat
- Page 422 and 423: 14.2. Weights, Metrics, Transformat
- Page 424 and 425: 14.3. PCs in the Presence of Second
- Page 426 and 427: 14.4. PCA for Non-Normal Distributi
- Page 428 and 429: 14.5. Three-Mode, Multiway and Mult
- Page 432 and 433: 14.6. Miscellanea 401• Linear App
- Page 434 and 435: 14.6. Miscellanea 40314.6.3 Regress
- Page 436 and 437: 14.7. Concluding Remarks 405space o
- Page 438 and 439: Appendix AComputation of Principal
- Page 440 and 441: A.1. Numerical Calculation of Princ
- Page 442 and 443: A.1. Numerical Calculation of Princ
- Page 444 and 445: A.1. Numerical Calculation of Princ
- Page 446 and 447: ReferencesAguilera, A.M., Gutiérre
- Page 448 and 449: References 417Apley, D.W. and Shi,
- Page 450 and 451: References 419Benasseni, J. (1986b)
- Page 452 and 453: References 421Boik, R.J. (1986). Te
- Page 454 and 455: References 423Castro, P.E., Lawton,
- Page 456 and 457: References 425Cook, R.D. (1986). As
- Page 458 and 459: References 427Dempster, A.P., Laird
- Page 460 and 461: References 429Feeney, G.J. and Hest
- Page 462 and 463: References 431in Descriptive Multiv
- Page 464 and 465: References 433Gunst, R.F. and Mason
- Page 466 and 467: References 435Hocking, R.R., Speed,
- Page 468 and 469: References 437Jeffers, J.N.R. (1978
- Page 470 and 471: References 439Kazi-Aoual, F., Sabat
- Page 472 and 473: References 441Krzanowski, W.J. (200
- Page 474 and 475: References 443Mann, M.E. and Park,
- Page 476 and 477: References 445Monahan, A.H., Tangan
- Page 478 and 479: References 447Pack, P., Jolliffe, I
14.5. Three-Mode, Multiway and Multiple Group PCA 399procedures can be extended to more than two groups. For example, Casin(2001) reviews a number of techniques for dealing with K sets of variables,most of which involve a PCA of the data arranged in one way or another.He briefly compares these various methods with his own ‘generalization’ ofPCA, which is now described.Suppose that X k is an (n × p k ) data matrix consisting of measurementsof p k variables on n individuals, k =1, 2,...,K. The same individuals areobserved for each k. The first step in Casin’s (2001) procedure is a PCAbased on the correlation matrix obtained from the (n × p) supermatrixX =(X 1 X 2 ... X K ),where p = ∑ Kk=1 p k. The first PC, z (1) , thus derived is then projected ontothe subspaces spanned by the columns of X k , k =1, 2,...,K, to give afor each X k . To obtain a second component, residualmatrices X (2)kare calculated. The jth column of X (2)kconsists of residualsfrom a regression of the jth column of X k on z (1)k. A covariance matrixPCA is then performed for the supermatrix‘first component’ z (1)kX (2) =(X (2)1 X (2)2 ... X (2)K ).The first PC from this analysis is next projected onto the subspaces spannedby the columns of X (2)k,k=1, 2,...,K to give a second component z(2)kforX k . This is called a ‘second auxiliary’ by Casin (2001). Residuals from regressionsof the columns of X (2)kon z (2)kgive matrices X (3)k, and a covariancematrix PCA is carried out on the supermatrix formed from these matrices.From this, third auxiliaries z (3)kare calculated, and so on. Unlike an ordinaryPCA of X, which produces p PCs, the number of auxiliaries for thekth group of variables is only p k . Casin (2001) claims that this procedure isa sensible compromise between separate PCAs for each X k , which concentrateon within-group relationships, and extensions of canonical correlationanalysis, which emphasize relationships between groups.Van de Geer (1984) reviews the possible ways in which linear relationshipsbetween two groups of variables can be quantified, and then discusseshow each might be generalized to more than two groups (see also van deGeer (1986)). One of the properties considered by van de Geer (1984) inhis review is the extent to which within-group, as well as between-group,structure is considered. When within-group variability is taken into accountthere are links to PCA, and one of van de Geer’s (1984) generalizations isequivalent to a PCA of all the variables in the K groups, as in extendedEOF analysis. Lafosse and Hanafi (1987) extend Tucker’s inter-batterymodel, which was discussed in Section 9.3.3, to more than two groups.