12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

9.3. Canonical Correlation <strong>Analysis</strong> and Related Techniques 227by Stewart and Love (1968), and is an index of the average proportionof the variance of the variables in one set that is reproducible from thevariables in the other set. One immediate difference from both CCA andmaximum covariance analysis is that it does not view the two sets of variablessymmetrically. One set is treated as response variables and the otheras predictor variables, and the results of the analysis are different dependingon the choice of which set contains responses. For convenience, in whatfollows x p1 and x p2 consist of responses and predictors, respectively.Stewart and Love’s (1968) redundancy index, given a pair of canonicalvariates, can be expressed as the product of two terms. These terms arethe squared canonical correlation and the variance of the canonical variatefor the response set. It is clear that a different value results if the rôles ofpredictor and response variables are reversed. The redundancy coefficientcan be obtained by regressing each response variable on all the predictorvariables and then averaging the p 1 squared multiple correlations fromthese regressions. This has a link to the interpretation of PCA given inthe discussion of Property A6 in Chapter 2, and was used by van denWollenberg (1977) and Thacker (1999) to introduce two slightly differenttechniques.In van den Wollenberg’s (1977) redundancy analysis, linear functionsa ′ k2 x p 2of x p2 are found that successively maximize their average squaredcorrelation with the elements of the response set x p1 , subject to the vectorsof loadings a 12 , a 22 ,...beingorthogonal. It turns out (van den Wollenberg,1977) that finding the required linear functions is achieved by solving theequationR xy R yx a k2 = l k R xx a k2 , (9.3.2)where R xx is the correlation matrix for the predictor variables, R xy is thematrix of correlations between the predictor and response variables, andR yx is the transpose of R xy . A linear function of x p1 can be found byreversing the rôles of predictor and response variables, and hence replacingx by y and vice versa, in equation (9.3.2).Thacker (1999) also considers a linear function z 1 = a ′ 12x p2 of the predictorsx p2 . Again a 12 is chosen to maximize ∑ p 1j=1 r2 1j , where r 1j is thecorrelation between z 1 and the jth response variable. The variable z 1 iscalled the first principal predictor by Thacker (1999). Second, third, ...principal predictors are defined by maximizing the same quantity, subjectto the constraint that each principal predictor must be uncorrelated withall previous principal predictors. Thacker (1999) shows that the vectors ofloadings a 12 , a 22 , ... are solutions of the equationS xy [diag(S yy )] −1 S yx a k2 = l k S xx a k2 , (9.3.3)where S xx , S yy , S xy and S yx are covariance matrices defined analogously tothe correlation matrices R xx , R yy , R xy and R yx above. The eigenvalue l kcorresponding to a k2 is equal to the sum of squared correlations ∑ p 1j=1 r2 kj

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!