12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

230 9. <strong>Principal</strong> <strong>Component</strong>s Used with Other Multivariate Techniquesspanned by the first m terms in the SVD. This is equivalent to projectingthe rows of Ŷ onto the subspace spanned by the first m PCs of Ŷ.Two further equivalences are noted by ter Braak and Looman (1994),namely that the reduced rank regression model estimated in this way isequivalent to redundancy analysis, and also to PCA of instrumental variables,as introduced by Rao (1964) (see Section 14.3). Van den Brink andter Braak (1999) also refer to redundancy analysis as ‘PCA in which samplescores are constrained to be linear combinations of the explanatory [predictor]variables.’ They extend redundancy analysis to the case where thevariables in X and Y are observed over several time periods and the modelchanges with time. This extension is discussed further in Section 12.4.2.Because of the link with PCA, it is possible to construct biplots (see Section5.3) of the regression coefficients in the reduced rank regression model(ter Braak and Looman, 1994).Aldrin (2000) proposes a modification of reduced rank regression, calledsoftly shrunk reduced-rank regression (SSRRR), in which the terms in theSVD of Ŷ are given varying non-zero weights, rather than the all-or-nothinginclusion/exclusion of terms in reduced rank regression. Aldrin (2000) alsosuggests that a subset of PCs of the predictor variables may be used asinput for a reduced rank regression or SSRRR instead of the predictorvariables themselves. In a simulation study comparing least squares with anumber of biased multivariate regression procedures, SSRRR with PCs asinput seems to be the best method overall.Reduced rank regression models essentially assume a latent structureunderlying the predictor variables, so that their dimensionality can be reducedbelow p 2 . Burnham et al. (1999) describe so-called latent variablemultivariate regression models, which take the idea of reduced rank regressionfurther by postulating overlapping latent structures underlying boththe response and predictor variables. The model can be writtenX = Z X Γ X + E XY = Z Y Γ Y + E Y ,where Z X , Z Y are of dimension (n × m) and contain values of m latentvariables for the n observations; Γ X , Γ Y are (m × p 1 ), (m × p 2 ) matricesof unknown parameters, and E X , E Y are matrices of errors.To fit this model, Burnham et al. (1999) suggest carrying out PCAson the data in X, on that in Y, and on the combined (n × (p 1 + p 2 ))matrix containing both response and predictor variables. In each PCA, ajudgment is made of how many PCs seem to represent common underlyingstructure and how many represent error or noise. Suppose that the numbersof non-noisy PCs in the three analyses are m X , m Y and m C , with obviousnotation. The implication is then that the overlapping part of the latentstructures has dimension m X + m Y − m C .Ifm X = m Y = m C there iscomplete overlap, whereas if m C = m X + m Y there is none. This model

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!