12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

9.1. Discriminant <strong>Analysis</strong> 209between- to within-group variance of γ ′ x, subject to being uncorrelatedwith γ ′ 1x, γ ′ 2x,...,γ ′ k−1x. For more details see, for example, McLachlan(1992, Section 9.2) or Mardia et al. (1979, Section 12.5).A variation on prewhitening using S w is to ‘standardize’ S b by dividingeach variable by its within-group standard deviation and then calculate abetween-group covariance matrix S ∗ bfrom these rescaled variables. FindingPCs based on S ∗ bis called discriminant principal components analysis byYendle and MacFie (1989). They compare the results of this analysis withthose from a PCA based on S b and also with canonical discriminant analysisin which the variables x are replaced by their PCs. In two examples, Yendleand MacFie (1989) find, with respect to misclassification probabilities, theperformance of PCA based on S ∗ b to be superior to that based on S b,andcomparable to that of canonical discriminant analysis using PCs. However,in the latter analysis only the restrictive special case is examined, in whichthe first q PCs are retained. It is also not made explicit whether the PCsused are obtained from the overall covariance matrix S or from S w , thoughit seems likely that S is used.To conclude this section, we note some relationships between PCA andcanonical discriminant analysis, via principal coordinate analysis (see Section5.2), which are described by Gower (1966). Suppose that a principalcoordinate analysis is done on a distance matrix whose elements are Mahalanobisdistances between the samples from the G populations. Thesedistances are defined as the square roots ofδhi 2 =(¯x h − ¯x i ) ′ S −1w (¯x h − ¯x i ); h, i =1, 2,...,G. (9.1.3)Gower (1966) then shows that the configuration found in m(< G) dimensionsis the same as that provided by the first m canonical variatesfrom canonical discriminant analysis. Furthermore, the same results maybe found from a PCA with X ′ X replaced by ( ¯XW) ′ ( ¯XW), where WW ′ =S −1w and ¯X is the (G × p) matrix whose hth row gives the sample meansof the p variables for the hth population, h =1, 2,...,G. Yet another, related,way of finding canonical variates is via a two-stage PCA, as describedby Campbell and Atchley (1981). At the first stage PCs are found, basedon the within-group covariance matrix S w , and standardized to have unitvariance. The values of the means of these standardised PCs for each of theG groups are then subjected to a weighted PCA (see Section 14.2.1), withweights proportional to the sample sizes n i in each group. The PC scoresat this second stage are the values of the group means with respect to thecanonical variates. Krzanowski (1990) generalizes canonical discriminantanalysis, based on the common PCA model due to Flury (1988), using thistwo-stage derivation. Bensmail and Celeux (1996) also describe an approachto discriminant analysis based on the common PCA framework; this willbe discussed further in Section 13.5. Campbell and Atchley (1981) note thepossibility of an alternative analysis, different from canonical discriminantanalysis, in which the PCA at the second of their two stages is unweighted.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!