12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

9.1. Discriminant <strong>Analysis</strong> 205Corbitt and Ganesalingam (2001) also examine an extension to morethan two groups based on the ideas of Dillon et al. (1989), but most of theirpaper consists of case studies in which only two groups are present. Corbittand Ganesalingam (2001) show that a two-group version of their interpretationof Dillon et al.’s methodology is inferior to <strong>Jolliffe</strong> et al.’s (1996) t-testswith respect to correct classification. However, both are beaten in severalof the examples studied by selecting a subset of the original variables.Friedman (1989) demonstrates that a quantity similar to ˆθ k is also relevantin the case where the within-group covariance matrices are notnecessarily equal. In these circumstances a discriminant score is formedfor each group, and an important part of that score is a term correspondingto ˆθ k , with l k , a k replaced by the eigenvalues and eigenvectors of thecovariance matrix for that group. Friedman (1989) notes that sample estimatesof large eigenvalues are biased upwards, whereas estimates of smalleigenvalues are biased downwards and, because the reciprocals of the eigenvaluesappear in the discriminant scores, this can lead to an exaggeratedinfluence of the low-variance PCs in the discrimination. To overcome this,he proposes a form of ‘regularized’ discriminant analysis in which samplecovariance matrices for each group are shrunk towards the pooled estimateS w . This has the effect of decreasing large eigenvalues and increasing smallones.We return to regularized discriminant analysis later in this section, butwe first note that Takemura (1985) also describes a bias in estimating eigenvaluesin the context of one- and two-sample tests for multivariate normalmeans, based on Hotelling T 2 . For two groups, the question of whether ornot it is worth calculating a discriminant function reduces to testing thenull hypothesis H 0 : µ 1 = µ 2 . This is often done using Hotelling’s T 2 .Läuter (1996) suggests a statistic based on a subset of the PCs of the overallcovariance matrix. He concentrates on the case where only the first PCis used, for which one-sided, as well as global, alternatives to H 0 may beconsidered.Takemura (1985) proposes a decomposition of T 2 into contributionsdue to individual PCs. In the two-sample case this is equivalent to calculatingt-statistics to decide which PCs discriminate between the groupscorresponding to the samples, although in Takemura’s case the PCs arecalculated from the within-group, rather than overall, covariance matrix.Takemura (1985) suggests that later PCs might be deemed significant,and hence selected, too often. However, <strong>Jolliffe</strong> et al. (1996) dispel theseworries for their tests by conducting a simulation study which shows notendency for over-selection of the low-variance PCs in the null case, andwhich also gives indications of the power of the t-test when the null hypothesisof equal means in the two populations is false. Interestingly, Mason andGunst (1985) noted bias in the opposite direction in PC regression, namelythat low-variance PCs are selected less, rather than more, often than thehigh-variance components. Given the links between regression and discrim-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!