Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

cda.psych.uiuc.edu
from cda.psych.uiuc.edu More from this publisher
12.07.2015 Views

6.2. Choosing m, the Number of Components: Examples 135Table 6.3. First six eigenvalues for the covariance matrix, gas chromatographydata.Component number 1 2 3 4 5 6Eigenvalue, l k 312187 2100 768 336 190 149l k /¯l ∑ 9.88 0.067 0.024 0.011 0.006 0.005mk=1t m = 100k∑ p l 98.8 99.5 99.7 99.8 99.9 99.94k=1 kl k−1 − l k 310087 1332 432 146 51R 0.02 0.43 0.60 0.70 0.83 0.99W 494.98 4.95 1.90 0.92 0.41 0.54the inclusion of five PCs in this example but, in fact, he slightly modifieshis criterion for retaining PCs. His nominal cut-off for including the kthPC is R

136 6. Choosing a Subset of Principal Components or VariablesFigure 6.2. LEV diagram for the covariance matrix: gas chromatography data.logue of the scree diagram. However, although this interpretation may bevalid for the correlation matrices in his simulations, it does not seem tohold for the dominant variance structures exhibited in Tables 6.2 and 6.3.For correlation matrices, and presumably for covariance matrices withless extreme variation among eigenvalues, the ad hoc methods and thecross-validatory criteria are likely to give more similar results. This is illustratedby a simulation study in Krzanowski (1983), where W is comparedwith the first two ad hoc rules with cut-offs at t ∗ = 75% and l ∗ = 1,respectively. Bartlett’s test, described in Section 6.1.4, is also included inthe comparison but, as expected from the earlier discussion, it retains toomany PCs in most circumstances. The behaviour of W compared with thetwo ad hoc rules is the reverse of that observed in the example above. Wretains fewer PCs than the t m > 75% criterion, despite the fairly low cutoffof 75%. Similar numbers of PCs are retained for W and for the rulebased on l k > 1. The latter rule retains more PCs if the cut-off is loweredto 0.7 rather than 1.0, as suggested in Section 6.1.2. It can also be arguedthat the cut-off for W should be reduced below unity (see Section 6.1.5),in which case all three rules will give similar results.

6.2. Choosing m, the Number of <strong>Component</strong>s: Examples 135Table 6.3. First six eigenvalues for the covariance matrix, gas chromatographydata.<strong>Component</strong> number 1 2 3 4 5 6Eigenvalue, l k 312187 2100 768 336 190 149l k /¯l ∑ 9.88 0.067 0.024 0.011 0.006 0.005mk=1t m = 100k∑ p l 98.8 99.5 99.7 99.8 99.9 99.94k=1 kl k−1 − l k 310087 1332 432 146 51R 0.02 0.43 0.60 0.70 0.83 0.99W 494.98 4.95 1.90 0.92 0.41 0.54the inclusion of five PCs in this example but, in fact, he slightly modifieshis criterion for retaining PCs. His nominal cut-off for including the kthPC is R

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!