12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.2. Choosing m, the Number of <strong>Component</strong>s: Examples 133Table 6.1. First six eigenvalues for the correlation matrix, blood chemistry data.<strong>Component</strong> number 1 2 3 4 5 6Eigenvalue, l k 2.79 1.53 1.25 0.78 0.62 0.49t m = 100 ∑ mk=1 l k/p 34.9 54.1 69.7 79.4 87.2 93.3l k−1 − l k 1.26 0.28 0.47 0.16 0.13to retain. In reading the concluding paragraph that follows, this messageshould be kept firmly in mind.Some procedures, such as those introduced in Sections 6.1.4 and 6.1.6,are usually inappropriate because they retain, respectively, too many or toofew PCs in most circumstances. Some rules have been derived in particularfields of application, such as atmospheric science (Sections 6.1.3, 6.1.7) orpsychology (Sections 6.1.3, 6.1.6) and may be less relevant outside thesefields than within them. The simple rules of Sections 6.1.1 and 6.1.2 seemto work well in many examples, although the recommended cut-offs mustbe treated flexibly. Ideally the threshold should not fall between two PCswith very similar variances, and it may also change depending on the valueson the values of n and p, and on the presence of variables with dominantvariances (see the examples in the next section). A large amount of researchhas been done on rules for choosing m since the first edition of this bookappeared. However it still remains true that attempts to construct ruleshaving more sound statistical foundations seem, at present, to offer littleadvantage over the simpler rules in most circumstances.6.2 Choosing m, the Number of <strong>Component</strong>s:ExamplesTwo examples are given here to illustrate several of the techniques describedin Section 6.1; in addition, the examples of Section 6.4 include some relevantdiscussion, and Section 6.1.8 noted a number of comparative studies.6.2.1 Clinical Trials Blood ChemistryThese data were introduced in Section 3.3 and consist of measurementsof eight blood chemistry variables on 72 patients. The eigenvalues for thecorrelation matrix are given in Table 6.1, together with the related informationthat is required to implement the ad hoc methods described inSections 6.1.1–6.1.3.Looking at Table 6.1 and Figure 6.1, the three methods of Sections 6.1.1–6.1.3 suggest that between three and six PCs should be retained, but thedecision on a single best number is not clear-cut. Four PCs account for

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!