12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

68 4. Interpreting <strong>Principal</strong> <strong>Component</strong>s: Examplesand forearm measurements; for men these two measurements are also important,but, in addition, hand and wrist measurements appear with thesame signs as forearm and head, respectively. This component contributes9%–14% of total variation.Overall, the first three PCs account for a substantial proportion of totalvariation, 86.5% and 87.0% for women and men respectively. Althoughdiscussion of rules for deciding how many PCs to retain is deferred untilChapter 6, intuition strongly suggests that these percentages are largeenough for three PCs to give an adequate representation of the data.A similar but much larger study, using seven measurements on 3000 criminals,was reported by Macdonell (1902) and is quoted by Maxwell (1977).The first PC again measures overall size, the second contrasts head andlimb measurements, and the third can be readily interpreted as measuringthe shape (roundness versus thinness) of the head. The percentages of totalvariation accounted for by each of the first three PCs are 54.3%, 21.4% and9.3%, respectively, very similar to the proportions given in Table 4.1.The sample size (28) is rather small in our example compared to that ofMacdonnell’s (1902), especially when the sexes are analysed separately, socaution is needed in making any inference about the PCs in the populationof students from which the sample is drawn. However, the same variableshave been measured for other classes of students, and similar PCs havebeen found (see Sections 5.1 and 13.5). In any case, a description of thesample, rather than inference about the underlying population, is oftenwhat is required, and the PCs describe the major directions of variationwithin a sample, regardless of the sample size.4.2 The Elderly at HomeHunt (1978) described a survey of the ‘Elderly at Home’ in which valuesof a large number of variables were collected for a sample of 2622 elderlyindividuals living in private households in the UK in 1976. The variablescollected included standard demographic information of the type found inthe decennial censuses, as well as information on dependency, social contact,mobility and income. As part of a project carried out for the Departmentsof the Environment and Health and Social Security, a PCA was done on asubset of 20 variables from Hunt’s (1978) data. These variables are listedbriefly in Table 4.3. Full details of the variables, and also of the project asa whole, are given by <strong>Jolliffe</strong> et al. (1982a), while shorter accounts of themain aspects of the project are available in <strong>Jolliffe</strong> et al. (1980, 1982b). Itshould be noted that many of the variables listed in Table 4.3 are discrete,or even dichotomous.Some authors suggest that PCA should only be done on continuousvariables, preferably with normal distributions. However, provided that

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!