Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
9.2. Cluster Analysis 215Figure 9.3. Aphids: plot with respect to the first two PCs showing four groupscorresponding to species.to verify that a given dissection ‘looks’ reasonable, rather than to attemptto identify clusters. An early example of this type of use was given by Moserand Scott (1961), in their Figure 9.2. The PCA in their study, which hasalready been mentioned in Section 4.2, was a stepping stone on the wayto a cluster analysis of 157 British towns based on 57 variables. The PCswere used both in the construction of a distance measure, and as a meansof displaying the clusters in two dimensions.Principal components are used in cluster analysis in a similar mannerin other examples discussed in Section 4.2, details of which can be foundin Jolliffe et al. (1980, 1982a, 1986), Imber (1977) and Webber and Craig(1978). Each of these studies is concerned with demographic data, as is theexample described next in detail.Demographic Characteristics of English CountiesIn an unpublished undergraduate dissertation, Stone (1984) considereda cluster analysis of 46 English counties. For each county there were 12
216 9. Principal Components Used with Other Multivariate TechniquesTable 9.1. Demographic variables used in the analysis of 46 English counties.1. Population density—numbers per hectare2. Percentage of population aged under 163. Percentage of population above retirement age4. Percentage of men aged 16–65 who are employed5. Percentage of men aged 16–65 who are unemployed6. Percentage of population owning their own home7. Percentage of households which are ‘overcrowded’8. Percentage of employed men working in industry9. Percentage of employed men working in agriculture10. (Length of public roads)/(area of county)11. (Industrial floor space)/(area of county)12. (Shops and restaurant floor space)/(area of county)Table 9.2. Coefficients and variances for the first four PCs: English counties data.Component number 1 2 3 4⎧ 1 0.35 −0.19 0.29 0.062 0.02 0.60 −0.03 0.223 −0.11 −0.52 −0.27 −0.364 −0.30 0.07 0.59 −0.035 0.31 0.05 −0.57 0.07⎪⎨6 −0.29 0.09 −0.07 −0.59Variable7 0.38 0.04 0.09 0.088 0.13 0.50 −0.14 −0.349 −0.25 −0.17 −0.28 0.5110 0.37 −0.09 0.09 −0.18⎪⎩ 11 0.34 0.02 −0.00 −0.2412 0.35 −0.20 0.24 0.07Eigenvalue 6.27 2.53 1.16 0.96Cumulative percentageof total variation 52.3 73.3 83.0 90.9
- Page 196 and 197: 7.5. Concluding Remarks 165To illus
- Page 198 and 199: 8Principal Components in Regression
- Page 200 and 201: 8.1. Principal Component Regression
- Page 202 and 203: 8.1. Principal Component Regression
- Page 204 and 205: 8.2. Selecting Components in Princi
- Page 206 and 207: 8.2. Selecting Components in Princi
- Page 208 and 209: 8.3. Connections Between PC Regress
- Page 210 and 211: 8.4. Variations on Principal Compon
- Page 212 and 213: 8.4. Variations on Principal Compon
- Page 214 and 215: 8.4. Variations on Principal Compon
- Page 216 and 217: 8.5. Variable Selection in Regressi
- Page 218 and 219: 8.5. Variable Selection in Regressi
- Page 220 and 221: 8.6. Functional and Structural Rela
- Page 222 and 223: 8.7. Examples of Principal Componen
- Page 224 and 225: Table 8.3. Principal component regr
- Page 226 and 227: 8.7. Examples of Principal Componen
- Page 228 and 229: 8.7. Examples of Principal Componen
- Page 230 and 231: 9Principal Components Used withOthe
- Page 232 and 233: 9.1. Discriminant Analysis 201on th
- Page 234 and 235: 9.1. Discriminant Analysis 203Figur
- Page 236 and 237: 9.1. Discriminant Analysis 205Corbi
- Page 238 and 239: 9.1. Discriminant Analysis 207that
- Page 240 and 241: 9.1. Discriminant Analysis 209betwe
- Page 242 and 243: 9.2. Cluster Analysis 211dimensiona
- Page 244 and 245: 9.2. Cluster Analysis 213Before loo
- Page 248 and 249: 9.2. Cluster Analysis 217demographi
- Page 250 and 251: 9.2. Cluster Analysis 219county clu
- Page 252 and 253: 9.2. Cluster Analysis 221choosing a
- Page 254 and 255: 9.3. Canonical Correlation Analysis
- Page 256 and 257: 9.3. Canonical Correlation Analysis
- Page 258 and 259: 9.3. Canonical Correlation Analysis
- Page 260 and 261: 9.3. Canonical Correlation Analysis
- Page 262 and 263: 9.3. Canonical Correlation Analysis
- Page 264 and 265: 10.1. Detection of Outliers Using P
- Page 266 and 267: 10.1. Detection of Outliers Using P
- Page 268 and 269: 10.1. Detection of Outliers Using P
- Page 270 and 271: 10.1. Detection of Outliers Using P
- Page 272 and 273: 10.1. Detection of Outliers Using P
- Page 274 and 275: 10.1. Detection of Outliers Using P
- Page 276 and 277: 10.1. Detection of Outliers Using P
- Page 278 and 279: 10.1. Detection of Outliers Using P
- Page 280 and 281: 10.2. Influential Observations in a
- Page 282 and 283: 10.2. Influential Observations in a
- Page 284 and 285: 10.2. Influential Observations in a
- Page 286 and 287: 10.2. Influential Observations in a
- Page 288 and 289: 10.2. Influential Observations in a
- Page 290 and 291: 10.3. Sensitivity and Stability 259
- Page 292 and 293: 10.3. Sensitivity and Stability 261
- Page 294 and 295: 10.4. Robust Estimation of Principa
216 9. <strong>Principal</strong> <strong>Component</strong>s Used with Other Multivariate TechniquesTable 9.1. Demographic variables used in the analysis of 46 English counties.1. Population density—numbers per hectare2. Percentage of population aged under 163. Percentage of population above retirement age4. Percentage of men aged 16–65 who are employed5. Percentage of men aged 16–65 who are unemployed6. Percentage of population owning their own home7. Percentage of households which are ‘overcrowded’8. Percentage of employed men working in industry9. Percentage of employed men working in agriculture10. (Length of public roads)/(area of county)11. (Industrial floor space)/(area of county)12. (Shops and restaurant floor space)/(area of county)Table 9.2. Coefficients and variances for the first four PCs: English counties data.<strong>Component</strong> number 1 2 3 4⎧ 1 0.35 −0.19 0.29 0.062 0.02 0.60 −0.03 0.223 −0.11 −0.52 −0.27 −0.364 −0.30 0.07 0.59 −0.035 0.31 0.05 −0.57 0.07⎪⎨6 −0.29 0.09 −0.07 −0.59Variable7 0.38 0.04 0.09 0.088 0.13 0.50 −0.14 −0.349 −0.25 −0.17 −0.28 0.5110 0.37 −0.09 0.09 −0.18⎪⎩ 11 0.34 0.02 −0.00 −0.2412 0.35 −0.20 0.24 0.07Eigenvalue 6.27 2.53 1.16 0.96Cumulative percentageof total variation 52.3 73.3 83.0 90.9