Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
9Principal Components Used withOther Multivariate TechniquesPrincipal component analysis is often used as a dimension-reducing techniquewithin some other type of analysis. For example, Chapter 8 describedthe use of PCs as regressor variables in a multiple regression analysis. Thepresent chapter discusses three classes of multivariate techniques, namelydiscriminant analysis, cluster analysis and canonical correlation analysis;for each these three there are examples in the literature that use PCA asa dimension-reducing technique.Discriminant analysis is concerned with data in which each observationcomes from one of several well-defined groups or populations. Assumptionsare made about the structure of the populations, and the main objectiveis to construct rules for assigning future observations to one of the populationsso as to minimize the probability of misclassification or some similarcriterion. As with regression, there can be advantages in replacing the variablesin a discriminant analysis by their principal components. The useof PCA in this way in linear discriminant analysis is discussed in Section9.1. In addition, the section includes brief descriptions of other discriminanttechniques that use PCs, and discussion of links between PCA andcanonical discriminant analysis.Cluster analysis is one of the most frequent contexts in which PCs arederived in order to reduce dimensionality prior to the use of a differentmultivariate technique. Like discriminant analysis, cluster analysis dealswith data sets in which the observations are to be divided into groups.However, in cluster analysis little or nothing is known a priori about thegroups, and the objective is to divide the given observations into groups orclusters in a ‘sensible’ way. There are two main ways in which PCs are em-
200 9. Principal Components Used with Other Multivariate Techniquesployed within cluster analysis: to construct distance measures or to providea graphical representation of the data; the latter is often called ordinationor scaling (see also Section 5.1) and is useful in detecting or verifying acluster structure. Both rôles are described and illustrated with examplesin Section 9.2. The idea of clustering variables rather than observationsis sometimes useful, and a connection between PCA and this idea is described.Also discussed in Section 9.2 are projection pursuit, which searchesfor clusters using techniques bearing some resemblance to PCA, and theconstruction of models for clusters which are mixtures of the PC modelintroduced in Section 3.9.‘Discriminant analysis’ and ‘cluster analysis’ are standard statisticalterms, but the techniques may be encountered under a variety of othernames. For example, the word ‘classification’ is sometimes used in a broadsense, including both discrimination and clustering, but it also has morethan one specialized meaning. Discriminant analysis and cluster analysisare prominent in both the pattern recognition and neural network literatures,where they fall within the areas of supervised and unsupervisedlearning, respectively (see, for example, Bishop (1995)). The relatively new,but large, field of data mining (Hand et al. 2001; Witten and Frank, 2000)also includes ‘clustering methods...[and] supervised classification methodsin general. . . ’ (Hand, 1998).The third, and final, multivariate technique discussed in this chapter, inSection 9.3, is canonical correlation analysis. This technique is appropriatewhen the vector of random variables x is divided into two parts, x p1 , x p2 ,and the objective is to find pairs of linear functions of x p1 and x p2 , respectively,such that the correlation between the linear functions within eachpair is maximized. In this case the replacement of x p1 , x p2 by some or allof the PCs of x p1 , x p2 , respectively, has been suggested in the literature. Anumber of other techniques linked to PCA that are used to investigate relationshipsbetween two groups of variables are also discussed in Section 9.3.Situations where more than two groups of variables are to be analysed areleft to Section 14.5.9.1 Discriminant AnalysisIn discriminant analysis, observations may be taken from any of G ≥ 2populationsor groups. Assumptions are made regarding the structure of thesegroups, namely that the random vector x associated with each observationis assumed to have a particular (partly or fully specified) distributiondepending on its group membership. Information may also be availableabout the overall relative frequencies of occurrence of each group. In addition,there is usually available a set of data x 1 , x 2 ,...,x n (the trainingset) for which the group membership of each observation is known. Based
- Page 180 and 181: 6.4. Examples Illustrating Variable
- Page 182 and 183: 7.1. Models for Factor Analysis 151
- Page 184 and 185: 7.2. Estimation of the Factor Model
- Page 186 and 187: 7.2. Estimation of the Factor Model
- Page 188 and 189: 7.2. Estimation of the Factor Model
- Page 190 and 191: 7.3. Comparisons Between Factor and
- Page 192 and 193: 7.4. An Example of Factor Analysis
- Page 194 and 195: 7.4. An Example of Factor Analysis
- Page 196 and 197: 7.5. Concluding Remarks 165To illus
- Page 198 and 199: 8Principal Components in Regression
- Page 200 and 201: 8.1. Principal Component Regression
- Page 202 and 203: 8.1. Principal Component Regression
- Page 204 and 205: 8.2. Selecting Components in Princi
- Page 206 and 207: 8.2. Selecting Components in Princi
- Page 208 and 209: 8.3. Connections Between PC Regress
- Page 210 and 211: 8.4. Variations on Principal Compon
- Page 212 and 213: 8.4. Variations on Principal Compon
- Page 214 and 215: 8.4. Variations on Principal Compon
- Page 216 and 217: 8.5. Variable Selection in Regressi
- Page 218 and 219: 8.5. Variable Selection in Regressi
- Page 220 and 221: 8.6. Functional and Structural Rela
- Page 222 and 223: 8.7. Examples of Principal Componen
- Page 224 and 225: Table 8.3. Principal component regr
- Page 226 and 227: 8.7. Examples of Principal Componen
- Page 228 and 229: 8.7. Examples of Principal Componen
- Page 232 and 233: 9.1. Discriminant Analysis 201on th
- Page 234 and 235: 9.1. Discriminant Analysis 203Figur
- Page 236 and 237: 9.1. Discriminant Analysis 205Corbi
- Page 238 and 239: 9.1. Discriminant Analysis 207that
- Page 240 and 241: 9.1. Discriminant Analysis 209betwe
- Page 242 and 243: 9.2. Cluster Analysis 211dimensiona
- Page 244 and 245: 9.2. Cluster Analysis 213Before loo
- Page 246 and 247: 9.2. Cluster Analysis 215Figure 9.3
- Page 248 and 249: 9.2. Cluster Analysis 217demographi
- Page 250 and 251: 9.2. Cluster Analysis 219county clu
- Page 252 and 253: 9.2. Cluster Analysis 221choosing a
- Page 254 and 255: 9.3. Canonical Correlation Analysis
- Page 256 and 257: 9.3. Canonical Correlation Analysis
- Page 258 and 259: 9.3. Canonical Correlation Analysis
- Page 260 and 261: 9.3. Canonical Correlation Analysis
- Page 262 and 263: 9.3. Canonical Correlation Analysis
- Page 264 and 265: 10.1. Detection of Outliers Using P
- Page 266 and 267: 10.1. Detection of Outliers Using P
- Page 268 and 269: 10.1. Detection of Outliers Using P
- Page 270 and 271: 10.1. Detection of Outliers Using P
- Page 272 and 273: 10.1. Detection of Outliers Using P
- Page 274 and 275: 10.1. Detection of Outliers Using P
- Page 276 and 277: 10.1. Detection of Outliers Using P
- Page 278 and 279: 10.1. Detection of Outliers Using P
9<strong>Principal</strong> <strong>Component</strong>s Used withOther Multivariate Techniques<strong>Principal</strong> component analysis is often used as a dimension-reducing techniquewithin some other type of analysis. For example, Chapter 8 describedthe use of PCs as regressor variables in a multiple regression analysis. Thepresent chapter discusses three classes of multivariate techniques, namelydiscriminant analysis, cluster analysis and canonical correlation analysis;for each these three there are examples in the literature that use PCA asa dimension-reducing technique.Discriminant analysis is concerned with data in which each observationcomes from one of several well-defined groups or populations. Assumptionsare made about the structure of the populations, and the main objectiveis to construct rules for assigning future observations to one of the populationsso as to minimize the probability of misclassification or some similarcriterion. As with regression, there can be advantages in replacing the variablesin a discriminant analysis by their principal components. The useof PCA in this way in linear discriminant analysis is discussed in Section9.1. In addition, the section includes brief descriptions of other discriminanttechniques that use PCs, and discussion of links between PCA andcanonical discriminant analysis.Cluster analysis is one of the most frequent contexts in which PCs arederived in order to reduce dimensionality prior to the use of a differentmultivariate technique. Like discriminant analysis, cluster analysis dealswith data sets in which the observations are to be divided into groups.However, in cluster analysis little or nothing is known a priori about thegroups, and the objective is to divide the given observations into groups orclusters in a ‘sensible’ way. There are two main ways in which PCs are em-