Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
7.4. An Example of Factor Analysis 161are actually derived from correlation matrices corresponding exactly to theunderlying model. In practice, the model itself is unknown and must beestimated from a data set. This allows more scope for divergence betweenthe results from PCA and from factor analysis. There have been a numberof studies in which PCA and factor analysis are compared empirically ondata sets, with comparisons usually based on a subjective assessment ofhow well and simply the results can be interpreted. A typical study ofthis sort from atmospheric science is Bärring (1987). There have also beena number of comparative simulation studies, such as Snook and Gorsuch(1989), in which, unsurprisingly, PCA is inferior to factor analysis in findingunderlying structure in data simulated from a factor model.There has been much discussion in the behavioural science literature ofthe similarities and differences between PCA and factor analysis. For example,114 pages of the first issue in 1990 of Multivariate Behavioral Researchwas devoted to a lead article by Velicer and Jackson (1990) on ‘Componentanalysis versus common factor analysis ...,’ together with 10 shorter discussionpapers by different authors and a rejoinder by Velicer and Jackson.Widaman (1993) continued this debate, and concluded that ‘...principalcomponent analysis should not be used if a researcher wishes to obtainparameters reflecting latent constructs or factors.’ This conclusion reflectsthe fact that underlying much of the 1990 discussion is the assumption thatunobservable factors are being sought from which the observed behaviouralvariables can be derived. Factor analysis is clearly designed with this objectivein mind, whereas PCA does not directly address it. Thus, at best,PCA provides an approximation to what is truly required.PCA and factor analysis give similar numerical results for many examples.However PCA should only be used as a surrogate for factor analysiswith full awareness of the differences between the two techniques, and eventhen caution is necessary. Sato (1990), who, like Schneeweiss and Mathes(1995) and Schneeweiss (1997), gives a number of theoretical comparisons,showed that for m = 1 and small p the loadings given by factor analysisand by PCA can sometimes be quite different.7.4 An Example of Factor AnalysisThe example that follows is fairly typical of the sort of data that are oftensubjected to a factor analysis. The data were originally discussed by Yuleet al. (1969) and consist of scores for 150 children on ten subtests of theWechsler Pre-School and Primary Scale of Intelligence (WPPSI); there arethus 150 observations on ten variables. The WPPSI tests were designedto measure ‘intelligence’ of children aged 4 1 2–6 years, and the 150 childrentested in the Yule et al. (1969) study were a sample of children who enteredschool in the Isle of Wight in the autumn of 1967, and who were tested
162 7. Principal Component Analysis and Factor Analysisduring their second term in school. Their average age at the time of testingwas 5 years, 5 months. Similar data sets are analysed in Lawley and Maxwell(1971).Table 7.1 gives the variances and the coefficients of the first four PCs,when the analysis is done on the correlation matrix. It is seen that thefirst four components explain nearly 76% of the total variation, and thatthe variance of the fourth PC is 0.71. The fifth PC, with a variance of0.51, would be discarded by most of the rules described in Section 6.1and, indeed, in factor analysis it would be more usual to keep only two,or perhaps three, factors in the present example. Figures 7.1, 7.2 earlier inthe chapter showed the effect of rotation in this example when only twoPCs are considered; here, where four PCs are retained, it is not possible toeasily represent the effect of rotation in the same diagrammatic way.All of the correlations between the ten variables are positive, so thefirst PC has the familiar pattern of being an almost equally weighted‘average’ of all ten variables. The second PC contrasts the first five variableswith the final five. This is not unexpected as these two sets ofvariables are of different types, namely ‘verbal’ tests and ‘performance’tests, respectively. The third PC is mainly a contrast between variables6 and 9, which interestingly were at the time the only two ‘new’ tests inthe WPSSI battery, and the fourth does not have a very straightforwardinterpretation.Table 7.2 gives the factor loadings when the first four PCs are rotatedusing an orthogonal rotation method (varimax), and an oblique method(direct quartimin). It would be counterproductive to give more varietiesof factor analysis for this single example, as the differences in detail tendto obscure the general conclusions that are drawn below. Often, resultsare far less sensitive to the choice of rotation criterion than to the choiceof how many factors to rotate. Many further examples can be found intexts on factor analysis such as Cattell (1978), Lawley and Maxwell (1971),Lewis-Beck (1994) and Rummel (1970).In order to make comparisons between Table 7.1 and Table 7.2 straightforward,the sum of squares of the PC coefficients and factor loadings arenormalized to be equal to unity for each factor. Typically, the output fromcomputer packages that implement factor analysis uses the normalization inwhich the sum of squares of coefficients in each PC before rotation is equalto the variance (eigenvalue) associated with that PC (see Section 2.3). Thelatter normalization is used in Figures 7.1 and 7.2. The choice of normalizationconstraints is important in rotation as it determines the propertiesof the rotated factors. Detailed discussion of these properties in the contextof rotated PCs is given in Section 11.1.The correlations between the oblique factors in Table 7.2 are given inTable 7.3 and it can be seen that there is a non-trivial degree of correlationbetween the factors given by the oblique method. Despite this, the structureof the factor loadings is very similar for the two factor rotation methods.
- Page 142 and 143: 6Choosing a Subset of PrincipalComp
- Page 144 and 145: 6.1. How Many Principal Components?
- Page 146 and 147: 6.1. How Many Principal Components?
- Page 148 and 149: 6.1. How Many Principal Components?
- Page 150 and 151: 6.1. How Many Principal Components?
- Page 152 and 153: 6.1. How Many Principal Components?
- Page 154 and 155: 6.1. How Many Principal Components?
- Page 156 and 157: 6.1. How Many Principal Components?
- Page 158 and 159: 6.1. How Many Principal Components?
- Page 160 and 161: 6.1. How Many Principal Components?
- Page 162 and 163: 6.1. How Many Principal Components?
- Page 164 and 165: 6.2. Choosing m, the Number of Comp
- Page 166 and 167: 6.2. Choosing m, the Number of Comp
- Page 168 and 169: 6.3. Selecting a Subset of Variable
- Page 170 and 171: 6.3. Selecting a Subset of Variable
- Page 172 and 173: 6.3. Selecting a Subset of Variable
- Page 174 and 175: 6.3. Selecting a Subset of Variable
- Page 176 and 177: 6.4. Examples Illustrating Variable
- Page 178 and 179: 6.4. Examples Illustrating Variable
- Page 180 and 181: 6.4. Examples Illustrating Variable
- Page 182 and 183: 7.1. Models for Factor Analysis 151
- Page 184 and 185: 7.2. Estimation of the Factor Model
- Page 186 and 187: 7.2. Estimation of the Factor Model
- Page 188 and 189: 7.2. Estimation of the Factor Model
- Page 190 and 191: 7.3. Comparisons Between Factor and
- Page 194 and 195: 7.4. An Example of Factor Analysis
- Page 196 and 197: 7.5. Concluding Remarks 165To illus
- Page 198 and 199: 8Principal Components in Regression
- Page 200 and 201: 8.1. Principal Component Regression
- Page 202 and 203: 8.1. Principal Component Regression
- Page 204 and 205: 8.2. Selecting Components in Princi
- Page 206 and 207: 8.2. Selecting Components in Princi
- Page 208 and 209: 8.3. Connections Between PC Regress
- Page 210 and 211: 8.4. Variations on Principal Compon
- Page 212 and 213: 8.4. Variations on Principal Compon
- Page 214 and 215: 8.4. Variations on Principal Compon
- Page 216 and 217: 8.5. Variable Selection in Regressi
- Page 218 and 219: 8.5. Variable Selection in Regressi
- Page 220 and 221: 8.6. Functional and Structural Rela
- Page 222 and 223: 8.7. Examples of Principal Componen
- Page 224 and 225: Table 8.3. Principal component regr
- Page 226 and 227: 8.7. Examples of Principal Componen
- Page 228 and 229: 8.7. Examples of Principal Componen
- Page 230 and 231: 9Principal Components Used withOthe
- Page 232 and 233: 9.1. Discriminant Analysis 201on th
- Page 234 and 235: 9.1. Discriminant Analysis 203Figur
- Page 236 and 237: 9.1. Discriminant Analysis 205Corbi
- Page 238 and 239: 9.1. Discriminant Analysis 207that
- Page 240 and 241: 9.1. Discriminant Analysis 209betwe
7.4. An Example of Factor <strong>Analysis</strong> 161are actually derived from correlation matrices corresponding exactly to theunderlying model. In practice, the model itself is unknown and must beestimated from a data set. This allows more scope for divergence betweenthe results from PCA and from factor analysis. There have been a numberof studies in which PCA and factor analysis are compared empirically ondata sets, with comparisons usually based on a subjective assessment ofhow well and simply the results can be interpreted. A typical study ofthis sort from atmospheric science is Bärring (1987). There have also beena number of comparative simulation studies, such as Snook and Gorsuch(1989), in which, unsurprisingly, PCA is inferior to factor analysis in findingunderlying structure in data simulated from a factor model.There has been much discussion in the behavioural science literature ofthe similarities and differences between PCA and factor analysis. For example,114 pages of the first issue in 1990 of Multivariate Behavioral Researchwas devoted to a lead article by Velicer and Jackson (1990) on ‘<strong>Component</strong>analysis versus common factor analysis ...,’ together with 10 shorter discussionpapers by different authors and a rejoinder by Velicer and Jackson.Widaman (1993) continued this debate, and concluded that ‘...principalcomponent analysis should not be used if a researcher wishes to obtainparameters reflecting latent constructs or factors.’ This conclusion reflectsthe fact that underlying much of the 1990 discussion is the assumption thatunobservable factors are being sought from which the observed behaviouralvariables can be derived. Factor analysis is clearly designed with this objectivein mind, whereas PCA does not directly address it. Thus, at best,PCA provides an approximation to what is truly required.PCA and factor analysis give similar numerical results for many examples.However PCA should only be used as a surrogate for factor analysiswith full awareness of the differences between the two techniques, and eventhen caution is necessary. Sato (1990), who, like Schneeweiss and Mathes(1995) and Schneeweiss (1997), gives a number of theoretical comparisons,showed that for m = 1 and small p the loadings given by factor analysisand by PCA can sometimes be quite different.7.4 An Example of Factor <strong>Analysis</strong>The example that follows is fairly typical of the sort of data that are oftensubjected to a factor analysis. The data were originally discussed by Yuleet al. (1969) and consist of scores for 150 children on ten subtests of theWechsler Pre-School and Primary Scale of Intelligence (WPPSI); there arethus 150 observations on ten variables. The WPPSI tests were designedto measure ‘intelligence’ of children aged 4 1 2–6 years, and the 150 childrentested in the Yule et al. (1969) study were a sample of children who enteredschool in the Isle of Wight in the autumn of 1967, and who were tested