12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.1. How Many <strong>Principal</strong> <strong>Component</strong>s? 131bootstrap rules, the results are viewed from a factor analysis rather than aPCA perspective.Franklin et al. (1995) compare 39 published analyses from ecology. Theyseem to start from the unproven premise that ‘parallel analysis’ (see Section6.1.3) selects the ‘correct’ number of components or factors to retain,and then investigate in how many of the 39 analyses ‘too many’ or ‘too few’dimensions are chosen. Franklin et al. (1995) claim that 2 3of the 39 analysesretain too many dimensions. However, as with a number of other referencescited in this chapter, they fail to distinguish between what is needed forPCA and factor analysis. They also stipulate that PCAs require normallydistributed random variables, which is untrue for most purposes. It seemsdifficult to instil the message that PCA and factor analysis require differentrules. Turner (1998) reports a large study of the properties of parallelanalysis, using simulation, and notes early on that ‘there are importantdifferences between principal component analysis and factor analysis.’ Hethen proceeds to ignore the differences, stating that the ‘term factors willbe used throughout [the] article [to refer] to either factors or components.’Ferré (1995b) presents a comparative study which is extensive in itscoverage of selection rules, but very limited in the range of data for whichthe techniques are compared. A total of 18 rules are included in the study,as follows:• From Section 6.1.1 the cumulative percentage of variance with fourcut-offs: 60%, 70%, 80%, 90%.• From Section 6.1.2 Kaiser’s rule with cut-off 1, together withmodifications whose cut-offs are 0.7 and 2; the broken stick rule.• From Section 6.1.3 the scree graph.• From Section 6.1.4 Bartlett’s test and an approximation due toMardia.• From Section 6.1.5 four versions of Eastment and Krzanowski’s crossvalidationmethods, where two cut-offs, 1 and 0.9, are used and, foreach threshold, the stopping rule can be based on either the first orlast occasion that the criterion dips below the threshold; Ferré’s ˆf q ;Besse and de Falguerolles’s approximate jackknife criterion.• From Section 6.1.6 Velicer’s test.The simulations are based on the fixed effects model described in Section6.1.5. The sample size is 20, the number of variables is 10, and eachsimulated data matrix is the sum of a fixed (20 × 10) matrix Z of rank 8and a matrix of independent Gaussian noise with two levels of the noisevariance. This is a fixed effects model with q =8,sothatatfirstsightwemight aim to choose m = 8. For the smaller value of noise, Ferre (1995b)considers this to be appropriate, but the higher noise level lies between the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!