12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6.1. How Many <strong>Principal</strong> <strong>Component</strong>s? 115As well as these intuitive justifications, Kaiser (1960) put forward a numberof other reasons for a cut-off at l k = 1. It must be noted, however, thatmost of the reasons are pertinent to factor analysis (see Chapter 7), ratherthan PCA, although Kaiser refers to PCs in discussing one of them.It can be argued that a cut-off at l k = 1 retains too few variables. Considera variable which, in the population, is more-or-less independent ofall other variables. In a sample, such a variable will have small coefficientsin (p − 1) of the PCs but will dominate one of the PCs, whose variancel k will be close to 1 when using the correlation matrix. As the variableprovides independent information from the other variables it would be unwiseto delete it. However, deletion will occur if Kaiser’s rule is used, andif, due to sampling variation, l k < 1. It is therefore advisable to choosea cut-off l ∗ lower than 1, to allow for sampling variation. <strong>Jolliffe</strong> (1972)suggested, based on simulation studies, that l ∗ =0.7 is roughly the correctlevel. Further discussion of this cut-off level will be given with respect toexamples in Sections 6.2 and 6.4.The rule just described is specifically designed for correlation matrices,but it can be easily adapted for covariance matrices by taking as a cut-off l ∗the average value ¯l of the eigenvalues or, better, a somewhat lower cut-offsuch as l ∗ =0.7¯l. For covariance matrices with widely differing variances,however, this rule and the one based on t k from Section 6.1.1 retain veryfew (arguably, too few) PCs, as will be seen in the examples of Section 6.2.An alternative way of looking at the sizes of individual variances is to usethe so-called broken stick model. If we have a stick of unit length, brokenat random into p segments, then it can be shown that the expected lengthof the kth longest segment isl ∗ k = 1 pOne way of deciding whether the proportion of variance accounted for bythe kth PC is large enough for that component to be retained is to comparethe proportion with lk ∗ . <strong>Principal</strong> components for which the proportionexceeds lk ∗ are then retained, and all other PCs deleted. Tables of l∗ k areavailable for various values of p and k (see, for example, Legendre andLegendre (1983, p. 406)).p∑j=k1j .6.1.3 The Scree Graph and the Log-Eigenvalue DiagramThe first two rules described above usually involve a degree of subjectivityin the choice of cut-off levels, t ∗ and l ∗ respectively. The scree graph,which was discussed and named by Cattell (1966) but which was alreadyin common use, is even more subjective in its usual form, as it involveslooking at a plot of l k against k (see Figure 6.1, which is discussed in detailin Section 6.2) and deciding at which value of k the slopes of lines joining

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!