12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

126 6. Choosing a Subset of <strong>Principal</strong> <strong>Component</strong>s or Variablesestimates depend on the reciprocal of the difference between l m and l m+1where, as before, m is the number of PCs retained. The usual implementationsof the rules of Sections 6.1.1, 6.1.2 ignore the size of gaps betweeneigenvalues and hence do not take stability into account. However, it is advisablewhen using Kaiser’s rule or one of its modifications, or a rule basedon cumulative variance, to treat the threshold with flexibility, and be preparedto move it, if it does not correspond to a good-sized gap betweeneigenvalues.Besse and de Falguerolles (1993) also examine a real data set with p =16and n = 60. Kaiser’s rule chooses m = 5, and the scree graph suggests eitherm =3orm = 5. The bootstrap and jackknife criteria behave similarly toeach other. Ignoring the uninteresting minimum at m = 1, all four methodschoose m = 3, although there are strong secondary minima at m =8andm =5.Another model-based rule is introduced by Bishop (1999) and, eventhough one of its merits is said to be that it avoids cross-validation, itseems appropriate to mention it here. Bishop (1999) proposes a Bayesianframework for Tipping and Bishop’s (1999a) model, which was described inSection 3.9. Recall that under this model the covariance matrix underlyingthe data can be written as BB ′ + σ 2 I p , where B is a (p × q) matrix. Theprior distribution of B in Bishop’s (1999) framework allows B to have itsmaximum possible value of q (= p − 1) under the model. However if theposterior distribution assigns small values for all elements of a column b k ofB, then that dimension is removed. The mode of the posterior distributioncan be found using the EM algorithm.Jackson (1993) discusses two bootstrap versions of ‘parallel analysis,’which was described in general terms in Section 6.1.3. The first, whichis a modification of Kaiser’s rule defined in Section 6.1.2, uses bootstrapsamples from a data set to construct confidence limits for the populationeigenvalues (see Section 3.7.2). Only those components for which thecorresponding 95% confidence interval lies entirely above 1 are retained.Unfortunately, although this criterion is reasonable as a means of decidingthe number of factors in a factor analysis (see Chapter 7), it is inappropriatein PCA. This is because it will not retain PCs dominated by a singlevariable whose correlations with all the other variables are close to zero.Such variables are generally omitted from a factor model, but they provideinformation not available from other variables and so should be retained ifmost of the information in X is to be kept. <strong>Jolliffe</strong>’s (1972) suggestion ofreducing Kaiser’s threshold from 1 to around 0.7 reflects the fact that weare dealing with PCA and not factor analysis. A bootstrap rule designedwith PCA in mind would retain all those components for which the 95%confidence interval for the corresponding eigenvalue does not lie entirelybelow 1.A second bootstrap approach suggested by Jackson (1993) finds 95%confidence intervals for both eigenvalues and eigenvector coefficients. To

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!