12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

10.3. Sensitivity and Stability 261bility index as the proportion of times in repeated samples that this anglehas a cosine whose value is greater than 0.95. They conduct a simulationstudy to examine the dependence of this index on sample size, on the ratioof consecutive population eigenvalues, and on whether or not the data arenormally distributed. To generate non-normal samples, Dudziński and researchersuse the bootstrap idea of sampling with replacement from a dataset that is clearly non-normal. This usage predates the first appearanceof the term ‘bootstrap.’ Their simulation study is relatively small, but itdemonstrates that repeatability is often greater for normal than for nonnormaldata with the same covariance structure, although the differencesare usually small for the cases studied and become very small as the repeatabilityincreases with sample size. Repeatability, as with other formsof stability, decreases as consecutive eigenvalues become closer.Dudziński et al. (1975) implemented a bootstrap-like method for assessingstability. Daudin et al. (1988) use a fully-fledged bootstrap and, in fact,note that more than one bootstrapping procedure may be relevant in resamplingto examine the properties of PCA, depending in part on whethera correlation-based or covariance- based analysis is done. They consider anumber of measures of stability for both eigenvalues and eigenvectors, butthe stability of subspaces spanned by subsets of PCs is deemed to be ofparticular importance. This idea is used by Besse (1992) and Besse and deFalguerolles (1993) to choose how many PCs to retain (see Section 6.1.5).Stability indices based on the jackknife are also used by the latter authors,and Daudin et al. (1989) discuss one such index in detail for bothcorrelation- and covariance-based PCA. Besse and de Falguerolles (1993)describe the equally weighted version of the criterion (6.1.6) as a naturalcriterion for stability, but preferL q = 1 2 ‖P q − ˆP q ‖ 2in equation (6.1.9). Either (6.1.6) or the expectation of L q can be used as ameasure of the stability of the first q PCs or the subspace spanned by them,and q is then chosen to optimize stability. Besse and de Falguerolles (1993)discuss a variety of ways of estimating their stability criterion, includingsome based on the bootstrap and jackknife. The bootstrap has also beenused to estimate standard errors of elements of the eigenvectors a k (seeSection 3.7.2), and these standard errors can be viewed as further measuresof stability of PCs, specifically the stability of their coefficients.Stauffer et al. (1985) conduct a study that has some similarities to that ofDudziński et al. (1975). They take bootstrap samples from three ecologicaldata sets and use them to construct standard errors for the eigenvalues ofthe correlation matrix. The stability of the eigenvalues for each data setis investigated when the full data set is replaced by subsets of variablesor subsets of observations. Each eigenvalue is examined as a percentage ofvariation remaining after removing earlier PCs, as well as in absolute terms.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!