12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

360 13. <strong>Principal</strong> <strong>Component</strong> <strong>Analysis</strong> for Special Types of DataCohn (1999) considers four test statistics for deciding the equivalenceor otherwise of subspaces defined by sets of q PCs derived from each oftwo covariance matrices corresponding to two groups of observations. Oneof the statistics is the likelihood ratio test used by Flury (1988) and twoothers are functions of the eigenvalues, or corresponding cosines, derivedby Krzanowski (1979b). The fourth statistic is based on a sequence of twodimensionalrotations from one subspace towards the other, but simulationsshow it to be less reliable than the other three. There are a number ofnovel aspects to Cohn’s (1999) study. The first is that the observationswithin the two groups are not independent; in his motivating example thedata are serially correlated time series. To derive critical values for the teststatistics, a bootstrap procedure is used, with resampling in blocks becauseof the serial correlation. The test statistics are compared in a simulationstudy and on the motivating example.Keramidas et al. (1987) suggest a graphical procedure for comparingeigenvectors of several covariance matrices S 1 , S 2 ,...,S G . Much of the paperis concerned with the comparison of a single eigenvector from eachmatrix, either with a common predetermined vector or with a ‘typical’vector that maximizes the sum of squared cosines between itself and theG eigenvectors to be compared. If a gk is the kth eigenvector for the gthsample covariance matrix, g =1, 2,...,G,anda 0k is the predetermined ortypical vector, then distanceskδ 2 g = min[(a gk − a 0k ) ′ (a gk − a 0k ), (a gk + a 0k ) ′ (a gk + a 0k )]are calculated. If the sample covariance matrices are drawn from the samepopulation, then k δg 2 has an approximate gamma distribution, so Keramidaset al. (1987) suggest constructing gamma Q-Q plots to detect differencesfrom this null situation. Simulations are given for both the null and nonnullcases. Such plots are likely to be more useful when G is large thanwhen there is only a handful of covariance matrices to be compared.Keramidas et al. (1987) extend their idea to compare subspaces spannedby two or more eigenvectors. For two subspaces, their overall measure ofsimilarity, which reduces to k δg 2 when single eigenvectors are compared, isthe sum of the square roots ν 1/2kof eigenvalues of A ′ 1qA 2q A ′ 2qA 1q . Recallthat Krzanowski (1979b) uses the sum of these eigenvalues, not their squareroots as his measure of overall similarity. Keramidas et al. (1987) stressthat individual eigenvectors or subspaces should only be compared whentheir eigenvalues are well-separated from adjacent eigenvalues so that theeigenvectors or subspaces are well-defined.Ten Berge and Kiers (1996) take a different and more complex view ofcommon principal components than Flury (1988) or Krzanowski (1979b).They refer to generalizations of PCA to G (≥ 2) groups of individuals, withdifferent generalizations being appropriate depending on what is taken asthe defining property of PCA. They give three different defining criteria and

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!