12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

120 6. Choosing a Subset of <strong>Principal</strong> <strong>Component</strong>s or Variablesthe cut-off point. In other words, in order to retain q PCs the last (p −q) eigenvalues should have a linear trend. Bentler and Yuan (1996,1998)develop procedures for testing in the case of covariance and correlationmatrices, respectively, the null hypothesisH ∗ q : λ q+k = α + βx k , k =1, 2,...,(p − q)where α, β are non-negative constants and x k =(p − q) − k.For covariance matrices a maximum likelihood ratio test (MLRT) canbe used straightforwardly, with the null distribution of the test statisticapproximated by a χ 2 distribution. In the correlation case Bentler andYuan (1998) use simulations to compare the MLRT, treating the correlationmatrix as a covariance matrix, with a minimum χ 2 test. They show thatthe MLRT has a seriously inflated Type I error, even for very large samplesizes. The properties of the minimum χ 2 test are not ideal, but the testgives plausible results in the examples examined by Bentler and Yuan.They conclude that it is reliable for sample sizes of 100 or larger. Thediscussion section of Bentler and Yuan (1998) speculates on improvementsfor smaller sample sizes, on potential problems caused by possible differentorderings of eigenvalues in populations and samples, and on the possibilityof testing hypotheses for specific non-linear relationships among the last(p − q) eigenvalues.Ali et al. (1985) propose a method for choosing m basedontestinghypothesesfor correlations between the variables and the components. Recallfrom Section 2.3 that for a correlation matrix PCA and the normalization˜α ′ k ˜α k = λ k , the coefficients ˜α kj are precisely these correlations. Similarly,the sample coefficients ã kj are correlations between the kth PC and thejth variable in the sample. The normalization constraint means that thecoefficients will decrease on average as k increases. Ali et al. (1985) suggestdefining m as one fewer than the index of the first PC for which none ofthese correlation coefficients is significantly different from zero at the 5%significance level. However, there is one immediate difficulty with this suggestion.For a fixed level of significance, the critical values for correlationcoefficients decrease in absolute value as the sample size n increases. Hencefor a given sample correlation matrix, the number of PCs retained dependson n. More components will be kept as n increases.6.1.5 Choice of m Using Cross-Validatory orComputationally Intensive MethodsThe rule described in Section 6.1.1 is equivalent to looking at how well thedata matrix X is fitted by the rank m approximation based on the SVD.The idea behind the first two methods discussed in the present section issimilar, except that each element x ij of X is now predicted from an equationlike the SVD, but based on a submatrix of X that does not include x ij .Inboth methods, suggested by Wold (1978) and Eastment and Krzanowski

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!