12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

52 3. Properties of Sample <strong>Principal</strong> <strong>Component</strong>swhereT k =λ k(n − 1)p∑l=1l≠kλ l(λ l − λ k ) 2 α lα ′ l.The matrix T k has rank (p − 1) as it has a single zero eigenvalue correspondingto the eigenvector α k . This causes further complications, but itcan be shown (Mardia et al., 1979, p. 233) that, approximately,(n − 1)(a k − α k ) ′ (l k S −1 + l −1k S − 2I p)(a k − α k ) ∼ χ 2 (p−1) . (3.7.4)Because a k is an eigenvector of S with eigenvalue l k , it follows thatl −1kSa k = l −1k l ka k = a k , l k S −1 a k = l k l −1ka k = a k ,and(l k S −1 + l −1k S − 2I p)a k = a k + a k − 2a k = 0,so that the result (3.7.4) reduces to(n − 1)α ′ k(l k S −1 + l −1k S − 2I p)α k ∼ χ 2 (p−1) . (3.7.5)From (3.7.5) an approximate confidence region for α k , with confidencecoefficient (1−α), has the form (n−1)α ′ k (l kS −1 +l −1kS−2I p)α k ≤ χ 2 (p−1);αwith fairly obvious notation.Moving away from assumptions of multivariate normality, the nonparametricbootstrap of Efron and Tibshirani (1993), noted in Section 3.6,can be used to find confidence intervals for various parameters. In theirSection 7.2, Efron and Tibshirani (1993) use bootstrap samples to estimatestandard errors of estimates for α kj , and for the proportion of totalvariance accounted for by an individual PC. Assuming approximate normalityand unbiasedness of the estimates, the standard errors can then beused to find confidence intervals for the parameters of interest. Alternatively,the ideas of Chapter 13 of Efron and Tibshirani (1993) can be usedto construct an interval for λ k with confidence coefficient (1 − α), for example,consisting of a proportion (1−α) of the values of l k arising from thereplicated bootstrap samples. Intervals for elements of α k can be found ina similar manner. Milan and Whittaker (1995) describe a related but differentidea, the parametric bootstrap. Here, residuals from a model based onthe SVD, rather than the observations themselves, are bootstrapped. Anexample of bivariate confidence intervals for (α 1j ,α 2j ) is given by Milanand Whittaker.Some theory underlying non-parametric bootstrap confidence intervalsfor eigenvalues and eigenvectors of covariance matrices is given by Beranand Srivastava (1985), while Romanazzi (1993) discusses estimation andconfidence intervals for eigenvalues of both covariance and correlation matricesusing another computationally intensive distribution-free procedure,the jackknife. Romanazzi (1993) shows that standard errors of eigenvalueestimators based on the jackknife can have substantial bias and are sensitiveto outlying observations. Bootstrapping and the jackknife have also

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!