Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

cda.psych.uiuc.edu
from cda.psych.uiuc.edu More from this publisher
12.07.2015 Views

3.7.2 Interval Estimation3.7. Inference Based on Sample Principal Components 51The asymptotic marginal distributions of l k and a kj given in the previoussection can be used to construct approximate confidence intervals for λ kand α kj , respectively. For l k , the marginal distribution is, from (3.6.1) and(3.6.2), approximatelysol k ∼ N(λ k ,2λ 2 kn − 1 ) (3.7.1)l k − λ k∼ N(0, 1),λ k [2/(n − 1)]1/2which leads to a confidence interval, with confidence coefficient (1 − α) forλ k ,oftheforml k[1 + τz α/2 ]

52 3. Properties of Sample Principal ComponentswhereT k =λ k(n − 1)p∑l=1l≠kλ l(λ l − λ k ) 2 α lα ′ l.The matrix T k has rank (p − 1) as it has a single zero eigenvalue correspondingto the eigenvector α k . This causes further complications, but itcan be shown (Mardia et al., 1979, p. 233) that, approximately,(n − 1)(a k − α k ) ′ (l k S −1 + l −1k S − 2I p)(a k − α k ) ∼ χ 2 (p−1) . (3.7.4)Because a k is an eigenvector of S with eigenvalue l k , it follows thatl −1kSa k = l −1k l ka k = a k , l k S −1 a k = l k l −1ka k = a k ,and(l k S −1 + l −1k S − 2I p)a k = a k + a k − 2a k = 0,so that the result (3.7.4) reduces to(n − 1)α ′ k(l k S −1 + l −1k S − 2I p)α k ∼ χ 2 (p−1) . (3.7.5)From (3.7.5) an approximate confidence region for α k , with confidencecoefficient (1−α), has the form (n−1)α ′ k (l kS −1 +l −1kS−2I p)α k ≤ χ 2 (p−1);αwith fairly obvious notation.Moving away from assumptions of multivariate normality, the nonparametricbootstrap of Efron and Tibshirani (1993), noted in Section 3.6,can be used to find confidence intervals for various parameters. In theirSection 7.2, Efron and Tibshirani (1993) use bootstrap samples to estimatestandard errors of estimates for α kj , and for the proportion of totalvariance accounted for by an individual PC. Assuming approximate normalityand unbiasedness of the estimates, the standard errors can then beused to find confidence intervals for the parameters of interest. Alternatively,the ideas of Chapter 13 of Efron and Tibshirani (1993) can be usedto construct an interval for λ k with confidence coefficient (1 − α), for example,consisting of a proportion (1−α) of the values of l k arising from thereplicated bootstrap samples. Intervals for elements of α k can be found ina similar manner. Milan and Whittaker (1995) describe a related but differentidea, the parametric bootstrap. Here, residuals from a model based onthe SVD, rather than the observations themselves, are bootstrapped. Anexample of bivariate confidence intervals for (α 1j ,α 2j ) is given by Milanand Whittaker.Some theory underlying non-parametric bootstrap confidence intervalsfor eigenvalues and eigenvectors of covariance matrices is given by Beranand Srivastava (1985), while Romanazzi (1993) discusses estimation andconfidence intervals for eigenvalues of both covariance and correlation matricesusing another computationally intensive distribution-free procedure,the jackknife. Romanazzi (1993) shows that standard errors of eigenvalueestimators based on the jackknife can have substantial bias and are sensitiveto outlying observations. Bootstrapping and the jackknife have also

3.7.2 Interval Estimation3.7. Inference Based on Sample <strong>Principal</strong> <strong>Component</strong>s 51The asymptotic marginal distributions of l k and a kj given in the previoussection can be used to construct approximate confidence intervals for λ kand α kj , respectively. For l k , the marginal distribution is, from (3.6.1) and(3.6.2), approximatelysol k ∼ N(λ k ,2λ 2 kn − 1 ) (3.7.1)l k − λ k∼ N(0, 1),λ k [2/(n − 1)]1/2which leads to a confidence interval, with confidence coefficient (1 − α) forλ k ,oftheforml k[1 + τz α/2 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!