12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

32 3. Properties of Sample <strong>Principal</strong> <strong>Component</strong>sIn the case of sample correlation matrices, one further reason can be putforward for interest in the last few PCs, as found by Property A2. Raveh(1985) argues that the inverse R −1 of a correlation matrix is of greaterinterest in some situations than R. It may then be more important toapproximate R −1 than R in a few dimensions. If this is done using thespectral decomposition (Property A3) of R −1 , then the first few terms willcorrespond to the last few PCs, since eigenvectors of R and R −1 are thesame, except that their order is reversed. The rôle of the last few PCs willbe discussed further in Sections 3.4 and 3.7, and again in Sections 6.3, 8.4,8.6 and 10.1.One further property, which is concerned with the use of principal componentsin regression, will now be discussed. Standard terminology fromregression is used and will not be explained in detail (see, for example,Draper and Smith (1998)). An extensive discussion of the use of principalcomponents in regression is given in Chapter 8.Property A7. Suppose now that X, defined as above, consists of n observationson p predictor variables x measured about their sample means,and that the corresponding regression equation isy = Xβ + ɛ, (3.1.5)where y is the vector of n observations on the dependent variable, againmeasured about the sample mean. (The notation y for the dependent variablehas no connection with the usage of y elsewhere in the chapter, butis standard in regression.) Suppose that X is transformed by the equationZ = XB, whereB is a (p × p) orthogonal matrix. The regression equationcan then be rewritten asy = Zγ + ɛ,where γ = B −1 β. The usual least squares estimator for γ is ˆγ =(Z ′ Z) −1 Z ′ y. Then the elements of ˆγ have, successively, the smallest possiblevariances if B = A, the matrix whose kth column is the kth eigenvectorof X ′ X, and hence the kth eigenvector of S. ThusZ consists of values ofthe sample principal components for x.Proof. From standard results in regression (Draper and Smith, 1998,Section 5.2) the covariance matrix of the least squares estimator ˆγ isproportional to(Z ′ Z) −1 =(B ′ X ′ XB) −1= B −1 (X ′ X) −1 (B ′ ) −1= B ′ (X ′ X) −1 B,as B is orthogonal. We require tr(B ′ q(X ′ X) −1 B q ), q =1, 2,...,p be minimized,where B q consists of the first q columns of B. But, replacing Σ yby (X ′ X) −1 in Property A2 of Section 2.1 shows that B q must consist of

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!