12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

2.1. Optimal Algebraic Properties of Population <strong>Principal</strong> <strong>Component</strong>s 13Now ∑ qk=1 c2 jk is the coefficient of λ j in (2.1.6), the sum of these coefficientsis q from (2.1.7), and none of the coefficients can exceed 1, from (2.1.8).Because λ 1 >λ 2 > ··· >λ p , it is fairly clear that ∑ pj=1 (∑ qk=1 c2 jk )λ j willbe maximized if we can find a set of c jk for whichq∑{c 2 1, j =1,...,q,jk =(2.1.9)0, j = q +1,...,p.But if B ′ = A ′ q, thenk=1c jk ={1, 1 ≤ j = k ≤ q,0, elsewhere,which satisfies (2.1.9). Thus tr(Σ y ) achieves its maximum value when B ′ =A ′ q.✷Property A2.Consider again the orthonormal transformationy = B ′ x,with x, B, A and Σ y defined as before. Then tr(Σ y ) is minimized by takingB = A ∗ q where A ∗ q consists of the last q columns of A.Proof. The derivation of PCs given in Chapter 1 can easily be turnedaround for the purpose of looking for, successively, linear functions of xwhose variances are as small as possible, subject to being uncorrelatedwith previous linear functions. The solution is again obtained by findingeigenvectors of Σ, but this time in reverse order, starting with the smallest.The argument that proved Property A1 can be similarly adapted to proveProperty A2.✷The statistical implication of Property A2 is that the last few PCs arenot simply unstructured left-overs after removing the important PCs. Becausethese last PCs have variances as small as possible they are useful intheir own right. They can help to detect unsuspected near-constant linearrelationships between the elements of x (see Section 3.4), and they mayalso be useful in regression (Chapter 8), in selecting a subset of variablesfrom x (Section 6.3), and in outlier detection (Section 10.1).Property A3. (the Spectral Decomposition of Σ)Proof.Σ = λ 1 α 1 α ′ 1 + λ 2 α 2 α ′ 2 + ···+ λ p α p α ′ p. (2.1.10)Σ = AΛA ′ from (2.1.4),and expanding the right-hand side matrix product shows that Σ equalsp∑λ k α k α ′ k,k=1as required (see the derivation of (2.1.6)).✷

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!