Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
3.2. Geometric Properties of Sample Principal Components 37mation for which B = A q minimizes the distortion in the configuration asmeasured by ‖YY ′ − XX ′ ‖,where‖·‖ denotes Euclidean norm and Y isa matrix with (i, j)th element ỹ ij − ȳ j .Proof.Y = XB, soYY ′ = XBB ′ X and ‖YY ′ − XX ′ ‖ = ‖XBB ′ X ′ − XX ′ ‖.A matrix result given by Rao (1973, p. 63) states that if F is a symmetricmatrix of rank p with spectral decompositionF = f 1 φ 1 φ ′ 1 + f 2 φ 2 φ ′ 2 + ···+ f p φ p φ ′ p,and G is a matrix of rank q
38 3. Properties of Sample Principal Components====p∑k=q+1p∑k=q+1p∑k=q+1p∑k=q+1l k ‖a k a ′ k‖l k⎡⎣l k⎡⎣l k ,p∑i=1 j=1p∑⎤p∑(a ki a kj ) 2 ⎦p∑a 2 ki a 2 kji=1 j=1as a ′ k a k =1, k =1, 2,...,p.Property G4 is very similar to another optimality property of PCs, discussedin terms of the so-called RV-coefficient by Robert and Escoufier(1976). The RV-coefficient was introduced as a measure of the similaritybetween two configurations of n data points, as described by XX ′ andYY ′ . The distance between the two configurations is defined by Robertand Escoufier (1976) as∥ XX ′∥{tr(XX ′ ) 2 } − YY ′ ∥∥∥, (3.2.1)1/2 {tr(YY ′ ) 2 } 1/2where the divisors of XX ′ , YY ′ are introduced simply to standardize therepresentation of each configuration in the sense that∥ ∥ XX ′ ∥∥∥ ∥=YY ′ ∥∥∥{tr(XX ′ ) 2 } 1/2 ∥=1.{tr(YY ′ ) 2 } 1/2It can then be shown that (3.2.1) equals [2(1 − RV(X, Y))] 1/2 , where theRV-coefficient is defined astr(XY ′ YX ′ )RV(X, Y) =. (3.2.2){tr(XX ′ ) 2 tr(YY ′ ) 2 }1/2Thus, minimizing the distance measure (3.2.1) which, apart from standardizations,is the same as the criterion of Property G4, is equivalent tomaximization of RV(X, Y). Robert and Escoufier (1976) show that severalmultivariate techniques can be expressed in terms of maximizing RV(X, Y)for some definition of X and Y. In particular, if Y is restricted to be ofthe form Y = XB, where B is a (p × q) matrix such that the columns ofY are uncorrelated, then maximization of RV(X, Y) leads to B = A q , thatis Y consists of scores on the first q PCs. We will meet the RV-coefficientagain in Chapter 6 in the context of variable selection.Property G5. The algebraic derivation of sample PCs reduces to finding,successively, vectors a k ,k=1, 2,...,p, that maximize a ′ k Sa k subjectto a ′ k a k =1,andsubjecttoa ′ k a l =0for l
- Page 17 and 18: xviAcknowledgmentsthese institution
- Page 19 and 20: xviiiContents3.4.1 Example ........
- Page 21 and 22: xxContents10 Outlier Detection, Inf
- Page 23 and 24: This page intentionally left blank
- Page 25 and 26: xxivList of Figures5.2 Artistic qua
- Page 27 and 28: This page intentionally left blank
- Page 29 and 30: xxviiiList of Tables6.1 First six e
- Page 31 and 32: This page intentionally left blank
- Page 33 and 34: 2 1. IntroductionFigure 1.1. Plot o
- Page 35: 4 1. IntroductionFigure 1.3. Studen
- Page 38 and 39: 1.2. A Brief History of Principal C
- Page 40 and 41: 1.2. A Brief History of Principal C
- Page 42 and 43: 2.1. Optimal Algebraic Properties o
- Page 44 and 45: 2.1. Optimal Algebraic Properties o
- Page 46 and 47: 2.1. Optimal Algebraic Properties o
- Page 48 and 49: 2.1. Optimal Algebraic Properties o
- Page 50 and 51: 2.2. Geometric Properties of Popula
- Page 52 and 53: 2.3. Principal Components Using a C
- Page 54 and 55: 2.3. Principal Components Using a C
- Page 56 and 57: 2.3. Principal Components Using a C
- Page 58 and 59: 2.4. Principal Components with Equa
- Page 60 and 61: 3Mathematical and StatisticalProper
- Page 62 and 63: where3.1. Optimal Algebraic Propert
- Page 64 and 65: 3.2. Geometric Properties of Sample
- Page 66 and 67: 3.2. Geometric Properties of Sample
- Page 70 and 71: 3.3. Covariance and Correlation Mat
- Page 72 and 73: 3.3. Covariance and Correlation Mat
- Page 74 and 75: 3.4. Principal Components with Equa
- Page 76 and 77: show that X = ULA ′ .⎡ULA ′ =
- Page 78 and 79: 3.6. Probability Distributions for
- Page 80 and 81: 3.7. Inference Based on Sample Prin
- Page 82 and 83: 3.7.2 Interval Estimation3.7. Infer
- Page 84 and 85: 3.7. Inference Based on Sample Prin
- Page 86 and 87: 3.7. Inference Based on Sample Prin
- Page 88 and 89: 3.8. Patterned Covariance and Corre
- Page 90 and 91: 3.9. Models for Principal Component
- Page 92 and 93: 3.9. Models for Principal Component
- Page 94 and 95: 4Principal Components as a SmallNum
- Page 96 and 97: 4.1. Anatomical Measurements 65Tabl
- Page 98 and 99: 4.1. Anatomical Measurements 67spac
- Page 100 and 101: 4.2. The Elderly at Home 69Table 4.
- Page 102 and 103: 4.3. Spatial and Temporal Variation
- Page 104 and 105: 4.3. Spatial and Temporal Variation
- Page 106 and 107: 4.4. Properties of Chemical Compoun
- Page 108 and 109: 4.5. Stock Market Prices 77Table 4.
- Page 110 and 111: 5. Graphical Representation of Data
- Page 112 and 113: Anatomical Measurements5.1. Plottin
- Page 114 and 115: 5.1. Plotting Two or Three Principa
- Page 116 and 117: 5.2. Principal Coordinate Analysis
38 3. Properties of Sample <strong>Principal</strong> <strong>Component</strong>s====p∑k=q+1p∑k=q+1p∑k=q+1p∑k=q+1l k ‖a k a ′ k‖l k⎡⎣l k⎡⎣l k ,p∑i=1 j=1p∑⎤p∑(a ki a kj ) 2 ⎦p∑a 2 ki a 2 kji=1 j=1as a ′ k a k =1, k =1, 2,...,p.Property G4 is very similar to another optimality property of PCs, discussedin terms of the so-called RV-coefficient by Robert and Escoufier(1976). The RV-coefficient was introduced as a measure of the similaritybetween two configurations of n data points, as described by XX ′ andYY ′ . The distance between the two configurations is defined by Robertand Escoufier (1976) as∥ XX ′∥{tr(XX ′ ) 2 } − YY ′ ∥∥∥, (3.2.1)1/2 {tr(YY ′ ) 2 } 1/2where the divisors of XX ′ , YY ′ are introduced simply to standardize therepresentation of each configuration in the sense that∥ ∥ XX ′ ∥∥∥ ∥=YY ′ ∥∥∥{tr(XX ′ ) 2 } 1/2 ∥=1.{tr(YY ′ ) 2 } 1/2It can then be shown that (3.2.1) equals [2(1 − RV(X, Y))] 1/2 , where theRV-coefficient is defined astr(XY ′ YX ′ )RV(X, Y) =. (3.2.2){tr(XX ′ ) 2 tr(YY ′ ) 2 }1/2Thus, minimizing the distance measure (3.2.1) which, apart from standardizations,is the same as the criterion of Property G4, is equivalent tomaximization of RV(X, Y). Robert and Escoufier (1976) show that severalmultivariate techniques can be expressed in terms of maximizing RV(X, Y)for some definition of X and Y. In particular, if Y is restricted to be ofthe form Y = XB, where B is a (p × q) matrix such that the columns ofY are uncorrelated, then maximization of RV(X, Y) leads to B = A q , thatis Y consists of scores on the first q PCs. We will meet the RV-coefficientagain in Chapter 6 in the context of variable selection.Property G5. The algebraic derivation of sample PCs reduces to finding,successively, vectors a k ,k=1, 2,...,p, that maximize a ′ k Sa k subjectto a ′ k a k =1,andsubjecttoa ′ k a l =0for l