Notes on Principal Component Analysis
Notes on Principal Component Analysis
Notes on Principal Component Analysis
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
The covariance matrix S (or Σ) can be represented by<br />
⎡ √ ⎤ ⎡ √ ⎤ ⎡<br />
λ1 · · · 0<br />
λ1 · · · 0<br />
a ′ 1<br />
S = [a 1 , . . . , a p ]<br />
⎢<br />
. .<br />
⎣<br />
√<br />
⎥ ⎢<br />
. .<br />
⎦ ⎣<br />
√<br />
⎥ ⎢<br />
.<br />
⎦ ⎣<br />
0 · · · λp 0 · · · λp a ′ p<br />
or as the sum of p, p × p matrices,<br />
⎤<br />
⎥<br />
⎦<br />
≡ LL ′<br />
S = λ 1 a 1 a ′ 1 + · · · + λ p a p a ′ p .<br />
Given the ordering of the eigenvalues as λ 1 ≥ λ 2 ≥ · · · ≥ λ p ≥ 0,<br />
the least-squares approximati<strong>on</strong> to S of rank r is λ 1 a 1 a ′ 1 + · · · +<br />
λ 1 a r a ′ r, and the residual matrix, S − λ 1 a 1 a ′ 1 − · · · − λ 1 a r a ′ r, is<br />
λ r+1 a r+1 a ′ r+1 + · · · + λ p a p a ′ p.<br />
Note that for an arbitrary matrix, B p×q , the Tr(BB ′ ) = sum of<br />
squares of the entries in B. Also, for two matrices, B and C, if both<br />
of the products BC and CB can be taken, then Tr(BC) is equal<br />
to Tr(CB). Using these two results, the least-squares criteri<strong>on</strong> value<br />
can be given as<br />
Tr([λ r+1 a r+1 a ′ r+1 + · · · + λ p a p a ′ p][λ r+1 a r+1 a ′ r+1 + · · · + λ p a p a ′ p] ′ ) =<br />
∑<br />
k≥r+1<br />
λ 2 k .<br />
This measure is <strong>on</strong>e of how bad the rank r approximati<strong>on</strong> might be<br />
(i.e., the proporti<strong>on</strong> of unexplained sum-of-squares when put over<br />
∑p<br />
k=1 λ 2 k).<br />
For a geometric interpretati<strong>on</strong> of principal comp<strong>on</strong>ents, suppose<br />
we have two variables, X 1 and X 2 , that are centered at their respective<br />
means (i.e., the means of the scores <strong>on</strong> X 1 and X 2 are zero). In<br />
5