Notes on Principal Component Analysis
Notes on Principal Component Analysis
Notes on Principal Component Analysis
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
the diagram below, the ellipse represents the scatter diagram of the<br />
sample points. The first principal comp<strong>on</strong>ent is a line through the<br />
widest part; the sec<strong>on</strong>d comp<strong>on</strong>ent is the line at right angles to the<br />
first principal comp<strong>on</strong>ent. In other words, the first principal comp<strong>on</strong>ent<br />
goes through the fattest part of the “football”, and the sec<strong>on</strong>d<br />
principal comp<strong>on</strong>ent through the next fattest part of the “football”<br />
and orthog<strong>on</strong>al to the first; and so <strong>on</strong>. Or, we take our original frame<br />
of reference and do a rigid transformati<strong>on</strong> around the origin to get a<br />
new set of axes; the origin is given by the sample means (of zero) <strong>on</strong><br />
the two X 1 and X 2 variables. (To make these same geometric points,<br />
we could have used a c<strong>on</strong>stant density c<strong>on</strong>tour for a bivariate normal<br />
pair of random variables, X 1 and X 2 , with zero mean vector.)<br />
sec<strong>on</strong>d comp<strong>on</strong>ent<br />
X 2<br />
first comp<strong>on</strong>ent<br />
X 1<br />
As an example of how to find the placement of the comp<strong>on</strong>ents in<br />
the picture given above, suppose we have the two variables, X 1 and<br />
X 2 , with variance-covariance matrix<br />
Σ =<br />
⎛<br />
⎜<br />
⎝<br />
σ1 2 σ 12<br />
σ 12 σ2<br />
2<br />
⎞<br />
⎟<br />
⎠ .<br />
6