12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

A.1. Numerical Calculation of <strong>Principal</strong> <strong>Component</strong>s 413observation on each PC, and hence all the information that is requiredto construct a biplot (see Section 5.3). The PC scores would otherwiseneed to be derived as an extra step after calculating the eigenvalues andeigenvectors of the covariance or correlation matrix S = 1n−1 X′ X.The values of the PC scores are related to the eigenvectors of XX ′ ,whichcan be derived from the eigenvectors of X ′ X (see the proof of Property G4in Section 3.2); conversely, the eigenvectors of X ′ X can be found fromthose of XX ′ . In circumstances where the sample size n is smaller than thenumber of variables p, XX ′ has smaller dimensions than X ′ X,sothatitcan be advantageous to use the algorithms described above, based on thepower method or QL method, on a multiple of XX ′ rather than X ′ X in suchcases. Large computational savings are possible when n ≪ p, as in chemicalspectroscopy or in the genetic example of Hastie et al. (2000), which isdescribed in Section 9.2 and which has n = 48, p = 4673. Algorithms alsoexist for updating the SVD if data arrive sequentially (see for exampleBerry et al. (1995)).Neural Network AlgorithmsNeural networks provide ways of extending PCA, including some non-lineargeneralizations (see Sections 14.1.3, 14.6.1). They also give alternative algorithmsfor estimating ‘ordinary’ PCs. The main difference between thesealgorithms and the techniques described earlier in the Appendix is thatmost are ‘adaptive’ rather than ‘batch’ methods. If the whole of a dataset is collected before PCA is done and parallel processing is not possible,then batch methods such as the QR algorithm are hard to beat (see Diamantarasand Kung, 1996 (hereafter DK96), Sections 3.5.3, 4.4.1). On theother hand, if data arrive sequentially and PCs are re-estimated when newdata become available, then adaptive neural network algorithms come intotheir own. DK96, Section 4.2.7 note that ‘there is a plethora of alternative[neural network] techniques that perform PCA.’ They describe a selectionof single-layer techniques in their Section 4.2, with an overview of these intheir Table 4.1. Different algorithms arise depending on• whether the first or last few PCs are of interest;• whether one or more than one PC is required;• whether individual PCs are wanted or whether subspaces spanned byseveral PCs will suffice;• whether the network is required to be biologically plausible.DK96, Section 4.2.7 treat finding the last few PCs as a different technique,calling it minor component analysis.In their Section 4.4, DK96 compare the properties, including speed, ofseven algorithms using simulated data. In Section 4.5 they discuss multilayernetworks.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!