12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

284 11. Rotation and Interpretation of <strong>Principal</strong> <strong>Component</strong>svariables in the functions. The constraints are designed to make the resultingcomponents simpler to interpret than PCs, but without sacrificing toomuch of the variance accounted for by the PCs. The first idea, discussedin Section 11.2.1, is a simple one, namely, restricting coefficients to a set ofintegers, though it is less simple to put into practice. The second type oftechnique, described in Section 11.2.2, borrows an idea from regression, thatof the LASSO (Least Absolute Shrinkage and Selection Operator). By imposingan additional constraint in the PCA optimization problem, namely,that the sum of the absolute values of the coefficients in a component isbounded, some of the coefficients can be forced to zero. A technique fromatmospheric science, empirical orthogonal teleconnections, is described inSection 11.2.3, and Section 11.2.4 makes comparisons between some of thetechniques introduced so far in the chapter.11.2.1 <strong>Component</strong>s with Discrete-Valued CoefficientsA fairly obvious way of constructing simpler versions of PCs is to successivelyfind linear functions of the p variables that maximize variance, as inPCA, but with a restriction on the values of coefficients in those functionsto a small number of values. An extreme version of this was suggested byHausmann (1982), in which the loadings are restricted to the values +1,−1 and 0. To implement the technique, Hausman (1982) suggests the useof a branch-and-bound algorithm. The basic algorithm does not includean orthogonality constraint on the vectors of loadings of successive ‘components,’but Hausmann (1982) adapts it to impose this constraint. Thisimproves interpretability and speeds up the algorithm, but has the implicationthat it may not be possible to find as many as p components. In the6-variable example given by Hausmann (1982), after 4 orthogonal componentshave been found with coefficients restricted to {−1, 0, +1} the nullvector is the only vector with the same restriction that is orthogonal to allfour already found. In a unpublished M.Sc. project report, Brooks (1992)discusses some other problems associated with Hausmann’s algorithm.Further information is given on Hausmann’s example in Table 11.2. Herethe following can be seen:• The first component is a straightforward average or ‘size’ componentin both analyses.• Despite a considerable simplication, and a moderately different interpretation,for the second constrained component, there is very littleloss in variance accounted by the first two constrained componentscompared to first two PCs.A less restrictive method is proposed by Vines (2000), in which the coefficientsare also restricted to integers. The algorithm for finding so-calledsimple components starts with a set of p particularly simple vectors of

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!