12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

344 13. <strong>Principal</strong> <strong>Component</strong> <strong>Analysis</strong> for Special Types of Datatween variables or measures of ‘shape.’ When the variables are physicalmeasurements on animals or plants the terms ‘size’ and ‘shape’ take onreal meaning. The student anatomical measurements that were analysed inSections 1.1, 4.1, 5.1 and 10.1 are of this type, and there is a large literatureon the study of size and shape for non-human animals. Various approacheshave been adopted for quantifying size and shape, only some of which involvePCA. We shall concentrate on the latter, though other ideas will benoted briefly.The study of relationships between size and shape during the growth oforganisms is sometimes known as allometry (Hills, 1982). The idea of usingthe first PC as a measure of size, with subsequent PCs defining various aspectsof shape, dates back at least to Jolicoeur (1963). Sprent (1972) givesa good review of early work in the area from a mathematical/statisticalpoint of view, and Blackith and Reyment (1971, Chapter 12) provide referencesto a range of early examples. It is fairly conventional, for reasonsexplained in Jolicoeur (1963), to take logarithms of the data, with PCAthen conducted on the covariance matrix of the log-transformed data.In circumstances where all the measured variables are thought to beof equal importance, it seems plausible that size should be an weightedaverage of the (log-transformed) variables with all weights equal. This isknown as isometric size. While the first PC may have roughly equal weights(coefficients), sampling variability ensures that they are never exactly equal.Somers (1989) argues that the first PC contains a mixture of size and shapeinformation, and that in order to examine ‘shape,’ an isometric componentrather than the first PC should be removed. A number of ways of ‘removing’isometric size and then quantifying different aspects of shape have beensuggested and are now discussed.Recall that the covariance matrix (of the log-transformed data) can bewritten using the spectral decomposition in equation (3.1.4) asS = l 1 a 1 a ′ 1 + l 2 a 2 a ′ 2 + ...+ l p a p a ′ p.Removal of the first PC is achieved by removing the first term in this decomposition.The first, second, ...,PCsofthereduced matrix are then thesecond, third, . . . PCs of S. Somers (1986) suggests removing l 0 a 0 a ′ 0 fromS, where a 0 = √ 1p(1, 1,...,1) is the isometric vector and l 0 is the samplevariance of a ′ 0x, and then carrying out a ‘PCA’ on the reduced matrix. Thisprocedure has a number of drawbacks (Somers, 1989; Sundberg, 1989), includingthe fact that, unlike PCs, the shape components found in this wayare correlated and have vectors of coefficients that are not orthogonal.One alternative suggested by Somers (1989) is to find ‘shape’ componentsby doing a ‘PCA’ on a doubly-centred version of the log-transformed data.The double-centering is considered to remove size (see also Section 14.2.3)because the isometric vector is one of the eigenvectors of its ‘covariance’matrix, with zero eigenvalue. Hence the vectors of coefficients of the shape

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!