12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

74 4. Interpreting <strong>Principal</strong> <strong>Component</strong>s: ExamplesIt is not always the case that interpretation is straightforward. In atmosphericscience the PCs or EOFS are often rotated in an attempt to findmore clearly interpretable patterns. We return to this topic in Chapter 11.Not only are the first few PCs readily interpreted in many meteorologicaland climatological examples, possibly after rotation, but they alsofrequently enable a considerable reduction to be made in the dimensions ofthe data set. In Maryon’s (1979) study, for example, there are initially 221variables, but 16 PCs account for over 90% of the total variation. Nor isthis due to any disparity between variances causing a few dominant PCs;size of variance is fairly similar for all 221 variables.Maryon’s (1979) analysis was for a covariance matrix, which is reasonablesince all variables are measured in the same units (see Sections 2.3 and 3.3).However, some atmospheric scientists advocate using correlation, ratherthan covariance, matrices so that patterns of spatial correlation can bedetected without possible domination by the stations and gridpoints withthe largest variances (see Wigley et al. (1984)).It should be clear from this section that meteorologists and climatologistshave played a leading role in applying PCA. In addition, they havedeveloped many related methods to deal with the peculiarities of theirdata, which often have correlation structure in both time and space. Asubstantial part of Chapter 12 is devoted to these developments.4.4 Properties of Chemical CompoundsThe main example given in this section is based on a subset of data givenby Hansch et al. (1973); the PCA was described by Morgan (1981). Sevenproperties (variables) were measured for each of 15 chemical substituents;the properties and substituents are listed in Table 4.5. Some of the results ofa PCA based on the correlation matrix for these data are given in Table 4.6.The aim of the work of Hansch et al. (1973), and of much subsequent researchin quantitative structure–activity relationships (QSAR), is to relateaspects of the structure of chemicals to their physical properties or activitiesso that ‘new’ chemicals can be manufactured whose activities may bepredicted in advance. Although PCA is less evident recently in the extensiveQSAR literature, it appeared in a number of early papers on QSAR.For example, it was used in conjunction with regression (see Chapter 8and Mager (1980a)), and as a discriminant technique (see Section 9.1 andMager (1980b)). Here we look only at the reduction of dimensionality andinterpretations obtained by Morgan (1981) in this analysis of Hansch etal.’s (1973) data. The first two PCs in Table 4.6 account for 79% of thetotal variation; the coefficients for each have a moderately simple structure.The first PC is essentially an average of all properties except π and MR,whereas the most important contribution to the second PC is an average of

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!