Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

cda.psych.uiuc.edu
from cda.psych.uiuc.edu More from this publisher
12.07.2015 Views

11.4. Physical Interpretation of Principal Components 297In practice, there may be a circularity of argument when PCA is usedto search for physically meaningful modes in atmospheric data. The formof these modes is often assumed known and PCA is used in an attempt toconfirm them. When it fails to do so, it is ‘accused’ of being inadequate.However, it has very clear objectives, namely finding uncorrelated derivedvariables that in succession maximize variance. If the physical modes arenot expected to maximize variance and/or to be uncorrelated, PCA shouldnot be used to look for them in the first place.One of the reasons why PCA ‘fails’ to find the expected physical modesin some cases in atmospheric science is because of its dependence on thesize and shape of the spatial domain over which observations are taken.Buell (1975) considers various spatial correlation functions, both circular(isotropic) and directional (anisotropic), together with square, triangularand rectangular spatial domains. The resulting EOFs depend on the size ofthe domain in the sense that in a small domain with positive correlationsbetween all points within it, the first EOF is certain to have all its elementsof the same sign, with the largest absolute values near the centre of thedomain, a sort of ‘overall size’ component (see Section 13.2). For largerdomains, there may be negative as well as positive correlations, so that thefirst EOF represents a more complex pattern. This gives the impressionof instability, because patterns that are present in the analysis of a largedomain are not necessarily reproduced when the PCA is restricted to asubregion of this domain.The shape of the domain also influences the form of the PCs. For example,if the first EOF has all its elements of the same sign, subsequentones must represent ‘contrasts,’ with a mixture of positive and negative values,in order to satisfy orthogonality constraints. If the spatial correlationis isotropic, the contrast represented by the second PC will be betweenregions with the greatest geographical separation, and hence will be determinedby the shape of the domain. Third and subsequent EOFs canalso be predicted for isotropic correlations, given the shape of the domain(see Buell (1975) for diagrams illustrating this). However, if the correlationis anisotropic and/or non-stationary within the domain, things are lesssimple. In any case, the form of the correlation function is important indetermining the PCs, and it is only when it takes particularly simple formsthat the nature of the PCs can be easily predicted from the size and shapeof the domain. PCA will often give useful information about the sourcesof maximum variance in a spatial data set, over and above that availablefrom knowledge of the size and shape of the spatial domain of the data.However, in interpreting PCs derived from spatial data, it should not beforgotten that the size and shape of the domain can have a strong influenceon the results. The degree of dependence of EOFs on domain shape hasbeen, like the use of rotation and the possibility of physical interpretation,a source of controversy in atmospheric science. For an entertaining andenlightening exchange of strong views on the importance of domain shape,

298 11. Rotation and Interpretation of Principal Componentswhich also brings in rotation and interpretation, see Legates (1991, 1993)and Richman (1993).Similar behaviour occurs for PCA of time series when the autocorrelationfunction (see Section 12.1) takes a simple form. Buell (1979) discusses thiscase, and it is well-illustrated by the road-running data which are analysedin Sections 5.3 and 12.3. The first PC has all its loadings of the samesign, with the greatest values in the middle of the race, while the secondPC is a contrast between the race segments with the greatest separationin time. If there are predictable patterns in time or in spatial data, itmay be of interest to examine the major sources of variation orthogonalto these predictable directions. This is related to what is done in lookingfor ‘shape’ components orthogonal to the isometric size vector for size andshape data (see Section 13.2). Rao’s ‘principal components uncorrelatedwith instrumental variables’ (Section 14.3) also have similar objectives.A final comment is that even when there is no underlying structure ina data set, sampling variation ensures that some linear combinations ofthe variables have larger variances than others. Any ‘first PC’ which hasactually arisen by chance can, with some ingenuity, be ‘interpreted.’ Ofcourse, completely unstructured data is a rarity but we should always tryto avoid ‘overinterpreting’ PCs, in the same way that in other branches ofstatistics we should be wary of spurious regression relationships or clusters,for example.

298 11. Rotation and Interpretation of <strong>Principal</strong> <strong>Component</strong>swhich also brings in rotation and interpretation, see Legates (1991, 1993)and Richman (1993).Similar behaviour occurs for PCA of time series when the autocorrelationfunction (see Section 12.1) takes a simple form. Buell (1979) discusses thiscase, and it is well-illustrated by the road-running data which are analysedin Sections 5.3 and 12.3. The first PC has all its loadings of the samesign, with the greatest values in the middle of the race, while the secondPC is a contrast between the race segments with the greatest separationin time. If there are predictable patterns in time or in spatial data, itmay be of interest to examine the major sources of variation orthogonalto these predictable directions. This is related to what is done in lookingfor ‘shape’ components orthogonal to the isometric size vector for size andshape data (see Section 13.2). Rao’s ‘principal components uncorrelatedwith instrumental variables’ (Section 14.3) also have similar objectives.A final comment is that even when there is no underlying structure ina data set, sampling variation ensures that some linear combinations ofthe variables have larger variances than others. Any ‘first PC’ which hasactually arisen by chance can, with some ingenuity, be ‘interpreted.’ Ofcourse, completely unstructured data is a rarity but we should always tryto avoid ‘overinterpreting’ PCs, in the same way that in other branches ofstatistics we should be wary of spurious regression relationships or clusters,for example.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!