12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

5.6. Displaying Intrinsically High-Dimensional Data 107There are a number of connections between PCA and the othertechniques–links with principal coordinate analysis and biplots have alreadybeen discussed, while those with correspondence analysis are deferreduntil Section 13.1—but for most data sets one method is more appropriatethan the others. Contingency table data imply correspondence analysis,and similarity or dissimilarity matrices suggest principal coordinate analysis,whereas PCA is defined for ‘standard’ data matrices of n observationson p variables. Notwithstanding these distinctions, different techniqueshave been used on the same data sets and a number of empirical comparisonshave been reported in the ecological literature. Digby and Kempton(1987, Section 4.3) compare twelve ordination methods, including principalcoordinate analysis, with five different similarity measures and correspondenceanalysis, on both species abundances and presence/absence data.The comparison is by means of a second-level ordination based on similaritiesbetween the results of the twelve methods. Gauch (1982, Chapter4) discusses criteria for choosing an appropriate ordination technique forecological data, and in Gauch (1982, Chapter 3) a number of studies aredescribed which compare PCA with other techniques, including correspondenceanalysis, on simulated data. The data are generated to have a similarstructure to that expected in some types of ecological data, with addednoise, and investigations are conducted to see which techniques are ‘best’at recovering the structure. However, as with comparisons between PCAand correspondence analysis given by Greenacre (1994, Section 9.6), therelevance to the data analysed of all the techniques compared is open todebate. Different techniques implicitly assume that different types of structureor model are of interest for the data (see Section 14.2.3 for some furtherpossibilities) and which technique is most appropriate will depend on whichtype of structure or model is relevant.5.6 Methods for Graphical Display of IntrinsicallyHigh-Dimensional DataSometimes it will not be possible to reduce a data set’s dimensionalityto two or three without a substantial loss of information; in such cases,methods for displaying many variables simultaneously in two dimensionsmay be useful. Plots of trigonometric functions due to Andrews (1972),illustrated below, and the display in terms of faces suggested by Chernoff(1973), for which several examples are given in Wang (1978), became popularin the 1970s and 1980s. There are many other possibilities (see, forexample, Tukey and Tukey (1981) and Carr(1998)) which will not be discussedhere. Recent developments in the visualization of high-dimensionaldata using the ever-increasing power of computers have created displayswhich are dynamic, colourful and potentially highly informative, but there

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!