12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5. Graphical Representation of Data Using <strong>Principal</strong> <strong>Component</strong>s 79by Chambers et al. (1983)). A more recent thorough review of graphicsfor multivariate data is given by Carr (1998). A major advance has beenthe development of dynamic multivariate graphics, which Carr (1998) describesas part of ‘the visual revolution in computer science.’ The techniquesdiscussed in the present chapter are almost exclusively static, althoughsome could be adapted to be viewed dynamically. Only those graphicsthat have links with, or can be used in conjunction with, PCA areincluded.Section 5.2 discusses principal coordinate analysis, which constructs lowdimensionalplots of a set of data from information about similarities ordissimilarities between pairs of observations. It turns out that the plotsgiven by this analysis are equivalent to plots with respect to PCs in certainspecial cases.The biplot, described in Section 5.3, is also closely related to PCA. Thereare a number of variants of the biplot idea, but all give a simultaneousdisplay of n observations and p variables on the same two-dimensionaldiagram. In one of the variants, the plot of observations is identical toa plot with respect to the first two PCs, but the biplot simultaneouslygives graphical information about the relationships between variables. Therelative positions of variables and observations, which are plotted on thesame diagram, can also be interpreted.Correspondence analysis, which is discussed in Section 5.4, again givestwo-dimensional plots, but only for data of a special form. Whereas PCAand the biplot operate on a matrix of n observations on p variables,and principal coordinate analysis and other types of scaling or ordinationtechniques use data in the form of a similarity or dissimilarity matrix, correspondenceanalysis is used on contingency tables, that is, data classifiedaccording to two categorical variables. The link with PCA is less straightforwardthan for principal coordinate analysis or the biplot, but the ideasof PCA and correspondence analysis have some definite connections. Thereare many other ordination and scaling methods that give graphical displaysof multivariate data, and which have increasingly tenuous links to PCA.Some of these techniques are noted in Sections 5.2 and 5.4, and in Section5.5 some comparisons are made, briefly, between PCA and the othertechniques introduced in this chapter.Another family of techniques, projection pursuit, is designed to find lowdimensionalrepresentations of a multivariate data set that best displaycertain types of structure such as clusters or outliers. Discussion of projectionpursuit will be deferred until Chapters 9 and 10, which include sectionson cluster analysis and outliers, respectively.The final section of this chapter describes some methods which have beenused for representing multivariate data in two dimensions when more thantwo or three PCs are needed to give an adequate representation of the data.The first q PCs can still be helpful in reducing the dimensionality in suchcases, even when q is much larger than 2 or 3.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!