12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

106 5. Graphical Representation of Data Using <strong>Principal</strong> <strong>Component</strong>ssites in the south west of Ireland, and the group {43, 168, 169, 171, 172} inthe bottom right of the diagram are all coastal sites in the south and east.If we look at species, rather than sites, we find that similar species tendto be located in the same part of Figure 5.6. For example, three of thefour species of goose which were recorded are in the bottom-right of thediagram (BG, WG, GG).Turning to the simultaneous positions of species and sites, the GreylagGoose (GG) and Barnacle Goose (BG) were only recorded at site171, among those sites which are numbered on Figure 5.6. On the plot,site 171 is closest in position of any site to the positions of these twospecies. The Whitefronted Goose (WG) is recorded at sites 171 and 172only, the Gadwall (GA) at sites 43, 103, 168, 169, 172 among those labelledon the diagram, and the Common Sandpiper (CS) at all sites inthe coastal group {43, 168, 169, 171, 172}, but at only one of the inlandgroup {50, 53, 103, 155, 156, 235}. Again, these occurrences might be predictedfrom the relative positions of the sites and species on the plot.However, simple predictions are not always valid, as the Coot (CO), whoseposition on the plot is in the middle of the inland sites, is recorded at all11 sites numbered on the figure.5.5 Comparisons Between <strong>Principal</strong> Coordinates,Biplots, Correspondence <strong>Analysis</strong> and PlotsBased on <strong>Principal</strong> <strong>Component</strong>sFor most purposes there is little point in asking which of the graphicaltechniques discussed so far in this chapter is ‘best.’ This is because they areeither equivalent, as is the case of PCs and principal coordinates for sometypes of similarity matrix, so any comparison is trivial, or the data set is ofa type such that one or more of the techniques are not really appropriate,and so should not be compared with the others. For example, if the dataare in the form of a contingency table, then correspondence analysis isclearly relevant, but the use of the other techniques is more questionable.As demonstrated by Gower and Hand (1996) and Gabriel (1995a,b), thebiplot is not restricted to ‘standard’ (n × p) data matrices, and could beused on any two-way array of data. The simultaneous positions of the gi∗and h ∗ j still have a similar interpretation to that discussed in Section 5.3,even though some of the separate properties of the gi ∗ and h∗ j , for instance,those relating to variances and covariances, are clearly no longer valid. Acontingency table could also be analysed by PCA, but this is not reallyappropriate, as it is not at all clear what interpretation could be givento the results. <strong>Principal</strong> coordinate analysis needs a similarity or distancematrix, so it is hard to see how it could be used directly on a contingencytable.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!