12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

96 5. Graphical Representation of Data Using <strong>Principal</strong> <strong>Component</strong>smuch worse than optimal are the approximations to whichever of (a), (b),(c) are suboptimally approximated. He defines a coefficient of goodnessof proportional fit equal to the squared matrix correlation between thematrix being approximated and its approximation. For example, if X isapproximated by ˆX, the matrix correlation, sometimes known as Yanai’sgeneralized coefficient of determination (see also Section 6.3), is defined as√tr(X ′ ˆX).tr(X ′ X)tr( ˆX′ ˆX)By comparing this coefficient for a suboptimal choice of α with that for anoptimal choice, Gabriel (2001) measures how much the approximation isdegraded by the suboptimal choice. His conclusion is that the approximationsare often very close to optimal, except when there is a large separationbetween the first two eigenvalues. Even then, the symmetric (α = 1 2 )andcorrespondence analysis plots are never much inferior to the α =0,α =1plots when one of the latter is optimal.Another aspect of fit is explored by Heo and Gabriel (2001). They notethat biplots often appear to give a better representation of patterns inthe data than might be expected from simplistic interpretations of a lowvalue for goodness-of-fit. To explain this, Heo and Gabriel (2001) invokethe special case of the unweighted version of the fixed effects model, withΓ = I p (see Section 3.9) and the corresponding view that we are plottingdifferent means for different observations, rather than points from a singledistribution. By simulating from the model with q = 2 and varying levelsof σ 2 they show that the match between the biplot representation and theunderlying model is often much better than that between the biplot andthe data in the sample. Hence, the underlying pattern is apparent in thebiplot even though the sample measure of fit is low.5.3.1 ExamplesTwo examples are now presented illustrating the use of biplots. Many otherexamples have been given by Gabriel; in particular, see Gabriel (1981) andGabriel and Odoroff (1990). Another interesting example, which emphasizesthe usefulness of the simultaneous display of both rows and columns of thedata matrix, is presented by Osmond (1985).In the examples that follow, the observations are plotted as points whosecoordinates are the elements of the gi ∗ , whereas variables are plotted as linescorresponding to the vectors h ∗ j , j =1, 2,...,p, with arrowheads at the endsof the vectors. Plots consisting of points and vectors are fairly conventional,but an alternative representation for the variables, strongly preferred byGower and Hand (1996), is to extend the vectors h ∗ j right across the diagramin both directions to become lines or axes. The disadvantage of this typeof plot is that information about the relative sizes of the variances is lost.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!