12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

98 5. Graphical Representation of Data Using <strong>Principal</strong> <strong>Component</strong>sdisplayed in Figure 5.2, but their relative positions are similar. For example,the group of three ‘Seventeenth Century’ painters at the bottom of the plotis still visible. Because of the compression of the horizontal relative to thevertical scale, the group of four painters at the left of the plot now seemsto have been joined by a fifth, Murillo, who is from the same school asthree of the others in this group. There is also an outlying painter, Fr.Penni, observation number 34, whose isolated position in the top left ofthe plot is perhaps more obvious on Figure 5.3 than Figure 5.2. The maindistinguishing feature of this painter is that de Piles gave him a 0 score forcomposition, compared to a minimum of 4 and a maximum of 18 for allother painters.Now consider the positions on the biplot of the vectors corresponding tothe four variables. It is seen that composition and expression (V1 and V4)are close together, reflecting their relatively large positive correlation, andthat drawing and colour (V2 and V3) are in opposite quadrants, confirmingtheir fairly large negative correlation. Other correlations, and hencepositions of vectors, are intermediate.Finally, consider the simultaneous positions of painters and variables.The two painters, numbered 9 and 15, that are slightly below the positivehorizontal axis are Le Brun and Domenichino. These are close to the directiondefined by V4, and not far from the directions of V1 and V2, whichimplies that they should have higher than average scores on these threevariables. This is indeed the case: Le Brun scores 16 on a scale from 0 to20 on all three variables, and Domenichino scores 17 on V2 and V4 and 15on V1. Their position relative to V3 suggests an average or lower score onthis variable; the actual scores are 8 and 9, which confirms this suggestion.As another example consider the two painters 16 and 19 (Giorgione andDa Udine), whose positions are virtually identical, in the bottom left-handquadrant of Figure 5.3. These two painters have high scores on V3 (18 and16) and below average scores on V1, V2 and V4. This behaviour, but withlower scores on V2 than on V1, V4, would be predicted from the points’positions on the biplot.100 km Running DataThe second example consists of data on times taken for each of ten 10km sections by the 80 competitors who completed the Lincolnshire 100 kmrace in June 1984. There are thus 80 observations on ten variables. (I amgrateful to Ron Hindley, the race organizer, for distributing the results ofthe race in such a detailed form.)The variances and coefficients for the first two PCs, based on the correlationmatrix for these data, are given in Table 5.2. Results for the covariancematrix are similar, though with higher coefficients in the first PC for thelater sections of the race, as (means and) variances of the times takenfor each section tend to increase later in the race. The first component

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!