12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

146 6. Choosing a Subset of <strong>Principal</strong> <strong>Component</strong>s or VariablesTable 6.4. Subsets of selected variables, Alate adelges.(Each row corresponds to a selected subset with × denoting a selected variable.)Variables5 8 9 11 13 14 17 18 19McCabe, using criterion (a){ best × × ×Three variablessecond best × × ×{ best × × × ×Four variablessecond best × × × ×<strong>Jolliffe</strong>, using criteria B2, B4{ B2 × × ×Three variablesB4 × × ×{ B2 × × × ×Four variablesB4 × × × ×Criterion (6.3.4)Three variables × × ×Four variables × × × ×Criterion (6.3.5)Three variables × × ×Four variables × × × ×largest coefficients on five of the seven discrete variables, and the third PC(3.9%) is almost completely dominated by one variable, number of antennalspines. This variable, which is one of the two variables negatively correlatedwith size, has a coefficient in the third PC that is five times as large as anyother variable.Table 6.4 gives various subsets of variables selected by <strong>Jolliffe</strong> (1973)and by McCabe (1982) in an earlier version of his 1984 paper that includedadditional examples. The subsets given by McCabe (1982) are the best twoaccording to his criterion (a), whereas those from <strong>Jolliffe</strong> (1973) are selectedby the criteria B2 and B4 discussed above. Only the results for m =3aregiven in <strong>Jolliffe</strong> (1973), but Table 6.4 also gives results for m = 4 using hismethods. In addition, the table includes the ‘best’ 3- and 4-variable subsetsaccording to the criteria (6.3.4) and (6.3.5).There is considerable overlap between the various subsets selected. Inparticular, variable 11 is an almost universal choice and variables 5, 13 and17 also appear in subsets selected by at least three of the four methods.Conversely, variables {1–4, 6, 7, 10, 12, 15, 16} appear in none of subsets ofTable 6.4. It should be noted the variable 11 is ‘number of antennal spines,’which, as discussed above, dominates the third PC. Variables 5 and 17, measuringnumber of spiracles and number of ovipositor spines, respectively, are

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!