12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6.3. Selecting a Subset of Variables 143of m variables, but rather than treating m as fixed they also considerhow to choose m. They use methods of variable selection due to <strong>Jolliffe</strong>(1972, 1973), adding a new variant that was computationally infeasible in1972. To choose m, King and Jackson (1999) consider the rules describedin Sections 6.1.1 and 6.1.2, including the broken stick method, togetherwith a rule that selects the largest value of m for which n/m > 3. Toassess the quality of a chosen subset of size m, King and Jackson (1999)compare plots of scores on the first two PCs for the full data set and forthe data set containing only the m selected variables. They also compute aProcrustes measure of fit (Krzanowski, 1987a) between the m-dimensionalconfigurations given by PC scores in the full and reduced data sets, and aweighted average of correlations between PCs in the full and reduced datasets.The data set analyzed by King and Jackson (1999) has n = 37 andp = 36. The results of applying the various selection procedures to thesedata confirm, as <strong>Jolliffe</strong> (1972, 1973) found, that methods B2 and B4 doreasonably well. The results also confirm that the broken stick methodgenerally chooses smaller values of m than the other methods, though itssubsets do better with respect to the Procrustes measure of fit than somemuch larger subsets. The small number of variables retained by the brokenstick implies a corresponding small proportion of total variance accountedfor by the subsets it selects. King and Jackson’s (1999) recommendation ofmethod B4 with the broken stick could therefore be challenged.We conclude this section by briefly describing a number of other possiblemethods for variable selection. None uses PCs directly to select variables,but all are related to topics discussed more fully in other sections or chapters.Bartkowiak (1991) uses a method described earlier in Bartkowiak(1982) to select a set of ‘representative’ variables in an example that alsoillustrates the choice of the number of PCs (see Section 6.1.8). Variablesare added sequentially to a ‘representative set’ by considering each variablecurrently outside the set as a candidate for inclusion. The maximumresidual sum of squares is calculated from multiple linear regressions ofeach of the other excluded variables on all the variables in the set plus thecandidate variable. The candidate for which this maximum sum of squaresis minimized is then added to the set. One of <strong>Jolliffe</strong>’s (1970, 1972, 1973)rules uses a similar idea, but in a non-sequential way. A set of m variablesis chosen if it maximizes the minimum multiple correlation between eachof the (p − m) non-selected variables and the set of m selected variables.The RV-coefficient, due to Robert and Escoufier (1976), was defined inSection 3.2. To use the coefficient to select a subset of variables, Robertand Escoufier suggest finding X 1 which maximizes RV(X, M ′ X 1 ), whereRV(X, Y) is defined by equation (3.2.2) of Section 3.2. The matrix X 1is the (n × m) submatrix of X consisting of n observations on a subsetof m variables, and M is a specific (m × m) orthogonal matrix, whoseconstruction is described in Robert and Escoufier’s paper. It is interesting

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!