12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

9.1. Discriminant <strong>Analysis</strong> 207that adopts this approach. The method, called SIMCA (Soft IndependentModelling of Class Analogy), does a separate PCA for each group, andretains sufficient PCs in each to account for most of the variation withinthat group. The number of PCs retained will typically be different for differentpopulations. The square of this distance for a particular populationis simply the sum of squares of the values of the omitted PCs for that population,evaluated for the observation in question (Mertens, et al., 1994).The same type of quantity is also used for detecting outliers (see Section10.1, equation (10.1.1)).To classify a new observation, a ‘distance’ of the observation from thehyperplane defined by the retained PCs is calculated for each population.If future observations are to be assigned to one and only one population,then assignment is to the population from which the distance is minimized.Alternatively, a firm decision may not be required and, if all the distancesare large enough, the observation can be left unassigned. As it is not closeto any of the existing groups, it may be an outlier or come from a newgroup about which there is currently no information. Conversely, if thegroups are not all well separated, some future observations may have smalldistances from more than one population. In such cases, it may again beundesirable to decide on a single possible class; instead two or more groupsmay be listed as possible ‘homes’ for the observation.According to Wold et al. (1983), SIMCA works with as few as five objectsfrom each population, although ten or more is preferable, and thereis no restriction on the number of variables. This is important in manychemical problems where the number of variables can greatly exceed thenumber of observations. SIMCA can also cope with situations where oneclass is very diffuse, simply consisting of all observations that do not belongin one of a number of well-defined classes. Frank and Freidman (1989)paint a less favourable picture of SIMCA. They use a number of datasets and a simulation study to compare its performance with that of linearand quadratic discriminant analyses, with regularized discriminant analysisand with a technique called DASCO (discriminant analysis with shrunkencovariances).As already explained, Friedman’s (1989) regularized discriminant analysisshrinks the individual within-group covariance matrices S g towards thepooled estimate S w in an attempt to reduce the bias in estimating eigenvalues.DASCO has a similar objective, and Frank and Freidman (1989)show that SIMCA also has a similar effect. They note that in terms ofexpression (9.1.2), SIMCA ignores the log-determinant term and replacesby a weighted and truncated version of its spectral decomposition,which in its full form is S −1g = ∑ pk=1 l−1 gk a gka ′ gk, with fairly obvious notation.If q g PCs are retained in the gth group, then SIMCA’s replacementS −1g∑ pk=qg +1 l gkfor S −1g is ∑ p −1k=q g+1¯l g a gk a ′ gk , where ¯l g =(p−q g). DASCO treats thelast (p − q g ) PCs in the same way as SIMCA, but adds the terms from the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!