12.07.2015 Views

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.3. Covariance and Correlation Matrices: An Example 39Consider the first PC; this maximizes a ′ Sa subject to a ′ a = 1. Buta ′ Sa =constdefines a family of ellipsoids and a ′ a =1defines a hyperspherein p-dimensional space, both centred at the origin. The hyperspherea ′ a =1will intersect more than one of the ellipsoids in the family a ′ Sa(unless S is the identity matrix), and the points at which the hypersphereintersects the ‘biggest’ such ellipsoid (so that a ′ Sa is maximized) lie onthe shortest principal axis of the ellipsoid. A simple diagram, as given byStuart (1982), readily verifies this result when p =2. The argument can beextended to show that the first q sample PCs are defined by the q shortestprincipal axes of the family of ellipsoids a ′ Sa =const. Although Stuart(1982) introduced this interpretation in terms of sample PCs, it is equallyvalid for population PCs.The earlier geometric property G1 was also concerned with ellipsoids butin the context of multivariate normality, where the ellipsoids x ′ Σ −1 x =const define contours of constant probability and where the first (longest)q principal axes of such ellipsoids define the first q population PCs. In lightof Property G5, it is clear that the validity of the Property G1 does notreally depend on the assumption of multivariate normality. Maximization ofa ′ Sa is equivalent to minimization of a ′ S −1 a, and looking for the ‘smallest’ellipsoids in the family a ′ S −1 a = const that intersect the hypersphere a ′ a =1 will lead to the largest principal axis of the family a ′ S −1 a. Thus the PCsdefine, successively, the principal axes of the ellipsoids a ′ S −1 a =const.Similar considerations hold for the population ellipsoids a ′ Σ −1 a =const,regardless of any assumption of multivariate normality. However, withoutmultivariate normality the ellipsoids lose their interpretation as contoursof equal probability, or as estimates of such contours in the sample case.Further discussion of the geometry of sample PCs, together with connectionswith other techniques such as principal coordinate analysis (seeSection 5.2) and special cases such as compositional data (Section 13.3), isgiven by Gower (1967).As with population properties, our discussion of sample properties ofPCA is not exhaustive. For example, Qian et al. (1994) consider the conceptof stochastic complexity or minimum description length, as describedby Rissanen and Yu (2000). They minimize the expected difference in complexitybetween a p-dimensional data set and the projection of the data ontoa q-dimensional subspace. Qian et al. show that, if multivariate normalityis assumed, the subset spanned by the first q PCs is obtained.3.3 Covariance and Correlation Matrices: AnExampleThe arguments for and against using sample correlation matrices as opposedto covariance matrices are virtually identical to those given forpopulations in Section 2.3. Furthermore, it is still the case that there is no

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!