Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
12.4. PCA and Non-Independent Data—Some Additional Topics 329series, rather than being restricted to real series. It turns out (Brillinger,1981, p. 344) thatB ′ u = 1 ∫ 2π˜B(λ)e iuλ dλ2πC u = 12π0∫ 2π0˜C(λ)e iuλ dλ,where ˜C(λ) isa(p × q) matrix whose columns are the first q eigenvectorsof the matrix F(λ) given in (12.1.4), and ˜B(λ) is the conjugate transposeof ˜C(λ).The q series that form the elements of z t are called the first q PC series ofx t . Brillinger (1981, Sections 9.3, 9.4) discusses various properties and estimatesof these PC series, and gives an example in Section 9.6 on monthlytemperature measurements at 14 meteorological stations. Principal componentanalysis in the frequency domain has also been used on economictime series, for example on Dutch provincial unemployment data (Bartels,1977, Section 7.7).There is a connection between frequency domain PCs and PCs definedin the time domain (Brillinger, 1981, Section 9.5). The connection involvesHilbert transforms and hence, as noted in Section 12.2.3, frequency domainPCA has links to HEOF analysis. Define the vector of variablesyt H (λ) = (x ′ t(λ), x ′Ht (λ)) ′ , where x t (λ) is the contribution to x t at frequencyλ (Brillinger, 1981, Section 4.6), and x H t (λ) is its Hilbert transform.Then the covariance matrix of yt H (λ) is proportional to[ ]Re(F(λ)) Im(F(λ)),− Im(F(λ)) Re(F(λ))where the functions Re(.), Im(.) denote the real and imaginary parts, respectively,of their argument. A PCA of ytH gives eigenvalues that are theeigenvalues of F(λ) with a corresponding pair of eigenvectors[ ] [ ]Re( ˜Cj (λ)) − Im( ˜Cj (λ))Im( ˜C ,j (λ)) Re( ˜C ,j (λ))where ˜C j (λ) isthejth column of ˜C(λ).Horel (1984) interprets HEOF analysis as frequency domain PCA averagedover all frequency bands. When a single frequency of oscillationdominates the variation in a time series, the two techniques become thesame. The averaging over frequencies of HEOF analysis is presumably thereason behind Plaut and Vautard’s (1994) claim that it is less good thanMSSA at distinguishing propagating patterns with different frequencies.Preisendorfer and Mobley (1988) describe a number of ways in whichPCA is combined with a frequency domain approach. Their Section 4ediscusses the use of PCA after a vector field has been transformed intothe frequency domain using Fourier analysis, and for scalar-valued fields
330 12. PCA for Time Series and Other Non-Independent Datatheir Chapter 12 examines various combinations of real and complex-valuedharmonic analysis with PCA.Stoffer (1999) describes a different type of frequency domain PCA, whichhe calls the spectral envelope. Here a PCA is done on the spectral matrixF(λ) relative to the time domain covariance matrix Γ 0 .Thisisaformofgeneralized PCA for F(λ) with Γ 0 as a metric (see Section 14.2.2), andleads to solving the eigenequation [F(λ) − l(λ)Γ 0 ]a(λ) = 0 for varyingangular frequency λ. Stoffer (1999) advocates the method as a way of discoveringwhether the p series x 1 (t),x 2 (t),...x p (t) share common signalsand illustrates its use on two data sets involving pain perception and bloodpressure.The idea of cointegration is important in econometrics. It has a technicaldefinition, but can essentially be described as follows. Suppose that theelements of the p-variate time series x t are stationary after, but not before,differencing. If there are one or more vectors α such that α ′ x t is stationarywithout differencing, the p series are cointegrated. Tests for cointegrationbased on the variances of frequency domain PCs have been put forward bya number of authors. For example, Cubadda (1995) points out problemswith previously defined tests and suggests a new one.12.4.2 Growth Curves and Longitudinal DataA common type of data that take the form of curves, even if they are notnecessarily recorded as such, consists of measurements of growth for animalsor children. Some curves such as heights are monotonically increasing, butothers such as weights need not be. The idea of using principal componentsto summarize the major sources of variation in a set of growth curves datesback to Rao (1958), and several of the examples in Ramsay and Silverman(1997) are of this type. Analyses of growth curves are often concerned withpredicting future growth, and one way of doing this is to use principalcomponents as predictors. A form of generalized PC regression developedfor this purpose is described by Rao (1987).Caussinus and Ferré (1992) use PCA in a different type of analysis ofgrowth curves. They consider a 7-parameter model for a set of curves, andestimate the parameters of the model separately for each curve. These 7-parameter estimates are then taken as values of 7 variables to be analyzedby PCA. A two-dimensional plot in the space of the first two PCs givesa representation of the relative similarities between members of the set ofcurves. Because the parameters are not estimated with equal precision, aweighted version of PCA is used, based on the fixed effects model describedin Section 3.9.Growth curves constitute a special case of longitudinal data, also knownas ‘repeated measures,’ where measurements are taken on a number of individualsat several different points of time. Berkey et al. (1991) use PCAto model such data, calling their model a ‘longitudinal principal compo-
- Page 310 and 311: 11.2. Alternatives to Rotation 279w
- Page 312 and 313: 11.2. Alternatives to Rotation 281F
- Page 314 and 315: 11.2. Alternatives to Rotation 283F
- Page 316 and 317: 11.2. Alternatives to Rotation 285T
- Page 318 and 319: 11.2. Alternatives to Rotation 287T
- Page 320 and 321: 11.2. Alternatives to Rotation 289A
- Page 322 and 323: 11.2. Alternatives to Rotation 291
- Page 324 and 325: 11.3. Simplified Approximations to
- Page 326 and 327: 11.3. Simplified Approximations to
- Page 328 and 329: 11.4. Physical Interpretation of Pr
- Page 330 and 331: 12Principal Component Analysis forT
- Page 332 and 333: 12.1. Introduction 301series is alm
- Page 334 and 335: 12.2. PCA and Atmospheric Time Seri
- Page 336 and 337: 12.2. PCA and Atmospheric Time Seri
- Page 338 and 339: and a typical row of the matrix is1
- Page 340 and 341: 12.2. PCA and Atmospheric Time Seri
- Page 342 and 343: 12.2. PCA and Atmospheric Time Seri
- Page 344 and 345: 12.2. PCA and Atmospheric Time Seri
- Page 346 and 347: 12.2. PCA and Atmospheric Time Seri
- Page 348 and 349: 12.3. Functional PCA 317A key refer
- Page 350 and 351: 12.3. Functional PCA 319The sample
- Page 352 and 353: 12.3. Functional PCA 321speed (mete
- Page 354 and 355: 12.3. Functional PCA 323of the data
- Page 356 and 357: 12.3. Functional PCA 325subject to
- Page 358 and 359: 12.3. Functional PCA 327series than
- Page 362 and 363: 12.4. PCA and Non-Independent Data
- Page 364 and 365: 12.4. PCA and Non-Independent Data
- Page 366 and 367: 12.4. PCA and Non-Independent Data
- Page 368 and 369: 12.4. PCA and Non-Independent Data
- Page 370 and 371: 13.1. Principal Component Analysis
- Page 372 and 373: 13.1. Principal Component Analysis
- Page 374 and 375: 13.2. Analysis of Size and Shape 34
- Page 376 and 377: 13.2. Analysis of Size and Shape 34
- Page 378 and 379: 13.3. Principal Component Analysis
- Page 380 and 381: 13.3. Principal Component Analysis
- Page 382 and 383: 13.4. Principal Component Analysis
- Page 384 and 385: 13.4. Principal Component Analysis
- Page 386 and 387: 13.5. Common Principal Components 3
- Page 388 and 389: 13.5. Common Principal Components 3
- Page 390 and 391: 13.5. Common Principal Components 3
- Page 392 and 393: 13.5. Common Principal Components 3
- Page 394 and 395: 13.6. Principal Component Analysis
- Page 396 and 397: 13.6. Principal Component Analysis
- Page 398 and 399: 13.7. PCA in Statistical Process Co
- Page 400 and 401: 13.8. Some Other Types of Data 369A
- Page 402 and 403: 13.8. Some Other Types of Data 371d
- Page 404 and 405: 14Generalizations and Adaptations o
- Page 406 and 407: 14.1. Non-Linear Extensions of Prin
- Page 408 and 409: 14.1. Additive Principal Components
330 12. PCA for Time Series and Other Non-Independent Datatheir Chapter 12 examines various combinations of real and complex-valuedharmonic analysis with PCA.Stoffer (1999) describes a different type of frequency domain PCA, whichhe calls the spectral envelope. Here a PCA is done on the spectral matrixF(λ) relative to the time domain covariance matrix Γ 0 .Thisisaformofgeneralized PCA for F(λ) with Γ 0 as a metric (see Section 14.2.2), andleads to solving the eigenequation [F(λ) − l(λ)Γ 0 ]a(λ) = 0 for varyingangular frequency λ. Stoffer (1999) advocates the method as a way of discoveringwhether the p series x 1 (t),x 2 (t),...x p (t) share common signalsand illustrates its use on two data sets involving pain perception and bloodpressure.The idea of cointegration is important in econometrics. It has a technicaldefinition, but can essentially be described as follows. Suppose that theelements of the p-variate time series x t are stationary after, but not before,differencing. If there are one or more vectors α such that α ′ x t is stationarywithout differencing, the p series are cointegrated. Tests for cointegrationbased on the variances of frequency domain PCs have been put forward bya number of authors. For example, Cubadda (1995) points out problemswith previously defined tests and suggests a new one.12.4.2 Growth Curves and Longitudinal DataA common type of data that take the form of curves, even if they are notnecessarily recorded as such, consists of measurements of growth for animalsor children. Some curves such as heights are monotonically increasing, butothers such as weights need not be. The idea of using principal componentsto summarize the major sources of variation in a set of growth curves datesback to Rao (1958), and several of the examples in Ramsay and Silverman(1997) are of this type. Analyses of growth curves are often concerned withpredicting future growth, and one way of doing this is to use principalcomponents as predictors. A form of generalized PC regression developedfor this purpose is described by Rao (1987).Caussinus and Ferré (1992) use PCA in a different type of analysis ofgrowth curves. They consider a 7-parameter model for a set of curves, andestimate the parameters of the model separately for each curve. These 7-parameter estimates are then taken as values of 7 variables to be analyzedby PCA. A two-dimensional plot in the space of the first two PCs givesa representation of the relative similarities between members of the set ofcurves. Because the parameters are not estimated with equal precision, aweighted version of PCA is used, based on the fixed effects model describedin Section 3.9.Growth curves constitute a special case of longitudinal data, also knownas ‘repeated measures,’ where measurements are taken on a number of individualsat several different points of time. Berkey et al. (1991) use PCAto model such data, calling their model a ‘longitudinal principal compo-