Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
and a typical row of the matrix is12.2. PCA and Atmospheric Time Series 307x ′ i =(x i1 ,x (i+1)1 ,...,x (i+m−1)1 ,x i2 ,...,x (i+m−1)2 ,...,x (i+m−1)p ),i =1, 2,...,n ′ , where x ij is the value of the measured variable at the ithtime point and the jth spatial location, and m plays the same rôle in MSSAas p does in SSA. The covariance matrix for this data matrix has the form⎡⎤S 11 S 12 ··· S 1pS 21 S 22 ··· S 2p⎢⎣.⎥.⎦ ,S p1 S p2 ··· S ppwhere S kk is an (m × m) covariance matrix at various lags for the kthvariable (location), with the same structure as the covariance matrix inan SSA of that variable. The off-diagonal matrices S kl , k ≠ l, have(i, j)thelement equal to the covariance between locations k and l at time lag |i−j|.Plaut and Vautard (1994) claim that the ‘fundamental property’ of MSSAis its ability to detect oscillatory behaviour in the same manner as SSA, butrather than an oscillation of a single series the technique finds oscillatoryspatial patterns. Furthermore, it is capable of finding oscillations with thesame period but different spatially orthogonal patterns, and oscillationswith the same spatial pattern but different periods.The same problem of ascertaining ‘significance’ arises for MSSA as inSSA. Allen and Robertson (1996) tackle this problem in a similar mannerto that adopted by Allen and Smith (1996) for SSA. The null hypothesishere extends one-dimensional ‘red noise’ to a set of p independentAR(1) processes. A general multivariate AR(1) process is not appropriateas it can itself exhibit oscillatory behaviour, as exemplified in POP analysis(Section 12.2.2).MSSA extends SSA from one time series to several, but if the number oftime series p is large, it can become unmanageable. A solution, which is usedby Benzi et al. (1997), is to carry out PCA on the (n × p) data matrix, andthen implement SSA separately on the first few PCs. Alternatively for largep, MSSA is often performed on the first few PCs instead of the variablesthemselves, as in Plaut and Vautard (1994).Although MSSA is a natural extension of SSA, it is also equivalent toextended empirical orthogonal function (EEOF) analysis which was introducedindependently of SSA by Weare and Nasstrom (1982). Barnett andHasselmann (1979) give an even more general analysis, in which differentmeteorological variables, as well as or instead of different time lags, may beincluded at the various locations. When different variables replace differenttime lags, the temporal correlation in the data is no longer taken intoaccount, so further discussion is deferred to Section 14.5.The general technique, including both time lags and several variables,is referred to as multivariate EEOF (MEEOF) analysis by Mote et al.
308 12. PCA for Time Series and Other Non-Independent Data(2000), who give an example of the technique for five variables, and comparethe results to those of separate EEOFs (MSSAs) for each variable. Moteand coworkers note that it is possible that some of the dominant MEEOFpatterns may not be dominant in any of the individual EEOF analyses,and this may viewed as a disadvantage of the method. On the other hand,MEEOF analysis has the advantage of showing directly the connectionsbetween patterns for the different variables. Discussion of the properties ofMSSA and MEEOF analysis is ongoing (in addition to Mote et al. (2000),see Monahan et al. (1999), for example). Compagnucci et al. (2001) proposeyet another variation on the same theme. In their analysis, the PCA is doneon the transpose of the matrix used in MSSA, a so-called T-mode instead ofS-mode analysis (see Section 14.5). Compagnucci et al. call their techniqueprincipal sequence pattern analysis.12.2.2 Principal Oscillation Pattern (POP) AnalysisSSA, MSSA, and other techniques described in this chapter can be viewedas special cases of PCA, once the variables have been defined in a suitableway. With the chosen definition of the variables, the procedures performan eigenanalysis of a covariance matrix. POP analysis is different, but itis described briefly here because its results are used for similar purposesto those of some of the PCA-based techniques for time series includedelsewhere in the chapter. Furthermore its core is an eigenanalysis, albeitnot on a covariance matrix.POP analysis was introduced by Hasselman (1988). Suppose that we havethe usual (n×p) matrix of measurements on a meteorological variable, takenat n time points and p spatial locations. POP analysis has an underlyingassumption that the p time series can be modelled as a multivariate firstorderautoregressive process. If x ′ t is the tth row of the data matrix, wehave(x (t+1) − µ) =Υ(x t − µ)+ɛ t , t =1, 2,...,(n − 1), (12.2.1)where Υ is a (p × p) matrix of constants, µ is a vector of means for the pvariables and ɛ t is a multivariate white noise term. Standard results frommultivariate regression analysis (Mardia et al., 1979, Chapter 6) lead to estimationof Υ by ˆΥ = S 1 S −10 , where S 0 is the usual sample covariance matrixfor the p variables, and S 1 has (i, j)th element equal to the sample covariancebetween the ith and jth variables at lag 1. POP analysis then finds theeigenvalues and eigenvectors of ˆΥ. The eigenvectors are known as principaloscillation patterns (POPs) and denoted p 1 , p 2 ,...,p p . The quantitiesz t1 ,z t2 ,...,z tp which can be used to reconstitute x t as ∑ pk=1 z tkp k arecalled the POP coefficients. They play a similar rôle in POP analysis tothat of PC scores in PCA.One obvious question is why this technique is called principal oscillationpattern analysis. Because ˆΥ is not symmetric it typically has a
- Page 288 and 289: 10.2. Influential Observations in a
- Page 290 and 291: 10.3. Sensitivity and Stability 259
- Page 292 and 293: 10.3. Sensitivity and Stability 261
- Page 294 and 295: 10.4. Robust Estimation of Principa
- Page 296 and 297: 10.4. Robust Estimation of Principa
- Page 298 and 299: 10.4. Robust Estimation of Principa
- Page 300 and 301: 11Rotation and Interpretation ofPri
- Page 302 and 303: 11.1. Rotation of Principal Compone
- Page 304 and 305: oot of the corresponding eigenvalue
- Page 306 and 307: 11.1. Rotation of Principal Compone
- Page 308 and 309: 11.1. Rotation of Principal Compone
- Page 310 and 311: 11.2. Alternatives to Rotation 279w
- Page 312 and 313: 11.2. Alternatives to Rotation 281F
- Page 314 and 315: 11.2. Alternatives to Rotation 283F
- Page 316 and 317: 11.2. Alternatives to Rotation 285T
- Page 318 and 319: 11.2. Alternatives to Rotation 287T
- Page 320 and 321: 11.2. Alternatives to Rotation 289A
- Page 322 and 323: 11.2. Alternatives to Rotation 291
- Page 324 and 325: 11.3. Simplified Approximations to
- Page 326 and 327: 11.3. Simplified Approximations to
- Page 328 and 329: 11.4. Physical Interpretation of Pr
- Page 330 and 331: 12Principal Component Analysis forT
- Page 332 and 333: 12.1. Introduction 301series is alm
- Page 334 and 335: 12.2. PCA and Atmospheric Time Seri
- Page 336 and 337: 12.2. PCA and Atmospheric Time Seri
- Page 340 and 341: 12.2. PCA and Atmospheric Time Seri
- Page 342 and 343: 12.2. PCA and Atmospheric Time Seri
- Page 344 and 345: 12.2. PCA and Atmospheric Time Seri
- Page 346 and 347: 12.2. PCA and Atmospheric Time Seri
- Page 348 and 349: 12.3. Functional PCA 317A key refer
- Page 350 and 351: 12.3. Functional PCA 319The sample
- Page 352 and 353: 12.3. Functional PCA 321speed (mete
- Page 354 and 355: 12.3. Functional PCA 323of the data
- Page 356 and 357: 12.3. Functional PCA 325subject to
- Page 358 and 359: 12.3. Functional PCA 327series than
- Page 360 and 361: 12.4. PCA and Non-Independent Data
- Page 362 and 363: 12.4. PCA and Non-Independent Data
- Page 364 and 365: 12.4. PCA and Non-Independent Data
- Page 366 and 367: 12.4. PCA and Non-Independent Data
- Page 368 and 369: 12.4. PCA and Non-Independent Data
- Page 370 and 371: 13.1. Principal Component Analysis
- Page 372 and 373: 13.1. Principal Component Analysis
- Page 374 and 375: 13.2. Analysis of Size and Shape 34
- Page 376 and 377: 13.2. Analysis of Size and Shape 34
- Page 378 and 379: 13.3. Principal Component Analysis
- Page 380 and 381: 13.3. Principal Component Analysis
- Page 382 and 383: 13.4. Principal Component Analysis
- Page 384 and 385: 13.4. Principal Component Analysis
- Page 386 and 387: 13.5. Common Principal Components 3
and a typical row of the matrix is12.2. PCA and Atmospheric Time Series 307x ′ i =(x i1 ,x (i+1)1 ,...,x (i+m−1)1 ,x i2 ,...,x (i+m−1)2 ,...,x (i+m−1)p ),i =1, 2,...,n ′ , where x ij is the value of the measured variable at the ithtime point and the jth spatial location, and m plays the same rôle in MSSAas p does in SSA. The covariance matrix for this data matrix has the form⎡⎤S 11 S 12 ··· S 1pS 21 S 22 ··· S 2p⎢⎣.⎥.⎦ ,S p1 S p2 ··· S ppwhere S kk is an (m × m) covariance matrix at various lags for the kthvariable (location), with the same structure as the covariance matrix inan SSA of that variable. The off-diagonal matrices S kl , k ≠ l, have(i, j)thelement equal to the covariance between locations k and l at time lag |i−j|.Plaut and Vautard (1994) claim that the ‘fundamental property’ of MSSAis its ability to detect oscillatory behaviour in the same manner as SSA, butrather than an oscillation of a single series the technique finds oscillatoryspatial patterns. Furthermore, it is capable of finding oscillations with thesame period but different spatially orthogonal patterns, and oscillationswith the same spatial pattern but different periods.The same problem of ascertaining ‘significance’ arises for MSSA as inSSA. Allen and Robertson (1996) tackle this problem in a similar mannerto that adopted by Allen and Smith (1996) for SSA. The null hypothesishere extends one-dimensional ‘red noise’ to a set of p independentAR(1) processes. A general multivariate AR(1) process is not appropriateas it can itself exhibit oscillatory behaviour, as exemplified in POP analysis(Section 12.2.2).MSSA extends SSA from one time series to several, but if the number oftime series p is large, it can become unmanageable. A solution, which is usedby Benzi et al. (1997), is to carry out PCA on the (n × p) data matrix, andthen implement SSA separately on the first few PCs. Alternatively for largep, MSSA is often performed on the first few PCs instead of the variablesthemselves, as in Plaut and Vautard (1994).Although MSSA is a natural extension of SSA, it is also equivalent toextended empirical orthogonal function (EEOF) analysis which was introducedindependently of SSA by Weare and Nasstrom (1982). Barnett andHasselmann (1979) give an even more general analysis, in which differentmeteorological variables, as well as or instead of different time lags, may beincluded at the various locations. When different variables replace differenttime lags, the temporal correlation in the data is no longer taken intoaccount, so further discussion is deferred to Section 14.5.The general technique, including both time lags and several variables,is referred to as multivariate EEOF (MEEOF) analysis by Mote et al.