Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)
12Principal Component Analysis forTime Series and OtherNon-Independent Data12.1 IntroductionIn much of statistics it is assumed that the n observations x 1 , x 2 ,...,x nare independent. This chapter discusses the implications for PCA of nonindependenceamong x 1 , x 2 ,...,x n . Much of the chapter is concerned withPCA for time series data, the most common type of non-independent data,although data where x 1 , x 2 ,...,x n are measured at n points in space arealso discussed. Such data often have dependence which is more complicatedthan for time series. Time series data are sufficiently different from ordinaryindependent data for there to be aspects of PCA that arise only for suchdata, for example, PCs in the frequency domain.The results of Section 3.7, which allow formal inference procedures tobe performed for PCs, rely on independence of x 1 , x 2 ,...,x n ,aswellas(usually) on multivariate normality. They cannot therefore be used if morethan very weak dependence is present between x 1 , x 2 ,...,x n . However,when the main objective of PCA is descriptive, not inferential, complicationssuch as non-independence do not seriously affect this objective. Theeffective sample size is reduced below n, but this reduction need not betoo important. In fact, in some circumstances we are actively looking fordependence among x 1 , x 2 ,...,x n . For example, grouping of observations ina few small areas of the two-dimensional space defined by the first two PCsimplies dependence between those observations that are grouped together.Such behaviour is actively sought in cluster analysis (see Section 9.2) andis often welcomed as a useful insight into the structure of the data, ratherthan decried as an undesirable feature.
300 12. PCA for Time Series and Other Non-Independent DataWe have already seen a number of examples where the data are timeseries, but where no special account is taken of the dependence betweenobservations. Section 4.3 gave an example of a type that is common inatmospheric science, where the variables are measurements of the samemeteorological variable made at p different geographical locations, and then observations on each variable correspond to different times. Section 12.2largely deals with techniques that have been developed for data of this type.The examples given in Section 4.5 and Section 6.4.2 are also illustrations ofPCA applied to data for which the variables (stock prices and crime rates,respectively) are measured at various points of time. Furthermore, one ofthe earliest published applications of PCA (Stone, 1947) was on (economic)time series data.In time series data, dependence between the x vectors is induced bytheir relative closeness in time, so that x h and x i will often be highlydependent if |h−i| is small, with decreasing dependence as |h−i| increases.This basic pattern may in addition be perturbed by, for example, seasonaldependence in monthly data, where decreasing dependence for increasing|h − i| is interrupted by a higher degree of association for observationsseparated by exactly one year, two years, and so on.Because of the emphasis on time series in this chapter, we need to introducesome of its basic ideas and definitions, although limited space permitsonly a rudimentary introduction to this vast subject (for more informationsee, for example, Brillinger (1981); Brockwell and Davis (1996); or Hamilton(1994)). Suppose, for the moment, that only a single variable is measured atequally spaced points in time. Our time series is then ...x −1 ,x 0 ,x 1 ,x 2 ,....Much of time series analysis is concerned with series that are stationary, andwhich can be described entirely by their first- and second-order moments;these moments areµ = E(x i ), i = ...,−1, 0, 1, 2,...γ k = E[(x i − µ)(x i+k − µ)], i = ...,−1, 0, 1, 2,... (12.1.1)k = ...,−1, 0, 1, 2,...,where µ is the mean of the series and is the same for all x i in stationaryseries, and γ k ,thekth autocovariance, is the covariance between x i andx i+k , which depends on k but not i for stationary series. The informationcontained in the autocovariances can be expressed equivalently in terms ofthe power spectrum of the seriesf(λ) = 12π∞∑k=−∞γ k e −ikλ , (12.1.2)where i = √ −1andλ denotes angular frequency. Roughly speaking, thefunction f(λ) decomposes the series into oscillatory portions with differentfrequencies of oscillation, and f(λ) measures the relative importance ofthese portions as a function of their angular frequency λ. For example, if a
- Page 280 and 281: 10.2. Influential Observations in a
- Page 282 and 283: 10.2. Influential Observations in a
- Page 284 and 285: 10.2. Influential Observations in a
- Page 286 and 287: 10.2. Influential Observations in a
- Page 288 and 289: 10.2. Influential Observations in a
- Page 290 and 291: 10.3. Sensitivity and Stability 259
- Page 292 and 293: 10.3. Sensitivity and Stability 261
- Page 294 and 295: 10.4. Robust Estimation of Principa
- Page 296 and 297: 10.4. Robust Estimation of Principa
- Page 298 and 299: 10.4. Robust Estimation of Principa
- Page 300 and 301: 11Rotation and Interpretation ofPri
- Page 302 and 303: 11.1. Rotation of Principal Compone
- Page 304 and 305: oot of the corresponding eigenvalue
- Page 306 and 307: 11.1. Rotation of Principal Compone
- Page 308 and 309: 11.1. Rotation of Principal Compone
- Page 310 and 311: 11.2. Alternatives to Rotation 279w
- Page 312 and 313: 11.2. Alternatives to Rotation 281F
- Page 314 and 315: 11.2. Alternatives to Rotation 283F
- Page 316 and 317: 11.2. Alternatives to Rotation 285T
- Page 318 and 319: 11.2. Alternatives to Rotation 287T
- Page 320 and 321: 11.2. Alternatives to Rotation 289A
- Page 322 and 323: 11.2. Alternatives to Rotation 291
- Page 324 and 325: 11.3. Simplified Approximations to
- Page 326 and 327: 11.3. Simplified Approximations to
- Page 328 and 329: 11.4. Physical Interpretation of Pr
- Page 332 and 333: 12.1. Introduction 301series is alm
- Page 334 and 335: 12.2. PCA and Atmospheric Time Seri
- Page 336 and 337: 12.2. PCA and Atmospheric Time Seri
- Page 338 and 339: and a typical row of the matrix is1
- Page 340 and 341: 12.2. PCA and Atmospheric Time Seri
- Page 342 and 343: 12.2. PCA and Atmospheric Time Seri
- Page 344 and 345: 12.2. PCA and Atmospheric Time Seri
- Page 346 and 347: 12.2. PCA and Atmospheric Time Seri
- Page 348 and 349: 12.3. Functional PCA 317A key refer
- Page 350 and 351: 12.3. Functional PCA 319The sample
- Page 352 and 353: 12.3. Functional PCA 321speed (mete
- Page 354 and 355: 12.3. Functional PCA 323of the data
- Page 356 and 357: 12.3. Functional PCA 325subject to
- Page 358 and 359: 12.3. Functional PCA 327series than
- Page 360 and 361: 12.4. PCA and Non-Independent Data
- Page 362 and 363: 12.4. PCA and Non-Independent Data
- Page 364 and 365: 12.4. PCA and Non-Independent Data
- Page 366 and 367: 12.4. PCA and Non-Independent Data
- Page 368 and 369: 12.4. PCA and Non-Independent Data
- Page 370 and 371: 13.1. Principal Component Analysis
- Page 372 and 373: 13.1. Principal Component Analysis
- Page 374 and 375: 13.2. Analysis of Size and Shape 34
- Page 376 and 377: 13.2. Analysis of Size and Shape 34
- Page 378 and 379: 13.3. Principal Component Analysis
12<strong>Principal</strong> <strong>Component</strong> <strong>Analysis</strong> forTime Series and OtherNon-Independent Data12.1 IntroductionIn much of statistics it is assumed that the n observations x 1 , x 2 ,...,x nare independent. This chapter discusses the implications for PCA of nonindependenceamong x 1 , x 2 ,...,x n . Much of the chapter is concerned withPCA for time series data, the most common type of non-independent data,although data where x 1 , x 2 ,...,x n are measured at n points in space arealso discussed. Such data often have dependence which is more complicatedthan for time series. Time series data are sufficiently different from ordinaryindependent data for there to be aspects of PCA that arise only for suchdata, for example, PCs in the frequency domain.The results of Section 3.7, which allow formal inference procedures tobe performed for PCs, rely on independence of x 1 , x 2 ,...,x n ,aswellas(usually) on multivariate normality. They cannot therefore be used if morethan very weak dependence is present between x 1 , x 2 ,...,x n . However,when the main objective of PCA is descriptive, not inferential, complicationssuch as non-independence do not seriously affect this objective. Theeffective sample size is reduced below n, but this reduction need not betoo important. In fact, in some circumstances we are actively looking fordependence among x 1 , x 2 ,...,x n . For example, grouping of observations ina few small areas of the two-dimensional space defined by the first two PCsimplies dependence between those observations that are grouped together.Such behaviour is actively sought in cluster analysis (see Section 9.2) andis often welcomed as a useful insight into the structure of the data, ratherthan decried as an undesirable feature.