Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s) Jolliffe I. Principal Component Analysis (2ed., Springer, 2002)(518s)

cda.psych.uiuc.edu
from cda.psych.uiuc.edu More from this publisher
12.07.2015 Views

12Principal Component Analysis forTime Series and OtherNon-Independent Data12.1 IntroductionIn much of statistics it is assumed that the n observations x 1 , x 2 ,...,x nare independent. This chapter discusses the implications for PCA of nonindependenceamong x 1 , x 2 ,...,x n . Much of the chapter is concerned withPCA for time series data, the most common type of non-independent data,although data where x 1 , x 2 ,...,x n are measured at n points in space arealso discussed. Such data often have dependence which is more complicatedthan for time series. Time series data are sufficiently different from ordinaryindependent data for there to be aspects of PCA that arise only for suchdata, for example, PCs in the frequency domain.The results of Section 3.7, which allow formal inference procedures tobe performed for PCs, rely on independence of x 1 , x 2 ,...,x n ,aswellas(usually) on multivariate normality. They cannot therefore be used if morethan very weak dependence is present between x 1 , x 2 ,...,x n . However,when the main objective of PCA is descriptive, not inferential, complicationssuch as non-independence do not seriously affect this objective. Theeffective sample size is reduced below n, but this reduction need not betoo important. In fact, in some circumstances we are actively looking fordependence among x 1 , x 2 ,...,x n . For example, grouping of observations ina few small areas of the two-dimensional space defined by the first two PCsimplies dependence between those observations that are grouped together.Such behaviour is actively sought in cluster analysis (see Section 9.2) andis often welcomed as a useful insight into the structure of the data, ratherthan decried as an undesirable feature.

300 12. PCA for Time Series and Other Non-Independent DataWe have already seen a number of examples where the data are timeseries, but where no special account is taken of the dependence betweenobservations. Section 4.3 gave an example of a type that is common inatmospheric science, where the variables are measurements of the samemeteorological variable made at p different geographical locations, and then observations on each variable correspond to different times. Section 12.2largely deals with techniques that have been developed for data of this type.The examples given in Section 4.5 and Section 6.4.2 are also illustrations ofPCA applied to data for which the variables (stock prices and crime rates,respectively) are measured at various points of time. Furthermore, one ofthe earliest published applications of PCA (Stone, 1947) was on (economic)time series data.In time series data, dependence between the x vectors is induced bytheir relative closeness in time, so that x h and x i will often be highlydependent if |h−i| is small, with decreasing dependence as |h−i| increases.This basic pattern may in addition be perturbed by, for example, seasonaldependence in monthly data, where decreasing dependence for increasing|h − i| is interrupted by a higher degree of association for observationsseparated by exactly one year, two years, and so on.Because of the emphasis on time series in this chapter, we need to introducesome of its basic ideas and definitions, although limited space permitsonly a rudimentary introduction to this vast subject (for more informationsee, for example, Brillinger (1981); Brockwell and Davis (1996); or Hamilton(1994)). Suppose, for the moment, that only a single variable is measured atequally spaced points in time. Our time series is then ...x −1 ,x 0 ,x 1 ,x 2 ,....Much of time series analysis is concerned with series that are stationary, andwhich can be described entirely by their first- and second-order moments;these moments areµ = E(x i ), i = ...,−1, 0, 1, 2,...γ k = E[(x i − µ)(x i+k − µ)], i = ...,−1, 0, 1, 2,... (12.1.1)k = ...,−1, 0, 1, 2,...,where µ is the mean of the series and is the same for all x i in stationaryseries, and γ k ,thekth autocovariance, is the covariance between x i andx i+k , which depends on k but not i for stationary series. The informationcontained in the autocovariances can be expressed equivalently in terms ofthe power spectrum of the seriesf(λ) = 12π∞∑k=−∞γ k e −ikλ , (12.1.2)where i = √ −1andλ denotes angular frequency. Roughly speaking, thefunction f(λ) decomposes the series into oscillatory portions with differentfrequencies of oscillation, and f(λ) measures the relative importance ofthese portions as a function of their angular frequency λ. For example, if a

12<strong>Principal</strong> <strong>Component</strong> <strong>Analysis</strong> forTime Series and OtherNon-Independent Data12.1 IntroductionIn much of statistics it is assumed that the n observations x 1 , x 2 ,...,x nare independent. This chapter discusses the implications for PCA of nonindependenceamong x 1 , x 2 ,...,x n . Much of the chapter is concerned withPCA for time series data, the most common type of non-independent data,although data where x 1 , x 2 ,...,x n are measured at n points in space arealso discussed. Such data often have dependence which is more complicatedthan for time series. Time series data are sufficiently different from ordinaryindependent data for there to be aspects of PCA that arise only for suchdata, for example, PCs in the frequency domain.The results of Section 3.7, which allow formal inference procedures tobe performed for PCs, rely on independence of x 1 , x 2 ,...,x n ,aswellas(usually) on multivariate normality. They cannot therefore be used if morethan very weak dependence is present between x 1 , x 2 ,...,x n . However,when the main objective of PCA is descriptive, not inferential, complicationssuch as non-independence do not seriously affect this objective. Theeffective sample size is reduced below n, but this reduction need not betoo important. In fact, in some circumstances we are actively looking fordependence among x 1 , x 2 ,...,x n . For example, grouping of observations ina few small areas of the two-dimensional space defined by the first two PCsimplies dependence between those observations that are grouped together.Such behaviour is actively sought in cluster analysis (see Section 9.2) andis often welcomed as a useful insight into the structure of the data, ratherthan decried as an undesirable feature.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!