26.08.2013 Views

Relational Takagi–Sugeno Models for Rainfall ... - EUSFLAT

Relational Takagi–Sugeno Models for Rainfall ... - EUSFLAT

Relational Takagi–Sugeno Models for Rainfall ... - EUSFLAT

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Relational</strong> <strong>Takagi–Sugeno</strong> <strong>Models</strong> <strong>for</strong> <strong>Rainfall</strong>-Discharge Modeling<br />

Hilde Vernieuwe<br />

Department of Applied Mathematics,<br />

Biometrics and Process Control<br />

Ghent University<br />

Coupure links 653, 9000 Gent, Belgium<br />

Hilde.Vernieuwe@rug.ac.be<br />

Abstract<br />

In this paper, the use of fuzzy models relating<br />

rainfall to catchment discharge is investigated<br />

<strong>for</strong> the Zwalm catchment in Belgium.<br />

The models are built along the lines<br />

of Gaweda’s method [4]. Since acceptable<br />

models were not obtained <strong>for</strong> this data set,<br />

the method was further adapted. The newly<br />

obtained models are of comparable per<strong>for</strong>mance<br />

as <strong>Takagi–Sugeno</strong> models based on<br />

the Gustafson–Kessel clustering algorithm.<br />

Keywords: Gustafson–Kessel clustering,<br />

rainfall-discharge, relational rules, Takagi–<br />

Sugeno models.<br />

1 Introduction<br />

With respect to rainfall-discharge prediction some<br />

models using fuzzy rules have been reported over<br />

the past years. See and Openshaw [8] used a combination<br />

of a hybrid neural network, an autoregressive<br />

moving average model, and a simple fuzzy rulebased<br />

model <strong>for</strong> discharge <strong>for</strong>ecasting. Hundecha<br />

et al. [5] developed fuzzy rule-based routines simulating<br />

different processes involved in the generation<br />

of discharge from precipitation inputs, and incorporated<br />

them in the modular conceptual physical<br />

model of Bergstrom [2]. Finally, Xiong et al. [11]<br />

used a <strong>Takagi–Sugeno</strong> model in a flood <strong>for</strong>ecasting<br />

study, combining the <strong>for</strong>ecasts of five different<br />

rainfall-discharge models.<br />

In [10], we developed different <strong>Takagi–Sugeno</strong> models<br />

using three different identification methods <strong>for</strong><br />

identifying the antecedent parts: grid partitioning,<br />

Bernard De Baets<br />

Department of Applied Mathematics,<br />

Biometrics and Process Control<br />

Ghent University<br />

Coupure links 653, 9000 Gent, Belgium<br />

Bernard.DeBaets@rug.ac.be<br />

subtractive clustering and Gustafson–Kessel clustering.<br />

The results in that paper show that Takagi–<br />

Sugeno models with antecedent parts determined by<br />

the Gustafson–Kessel clustering method and with linear<br />

consequent parts (GKL) give the best results.<br />

In this paper, we investigate whether comparable results<br />

can be obtained using a <strong>Takagi–Sugeno</strong> model<br />

with relational rules. The relational rules are identified<br />

using the method presented in [4].<br />

2 Study Area and Data Used<br />

The <strong>Takagi–Sugeno</strong> models developed in this paper<br />

have been applied to predict the discharge of the river<br />

Zwalm in Belgium. Troch et al. [9] give a general<br />

overview of the soil, vegetative, and topographic conditions<br />

of the catchment.<br />

The data set consists of hourly precipitation values<br />

(obtained through disaggregation of daily observations)<br />

and hourly measured discharge values from<br />

1994 through 1998. Pauwels et al. [7] describe in detail<br />

this precipitation disaggregation algorithm.<br />

The identification data set used to build the models<br />

consists of the data set <strong>for</strong> 1994 only (Fig. 1). The<br />

entire data set was then used <strong>for</strong> validation. The discharge<br />

records show a high temporal variability, and<br />

include extremely high and low values. Since the<br />

hourly precipitation records were obtained using daily<br />

observations, the model per<strong>for</strong>mance was evaluated<br />

using both hourly and daily averages of simulated and<br />

observed discharge values.


Q(t+1)<br />

20<br />

15<br />

10<br />

5<br />

0<br />

20<br />

15<br />

10<br />

Q(t)<br />

5<br />

0 0<br />

Figure 1: Identification data<br />

3 <strong>Relational</strong> Rules <strong>for</strong> the Zwalm<br />

Catchment<br />

The <strong>Takagi–Sugeno</strong> models used will predict the discharge<br />

value Q at time step t + 1 using the precipitation<br />

and discharge values P and Q at the previous time<br />

step t. The <strong>Takagi–Sugeno</strong> model uses relational rules<br />

of the <strong>for</strong>m:<br />

IF(P(t),Q(t)) is Ri THEN Q(t + 1) =<br />

2<br />

4<br />

P(t)<br />

aiP(t)+biQ(t)+ci<br />

where Ri is a fuzzy relation on the Cartesian products<br />

of the domains of P and Q [4]. In this particular<br />

case, the fuzzy relation R is considered as a twodimensional<br />

membership function. Q(t + 1) is computed<br />

as:<br />

Q(t + 1) = ∑n i=1 Ri(P(t),Q(t))(aiP(t)+biQ(t)+ci)<br />

∑ n i=1 Ri(P(t),Q(t))<br />

(1)<br />

In order to identify the parameters of the rules, the<br />

method presented in [4] is used. First, the Gustafson–<br />

Kessel clustering method is applied on the data set<br />

in the input-output space. From this clustering algorithm,<br />

a fuzzy partition matrix is obtained. This<br />

matrix contains the membership degrees of the data<br />

points z = (x,y) to the different fuzzy clusters. From<br />

this partition matrix, a subset Zi, corresponding to the<br />

i-th cluster, is determined. This subset contains the<br />

data points with a membership value to the i-th cluster<br />

bigger than an arbitrarily chosen threshold α. For<br />

each subset, the parameters of the two-dimensional<br />

membership functions are determined: the center coordinates<br />

<strong>for</strong> the i-th membership function:<br />

c i = 1<br />

Ni<br />

Ni<br />

∑ x<br />

k=1<br />

k<br />

6<br />

8<br />

10<br />

12<br />

(2)<br />

with Ni the number of data points in the subset Zi. The<br />

standard deviation <strong>for</strong> the i-th membership function:<br />

s i <br />

∑<br />

p =<br />

Ni<br />

k=1 (cip − xk p) 2<br />

(3)<br />

Ni − 1<br />

with c i p and x k p the values of c i and x k <strong>for</strong> the p-th dimension.<br />

The correlation coefficient <strong>for</strong> the i-th membership<br />

function between the p-th and the q-th dimen-<br />

sions:<br />

r i pq =<br />

∑ N i<br />

k=1 (ci p−x k p)(c i q−x k q)<br />

Ni−1<br />

s i ps i q<br />

(4)<br />

with x k p, x k q, c ip and c iq the values of x k and c i <strong>for</strong> the pth<br />

and the q-th dimension. The membership functions<br />

Gaweda uses are then:<br />

−1<br />

1−ri pq 2<br />

xp−ci p<br />

si 2 <br />

xq−ci q<br />

+<br />

p<br />

si 2 −2r<br />

q<br />

i (xp−c<br />

pq<br />

i p )(xq−ciq )<br />

si psi <br />

q<br />

Ri(xp,xq)=e<br />

(5)<br />

The vector ai = (ai,bi,ci) containing the consequent<br />

parameters of rule i is then computed using:<br />

ai = (X T<br />

i Xi) −1 X T<br />

i Yi<br />

(6)<br />

with Xi the matrix containing Ni rows of type (x k 1)<br />

and Yi the vector containing the corresponding output<br />

parts of the vectors of the subset Zi.<br />

4 Modeling Results and Improvements<br />

In order to examine the per<strong>for</strong>mance of the Takagi–<br />

Sugeno models, the following two indices are used:<br />

(i) The criterion of Nash and Suttcliffe [6] (NS),<br />

commonly used in hydrological studies and comparable<br />

to the Variance Accounted For (VAF),<br />

compares the sum of squares of model errors<br />

with the sum of squares of errors when “no<br />

model” is present:<br />

NS = 1 −<br />

N<br />

∑<br />

k=1<br />

N<br />

∑<br />

k=1<br />

(Qm(k) − Qobs(k)) 2<br />

(Qobs(k) − Qobs) 2<br />

(7)<br />

where Qm is the simulated discharge, Qobs is the<br />

observed discharge and Qobs denotes the mean of<br />

the observed data. The optimal value of NS is 1,<br />

meaning a perfect match of the model. A value<br />

of zero indicates that the model predictions are as


Table 1: Coordinates of the centra, with the corresponding<br />

per<strong>for</strong>mance indices<br />

Centra Coordinates NS RMSE<br />

Model P(t) Q(t) [-] [m 3 s −1 ]<br />

I 1.11 2.69 -77.33 16.49<br />

0.01 1.70<br />

II 0.05 3.44 -0.72 2.45<br />

0.12 1.00<br />

good as that of a “no-knowledge” model continuously<br />

simulating the mean of the observed signal<br />

[3]. Negative values indicate that the model<br />

is per<strong>for</strong>ming worse than this “no-knowledge”<br />

model [3].<br />

(ii) The Root Mean Square Error (RMSE) given by:<br />

<br />

N <br />

<br />

∑(Qobs(k)<br />

− Qm(k))<br />

k=1<br />

RMSE =<br />

2<br />

(8)<br />

N<br />

Using the above described method, the parameters of<br />

the rules were identified. Since the Gustafson–Kessel<br />

fuzzy clustering algorithm is an iterative clustering algorithm<br />

with a random initialisation of the partition<br />

matrix, the method was repeated 30 times. For the<br />

Gustafson–Kessel clustering, the tolerance value was<br />

set to 10 −3 and the fuzziness exponent was set to 2.<br />

Since the optimal number of clusters found in [10]<br />

was 2, the relational models were built using 2 clusters.<br />

For each repetition, a relational rule base was<br />

built and a baseline run was per<strong>for</strong>med on the training<br />

data. Within these 30 repetitions, essentially two different<br />

values <strong>for</strong> the per<strong>for</strong>mance indices were found.<br />

The corresponding models also have the same parameters,<br />

i.e. centra, spreads, correlation coefficients and<br />

consequent parameters, within a certain accuracy. Apparently,<br />

<strong>for</strong> this data set, the Gustafson–Kessel clustering<br />

algorithm can only convergence to two different<br />

models. The coordinates of the centra, together with<br />

the corresponding values of the per<strong>for</strong>mance indices<br />

are given in Table 1. The values of the per<strong>for</strong>mance<br />

indices show that these models do not per<strong>for</strong>m gooed.<br />

These poor values may be caused by the fact that data<br />

points that have a higher value in rainfall and/or discharge<br />

are not well covered by the membership functions.<br />

This can be seen by comparing Figs. 1and 2.<br />

Figure 2: Membership functions <strong>for</strong> one of the relational<br />

models<br />

In order to improve the relational models <strong>for</strong> this data<br />

set, the covariance matrices Σi, and the cluster centra<br />

mi, resulting from the Gustafson–Kessel clustering<br />

algorithm, are used to construct the two-dimensional<br />

membership functions. The two-dimensional membership<br />

functions can then be scaled by introducing a<br />

multiplicative parameter βi into each of the different<br />

covariance matrices. In order to use a Gaussian-like<br />

expression, the exponent in Eq. 5 is multiplied by 0.5.<br />

Eq. 5 can then be rewritten as:<br />

Ri(xp,xq) = e −0.5(x−mi)(β 2 i Σi) −1 (x−mi) T<br />

(9)<br />

The consequent parameters were then estimated using<br />

a global least squares method [1].<br />

βi was varied between 1,2,3 and 4 <strong>for</strong> the two clusters<br />

separately. For each combination of the two βi’s, the<br />

covariance matrices and the cluster centra of the previous<br />

30 models were used to build new models. A<br />

baseline run was per<strong>for</strong>med on the training data. The<br />

values of the per<strong>for</strong>mance indices <strong>for</strong> the best models<br />

found with the above described combinations of<br />

β’s varied between 0.11 and 0.45 <strong>for</strong> NS and 1.75 and<br />

1.38 <strong>for</strong> RMSE. The models with the best values <strong>for</strong><br />

both NS= 0.45 and RMSE= 1.38, were found with 2<br />

and 4 as βi’s. These best models are built using model<br />

type I of Tabel 1, with the covariance matrix of the<br />

first cluster multiplied by 4 and the covariance matrix<br />

of the second cluster multiplied by 16. Fig. 3 shows<br />

the membership functions <strong>for</strong> one of these models.<br />

These best models were then used to per<strong>for</strong>m a baseline<br />

run on the entire data set. The per<strong>for</strong>mance indices<br />

are calculated using the mean output of these<br />

models (Table 2). The values of the per<strong>for</strong>mance indices<br />

<strong>for</strong> the GKL models are also listed (Table 2).<br />

From this table, one can see that both methods yield


Figure 3: Membership functions <strong>for</strong> one of the relational<br />

models<br />

Table 2: Values of the per<strong>for</strong>mance indices <strong>for</strong> the<br />

baseline run on the entire data set using the relational<br />

models and the GKL models<br />

<strong>Relational</strong> GKL<br />

NS hourly 0.43 0.43<br />

[-] daily 0.48 0.47<br />

RMSE hourly 1.37 1.37<br />

[m 3 s −1 ] daily 1.18 1.19<br />

comparable results. The modeling results <strong>for</strong> the relational<br />

models <strong>for</strong> 1998 are shown in Fig. 4.<br />

5 Conclusion<br />

<strong>Relational</strong> rules were used to develop data-driven<br />

<strong>Takagi–Sugeno</strong> models. Applying the method as described<br />

by [4] dit not result in acceptable models. This<br />

can be due to the fact that the membership functions<br />

are too narrow compared to the spread of the training<br />

data. Multiplying the covariance matrices by a factor<br />

β2 i , did result in broader membership functions and<br />

model results that are comparable to those obtained<br />

Q(m3/s)<br />

16<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

0 50 100 150 200 250 300 350<br />

Day of year<br />

Figure 4: Simulation results <strong>for</strong> the relational models<br />

<strong>for</strong> 1998. The observations are in solid lines and the<br />

simulations are in dashed lines.<br />

0<br />

20<br />

40<br />

60<br />

80<br />

100<br />

120<br />

140<br />

P(mm/day)<br />

by the GKL models.<br />

Acknowledgements<br />

The authors would like to thank A. Gaweda <strong>for</strong> the<br />

use of his software.<br />

References<br />

[1] R. Babuˇska, Fuzzy Modeling <strong>for</strong> Control,<br />

Kluwer Academic Publishers, 1998.<br />

[2] S. Bergström, The HBV model, Computer <strong>Models</strong><br />

of Watershed Hydrology (V. P. Singh, ed.),<br />

Water Resources Publications, 1995, pp. 443–<br />

476.<br />

[3] K.J. Beven, <strong>Rainfall</strong>-Runoff Modelling, The<br />

Primer, John Wiley and Sons, 2000.<br />

[4] A.E. Gaweda, Optimal data-driven rule extraction<br />

using adaptive fuzzy-neural models, Ph.D.<br />

thesis, University of Louisville, Lousville, Kentucky,<br />

August 2002.<br />

[5] Y. Hundecha, A. Bárdossy, and H.W. Theisen,<br />

Development of a fuzzy logic-based rainfallrunoff<br />

model, Hydrol. Sc. Journal 46 (2001),<br />

no. 3, 363–376.<br />

[6] J.E. Nash and J.V. Sutcliffe, River flow <strong>for</strong>ecasting<br />

through conceptual models part I - a discussion<br />

of principles, J. Hydrol. 10 (1970), 282–<br />

290.<br />

[7] V. R. N. Pauwels, N. E. C. Verhoest, and F. P.<br />

De Troch, A meta-hillslope model based on an<br />

analytical solution to the linearized Boussinesqequation<br />

<strong>for</strong> temporally variable recharge rates,<br />

Water Resour. Res. 38 (2002), no. 12, 1297,<br />

doi:1029/2001WR000714.<br />

[8] L. See and S. Openshaw, A hybrid multi-model<br />

approach to river level <strong>for</strong>ecasting, Hydrol. Sci.<br />

Journal 45 (2000), no. 4, 523–536.<br />

[9] P.A. Troch, F.P. De Troch, and W. Brutsaert, Effective<br />

water table depth to describe initial conditions<br />

prior to storm rainfall in humid regions,<br />

Water Resour. Res. 29 (1993), no. 2, 427–434.<br />

[10] H. Vernieuwe, O. Georgieva, B. De Baets,<br />

V.R.N. Pauwels, and N.E.C. Verhoest, Fuzzy<br />

models of rainfall-discharge dynamics, Lecture<br />

Notes in Computer Science, to appear.<br />

[11] L. Xiong, A.Y. Shamseldin, and K.M.<br />

O’Connor, A non-linear combination of the<br />

<strong>for</strong>ecasts of rainfall-runoff models by the firstorder<br />

Takagi-Sugeno fuzzy system, J. Hydrol.<br />

245 (2001), 196–217.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!