Relational Takagi–Sugeno Models for Rainfall ... - EUSFLAT
Relational Takagi–Sugeno Models for Rainfall ... - EUSFLAT
Relational Takagi–Sugeno Models for Rainfall ... - EUSFLAT
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Relational</strong> <strong>Takagi–Sugeno</strong> <strong>Models</strong> <strong>for</strong> <strong>Rainfall</strong>-Discharge Modeling<br />
Hilde Vernieuwe<br />
Department of Applied Mathematics,<br />
Biometrics and Process Control<br />
Ghent University<br />
Coupure links 653, 9000 Gent, Belgium<br />
Hilde.Vernieuwe@rug.ac.be<br />
Abstract<br />
In this paper, the use of fuzzy models relating<br />
rainfall to catchment discharge is investigated<br />
<strong>for</strong> the Zwalm catchment in Belgium.<br />
The models are built along the lines<br />
of Gaweda’s method [4]. Since acceptable<br />
models were not obtained <strong>for</strong> this data set,<br />
the method was further adapted. The newly<br />
obtained models are of comparable per<strong>for</strong>mance<br />
as <strong>Takagi–Sugeno</strong> models based on<br />
the Gustafson–Kessel clustering algorithm.<br />
Keywords: Gustafson–Kessel clustering,<br />
rainfall-discharge, relational rules, Takagi–<br />
Sugeno models.<br />
1 Introduction<br />
With respect to rainfall-discharge prediction some<br />
models using fuzzy rules have been reported over<br />
the past years. See and Openshaw [8] used a combination<br />
of a hybrid neural network, an autoregressive<br />
moving average model, and a simple fuzzy rulebased<br />
model <strong>for</strong> discharge <strong>for</strong>ecasting. Hundecha<br />
et al. [5] developed fuzzy rule-based routines simulating<br />
different processes involved in the generation<br />
of discharge from precipitation inputs, and incorporated<br />
them in the modular conceptual physical<br />
model of Bergstrom [2]. Finally, Xiong et al. [11]<br />
used a <strong>Takagi–Sugeno</strong> model in a flood <strong>for</strong>ecasting<br />
study, combining the <strong>for</strong>ecasts of five different<br />
rainfall-discharge models.<br />
In [10], we developed different <strong>Takagi–Sugeno</strong> models<br />
using three different identification methods <strong>for</strong><br />
identifying the antecedent parts: grid partitioning,<br />
Bernard De Baets<br />
Department of Applied Mathematics,<br />
Biometrics and Process Control<br />
Ghent University<br />
Coupure links 653, 9000 Gent, Belgium<br />
Bernard.DeBaets@rug.ac.be<br />
subtractive clustering and Gustafson–Kessel clustering.<br />
The results in that paper show that Takagi–<br />
Sugeno models with antecedent parts determined by<br />
the Gustafson–Kessel clustering method and with linear<br />
consequent parts (GKL) give the best results.<br />
In this paper, we investigate whether comparable results<br />
can be obtained using a <strong>Takagi–Sugeno</strong> model<br />
with relational rules. The relational rules are identified<br />
using the method presented in [4].<br />
2 Study Area and Data Used<br />
The <strong>Takagi–Sugeno</strong> models developed in this paper<br />
have been applied to predict the discharge of the river<br />
Zwalm in Belgium. Troch et al. [9] give a general<br />
overview of the soil, vegetative, and topographic conditions<br />
of the catchment.<br />
The data set consists of hourly precipitation values<br />
(obtained through disaggregation of daily observations)<br />
and hourly measured discharge values from<br />
1994 through 1998. Pauwels et al. [7] describe in detail<br />
this precipitation disaggregation algorithm.<br />
The identification data set used to build the models<br />
consists of the data set <strong>for</strong> 1994 only (Fig. 1). The<br />
entire data set was then used <strong>for</strong> validation. The discharge<br />
records show a high temporal variability, and<br />
include extremely high and low values. Since the<br />
hourly precipitation records were obtained using daily<br />
observations, the model per<strong>for</strong>mance was evaluated<br />
using both hourly and daily averages of simulated and<br />
observed discharge values.
Q(t+1)<br />
20<br />
15<br />
10<br />
5<br />
0<br />
20<br />
15<br />
10<br />
Q(t)<br />
5<br />
0 0<br />
Figure 1: Identification data<br />
3 <strong>Relational</strong> Rules <strong>for</strong> the Zwalm<br />
Catchment<br />
The <strong>Takagi–Sugeno</strong> models used will predict the discharge<br />
value Q at time step t + 1 using the precipitation<br />
and discharge values P and Q at the previous time<br />
step t. The <strong>Takagi–Sugeno</strong> model uses relational rules<br />
of the <strong>for</strong>m:<br />
IF(P(t),Q(t)) is Ri THEN Q(t + 1) =<br />
2<br />
4<br />
P(t)<br />
aiP(t)+biQ(t)+ci<br />
where Ri is a fuzzy relation on the Cartesian products<br />
of the domains of P and Q [4]. In this particular<br />
case, the fuzzy relation R is considered as a twodimensional<br />
membership function. Q(t + 1) is computed<br />
as:<br />
Q(t + 1) = ∑n i=1 Ri(P(t),Q(t))(aiP(t)+biQ(t)+ci)<br />
∑ n i=1 Ri(P(t),Q(t))<br />
(1)<br />
In order to identify the parameters of the rules, the<br />
method presented in [4] is used. First, the Gustafson–<br />
Kessel clustering method is applied on the data set<br />
in the input-output space. From this clustering algorithm,<br />
a fuzzy partition matrix is obtained. This<br />
matrix contains the membership degrees of the data<br />
points z = (x,y) to the different fuzzy clusters. From<br />
this partition matrix, a subset Zi, corresponding to the<br />
i-th cluster, is determined. This subset contains the<br />
data points with a membership value to the i-th cluster<br />
bigger than an arbitrarily chosen threshold α. For<br />
each subset, the parameters of the two-dimensional<br />
membership functions are determined: the center coordinates<br />
<strong>for</strong> the i-th membership function:<br />
c i = 1<br />
Ni<br />
Ni<br />
∑ x<br />
k=1<br />
k<br />
6<br />
8<br />
10<br />
12<br />
(2)<br />
with Ni the number of data points in the subset Zi. The<br />
standard deviation <strong>for</strong> the i-th membership function:<br />
s i <br />
∑<br />
p =<br />
Ni<br />
k=1 (cip − xk p) 2<br />
(3)<br />
Ni − 1<br />
with c i p and x k p the values of c i and x k <strong>for</strong> the p-th dimension.<br />
The correlation coefficient <strong>for</strong> the i-th membership<br />
function between the p-th and the q-th dimen-<br />
sions:<br />
r i pq =<br />
∑ N i<br />
k=1 (ci p−x k p)(c i q−x k q)<br />
Ni−1<br />
s i ps i q<br />
(4)<br />
with x k p, x k q, c ip and c iq the values of x k and c i <strong>for</strong> the pth<br />
and the q-th dimension. The membership functions<br />
Gaweda uses are then:<br />
−1<br />
1−ri pq 2<br />
xp−ci p<br />
si 2 <br />
xq−ci q<br />
+<br />
p<br />
si 2 −2r<br />
q<br />
i (xp−c<br />
pq<br />
i p )(xq−ciq )<br />
si psi <br />
q<br />
Ri(xp,xq)=e<br />
(5)<br />
The vector ai = (ai,bi,ci) containing the consequent<br />
parameters of rule i is then computed using:<br />
ai = (X T<br />
i Xi) −1 X T<br />
i Yi<br />
(6)<br />
with Xi the matrix containing Ni rows of type (x k 1)<br />
and Yi the vector containing the corresponding output<br />
parts of the vectors of the subset Zi.<br />
4 Modeling Results and Improvements<br />
In order to examine the per<strong>for</strong>mance of the Takagi–<br />
Sugeno models, the following two indices are used:<br />
(i) The criterion of Nash and Suttcliffe [6] (NS),<br />
commonly used in hydrological studies and comparable<br />
to the Variance Accounted For (VAF),<br />
compares the sum of squares of model errors<br />
with the sum of squares of errors when “no<br />
model” is present:<br />
NS = 1 −<br />
N<br />
∑<br />
k=1<br />
N<br />
∑<br />
k=1<br />
(Qm(k) − Qobs(k)) 2<br />
(Qobs(k) − Qobs) 2<br />
(7)<br />
where Qm is the simulated discharge, Qobs is the<br />
observed discharge and Qobs denotes the mean of<br />
the observed data. The optimal value of NS is 1,<br />
meaning a perfect match of the model. A value<br />
of zero indicates that the model predictions are as
Table 1: Coordinates of the centra, with the corresponding<br />
per<strong>for</strong>mance indices<br />
Centra Coordinates NS RMSE<br />
Model P(t) Q(t) [-] [m 3 s −1 ]<br />
I 1.11 2.69 -77.33 16.49<br />
0.01 1.70<br />
II 0.05 3.44 -0.72 2.45<br />
0.12 1.00<br />
good as that of a “no-knowledge” model continuously<br />
simulating the mean of the observed signal<br />
[3]. Negative values indicate that the model<br />
is per<strong>for</strong>ming worse than this “no-knowledge”<br />
model [3].<br />
(ii) The Root Mean Square Error (RMSE) given by:<br />
<br />
N <br />
<br />
∑(Qobs(k)<br />
− Qm(k))<br />
k=1<br />
RMSE =<br />
2<br />
(8)<br />
N<br />
Using the above described method, the parameters of<br />
the rules were identified. Since the Gustafson–Kessel<br />
fuzzy clustering algorithm is an iterative clustering algorithm<br />
with a random initialisation of the partition<br />
matrix, the method was repeated 30 times. For the<br />
Gustafson–Kessel clustering, the tolerance value was<br />
set to 10 −3 and the fuzziness exponent was set to 2.<br />
Since the optimal number of clusters found in [10]<br />
was 2, the relational models were built using 2 clusters.<br />
For each repetition, a relational rule base was<br />
built and a baseline run was per<strong>for</strong>med on the training<br />
data. Within these 30 repetitions, essentially two different<br />
values <strong>for</strong> the per<strong>for</strong>mance indices were found.<br />
The corresponding models also have the same parameters,<br />
i.e. centra, spreads, correlation coefficients and<br />
consequent parameters, within a certain accuracy. Apparently,<br />
<strong>for</strong> this data set, the Gustafson–Kessel clustering<br />
algorithm can only convergence to two different<br />
models. The coordinates of the centra, together with<br />
the corresponding values of the per<strong>for</strong>mance indices<br />
are given in Table 1. The values of the per<strong>for</strong>mance<br />
indices show that these models do not per<strong>for</strong>m gooed.<br />
These poor values may be caused by the fact that data<br />
points that have a higher value in rainfall and/or discharge<br />
are not well covered by the membership functions.<br />
This can be seen by comparing Figs. 1and 2.<br />
Figure 2: Membership functions <strong>for</strong> one of the relational<br />
models<br />
In order to improve the relational models <strong>for</strong> this data<br />
set, the covariance matrices Σi, and the cluster centra<br />
mi, resulting from the Gustafson–Kessel clustering<br />
algorithm, are used to construct the two-dimensional<br />
membership functions. The two-dimensional membership<br />
functions can then be scaled by introducing a<br />
multiplicative parameter βi into each of the different<br />
covariance matrices. In order to use a Gaussian-like<br />
expression, the exponent in Eq. 5 is multiplied by 0.5.<br />
Eq. 5 can then be rewritten as:<br />
Ri(xp,xq) = e −0.5(x−mi)(β 2 i Σi) −1 (x−mi) T<br />
(9)<br />
The consequent parameters were then estimated using<br />
a global least squares method [1].<br />
βi was varied between 1,2,3 and 4 <strong>for</strong> the two clusters<br />
separately. For each combination of the two βi’s, the<br />
covariance matrices and the cluster centra of the previous<br />
30 models were used to build new models. A<br />
baseline run was per<strong>for</strong>med on the training data. The<br />
values of the per<strong>for</strong>mance indices <strong>for</strong> the best models<br />
found with the above described combinations of<br />
β’s varied between 0.11 and 0.45 <strong>for</strong> NS and 1.75 and<br />
1.38 <strong>for</strong> RMSE. The models with the best values <strong>for</strong><br />
both NS= 0.45 and RMSE= 1.38, were found with 2<br />
and 4 as βi’s. These best models are built using model<br />
type I of Tabel 1, with the covariance matrix of the<br />
first cluster multiplied by 4 and the covariance matrix<br />
of the second cluster multiplied by 16. Fig. 3 shows<br />
the membership functions <strong>for</strong> one of these models.<br />
These best models were then used to per<strong>for</strong>m a baseline<br />
run on the entire data set. The per<strong>for</strong>mance indices<br />
are calculated using the mean output of these<br />
models (Table 2). The values of the per<strong>for</strong>mance indices<br />
<strong>for</strong> the GKL models are also listed (Table 2).<br />
From this table, one can see that both methods yield
Figure 3: Membership functions <strong>for</strong> one of the relational<br />
models<br />
Table 2: Values of the per<strong>for</strong>mance indices <strong>for</strong> the<br />
baseline run on the entire data set using the relational<br />
models and the GKL models<br />
<strong>Relational</strong> GKL<br />
NS hourly 0.43 0.43<br />
[-] daily 0.48 0.47<br />
RMSE hourly 1.37 1.37<br />
[m 3 s −1 ] daily 1.18 1.19<br />
comparable results. The modeling results <strong>for</strong> the relational<br />
models <strong>for</strong> 1998 are shown in Fig. 4.<br />
5 Conclusion<br />
<strong>Relational</strong> rules were used to develop data-driven<br />
<strong>Takagi–Sugeno</strong> models. Applying the method as described<br />
by [4] dit not result in acceptable models. This<br />
can be due to the fact that the membership functions<br />
are too narrow compared to the spread of the training<br />
data. Multiplying the covariance matrices by a factor<br />
β2 i , did result in broader membership functions and<br />
model results that are comparable to those obtained<br />
Q(m3/s)<br />
16<br />
14<br />
12<br />
10<br />
8<br />
6<br />
4<br />
2<br />
0<br />
0 50 100 150 200 250 300 350<br />
Day of year<br />
Figure 4: Simulation results <strong>for</strong> the relational models<br />
<strong>for</strong> 1998. The observations are in solid lines and the<br />
simulations are in dashed lines.<br />
0<br />
20<br />
40<br />
60<br />
80<br />
100<br />
120<br />
140<br />
P(mm/day)<br />
by the GKL models.<br />
Acknowledgements<br />
The authors would like to thank A. Gaweda <strong>for</strong> the<br />
use of his software.<br />
References<br />
[1] R. Babuˇska, Fuzzy Modeling <strong>for</strong> Control,<br />
Kluwer Academic Publishers, 1998.<br />
[2] S. Bergström, The HBV model, Computer <strong>Models</strong><br />
of Watershed Hydrology (V. P. Singh, ed.),<br />
Water Resources Publications, 1995, pp. 443–<br />
476.<br />
[3] K.J. Beven, <strong>Rainfall</strong>-Runoff Modelling, The<br />
Primer, John Wiley and Sons, 2000.<br />
[4] A.E. Gaweda, Optimal data-driven rule extraction<br />
using adaptive fuzzy-neural models, Ph.D.<br />
thesis, University of Louisville, Lousville, Kentucky,<br />
August 2002.<br />
[5] Y. Hundecha, A. Bárdossy, and H.W. Theisen,<br />
Development of a fuzzy logic-based rainfallrunoff<br />
model, Hydrol. Sc. Journal 46 (2001),<br />
no. 3, 363–376.<br />
[6] J.E. Nash and J.V. Sutcliffe, River flow <strong>for</strong>ecasting<br />
through conceptual models part I - a discussion<br />
of principles, J. Hydrol. 10 (1970), 282–<br />
290.<br />
[7] V. R. N. Pauwels, N. E. C. Verhoest, and F. P.<br />
De Troch, A meta-hillslope model based on an<br />
analytical solution to the linearized Boussinesqequation<br />
<strong>for</strong> temporally variable recharge rates,<br />
Water Resour. Res. 38 (2002), no. 12, 1297,<br />
doi:1029/2001WR000714.<br />
[8] L. See and S. Openshaw, A hybrid multi-model<br />
approach to river level <strong>for</strong>ecasting, Hydrol. Sci.<br />
Journal 45 (2000), no. 4, 523–536.<br />
[9] P.A. Troch, F.P. De Troch, and W. Brutsaert, Effective<br />
water table depth to describe initial conditions<br />
prior to storm rainfall in humid regions,<br />
Water Resour. Res. 29 (1993), no. 2, 427–434.<br />
[10] H. Vernieuwe, O. Georgieva, B. De Baets,<br />
V.R.N. Pauwels, and N.E.C. Verhoest, Fuzzy<br />
models of rainfall-discharge dynamics, Lecture<br />
Notes in Computer Science, to appear.<br />
[11] L. Xiong, A.Y. Shamseldin, and K.M.<br />
O’Connor, A non-linear combination of the<br />
<strong>for</strong>ecasts of rainfall-runoff models by the firstorder<br />
Takagi-Sugeno fuzzy system, J. Hydrol.<br />
245 (2001), 196–217.