57 optimizing water quality monitoring stations using genetic ...

57 optimizing water quality monitoring stations using genetic ... 57 optimizing water quality monitoring stations using genetic ...

ajse.kfupm.edu.sa
from ajse.kfupm.edu.sa More from this publisher
03.08.2013 Views

OPTIMIZING WATER QUALITY MONITORING STATIONS USING GENETIC ALGORITHMS Muhammad A. Al-Zahrani * and Khurram Moied King Fahd University of Petroleum & Minerals Department of Civil Engineering Dhahran 31261, Saudi Arabia : ﺔــﺻﻼﺨﻟا ﻚﻟذو ،ﻊﻳزﻮﺘﻟا ﺔﻜﺒﺷ لﻼﺧ ﻖﻓﺪﺘﻤﻟا ءﺎﻤﻟا ﻦﻣ تﺎﻨﻴﻋ ﺬﺧأ ﺐﻠﻄﺘﺗ ﺎﻬﺘﻤﻈﻧأو ﺔﻴﻤﻟﺎﻌﻟا بﺮﺸﻟا ﻩﺎﻴﻣ ﻦﻴﻧاﻮﻗ نإ ﻰﻠﻋ ءﺎﻨﺑو . ﺔﻴﻤﻟﺎﻌﻟاو ﺔﻴﻠﺤﻤﻟا تﺎﻔﺻاﻮﻤﻠﻟ ﺎﻬﺘﻘﺑﺎﻄﻣو ﺎﻬﺗدﻮﺟو بﺮﺸﻠﻟ ﻩﺎﻴﻤﻟا ﻩﺬه ﺔﻴﺣﻼﺻ ﻦﻣ ﺪآﺄﺘﻟا ﻢﺘﻳ ﻲﻜﻟ رﺎﺒﺘﻋﻻا ﻲﻓ ﺬﺧﻷا ﻊﻣ ﺔﻴﻤﻠﻌﻟا قﺮﻄﻟﺎﺑ ﻩﺎﻴﻤﻠﻟ ﻲﻋﻮﻨﻟا ﺪﺻﺮﻟا تﺎﻄﺤﻤﻟ ﻰﻠﺜﻤﻟا ﻊﻗاﻮﻤﻟا ﺪﻳﺪﺤﺗ ﻢﺘﻳ نأ ﺐﺠﻳ ﻚﻟذ . ﺔﻳدﺎﺼﺘﻗﻻا ﻞﻣاﻮﻌﻟا Genetic ) ﺔﻴﻨﻴﺠﻟا بﺎﺴﺤﻟا ﺔﻘﻳﺮﻃ ماﺪﺨﺘﺳﺎﺑ ﻲﺿﺎﻳر جذﻮﻤﻧ ﺮﻳﻮﻄﺗ ﺔﺳارﺪﻟا ﻩﺬه ﻲﻓ ﻢﺗ ﺪﻗو ﻰﻠﻋ ﺪﻋﺎﺴﻳ ﺎﻤﻣ ،ﻩﺎﻴﻤﻠﻟ ﻲﻋﻮﻨﻟا ﺪﺻﺮﻟا تﺎﻄﺤﻤﻟ ﻰﻠﺜﻤﻟا ﻊﻗاﻮﻤﻟا ﻦﻴﻴﻌﺗ ﻪﺘﻃﺎﺳﻮﺑ ﻢﺘﻳ يﺬﻟا ،( Algorithm ﻢﺗ يﺬﻟا جذﻮﻤﻨﻟا حﺮﺷ و ﺢﻴﺿﻮﺗ ﻢﺗ ﺪﻗو . ﺎﻬﺗدﻮﺟو ﺔﻜﺒﺸﻟا ﻲﻓ ﺔﻟﻮﻘﻨﻤﻟا ﻩﺎﻴﻤﻟا ﺔﻴﻋﻮﻧ ﻦﻋ ﺔﻠﻣﺎآ ةرﻮﺻ ءﺎﻄﻋإ ةرﺪﻗ ﻦﻣ ﺪآﺄﺘﻠﻟ ﻚﻟذ و ،ﺔﻴﺿﺮﻔﻟا ﻩﺎﻴﻤﻟا تﺎﻜﺒﺷ ﺾﻌﺑ ﻰﻠﻋ رﻮﻄﻤﻟا جذﻮﻤﻨﻟا ﻖﻴﺒﻄﺗ ﺎﻀﻳأ ﻢﺗ ﺎﻤآ ، ﻩﺮﻳﻮﻄﺗ ﺬﺧأ حُﺮﺘﻗ ُا ﻲﺘﻟا ﻰﻠﺜﻤﻟا ﻊﻗاﻮﻤﻟا ﺪﻳﺪﺤﺗ ﺚﻴﺣ ﻦﻣ ةزﺎﺘﻤﻣ ةءﺎﻔآ رﻮﱠـﻄﻤﻟا جذﻮﻤﻨﻟا ﺖﺒﺛا ﺪﻗ و . ﻪﺗءﺎﻔآ و جذﻮﻤﻨﻟا . ﺔﻜﺒﺸﻟا ﻲﻓ ﺔﻘﻓﺪﺘﻤﻟا ﻩﺎﻴﻤﻟا ةدﻮﺟ ﻦﻣ ﺪآﺄﺘﻠﻟ ﺎﻬﻨﻣ ت ﺎﻨﻴﻋ * Address for correspondence: KFUPM Box 686 King Fahd University of Petroleum & Minerals Dhahran 31261 Saudi Arabia E-mail address: mzahrani@kfupm.edu.sa April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 57

OPTIMIZING WATER QUALITY MONITORING STATIONS<br />

USING GENETIC ALGORITHMS<br />

Muhammad A. Al-Zahrani * and Khurram Moied<br />

King Fahd University of Petroleum & Minerals<br />

Department of Civil Engineering<br />

Dhahran 31261, Saudi Arabia<br />

: ﺔــﺻﻼﺨﻟا<br />

ﻚﻟذو ،ﻊﻳزﻮﺘﻟا<br />

ﺔﻜﺒﺷ لﻼﺧ ﻖﻓﺪﺘﻤﻟا ءﺎﻤﻟا<br />

ﻦﻣ تﺎﻨﻴﻋ<br />

ﺬﺧأ ﺐﻠﻄﺘﺗ ﺎﻬﺘﻤﻈﻧأو ﺔﻴﻤﻟﺎﻌﻟا بﺮﺸﻟا ﻩﺎﻴﻣ ﻦﻴﻧاﻮﻗ<br />

نإ<br />

ﻰﻠﻋ ءﺎﻨﺑو<br />

. ﺔﻴﻤﻟﺎﻌﻟاو ﺔﻴﻠﺤﻤﻟا تﺎﻔﺻاﻮﻤﻠﻟ ﺎﻬﺘﻘﺑﺎﻄﻣو ﺎﻬﺗدﻮﺟو بﺮﺸﻠﻟ ﻩﺎﻴﻤﻟا ﻩﺬه ﺔﻴﺣﻼﺻ ﻦﻣ ﺪآﺄﺘﻟا ﻢﺘﻳ ﻲﻜﻟ<br />

رﺎﺒﺘﻋﻻا ﻲﻓ ﺬﺧﻷا ﻊﻣ ﺔﻴﻤﻠﻌﻟا قﺮﻄﻟﺎﺑ<br />

ﻩﺎﻴﻤﻠﻟ ﻲﻋﻮﻨﻟا ﺪﺻﺮﻟا تﺎﻄﺤﻤﻟ ﻰﻠﺜﻤﻟا ﻊﻗاﻮﻤﻟا ﺪﻳﺪﺤﺗ ﻢﺘﻳ نأ ﺐﺠﻳ ﻚﻟذ<br />

. ﺔﻳدﺎﺼﺘﻗﻻا ﻞﻣاﻮﻌﻟا<br />

Genetic ) ﺔﻴﻨﻴﺠﻟا<br />

بﺎﺴﺤﻟا ﺔﻘﻳﺮﻃ ماﺪﺨﺘﺳﺎﺑ ﻲﺿﺎﻳر جذﻮﻤﻧ ﺮﻳﻮﻄﺗ ﺔﺳارﺪﻟا ﻩﺬه ﻲﻓ ﻢﺗ ﺪﻗو<br />

ﻰﻠﻋ ﺪﻋﺎﺴﻳ ﺎﻤﻣ ،ﻩﺎﻴﻤﻠﻟ<br />

ﻲﻋﻮﻨﻟا ﺪﺻﺮﻟا تﺎﻄﺤﻤﻟ ﻰﻠﺜﻤﻟا ﻊﻗاﻮﻤﻟا ﻦﻴﻴﻌﺗ ﻪﺘﻃﺎﺳﻮﺑ<br />

ﻢﺘﻳ يﺬﻟا ،(<br />

Algorithm<br />

ﻢﺗ يﺬﻟا جذﻮﻤﻨﻟا حﺮﺷ و ﺢﻴﺿﻮﺗ ﻢﺗ ﺪﻗو . ﺎﻬﺗدﻮﺟو ﺔﻜﺒﺸﻟا ﻲﻓ ﺔﻟﻮﻘﻨﻤﻟا<br />

ﻩﺎﻴﻤﻟا ﺔﻴﻋﻮﻧ ﻦﻋ ﺔﻠﻣﺎآ ةرﻮﺻ ءﺎﻄﻋإ<br />

ةرﺪﻗ ﻦﻣ ﺪآﺄﺘﻠﻟ ﻚﻟذ و ،ﺔﻴﺿﺮﻔﻟا<br />

ﻩﺎﻴﻤﻟا تﺎﻜﺒﺷ ﺾﻌﺑ ﻰﻠﻋ رﻮﻄﻤﻟا جذﻮﻤﻨﻟا ﻖﻴﺒﻄﺗ ﺎﻀﻳأ ﻢﺗ ﺎﻤآ ، ﻩﺮﻳﻮﻄﺗ<br />

ﺬﺧأ<br />

حُﺮﺘﻗ<br />

ُا ﻲﺘﻟا<br />

ﻰﻠﺜﻤﻟا ﻊﻗاﻮﻤﻟا ﺪﻳﺪﺤﺗ ﺚﻴﺣ ﻦﻣ ةزﺎﺘﻤﻣ ةءﺎﻔآ رﻮﱠـﻄﻤﻟا<br />

جذﻮﻤﻨﻟا ﺖﺒﺛا ﺪﻗ و . ﻪﺗءﺎﻔآ و جذﻮﻤﻨﻟا<br />

. ﺔﻜﺒﺸﻟا ﻲﻓ ﺔﻘﻓﺪﺘﻤﻟا ﻩﺎﻴﻤﻟا ةدﻮﺟ ﻦﻣ ﺪآﺄﺘﻠﻟ<br />

ﺎﻬﻨﻣ ت ﺎﻨﻴﻋ<br />

* Address for correspondence:<br />

KFUPM Box 686<br />

King Fahd University of Petroleum & Minerals<br />

Dhahran 31261<br />

Saudi Arabia<br />

E-mail address: mzahrani@kfupm.edu.sa<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B <strong>57</strong>


58<br />

ABSTRACT<br />

Monitoring of drinking <strong>water</strong> transported by a <strong>water</strong> distribution network is an<br />

essential step to ensure the safeguard of human health and the compliance of drinking<br />

<strong>water</strong> <strong>quality</strong> with local and international standards. The Safe Drinking Water Act requires<br />

that <strong>water</strong> <strong>quality</strong> in a <strong>water</strong> distribution network be sampled at locations which are<br />

representative of the whole network system. Different tools based on optimization<br />

techniques can be employed for identifying <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong> in a <strong>water</strong><br />

distribution network. In this paper, a Genetic Algorithm (GA) is applied for this purpose.<br />

The steps involved in the developed methodology are presented with an application on<br />

hypothetical networks. Then its validity was tested against two cases presented in the<br />

literature and gave similar results.<br />

Keywords: Water <strong>quality</strong>, Water distribution network, Optimization, Genetic Algorithm (GA).<br />

M. A. Al-Zahrani and K. Moied<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003


M. A. Al-Zahrani and K. Moied<br />

OPTIMIZING WATER QUALITY MONITORING STATIONS<br />

USING GENETIC ALGORITHMS<br />

LIST OF SYMBOLS<br />

C1 Scaled fitness constant<br />

C2 Scaled fitness constant<br />

d Nodal demand<br />

fn Water fraction<br />

Fr Raw fitness<br />

Fs Scaled fitness<br />

n Number of nodes in the <strong>water</strong> distribution network<br />

NFS Number of feasible solutions<br />

P Number of generations<br />

Q The demand coverage that a possible solution can achieve in the distribution network<br />

R Rate of mutation<br />

S Number of <strong>monitoring</strong> <strong>stations</strong><br />

W Quantity of <strong>water</strong><br />

X Size of population<br />

yi Yes/No signal, suggesting if the node “i” is covered or not<br />

Z Best fitness value<br />

1. INTRODUCTION<br />

Drinking <strong>water</strong> <strong>quality</strong> can deteriorate during distribution to the consumer. Many factors, which can be external or<br />

internal, cause the deterioration of <strong>water</strong> <strong>quality</strong> between treatment and consumption. Some of the major causes are: source<br />

<strong>water</strong>, treatment processes, operation of systems, transport and transformations, <strong>water</strong> distribution network condition, and<br />

storage.<br />

In order to have a general picture about the <strong>water</strong> <strong>quality</strong> situation in a <strong>water</strong> distribution network, sampling locations need<br />

to be identified to monitor <strong>water</strong> <strong>quality</strong> parameters. Convenience and spatial representativeness are the two major factors in<br />

selecting the sampling locations [1,2].<br />

Once appropriate sampling locations are identified, then regular <strong>monitoring</strong> of <strong>water</strong> <strong>quality</strong> at these locations in addition<br />

to <strong>monitoring</strong> of source <strong>quality</strong> is needed. Monitoring should include sufficient parameters to indicate all <strong>quality</strong> concerns and<br />

should be conducted at appropriate locations throughout the source of supply. The <strong>monitoring</strong> program should include<br />

protocols for frequency of sampling and methodology of analysis and should be designed to establish baseline data to indicate<br />

both short-term and long-term trends. Such <strong>monitoring</strong> can serve as a trigger mechanism to detect the occurrence of <strong>water</strong><br />

contamination problems at their earliest stages [3].<br />

The current practice of <strong>water</strong> sampling is based on taking <strong>water</strong> samples from locations that are easy to reach. The types of<br />

locations used for collecting <strong>water</strong> samples include: fire hydrants, storage tanks, pumping <strong>stations</strong>, commercial buildings,<br />

public buildings and private residences. Thus, no guidelines exist on how to locate sampling <strong>stations</strong>.<br />

Recent methodologies have been developed to locate sampling <strong>stations</strong> (<strong>monitoring</strong> <strong>stations</strong>) based on scientific methods.<br />

Lee and Deininger [2] developed such a scientific approach based on the concept of demand coverage (DC). The term DC<br />

was used to represent the percentage of network demand monitored by a particular <strong>monitoring</strong> station. The objective of this<br />

methodology was to allocate <strong>monitoring</strong> <strong>stations</strong> that provide maximum information about <strong>water</strong> <strong>quality</strong> condition within a<br />

distribution network. The solution suggested by Lee and Deininger [2] was based on the general feature that <strong>water</strong> <strong>quality</strong><br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 59


60<br />

M. A. Al-Zahrani and K. Moied<br />

parameters decrease with time and distance from the source. That is, if the <strong>water</strong> <strong>quality</strong> at a sampled node is good, then it<br />

must be good at an immediate upstream node. The term “covered node” was used to denote that <strong>water</strong> <strong>quality</strong> at a particular<br />

node can be inferred by the <strong>water</strong> sampled at some downstream nodes. Lee and Deininger [2] used the information obtained<br />

from hydraulic analysis of the network to identify the pathways, such that the <strong>water</strong> <strong>quality</strong> of a large portion of the network is<br />

assessed by installing a few sampling <strong>stations</strong>. The information obtained from the pathways, in terms of a <strong>water</strong> fraction<br />

matrix, was then converted into an integer-programming problem under a chosen coverage criterion. By this method, the<br />

lowest level of knowledge occurred when only a very small fraction of the <strong>water</strong> passed through the node that was called “any<br />

fraction”. For a large network however, this method became highly cumbersome and difficult to handle because of the large<br />

dimensionality of the problem.<br />

Kumar et al. [4] enhanced the work of Lee and Deininger [2] to resolve the problem of dimensionality and proposed a few<br />

changes in the methodology. Kumar et al. [4] used the same coverage matrix as developed by Lee and Deininger. After<br />

calculating the flow direction in each pipe of the network, nodes are renumbered in ascending order of flow, and then the<br />

<strong>monitoring</strong> station with maximum coverage of upstream nodes is selected. Next, the row corresponding to the selected station<br />

is deleted from the coverage matrix. The subsequent <strong>monitoring</strong> <strong>stations</strong> were selected by repeating the same process for the<br />

number of times the <strong>monitoring</strong> <strong>stations</strong> were required from the preceding coverage matrix. The methodology proposed by<br />

Kumar et al. [4] was simpler than that of Lee and Deininger [2] as far as construction of the coverage matrix was concerned;<br />

however, extensive computer programming was required to optimize the <strong>monitoring</strong> locations even with simple mathematical<br />

calculations for a large distribution network.<br />

Based on the concept of Lee and Deininger [2], Kessler et al. [5] developed a methodology that is capable of locating<br />

optimal <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong> in a distribution network under the situation of accidental intrusion of contaminants.<br />

Kessler et al. [5] defined a “level of service” as the maximum allowable quantity of <strong>water</strong> to flow through a certain node<br />

before the detection of contaminant in relation to the time of detection. After hydraulically simulating the network <strong>using</strong><br />

extended period simulation, an auxiliary network is developed in the form of a graph consisting of nodes and directed arcs,<br />

such that the length of arc represents the travel time between the nodes. All shortest paths were calculated <strong>using</strong> the auxiliary<br />

network and a pollution matrix was constructed with 0–1 coefficients.<br />

The current study involves an extension of the model developed by Lee and Deininger [2] to help in identifying <strong>water</strong><br />

<strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong> in a <strong>water</strong> distribution network <strong>using</strong> a Genetic Algorithm.<br />

2. MONITORING STATIONS AND OPTIMIZATION<br />

The main objective of optimization is to determine appropriate locations of <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong> such that they<br />

are representative of the whole network. So, <strong>water</strong> <strong>quality</strong> examined at these <strong>monitoring</strong> <strong>stations</strong> will represent the <strong>quality</strong> of<br />

the whole network. For a <strong>water</strong> distribution network, the size of the space domain as well as the number of <strong>monitoring</strong><br />

<strong>stations</strong> required increases exponentially with an increasing number of nodes in the network. Thus, the number of feasible<br />

solutions, assuming that any node within the distribution network is a candidate <strong>monitoring</strong> sampling station, is defined as:<br />

NFS = N S (1)<br />

where NFS=Number of feasible solutions,<br />

N=is the number of nodes in the network and<br />

S=is the number of <strong>monitoring</strong> <strong>stations</strong> required<br />

For example, in order to locate 4 <strong>monitoring</strong> <strong>stations</strong> in a network of 100 nodes, the number of feasible solutions are on the<br />

order of 100 4 . However, out of these 100 million feasible solutions, there is only one optimal. To locate this optimal solution<br />

from a large number of feasible solutions a powerful algorithm is required. A Genetic Algorithm (GA) is one such algorithm<br />

which can solve this problem. Described briefly in the next section, GAs have been successfully used in other areas involving<br />

<strong>water</strong> networks. Meier et al. [7] used GAs to optimize the flow test locations needed for network model calibration. Savic and<br />

Walters [11] illustrated the use of GAs for least-cost selection of pipe sizes.<br />

3. OPTIMIZATION BY GENETIC ALGORITHM<br />

3.1 Outline of Genetic Algorithm (GA)<br />

A GA starts with a randomly generated set of coded strings representing potential solutions to variables that point to one<br />

location in the solution domain. The variables encoded in these chromosomes are called “genes” or “alleles”. Natural<br />

evolution takes place in chromosomes that are the microscopic threadlike part of the cell nucleus that carries hereditary<br />

information in the form of genes [6]. Concentration of decision variable (genes) values usually forms the strings [7]. This is<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003


M. A. Al-Zahrani and K. Moied<br />

most often done <strong>using</strong> binary representation of those values, but in this study integer values have been used, which represent<br />

the number of nodes in the distribution network, thus shortening the length of possible solution strings (chromosome). From<br />

the initial population, the fittest strings (as measured by their objective function values) are selected to pass their “<strong>genetic</strong><br />

information” to the next generation. This operation is called “selection” which resembles the survival of the fittest in natural<br />

systems. There are many different schemes for selecting survivors; however, all of them share the common goal that more fit<br />

members replace the less fit ones in the population to advance the searching operation. After selection, the population is, on<br />

average, more fit than it was before selection.<br />

Selection is then accompanied by an operation called “crossover” which creates, from the survivors, new strings that<br />

contain distinguished properties of the survivors from which they are created. In some cases the new strings will have lower<br />

fitness, in other cases they will have higher fitness, and in a certain percentage of cases the children will resemble their parents<br />

and thus have the same fitness values as of their parents<br />

Since crossover simply recombines existing strings into new combinations, successive generations will carry the<br />

characteristics contained in the previous populations. It is possible that some desirable strings were not included in the initial<br />

(randomly generated) population or have been lost because individuals possessing those desirable qualities got unfit and<br />

disappeared from the population. An operation called “mutation” is therefore used to occasionally alter a string (chromosome)<br />

to recover desirable qualities or to create new qualities in the strings (chromosomes).<br />

Figure 1 shows the necessary steps involved while applying a Genetic Algorithm for locating optimum <strong>water</strong> <strong>quality</strong><br />

<strong>monitoring</strong> <strong>stations</strong><br />

Figure 1. Genetic algorithm flowchart<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 61


62<br />

M. A. Al-Zahrani and K. Moied<br />

3.2 Hydraulic Simulation & Demand Coverage<br />

A prior step before applying the GA is to hydraulically simulate the <strong>water</strong> distribution network. Hydraulic simulation is<br />

achieved once all the directions of flow in the links along with demands at each node are known, such that all head losses<br />

around the closed loops are added to zero in order to satisfy the mass balance equation. The formulation is carried out for<br />

single or multi-demand patterns of the <strong>water</strong> distribution network. This will help while <strong>optimizing</strong> <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong><br />

<strong>stations</strong> under multiple flow scenarios.<br />

Once the hydraulic features of the network are known, the logic that <strong>water</strong> <strong>quality</strong> at an upstream node is better than the<br />

downstream nodes is applied. It comes from the fact that <strong>water</strong> <strong>quality</strong> deteriorates with the passage of time as <strong>water</strong> starts<br />

flowing away from the source of supply. Matrices are therefore developed for this purpose to analyze the maximum coverage<br />

of the distribution network. The matrix developed for this purpose is called the “Demand Coverage Matrix”. Thus, demand<br />

covered at each downstream node is computed by considering the demands of all the upstream nodes en-rout to that node.<br />

Since the direction of flow in links changes with the change of demand pattern, coverage matrices are required to be<br />

constructed for each demand scenario separately. The above concept will be presented later when the developed methodology<br />

is applied on a hypothetical <strong>water</strong> distribution network.<br />

4. APPLICATION OF GA<br />

To illustrate the application of the above concept, a hypothetical <strong>water</strong> distribution network consisting of 15 nodes, 23<br />

links, and 3 sources of supply (A, B, and C) is proposed [8]. Water distribution networks use interconnected elements such as<br />

pipes, pumps and reservoir to convey treated <strong>water</strong> from one or more sources to consumers spread over a wide area. For this<br />

hypothetical network, hydraulic simulation of the network for a specific scenario is determined <strong>using</strong> EPANET [9]. Thus,<br />

quantities, directions of flow and nodal demands are determined. These quantities and directions of flow in the links along<br />

with demands at each node are shown in Figure 2 for Scenario 1. The total supply to the network from the source is set to 415<br />

units.<br />

Figure 2. Hypothetical <strong>water</strong> distribution network (scenario 1) (after Boul s and Altman, [8])<br />

Nodes have to be identified in order to determine the routing of <strong>water</strong> in the distribution network. Considering a<br />

downstream node, evaluation is carried out in the upstream direction to identify the nodes, which contribute <strong>water</strong> supply to<br />

the considered downstream node. For the hypothetical network, it is quite obvious that node 9 receives <strong>water</strong> from node 1,<br />

node 5 receives <strong>water</strong> supply from node 2, and node 4 receives <strong>water</strong> from nodes 1 and 2. Similarly, node 10 receives <strong>water</strong><br />

supply from nodes 4 and 9. In this way, all downstream nodes are evaluated to determine the contributing upstream nodes of<br />

the network.<br />

Once the nodes are identified, <strong>water</strong> fraction matrices are constructed. For this purpose, the fractions of total <strong>water</strong><br />

received at a particular downstream node from the contributing upstream nodes are determined and inserted in the <strong>water</strong><br />

fraction matrix. For example, if (wnk + wnl + wnm) is the total <strong>water</strong> supplied to a downstream node n from the upstream nodes<br />

k, l, and m, then the fractions fn are calculated by [4]:<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003


M. A. Al-Zahrani and K. Moied<br />

where<br />

f<br />

f<br />

f<br />

w<br />

nk<br />

nk = (2)<br />

wnk<br />

+ wnl<br />

+ wnm<br />

w<br />

nl<br />

nl = (3)<br />

wnk<br />

+ wnl<br />

+ wnm<br />

w<br />

nm<br />

nm = (4)<br />

wnk<br />

+ wnl<br />

+ wnm<br />

fnk is the fraction of <strong>water</strong> received at node n from node k,<br />

fnl is the fraction of <strong>water</strong> received at node n from node l, and<br />

fnm is the fraction of <strong>water</strong> received at node n from node m.<br />

Since nodes k, l, and m are also getting supplies from further upstream nodes 1,2,3,4........., the <strong>water</strong> fraction of all these<br />

upstream nodes to the <strong>monitoring</strong> node n is the vector fn , which will be:<br />

n<br />

( f f , f ,....., f , f , f ,....., f f )<br />

f ,<br />

= . (5)<br />

n1,<br />

n2<br />

n3<br />

nk nl nm n(<br />

n−1)<br />

Elements of the vector fn can be computed as<br />

f n = [ ( f nk * f k ) + ( f nl * f l ) + ( f nm * f m ) ] , (6)<br />

and <strong>water</strong> coverage matrix is represented by<br />

[ f ]<br />

⎡ f<br />

⎢<br />

= ⎢<br />

⎢<br />

⎢<br />

⎣<br />

11<br />

f<br />

f<br />

12<br />

22<br />

:<br />

:<br />

:<br />

f<br />

f<br />

f<br />

1n<br />

2n<br />

:<br />

nn<br />

⎤<br />

⎥<br />

⎥<br />

⎥<br />

⎥<br />

⎦<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 63<br />

nn<br />

. (7)<br />

Since a node always covers itself, all the fnn entries are set equal to ‘1’ in the <strong>water</strong> fraction matrix. Table 1 represents the<br />

computed <strong>water</strong> fraction matrix for the hypothetical network of Scenario 1.<br />

After the construction of the <strong>water</strong> fraction matrix, a coverage criterion is established. If ‘d’ is the total demand of the<br />

entire network and ‘di’ is the demand of a particular node, then by <strong>monitoring</strong> node i, the fraction di/d of the network can be<br />

covered. Thus, in order to cover the entire network, logically every node of the network must be monitored but it might not be<br />

possible due to economical reasons. Therefore, in order to make the selection process easy, a coverage criterion has to be set.<br />

In the current study, a coverage criterion of 50% is used. Under 50% coverage criteria, those upstream nodes which deliver<br />

more than or equal to 50% of the <strong>water</strong> to a downstream node are considered as covered and marked as ‘1’ in the <strong>water</strong><br />

coverage matrix. Otherwise, they are marked as ‘0’. The coverage criterion plays a role in establishing a tradeoff between the<br />

number of <strong>monitoring</strong> <strong>stations</strong> and the demand coverage of the network. A small value of the coverage criteria may suggest<br />

less number of <strong>stations</strong> to be monitored in order to achieve a desired level of the demand coverage. Consequently, a large<br />

value of this coverage criterion may suggest a large number of <strong>monitoring</strong> <strong>stations</strong> to ensure that the same level of demand<br />

coverage has been achieved.<br />

Under the 50% criteria of coverage, the <strong>water</strong> fraction matrix, Table 1, is then converted into a <strong>water</strong> coverage matrix,<br />

Table 2. An entry equal to ‘1’ indicates that the node has been covered while an entry of ‘0’ indicates that the particular node<br />

is not covered by a certain <strong>monitoring</strong> station.<br />

It is shown in Table 2 that node 1 in the hypothetical network covers only itself, node 5 covers nodes 2 and 5, whereas<br />

node 12 covers nodes 2, 5, and 12. Similar analyses are made for all nodes of the network. The demand vector of the network<br />

is constructed as d = di, where d is the demand of node and i represents the number of nodes in the distribution network. The


64<br />

M. A. Al-Zahrani and K. Moied<br />

demand vector is the vector of known nodal demands. Hypothetical values were assumed for the hypothetical network.<br />

However, for the real networks, the values for demand vector can be determined from design sheets or <strong>water</strong> bills.<br />

Table 1. Water Fraction Matrix (Scenario 1).<br />

Sample<br />

at node<br />

Water Fraction through Nodes<br />

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15<br />

1 1 0 0 0.67 0 0 0 0.61 0.79 0.71 0.08 0.19 0.19 0 0.19<br />

2 0 1 0 0.33 1 0 0 0.39 0 0.21 0.03 0.68 0.68 0 0.68<br />

3 0 0 1 0 0 1 1 0 0.22 0.09 0.89 0.12 0.12 0 0.12<br />

4 0 0 0 1 0 0 0 0.9 0 0.64 0.08 0.27 0.27 0 0.27<br />

5 0 0 0 0 1 0 0 0.09 0 0 0 0.61 0.61 0 0.61<br />

6 0 0 0 0 0 1 0 0 0.22 0 0.40 0.05 0.05 0 0.05<br />

7 0 0 0 0 0 0 1 0 0 0 0.48 0.06 0.06 0 0.06<br />

8 0 0 0 0 0 0 0 1 0 0 0 0.29 0.29 0 0.29<br />

9 0 0 0 0 0 0 0 0 1 0.36 0.04 0.01 0.01 0 0.01<br />

10 0 0 0 0 0 0 0 0 0 1 0.12 0.02 0.02 0 0.02<br />

11 0 0 0 0 0 0 0 0 0 0 1 0.13 0.13 0 0.13<br />

12 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1<br />

13 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1<br />

14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br />

15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1<br />

For the hypothetical network, where i = 15, the demand vector is constructed as follows:<br />

( 30 25 30 30 30 30 30 45 30 30 35 50 0 0 20)<br />

.<br />

4.1 Multiple Flow Scenarios<br />

The flow demands in a <strong>water</strong> distribution network vary more than once during the day. To incorporate this effect, multiple<br />

flow scenarios are considered. For our purpose, the hypothetical model is modified to represent a flow scenario other than<br />

Scenario 1. Figure 3 shows the modified quantities and directions of flow as well as demands at each node for Scenario 2.<br />

The corresponding <strong>water</strong> coverage matrix and nodal demand vector for Scenario 2 can be calculated similar to Scenario 1.<br />

Figure 3. Hypothetical <strong>water</strong> distribution network (scenario 1) (after Boulos and Altman, [8])<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003<br />

T


M. A. Al-Zahrani and K. Moied<br />

Once the coverage matrix is developed, the developed Genetic Algorithm model can be applied to identify the optimum<br />

<strong>water</strong> <strong>quality</strong> <strong>stations</strong> in a <strong>water</strong> distribution network for both single flow (Scenario 1) and multiple flows (the combination of<br />

both Scenario 1 and Scenario 2). The following (Sections 4.2–4.9) are the steps involved when applying the GA.<br />

4.2 Raw Fitness Evaluation<br />

Unlike other optimization techniques, the objective function is formulated as a “fitness function” in a Genetic Algorithm.<br />

This fitness function is derived based on either maximization or minimization of the objective(s). The first fitness value of a<br />

possible solution is called the “raw fitness value”, which is determined by Equation (8).<br />

⎡Z − Q ⎤<br />

= 100 − ⎢ * 100⎥<br />

⎣ Z ⎦<br />

F r (8)<br />

Where Fr is the raw fitness, Z is the best fitness, and Q is the demand coverage that a particular solution can achieve.<br />

An initial step prior to the process of optimization <strong>using</strong> GA is to define an ideal or “best” solution. All possible solutions<br />

generated during optimization are then compared with this “best” value to determine their relative goodness. The closer the<br />

fitness value of a possible solution falls near the “best”, the greater is the chance of its selection. In this study, the “best” value<br />

is set to equal the total supply of <strong>water</strong> in the distribution network. This value can also be set greater than or equal to the total<br />

input supply of <strong>water</strong> to the network. For the hypothetical <strong>water</strong> distribution network, Scenario 1, the total input supply is 415<br />

units, and the “best” is set equal to 500. It can be set to a value equal to 415, but not less. Therefore:<br />

Z ≥ W , (9)<br />

where, W is the total input supply.<br />

Based on the maximization function<br />

n ⎛ ⎞<br />

⎜max dy ⎟<br />

∑<br />

, Q is evaluated from Table 2 (Water Coverage Matrix). The<br />

i i<br />

⎝ i = 1 ⎠<br />

resultant coverage matrix is constructed for this purpose, which is obtained after “ORING” those columns of the coverage<br />

matrix, which are indicated by the possible solution. This process of “ORING” is similar to taking the union of 1’s in these<br />

column vectors. Q is evaluated by adding the product of the resultant coverage vector with the corresponding nodal demand<br />

vector. For example, if ( 3 8 11 15 ) is the first possible solution for the hypothetical network (Scenario 1), then the resulting<br />

coverage vector is constructed after “ORING” the columns 3, 8, 11 and 15 of the coverage matrix shown in Table 2. The value<br />

of Q is then calculated by adding the product of the resultant coverage vector with the nodal demand vector. Table 3 shows<br />

the evaluation of raw fitness of a possible solution for Scenario 1.<br />

Table 2. Water Coverage Matrix (Scenario 1).<br />

Sample<br />

Water fraction through nodes<br />

at node 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15<br />

1 1 0 0 1 0 0 0 1 1 1 0 0 0 0 0<br />

2 0 1 0 0 1 0 0 0 0 0 0 1 1 0 1<br />

3 0 0 1 0 0 1 1 0 0 0 1 0 0 0 0<br />

4 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0<br />

5 0 0 0 0 1 0 0 0 0 0 0 1 1 0 1<br />

6 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0<br />

7 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0<br />

8 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0<br />

9 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0<br />

10 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0<br />

11 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0<br />

12 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1<br />

13 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1<br />

14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0<br />

15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 65


66<br />

m<br />

n<br />

M. A. Al-Zahrani and K. Moied<br />

Similarly, based on the maximum function ( ∑ ∑d<br />

ik yik<br />

) defined for multiple scenarios, Q is first evaluated for the first<br />

k = 1 i=<br />

1<br />

scenario and then for the 2 nd , 3 rd , and so on. Total Q is then evaluated by adding all the Q values such that:<br />

Q =<br />

m<br />

∑<br />

k = 1<br />

Q<br />

where, m is the number of scenarios.<br />

k<br />

, (10)<br />

Columns obtained from Table 2<br />

Table 3. Evaluation of “Q”.<br />

Resultant<br />

coverage<br />

vector after<br />

“ORING”<br />

(5)<br />

Nodal<br />

demand<br />

vector<br />

(6)<br />

Q<br />

(5)*(6)<br />

Node 3 Node 8 Node 11 Node 15<br />

(1)<br />

(2)<br />

(3)<br />

(4)<br />

0 1 0 0 1 30 30<br />

0 0 0 1 1 25 25<br />

1 0 1 0 1 30 30<br />

0 0 0 1 1 30 30<br />

0 0 0 1 1 30 30<br />

0 0 0 0 0 30 0<br />

0 0 0 0 0 30 0<br />

0 1 0 0 1 45 45<br />

0 0 0 0 0 30 0<br />

0 0 0 0 0 30 0<br />

0 0 1 0 1 35 35<br />

0 0 0 1 1 50 50<br />

0 0 0 1 1 0 0<br />

0 0 0 0 0 0 0<br />

0 0 0 1 1 20 20<br />

∑ Q = 295<br />

For multiple scenarios, the total Q is 510 and the best be defined as equal to or greater than 955 units (i.e it represents the<br />

summation of the total <strong>water</strong> supply of Scenario 1, which is 415 units, and the total <strong>water</strong> supply of Scenario 2, which is 540).<br />

4.3 Scaled Fitness Evaluation<br />

Though, the goal of GA is to dominate the population pool, the same has to be supported by increasing the search space for<br />

the truly fit members to be identified. This can be achieved by scaling of fitness values, so as to add further security to avoid<br />

premature member dominance in the population. This premature member can also be termed as unhealthy or weak member,<br />

which could not be otherwise identified. This phenomenon is extremely important in higher order populations where the<br />

average and maximum fitness values fall very close to each other and search space needs to be increased. This could further<br />

mean a little delay in convergence of algorithm since enough members fall under examination.<br />

A simple linear scaling is proposed by Goldberg [10] and is adopted in this study, such that the average scaled fitness is<br />

kept equal to the average raw fitness. Thus, the scaled fitness, Fs, of a possible solution can be defined as:<br />

Fs = Fr<br />

∗ C1<br />

+ C2<br />

(11)<br />

where Fs is the scaled fitness and C1 and C2 are the constants for the linear scaling. Appendix I shows how the linear<br />

constants C1 and C2 are evaluated [10].<br />

4.4 Random Generation of Initial Population<br />

After writing the fitness function, the next step is to generate an initial population of possible solutions. Here, analogy with<br />

nature is established by creating within a computer a set of solutions called the “population” [11]. Each solution string called a<br />

“chromosome”, consists of decision variables called genes or alleles. The number of chromosomes required to be generated,<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003


M. A. Al-Zahrani and K. Moied<br />

depends on the decision regarding the size of population. As there is no set guided rule to define the exact size of the<br />

population; artistic judgment of the GA user is employed. The judgment comes from experience and relative knowledge of<br />

GA implementation.<br />

In this study, X numbers of chromosomes (possible solutions) are randomly generated in the first population. These<br />

chromosomes generated at random serve as parents for selection. The length of chromosomes (i.e. the number of genes) is<br />

kept equal to the number of <strong>monitoring</strong> <strong>stations</strong> required. A “gene” is set to have an integer value between 1 and n, where n is<br />

the total number of nodes in the distribution network. However, during the random generation of chromosomes, care is<br />

exercised to ensure that all the genes in a chromosome must have different values defined in the range of 1 and n. Table 4<br />

shows the randomly generated initial population consisting of 10 numbers of chromosomes for locating “four” <strong>monitoring</strong><br />

<strong>water</strong> <strong>quality</strong> <strong>stations</strong> in the hypothetical network.<br />

Table 4. Initial Population.<br />

Index Number<br />

(1)<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Initial Population<br />

(2)<br />

12 15 5 3<br />

2 15 13 3<br />

11 15 5 1<br />

3 15 4 10<br />

8 4 1 7<br />

9 3 1 7<br />

10 6 1 13<br />

15 12 9 2<br />

4 14 1 7<br />

9 11 7 2<br />

4.5 Selection of Parents<br />

Randomly generated chromosomes are required to be selected, such that only fitter ones can mate to produce children for<br />

the subsequent populations. In this study, tournament selection is used as the method of selection.<br />

In this method of selection, the whole population of chromosomes is divided into subgroups, such that each subgroup<br />

contains two or more chromosomes. Then, a highly fit chromosome is selected from each subgroup. To select the remaining<br />

members of the population, the source population is shuffled and the same process is repeated till all the members of the new<br />

population are selected, such that the size of the population must remain constant throughout the simulation. Table 5 shows<br />

the tournament selection of the parents for the hypothetical network.<br />

In this method of selection, a parent can twice get a chance of being selected, based on its higher fitness value, whereas the<br />

less-fit parent also receives a good probability of being selected for mating.<br />

4.6 Crossover<br />

Natural process of evolution describes the evolution of children from their parents; when they mate together and their<br />

chromosomes cross to produce children. Concatenation of chromosomes takes place at the point of crossover and new<br />

chromosomes (children) are produced. These chromosomes are different in structure from their parent chromosomes but<br />

inherit characteristic behavior from them. A similar concept is induced in GA, which is called “crossover operation”.<br />

In this operation of Genetic Algorithms, parents are made to produce children. Several methods are available to accomplish<br />

this task. The simple idea is to generate children from their parents, such that they bear the characteristics inherited by their<br />

parents. Every GA user can develop his own technique to produce children from parents depending on the specific constraints,<br />

such that the objective of creating new from the old is achieved.<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 67


68<br />

M. A. Al-Zahrani and K. Moied<br />

Once the parents are selected by tournament selection, they are made to mate and hence produce children. In the current<br />

study, two parents are made to produce two children and single point crossover is adopted for this purpose. Random selection<br />

of crossover point is performed <strong>using</strong> built-in Matlab function called “randint”. Random integer number is generated in the<br />

range of 1 to (S–1) for each crossover operation, where S is the number of <strong>monitoring</strong> <strong>stations</strong>.<br />

A child produced is termed as “perfect” if all of its genes appear to be dissimilar. In terms of our problem a “faulty child”<br />

is a solution that contains a particular node more than once, which would imply a duplication of <strong>monitoring</strong> point. However, if<br />

any two or more number of genes appears to be the same after crossover, the child is termed as “faulty”. By the end of the<br />

crossover operation, only perfect children are allowed to enter the mutation pool, whereas in the case of a faulty child, the<br />

corresponding parent (instead of faulty child) is made to enter the mutation pool. The whole mutation pool therefore consists<br />

of perfect children only. In this study a “faulty child” is a solution that contains a particular node more than once, which<br />

would imply a duplication of <strong>monitoring</strong> point. Table 6 shows the application of crossover to the parents obtained in Table 5.<br />

Table 5. Tournament Selection of Parents for Scenario 1.<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Source Population<br />

12 15 5 3 X<br />

2 15 13 3 2 15 13 3<br />

11 15 5 1 X<br />

3 15 4 10 3 15 4 10<br />

8 4 1 7 <br />

9 3 1 7 X 8 4 1 7<br />

10 6 1 13 <br />

15 12 9 2 X 10 6 1 13<br />

4 14 1 7 X<br />

9 11 7 2 9 11 7 2<br />

11 15 5 1 X<br />

3 15 4 10 3 15 4 10<br />

4 14 1 7 X<br />

10 6 1 13 10 6 1 13<br />

9 3 1 7 X<br />

2 15 13 3 2 15 13 3<br />

12 15 5 3 X<br />

15 12 9 2 15 12 9 2<br />

9 11 7 2 <br />

8 4 1 7 X 9 11 7 2<br />

4.7 Mutation<br />

After crossover, an operation called “mutation” is applied. Since the fundamental principle of GA is its randomness, it is<br />

quite possible that the solution appearing in the initial populations will not appear for several successive populations ahead.<br />

This is due to the fact that fitter parents forced it to die, and hence not to appear ever again. This disappeared solution can help<br />

in making the most optimal solution in GA runs. An operator called “mutation” is therefore applied, which randomly alters<br />

the structure of the chromosome, and hence gives the chance to the disappeared feasible solutions to appear again and<br />

participate towards the generation of the most optimal solution in the forthcoming populations. The rate at which this<br />

mutation is applied needs to be very low. It is suggested to be within the range of 1–3%, however, studies conducted by Tate<br />

and Smith [12] suggested a high rate of mutation for non-binary encoding of strings. Thus, in this study the mutation is carried<br />

out at a rate of 5%.<br />

During the mutation pool, a random real number is generated in the range of 0 to 1, corresponding to every gene of the<br />

chromosome. If the random number generated is less than 0.05 then the corresponding value of the gene is changed randomly,<br />

such that it is different from the values of the remaining genes in the same chromosome. Table 7 summarizes the application<br />

of mutation to the crossed population obtained in Table 6.<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003


M. A. Al-Zahrani and K. Moied<br />

Index Number<br />

(1)<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

Index Number<br />

(1)<br />

1<br />

Random Numbers<br />

2<br />

Random Numbers<br />

3<br />

Random Numbers<br />

4<br />

Random Numbers<br />

5<br />

Random Numbers<br />

6<br />

Random Numbers<br />

7<br />

Random Numbers<br />

8<br />

Random Numbers<br />

9<br />

Random Numbers<br />

10<br />

Random Numbers<br />

Parents from<br />

Table 5 with<br />

crossov er point<br />

2 15↓ 13 3<br />

3 15↓ 4 10<br />

8 4 1↓ 7<br />

10 6 1↓ 13<br />

9 11↓ 7 2<br />

3 15↓ 4 10<br />

10↓ 6 1 13<br />

2↓ 15 13 3<br />

15 12↓ 9 2<br />

9 11↓ 7 2<br />

Table 6. Crossover for Scenario 1.<br />

Children<br />

produced after<br />

crossover<br />

2 15 4 10<br />

3 15 13 3<br />

8 4 1 13<br />

10 6 1 7<br />

9 11 4 10<br />

3 15 7 2<br />

10 15 13 3<br />

2 6 1 13<br />

15 12 7 2<br />

9 11 9 2<br />

Selection Status<br />

(4)<br />

Perfect<br />

Faulty<br />

Perfect<br />

Perfect<br />

Perfect<br />

Perfect<br />

Perfect<br />

Perfect<br />

Perfect<br />

Faulty<br />

The ( ↓ ) sign represents the point where crossover operation is performed.<br />

Table 7. Mutation Pool for Scenario 1.<br />

2<br />

0.752<br />

3<br />

0.545<br />

8<br />

0.458<br />

10<br />

0.329<br />

9<br />

0.215<br />

3<br />

0.033<br />

10<br />

0.354<br />

2<br />

0.214<br />

15<br />

0.325<br />

9<br />

0.215<br />

Mutation Pool from Table 6<br />

(2)<br />

15<br />

0.255<br />

15<br />

0.458<br />

4<br />

0.214<br />

6<br />

0.010<br />

11<br />

0.859<br />

15<br />

0.782<br />

15<br />

0.8<strong>57</strong><br />

6<br />

0.354<br />

12<br />

0.024<br />

11<br />

0.710<br />

Mutation Pool<br />

(5)<br />

2 15 4 10<br />

3 15 4 10<br />

8 4 1 13<br />

10 6 1 7<br />

9 11 4 10<br />

3 15 7 2<br />

10 15 13 3<br />

2 6 1 13<br />

15 12 7 2<br />

9 11 7 2<br />

Mutated Population<br />

(3)<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 69<br />

4<br />

0.625<br />

4<br />

0.015<br />

1<br />

0.958<br />

1<br />

0.385<br />

4<br />

0.438<br />

7<br />

0.529<br />

13<br />

0.081<br />

1<br />

0.852<br />

7<br />

0.665<br />

7<br />

0.2<strong>57</strong><br />

10<br />

0.859<br />

10<br />

0.075<br />

13<br />

0.245<br />

7<br />

0.495<br />

10<br />

0.215<br />

2<br />

0.125<br />

3<br />

0.958<br />

13<br />

0.756<br />

2<br />

0.682<br />

2<br />

0.045<br />

2 15 4 10<br />

3 15 6 10<br />

8 4 1 13<br />

10 2 1 7<br />

9 11 4 10<br />

12 15 7 2<br />

10 15 13 3<br />

2 6 1 13<br />

15 14 7 2<br />

9 11 7 5<br />

* Bold numbers represent generated random numbers which are less than the adopted mutation rate of 5%.<br />

4.8 Construction of Array of Best Solutions<br />

Once the operation of mutation is completed, the best member of the population is selected and stored in an “array of best<br />

solutions”. In this way, an array is obtained, which contains only the best solutions from each population. These best<br />

solutions therefore represent the solutions, which were selected from a population based on their highest fitness among the<br />

other members.


70<br />

Population Number<br />

(1)<br />

1<br />

2<br />

3<br />

4<br />

5<br />

6<br />

7<br />

8<br />

9<br />

10<br />

11<br />

12<br />

13<br />

14<br />

15<br />

16<br />

17<br />

18<br />

19<br />

20<br />

Table 8. Array of Best Solutions for Scenario 1.<br />

Array of Best Solution<br />

(2)<br />

3 15 4 10<br />

15 4 10 6<br />

8 10 3 15<br />

8 10 3 15<br />

10 6 15 8<br />

6 15 8 10<br />

15 8 10 6<br />

6 15 8 10<br />

6 15 8 10<br />

11 15 8 10<br />

8 10 11 15<br />

8 10 11 15<br />

8 10 11 15<br />

15 8 10 11<br />

11 15 8 10<br />

11 15 8 10<br />

15 8 10 11<br />

15 8 10 11<br />

15 8 10 11<br />

15 8 10 11<br />

*Maximum Scale Fitness<br />

Solution Fitness<br />

(3)<br />

59.55<br />

65.55<br />

68.55<br />

69.30<br />

71.85<br />

72.90<br />

77.85<br />

81.30<br />

81.45<br />

86.40<br />

89.10<br />

84.30<br />

93.75<br />

96.90*<br />

96.45<br />

95.55<br />

95.40<br />

87.90<br />

87.45<br />

93.30<br />

M. A. Al-Zahrani and K. Moied<br />

For the hypothetical network, a tentative array of best solutions is shown in Table 8 after running the program for 20<br />

iterations (population), four <strong>monitoring</strong> <strong>stations</strong>, and a population size of 10.<br />

4.9 Selection of Optimal Solution<br />

Once the “array of best solutions” is obtained, the optimal solution can be identified based on the highest fitness among the<br />

best selected in this array. It is quite obvious that the best or optimal solution after 20 populations is ( 15 8 10 11 ) which is<br />

selected based on its highest fitness value of 96.90 among all members of this array.<br />

Figure 4. Generations vs. maximum scaled fitness of single flow scenario<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003


M. A. Al-Zahrani and K. Moied<br />

The hypothetical network was run <strong>using</strong> a program named QUDIS developed in MATLAB, for identifying 4 <strong>monitoring</strong><br />

<strong>stations</strong> (S). For this purpose 500 populations (P) were set for both single and multiple scenarios. Simulation was carried out<br />

for the two cases (single flow and multiple flow) independently. The best values for fitness (Z) were selected as 500 for the<br />

single scenario and 1000 for the multiple scenarios. The population size (X) was kept constant at 100 with mutation rate (R) at<br />

5% in both cases.<br />

Figure 5. Generations vs. maximum scaled fitness of multiple flow<br />

It took about 1 minute and 56 seconds to run 500 iterations <strong>using</strong> PC-1400 MHz machine and ( 11 15 8 10 ) was selected as<br />

the optimal or best identified <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong> for single flow (Scenario 1). Figure 4 shows the maximum<br />

scaled fitness plotted against generations (P). As can be seen from the figure, the maximum scaled fitness of the population<br />

increases until it gets stable after about 40 generations.<br />

For multiple flows (Scenario 1 and Scenario 2), it took about 2 minutes and 3 seconds to run the 500 iterations <strong>using</strong> PC-<br />

1400 MHz machine. Stations ( 8 6 15 10 ) was identified as the optimal <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong>. Figure 5 shows the<br />

maximum scaled fitness plotted against generations. As can be revealed from the figure, the maximum scaled fitness increases<br />

until it gets stable after about 45 generations.<br />

5. MODEL VERIFICATION<br />

To check the validity of the developed model, it was applied to two cases presented in the literature. The application of the<br />

developed methodology and the result comparison are presented in the following paragraphs.<br />

Case I<br />

The first case is taken from Lee and Deininger [2]. The <strong>water</strong> distribution network consists of 7 nodes with only one<br />

source of supply as shown in Figure 6(a). The model was run based on 50% coverage demand criteria.<br />

Simulation was run for X = 100, P = 500, S = 2, Z = 200, N = 7, and R = 0.05. The simulation time was about 52 seconds<br />

for 500 generations. Stations ( 5 6 ) were identified to be the optimal <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong>. This conclusion was<br />

similar to the findings of Lee and Deininger [2]. Simulated results are shown in Figure 6(b).<br />

Case II<br />

The second case is taken from Kumar et al. [4]. The <strong>water</strong> network consists of 19 nodes with two sources of supply as<br />

shown in the Figure 7(a).<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 71


72<br />

M. A. Al-Zahrani and K. Moied<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003


M. A. Al-Zahrani and K. Moied<br />

Figure 6. (a) Water distribution network of Case I (after Lee and Deininger, [2]), (b) generations vs. maximum scaled fitness for Case I.<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 73


74<br />

M. A. Al-Zahrani and K. Moied<br />

Figure 7. (a) Water distribution network of Case 2 (after Kumar et al., [4] (b) generations vs. maximum scaled fitness for case2<br />

Simulation was run for X = 100, P = 500, Z = 25000, N = 19, and R = 0.05. The simulation time took about 2 minutes and<br />

14 seconds to run 500 generations. Stations ( 5 17 18 19 ) were identified to be the optimal <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong>,<br />

which were similar to those identified by Kumar et al. [4]. Simulated results are shown in Figure 7(b).<br />

Model verification implies to the application of developed methodology based on Genetic Algorithm to the studies carried<br />

out by Lee and Deininger [2] and Kumar et al. [4]. Two <strong>monitoring</strong> <strong>stations</strong> were identified for example network, as proposed<br />

by Lee and Deininger [2]. Similarly, four <strong>monitoring</strong> <strong>stations</strong> were identified for the example network, as proposed by Kumar<br />

et al. [4].<br />

This finding gives confidence in applying the developed model for identifying <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong> in any<br />

<strong>water</strong> distribution network.<br />

6. CONCLUSION AND RECOMMENDATIONS<br />

A methodology based on GA was developed and illustrated with the help of a hypothetical case to identify <strong>water</strong> <strong>quality</strong><br />

<strong>monitoring</strong> <strong>stations</strong> in a <strong>water</strong> distribution network and verified with two examples from the literature.<br />

The results of this research can contribute significantly in assuring safe and better <strong>water</strong> <strong>quality</strong> to be delivered to the<br />

consumers through the <strong>water</strong> distribution network by identifying proper locations of <strong>monitoring</strong> <strong>stations</strong> over the entire <strong>water</strong><br />

distribution network, thus ensuring safe <strong>water</strong> provided to the consumers.<br />

In this paper, <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong> were located based on the quantity of the flow assuming that <strong>water</strong> <strong>quality</strong><br />

at a downstream node is less than <strong>water</strong> <strong>quality</strong> at an upstream node. This research can be extended in the future to consider<br />

multiple reasons for <strong>water</strong> <strong>quality</strong> variation when identifying <strong>water</strong> <strong>quality</strong> <strong>monitoring</strong> <strong>stations</strong> such as <strong>water</strong> distribution<br />

network condition, constituent concentration, and age of <strong>water</strong>.<br />

ACKNOWLEDGEMENTS<br />

The authors express their thanks to King Abdulaziz City of Science and Technology (KACST) for the financial support and<br />

to King Fahd University of Petroleum and Minerals (KFUPM) for providing the necessary help and research facility to<br />

conduct the current study.<br />

REFERENCES<br />

[1] B. H. Lee, “Locating Monitoring Stations in Water Distribution Networks”, Ph.D. Dissertation, Environmental Health Sciences,<br />

The University of Michigan, 1990.<br />

[2] B. H. Lee, and R.A. Deininger, “Optimal Iocations of Monitoring Stations in Water Distribution system”, J. Envir. Engrg., ASCE,<br />

118(1) (1992), pp. 4–16<br />

[3] F.W. Pontius, Water Quality and Treatment. (4 th Edn), New York: (AWWA), McGraw-Hill, (1990).<br />

[4] A., Kumar, M.L., Kansal, and G. Arora, “Identification of Monitoring Stations in Water Distribution System”, J. Envir. Engrg.,<br />

ASCE, 123 (8) (1997). pp. 746–752.<br />

[5] A. Kesseler, Ostfeld, and G. Sineri, “Detecting Accidental Contaminations in Municipal Water Networks”, J. Water Resour.<br />

Planning and Management, ASCE, 124(4) (1998), pp. 192–198.<br />

[6] L.F.R. Reis, R.M., Porto, and F.H. Chaudhry, “Optimal Location of Control Valves in Pipe Networks by Genetic Algorithm”,<br />

Journal of Water Resources Planning and Management, ASCE, 123(6) (1997), 314–326.<br />

[7] R.W. Meier, and B.D. Barkdoll, “Sampling design for network model calibration <strong>using</strong> Genetic Algorithm”, J. Water Resour.<br />

Plang. and Mgmt., ASCE, 126(4) (2000), pp. 245-250.<br />

[8] P.F., Boulos, and T. Altman, “Explicit Calculation of Water Quality Parameters in Pipe Distribution Network.” Journal of Civil<br />

Engineering Systems, 10(1), (1993), pp. 187–206.<br />

[9] L.A. Rossman, EPANET – User’s Manual. Cincinnati, Ohio: United States Environmental Protection Agency (USEPA), 2000.<br />

[10] D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, Mass. Addison – Wesley, (1989).<br />

[11] D.A. Savic, and G.A. Walters, “Genetic Algorithms for Least–Cost Design of Water Qistribution Networks”, Journal of Water<br />

Resources Planning and Management, ASCE, 123 (2) (1997), pp. 67–77.<br />

[12] D.M. Tate, and A.E. Smith, Expected Allele Coverage and the Role of Mutation in Genetic Algorithms”, Proceedings of the Fifth<br />

International Conference on Genetic Algorithm, University of Illinois at Urban-Champaign, July 17–21, 1993, pp. 31–37.<br />

Paper Received 4 February 2002, Revised 2 June 2002; Accepted 23 October 2002.<br />

The Arabian Journal for Science and Engineering, Volume 28, Number 1B April 2003


M. A. Al-Zahrani and K. Moied<br />

APPENDIX I. Evaluation of the Linear Constants C1 & C2<br />

If frmin ><br />

Then C1 = ><br />

If frmin <<br />

Then C1 = ><br />

Where<br />

( 1.<br />

5 * ( fravg − frmax<br />

))<br />

,<br />

0.<br />

5<br />

0.<br />

5 * fr<br />

fr − fr<br />

max<br />

avg<br />

avg<br />

, and C2 =<br />

( 1.<br />

5 * ( fravg − frmax<br />

))<br />

,<br />

0.<br />

5<br />

fr<br />

avg<br />

fr<br />

avg<br />

−<br />

fr<br />

min<br />

, and C2 =<br />

( fr<br />

frmin = Minimum raw fitness of population<br />

frmax = Maximum raw fitness of population<br />

fravg = Average raw fitness of population<br />

max<br />

−1.<br />

5 * fr<br />

fr − fr<br />

max<br />

min<br />

avg<br />

avg<br />

( − frmin<br />

* fravg<br />

) * fr<br />

fr − fr<br />

avg<br />

) * fr<br />

avg<br />

April 2003 The Arabian Journal for Science and Engineering, Volume 28, Number 1B 75<br />

avg

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!