HYDROLOGICAL MODELLING AND RIVER BASIN MANAGEMENT
HYDROLOGICAL MODELLING AND RIVER BASIN MANAGEMENT
HYDROLOGICAL MODELLING AND RIVER BASIN MANAGEMENT
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Danmarks og Grønlands Geologiske Undersøgelse — Særudgivelse 2007<br />
Hydrological Modelling and River Basin Management<br />
Doctoral Thesis<br />
Jens Christian Refsgaard<br />
Geological Survey of Denmark and Greenland<br />
Danish Ministry of the Environment
Denne afhandling er af Det Naturvidenskabelige Fakultet ved Københavns Universitet antaget til offentligt at forsvares<br />
for den naturvidenskabelige doktorgrad.<br />
København, den 5. januar, 2007<br />
Nils O. Andersen<br />
Dekan<br />
Forsvaret vil finde sted fredag den 1 juni, 2007 kl 14 00 i Anneksauditorium A, Studiestræde 6, Københavns Universitet<br />
This thesis has been accepted by the Faculty of Natural Science at the University of Copenhagen for public defence<br />
in fulfilment of the degree of Doctor of Science.<br />
Copenhagen, 5 th January, 2007<br />
Nils O Andersen<br />
Dean<br />
The defence will take place on Friday 1 st June, 2007 at 14 00 in Anneksaudiorium A, Studiestræde 6, University of<br />
Copenhagen<br />
Special Issue<br />
Author: Jens Christian Refsgaard<br />
Illustrations: Kristian A. Rasmussen and reproductions from existing publications<br />
Cover: Kristian A. Rasmussen<br />
Date: January 2007<br />
The Report is available on the internet at http://www.geus.dk/<br />
ISBN 978-87-7871-185-4<br />
Geological Survey of Denmark and Greenland (GEUS)<br />
Øster Voldgade 10<br />
DK-1350 København K<br />
Tel: +45 38142000<br />
Fax: +45 38142050<br />
Email: geus@geus.dk<br />
http://www.geus.dk/
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Table of Contents<br />
Dansk Resume 3<br />
Abstract 4<br />
1. Introduction 5<br />
1.1 Water Resources Management and Hydrological Modelling 5<br />
1.2 Objective and Content 5<br />
2 Water Resources Management and the Modelling Process 7<br />
2.1 Modelling as Part of the Planning and Management Process 7<br />
2.2 Terminology and Scientific Philosophical Basis for the Modelling Process 10<br />
2.2.1 Background 10<br />
2.2.2 Terminology and guiding principles 10<br />
2.2.3 Scientific philosophical aspects 12<br />
2.3 Modelling Protocol 14<br />
2.4 Classification of Models 18<br />
3 Simulation of Hydrological Processes at Catchment Scale 20<br />
3.1 Flow modelling 20<br />
3.1.1 Groundwater/surface water model for the Suså catchment ([1], [2]) 20<br />
3.1.2 Application of SHE to catchments in India ([4], [5]) 27<br />
3.1.3 Intercomparison of different types of hydrological models ([6]) 32<br />
3.2 Reactive Transport 36<br />
3.2.1 Oxygen transport and consumption in the unsaturated zone ([3]) 36<br />
3.2.2 An integrated model for the Danubian Lowland ([9]) 39<br />
3.2.3 Large scale modelling of groundwater contamination ([10]) 45<br />
3.3 Real-time Flood Forecasting 49<br />
3.3.1 Intercomparison of updating procedures for real-time forecasting ([8]) 49<br />
4. Key Issues in Catchment Scale Hydrological Modelling 53<br />
4.1 Scaling 53<br />
4.1.1 Catchment heterogeneity 53<br />
4.1.2 A scaling framework 56<br />
4.1.3 Scaling - an example 58<br />
4.1.4 Discussion – post evaluation 59<br />
4.2 Confirmation, Verification, Calibration and Validation 62<br />
4.2.1 Confirmation of conceptual model 62<br />
4.2.2 Code verification 62<br />
4.2.3 Model calibration 63<br />
4.2.4 Model validation 63<br />
i
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
4.2.5 Discussion – post evaluation 64<br />
4.3 Uncertainty Assessment 66<br />
4.3.1 Modelling uncertainty in a water resources management context 66<br />
4.3.2 Data uncertainty 71<br />
4.3.3 Parameter uncertainty 71<br />
4.3.4 Model structure uncertainty 73<br />
4.3.5 Discussion – post evaluation 75<br />
4.4 Quality Assurance in Model based Water Management 77<br />
4.4.1 Background 77<br />
4.4.2 The HarmoniQuA approach 77<br />
4.4.3 Organisational requirements for QA guidelines to be effective 79<br />
4.4.4 Performance criteria and uncertainty – when is a model good enough 79<br />
4.4.5 Discussion – post evaluation 80<br />
5 Conclusions and Perspectives for Future Work 81<br />
5.1 Summary of Main Scientific Contributions 81<br />
5.2 Modelling Issues for Future Research 82<br />
6 References 84<br />
ii
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Appendices: Publications [1] – [15]<br />
[1] Refsgaard JC, Hansen E (1982) A Distributed Groundwater/Surface Water Model for the Suså<br />
Catchment. Part 1: Model Description. Nordic Hydrology, 13, 299-310.<br />
[2] Refsgaard JC, Hansen E (1982) A Distributed Groundwater/Surface Water Model for the Suså<br />
Catchment. Part 2: Simulations of Streamflow Depletions Due to Groundwater Abstraction. Nordic<br />
Hydrology, 13, 311-322.<br />
[3] Refsgaard JC, Christensen TH, Ammentorp HC (1991) A model for oxygen transport and consumption<br />
in the unsaturated zone. Journal of Hydrology, 129, 349-369.<br />
[4] Refsgaard JC, Seth SM, Bathurst JC, Erlich M, Storm B, Jørgensen, GH, Chandra S (1992) Application<br />
of the SHE to catchments in India - Part 1: General results. Journal of Hydrology, 140,<br />
pp 1-23.<br />
[5] Jain SK, Storm B, Bathurst JC, Refsgaard JC, Singh RD (1992) Application of the SHE to catchments<br />
in India - Part 2: Field experiments and simulation studies with the SHE on the Kolar subcatchment<br />
of the Narmada River. Journal of Hydrology, 140, 25-47.<br />
[6] Refsgaard JC, Knudsen J (1996) Operational validation and intercomparison of different types of<br />
hydrological models. Water Resources Research, 32 (7), 2189-2202.<br />
[7] Refsgaard JC (1997) Parametrisation, calibration and validation of distributed hydrological models.<br />
Journal of Hydrology, 198, 69-97.<br />
[8] Refsgaard JC (1997) Validation and Intercomparison of Different Updating Procedures for Real-<br />
Time Forecasting. Nordic Hydrology, 28, 65-84.<br />
[9] Refsgaard JC, Sørensen HR, Mucha I, Rodak D, Hlavaty Z, Bansky L, Klucovska J, Topolska J,<br />
Takac J, Kosc V, Enggrob HG, Engesgaard P, Jensen JK, Fiselier J, Griffioen J, Hansen S<br />
(1998) An Integrated Model for the Danubian Lowland – Methodology and Applications. Water<br />
Resources Management, 12, 433-465.<br />
[10] Refsgaard JC, Thorsen M, Jensen JB, Kleeschulte S, Hansen S (1999) Large scale modelling of<br />
groundwater contamination from nitrogen leaching. Journal of Hydrology, 221(3-4), 117-140.<br />
[11] Thorsen M, Refsgaard JC, Hansen S, Pebesma E, Jensen JB, Kleeschulte S (2001) Assessment<br />
of uncertainty in simulation of nitrate leaching to aquifers at catchment scale. Journal of Hydrology,<br />
242, 210-227.<br />
[12] Refsgaard JC, Henriksen HJ (2004) Modelling guidelines – terminology and guiding principles.<br />
Advances in Water Resources, 27(1), 71-82.<br />
[13] Refsgaard JC, Henriksen HJ, Harrar WG, Scholten H, Kassahun A (2005) Quality assurance in<br />
model based water management – review of existing practice and outline of new approaches.<br />
Environmental Modelling & Software, 20, 1201-1215.<br />
[14] Refsgaard JC, Nilsson B, Brown J, Klauer B, Moore R, Bech T, Vurro M, Blind M, Castilla G,<br />
Tsanis I, Biza P (2005) Harmonised techniques and representative river basin data for assessment<br />
and use of uncertainty information in integrated water management (HarmoniRiB). Environmental<br />
Science and Policy, 8, 267-277.<br />
[15] Refsgaard JC, van der Sluijs JP, Brown J, van der Keur P (2006). A framework for dealing with<br />
uncertainty due to model structure error. Advances in Water Resources, 29, 1586-1597.<br />
iii
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
iv
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Preface<br />
The work presented in this thesis together with the 15 publications published between 1982 and 2006<br />
form the material for evaluation for the degree of doctor scientiarum (dr. scient.) at the University of<br />
Copenhagen. The papers have all been published in peer reviewed international scientific journals.<br />
They are referred to by the numbers [1] to [15].<br />
In the present report I have assembled and summarised my most important scientific contributions to<br />
catchment modelling that has been my research interest during the past three decades. In this connection<br />
I wish to thank all my co-authors for a very inspiring co-operation during the years. Research does<br />
not take place in a vacuum, and without the interactions with them my work would not have been possible.<br />
I wish to acknowledge former and present colleagues and managements at the three organisations<br />
where I have been employed. At the Institute of Hydrodynamics and Hydraulic Engineering, Technical<br />
University of Denmark (now Environment and Resources, DTU) I was given the opportunity to explore<br />
and develop new integrated groundwater/surface water catchment models at a time when hydrological<br />
modelling was still in its infancy. This showed me the enormous potential of this new field. At Danish<br />
Hydraulic Institute (now DHI Water & Environment) I was then entrusted with further development of<br />
modelling tools and with testing them in real life applications. This taught me the limitations and difficulties<br />
we encounter and the need to be humble when applying models in water resources management.<br />
Finally, the Geological Survey of Denmark and Greenland (GEUS) has provided a very inspiring scientific<br />
environment and given me the opportunity to get involved in broader international research projects<br />
that have matured much of my previous views and allowed me to assemble this work.<br />
A special thank goes to Kristian A. Rasmussen, GEUS, for using his magic touch to polish some of the<br />
old dusty figures from the last century to make them easier to read in this thesis.<br />
Last, but not least, I wish to thank my family for their patience and support and for accepting that I always<br />
have been too busy with this topic.<br />
Copenhagen, January 2007<br />
Jens Christian Refsgaard<br />
"Life can only be understood backwards; but it must be lived forwards"<br />
Søren Kierkegaard (1813-1855)<br />
1
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
2
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Dansk Resume<br />
Publikationerne og materialet i denne doktorafhandling beskriver en række videnskabelige undersøgelser<br />
af hydrologisk modellering på oplandsskala i relation til vandressourceforvaltning. Hver af de 15<br />
publikationer fokuserer på dele af det overordnede emne spændende fra udvikling af nye koncepter og<br />
modelkoder til modelanvendelser; fra punktskala til oplandsskala; fra modellering af vandstrømninger til<br />
transport af opløste og reaktive stoffer; fra fokus på planlægning til real-tids oversvømmelsesvarsling og<br />
videre til tværgående emner og protokoller for selve modelleringsprocessen.<br />
Afhandlingens kapitel 2 præsenterer protokoller for hydrologisk modellering og en diskussion af interaktionen<br />
mellem hydrologisk modellering og vandressourceforvaltning. Endvidere forklares den terminologi<br />
og den tilgrundlæggende videnskabsfilosofiske tankegang samt den klassifikation af modeltyper,<br />
som benyttes i resten af afhandlingen. Kapitel 3 indeholder resumeer af modelstudier baseret på ni af<br />
publikationerne. Vurderingerne af disse publikationers bidrag til ny viden på det tidspunkt de blev publiceret<br />
og af emner som ikke blev behandlet i publikationerne, viser en betydelig udvikling gennem de<br />
sidste 25 år. Fx indeholder de første publikationer om udvikling af nye modelkoder, intet om verifikation<br />
af modelkode, validering af modeller mod uafhængige data eller usikkerhedsvurderinger – emner som i<br />
dag betragtes som meget væsentlige. Eksemplerne illustrerer ligeledes, hvordan generelle emner som<br />
skalaproblemer og model validering gradvis udviklede sig med baggrund i erfaringer og erkendte problemer<br />
fra modelstudier, som egentlig havde andre formål. Kapitel 4 præsenterer og diskuterer herefter<br />
fire generelle emner: (a) heterogenitet og skalering; (b) konfirmation, verifikation, kalibrering og validering<br />
af modeller; (c) usikkerhedsvurderinger; og (d) kvalitetssikring af modelleringsprocessen.<br />
Mine væsentligste bidrag til ny videnskabelig viden har været indenfor de følgende fem områder:<br />
• Ny konceptuel forståelse og tilhørende kodeudvikling. Suså modellen var baseret på en ny forståelse<br />
af interaktionen mellem overfladevand og grundvand i moræneområder og bragte ny viden om<br />
hvorledes grundvandsindvinding påvirker vandløb i sådanne oplande.<br />
• Validering af modeller. Arbejdet med rigoristiske principper for validering af modeller og eksempler<br />
på anvendelser for såvel ’lumped conceptual’ og ’distributed physically-based’ modeller har været<br />
en grundpille gennem de sidste 15 år af min forskning. Specielt er introduktionen af begrebet ’conditional<br />
validation’ ny.<br />
• Skalering. Mit arbejde har ikke ’løst’ skalaproblemerne, men bidrager til at tydeliggøre de principielt<br />
forskellige metoder med fokus på deres respektive forudsætninger og begrænsninger.<br />
• Usikkerhedsvurderinger. En betydelig del af min forskningsaktivitet gennem de sidste 10 år har<br />
fokuseret på usikkerhedsaspekter. Mit hovedbidrag i den sammenhæng har været introduktion af<br />
bredere usikkerhedsaspekter i hele modelleringsprocessen samt arbejdet med usikkerheder på<br />
modelstruktur.<br />
• Protokoller for hydrologisk modellering og kvalitetssikring af modelleringsprocessen. Den omfattende<br />
og detaljerede modelleringsprotokol, som blev udviklet i HarmoniQuA projektet er en formalisering<br />
og udmøntning af erfaring fra de foregående 25 års arbejde med hydrologisk modellering. De<br />
ny elementer heri er den fokus der lægges på (a) den interaktive dialog mellem modellør, vandressourceforvalter,<br />
reviewer, interessenter og offentligheden; (b) usikkerhedsvurderinger som et løbende<br />
element gennem hele modelleringsprocessen; (c) model validering; og (d) introduktion af erfaringer<br />
og subjektiv viden via eksterne reviews.<br />
3
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Abstract<br />
The publications and material presented in this thesis describe a series of scientific investigations on<br />
catchment modelling in relation to water resources management. Each of the 15 publications represents<br />
parts of the overall topic ranging from development of new concepts and model codes to model<br />
applications; from point scale to catchment scale; from flow modelling to transport and reactive modelling;<br />
from planning type applications to real-time forecasting and further on to crosscutting issues and<br />
protocols for the modelling process.<br />
The thesis starts with a presentation of protocols for the hydrological modelling process together with a<br />
discussion of the interaction between the water resources planning and management process and the<br />
hydrological modelling process. This includes a definition of terminology, a discussion of the underlying<br />
scientific philosophy and a classification of hydrological models. The following chapter comprises summaries<br />
of cases of simulation models based on nine of the publications. The post evaluations of the<br />
contributions to scientific knowledge in the publications and the issues not taken into account in the<br />
earlier publications reveal significant developments over the years. For example the first publications<br />
focussing on development of new model codes did not put any emphasis on rigorous verification or<br />
validation tests nor on uncertainty assessments, which are key issues today. The cases furthermore<br />
illustrate how general issues such as scaling and model validation gradually emerged from experiences<br />
and problems encountered in catchment studies that had other primary objectives. The next chapter<br />
then provides a presentation and discussion of four general issues: (a) catchment heterogeneity and<br />
scaling; (b) confirmation, verification, calibration and model validation; (c) uncertainty assessment; and<br />
(d) quality assurance in model based water management.<br />
My main contributions to scientific knowledge have been in the following five areas:<br />
• New conceptual understanding and code development. The Suså model was based on a new conceptual<br />
understanding of the surface water/groundwater interaction in moraine catchment and<br />
brought new insight into the effect of groundwater abstraction on streamflow in catchments with<br />
such hydrogeological characteristics.<br />
• Model validation. The work on rather rigorous principles for model validation and the examples of<br />
their application both for lumped conceptual and distributed physically based models is a cornerstone<br />
in my research. In particular the introduction of the term ‘conditional validation’ is novel.<br />
• Scaling. The framework on scaling does not ‘solve’ the scaling problem but contributes to clarifications<br />
on applicable methodologies with focus on their respective assumptions and limitations.<br />
• Uncertainty assessment. During the past decade a considerable part of my research work has focussed<br />
on uncertainty aspects. I consider my main contributions in this respect to be the introduction<br />
of the broader uncertainty aspects integrated into the modelling framework and the work with<br />
model structure uncertainty.<br />
• Modelling protocols and guidelines for quality assurance in the modelling process. The comprehensive<br />
modelling protocol developed within the HarmoniQuA project is a formalisation of experience<br />
and practises that have gradually emerged over the years. The novel elements are the emphasis on<br />
(a) the interactive dialogue between modeller, water manager, reviewer, stakeholders and the public;<br />
(b) uncertainty assessments throughout the modelling process; (c) model validation; and (d) experience<br />
and subjective knowledge introduced through external model reviews.<br />
4
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
1. Introduction<br />
1.1 Water Resources Management and Hydrological Modelling<br />
"Scarcity and misuse of fresh water pose a serious and growing threat to sustainable development<br />
and protection of the environment. Human health and welfare, food security, industrial<br />
development and the ecosystems on which they depend, are all at risk, unless water<br />
and land resources are managed more effectively in the present decade and beyond than<br />
they have been in the past". (ICWE, 1992)<br />
“The fact that the world faces a water crises has become increasingly clear in recent years.<br />
Challenges remain widespread and reflect severe problems in the management of water resources<br />
in many parts of the world. These problems will intensify unless effective and concerted<br />
actions are taken”. (WWAP, 2003)<br />
The first of the above quotes presents the status and the future challenges facing hydrologists and water<br />
resources managers as summarised in the introductory paragraph of the Dublin Statement on Water and<br />
Sustainable Development (ICWE, 1992). The second quote is from the first chapter of the UN World Water<br />
Development Report “Water for People, Water for Life” which is a collaborative effort of 23 UN agencies<br />
and convention secretariats co-ordinated by the World Water Assessment Programme.<br />
Thus the challenges in water resources management are enormous, both at the global scale as illustrated<br />
above and at smaller scales as for instance outlined in the vision for the European water sector recently<br />
formulated by the European Water Supply and Sanitation Technology Platform (WSSTP, 2005).<br />
The present thesis deals with hydrological modelling. It must be emphasised that modelling in itself is not<br />
sufficient to address these challenges. Modelling only constitute one, among several, sets of tools that can<br />
be used to support water resources management. Computer based hydrological models have been<br />
developed and applied at an ever increasing rate during the past four decades. The key reasons for that<br />
are twofold: (a) improved models and methodologies are continuously emerging from the research<br />
community, and (b) the demand for improved tools increases with the increasing pressure on water<br />
resources. Overviews of the status and development trends in catchment scale hydrological modelling<br />
during this period can be found in Fleming (1975) and Singh (1995).<br />
1.2 Objective and Content<br />
The objective of this thesis is to present the contributions to scientific knowledge that has emerged from<br />
the research described in the 15 appended publications. I have structured the thesis with an aim of presenting<br />
my research contributions within a framework of catchment modelling and its application to<br />
support water resources management.<br />
5
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The next chapter (Chapter 2) therefore presents an overall framework of the water resources management<br />
and planning process and the modelling process and the interaction between these two processes.<br />
Here the terminology and modelling protocol are introduced and discussed. This chapter is<br />
based on publications [7], [12] and [13], i.e. mainly some of my most recent work.<br />
Chapter 3 comprises a number of examples of simulation models ranging from point scale to catchment<br />
scale, from flow modelling to transport and reactive modelling and from planning type applications to<br />
real-time forecasting. This chapter is based on publications [1], [2], [3], [4], [5], [6], [8], [9] and [10], i.e.<br />
mainly some of my earlier work.<br />
Chapter 4 then provides a presentation and discussion of key and cross-cutting issues in hydrological<br />
modelling such as scaling, model validation, uncertainty assessment and quality assurance. These issues<br />
that were introduced as part of the overall framework in Chapter 2 are here discussed with reference<br />
to the experience and findings made in the publications. This chapter includes ideas, views and<br />
material from all the 15 publications, but with more emphasis on some of the more general purpose<br />
publications [6], [7], [10], [11], [12], [13], [14] and [15].<br />
Finally, Chapter 5 contains some conclusions and perspectives for future work.<br />
Thus I have not structured the content of this report according to the chronology of my publications [1] –<br />
[15]. The reason for this is that my most recent work provides a broader and better overview of the topic<br />
and is thus better suited for providing a framework for my earlier work.<br />
6
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
2 Water Resources Management and the Modelling Process<br />
2.1 Modelling as Part of the Planning and Management Process<br />
Integrated Water Resources Management (IWRM) is “a process, which promotes the co-ordinated development<br />
and management of water, land and related resources, in order to maximise the resultant<br />
economic and social welfare in an equitable manner without compromising the sustainability of vital<br />
ecosystems” (GWP, 2000). In the EU Water Framework Directive (WFD) Guidance Document on Planning<br />
Processes planning is defined as “a systematic, integrative and iterative process that is comprised<br />
of a number of steps executed over a specified time schedule” (EC, 2003b). In all new guidelines on<br />
water resources management the importance of integrated approaches, cross-sectoral planning and of<br />
public participation in the planning process are emphasised (GWP, 2000; EC, 2003b; Jønch-Clausen,<br />
2004).<br />
Models describing water flows, water quality, ecology and economy are being developed and used in<br />
increasing number and variety to support water management decisions. The interactions between the<br />
modelling process and the water management process are illustrated in Figs. 1 and 2. Fig. 1 shows the<br />
key actors in the water management process and the five steps that the modelling process typically<br />
may be decomposed in. The organisation that commissions a modelling study is denoted the water<br />
manager. This is often the competent authority, but can also be a stakeholder such as a water supply<br />
company. The role of the government is most often limited to providing the enabling environment such<br />
as legislation, research and information infrastructure. The typical cyclic and iterative character of the<br />
water management process, such as the WFD process, is illustrated in Fig. 2, where the interaction<br />
with the modelling process is illustrated by the large circle (water management) and the four smaller<br />
supporting circles (modelling). The WFD planning process, as most other planning processes, contains<br />
four main elements:<br />
• Identification including assessment of present status, analysis of impacts and pressures and establishment<br />
of environmental objectives. Here modelling may be useful for example for supporting assessments<br />
of what are the reference conditions and what are the impacts of the various pressures<br />
(EC, 2004).<br />
• Designing including the set up and analysis of programme of measures designed to be able in a<br />
cost effective way to reach the environmental objectives. Here modelling will typically be used for<br />
supporting assessments of the effects and costs of various measures under consideration.<br />
• Implementing the measures. Here on-line modelling in some cases may support the operational<br />
decisions to be made.<br />
• Evaluation of the effects of the measures on the environment. Here modelling may support the<br />
monitoring in order to extract maximum information from the monitoring data, e.g. by indicating errors<br />
and inadequacies in the data and by filtering out the effects of climate variability.<br />
7
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The Environment<br />
Problem<br />
Identification<br />
1. Model Study Plan<br />
• Identify problem<br />
• Define requirements<br />
• Assess uncertainties<br />
• Prepare model study plan<br />
Public Opinion<br />
2. Data and Conceptualisation<br />
• Collect and process data<br />
• Develop conceptual model<br />
• Select model code<br />
• Review and dialogue<br />
Stakeholders<br />
Competent<br />
Authority<br />
3. Model Set-up<br />
• Construct model<br />
• Reassess performance<br />
criteria<br />
• Review and dialogue<br />
Government<br />
4. Calibration and Validation<br />
• Model calibration<br />
• Model validation<br />
• Uncertainty assessment<br />
• Review and dialogue<br />
Implementation<br />
Water<br />
Management<br />
Decision<br />
5. Simulation and Evaluation<br />
• Model predictions<br />
• Uncertainty assessment<br />
• Review and dialogue<br />
Water Management Process<br />
Modelling Process<br />
Fig. 1 The role of the modelling process and the water management decision process (inspired from<br />
Pascual et al. (2003).<br />
It is important to note that the modelling studies typically do not address the entire planning and management<br />
process, but rather support certain elements of the process. Modelling is applied as a response<br />
(but usually not the only response) to an identified problem and can provide support for water<br />
management decisions. The types of interactions between the modelling process and the planning and<br />
management process are:<br />
8
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
• The modelling process starts with a thorough framing of the problem to be addressed and definition<br />
of modelling objectives and requirements for the modelling study (Step 1 in Fig. 1). Water managers<br />
and stakeholders dominate this step, which basically is identical to part of the broader planning<br />
process. A participatory based assessment of the most important sources of uncertainty for the decision<br />
process should be used as a basis for prioritising the elements of the modelling study. The<br />
uncertainty assessments made at this stage will typically be qualitative.<br />
• The main modelling itself is composed of steps 2, 3 and 4 of Fig. 1. Here the link with the main<br />
planning process consists of dialogue, reviews and discussions of preliminary results. The amount<br />
and type of interaction here depends on the level of public participation that may vary from case to<br />
case from providing information over consultation to active involvement (Henriksen et al., submitted).<br />
• The finalisation of the modelling study (equivalent to the last step in Fig. 1), typically including scenario<br />
simulations. Here the water managers and the stakeholders again have a dominant role. The<br />
decisions made at the outcome of this step on the basis of modelling results are made in the context<br />
of the main planning process. Uncertainty assessment of model predictions is a crucial aspect<br />
of the modelling results and should be communicated in a way that is accessible for the stakeholders<br />
in the further water management process.<br />
Modelling<br />
Evaluation<br />
Modelling<br />
Implementation<br />
WFD process<br />
Modelling<br />
Identification<br />
Designing<br />
Modelling<br />
Fig. 2 The role of modelling in the water management process within the context of the EU Water<br />
Framework Directive (WFD)<br />
9
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
2.2 Terminology and Scientific Philosophical Basis for the Modelling<br />
Process<br />
2.2.1 Background<br />
As pointed out in [12] a key problem in relation to establishment of a theoretical modelling framework is<br />
confusion on terminology. For example the terms validation and verification are used with different, and<br />
some times interchangeable, meanings by different authors. The confusion arises from both semantic<br />
and philosophical considerations (Rykiel, 1996). Another important problem is the lack of consensus<br />
related to the so far non-conclusive debate on the fundamental question concerning whether a water<br />
resources model can be validated or verified, and whether it as such can be claimed to be suitable or<br />
valid for particular applications (Konikow and Bredehoeft, 1992; De Marsily et al., 1992; Oreskes et al.,<br />
1994).<br />
An important issue in relation to validation/verification is the distinction between open and closed systems.<br />
A system is a closed system if its true conditions can be predicted or computed exactly. This applies<br />
to mathematics and mostly to physics and chemistry. Systems where the true behaviour cannot be<br />
computed due to uncertainties and lack of knowledge on e.g. input data and parameter values are<br />
called open systems. The systems we are dealing with in water resources management, based on geosciences,<br />
biology and socio-economy, are open systems. According to Konikow and Bredehoeft (1992)<br />
and Oreskes et al. (1994) it is not possible to verify or validate models of open systems.<br />
Finally, the principles have to reflect and be in line with the underlying philosophy of environmental<br />
modelling that have changed significantly during the past decades. In the early days many of us were<br />
focussing on the huge potentials of sophisticated models in a way that in retrospect may be characterised<br />
as rather naive enthusiasm (e.g. Freeze and Harlan (1969); Abbott, 1992). The dominant views<br />
today appears to be a much more balanced and mature view (e.g. Beven, 2002a; Beven, 2002b).<br />
2.2.2 Terminology and guiding principles<br />
According to the terminology presented in [12] the simulation environment is divided into four basic<br />
elements as shown in Fig. 3. The inner arrows describe the processes that relate the elements to each<br />
other, and the outer circle refers to the procedures that evaluate the credibility of these processes.<br />
In general terms a model is understood as a simplified representation of the natural system it attempts to<br />
describe. However, a distinction is made between three different meanings of the general term model,<br />
namely the conceptual model, the model code and the model that here is defined as a site-specific model.<br />
The most important elements in the terminology and their interrelationships are defined as follows:<br />
Reality: The natural system, understood here as the study area.<br />
Conceptual model: A description of reality in terms of verbal descriptions, equations, governing<br />
relationships or ‘natural laws’ that purport to describe reality. This is the user's perception of the key<br />
hydrological and ecological processes in the study area (perceptual model) and the corresponding<br />
10
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
simplifications and numerical accuracy limits that are assumed acceptable in order to achieve the purpose<br />
of the modelling. A conceptual model thus includes both a mathematical description (equations) and a<br />
descriptions of flow processes, river system elements, ecological structures, geological features, etc. that<br />
are required for the particular purpose of modelling. By drawing an analogy to scientific philosophical<br />
discussion the conceptual model in other words constitutes the scientific hypothesis or theory that we<br />
assume for our particular modelling study.<br />
Fig. 3 Elements of a modelling terminology [12].<br />
Model code: A mathematical formulation in the form of a computer program that is so generic that it,<br />
without program changes, can be used to establish a model with the same basic type of equations (but<br />
allowing different input variables and parameter values) for different study areas.<br />
Model: A site-specific model established for a particular study area, including input data and parameter<br />
values.<br />
11
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Model confirmation: Determination of adequacy of the conceptual model to provide an acceptable level of<br />
agreement for the domain of intended application. This is in other words the scientific confirmation of the<br />
theories/hypotheses included in the conceptual model.<br />
Code verification: Substantiation that a model code is in some sense a true representation of a conceptual<br />
model within certain specified limits or ranges of application and corresponding ranges of accuracy.<br />
Model calibration: The procedure of adjustment of parameter values of a model to reproduce the response<br />
of reality within the range of accuracy specified in the performance criteria.<br />
Model validation: Substantiation that a model within its domain of applicability possesses a satisfactory<br />
range of accuracy consistent with the intended application of the model.<br />
Model set-up: Establishment of a site-specific model using a model code. This requires, among other<br />
things, the definition of boundary and initial conditions and parameter assessment from field and laboratory<br />
data.<br />
Simulation: Use of a validated model to gain insight into reality and obtain predictions that can be used by<br />
water managers. This includes insight into how reality can be expected to respond to human interventions.<br />
In this connection uncertainty assessments of the model predictions are very important.<br />
Performance criteria: Level of acceptable agreement between model and reality. The performance criteria<br />
apply both for model calibration and model validation.<br />
Domain of applicability (of conceptual model): Prescribed conditions for which the conceptual model<br />
has been tested, i.e. compared with reality to the extent possible and judged suitable for use (by model<br />
confirmation).<br />
Domain of applicability (of model code): Prescribed conditions for which the model code has been<br />
tested, i.e. compared with analytical solutions, other model codes or similar to the extent possible and<br />
judged suitable for use (by code verification).<br />
Domain of applicability (of model): Prescribed conditions for which the site-specific model has been<br />
tested, i.e. compared with reality to the extent possible and judged suitable for use (by model validation).<br />
2.2.3 Scientific philosophical aspects<br />
The credibility of the descriptions or the agreements between reality, conceptual model, model code and<br />
model are evaluated through the terms confirmation, verification, calibration and validation. Thus, the relation<br />
between reality and the scientific description of reality which is constituted by the conceptual model<br />
with its theories and equations on flow and transport processes, its interpretation of the geological system<br />
and ecosystem at hand, etc., is evaluated through the confirmation of the conceptual model. By using the<br />
term confirmation in connection with conceptual model, it is implied that it is never considered possible<br />
to prove the truth of a theory/hypothesis and as such of a conceptual model. And even if a site-specific<br />
12
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
model is eventually accepted as valid for specific conditions, this is not a proof that the conceptual<br />
model is true, because, due to non-uniqueness, the site-specific model may turn out to perform right for<br />
the wrong reasons.<br />
The fundamental view expressed by scientific philosophers is that verification and validation of numerical<br />
models of natural systems is impossible, because natural systems are never closed and because<br />
the mapping of model results are always non-unique (Popper, 1959; Oreskes et al., 1994). I agree that<br />
it is not possible to carry out model verification or model validation, if these terms are used universally,<br />
without restriction to domains of applicability and levels of accuracy.<br />
[12] note, however, that Popper (1959) distinguished between two kinds of universal statements: the<br />
'strictly universal' and the 'numerical universal'. The strictly universal statements are those usually dealt<br />
with when speaking about theories or natural laws. They are a kind of 'all-statement' claiming to be true<br />
for any place and any time. In contrary, numerical universal statements refers only to a finite class of<br />
specific elements within a finite individual spatio-temporal region. A numerical universal statement is<br />
thus in fact equivalent to conjunctions of singular statements.<br />
The restrictions in use of the terms confirmation, verification and validation imposed by the respective<br />
domains of applicability imply, according to Popper's views, that the conceptual model, model code and<br />
site-specific models can only be classified as numerical universal statements as opposed to strictly universal<br />
statements. This distinction is fundamental for the terminology described in [12] and its link to<br />
scientific philosophical theories. Consequently the terms verification and validation should never be<br />
used without qualifiers.<br />
An important aspect of the framework outlined in [12] lies in the separation between the three different<br />
‘versions’ of the word model, namely the conceptual model, the model code and the-site specific model.<br />
Due to this distinction it is possible, at a general level, to talk about confirmation of a theory or a hypothesis<br />
about how nature can be described using the relevant scientific method for that purpose, and,<br />
at a site-specific level, to talk about validity of a given model within certain domains of applicability and<br />
associated with specified accuracy limits.<br />
13
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
2.3 Modelling Protocol<br />
The procedure for applying a hydrological model is often denoted a modelling protocol. It comprises a<br />
series of actions to be followed in a sequential or iterative form. The modelling protocol presented in [7]<br />
for distributed catchment modelling was inspired by the groundwater community (Anderson and<br />
Woessner, 1992). It was subsequently used in the Danish Handbook for Groundwater Modelling (Henriksen<br />
et al., 2001) that has been used extensively in practise since its emergence. A more recent modelling<br />
protocol, developed within the context of the EU research project HarmoniQuA, is reported in [13]<br />
and Scholten et al. (2007). The two protocols are illustrated in Figs. 4 and 5.<br />
Fig. 4 The modelling protocol from [7].<br />
14
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
A modelling study will involve several phases and several actors. A typical modelling study will involve<br />
the following four different types of actors:<br />
• The water manager, i.e. the person or organisation responsible for the management or protection of<br />
the water resources, and thus responsible for the modelling study and the outcome (the problem<br />
owner).<br />
• The modeller, i.e. a person or an organisation that works with the model conducting the modelling<br />
study. If the modeller and the water manager belong to different organisations, their roles will typically<br />
be denoted consultant and client, respectively.<br />
• The reviewer, i.e. a person that is conducting some kind of external review of a modelling study.<br />
The review may be more or less comprehensive depending on the requirements of the particular<br />
case. The reviewer is typically appointed by the water manager to support the water manager to<br />
match the modelling capability of the modeller.<br />
• The stakeholders/public. A stakeholder is an interested party with a stake in the water management<br />
issue, either in exploiting or protecting the resource. Stakeholders include the following different<br />
groups: (i) competent water resource authority (typically the water manager, cf. above); (ii) interest<br />
groups; and (iii) general public.<br />
The modelling process may, according to [13], be decomposed into five major steps which again are<br />
decomposed into 48 tasks (Fig. 5). The contents of the five steps are:<br />
• STEP1 (Model Study Plan). This step aims to agree on a Model Study Plan comprising answers to<br />
the questions: Why is modelling required for this particular model study What is the overall modelling<br />
approach and which work should be carried out Who will do the modelling work Who should<br />
do the technical reviews Which stakeholders/public should be involved and to what degree What<br />
are the resources available for the project The water manager needs to describe the problem and<br />
its context as well as the available data. A very important task is then to analyse and determine the<br />
various requirements of the modelling study in terms of the expected accuracy of modelling results.<br />
The acceptable level of accuracy will vary from case to case and must be seen in a socio-economic<br />
context. It should, therefore, be defined through a dialogue between the modeller, water manager<br />
and stakeholders/public. In this respect an analysis of the key sources of uncertainty is crucial in<br />
order to focus the study on the elements that produce most information of relevance to the problem<br />
at hand.<br />
• STEP 2 (Data and Conceptualisation). In this step the modeller should gather all the relevant<br />
knowledge about the study basin and develop an overview of the processes and their interactions in<br />
order to conceptualise how the system should be modelled in sufficient detail to meet the requirements<br />
specified in the Model Study Plan. Consideration must be given to the spatial and temporal<br />
detail required of a model, to the system dynamics, to the boundary conditions and to how the<br />
model parameters can be determined from the available data. The need to model certain processes<br />
in alternative ways or to differing levels of detail in order to enable assessments of model structure<br />
uncertainty should be evaluated. The availability of existing computer codes that can address the<br />
model requirements should also be addressed.<br />
• STEP 3 (Model Set-up). Model Set-up implies transforming the conceptual model into a site-specific<br />
model that can be run in the selected model code. A major task in Model Set-up is the processing of<br />
data in order to prepare the input files necessary for executing the model. Usually, the model is run<br />
within a Graphical User Interface (GUI) where many tasks have been automated. The GUI speeds<br />
15
Refsgaard JC – Doctoral Thesis<br />
Hydrological Modelling and River Basin Management<br />
January 2007<br />
up the generation of input files, but it does not guarantee that the input files are error free. The<br />
modeller performs this work.<br />
• STEP 4 (Calibration and Validation). This step is concerned with the process of analysing the model<br />
that was constructed during the previous step, first by calibrating the model, and then by validating<br />
its performance against independent field data. Finally, the reliability of model simulations for the intended<br />
domain of applicability is assessed through uncertainty analyses. The results are described<br />
so that the scope of model use and its associated limitations are documented and made explicit.<br />
The modeller performs this work.<br />
• STEP 5 (Simulation and Evaluation). In this step the modeller uses the calibrated and validated<br />
model to make simulations to meet the objectives and requirements of the model study. Depending<br />
on the objectives of the study, these simulations may result in specific results that can be used in<br />
subsequent decision making (e.g. for planning or design purposes) or to improve understanding<br />
(e.g. of the hydrological/ecological regime of the study area). It is important to carry out suitable uncertainty<br />
assessments of the model predictions in order to arrive at a robust decision. As with the<br />
other steps, the quality of the results needs to be assessed through internal and external reviews.<br />
Each of the last four steps is concluded with a reporting task followed by a review task. The review<br />
tasks include dialogues between water manager, modeller, reviewer and, often, stakeholders/public.<br />
The protocol includes many feedback possibilities (Fig. 5).<br />
A comparison of the old protocol (Fig. 4) and the one decade younger HarmoniQuA protocol (Fig. 5)<br />
shows some interesting developments:<br />
• The basic sequence of the prescribed activities in the protocols is the same. The HarmoniQuA protocol<br />
is much more detailed than the old one, but there are no fundamental disagreements between<br />
the two.<br />
• The HarmoniQuA protocol puts much more emphasis on the framing of the modelling study. This<br />
is only considered in one box in Fig. 4 and not given much weight in [7], while it is one full Step<br />
comprising seven tasks in Fig 5. This implies for instance that requirements on performance criteria<br />
and uncertainty assessments are introduced rather late in the old protocol, while it is an important<br />
part of Step 1 in the HarmoniQuA protocol.<br />
• There is much emphasis on uncertainty assessments throughout the modelling process in the<br />
HarmoniQuA protocol, while uncertainty assessments are only considered as part of model calibration<br />
and simulation in the old protocol.<br />
• The HarmoniQuA protocol is part of a quality assurance framework with much emphasis on the<br />
role play between the various actors in the modelling process. This results in stakeholder involvement,<br />
peer reviews, focus on reporting and dialogue between water manger and modeller. In contrary<br />
to this, the old protocol only focuses on the modeller.<br />
These developments reflect a process from guidance to the modeller only (old protocol) towards guidance<br />
to all actors involved in the modelling process (HarmoniQuA). This process has been inspired by<br />
feedbacks from introducing the old protocol to real world applications, where it was realised that a<br />
broader concept was required.<br />
16
Data and<br />
Conceptualisation<br />
Describe System and<br />
Data Availability<br />
Collect and Process<br />
Raw Data<br />
Calibration<br />
and Validation<br />
Specify Stages in<br />
Calibration Strategy<br />
Select Calibration<br />
Method<br />
Define Stop Criteria<br />
Simulation<br />
and Evaluation<br />
Set-up Scenario<br />
Simulations<br />
Model Study Plan<br />
Describe Problem<br />
and Context<br />
Define<br />
Objectives<br />
Identify Data<br />
Availability<br />
Determine<br />
Requirements<br />
Prepare Terms of<br />
Reference<br />
Proposal and<br />
Tendering<br />
No<br />
Agree on<br />
Model Study Plan<br />
and Budget<br />
Yes<br />
No<br />
Sufficient<br />
Data<br />
Dire<br />
Yes<br />
Model Structure and<br />
Processes<br />
Model Parameters<br />
Summarise<br />
Conceptual Model and<br />
Assumptions<br />
Need<br />
Yes<br />
for Alternative<br />
Conceptual<br />
Models<br />
No<br />
Process Model<br />
Structure Data<br />
Not<br />
Assess<br />
OK<br />
Dire<br />
Soundness of<br />
Conceptualisation<br />
OK<br />
Code Selection<br />
Report and Revisit<br />
Model Study Plan<br />
(Data and<br />
Conceptualisation)<br />
Review Data and<br />
Conceptualisation and<br />
Model Set-up Plan<br />
OK<br />
Not<br />
OK<br />
Model Set-up<br />
Construct Model<br />
Not<br />
Dire<br />
Test Runs<br />
Completed<br />
OK<br />
OK<br />
Specify or Update<br />
Calibration and<br />
Validation Targets<br />
and Criteria<br />
Report and Revisit Model<br />
Study Plan<br />
(Model Set-up)<br />
Not<br />
Dire<br />
Review Model<br />
OK<br />
Set-up and Calibration<br />
and Validation Plan<br />
OK<br />
Select Calibration<br />
Parameters<br />
Not<br />
OK<br />
Parameter<br />
Estimation<br />
Dire<br />
OK<br />
All<br />
No<br />
Calibration Stages<br />
Completed<br />
Yes<br />
Assess<br />
Not<br />
Soundness of<br />
OK<br />
Calibration<br />
OK<br />
Validation<br />
Not<br />
Dire<br />
Assess<br />
OK<br />
Soundness of<br />
Validation<br />
OK<br />
Uncertainty Analysis<br />
of Calibration and<br />
Validation<br />
Scope of Applicability<br />
Report and Revisit<br />
Model Study Plan<br />
(Calibration and<br />
Validation)<br />
Not<br />
Review<br />
OK<br />
Calibration and Validation<br />
and Simulation Plan<br />
OK<br />
Dire<br />
Not<br />
OK<br />
Check<br />
Simulations<br />
OK<br />
Analyse and Interpret<br />
Results<br />
Not<br />
Assess<br />
OK<br />
Soundness of<br />
Simulattion<br />
OK<br />
Uncertainty Analysis<br />
of Simulation<br />
No<br />
All Scenarios<br />
Completed<br />
Yes<br />
Reporting of<br />
Simulation and<br />
Evaluation<br />
Not<br />
Review of<br />
OK<br />
Simulation and<br />
Evaluation<br />
OK<br />
Need for Post Audit<br />
Model Study<br />
Closure<br />
Fig. 5 The five modelling steps and the 48 tasks in the HarmoniQuA modelling protocol. The diagram is an updated version of Fig. 5 in [13]<br />
(Refsgaard et al., 2006).
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
2.4 Classification of Models<br />
Many attempts have been made to classify hydrological models (or model codes). Refsgaard (1996)<br />
presented the classification shown in Fig. 6 that I have used in all papers of the present thesis. Deterministic<br />
models can be classified according to whether the model gives a lumped or a distributed description<br />
of the considered area, and whether the description of the hydrological processes is empirical,<br />
conceptual, or more physically-based. A lumped model implies that the catchment is considered as one<br />
computational unit. A distributed model, on the other hand, provides a description of catchment processes<br />
at geo-referenced computational grid points within the catchment. An intermediate approach is a<br />
semi-distributed model, which uses some kind of distribution, either in sub-catchments or in hydrological<br />
response units, where areas with the same key characteristics are aggregated to sub-units without<br />
considering their actual locations within the catchment. Examples of hydrological response units considered<br />
in semi-distributed models are elevation zones, which are relevant for snow modelling, and<br />
combinations of soil and vegetation type, which may be relevant for simulation of root zone processes<br />
such as evapotranspiration and nitrate leaching.<br />
As most conceptual models are also lumped, and as most physically-based models are also distributed,<br />
the three main classes emerge:<br />
• Empirical (black box)<br />
• Lumped conceptual models (grey box)<br />
• Distributed physically-based (white box)<br />
The classification is discussed in some details in Refsgaard (1996). Here, the focus is on the two traditional<br />
approaches in deterministic hydrological catchment modelling, namely the lumped conceptual<br />
and the distributed physically-based ones. The fundamental difference between these two types of<br />
models lies in their process descriptions and the way spatial variability is treated. The distributed physically-based<br />
models contain equations which have originally been developed for point scales and which<br />
provide detailed descriptions of flows of water and solutes. The variability of catchment characteristics<br />
is accounted for explicitly through the variations of hydrological parameter values among the different<br />
computational grid points. This approach leaves the variability within a grid as un-accounted for, which<br />
in some cases is of minor importance but in other cases may pose a serious constraint. The lumped<br />
conceptual models uses empirical process descriptions, which have built-in accounting for the spatial<br />
variability of catchment characteristics.<br />
18
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Fig. 6 Classification of hydrological models according to process description (Refsgaard, 1996).<br />
Typical examples of lumped conceptual model codes are the Stanford Watershed Model (Crawford and<br />
Linsley, 1966), the Sacramento (Burnash, 1995), the HBV (Bergström, 1995) and the NAM (Nielsen<br />
and Hansen, 1973). Typical examples of distributed physically-based model codes are the MIKE SHE<br />
(Abbott et al., 1986a, b; Refsgaard and Storm, 1995) and the Thales (Grayson et al., 1992a, b).<br />
Groundwater model codes like MODFLOW belong to the distributed physically-based class.<br />
The classification has some shortcomings that should be noted. First of all, the use of the term ‘conceptual<br />
model’ is unfortunate, because this is a different meaning of the term as compared to the definition<br />
given in Section 2.2 and used in the modelling protocols (Section 2.3). This can cause some confusion,<br />
but to introduce a new term completely different from what is used by almost all other scientists in the<br />
community of catchment modelling may cause even more confusion. Secondly, and more fundamental,<br />
the names of the classes should be considered as relative rather than absolute. For example Beven<br />
(1989) argued that in most applications physically-based models are used as lumped conceptual models<br />
at the grid scale. As discussed in [4] I agree that some degree of lumping and conceptualisation will<br />
always need to take place, but that in spite of this there is a fundamental difference in the functioning<br />
and, as shall also be discussed later, of the applicability of the two model types.<br />
19
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
3 Simulation of Hydrological Processes at Catchment<br />
Scale<br />
In this chapter some modelling examples from the publications are briefly summarised and discussed<br />
within the framework outlined in Chapter 2.<br />
3.1 Flow modelling<br />
3.1.1 Groundwater/surface water model for the Suså catchment ([1], [2])<br />
Summary<br />
The publications [1] and [2] describe a new model code and the set-up, calibration and validation of a<br />
model for a 1,000 km 2 area. Further details can be found in Stang (1981), Refsgaard (1981) and<br />
Refsgaard and Stang (1981). The objectives of the study were to develop a spatially distributed<br />
groundwater/surface water model code and apply it to the Suså catchment with a particular focus on<br />
the stream-aquifer interaction in a hydrogeological system consisting of confined aquifer-aquitardphreatic<br />
aquifer and to test the model for prediction of the hydrological consequences on streamflows<br />
and hydraulic heads of groundwater abstraction.<br />
The new model code was rather complex and computationally demanding at the time of development.<br />
Thus, standard 30 years model simulations could only be carried out as night runs at the main frame<br />
computer at DTU’s computer centre.<br />
The model area comprising the Suså and the neighbouring Køge Å catchments is located in the central<br />
and southern part of Zealand. The model area, the topographic divides and the groundwater model<br />
polygonal mesh are shown in Fig. 7. The overall structure of the model is outlined in Fig. 8. It consists<br />
of four separate components for the confined regional aquifer, the aquitard, the phreatic aquifer and the<br />
root zone. The spatial distribution and the degree of physical basis differ between the four components.<br />
The time steps in the calculations are one day in all parts of the model.<br />
The confined aquifer is described by a two-dimensional integrated finite difference model with 112 polygons.<br />
For the phreatic aquifer consisting of till with very small transmissivities and for the aquitard each<br />
of the polygons are distributed further into four sub-polygons based on hypsographic curves (Fig. 9).<br />
Due to small scale topographic variations the flows in the aquitard in most polygons are upwards in<br />
some parts and downwards in other parts of the polygon. A correct representation of these flows between<br />
the regional aquifer and the phreatic aquifer that discharges the rivers is crucial for achieving a<br />
good description of the stream-aquifer interaction. Without such approach allowing a description of both<br />
upwards and downwards flows in the aquitard within the same polygon a much finer spatial resolution<br />
with 10-100 times as many polygons would have been required. This would have been impossible 25<br />
years ago due to computational constraints.<br />
20
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The root zone component calculated the net precipitation that recharged the phreatic aquifer. The modelling<br />
area was divided into seven sub-areas with separate precipitation input and soil parameters. Further<br />
the spatial variation in vegetation was accounted for by dividing each of these seven areas into five<br />
vegetation areas based on agricultural statistics and one meadow (wetland) area. This makes the total<br />
distribution to 42 sub-areas where each sub-area is a kind of ‘hydrological response unit’, i.e. a semidistributed<br />
approach. The root zone calculations were based on a box approach with four layers in the<br />
root zone.<br />
Fig. 7 Topographic divides, groundwater polygonal mesh, precipitation gauging stations and precipitation<br />
zones of the Suså model.<br />
21
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Fig. 8 The structure of the Suså model<br />
Aquitard<br />
40<br />
30<br />
Ground surface<br />
Water table<br />
(lower outlet)<br />
Head, regional<br />
aquifer<br />
Legend<br />
0 1 2 3<br />
km<br />
POLYGON 21<br />
< 24 m above MSL<br />
24–28 m above MSL<br />
28–34 m above MSL<br />
> 34 m above MSL<br />
Lilleå<br />
20<br />
Vendebæk<br />
Regional<br />
aquifer<br />
10<br />
0<br />
1 2 3 4<br />
50 100 %<br />
Pre-Quaternary<br />
surface<br />
Suså<br />
Gasmose Bæk<br />
Fig. 9 Hypsographic curve for polygon 21 and areas represented by the four sub-polygons.<br />
22
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Fig. 10 Examples of simulation results from soil moisture in root zone, hydraulic head of regional confined<br />
aquifer and river discharge.<br />
The model was calibrated against soil moisture data from four experimental plots, time series of hydraulic<br />
heads from 40 observation wells in the regional aquifer and streamflow from six gauging stations.<br />
Examples of simulation results from the calibration period are shown in Fig. 10 which shows excellent<br />
curve fits. The groundwater and aquitard models were calibrated, along with the code development<br />
itself, using all available hydraulic head data from the period 1950-80. Between 1964 and 1970 the<br />
groundwater abstraction to Copenhagen Water Supply from the Regnemark Waterworks in the Køge Å<br />
catchment was increased from zero to about 15 million m 3 /year. The remaining model components<br />
23
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
were calibrated against only some of the available streamflow data, namely some of the data from the<br />
Suså catchment, while amongst others Køge Å data were not used for calibration.<br />
While the simulation of streamflows in the Køge Å catchment in [1] was characterised as a “half-way<br />
test of the model’s ability to simulate streamflow from ungauged catchments” no systematic validation<br />
tests against independent data were carried out as part of the study. Some years later the model simulations<br />
were extended with new data from the period 1981-87, where the groundwater abstractions had<br />
changed slightly. In this post audit validation study the model simulations were found to match the observations<br />
to the same degree of accuracy as during the calibration period (Jensen and Jørgensen,<br />
1988).<br />
The model’s ability to simulate the streamflow depletion caused by a groundwater abstraction from the<br />
regional confined aquifer was tested on historical data from the Køge Å catchment. Fig. 11 shows simulated<br />
streamflow assuming actual groundwater abstraction from the Regnemark Waterworks starting in<br />
1964, Q sim , and assuming no abstracting from Regnemark, Q 1 sim. The recorded streamflow fits reasonably<br />
well with Q sim . The difference Q 1 sim - Q sim , which is the simulated streamflow depletion caused<br />
by the increased groundwater abstraction, is seen to have a clear seasonal variation with smaller depletion<br />
during the dry summer periods and larger depletion during the wet winter season.<br />
Fig. 11 Comparison of 15 days moving average streamflows for Køge Å (lower) and the relative streamflow<br />
depletion caused by the groundwater abstraction (upper)<br />
24
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Discussion - post evaluation<br />
Most other catchment models existing when the Suså model code was developed were either purely<br />
rainfall runoff models of the lumped conceptual type, such as the classical Stanford Watershed Model<br />
(Crawford and Linsley, 1966), the HBV (Bergström and Forsman, 1973; Bergström, 1976) and the NAM<br />
(Nielsen and Hansen, 1973) or purely groundwater models (Prickett and Lonnquist, 1971; Thomas,<br />
1973). A few authors had concluded that coupled groundwater/surface water modelling was essential<br />
(e.g. Luckner, 1978; Lloyd, 1980) and some had outlined specific, but not yet operational, concepts<br />
(e.g. Freeze and Harlan, 1969; Wardlaw, 1978; Jønch-Clausen, 1979). In some studies groundwater<br />
models and rainfall-runoff models were used at the same catchment, but without coupling (e.g. Weeks<br />
et al., 1974). Thus, apparently no other model had previously been used to dynamically simulate coupled<br />
groundwater/surface water conditions at catchment scale (rainfall, evapotranspiration, surface near<br />
runoff, groundwater recharge, groundwater heads, baseflow discharge from aquifers to streams).<br />
During the decade following [1] and [2] a few model codes with integrated groundwater/surface water<br />
descriptions emerged. The most prominent of these codes was the SHE (Abbott et al., 1986a, b) and its<br />
operational daughter codes, MIKE SHE from DHI (Refsgaard and Storm, 1995) and SHETRAN from<br />
University of Newcastle (Bathurst and O’Connell, 1992), which both are used today, although in later<br />
versions. Other operational models from that period were described by Miles and Rushton (1983),<br />
Christensen (1994) and Wardlaw (1994). Miles and Rushton (1983) used a simpler root zone and surface<br />
water component than [1] together with a two-dimensional finite difference groundwater model and<br />
monthly time steps. Christensen (1994) developed a model for the Tude Å catchment (a neighbour to<br />
Suså) that conceptually was similar and a little bit simpler than [1]. Wardlaw et al. (1994) used the concepts<br />
outlined in Wardlaw (1978) coupling the Stanford Watershed Model with a finite-difference<br />
groundwater model and a channel routing model for simulation of discharge and groundwater levels in<br />
the Allen catchment in England.<br />
During the past decade the number of integrated modelling codes has exploded. The existing codes<br />
today can be considered to fall in three classes: (a) fully integrated codes such as MIKE SHE (Graham<br />
and Butts, 2005); (b) couplings of existing groundwater codes and surface water codes such as MOD-<br />
FLOW and SWAT (Perkins and Sophocleous, 1999); and (c) codes based on the fully 3-dimensional<br />
Richards’ equation (Panday and Hayakorn, 2004). Independent reviews of the scientific basis and practical<br />
applicability of a number of recent integrated model codes are provided by e.g. Kaiser-Hill (2001)<br />
and Tampa Bay Water (2001).<br />
A major novelty of [1] and [2] was that the Suså model code was one of the first codes, which integrated<br />
surface water and groundwater descriptions, and the first of its kind applied operationally to moraine<br />
landscapes. The model results were unique with respect to simulation of the dynamics of the groundwater/surface<br />
water interaction, as for instance reflected by the annual hydraulic head fluctuations and the<br />
streamflow depletion due to the groundwater abstraction. Furthermore the study provided new insights<br />
and understanding on the mechanisms that governed streamflow depletion due to groundwater abstraction<br />
from confined aquifers in moraine catchments. In contrary to the traditional type curve analyses<br />
which were used extensively in hydrogeology to analyse test pumpings and to predict the effects of<br />
abstractions, [1] and [2] were based on non-stationary analysis which, as evident from the annual variations<br />
of streamflow depletion shown in Fig. 11, turns out to be crucial. The only modelling study from<br />
the following decade that considered the dynamics of the stream-aquifer interaction in moraine catch-<br />
25
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
ments in connection with groundwater abstraction was Christensen (1994) who basically confirmed the<br />
results of [2].<br />
The spatial distribution and the degree of physical basis differ between the four components of the<br />
Suså model. The groundwater model can be characterised as distributed physically-based, the aquitard<br />
model as semi-distributed physically-based and the phreatic aquifer and root zone models as semidistributed<br />
conceptual. In contrary to for instance the later SHE code (Abbott et al, 1986a, b), the Suså<br />
model code was not generic, because it could not be applied to other catchments without changes in<br />
the code. Furthermore, it was tailored to the specific hydrological conditions prevailing in the Suså<br />
catchment and could for instance not be applied to an alluvial unconfined aquifer.<br />
In retrospect, it is interesting to observe that issues related to the credibility of model simulations were<br />
not critically analysed or discussed in [1] and [2]. First of all, aspects of code verification were not dealt<br />
with in the publications, although a major novelty of the work was the development of a completely new<br />
code. Secondly, and maybe more surprisingly, model validation and uncertainty assessments of model<br />
simulations were almost not addressed. By using all the available groundwater head data for calibration<br />
the opportunity to make split-sample validation test against parts of the data or even the unique opportunity<br />
to calibrate on data before the groundwater abstraction and validate on data after the abstraction<br />
(differential split-sample test according to Klemes (1986)) were not utilised. By not addressing the uncertainty<br />
and by not conducting rigorous validation tests the reader may be left with the, undocumented,<br />
impression that the curve fitting in Fig. 10 is supposed to reflect the predictive capability of the model.<br />
That the model proved to perform well in a subsequent post-audit validation study could not be known<br />
at the time of [1] and [2].<br />
The other integrated groundwater/surface water modelling studies from the following decade (Miles and<br />
Rushton, 1983; Christensen, 1994; Wardlaw, 1994) had the same characteristics, i.e. only focus on<br />
calibration and model prediction but no mentioning of verification of the new model codes, no model<br />
validation tests against independent data and no uncertainty assessments. The SHE study reported by<br />
Bathurst (1986a, b) focussing on surface water hydrology did include split-sample validation testing and<br />
sensitivity analysis. For surface water (rainfall-runoff) modelling studies focusing more on model applications<br />
than code developments split-sample testing was more common (e.g. Bergström, 1976; WMO,<br />
1975; WMO 1988) but uncertainty assessment was not systematically carried out and usually not even<br />
considered until Beven called for it (Beven, 1989; Beven and Binley, 1992). Altogether, this illustrates a<br />
very significant development in the modelling practise during these three decades.<br />
26
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
3.1.2 Application of SHE to catchments in India ([4], [5])<br />
Summary<br />
The publications [4] and [5] describe the set-up, calibration and validation of the ‘Système Hydrologique<br />
Européen’ (SHE) code to six sub-catchments totalling about 15,000 km 2 of the Narmada basin in India,<br />
Fig. 12. The objective of the papers was to describe experiences from applying a distributed physicallybased<br />
code like SHE to large basins with rather limited data coverage compared to previous SHE applications<br />
to research catchments. In contrary to the Suså study in [1] and [2], the India study did not<br />
include any code development, except for data processing utility software. Instead it comprised application<br />
of an existing code (Abbott et al., 1986a,b) to conditions that were far beyond the conditions for<br />
which the SHE had previously been tested in terms of catchment size, data coverage and hydrological<br />
regime (Bathurst, 1986a).<br />
Fig. 12 Location map for the Narmada and the six sub-catchments.<br />
Applicationwise, the study focused on simulation of catchment runoff, i.e. surface water aspects only.<br />
The model structure was as illustrated in Fig. 17. The groundwater zone was, however, considered only<br />
with one layer, i.e. a 2-dimensional groundwater model, and there were no data from observation wells<br />
to allow a calibration of the groundwater part of the model. The six models were set-up with a 2 km x 2<br />
km computational grid. A split-sample approach was used with typically three years for model calibrations<br />
and other three years for the subsequent model validation.<br />
27
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The data requirements for a SHE based model is substantial and much larger than for a rainfall-runoff<br />
model of lumped conceptual type that previously had been applied to such types of catchments. A major<br />
challenge of the study was therefore to identify, collect and process data and to check their quality.<br />
Data were collected from more than 15 different agencies belonging to many different ministries and the<br />
data quality varied substantially.<br />
Another challenge was how to assess parameter values in a distributed model when data, in contrary to<br />
the previous tests on small experimental catchments like in Bathurst (1986a), are scarce. Each of the<br />
grid points in a distributed model is characterised by one or more parameters. Although the parameter<br />
values in principle (as in nature) vary from grid point to grid point, it is neither feasible nor desirable to allow<br />
the parameter values to vary so freely. Instead, a given parameter should only reflect the significant and<br />
systematic variation described in the available field data. Therefore a parameterisation procedure was<br />
developed, where representative parameter values were associated to individual soil types, vegetation<br />
types, geological layers, etc. This process of defining the spatial pattern of parameter values effectively<br />
reduced the number of free parameter coefficients, which needs to be adjusted in the subsequent<br />
calibration procedure. For example, the 820 km 2 Kolar catchment is parameterised into three soil classes<br />
and 10 land use/soil depth classes. For the soil type classes calibration was allowed for the hydraulic<br />
conductivity in the unsaturated zone (for each soil type class the conductivity could vary among three<br />
different land uses => nine parameter values). For the land use/soil depth classes the calibration<br />
parameters comprised soil depths (10 parameters in total) and the Strictler overland flow coefficients for<br />
four land use types (four parameters in total). Further three parameters were subject to calibration<br />
(hydraulic conductivity in the saturated zone, an (empirical) by-pass coefficient and a surface retention<br />
parameter; all kept constant throughout the catchment). Although the 26 calibration parameters could not<br />
be assessed from field data alone, but had to be modified through calibration, the physical realism of the<br />
parameter values resulting from the subsequent calibration procedure could be evaluated from available<br />
field data.<br />
The simulation results are illustrated in Fig. 13 as hydrographs for the largest sub-catchment and in Fig.<br />
14 as annual runoff and annual peaks for all six sub-catchments. In both figures the results are for the<br />
validation periods, where results are slightly poorer as compared to the calibration periods. In [4] the<br />
rainfall-runoff simulation results were characterised as having the same degree of accuracy as would<br />
have been expected with simpler hydrological models of the lumped conceptual type. The results therefore<br />
suggested that application of complex data demanding models like the present SHE approach are<br />
not justified in cases where the modelling objective is limited to simulation of catchment runoff and<br />
where observed runoff records exist for calibration purposes. No attempts were made in the study to<br />
test the capability of a model without calibration.<br />
After the first calibration and validation tests had been made, field investigations were carried out in the<br />
Kolar catchment during a 2½ week period to improve the parameter estimates, mainly for soil and vegetation<br />
parameters, and to evaluate the importance of additional field data. Subsequently, the Kolar<br />
model was recalibrated in such a way that rather narrow constraints were put on the range of values<br />
allowed for the key parameters. The final model, based on the additional data, produced simulation<br />
results of same quality as the preliminary model with respect to simulated hydrograph. Although it is<br />
argued in [5] that the final model is believed to give an improved physical representation of the hydrological<br />
regime, it is concluded that a good match between observed and simulated outlet hydrographs<br />
does not provide a sufficient guarantee of a hydrologically realistic process description.<br />
28
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Fig. 13 Observed and simulated hydrographs for the Narmada at Manot during the validation period<br />
1985 and 1987.<br />
Fig. 14 Simulated monthly runoff during monsoon season (left) and simulated annual peak discharge<br />
compared with measured values during validation periods for all six sub-catchments.<br />
29
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Discussion - post evaluation<br />
At the time of [4] and [5] lumped conceptual catchment model codes such as HBV (Bergström, 1992)<br />
and NAM (Jønch-Clausen and Refsgaard, 1984) had been used operationally for two decades, typically<br />
for catchments ranging from a few km 2 to more than 10,000 km 2 .<br />
At the same time distributed physically-based models had mainly been tested on flood events on small<br />
catchments that typically had very good data due to experimental instrumentation (Loague and Freeze,<br />
1985; Bathurst 1986a; Grayson et al., 1992a,b; Troch et al., 1993). Loague and Freeze (1985) compared<br />
a quasi-physically based model with a regression model and a unit hydrograph model on three<br />
experimental catchments, the 0.1 km 2 R-5, Chickasha, Oklahoma, the 7.2 km 2 WE-38, Klingertown,<br />
Pensylvania and the 0.1 km 2 HB-6, West Thornton, New Hampshire. Bathurst (1986a) applied the SHE<br />
to the simulation of flood events for the 10.6 km 2 experimental Wye catchment in Wales. Grayson et al.<br />
(1992a,b) applied the THALES to the simulation of flood events for the 7.0 ha Wagga catchment in Australia<br />
and the 4.4 ha Lucky Hill catchment at the Walnut Gulch Experimental Area in Arizona. Troch et<br />
al. (1993) applied a model based on a 3-dimensional numerical solution to Richards’ equation to the 7.2<br />
km 2 WE-38 catchment and a 0.64 km 2 subcatchment.<br />
To my knowledge the only examples until then of distributed physically-based model studies including<br />
applications on several hundred km 2 catchments and continuous simulation for periods of several years<br />
were the coupled groundwater/surface water models discussed in the previous section ([1]; [2]; Miles<br />
and Rushton, 1983; Christensen, 1994; Wardlaw et al., 1994) that all had distributed physically-based<br />
groundwater components and lumped (or semi-distributed) conceptual surface water components and<br />
some models such as WATBAL (Knudsen et al., 1986) that had semi-distributed surface water components<br />
and lumped conceptual groundwater components.<br />
During the following few years a few additional catchment scale studies with continuous simulations of<br />
distributed physically-based models emerged. One example is Querner (1997) who applied the<br />
MOGROW to the 6.5 km 2 Hupselse Beek catchment simulating both discharge and groundwater head<br />
dynamics. Another example is Kutchment et al. (1996) who simulated surface water processes for the<br />
3315 km 2 Ouse catchment. The study of Kutchment et al (1996) had many similarities with [4] and [5]<br />
with respect to model conceptualisation and conclusions.<br />
The main scientific contribution of [4] and [5] was therefore as the first study to demonstrate that distributed<br />
physically-based models could be established for catchments of this size and with ordinary data<br />
availability. Previous studies reported in literature had either been tests on small research catchments<br />
or been models with major components of the lumped conceptual type. As outlined above, it is worth<br />
noting the different traditions in the communities that had dealt with (large scale) lumped conceptual<br />
models, (small scale) physically-based models and groundwater models, respectively. I believe that an<br />
important characteristic of the team who performed the present study ([4] and [5]) was that it comprised<br />
scientists who together had comprehensive experiences from all these communities.<br />
Another key contribution was the parameterisation approach introduced. The point of departure for this<br />
approach, e.g. [1] and Bathurst (1986a), was an approach allowing parameter values to vary as required<br />
to fit the observed data during the calibration phase. This approach had been criticised by Beven<br />
(1989) to result in overparameterisation. The procedure resulted in 26 parameters to be calibrated for<br />
the Kolar catchment. Although this number is significantly less than e.g. the number of free parameters<br />
30
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
in [1], it is still very high and it is very likely that a sensitivity analysis would have shown that this number<br />
could easily be reduced without loss of model performance. It is interesting to note that similar parameterisation<br />
approaches reported for other catchments in 1997 ([7]) and 2001 (Andersen et al., 2001)<br />
resulted in 11 and 4 free parameters, respectively, implying that the parameterisation approach adopted<br />
in [4] and [5] were not yet finally developed.<br />
Beven (1989) had provided a fundamental critique of the way physically-based models such as the<br />
SHE had been promoted by e.g. Abbott et al. (1986a) and Bathurst (1986a). His main critique was that<br />
the attitudes in these early SHE papers were not realistic with respect to the abilities and achievements<br />
of physically-based models. Beven pointed amongst others to the following key problems:<br />
• The process equations are simplifications leading to model structure uncertainty.<br />
• Spatial heterogeneity at subgrid scale is not included in the physically-based models. The current<br />
generation of distributed physically-based models are in reality lumped conceptual models.<br />
• There is a great danger of overparameterisation if it is attempted to simulate all hydrological processes<br />
thought to be relevant and the related parameters against observed discharge data only.<br />
As a conclusion Beven argued that for future applications attempts must be made to obtain realistic<br />
estimates of the uncertainty associated with their predictions, particularly in the case of evaluating future<br />
scenarios of the effects of management strategies.<br />
[4] noted some of Beven’s critique, acknowledging that the process representation at the 2 km x 2 km<br />
grid squares is causing significant violations of some of the process descriptions, that “some degree of<br />
lumping and conceptualisation has taken place at the grid scale” and that “scale problems are important”.<br />
[4] stressed, however, that in spite of these acknowledged limitations “the present basin model is<br />
much more physically based and distributed than the traditional lumped conceptual model, where the<br />
entire catchment is represented in effect by one grid square, and where the process representations<br />
due to averaging over characteristics of topography, soil type and vegetation type are fundamentally<br />
different from the basic physical laws”.<br />
[4] and [5] concluded that the SHE is a suitable tool to support water management for conditions in India.<br />
In contrary to this, Beven (1989) had stated that the physically-based models “are not well suited to<br />
applications to real catchments”. In retrospect, it is remarkable that [4] and [5] did not go more substantially<br />
into a dialogue with the very fundamental critique raised by Beven (1989). For instance [4] and [5]<br />
did not comment at all on Beven’s main conclusion on the need for uncertainty assessment, although<br />
[5] actually used the model to study the impact of soil and land use by performing sensitivity analyses.<br />
A more comprehensive response and dialogue took place a few years later (Beven, 1996a; Refsgaard<br />
et al., 1996; Beven, 1996b).<br />
Seen in the perspective of present protocols for good modelling practise ([12] and [13]) the approach<br />
and conclusions in [4] and [5] are especially deficient by the lacking focus on uncertainty assessment. A<br />
main reason for the lack of dialogue with Beven’s critique and the lack of focus on uncertainty in [4] and<br />
[5] may be that we were too preoccupied with the real achievement as the first to setting up and running<br />
such type of model for such large catchments. Another reason may be that some of us had a background<br />
in groundwater modelling, where large scale distributed physically-based models had been successfully<br />
used to support practical water resources management for more than a decade, so we considered<br />
Beven’s statement that the physically-based models “are not well suited to applications to real<br />
catchments” as a large exaggeration.<br />
31
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
3.1.3 Intercomparison of different types of hydrological models ([6])<br />
Summary<br />
The research study reported in publication [6] had two objectives. The first objective was to identify a<br />
rigorous framework for the testing of model capabilities for different types of tasks. The second objective<br />
was to use this theoretical framework and conduct an intercomparison study involving application of<br />
three model codes of different complexity to a number of tasks ranging from traditional simulation of<br />
stationary, gauged catchments to simulation of ungauged catchments and of catchments with nonstationary<br />
climate conditions. Data from three catchments in Zimbabwe were used for the tests.<br />
The three codes used in the study were (a) NAM (Nielsen and Hansen, 1973; Havnø et al., 1995) – Fig.<br />
15; (b) WATBAL (Knudsen et al., 1986) – Fig. 16; and (c) MIKE SHE (Abbott et al., 1986a,b; Refsgaard<br />
and Storm, 1995) – Fig. 17. The NAM and MIKE SHE can be characterised as very typical of their<br />
lumped conceptual and distributed physically-based types, respectively, while the WATBAL with its<br />
semi-distributed approach falls in between these two standard classes.<br />
Fig. 15 Structure of the NAM rainfall-runoff model code<br />
32
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Fig. 16 Structure of the WATBAL code.<br />
Fig. 17 Schematic representation of the model structure of the ‘Système Hydrologique Européen’ (SHE)<br />
code.<br />
The three catchments in Zimbabwe that were selected for the tests were Ngezi-South (1090 km 2 ), Lundi<br />
(254 km 2 ) and Ngezi-North (1040 km 2 ). For two of the catchments the model simulations started with a<br />
blind simulation, i.e. a simulation where no calibration was conducted, but where model parameters<br />
were assessed directly from field data and indirectly by considering parameter values in the first catchment<br />
(proxy basin test). Then one year was made available for calibration and finally the full calibration<br />
period of 4-5 years was used. In all cases an independent period was used for validation tests (splitsample<br />
test). The hydrological regime in Zimbabwe is semi-arid and characterised by very large interannual<br />
variations. It was therefore possible to construct a test scheme in such a way that a model’s<br />
ability to predict differences in climate input could be tested by calibrating on a dry period and validating<br />
on a wet period or vice versa (differential split-sample test).<br />
33
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The model performance was evaluated for annual runoff and criteria focussing on the shape of the discharge<br />
hydrograph, i.e. rainfall-runoff modelling. The modelling work was carried out by three different<br />
persons/teams that were very experienced by applying their respective model codes. A general conclusion<br />
from the study was that the performances of the three codes were surprisingly similar. Thus, the<br />
ability of WATBAL and SHE to explicitly utilise data such as topography, soil and vegetation data that<br />
the NAM could not use turned out to make no significant difference in most cases. In summary the conclusions<br />
were:<br />
• Given a few (1–3) years of runoff measurements, a lumped model of the NAM type would be a<br />
suitable tool from the point of view of technical and economical feasibility. This applies for catchments<br />
with homogeneous climatic input as well as cases where significant variations in the exogenous<br />
input are encountered.<br />
• For ungauged catchments, however, where accurate simulations are critical for water resources<br />
decisions, a distributed model is expected to give better results than a lumped model if appropriate<br />
information on catchment characteristics can be obtained.<br />
Discussion - post evaluation<br />
A scientific contribution of [6] was the adoption and demonstration of Klemes’s model validation testing<br />
scheme, which had not been much used since the basic idea was published by Klemes (1986). This is<br />
discussed further in Section 4.2.4.<br />
Furthermore, the results from the intercomparison contributed to the ongoing scientific discussion on<br />
which types of model codes should be recommended for which application purpose. Only a few intercomparison<br />
studies involving different model types had been reported in literature and only two studies<br />
included physically-based models (Loague and Freeze, 1985; Michaud and Sorooshian, 1994). Most of<br />
these previous studies had been conducted on small research catchments and none of them had included<br />
tests for non-stationary climate conditions as in [6].<br />
From the emergence of the distributed physically-based models it was widely stated and believed that<br />
these new model types generally would be able to provide more accurate simulation of the hydrological<br />
cycle (Abbot et al., 1986a). In the absence of hard facts from suitable tests the scientific debate had to<br />
a very large extent been based on expectations and qualitative arguments such that the models with<br />
more physical basis in their model structure were assumed to be able to provide more accurate simulation<br />
results, or the opposite view, as e.g. advocated by Beven (1989) that such expectations to the superior<br />
performance of the physically-based models were unrealistic. In [4] we basically agreed with<br />
Beven (1989) with respect to the SHE’s capability to simulate discharge for large scale catchments with<br />
ordinary data, i.e. that the rainfall-runoff simulation results were of the same degree of accuracy “as<br />
would have been expected” with simpler hydrological models of the lumped conceptual type.<br />
With the results from [6] it was now possible to more firmly conclude that if the purpose of modelling is<br />
limited to simulation of runoff under stationary catchment conditions and if data exist for calibration purpose,<br />
there is no scientifically documented reason to go beyond lumped conceptual models. This issue<br />
has been subject to several studies since then, where the conclusions from [6] basically have been<br />
confirmed (e.g. Perrin et al., 2001; Reed et al., 2004). I believe that the only thing that may change that<br />
conclusion is the introduction of new spatial data from new airborne or satellite sensors. Whereas these<br />
new data types have proven to have great value for many hydrological purposes and for special condi-<br />
34
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
tions (e.g. snow cover), they have in general not yet documented that they can provide distributed<br />
models with comparative advantages in simulation of catchment runoff.<br />
35
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
3.2 Reactive Transport<br />
3.2.1 Oxygen transport and consumption in the unsaturated zone ([3])<br />
Summary<br />
Publication [3] describes the development of a new code for simulation of oxygen transport and consumption<br />
in the unsaturated zone. The code was linked as a sub-component to the SHE modelling system<br />
(Abbott et al., 1986a,b). The objective of the paper was to describe the new process formulation,<br />
document its applicability through two case studies and outline the perspectives in relation to its use as<br />
part of the comprehensive SHE code.<br />
The unsaturated zone water flow calculations in SHE were based on a finite difference solution to the<br />
full Richards’ equation for unsteady soil water flow. The solute transport calculations were based on the<br />
traditional convection-dispersion equation. The new code for oxygen transport and consumption was an<br />
add-on to these first two steps and used information on soil moisture content, water flows and solute<br />
concentrations and fluxes as input. Thus the spatial representation is given by the underlying flow and<br />
solute transport discretisation, implying a one-dimensional description with spatial resolution ranging<br />
from a few cm close to the terrain to 20-40 cm further down in the soil column.<br />
The process description in [3] is based on a three-phase system (soil, water, air) and accounting for<br />
spatial heterogeneity at this small scale. Fig. 18 shows a microscale illustration of the soil. Air tends to<br />
fill the larger pores in the soil matrix whereas water is drawn into the narrow necks and finer pore<br />
spaces in aggregates, forming capillary films and wedges. The air and water coexist in the soil by occupying<br />
different geometric configurations. Oxygen movement within these different portions of the pore<br />
space can occur by: convective transport in the water, diffusion in water, convective transport in soil air,<br />
diffusion in soil air, diffusion into water-saturated soil crumbs, and consumption in free and fixed water.<br />
Microorganisms and plant roots are generally found in the finer pores of the soil because they require<br />
close contact with the soil particles for uptake of substrate and nutrients. Transport of oxygen to these<br />
respiring sites usually occurs in the water phase of soil crumbs. It is the rate of oxygen diffusion through<br />
this fixed water in micropores that will determine the availability of oxygen for respiration and the anaerobic<br />
fraction of the soil. A soil crumb is considered to be any fully water-saturated subvolume of soil,<br />
the physical size of which is determined by the nearness of air-filled soil pores. The crumb is thus defined<br />
by the fact that oxygen transport within the crumb is primarily due to diffusion in water-filled pores.<br />
The size of the soil crumbs is dependent on the water content of the soil and the corresponding number<br />
of air-filled pores.<br />
The relation between soil water content and size of the water crumbs is derived from the soil water retention<br />
curve that is already used in Richards’ equation. The idea behind this is illustrated in Fig. 19 and<br />
described in more details in [3]. The number of air filled pores at a given soil moisture content can be<br />
36
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
calculated from the retention curve (Fig. 19b). It is furthermore assumed that the distance between two<br />
air filled pores, d i , corresponds to the average diameter of a water saturated crumb (Fig. 19a).<br />
Air<br />
“Free” water<br />
Solids/<br />
aggregates<br />
“Fixed” water<br />
Anaerobic<br />
zone<br />
Aerobic<br />
zone<br />
Fig. 18 Microscale representation of the three-phase soil system with respect to oxygen transport.<br />
Tension (ψ)<br />
Pore radius (p)<br />
Airfilled pore<br />
d i<br />
Water saturated<br />
crumb<br />
L<br />
(θ i<br />
+1)<br />
θ i<br />
Water content (θ)<br />
(a)<br />
(b)<br />
Fig. 19 (a) The assumed pore distribution within the unit L x L. (b) Retention curve showing the relation<br />
between tension, water content and pore radius of a soil.<br />
The two case studies where the model code was tested and demonstrated dealt with operation of a<br />
waste water infiltration plant and assessment of anaerobic zones of importance for denitrification in<br />
agricultural soils.<br />
37
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Discussion - post evaluation<br />
Previous research in oxygen transport processes in heterogeneous soils (e.g. Currie, 1961; Smith,<br />
1980; Troeh et al., 1982) were based on the assumption of steady-state conditions with regard to<br />
crumb/aggregate size and aerobic-anaerobic fractions. The novel scientific contribution of this paper<br />
was the new concept of calculating the size of the water crumbs as a function of the water retention<br />
curve and the time varying soil moisture content originating from SHE calculations and the linking of this<br />
concept to the previous research in this field. In this way it became possible to calculate aerobicanaerobic<br />
fractions dynamically.<br />
Although the scale of consideration in this study is the smallest possible in a catchment modelling perspective,<br />
namely point or column scale, it illustrates that smaller scale phenomena (here diffusion into<br />
soil crumbs that are of mm or less in size and temporally varying) often dominate the oxygen conditions<br />
at grid (cm - dm) scale. The approach in [3] is an upscaling from grain size to computational model grid<br />
point, where the within grid heterogeneity is accounted for by developing a set of process equations<br />
that includes the effect of the smaller scale heterogeneity at the larger grid scale.<br />
In retrospect, it is interesting to consider the issues that were not discussed in [3]. In this respect it<br />
should be noted that code verification aspects were not mentioned in [3], although a completely new<br />
code was developed. Furthermore, [3] did not discuss the issue of upscaling the present grid scale<br />
processes to application at catchment scale. Interesting issues in this regard would be evaluations of<br />
how data and parameter values could be assessed for catchment scale applications and discussions of<br />
whether it would still be the mm-scale (crumbs) processes that would be dominating when simulating at<br />
large scale, or whether larger scale heterogeneities, such as differences in crops, soil types or topography,<br />
would become more important and thus reduce the importance of the present process description.<br />
The model code presented in [3] was developed in a ‘research version’ of the SHE code. After the<br />
completion of the study it was not upgraded to become part of the ‘commercial version’ of MIKE SHE<br />
that emerged a few years later. The oxygen model has not been used for practical purposes.<br />
To my knowledge, process description of the same detail as in [3] has not been included in any catchment<br />
model, and not even in the most comprehensive physically-based root zone models such as<br />
DAISY (Hansen et al., 1991; Abrahamsen and Hansen, 2000). In DAISY that provides state-of-the-art<br />
descriptions of root zone processes with focus on water, plant growth and nitrogen a much simpler and<br />
more empirical process formulation is used for calculating denitrification as a function of anaerobic subsoil<br />
conditions.<br />
38
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
3.2.2 An integrated model for the Danubian Lowland ([9])<br />
Summary<br />
Publication [9] is concerned with environmental assessment studies in connection with the Gabcikovo<br />
hydropower scheme along the Danube. The objective of the underlying study was to develop and apply<br />
a comprehensive integrated modelling system to support management decisions in this respect.<br />
The Danubian Lowland (Fig. 20) in Slovakia and Hungary downstream Bratislava is an inland delta<br />
formed in the past by river sediments from the Danube. The entire area forms an alluvial aquifer, which<br />
throughout the year receives around 30 m 3 /s infiltration water from the Danube in the upper parts of the<br />
area and returns it to the Danube and the drainage canals in the downstream part. The aquifer is an<br />
important water resource for municipal and agricultural water supply, and the floodplain area with its<br />
alluvial forests and associated ecosystems represents a unique landscape of outstanding ecological<br />
importance.<br />
Fig. 20 The Danubian Lowland with the new reservoir and the Gabcikovo hydropower scheme.<br />
The Gabcikovo hydropower scheme was put into operation in 1992. A large number of hydraulic structures<br />
was established as part of the hydropower scheme. The key structures are a system of weirs<br />
across the Danube at Cunovo 15 km downstream of Bratislava, a reservoir created by the damming at<br />
Cunovo, a 30 km long lined navigation canal, outside the floodplain area, parallel to the Danube River<br />
39
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
with intake to the hydropower plant, a hydropower plant and two ship-locks at Gabcikovo, and an intake<br />
structure at Dobrohost, 10 km downstream of Cunovo, diverting water from the new canal to the river<br />
branch system. The entire scheme has significantly affected the hydrological regime and the ecosystem<br />
of the region. The scheme was originally planned as a joint effort between former Czecho-Slovakia and<br />
Hungary, and the major parts of the construction were carried out as such on the basis of an international<br />
treaty from 1977. However, since 1989 Gabcikovo has been a major matter of controversy between<br />
Slovakia and Hungary, who have referred some disputed questions to the International Court of<br />
Justice in The Hague (ICJ, 1997).<br />
The hydrological regime in the area is very dynamic with so many crucial links and feedback mechanisms<br />
between the various parts of the surface- and subsurface water regimes that no single existing model code<br />
was able to describe the entire regime. Therefore, the modelling system illustrated in Fig 21 was established.<br />
It integrates four model codes: (a) MIKE 21 (DHI, 1995) for describing the reservoir (2D flow, eutrophication,<br />
sediment transport); (b) MIKE 11 (Havnø et al., 1995) describing the river and river<br />
branches (1D flow including effects of hydraulic control structures, water quality, sediment transport);<br />
(c) MIKE SHE (Refsgaard and Storm, 1995) describing the ground water (3D flow, solute transport,<br />
geochemistry) and flood plain conditions (dynamics of inundation pattern, ground water and soil moisture<br />
conditions); and (d) DAISY (Hansen et al., 1991) describing agricultural aspects (crop yield, irrigation,<br />
nitrogen leaching). The interfaces between the various models were:<br />
Fig. 21 Structure of the integrated modelling system with indication of the interactions between the individual<br />
models<br />
40
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
A) MIKE SHE forms the core of the integrated modelling system having interfaces to all the individual<br />
modelling systems. The coupling of MIKE SHE and MIKE 11 is a fully dynamic coupling<br />
where data is exchanged within each computational time step.<br />
B) Results of eutrophication simulations with MIKE 21 in the reservoir are used to estimate the concentration<br />
of various water quality parameters in the water that enters the Danube downstream of<br />
the reservoir. This information serves as boundary conditions for water quality simulations for the<br />
Danube using MIKE 11.<br />
C) Sediment transport simulations in the reservoir with MIKE 21 provide information on the amount<br />
of fine sediment on the bottom of the reservoir. The simulated grain size distribution and sediment<br />
layer thickness is used to calculate leakage coefficients, which are used in ground water modelling<br />
with MIKE SHE to calculate the exchange of water between the reservoir and the aquifer.<br />
D) DAISY simulates vegetation parameters that are used in MIKE SHE to simulate the actual<br />
evapotranspiration. Ground water levels simulated with MIKE SHE act as lower boundary conditions<br />
for DAISY unsaturated zone simulations. Consequently, this process is iterative and requires<br />
several model simulations.<br />
E) Results from water quality simulations with MIKE 11 and MIKE 21 provide estimates of the concentration<br />
of various components/parameters in the water that infiltrates to the aquifer from the<br />
Danube and the reservoir. This can be used in the ground water quality simulations (geochemistry)<br />
with MIKE SHE.<br />
The integrated model was established for the 3,000 km 2 area on the basis of a large amount of good<br />
quality data. Most of the model parameters were assessed directly from field data, and some were estimated<br />
through calibration. For most of the individual model components, traditional split-sample validation<br />
tests were carried out.<br />
The modelling system was used in a scenario approach to assess the environmental impacts of alternative<br />
water management options. The uncertainties of the model predictions were assessed through<br />
sensitivity analyses. As an example, Figs 22 and 23 shows a characterisation of the floodplain area<br />
between the (old) main Danube river channel (western model boundary) and the power canal for predam<br />
(Fig. 22) and a hypothetical post-dam condition (Fig. 23) where the major part of the water is diverted<br />
from the main Danube channel to the power canal. The classes with different ground water depths<br />
and flooding have been determined from ecological considerations according to requirements of<br />
(semi)terrestrial (floodplain) ecotopes. For the pre-dam condition (Fig. 22) the contacts between the main<br />
Danube river and the river branch system is clearly seen. Similar results for a hypothetical post-dam water<br />
management regime (Fig. 23) show significant differences in hydrological regime, e.g. many areas are<br />
characterised by high groundwater tables and small/seldom flooding, while the post-dam situation (Fig. 22)<br />
generally has deeper ground water tables and more frequent flooding. From such changes in hydrological<br />
conditions inferences can be made on possible changes in the floodplain ecosystem.<br />
41
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Fig. 22 Hydrological regime in the river branch area for 1988 pre-dam conditions characterised in ecological<br />
classes<br />
Fig. 23 Hydrological regime in the river branch area for a post-dam water management regime characterised<br />
in ecological classes. The scenario has been simulated using 1988 observed upstream discharge<br />
data and a given hypothetical operation of the hydraulic structures.<br />
42
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Discussion - post evaluation<br />
The uniqueness of the established modelling system is the integration between the individual model<br />
codes, each of which providing complex distributed physically-based descriptions of the various processes.<br />
The validation tests have generally been carried out for the individual models, whereas only few<br />
tests on the integrated model were possible. Altogether, the integrated modelling system and the applications<br />
were more comprehensive and complex in terms of interactive dynamics between different<br />
components of an ecosystem than had previously been reported in the scientific literature.<br />
In the years following [9] a few comprehensive large scale studies with coupled models emerged. The<br />
most comprehensive of those was probably Wolf et al. (2003) who developed the STONE for calculating<br />
nutrient emissions from agriculture in The Netherlands. Although based on different codes the<br />
STONE resembles the integrated modelling system in [9] in terms of number of codes and complexity<br />
of process descriptions. One main difference, however, was that STONE consists of a chain of models<br />
without the feedback couplings that characterise [9]. Simpler, although still comprehensive, modelling<br />
systems were presented by Birkinshaw and Ewen (2000) as the SHETRAN code with a built-in nitrate<br />
transformation component and Conan et al. (2003) with a coupling of SWAT, MODFLOW and MT3DMS<br />
also focusing on nitrate fate at catchment scale.<br />
The complexity of the modelling studies in [9] may be compared to coupled modelling studies in<br />
neighbouring fields. The hydrology related field with the strongest modelling traditions is no doubt the<br />
atmospheric science. Here very comprehensive coupled models have been used in connection with<br />
hydrology oriented climate change studies. An example of a sequentially coupled atmospherichydrological<br />
model from that period is Graham (1999) who used the ECHAM4 regional atmospheric<br />
model coupled with the HBV hydrological model to simulate discharge for the entire 1.6 10 6 km 2 Baltic<br />
Sea basin. The atmospheric modelling component is in itself more demanding in terms of computer<br />
power than comprehensive hydrological modelling such as [9], and the complexity of the atmospheric<br />
modelling is maybe larger than the complexity of the individual process model codes in [9]. Otherwise<br />
the complexity of the coupled atmospheric-hydrological studies with respect to feedback couplings between<br />
process descriptions, data requirements, different scales for different processes, etc., may be<br />
considered comparable to the complexity of [9].<br />
In retrospect it is interesting to evaluate how much this comprehensive modelling system actually was used<br />
as part of the political decision process Were the full potential of the models utilised by the decision<br />
makers In the following my personal perception of these aspects are presented. The application of the<br />
integrated modelling and information system in practise may be categorised in three principally different<br />
functions: (a) to assist in design of structures and details of water management regimes, (b) to assist in<br />
policy analysis by assessing the environmental impacts of alternative water management regimes, and (c)<br />
to assist in resolving different views between interest groups on environmental assessments.<br />
The use of models to assist in designs is the classical "engineering" way of using such models. There were<br />
a number of such applications. The best example of this is the final design in 1993 of the guiding structures<br />
of the Cunovo reservoir that was based on model simulations. Such model use was possible, because the<br />
objectives of the decision-makers were clear and there was an urgent need for the results before the<br />
construction works actually started.<br />
43
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Use of models to assess the environmental impacts of alternative water management regimes was one of<br />
the primary reasons for establishing the modelling systems. There were several examples of such model<br />
applications. A key example was a combined field and modelling study of the geochemical conditions in the<br />
aquifer to assess whether the changed boundary conditions with the new reservoir would affect the redox<br />
conditions and hence the groundwater quality in the aquifer that forms the basis for the water supply of<br />
Bratislava. Another example is a combined field and modelling study of the eutrophication conditions in the<br />
reservoir. Such studies were conducted in close dialogue with the decision-makers in order to assist in their<br />
policy formulation.<br />
Finally, the modelling system was an invaluable tool in connection with the international attempts made to<br />
assist in resolving some of the issues that were disputed between Slovakia and Hungary. Many of the<br />
arguments brought forward on these highly controversial issues were mixtures of scientifically based facts<br />
and politically based views, but they were often claimed as purely scientifically based. It is very natural and<br />
fully legitimate that all parties have political interests and do their best to pursue them. However, the mixing<br />
of scientific facts and political interest makes the whole scene less transparent and may be an obstacle for<br />
arriving at rationale decisions. The role the modelling system had in this context was that it made it possible<br />
at some occasions to help distinguish between facts and fiction with respect to the scientific arguments. In<br />
this way the modelling tools assisted in separating scientific and political problems. Thus, the modelling<br />
system was often used as an important tool in resolving technical disagreements between the Slovakian<br />
and Hungarian delegations in the international expert groups (EC, 1992, 1993a, 1993b). Similarly, it is my<br />
impression that the modelling results played a significant role for the International Court of Justice when<br />
dealing with the question of whether the ecological situation could be characterised as a catastrophe<br />
justifying the use of the legal principle of “the ecological state of necessity” as done when Hungary stopped<br />
the construction works on the Gabcikovo scheme in 1989 (ICJ, 1997).<br />
However, there were also clear limitations to the application of the modelling tools. These limitations<br />
occurred when the political objectives were not clearly defined. It was for instance imagined that the<br />
modelling tools should be used to identify the optimal solution for the water management regime in the river<br />
branch system. This unique area is, however, subject to considerable interest from different sectors such<br />
as commercial forestry, fishery, tourism and natural conservation. The requirements of these different<br />
sectoral interests are not common and in some cases even contradictory with respect to how the water<br />
regime should be. Thus, until the balance of interests between these different stakeholders has been<br />
decided in terms of clear political goals from the government, an optimal solution does not exist. Another<br />
example of lack of clear political goals was related to the overall sharing of water between hydropower and<br />
the environment.<br />
44
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
3.2.3 Large scale modelling of groundwater contamination ([10])<br />
Summary<br />
Publication [10] describes results from an EU research project on groundwater pollution from non-point<br />
sources. The rationale outlined in [10] is that physically based models for describing nitrate due to better<br />
process descriptions may be expected to have better predictive capabilities than simpler empirical<br />
models for certain applications related to assessing the impacts of changes in agricultural management<br />
practise. Such models were well proven for simulation of nitrate contamination at small scale with good<br />
data availability. Two of the main constraints for using such models operationally were that (a) the databases<br />
existing at national or European scale had not previously been tested as input for such models;<br />
and (b) almost no tests had been conducted for such models at large scale. The objectives of the paper<br />
were therefore to study the data availability at the large scale and develop methodologies for model<br />
upscaling/aggregation to represent conditions at larger scale. The theoretical aspects on scaling included<br />
in [10] are dealt with in Section 4.1. Here some key results from one of the two catchments (Karup)<br />
are discussed.<br />
The modelling system used was MIKE SHE (Refsgaard and Storm, 1995) coupled with the DAISY root<br />
zone model (Hansen et al., 1991). Two Danish catchments of about 500 km 2 each, Karup and Odense,<br />
were used for the tests.<br />
The principles used for collecting input data and assessing values of model parameters were:<br />
• The data must be easily accessible. This implied that most of the data were aggregated data from<br />
national or European databases.<br />
• No model calibration is carried out. Instead parameter values are estimated from generic transfer<br />
functions.<br />
Data were collected from the following sources:<br />
• Topography: 1 km grid data downloadable from USGS and GISCO (Geographical Information System<br />
of the European Commission)<br />
• Catchment boundaries and river network: generated from the topographical data using standard<br />
GIS functionality.<br />
• River cross-sections: derived from a special GIS application where the cross-section was estimated<br />
based on upstream catchment area, slope and a characteristic discharge.<br />
• Soil type: GISCO soil map.<br />
• Soil organic matter: experience values.<br />
• Vegetation: EEA CORINE land cover map.<br />
• Agricultural management practise: Agricultural statistics and government prescribed norms<br />
• Geology and groundwater abstraction: EC report<br />
• Climatic variables and discharge data: national data<br />
The MIKE SHE models were run with 1, 2 and 4 km grids. For describing the nitrate leaching from the<br />
root zone, 17 crop rotation schemes were established by use of DAISY. The crop rotations were based<br />
45
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
on the statistical information on crop type and livestock densities. The 17 schemes were distributed<br />
randomly over the catchment in such a way that the statistical distribution was in accordance with the<br />
agricultural statistics. As an alternative, all the agricultural area was described by one representative<br />
crop instead of 17 cropping patterns. These two approaches are denoted ‘Distributed’ and ‘Uniform’ in<br />
Figs. 24 and 25 below.<br />
The Karup model was validated by comparison of model simulations and field data on annual water<br />
balances, discharge hydrographs (Fig. 24) and nitrate concentrations in the upper groundwater layer<br />
from 35 observation wells (Fig. 25). The results of the validation tests were characterised as follows:<br />
• The annual water balance was simulated remarkably well with only 2% difference as average value<br />
over the five years validation period. The variation over the year (Fig. 24) is less well described.<br />
• The simulated nitrate concentrations (Fig. 25) match the observed data remarkably well both with<br />
respect to average concentrations and statistical distribution of concentrations within the catchment.<br />
• The simulations are clearly affected by various scale effects (1, 2, 4 km grid and Distributed/Uniform).<br />
This is addressed further in Section 4.1 below.<br />
Fig. 24 Comparison of the recorded discharge hydrograph for the Karup catchment with simulations<br />
based on 1, 2 and 4 km grids. The two simulated curves correspond to the combined upscaling/aggregation<br />
procedure (Distributed) and the simpler upscaling procedure (Uniform).<br />
46
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
1,2<br />
Distribution of groundwater concentrations (ultimo 1993)<br />
(uniform agricultural representation)<br />
Cumulative frequency<br />
1<br />
0,8<br />
0,6<br />
0,4<br />
0,2<br />
Measure<br />
d<br />
det1000_<br />
d1<br />
det2000_<br />
d1<br />
det4000_<br />
d1<br />
0<br />
0 20 40 60 80 100 120 140 160 180<br />
1,2<br />
(mg/l)<br />
Distribution of groundwater concentrations (ultimo 1993)<br />
(distributed agricultural representation)<br />
Cumulative frequency<br />
1<br />
0,8<br />
0,6<br />
0,4<br />
0,2<br />
Measured<br />
det1000<br />
det2000<br />
det4000<br />
0<br />
0 20 40 60 80 100 120 140 160 180<br />
mg/l<br />
Fig. 25 Comparison of statistical distribution of nitrate concentrations in groundwater for the Karup<br />
catchment by the model with 1, 2 and 4 km grids and observed in 35 wells. The lower figure corresponds<br />
to the upscaling procedure resulting in a distributed representation of agricultural crops, while<br />
the upper figure is from the run with the upscaling procedure, where all agricultural area is represented<br />
by one uniform crop.<br />
Discussion - post evaluation<br />
The model codes used in [10] were well known and previously used in one of the catchments (Styczen<br />
and Storm, 1993a, b). The scientific contributions of [10] relate partly to scaling issues, which are dealt<br />
with in Section 4.1 below, and partly to testing the performance of nitrate catchment models when<br />
scarce data are used and when no model calibration is carried out. The most important finding with<br />
respect to data availability is probably that aggregated data in many cases can provide sufficient input<br />
to perform useful model simulations. This message is similar to the output from the first large scale application<br />
of SHE to catchments in India with scarce data ([4] and [5]), namely that an apparent lack of<br />
primary data should not always prevent you from using a model.<br />
With regard to data availability at large scale it was concluded that the most critical data that may cause<br />
problems for large scale applications are the geological data for which no suitable global or European<br />
digital database exist. In this respect the development of a national hydrological model in Denmark<br />
(Henriksen et al., 2003) that is based on comprehensive geological data from the very large national<br />
geological database is an important development.<br />
47
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The study showed that one of the strengths of physically-based models is the possibility to assess<br />
many parameter values from standard values, achieved from experience through a number of other<br />
applications. It also showed some of the limitations in this respect. While the key results in terms of<br />
annual runoff and nitrogen concentration distributions are encouraging, the discharge hydrographs<br />
clearly illustrate that it would be very easy to obtain a better hydrograph fit through calibration of a couple<br />
of parameter values. When parameters are assessed in this way they are subject to considerable<br />
uncertainty, which will generate significant uncertainty in model predictions. This aspect is addressed in<br />
([11]) which is discussed in Section 4.3 below.<br />
The attempt to assess parameter values directly from data without any model calibration can be seen<br />
as the extreme end of the development starting with hundreds of free parameters in the Suså model<br />
([1]), over 26 parameters in the Kolar basin in India ([5]), to 11 free parameters in a previous Karup<br />
study ([7]). The results from the present study showed some obvious shortcomings of this approach,<br />
and in a later study of the Senegal basin (Andersen et al., 2001) we used 4 free parameters for calibration.<br />
48
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
3.3 Real-time Flood Forecasting<br />
3.3.1 Intercomparison of updating procedures for real-time forecasting ([8])<br />
Summary<br />
Publication [8] presents a classification of updating procedures used in real-time flood forecasting modelling<br />
and a review of the results from the WMO project ‘Simulated Real-Time Intercomparison of Hydrological<br />
Models’ (WMO, 1992) comprising more than 10 commonly used hydrological model codes<br />
and a variety of different updating procedures. The objective of the paper was to analyse the performance<br />
of different types of updating procedures and to assess what is more important, the simulation<br />
model or the updating procedure.<br />
In the context of real-time forecasting a hydrological catchment model, as those in the remaining part of<br />
this thesis, may be denoted a process model (Fig. 26). A process model consists of a model structure<br />
including process equations, model parameters that are constant throughout a model run and state<br />
variables. The transformation from input to output by the process model is called simulation, in accordance<br />
with the terminology defined in Section 2.2 above. Process models that operate in real-time may<br />
take into consideration the measured discharge/water level at the time of preparing the forecast. This<br />
feedback process of assimilating the measured data into the forecasting procedure is referred to as<br />
updating, or data assimilation. Updating procedures can be classified according to four different methodologies<br />
(Fig. 26):<br />
1. Updating of input variables, typically by adjusting precipitation.<br />
2. Updating of state variables, e.g. the soil moisture content.<br />
3. Updating of model parameters.<br />
4. Updating of output variables (error prediction).<br />
The core of the WMO project was a workshop held in Vancouver during the period July 30 – August 8,<br />
1987, where 15 models from 14 different organisations were run in a simulated real-time environment.<br />
Data from three catchments with significantly different hydrological characteristics were used for the<br />
tests. Before the workshop the modellers had received historical data for several years for calibration<br />
and validation and two ‘warm up’ flood events. During the workshop four additional flood events were<br />
forecasted as blind tests, each with seven forecasts at consecutive times. Each event was forecasted<br />
within one workshop day, often under considerable time pressure.<br />
I participated in the workshop with two models that differed both with respect to process model and<br />
updating procedure:<br />
• NAMS11 comprising the NAM as catchment model, St. Venant river routing and an error prediction<br />
model as updating procedure. This is basically identical to what later became known as the flood<br />
forecasting module of MIKE 11 (Havnø et al., 1995).<br />
49
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
• NAMKAL comprising the NAM formulated in a state-space form and build into an extended Kalman<br />
filter for updating. This version had no separate river routing but relied on the linear reservoirs in<br />
NAM.<br />
The two models were tested on the 104 km 2 Orgeval catchment (France) and the 2,344 km 2 Bird Creek<br />
catchment (United States). The models were not tested on the third, snow-dominated catchment.<br />
Fig. 26 Schematic diagram of simulation and forecasting with illustration of four different updating<br />
methodologies), [8].<br />
Summary results from the two catchments are shown in Fig. 27 as root mean square errors (RMSE) as<br />
a function of forecast lead time (lag). As can be seen from the figure the intercomparison test turned out<br />
to be a very close ‘race’ with at least one third of the models performing almost equally well. Depending<br />
on the selected criteria for comparison (which catchment, priority to short, medium or long lead times,<br />
etc.) several of these could claim to be the ‘best model’. What is maybe more interesting is some of the<br />
general findings:<br />
• The process models belonged to two of the classes shown in Fig. 6, namely empirical (black box)<br />
models and lumped conceptual models. From the results it was not possible to clearly distinguish<br />
which model type performed better.<br />
• All four types of updating procedures were represented, both among the models with the best performance<br />
and among the models with the poorest performance. This indicates that the selection of<br />
a specific updating methodology is only one out of several important factors.<br />
• The forecast error (RMSE) generally increases with forecast lead time. This shows that updating<br />
procedures most often significantly improve the performance of hydrological models for short-range<br />
forecasting.<br />
• In most cases the models with the best performance for short lead times were also those with the<br />
best results for the long lead times. This indicates that the goodness of the basic simulation (by the<br />
50
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
process model) is crucial to forecast accuracy, or in other words that a good updating procedure<br />
can not compensate for a poor process model.<br />
Discussion - post evaluation<br />
Real-time forecasting is the toughest field I have experienced in hydrological modelling with respect to<br />
model validation, because the results of the model forecasts are continuously confronted with observations.<br />
In many studies involving model simulations for planning purposes it is often not possible to conduct<br />
a validation test that exactly fits the conditions for which model simulations of future conditions are<br />
needed. Therefore, the validation test results will often have many qualifiers and be considered together<br />
with other arguments. In real-time flood forecasting there is no need for such qualifiers and arguments<br />
(‘no nonsense’) and therefore only the hard facts are considered.<br />
Fig. 27 Root Mean Square Errors (RMSE) as a function of forecast lead time for all models participating<br />
in the Orgeval and Bird Creek catchments. The RMSE values are averaged over the four forecasted<br />
flood events with blind tests (events 3-6), [8].<br />
51
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The main scientific contribution of [8] was the analysis of the performance of different types of process<br />
models and updating procedures and combinations hereof. Our motivations to participate in this unique<br />
WMO intercomparison project were (a) to test DHI’s code NAMS11 (now MIKE 11), which was used<br />
operationally in India at that time, in an intercomparison with some of the internationally leading codes<br />
and modellers; and (b) to test whether an extended Kalman filter could provide a better updating routine<br />
than the more commonly used and simpler error prediction routine. In addition to noting that the<br />
NAMS11 performed very well and that the extended Kalman filter under ideal conditions could perform<br />
marginally better than the standard updating procedure, the analysis lead to the following interesting<br />
findings:<br />
• It was not possible to conclude which model type, black box or lumped conceptual, is better suited<br />
for simulation of runoff. This is in good agreement with [6] and later studies such as Reed et al.<br />
(2004), which concluded that lumped conceptual and distributed physically-based models performed<br />
equally well for split-sample tests. Thus it may be argued that all three model types described<br />
in Section 2.4 in many cases can be expected to be able to perform equally well in rainfallrunoff<br />
modelling.<br />
• It turned out that the personal factor is maybe the most important aspect of hydrological modelling.<br />
It was clear after the workshop that the difference in model performances between the participating<br />
codes could often not be explained by differences in model codes. Personal factors such as the<br />
modeller’s ability to make a good model calibration, experience from working in hydrological regimes<br />
different from the regime you see in your home office, ability to work under extreme stress,<br />
level of preparation beforehand and random luck also played important roles. The personal factor is<br />
most often overlooked in natural science, maybe because it is subjective of nature and therefore<br />
does not fit well into the methods usually adopted in natural science. The ultimate consequence of<br />
this finding is that good quality of modelling results requires both use of good scientifically based<br />
methodologies and adoption of sound practises by competent professionals. This consequence was<br />
not derived in [6] but is central for recent work on quality assurance guidelines in the modelling<br />
process ([13]).<br />
Most of the model codes that participated in the intercomparison study were state-of-the-art hydrological<br />
model codes such as Sacramento (Burnash, 1995), HBV (Bergström, 1995) and MIKE 11<br />
(NAMS11) with comprehensive experience in operational flood forecasting. These codes are still<br />
among the most commonly used today. The updating techniques tested in [8] are also still the basic<br />
techniques used operationally today, although more sophisticated developments and improvements<br />
have taken place, e.g. a combination of the Kalman filtering and the error prediction procedure (Madsen<br />
and Skotner, 2005).<br />
52
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
4. Key Issues in Catchment Scale Hydrological Modelling<br />
4.1 Scaling<br />
This section provides a discussion of catchment heterogeneity and upscaling in relation to catchment<br />
modelling based partly on the publications in the present thesis (most importantly [7] and [10]) and<br />
partly on other previous work such as Refsgaard (1981), the foundation of [1] and [2], and Refsgaard<br />
and Butts (1999) that was heavily inspired by the EU research project behind [10] and [11].<br />
Hydrological modelling is being carried out at spatial scales ranging from pore scale to global scale and<br />
a variety of scaling theories has been developed, see e.g. Blöschl and Sivapalan (1995) and Beven<br />
(1995). Many of the scaling theories consider different spatial scales for single processes. For catchment<br />
modelling it is necessary to include several processes and their linkages.<br />
4.1.1 Catchment heterogeneity<br />
Catchment properties exhibit spatial variability. For almost all properties this heterogeneity is very large<br />
and dominates the behaviour of the catchment. Scaling is basically a question of how to handle heterogeneity<br />
at different spatial scales. Different model types do this fundamentally different. Let us illustrate<br />
this by two examples.<br />
As the first example, let us consider an idealised description of flow through the root zone (Fig. 28). If a<br />
soil column, initially dry, is supplied with a certain amount of water it will retain water, until it is filled to a<br />
certain level, the field capacity θ’ F , whereupon all the supplied water will pass through. This is illustrated<br />
in Fig. 28 A,B,C, where also the frequency and the distribution of θ F are shown. If we then consider a<br />
catchment with a spatial variability in soil physical properties, the frequency and the distribution of the<br />
field capacity are illustrated in Fig. 28 D and E respectively. If the root zone of this catchment, initially<br />
dry, is being supplied with water, not all of the area will contribute to throughflow at the same time, as θ F<br />
varies in the catchment . When, for instance, the rainfall has supplied the water amount θ’ F,m , it is seen<br />
from Fig. 28 E that field capacity has been reached in one half of the catchment, thus contributing to<br />
throughflow, while the other half of the catchment still retains the rain in its root zone.<br />
In a lumped model, such as NAM, such spatial variability is taken into account by using semi-empirical<br />
relations as e.g. the dashed line in Fig. 28 F, where θ’ 1 and θ’ 2 typically have to be estimated from calibration.<br />
The difference between θ’ 1 and θ’ 2 can be seen as a measure of the heterogeneity of the catchment,<br />
or of the catchment input that is also assumed homogeneously distributed in a lumped approach.<br />
This way of accounting for the spatial variability in the process equations can be considered the heart of<br />
lumped models and also explains why the process equations in lumped models are fundamentally different<br />
from point scale physical process equations.<br />
In a distributed model the spatial variability is taken into account by dividing the catchment into several<br />
smaller elements, which are then usually treated as homogeneous units, i.e. as a column in Fig. 28.<br />
53
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
However, the spatial variability of soil physical properties comprise both variability between different soil<br />
types and variability within the same soil type as illustrated in Fig. 29. It has been demonstrated in several<br />
studies (Nielsen et al., 1973; Jensen and Refsgaard, 1991a,b,c; Djurhus et al. 1999) that the spatial<br />
variability of e.g. soil properties within one standard soil type at field scale is very high and can significantly<br />
influence the water balance and solute transport at this scale.<br />
Frequency<br />
A<br />
Distribution<br />
B<br />
Through flow<br />
Supplied water<br />
C<br />
Soil<br />
Column<br />
θ F<br />
θ F<br />
Supplied<br />
water<br />
θ’ F<br />
Frequency<br />
θ’ F<br />
Distribution<br />
θ F<br />
Through flow<br />
Supplied water<br />
1.0<br />
D<br />
E<br />
F<br />
Catchment<br />
0.5<br />
0<br />
θ F<br />
θ’ F, m<br />
θ’ 1<br />
θ’ F<br />
θ’ 2<br />
Supplied<br />
water<br />
Fig. 28 Idealised description of the variation of field capacity, θ F , and its effect on flow through the root<br />
zone in a soil column and in a catchment (Refsgaard, 1981).<br />
Frequency<br />
Spatial variability<br />
of field capacity, θ F<br />
within one of<br />
the soil types<br />
in the entire<br />
catchment<br />
θ F<br />
Fig. 29 The principle of spatial variability of a soil physical property within a single soil type and within a<br />
catchment containing more than one soil type (Refsgaard, 1981).<br />
Let us then turn to another example focusing on the limitation of a distributed model to resolve key features<br />
of a catchment. Fig. 30 shows the topography and river network for two models that are identical<br />
54
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
except for differences in spatial discretisation. It is clearly seen that the 500 m grid provides a much<br />
better resolution of the topography and the river network, and also of other catchment characteristics as<br />
explained in [7]. In the 2000 m grid the river valley cannot be described well and many of the smaller<br />
streams have to be omitted, where the distance between neighbouring streams are smaller than the<br />
model grid size. This significantly affects the stream-aquifer interaction and in this way the simulation of<br />
both river discharge and groundwater heads. As discussed in [7] a change in scale (grid size) in this<br />
way changes the model simulations. This can in some cases be compensated by adjusting parameter<br />
values. But it implies that parameter values are scale dependent and that the physical basis is reduced<br />
if the grid size is increased.<br />
Fig. 30 Topography, river network and model grid for two models with discretisations of 500 m and<br />
2000 m [7].<br />
This example focussed on river discharges and hydraulic heads at some given observational locations<br />
for which [7] argues that a 500 m resolution provides an adequate description. If we instead had focussed<br />
on other processes such as reactive transport in aquifers or in river valleys, we would have needed<br />
to account for geological and geomorphological heterogeneity of much smaller scale than 500 m. This<br />
line of argument can continue down to pore scale processes such as those described in [3]. The point is<br />
that, no matter which resolution a model has, it is always possible to find processes that require a<br />
smaller scale in order to provide a physically based description. Consequently, the ultimate distributed<br />
physically based model where everything is described can never be achieved. This implies that any<br />
distributed model needs to provide a kind of lumped conceptual representation at its scale of operation.<br />
An excellent example of this is the traditional advection dispersion equation with its associated dispersivities,<br />
where the dispersivities show the well known scale dependence (Gelhar, 1986). The process<br />
description of oxygen transport and consumption given in [3] is another example. Although meant for<br />
55
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
inclusion as a submodel in a distributed physically based model, [3] incorporates spatial heterogeneity<br />
of processes at pore scale (mm) to a process equation assumed valid at its scale of operation (grid<br />
points with 10-40 cm distance). This process equation can therefore be considered a lumped conceptual<br />
description at this scale.<br />
4.1.2 A scaling framework<br />
In this section we only consider the case of moving from the smaller to the larger scale, which is often<br />
denoted upscaling. When moving to larger scales the spatial variability of physical parameters and variables<br />
have to be taken into account. This can in principle be done in two ways, either by aggregation or<br />
upscaling (Heuvelink and Pebesma, 1999):<br />
• Upscaling means that the process equations and the associated parameters that basically constitute<br />
the model in principle are modified or substituted when moving from the smaller scale to the<br />
larger scale.<br />
• Aggregation means that the process equations are applied at the smaller scale (where they were<br />
derived) and the large-scale results are obtained by aggregating the small-scale results at the larger<br />
scale.<br />
Hence, in order not to confuse the terminology with two different meanings of the term upscaling the<br />
term scaling will in the following be used for the case of moving from modelling at the smaller scale to<br />
modelling at the larger scale. Thus, the term upscaling is reserved to the specific approach of scaling<br />
defined above.<br />
The differences between upscaling and aggregation are illustrated in Fig. 31 and some key characteristics<br />
are summarised in Table 1. At the smaller scale, the hydrological processes can be described by<br />
smaller scale equations and associated smaller scale parameters. If the aggregation approach is<br />
adopted for large-scale modelling, then the model is operated at the smaller scale units with smaller<br />
scale equations and parameters and the model output valid for the larger scale emerges after aggregation<br />
of the results. The aggregation consists of estimating the spatial mean and in some cases also the<br />
statistical distribution of the model outputs. If the model is linear or the parameters and variables are<br />
spatially constant, computational time may be saved by averaging of model parameters and input before<br />
running the model; otherwise the models runs must be made before the aggregation step.<br />
Table 1. Characteristics of different scaling procedures when moving from a smaller scale (SS) to a<br />
larger scale (LS).<br />
Aggregation<br />
Upscaling<br />
Basis of process descriptions<br />
SS equations<br />
used at LS<br />
Large-scale<br />
PDE<br />
Smaller scale Smaller scale Smaller<br />
scale<br />
LS equations<br />
developed<br />
Larger scale<br />
Computational unit Smaller scale Larger scale Larger<br />
scale<br />
Larger scale<br />
Parameter estimation<br />
possible from field<br />
data<br />
Yes<br />
No, some values<br />
need calibration<br />
Yes<br />
No, some values<br />
need calibration<br />
56
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Fig. 31 Upscaling and aggregation methods for extending hydrological processes from small-scale (SS)<br />
to large-scale (LS) models (Refsgaard and Butts, 1999).<br />
If the upscaling approach is adopted for the large-scale modelling, the smaller scale equations and parameters<br />
are in principle substituted by larger scale ones. The upscaling approach can be carried out in<br />
three different ways:<br />
• The smaller scale equations are assumed valid also at the larger scale. In this case the parameter<br />
values have to be estimated as effective parameters corresponding to the larger scale computational<br />
unit. Effective parameters are single values, similar to point scale parameters, but somehow<br />
reproduce the bulk behaviour of a heterogeneous medium. The estimation of parameter values is in<br />
such case often done by calibration, at least for a handful of the key parameters. An example of this<br />
approach is given in [5] describing an application of the SHE to a large catchment in India using<br />
spatial grid sizes of 2 km x 2 km.<br />
• The equations at the larger scale are derived in a theoretical framework from a set of deterministic<br />
partial differential equations (PDE) assumed valid at the smaller scale and assumptions on the spatial<br />
variability of key parameters and/or input data. This is often carried out in a stochastic framework<br />
where quantities such as the average value and higher order statistical moments of the desired<br />
model output variables can be assessed. An example of this approach is Jensen and Mantouglou<br />
(1992) who consider the spatial variability of soil hydraulic parameters in field scale modelling.<br />
In this case the parameter values may be assessed directly on the basis of smaller scale information.<br />
• The equations at the larger scale are developed at the larger scale using a concept, which does not<br />
explicitly consider the smaller scale equations, i.e. the formulation of laws that apply at the large<br />
scale. Examples of this approach are the conceptual rainfall-runoff models such as the NAM (Niel-<br />
57
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
sen and Hansen, 1973; [6]; [8]), cf Fig. 28 and the discussion above. The oxygen model described<br />
in [3] is also an example of this approach, although smaller scale and larger scale here refer to mm<br />
and dm scales and not to catchment scale. As a result of the larger scale concepts such codes are<br />
often not adequate also for smaller scale application and can most often not assess parameters directly<br />
from small scale information.<br />
4.1.3 Scaling - an example<br />
The above four scaling approaches each have their advantages and limitations and the specific approach<br />
to use in particular applications will depend on many factors such as the purpose of a given<br />
study, the dominating processes in the particular hydrological regime and the data availability. Thus, no<br />
unique approach can be claimed superior in all cases. As illustrated below, scaling procedures are in<br />
practise often based on combinations of the above approaches.<br />
The example outlines the scaling methodologies adopted under an EU research project dealing with<br />
uncertainties of assessing non-point pollution to aquifers at the European scale (Refsgaard et al, 1998;<br />
[10]). During this project two model codes were used:<br />
• SMART2 for studying leaching to groundwater of nitrate and aluminium from natural areas due to<br />
atmospheric deposition. SMART2 is a relatively simple dynamic model operating in vertical columns<br />
with annual time steps (Kros et al., 1995).<br />
• MIKE SHE/DAISY for studying groundwater contamination from agricultural areas. Both MIKE SHE<br />
(Refsgaard and Storm, 1995) and DAISY (Hansen et al., 1991) are physically-based model codes<br />
with detailed process descriptions and typically hourly time steps.<br />
The objective of the project was to assess the uncertainty in model predictions when applied at the<br />
European scale. As both codes had been developed for and previously mainly been applied at much<br />
smaller scales a scaling procedure had to be adopted. The two scaling procedures, illustrated in Fig.<br />
32, show significant differences:<br />
SMART 2 is operating at a 1 km grid scale. It was developed on the basis of experience with the NUC-<br />
SAM code (Groenenberg et al., 1995) which is a detailed physically-based code operating at point<br />
scale. Thus, SMART2 can be considered as an upscaling of NUCSAM with new equations and parameters<br />
applicable at the 1 km scale, equivalent to the upscaling procedure of the conceptual hydrological<br />
models described above. For use for the Netherlands the SMART2 model results were aggregated to 5<br />
km x 5 km grid by selecting the median value among the 25 grids of 1 km x 1 km size. The parameters<br />
were assessed by pedotransfer functions from field data without prior model calibration. The scaling<br />
procedure from point scale to national or European scales thus consists of a combination of an upscaling<br />
and an aggregation step.<br />
MIKE SHE/DAISY, on the other hand, is in this case run with equations and parameter values in each<br />
model grid point representing field scale conditions. The field scale is characterised by ‘effective’ soil<br />
and vegetation parameters, but assuming only one soil type and one cropping pattern. The smallest<br />
horizontal discretisation in the model is the grid scale (1-5 km) that is larger than the field scale. This<br />
implies that all the variations between categories of soil type and crop type within the area of each grid<br />
can not be resolved and described at the grid level. Input data, whose variations are not included in the<br />
58
Refsgaard JC – Doctoral Thesis<br />
Hydrological Modelling and River Basin Management<br />
January 2007<br />
grid scale representation, are distributed randomly at the catchment scale so that their statistical distributions<br />
are preserved at that scale. The results from the grid scale modelling are then aggregated to<br />
catchment scale (10-50 km) and the statistical properties of model output and field data are then compared<br />
at catchment scale (Hansen et al., 1999; [10]). Thus the scaling procedure from point scale to<br />
catchment scale is again a combination of an upscaling step and an aggregation step. In contrary to the<br />
NUCSAM-SMART2 case the upscaling step here is simply the (important) assumption that the point<br />
scale equations are valid at field scale. The aggregation step highlights a key issue from the concept of<br />
Representative Elementary Area, REA (Wood et al., 1988), namely that variability can be explicitly represented<br />
only at scales larger than the model grid size.<br />
Validation tests against field data suggested that the two different scaling procedures basically could be<br />
assumed valid for their respective cases, although important limitations were also identified. An important<br />
question regarding the differences between the two upscaling methods is, why it apparently was<br />
possible to make the large upscaling step from the smaller scale NUCSAM to the larger scale SMART 2<br />
code, while a similar step was not judged possible for the MIKE SHE/DAISY code. The answer may be<br />
that the nitrogen leaching in agricultural fields is a highly non-linear and dynamic process that depends<br />
on cropping pattern and agricultural management practise, which can not be lumped to a larger scale<br />
description, while the geochemical processes below natural lands, where no management practise is<br />
interfering, more easily can be represented by long term average simulations focussing on the gradual<br />
reduction of the chemical buffer capacities due to the acids in the atmospheric deposition.<br />
An inherent limitation of the scaling methodologies illustrated in this example is that they do not preserve<br />
the georeferenced location of simulated concentrations, but only their statistical distribution over<br />
the catchment area (e.g. Fig. 25). Therefore, comparisons with field data make no sense on a well by<br />
well or subcatchment by subcatchment basis, and no information on the actual location of the simulated<br />
‘hot spots’ within the catchment is provided. If it from a management point of view is required with a<br />
more detailed spatial resolution of the model predictions, then the same scaling method has to be carried<br />
out at a finer scale with all the statistical input data being supplied on a subcatchment basis. This is<br />
in principle straightforward, but in reality it may often be limited by data availability.<br />
4.1.4 Discussion – post evaluation<br />
The issue of scaling represents both a major scientific challenge and a practical problem in water resources<br />
management. Scaling is dealt with as a key issue in two of the publications in this thesis ([7],<br />
[10]). As the studies behind the other publications operate on scales ranging from point scale ([3]) to<br />
thousands of km 2 ([4], [5], [9]) catchment heterogeneity and scaling are dealt with and discussed in<br />
many of the publications.<br />
59
Fig. 32 Scaling methodology adopted by the SMART2 and MIKE SHE/DAISY models in the UNCERSDSS project (Refsgaard and Butts, 1999).
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
In the beginning of my career I had the rather naive view that it might be possible to develop a universal<br />
model code and a methodology that could be used to address most problems in hydrological management.<br />
This is reflected in the dualism of statements of the MIKE SHE description in Refsgaard and<br />
Storm (1995), where it on the one hand is stated that “MIKE SHE is applicable on spatial scales ranging<br />
from a single soil profile to a large regions”, while it on the other hand is acknowledged that “there are a<br />
number of fundamental scale problems which need to be carefully considered in the model applications”.<br />
I do not believe any longer that a universally applicable code and modelling methodology is theoretically<br />
realistic, and certainly it is not feasible in practise. The main reason for this is the scaling problems.<br />
Because scaling is interlinked with modelling concepts, I therefore do not believe it will ever be<br />
possible to derive a universal scaling theory of practical applicability.<br />
Scaling implies to take spatial heterogeneity into account. In catchment modelling it is furthermore<br />
complicated by the need to include and link several processes, such as subsurface processes (Dagan,<br />
1986; Gelhar, 1986; Wen and Gómez-Hernández, 1996), root zone processes including land surfaceatmosphere<br />
interaction (Michaud and Shuttelworth, 1997); and surface water processes including<br />
stream-aquifer interaction (Saulnier et al., 1997; [7]).<br />
Many researchers have expressed doubts whether it is feasible to use the same model process descriptions<br />
at different scales. For instance Beven (1995) states that “… the aggregation approach towards<br />
macroscale hydrological modelling, in which it is assumed that a model applicable at small<br />
scales can be applied at larger scales using ‘effective’ parameter values, is an inadequate approach to<br />
the scale problem. It is also unlikely in the future that any general scaling theory can be developed due<br />
to the dependence of hydrological systems on historical and geological perturbations.”<br />
Beven’s view can be considered a universal and fundamental statement to which it is difficult to disagree.<br />
A more pragmatic, but not necessarily conflicting, view is expressed by Grayson and Blöschl<br />
(2000): “As modellers, we are often left with little choice but to use the effective parameter approach,<br />
but we must recognise that effective parameters may have a narrow range of application and an effective<br />
parameter value that “works” for one process may not be valid for another process.” The scaling<br />
framework presented above should be seen in this context. It is not a fundamental theory but rather a<br />
collection of different methods and an emphasis on their respective assumptions and associated costs<br />
in terms of lost information. These methods or building blocks can then be used in composing specific<br />
scaling methodologies depending on the purposes of the particular modelling studies. In this respect it<br />
is crucial that the modeller is aware of the limitations of the scaling methodology chosen in a particular<br />
study.<br />
61
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
4.2 Confirmation, Verification, Calibration and Validation<br />
As illustrated in Fig. 3 the credibility of the descriptions or the agreements between reality, conceptual<br />
model, model code and model are evaluated through confirmation of the conceptual model, verification<br />
of the code, model calibration and model validation. These four terms are addressed in this section.<br />
4.2.1 Confirmation of conceptual model<br />
The conceptual model, with its selection of process descriptions, equations, etc., is the foundation for the<br />
model structure. Therefore a good conceptual model is most often a prerequisite for obtaining trustworthy<br />
model results. In groundwater modelling, establishment of the conceptual model is often considered the<br />
most important part of the entire modelling process (Middlemis, 2000). Evaluation of conceptual models is<br />
an important part in assessing uncertainty due to model structure error (Section 4.3 below and [15]).<br />
Methods for conceptual model confirmation should follow the standard procedures for confirmation of<br />
scientific theories. This implies that conceptual models should be confronted with actual field data and be<br />
subject to critical peer reviews. Furthermore, the feedback from the calibration and validation process may<br />
also serve as a means by which one or a number of alternative conceptual models may be either<br />
confirmed or falsified.<br />
As Beven (2002b) argues we need to distinguish between our qualitative understanding (perceptual model)<br />
and the practical implementation of that understanding in our conceptual model. As a conceptual model is<br />
defined in [12] as combination of a perceptual model and the simplifications acceptable for a particular<br />
model study a conceptual model becomes site-specific and even case specific. For example a conceptual<br />
model of a groundwater aquifer may be described as two-dimensional for a study focussing on regional<br />
groundwater heads, while it may need to include more complex three-dimensional geological structures for<br />
a study requiring detailed solute transport simulations.<br />
4.2.2 Code verification<br />
The ability of a given model code to adequately describe the theory and equations defined in the<br />
conceptual model by use of numerical algorithms is evaluated through the verification of the model code.<br />
Use of the term verification in this respect is in accordance with Oreskes et al. (1994), because<br />
mathematical equations are closed systems. The methodologies used for code verification include<br />
comparing a numerical solution with an analytical solution or with a numerical solution from other verified<br />
codes. However, some programme errors only appear under circumstances that do not routinely occur,<br />
and may not have been anticipated. Furthermore, for complex codes it is virtually impossible to verify that<br />
the code is universally accurate and error-free. Therefore, the term code verification must be qualified in<br />
terms of specified ranges of application and corresponding ranges of accuracy.<br />
Code verification is not an activity that is carried out from scratch in every modelling study. In a particular<br />
study it has to be ascertained that the domain of applicability for which the selected model code has been<br />
verified covers the conditions specified in the actual conceptual model. If that is not the case, additional<br />
62
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
verification tests have to be conducted. Otherwise, the code explicitly must be classified as not verified for<br />
this particular study, and the subsequent simulation results therefore have to be considered with extra caution.<br />
4.2.3 Model calibration<br />
The application of a model code to be used for setting up a site-specific model is usually associated with<br />
model calibration. The model performance during calibration depends on the quantity and quality of the<br />
available input and observation data as well as on the conceptual model. If sufficient accuracy cannot be<br />
achieved either the conceptual model and/or the data have to be re-evaluated.<br />
Many of the publications ([1], [4], [5], [6], [7], [8], [9]) have involved model calibration. This was in all<br />
cases done manually. Today automatic calibration (inverse modelling) is state-of-the-art (Duan et al.,<br />
1994; Hill, 1998; Doherty, 2003), also as part of the calibration process for rather complex distributed<br />
physically-based models (Sonnenborg et al., 2003; Henriksen et al., 2003).<br />
A key issue related to calibration of distributed models with potentially hundreds or thousands of parameter<br />
values is a rigorous parameterisation procedure, where the spatial pattern of the parameter<br />
values are defined and the number of free parameters adjustable through calibration is reduced as<br />
much as possible. A methodology for this is presented in [7], and this issue is further discussed in [4],<br />
[5], [10] and Andersen et al. (2001).<br />
4.2.4 Model validation<br />
Often the model performance during calibration is used as a measure of the predictive capability of a<br />
model. This is a fundamental error. Many studies (e.g. [4]; [6]; Andersen et al., 2001) have<br />
demonstrated that the model performance against independent data not used for calibration is generally<br />
poorer than the performance achieved in the calibration situation. Therefore, the credibility of a sitespecific<br />
model’s capability to make predictions about reality must be evaluated against independent<br />
data. This process is denoted model validation.<br />
In designing suitable model validation tests a guiding principle should be that a model should be tested<br />
to show how well it can perform the kind of task for which it is specifically intended (Klemes, 1986).<br />
Klemes proposed the following scheme comprising four types of test corresponding to different situations<br />
with regard to whether data are available for calibration and whether the catchment conditions are<br />
stationary or the impact of some kind of intervention has to be simulated:<br />
• The split-sample test is the classical test, being applicable to cases where there is sufficient data for<br />
calibration and where the catchment conditions are stationary. The available data record is divided into<br />
two parts. A calibration is carried out on one part and then a validation on the other part. Both the<br />
calibration and validation exercises should give acceptable results.<br />
• The proxy-basin test should be applied when there is not sufficient data for a calibration of the<br />
catchment in question. If, for example, streamflow has to be predicted in an ungauged catchment Z,<br />
two gauged catchments X and Y within the region should be selected. The model should be calibrated<br />
on catchment X and validated on catchment Y and vice versa. Only if the two validation results are<br />
63
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
acceptable and similar can the model command a basic level of credibility with regard to its ability to<br />
simulate the streamflow in catchment Z adequately.<br />
• The differential split-sample test should be applied whenever a model is to be used to simulate flows,<br />
soil moisture patterns and other variables in a given gauged catchment under conditions different from<br />
those corresponding to the available data. The test may have several variants depending on the<br />
specific nature of the modelling study. If for example a simulation of the effects of a change in climate is<br />
intended, the test should have the following form. Two periods with different values of the climate<br />
variables of interest should be identified in the historical record, such as one with a high average<br />
precipitation and the other with a low average precipitation. If the model is intended to simulate<br />
streamflow for a wet climate scenario, then it should be calibrated on a dry segment of the historical<br />
record and validated on a wet segment. Similar test variants can be defined for the prediction of<br />
changes in land use, effects of groundwater abstraction and other such changes. In general, the model<br />
should demonstrate an ability to perform through the required transition regime.<br />
• The proxy-basin differential split-sample test is the most difficult test for a hydrological model, because<br />
it deals with cases where there is no data available for calibration and where the model is directed to<br />
predicting non-stationary conditions. An example of a case that requires such a test is simulation of<br />
hydrological conditions for a future period with a change in climate and for a catchment, where no<br />
calibration data presently exist. The test is a combination of the two previous tests.<br />
The above test types are very general and needs to be translated to specific tests in each case depending<br />
on data availability, hydrological regime and purpose of the modelling study. Except for the situations,<br />
where the split-sample test is sufficient, rather limited work has been carried out so far on validation<br />
test schemes.<br />
From a theoretical point of view the procedures outlined by Klemes (1986) for the proxy-basin and the<br />
differential split-sample tests, where tests have to be carried out using data from similar catchments,<br />
are weaker than the usual split-sample test, where data from the specific catchment are available.<br />
However, no obviously better testing schemes exist.<br />
It must be realised that the validation test schemes proposed above are so demanding that many applications<br />
today would fail to meet them. Thus, for many cases where either proxy-basin or differential<br />
split-sample tests are required, suitable test data simply do not exist. This is for example the case for<br />
prediction of regional scale transport of potential contamination from underground radionuclide deposits<br />
over the next thousands of years. In such case model validation is not possible. This does not imply<br />
that these modelling studies are not useful, only that their output should be recognised to be somewhat<br />
more uncertain than is often stated and that the term ‘validated model’ should not be used. Thus, a<br />
model’s validity will always be confined in terms of space, time, boundary conditions, types of application,<br />
etc.<br />
4.2.5 Discussion – post evaluation<br />
Relative to confirmation, verification and calibration, the main scientific contributions in my publications<br />
[1] – [15] are on the model validation issue. The motivation for this research was twofold: First of all,<br />
there were too many undocumented claims (over-selling) in the modelling community on model capabilities<br />
during the years following the development of many comprehensive model codes such as MIKE<br />
64
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
SHE. This over-selling was most obvious in practical studies conducted by consultants, but it was also<br />
common in large parts of the scientific community, e.g. Abbot et al. (1986a,b) and many others. Secondly,<br />
dominant parts of the hydrological scientific community advocated that model validation was not<br />
possible (Konikow and Bredehoeft, 1992; Beven, 1996a). This left the practising world in a vacuum<br />
without scientifically based methodologies to test and document the degree of credibility of particular<br />
model predictions. The methodologies described in [6] and [7] should be seen as pragmatic approaches<br />
to help filling this vacuum and the discussions in [12] should be seen as an attempt to provide a scientific<br />
basis for adopting rigorous model validation schemes as part of a good modelling practise.<br />
The principles and schemes proposed by Klemes have been extensively used in the last 12 of the publications<br />
([4] – [15]). Thus, the intercomparison study in [6] was based on a rigorous use of all four types<br />
of tests. Furthermore, [7] ‘translated’ Klemes’ principles that were developed with lumped conceptual<br />
models in mind to use in distributed modelling. After demonstrating that a distributed model that was<br />
validated for simulating catchment response often performs much poorer for internal sites, [7] emphasised<br />
that a model should only be assumed valid with respect to the outputs that have been directly<br />
validated. This implies e.g. that multi-site validation is needed if predictions of spatial patterns are required.<br />
Furthermore, a model which is validated against catchment runoff can not automatically be assumed<br />
valid also for simulation of erosion on a hillslope within the catchment, because smaller scale processes<br />
may dominate here; it will need specific validation against hillslope soil erosion data. Furthermore,<br />
systematic split-sample tests were made in [4], [5] and [9], and proxy- basin tests were conducted in [10].<br />
Finally, the validation requirements are emphasised in the publications related to quality assurance [12]<br />
and [13].<br />
[6] and [7] were not the first studies to use Klemes’ principles for validation. For example Quinn and<br />
Beven (1993) used split sample-tests, proxy-basin tests and differential split-sample tests (wet/dry periods)<br />
to analyse TOPMODEL’s predictive capabilities for the Plynlimon catchment in Wales. The key<br />
contribution of [7] and [12] in this respect was the integration of Klemes’ principles as core elements of<br />
a protocol for good modelling practise.<br />
The principles outlined in [7] and consolidated in [12] that a model should never be considered universally<br />
validated, but can only be conditionally validated restricted by the availability of data and specifically<br />
performed validation tests are well in line with Lane and Richards (2001) who argue that “evidence<br />
of a successful prediction in observed spaces and times (conventional validation) cannot provide a sufficient<br />
basis for use of a model beyond the set of situations for which the model has been empirically<br />
tested”. The principles are also in accordance with the new coherent philosophy for modelling of the<br />
environment proposed by Beven (2002b) where he argues that it is required to be able to “define those<br />
areas of the model space where behavioural models occur”.<br />
65
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
4.3 Uncertainty Assessment<br />
This section presents a broad framework originating from Refsgaard et al. (2005) and [14] followed by a<br />
discussion on data uncertainty (including [14]), parameter uncertainty (including [11]) and model structure<br />
uncertainty (including [15]) and how they affect model output uncertainty.<br />
4.3.1 Modelling uncertainty in a water resources management context<br />
Definitions and Taxonomy<br />
Uncertainty and associated terms such as error, risk and ignorance are defined and interpreted differently<br />
by different authors (see Walker et al. (2003) for a review). The different definitions reflect, among<br />
other factors, the different scientific disciplines and philosophies of the authors involved, as well as the<br />
intended audience. In addition they vary depending on their purpose. Here I will use the terminology<br />
used in Refsgaard et al. (2005) and [14] that has emerged after discussions between social scientists<br />
and natural scientists specifically aiming at applications in model based water management (Klauer and<br />
Brown, 2003). It is based on a subjective interpretation of uncertainty in which the degree of confidence<br />
that a decision maker has about possible outcomes and/or probabilities of these outcomes is the central<br />
focus. Thus, according to this definition a person is uncertain if s/he lacks confidence about the specific<br />
outcomes of an event. Reasons for this lack of confidence might include a judgement that the information<br />
is incomplete, blurred, inaccurate, imprecise or potentially false. Similarly, a person is certain if s/he<br />
is confident about the outcome of an event. It is possible that a person feels certain but has misjudged<br />
the situation (i.e. s/he is wrong).<br />
There are many different (decision) situations, with different possibilities for characterising of what we<br />
know or do not know and of what we are certain or uncertain. A first distinction is between ignorance as<br />
a lack of awareness about imperfect knowledge and uncertainty as a state of confidence about knowledge<br />
(which includes the act of ignoring). Our state of confidence may range from being certain to admitting<br />
that we know nothing (of use), and uncertainty may be expressed at a number of levels in between.<br />
Regardless of our confidence in what we know, ignorance implies that we can still be wrong (‘in<br />
error’). In this respect Brown (2004) has defined a taxonomy of imperfect knowledge illustrated in Fig.<br />
33.<br />
66
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Ignorance: unaware of imperfect knowledge<br />
Spectrum of confidence (a state of awareness)<br />
Indeterminacy (‘cannot know’)<br />
Certainty ‘Bounded’ uncertainty ‘Unbounded’ uncertainty<br />
No possible outcomes<br />
known (‘do not know’)<br />
Some possible<br />
outcomes and<br />
probabilities known<br />
Some possible<br />
outcomes, but no<br />
probabilities known<br />
All possible<br />
outcomes and all<br />
probabilities known<br />
All possible outcomes<br />
and some probabilities<br />
known<br />
All possible outcomes<br />
but no probabilities<br />
known<br />
Fig. 33 Taxonomy of imperfect knowledge resulting in different uncertainty situations (Brown, 2004)<br />
In evaluating uncertainty, it is useful to distinguish between uncertainty that can be quantified e.g. by<br />
probabilities and uncertainty that can only be qualitatively described e.g. by scenarios. If one throws a<br />
balanced die, the precise outcome is uncertain, but the ‘attractor’ of a perfect die is certain: we know<br />
precisely the probability for each of the 6 outcomes, each being 1/6. This is what we mean with ‘uncertainty<br />
in terms of probability’. However, the estimates for the probability of each outcome can also be<br />
uncertain. If a model study says: “there is a 30% probability that this area will flood two times in the next<br />
year”, there is not only ‘uncertainty in terms of probability’ but also uncertainty regarding whether the<br />
estimate of 30% is a reliable estimate.<br />
Secondly, it is useful to distinguish between bounded uncertainty, where all possible outcomes have<br />
been identified and unbounded uncertainty, where the known outcomes are considered incomplete.<br />
Since quantitative probabilities require ‘all possible outcomes’ of an uncertain event and each of their<br />
individual probabilities to be known, they can only be defined for ‘bounded uncertainties’. If probabilities<br />
cannot be quantified in any undisputed way, we often can still qualify the available body of evidence for<br />
the possibility of various outcomes.<br />
The bounded uncertainty where all probabilities are deemed known (Fig. 33) is often denoted ‘statistical<br />
uncertainty’ (e.g. Walker et al., 2003). This is the case traditionally addressed in model based uncertainty<br />
assessment. It is important to note that this case constitutes one of many decision situations outlined<br />
in Fig. 33, and in other situations the main uncertainty in a decision situation cannot be characterised<br />
statistically.<br />
67
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Sources of uncertainty<br />
Walker et al. (2003) describes the uncertainty as manifesting itself at different locations in the model<br />
based water management process. These locations, or sources, may be characterised as follows:<br />
• Context, i.e. at the boundaries of the system to be modelled. The model context is typically determined<br />
at the initial stage of the study where the problem is identified and the focus of the model<br />
study selected as a confined part of the overall problem. This includes, for example, the external<br />
economic, environmental, political, social and technological circumstances that form the context of<br />
problem.<br />
• Input uncertainty in terms of external driving forces (within or outside the control of the water manager)<br />
and system data that drive the model such as land use maps, pollution sources and climate<br />
data.<br />
• Model structure uncertainty is the conceptual uncertainty due to incomplete understanding and simplified<br />
descriptions of processes as compared to nature.<br />
• Parameter uncertainty, i.e. the uncertainties related to parameter values.<br />
• Model technical uncertainty is the uncertainty arising from computer implementation of the model,<br />
e.g. due to numerical approximations and bugs in the software.<br />
• Model output uncertainty, i.e. the total uncertainty on the model simulations taken all the above<br />
sources into account, e.g. by uncertainty propagation.<br />
Nature of uncertainty<br />
Many authors (e.g. Walker et al., 2003) categorise the nature of uncertainty into:<br />
• Epistemic uncertainty, i.e. the uncertainty due to imperfect knowledge.<br />
• Stochastic uncertainty, i.e. uncertainty due to inherent variability, e.g. climate variability.<br />
Epistemic uncertainty is reducible by more studies: e.g. research or data collection. Stochastic uncertainty<br />
is non-reducible.<br />
Often the uncertainty on a certain event includes both epistemic and stochastic uncertainty. An example<br />
is the uncertainty of the 100 year flood at a given site. This flood event can be estimated: e.g. by use of<br />
standard flood frequency analysis on the basis of existing flow data. The (epistemic) uncertainty may be<br />
reduced by improving the data analysis, by making additional monitoring (longer time series) or by a<br />
deepening our understanding of how the modelled system works. However, no matter how much we<br />
improve our knowledge, there will always be some (stochastic) uncertainty inherent to the natural system,<br />
related to the stochastic and chaotic nature of several natural phenomena, such as weather. Perfect<br />
knowledge on these phenomena cannot give us a deterministic prediction, but would have the form<br />
of a perfect characterisation of the natural variability; for example, a probability density function for rainfall<br />
in a month of the year.<br />
68
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The uncertainty matrix<br />
The uncertainty matrix in Table 2 can be used as a tool to get an overview of the various sources of<br />
uncertainty in a modelling study. The matrix is modified after Walker et al. (2003) in such a way that it<br />
matches Fig. 33 and so that the taxonomy now gives ‘uncertainty type’ in descriptions that indicates in<br />
what terms uncertainty can best be described. The vertical axis identifies the source of uncertainty<br />
while the horizontal axis covers the level and nature of uncertainty. It is noticed that the matrix is in reality<br />
three-dimensional (source, type, nature), because the categories Type and Nature are not mutually<br />
exclusive<br />
Table 2 The uncertainty matrix (modified after Walker et al., 2003).<br />
Taxonomy (types of uncertainty)<br />
Source of uncertainty<br />
Natural, technological,<br />
Context<br />
economic,<br />
social, political<br />
Inputs System data<br />
Driving forces<br />
Model structure<br />
Model<br />
Technical<br />
Parameters<br />
Model outputs<br />
Statistical<br />
uncertainty<br />
Scenario<br />
uncertainty<br />
Qualitative<br />
uncertainty<br />
Recognised<br />
ignorance<br />
Nature<br />
Epistemic<br />
uncertainty<br />
Stochastic<br />
uncertainty<br />
69
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Methodologies for assessing uncertainty<br />
A list of the most common methodologies applicable for addressing different types of uncertainty has<br />
been compiled and briefly described in Refsgaard et al. (2005). Table 3 provides an overview.<br />
Table 3 Applicability of different methodologies to address different types and sources of uncertainty<br />
(modified after Refsgaard et al., 2005).<br />
Taxonomy (types of uncertainty)<br />
Statistical<br />
uncertainty<br />
Scenario uncertainty<br />
Qualitative<br />
uncertainty<br />
Recognised<br />
ignorance<br />
Source of uncertainty<br />
Natural, technological,<br />
EE EE, SC, SI EE, EPR,<br />
Context<br />
NUSAP, SI,<br />
economic,<br />
UM<br />
social, political<br />
Inputs System data DA, EPE, EE, DA, EE, SC DA, EE DA, EE<br />
MCA, SA<br />
EE, EPR, NU-<br />
SAP, SI, UM<br />
Driving forces DA, EPE, EE, DA, EE, SC DA, EE, EPR DA, EE, EPR<br />
MCA, SA<br />
Model structure<br />
EE, MMS, QA EE, MMS, SC, EE, NUSAP, EA, NUSAP,<br />
Model<br />
QA<br />
QA<br />
QA<br />
Technical QA QA QA QA<br />
Parameters EE, IN-PA, SA EE, IN-PA, SA EE EE<br />
Model outputs<br />
EPE, EE, IN- EE, IN-UN, EE, NUSAP EE, NUSAP<br />
UN, MCA, MMS, SA<br />
MMS, SA<br />
Abbreviations of methodologies:<br />
DA Data Uncertainty<br />
EPE Error Propagation Equations<br />
EE Expert Elicitation<br />
EPR Extended Peer Review (review by stakeholders)<br />
IN-PA Inverse modelling (parameter estimation)<br />
IN-UN Inverse modelling (predictive uncertainty)<br />
MCA Monte Carlo Analysis<br />
MMS Multiple Model Simulation<br />
NUSAP NUSAP<br />
QA Quality Assurance<br />
SC Scenario Analysis<br />
SA Sensitivity Analysis<br />
SI Stakeholder Involvement<br />
UM Uncertainty Matrix<br />
70
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
4.3.2 Data uncertainty<br />
Uncertainty in data is a major source of uncertainty when assessing uncertainty of model outputs. It is<br />
also an uncertainty source that is very visible for people outside the modelling community. One of the<br />
scientific contributions of the HarmoniRiB project ([14]) is to address data uncertainty. This has been<br />
done in three steps:<br />
• A methodology has been developed for characterising uncertainty in different types of data (Brown<br />
et al., 2005).<br />
• A software tool (Data Uncertainty Engine – DUE) for supporting the assessment of data uncertainty<br />
has been developed (Brown and Heuvelink, 2005).<br />
• Reviews with results on data uncertainty reported in the literature have been compiled into a guideline<br />
report for assessing uncertainty in various types of data originating from meteorology, soil physics<br />
and geochemistry, hydrogeology, land cover, topography, discharge, surface water quality,<br />
ecology and socio-economics (Van Loon and Refsgaard, 2005).<br />
The categorisation of data types distinguishes 13 categories (Table 4) for each of which a conceptual<br />
data uncertainty model is developed. By considering measurement scale, it becomes possible to<br />
quickly limit the relevant uncertainty models for a certain variable. On a discrete measurement scale, for<br />
example, it is only relevant to consider discrete probability distribution functions, whereas continuous<br />
density functions are required for continuous numerical data. In addition, the use of space and time<br />
variability determines the need for autocorrelation functions alongside a probability density function<br />
(pdf). Each data category is associated with a range of uncertainty models, for which more specific pdfs<br />
may be developed with different simplifying assumptions (e.g. Gaussian; second-order stationarity; degree<br />
of temporal and spatial autocorrelation).<br />
Table 4 The subdivision of uncertainty categories, along the ‘axes’ of space-time variability and measurement<br />
scale (Brown et al., 2005).<br />
Measurement scale<br />
Space-time variability<br />
Continuous<br />
numerical<br />
Discrete<br />
numerical<br />
Categorical<br />
Narrative<br />
Constant in space and time A1 A2 A3<br />
Varies in time, not in space B1 B2 B3<br />
Varies in space, not in time C1 C2 C3<br />
4<br />
Varies in time and space D1 D2 D3<br />
4.3.3 Parameter uncertainty<br />
In addition to data uncertainty, uncertainty of parameter values is the most commonly considered<br />
source of uncertainty in hydrological modelling. The scientifically soundest way of assessing parameter<br />
uncertainty is through inverse modelling (Duan et al., 1994; Hill, 1998; Doherty, 2003). These tech-<br />
71
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
niques have the benefit that they, in addition to optimal parameter values, also produce calibration statistics<br />
in terms of parameter- and observation sensitivities, parameter correlation and parameter uncertainties.<br />
When parameter uncertainties are assessed they can be propagated through the model to infer about<br />
model output uncertainty. A serious constraint in this respect is the interdependence between model<br />
parameters and model structure as discussed under model structure uncertainty below.<br />
[11] describe an example of how (input) data uncertainty and parameter uncertainty are propagated<br />
through a model to assess uncertainty in model simulation of nitrate concentrations in groundwater. The<br />
assessment of data and parameter values were done by expert judgement and a Monte Carlo technique<br />
with Latin hypercube sampling was used for the uncertainty propagation. The simulated uncertainty<br />
band around the deterministic model simulation in Fig. 25 is shown in Fig. 34 based on 25 Monte<br />
Carlo realisations. The uncertainty is seen to be considerable, e.g. with the estimate of the areal fraction<br />
of the aquifer having concentrations less than 50 mg NO 3 /l ranging between 30% and 80%.<br />
1<br />
0,8<br />
Cum. frequency<br />
0,6<br />
0,4<br />
0,2<br />
(ultimo 1993)<br />
0<br />
0 20 40 60 80 100 120 140 160 180<br />
mg/l<br />
Fig. 34 Measured (•) and simulated (×) areal distribution of NO 3 concentrations in groundwater at a<br />
point in time. Measured values are based on 35 groundwater observations. [11].<br />
As noted in [11] a fundamental limitation of the approach adopted in [11] is that the errors due to incorrect<br />
model structure are neglected. As discussed also below one approach to assess such model structure<br />
error is through comparison of predicted and observed values. In the present case (Figs 25 and 34)<br />
the deviation between observed and simulated values is so small that this term may be neglected. This<br />
is, however, by no means a proof of a correct model structure. It only shows that the particular model<br />
performs without apparent model errors for this particular application.<br />
72
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
4.3.4 Model structure uncertainty<br />
Existing approaches and new framework<br />
Any model is an abstraction, simplification and interpretation of reality. The incompleteness of a model<br />
structure and the mismatch between the real causal structure of a system and the assumed causal<br />
structure as represented in a model will therefore always result in uncertainty about model predictions.<br />
The importance of the model structure for predictions is well recognised, even for situations where predictions<br />
are made on output variables, such as discharge, for which field data are available (Franchini<br />
and Pacciani, 1992; Butts et al., 2004). The considerable challenge faced in many applications of environmental<br />
models is that predictions are required beyond the range of available observations, either in<br />
time or in space, e.g. to make extrapolations towards unobservable futures (Babendreier, 2003) or to<br />
make predictions for natural systems, such as ecosystems, that are likely to undergo structural changes<br />
(Beck, 2005). In such cases, uncertainty in model structure is recognised by many authors to be the<br />
main source of uncertainty in model predictions (Dubus et al., 2003; Neumann and Wierenga, 2003;<br />
Linkov and Burmistrov, 2003).<br />
The existing strategies for assessing uncertainty due to incomplete or inadequate model structure may<br />
be grouped into the categories shown in Fig. 35. The most important distinction is whether data exist<br />
that makes it possible to infer directly on the model structure uncertainty. This requires that data are<br />
available for the output variable of predictive interest and for conditions similar to those in the predictive<br />
situation. In other words it is a distinction between whether the model predictions can be considered as<br />
interpolations or extrapolations relative to the calibration situation.<br />
Availability of data for<br />
model validation test<br />
Target data exist<br />
(interpolation)<br />
No direct data<br />
(extrapolation)<br />
Increase<br />
parameter<br />
uncertainty<br />
Estimate<br />
structural<br />
term<br />
Multiple<br />
conceptual<br />
models<br />
Expert<br />
elicitation<br />
Pedigree<br />
analysis<br />
Intermediate data<br />
(differential splitsample<br />
case)<br />
No data at all<br />
(proxy basin case)<br />
Fig. 35 Classification of existing strategies for assessing conceptual model uncertainty [15].<br />
73
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The two main categories are thus equivalent to different situations with respect to model validation<br />
tests. According to Klemes’ classical hierarchical test scheme (Klemes, 1986; see Section 4.2 above),<br />
the interpolation case corresponds to situations where the traditional split-sample test is suitable, while<br />
the extrapolation case corresponds to situations where no data exist for the concerned output variable<br />
(proxy-basin test) or where the basin characteristics are considered non-stationary, e.g. for predictions<br />
of effects of climate change or effects of land use change (differential split-sample test).<br />
The strategies used in ‘interpolation’, i.e. for situations that are similar to the calibration situation with<br />
respect to variables of interest and conditions of the natural system, have the advantage that they can<br />
be based directly on field data (e.g. Radwan et al., 2004; van Griensven and Meixner, 2004; and Vrugt<br />
et al., 2005). A fundamental weakness is that field data are themselves uncertain. Nevertheless, in<br />
many cases, they can be expected to provide relatively accurate estimates of, at least, the total predictive<br />
uncertainty for the specific measured variable and for the same conditions as those in the calibration<br />
and validation situation. A more serious limitation of the strategies depending on observed data is<br />
that they are only applicable for situations where the output variables of interest are measured. While<br />
relevant field data are often available for variables such as water levels and water flows, this is usually<br />
not the case for concentrations, or when predictions are desired for scenarios involving catchment<br />
change, such as land use change or climate change. Another serious limitation stems from an assumption<br />
that the underlying system does not undergo structural changes, such as changes in ecosystem<br />
processes due to climate change.<br />
The strategy that uses multiple conceptual models benefits from an explicit analysis of the effects of<br />
alternative model structures, e.g. IPCC (2001), Harrar et al. (2003), Troldborg (2004), Poeter and<br />
Anderson (2005) and Højberg and Refsgaard (2005). The multiple conceptual model strategy makes it<br />
possible to include expert knowledge on plausible model structures. This strategy is strongly advocated<br />
by Neuman and Wierenga (2003) and Poeter and Anderson (2005). They characterise the traditional<br />
approach of relying on a single conceptual model as one in which plausible conceptual models are rejected<br />
(in this case by omission). They conclude that the bias and uncertainty that results from reliance<br />
on an inadequate conceptual model are typically much larger than those introduced through an inadequate<br />
choice of model parameter values. This view is consistent with Beven (2002b) who outlines a<br />
new philosophy for modelling of environmental systems. The basic aim of his approach is to extend<br />
traditional schemes with a more realistic account of uncertainty, rejecting the idea that a single optimal<br />
model exists for any given case. Instead, environmental models may be non-unique in their accuracy of<br />
both reproduction of observations and prediction (i.e. unidentifiable or equifinal), and subject to only a<br />
conditional confirmation, due to e.g. errors in model structure, calibration of parameters and period of<br />
data used for evaluation.<br />
A weakness of the multiple modelling strategy, is the absence of quantitative information about the extent<br />
to which each model is plausible. Furthermore, it may be difficult to sample from the full range of<br />
plausible conceptual models. In this respect, expert knowledge on which the formulations of multiple<br />
conceptual models are based, is an important and unavoidable subjective element.<br />
The framework presented in [15] for assessing the predictive uncertainties of environmental models<br />
used for extrapolation includes a combination of use of multiple conceptual models and assessment by<br />
use of the pedigree approach of their credibility as well as a reflection on the extent to which the sampled<br />
models adequately represent the space of plausible models.<br />
74
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
The role of model calibration<br />
Some of the existing strategies used in ‘interpolation’ cannot differentiate how the total predictive uncertainty<br />
originates from model input, model parameter and model structure uncertainty. Other methods<br />
attempt to do so, but as discussed in [15] this is problematic. In the case of uncalibrated models, the<br />
parameter uncertainty is very difficult to assess quantitatively, and wrong estimates of model parameter<br />
uncertainty will influence the estimates of model structure uncertainty. In the case of calibrated models,<br />
estimates of model parameter uncertainty can often be derived from autocalibration routines. An inadequate<br />
model structure will, however, be compensated by biased parameter values to optimise the<br />
model fit with field data during calibration. Hence, the uncertainty due to model structure will be underestimated<br />
in this case.<br />
The importance of model calibration can be illustrated by the example described in Højberg and<br />
Refsgaard (2005). They use three different conceptual models, based on three alternative geological<br />
interpretations, for a multi-aquifer system in Denmark. Each of the models was calibrated against piezometric<br />
head data using inverse technique. The three models provided equally good and very similar<br />
predictions of groundwater heads, including well field capture zones. However, when using the models<br />
to extrapolate beyond the calibration data to predictions of flow pathways and travel times the three<br />
models differed dramatically. When assessing the uncertainty contributed by the model parameter values,<br />
the overlap of uncertainty ranges between the three models significantly decreased when moving<br />
from groundwater heads to capture zones and travel times. They conclude that the larger the degree of<br />
extrapolation, the more the underlying conceptual model dominates over the parameter uncertainty and<br />
the effect of calibration.<br />
This diminishing effect of calibration as the prediction situation is extrapolated further and further away<br />
from the calibration base resembles the conclusion on the effects of updating relative to the underlying<br />
process model, when forecast lead times are increased in real-time forecasting (Fig. 27, Section 3.3).<br />
Here the effect of updating is reduced and the forecast error therefore increases as the forecast lead<br />
time (= degree of extrapolation) increases.<br />
4.3.5 Discussion – post evaluation<br />
Uncertainty is a key, and crosscutting, issue that I consider a useful platform or catalyst for establishing<br />
a common understanding in hydrological modelling and water resources management. By this I mean<br />
both a common understanding within the natural science based modelling issues such as scaling and<br />
validation and between people from the modelling and the monitoring communities as well as a broader<br />
dialogue between modellers and stakeholders on issues such as when is a model accurate and credible<br />
enough for its purpose of application, see Subsection 4.4.4 below.<br />
In the publications on developing the Suså model ([1], [2]) and the oxygen module ([3]) no explicit consideration<br />
is given to the goodness of the model structure and uncertainty assessment was not an issue<br />
at all. In the later work on catchment modelling in India ([4], [5]), where some twisting was done of the<br />
physical realism of the model due to scaling problems, it was noted that the model results might be<br />
‘right for the wrong reasons’, and the limitations of model applicability were emphasised in this respect,<br />
but no uncertainty assessments were made. In the paper describing a methodology for parameterisa-<br />
75
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
tion, calibration and validation of distributed hydrological models ([7]) uncertainty is also neglected. In<br />
the publications [6], [8], [9] and [10] uncertainty is discussed, but as a secondary issue only.<br />
Although examples of model prediction uncertainty assessments had been reported previously from<br />
different modelling disciplines (e.g. Refsgaard et al., 1983; Beck, 1987), the fist to emphasise the need<br />
to systematically perform uncertainty assessments related to catchment model predictions was probably<br />
Beven (1989). This was followed by Binley et al. (1991) who used Monte Carlo analysis to assess<br />
the predictive uncertainty for the Institute of Hydrology Distributed Model and by the introduction of the<br />
Generalised Likelihood Uncertainty Estimation (GLUE) methodology (Beven and Binley, 1992) after<br />
which uncertainty in catchment modelling was high on the agenda in the scientific community.<br />
My main scientific contributions on uncertainty are the publications [11], [14] and [15] and the link of<br />
uncertainty to principles and protocols for good modelling practise in [12] and [13]. Although reported 10<br />
years later than Binley et al. (1991), [11] was one of the first studies with uncertainty propagation<br />
through a complex, coupled distributed physically based catchment model with a focus on water quality.<br />
A key contribution of [14] and Refsgaard et al. (2005) is the broad framework for characterising uncertainty.<br />
This framework provides the link to uncertainty in the quality assurance work ([12], [13]). This<br />
broad framework is inspired by research in social science (Pahl-Wostl, 2002; van Asselt and Rotmans,<br />
2002; Dewulf et al., 2005). The main difference between the traditions in social science and natural<br />
science is that social scientists emphasise participatory processes including consultation and involvement<br />
of users, also on uncertainty aspects, right from the beginning of a study, while natural scientists<br />
often talk about users as someone to which uncertainty results should be communicated, e.g. Pappenberger<br />
and Beven (2006).<br />
The most difficult uncertainty problem (in natural science) to handle today is the model structure uncertainty,<br />
and the most important and novel contribution is probably the efforts made in this respect, primarily<br />
the new framework outlined in [15] but also the inclusion of options for evaluating multiple conceptual<br />
models in the HarmoniQuA modelling protocol ([13] and Fig. 5). The approach suggested in [15]<br />
of using multiple conceptual models (model structures) is not new (IPCC, 2001; Beven, 2002a; Neuman<br />
and Wierenga, 2003) and the use of pedigree analysis to qualitatively assess the credibility of something<br />
is not new either (van der Sluijs et al., 2005). The novelty lies in the combination of the two approaches<br />
that originate from different disciplines.<br />
76
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
4.4 Quality Assurance in Model based Water Management<br />
4.4.1 Background<br />
During the last decade many problems have emerged in river basin modelling projects, including poor<br />
quality of modelling, unrealistic expectations, and lack of credibility of modelling results. Some of the<br />
reasons for this lack of quality can be evaluated ([13]; Scholten et al., 2007) as the effect of:<br />
• Ambiguous terminology and a lack of understanding between key-players (modellers, clients, reviewers,<br />
stakeholders and concerned members of the public)<br />
• Bad practice (careless handling of input data, inadequate model set-up, insufficient calibration/validation<br />
and model use outside of its scope)<br />
• Lack of data or poor quality of available data<br />
• Insufficient knowledge on the processes<br />
• Poor communication between modellers and end-users on the possibilities and limitations of the<br />
modelling project and overselling of model capabilities<br />
• Confusion on how to use model results in decision making<br />
• Lack of documentation and clarity on the modelling process, leading to results that are difficult to<br />
audit or reproduce<br />
• Insufficient consideration of economic, institutional and political issues and a lack of integrated<br />
modelling.<br />
In the water resources management community many different guidelines on good modelling practice<br />
have been developed, see [13] for a review. One, if not the most, comprehensive example of a modelling<br />
guideline has been developed in The Netherlands (Van Waveren et al., 2000) as a result of a process<br />
involving all the main players in the Dutch water management field. The background for this was a<br />
perceived need to improve the quality of modelling (Scholten et al., 2000). Similarly, modelling guidelines<br />
for the Murray-Darling Basin in Australia were developed due to the perception among end-users<br />
that model capabilities may have been ‘over-sold’, and that there was a lack of consistency in approaches,<br />
communication and understanding among and between the modellers and the water managers,<br />
which often resulted in considerable uncertainty for decision making (Middlemis, 2000).<br />
4.4.2 The HarmoniQuA approach<br />
A software tool, MoST, with its associated knowledge base (KB), has been developed by the HarmoniQuA<br />
project ([13]; Scholten et al., 2007) to provide QA in modelling through guidance, monitoring<br />
and reporting. As defined in HarmoniQuA: “Quality Assurance (QA) is the procedural and operational<br />
framework used by an organisation managing the modelling study to build consensus among the organisations<br />
concerned in its implementation, to assure technically and scientifically adequate execution<br />
of all tasks included in the study, and to assure that all modelling-based analysis is reproducible and<br />
justifiable”. This modification of the older NRC (1990) definition includes the organisational, technical<br />
77
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
and scientific aspects, but also the need to build consensus among the organisations concerned in accordance<br />
with the discussion in Section 2.1 above.<br />
Guidelines for good modelling practise are included in the Knowledge Base (KB) of MoST. The modelling<br />
process has been decomposed into five steps, see the flowchart in Fig. 5. Each step includes several<br />
tasks. Each task has an internal structure i.e. name, definition, explanation, interrelations with other<br />
tasks, activities, activity related methods, references, sensitivity/pitfalls, task inputs and outputs.<br />
The KB contains knowledge specific to seven domains (groundwater, precipitation-runoff, river hydrodynamics,<br />
flood forecasting, water quality, ecology and socio-economics), and forms the heart of the<br />
tool. A computer based journal is produced within MoST where the water manager and modelling team<br />
record the progress and decisions made during a model study according to the tasks in the flowchart.<br />
This record can be used when reviewing the model study to judge its quality.<br />
The most important QA principles incorporated in the KB are:<br />
• The five modelling steps conclude with a formal dialogue between the modeller and manager,<br />
where activities and results from the present step are reported, and details of plans for the next step<br />
(a revised work plan) are discussed.<br />
• External reviews are prescribed as the key mechanism of ensuring that the knowledge and experience<br />
of other independent modellers are used.<br />
• The KB provides public interactive guidelines to facilitate dialogue between modellers and the water<br />
manager, with options to include auditors (reviewers), stakeholders and the public.<br />
• There are many feed back loops, some technical involving only the modeller, and others that may<br />
require a decision before doing costly additional work.<br />
• The KB allows performance and accuracy criteria to be updated during the modelling process. In<br />
the first step the water manager’s objectives and requirements are translated into performance criteria<br />
that may include qualitative and quantitative measures. These criteria may be modified during<br />
the formal reviews of subsequent steps.<br />
• Emphasis is put on validation schemes, i.e. tests of model performance against data that have not<br />
been used for model calibration.<br />
• Uncertainties must be explicitly recognised and assessed (qualitatively and/or quantitatively)<br />
throughout the modelling process.<br />
MoST supports multi-domain studies and working in teams of different user types (water managers,<br />
modellers, auditors, stakeholders and members of the public). It contains an interactive glossary that is<br />
accessible via hyperlinked text. The key functionality of MoST is to:<br />
• Guide, to ensure a model has been properly applied. This is based on the Knowledge Base.<br />
• Monitor, to record decisions, methods and data used in the modelling work and in this way enable<br />
transparency and reproducibility of the modelling process.<br />
• Report, to provide suitable reports of what has been done for managers/clients, modellers, auditors,<br />
stakeholders and the general public.<br />
78
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
4.4.3 Organisational requirements for QA guidelines to be effective<br />
Modelling studies involve several parties with different responsibilities. The key players are modellers<br />
and water managers, but often reviewers, stakeholders and the general public are also involved. To a<br />
large extent the quality of the modelling study is determined by the expertise, attitudes and motivation<br />
of the teams involved in the modelling and QA process.<br />
QA will only be successful if all parties actively support its use. The attitude of the modellers is important.<br />
NRC (1990) characterises this as follows: “most modellers enjoy the modelling process but find<br />
less satisfaction in the process of documentation and quality assurance”. Scholten and Groot (2002)<br />
describe the main problem with the Dutch Handbook on Good Modelling Practice as “they all like it, but<br />
only a few use it”. The water manager, however, has a particular responsibility, because he/she has the<br />
power to request and pay for adequate QA in modelling studies. Therefore, QA guidelines can only be<br />
expected to be used in practice if the water manager prescribes their use. It is therefore very important<br />
that the water manager has the technical capacity to organise the QA process. Often, water managers<br />
do not have individuals available with the appropriate training to understand and use models. An external<br />
modelling expert should then be sought to help with the QA process. However, this requires that the<br />
manager is aware of the problem and the need.<br />
4.4.4 Performance criteria and uncertainty – when is a model good enough<br />
A critical issue is how to define the performance criteria. We agree with Beven (2002b) that any conceptual<br />
model is known to be wrong and hence any model will be falsified if we investigate it in sufficient<br />
detail and specify very high performance criteria. Clearly, if one attempts to establish a model that<br />
should simulate the truth it would always be falsified. However, this is not very useful information.<br />
Therefore, we are using the conditional validation, or the validation restricted to domain of applicability<br />
(or numerical universal as opposed to strictly universal in Popperian terms). The good question is then<br />
what is good enough Or in other words what are the criteria How do we select them<br />
A good reference for model performance is to compare it with uncertainties of the available field observations.<br />
If the model performance is within this uncertainty range we often characterise the model as<br />
good enough. However, usually it is not so simple. How wide confidence bands do we accept on observational<br />
uncertainties – ranges corresponding to 65%, 95% or 99% Do we always then reject a model<br />
if it cannot perform within the observational uncertainty range In many cases even results from less<br />
accurate models may be useful.<br />
Therefore, the decision on what is good enough generally must be taken in a socio-economic context.<br />
For instance, the accuracy requirements to a model to be used for an initial screening of alternative<br />
options for location of a new small well field for a small water supply will be much smaller than the requirements<br />
to a model that is intended to be used for the final design of a large well field for a major<br />
water supply in an area with potential damaging effects on precious nature and other significant conflicts<br />
of interests. Thus, the accuracy criteria can not be decided universally by modellers or researchers,<br />
but must be different from case to case depending on how much is at stake in the decision to depend<br />
on the support from model predictions. This implies that the performance criteria must be discussed<br />
and agreed between the manager and the modeller beforehand.<br />
79
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Accuracy requirements and uncertainty assessments of model simulations are two sides of the same<br />
coin, just seen from two different perspectives, namely the water manager and the modeller. As all uncertainty<br />
can not be characterised as statistical uncertainty (see Fig. 33 and Tables 2 and 3 in Subsection<br />
4.3.1) it is also required to characterise accuracy requirements in qualitative terms. Furthermore the<br />
risk perception of the water manager and the stakeholders/public has to be considered. Therefore, involvement<br />
of stakeholders and public are most often required as an integrated part of this process (see<br />
also Section 2.1 and Figs. 1-2). According to the HarmoniQuA methodology stakeholder/public involvement<br />
is crucial at the beginning of a modelling project to frame the problem, define the requirements<br />
and assess the uncertainties (Henriksen et al., submitted).<br />
This way of thinking is well in line with the principles behind some of the Water Framework Directive<br />
Guidance Documents. For example the Guidance Document on Monitoring (EC, 2003a) does not specify<br />
the levels of precision and confidence required from the monitoring programmes, but rather states<br />
that the precision and confidence level should be sufficient to enable a meaningful assessment of for<br />
instance the status of the environment and should be sufficient to achieve an acceptable risk of making<br />
the wrong decision. This obviously calls for uncertainty assessments and public participation to have a<br />
central role in the entire process, which pave the road towards making adaptive management an important<br />
part of the river basin management process (Pahl-Wostl, 2002).<br />
4.4.5 Discussion – post evaluation<br />
The ideas and concepts behind the HarmoniQuA guidelines ([12], [13]) summarised above have been<br />
inspired from previous QA guidelines. The novel contributions have been inspired both from previous<br />
research activities (including [4], [5], [6], [7], [9], [11]) and from participation in a large range of national<br />
and international consultancy projects. Without having been in this crossroad between the research<br />
world and the practical world for more than two decades this would not have been possible. I consider<br />
my most important contributions in this respect to be:<br />
• The terminology and guiding principles behind the guidelines [12] are novel in their attempt to formulate<br />
a coherent approach that on the one hand has a solid scientifically philosophical foundation<br />
and on the other hand can be useful for practitioners. In the very controversial issue of model validation,<br />
where there has been almost a deadlock between different schools with respect to whether<br />
validation at all is possible, the philosophy of conditional validation is novel.<br />
• The major novelty of the HarmoniQuA approach does not lie in its guidance on model technical<br />
issues, but on its emphasis and more elaborate focus on the dialogue between modeller, water<br />
manager, reviewer, stakeholders and the public. In addition, there are novel elements on the large<br />
emphasis on uncertainty assessments throughout the modelling process and model validation. Finally,<br />
the emphasis on model reviews allows bringing in subjective knowledge and experience in the<br />
QA process.<br />
Both the HarmoniQuA guidelines and other recent good modelling practise guidelines have been<br />
deeply rooted both in the scientific community and among practitioners ([13]). As a comparison, ideas<br />
originating alone from the natural science community, such as the suggested Code of Practise on performing<br />
uncertainty analysis by Pappenberger and Beven (2006), are typically limited to valuable contributions<br />
on model technical issues, while they often do not consider the broader aspects of the modelling<br />
process such as the involvement of water managers and stakeholders.<br />
80
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
5 Conclusions and Perspectives for Future Work<br />
5.1 Summary of Main Scientific Contributions<br />
The contributions to scientific knowledge in the papers of the present thesis are discussed in the previous<br />
chapters. The main contributions have been in the following five areas:<br />
• New conceptual understanding and code development. The Suså model ([1], [2]) was based on a<br />
new conceptual understanding of the surface water/groundwater interaction in moraine catchment.<br />
The code and its application brought new insight regarding the effect of groundwater abstraction on<br />
streamflow in catchments with such hydrogeological characteristics.<br />
• Model validation. The adoption and adaptation of rather rigorous principles for model validation and<br />
the examples of their application both for lumped conceptual and distributed physically based models<br />
is a cornerstone in my research. This work was first published in [6] and [7] and later brought<br />
into a broader modelling framework in [12] and [13]. In particular the introduction of the term ‘conditional<br />
validation’ in [7] and the outline of its scientific philosophical basis in [12] is novel.<br />
• Scaling. The publications focussing on scaling ([7], [10]) presents ideas crystallised from work with<br />
scaling problems in many modelling studies ranging from point scale to thousands of km 2 . The later<br />
framework, outlined in Section 4.1 above does not in any way ‘solve’ the scaling problem but contributes<br />
to clarifications on applicable methodologies with focus on their respective assumptions and<br />
limitations.<br />
• Uncertainty assessment. During the past decade a considerable part of my research work has focussed<br />
on uncertainty aspects. I consider my main contributions in this respect to be the introduction<br />
of the broader uncertainty framework integrated into the modelling framework ([13], [14]) and<br />
the work with model structure uncertainty ([15]).<br />
• Modelling protocols and guidelines for quality assurance in the modelling process. The modelling<br />
protocol in [7] and the later and more comprehensive one presented as part of the guidelines for<br />
quality assurance in the modelling framework in [13] are a formalisation of experience and practises<br />
that have gradually emerged over the years. The novel elements in [13] are the emphasis on (a) the<br />
interactive dialogue between modeller, water manager, reviewer, stakeholders and the public; (b)<br />
uncertainty assessments throughout the modelling process; (c) model validation; and (d) experience<br />
and subjective knowledge introduced through external model reviews.<br />
These main contributions to scientific knowledge would, however, not have been possible without the<br />
experience and insight gained in modelling studies ranging from point scale ([3]) to large catchments<br />
([4], [5], [8], [9], [11]).<br />
81
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
5.2 Modelling Issues for Future Research<br />
Hydrological modelling has developed significantly during the three decades I have worked in this field.<br />
I started with editing punch cards and could only run one simulation per day (overnight) using model<br />
codes that today are considered small and simple. Since then, comprehensive new knowledge has<br />
been build into model codes and into the methodologies used in the modelling process.<br />
During the process of writing this thesis, where I had to review my older publications, it was interesting<br />
to note the gradual change in research focus. The first decade my research focused on development of<br />
new codes. During the second decade more general methodological problem areas such as scaling<br />
and model validation were addressed. Towards the end of the third decade the emphasis is now on the<br />
broader issues such as uncertainty assessment and quality assurance frameworks for the entire modelling<br />
process, and the interaction between the modelling and the water management processes. While<br />
this no doubt is affected by personal and career developments, it also reflects a general trend. We are<br />
no longer satisfied with being able to produce beautiful simulations with sophisticated new model<br />
codes; we also want to evaluate the credibility of such simulations and to apply them in real-world water<br />
management decisions.<br />
Certainly I did not foresee this development three decades ago. On this background it is therefore not<br />
wise to make long range forecasts on what we can expect as the key issues for future modelling research.<br />
Hence, the following list should not pretend to cover all the most important research issues for<br />
modelling during the coming many years. It rather presents a list of issues which I, seen from the perspective<br />
dealt with in the present thesis, consider the presently most important and fundamental problems<br />
requiring more research during the coming years.<br />
• Improved representation of heterogeneity in reactive transport modelling. There will always be a<br />
need to improve our conceptual understanding of hydrological processes. It appears that, whereas<br />
we have had some success with prediction of flows and hydraulic heads, the existing paradigms in<br />
hydrological modelling are not good enough to simulate concentrations of conservative and reactive<br />
contaminants. Flows and hydraulic heads are much less depending on heterogeneity than concentrations,<br />
and it will be necessary to include heterogeneity much more explicitly in the modelling than<br />
done until now. Examples of areas, where this is important, include simulation of transport and fate<br />
of contaminants in aquifers and simulation of the stream-aquifer interaction governed by processes<br />
in river valleys.<br />
• Utilisation of new data types. Whenever possible we should try to make use of new data types. New<br />
techniques for collecting satellite data on surface conditions and geophysical data on subsurface<br />
features are promising and have not been fully exploited yet. We can hope and expect that better<br />
techniques will be developed during the coming years. Thus, it is not unrealistic in some years to<br />
have improved data providing both a much better spatial resolution of catchment/aquifer properties<br />
and on-line information on state variables. The improved spatial resolution can help us give a better<br />
representation of heterogeneities in models (see above), while on-line information provide interesting<br />
potentials for improved management. In order to utilise on-line data optimally new and improved<br />
data assimilation (updating) techniques will be required.<br />
82
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
• Model structure error. Probably the most important single issue related to uncertainty of model predictions<br />
is how to assess uncertainty caused by model structure error. It is important, because the<br />
most interesting fields of model applications deal with assessments of the effects on the ecosystem<br />
of human activities. And it is at the same time fundamentally difficult, because we in such situations<br />
are using models beyond the situations, where we can test the model performance against field<br />
data. I consider the framework based on multiple conceptual models ([15]) only to be a very first<br />
beginning in this respect.<br />
• Uncertainty and credibility of modelling in relation to water resources management. Uncertainty<br />
assessments of model predictions are crucial for a sound use of models in water resources management<br />
in practise. Model predictions without uncertainty assessments correspond to only presenting<br />
a (minor) part of the available information. Uncertainty in relation to water resources management<br />
in practise is not confined to statistical uncertainty. It is also required to include aspects of<br />
qualitative uncertainty and ignorance. Furthermore, uncertainty must be seen in a broad socioeconomic<br />
context where stakeholder and policy views are taken into account. There are many future<br />
challenges on this multi-disciplinary road. How do we ensure that models incorporate the best<br />
available information and adequately address the issues and the priorities set by water managers<br />
and stakeholders How should we translate objectives and requirements formulated in qualitative<br />
language by water managers and stakeholders to accuracy criteria for a modelling study And how<br />
should we compile and present uncertainties from a modelling study in a way that is understandable<br />
by non-modellers Some of these questions are likely to be answered within the context of new water<br />
management paradigms such as adaptive management.<br />
83
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
6 References<br />
Abbott MB (1992) The theory of the hydrological model, or: the struggle for the soul of hydrology. In: O’Kane<br />
JP (Ed.) Advances in theoretical hydrology, Elsevier, 237-254.<br />
Abbott MB, Bathurst JC, Cunge JA, O'Connel PE, Rasmussen J (1986a) An introduction to the European<br />
Hydrological System - Systeme Hydrologique Européen "SHE", 1: History and philosophy of a physically-based<br />
distributed modelling system. Journal of Hydrology, 87, 45-59.<br />
Abbott MB, Bathurst JC, Cunge JA, O'Connel PE, Rasmussen J (1986b) An introduction to the European<br />
Hydrological System - Systeme Hydrologique Européen "SHE", 2: Structure of a physically-based distributed<br />
modelling system. Journal of Hydrology, 87, 61-77.<br />
Abrahamsen P, Hansen S (2000) Daisy: an open soil-crop-atmosphere system model. Environmental Modelling<br />
& Software, 15, 313-330.<br />
Andersen J, Refsgaard JC, Jensen KH (2001) Distributed hydrological modelling of the Senegal River Basin<br />
– model construction and validation. Journal of Hydrology, 247, 200-214.<br />
Anderson MP, Woessner WW (1992) The role of postaudit in model validation. Advances in Water Resources,<br />
15, 167-173.<br />
Babendreier JE (2003) National-scale multimedia risk assessment for hazardous waste disposal. International<br />
Workshop on Uncertainty, Sensitivity and Parameter Estimation for Multimedia Environmental<br />
Modelling held at U.S Nuclear Regulatory Commission, Rockville, Maryland, August 19-21, 2003. Proceedings,<br />
103-109.<br />
Bathurst JC (1986a) Physically-based distributed modelling of an upland catchment using the Systeme Hydrologique<br />
Européen. Journal of Hydrology, 87, 79-102.<br />
Bathurst JC (1986b) Sensitivity analysis of the Systeme Hydrologique Européen for an upland catchment.<br />
Journal of Hydrology, 87, 103-123.<br />
Beck MB (1987) Water quality modelling: a review of the analysis of uncertainty. Water Resources Research,<br />
23(8), 1393-1442.<br />
Beck MB (2005) Environmental foresight and structural change. Environmental Modelling & Software, 20,<br />
651-670.<br />
Bergström (1976) Development and application of a conceptual runoff model for Scandinavian catchments.<br />
PhD Thesis, University of Lund, Bulletin Series A No 52.<br />
Bergström S (1992) The HBV model – its structure and applications. SMHI RH No 4. Norrköping.<br />
Bergström S (1995) The HBV model. In: Singh VP (Ed) Computer Models of Watershed Hydrology. Water<br />
Resources Publications, Highlands Ranch, Colorado, 443-476.<br />
Bergström S, Forsman A (1973) Development of a conceptual deterministic rainfall-runoff model. Nordic<br />
Hydrology, 4, 147-170.<br />
Beven K (1989) Changing ideas in hydrology – the case of physically based models. Journal of Hydrology,<br />
105, 157-172.<br />
Beven K (1995) Linking parameters across scales: Subgrid parameterization and scale dependent hydrological<br />
models. Hydrological Processes, 9, 507-525.<br />
Beven K (1996a) A discussion of distributed hydrological modelling. In: Abbott MB, Refsgaard JC (Eds):<br />
Distributed Hydrological Modelling, Kluwer Academic Publishers, 255-278.<br />
Beven K (1996b) Response to comments on ‘A discussion of distributed hydrological modelling’. In: Abbott<br />
MB, Refsgaard JC (Eds): Distributed Hydrological Modelling, Kluwer Academic Publishers, 289-295.<br />
Beven K (2001) How far can we go in distributed hydrological modelling Hydrology and Earth System Sciences,<br />
5(1), 1-12.<br />
Beven K (2002a) Towards an alternative blueprint for a physically based digitally simulated hydrologic response<br />
modelling system. Hydrological Processes, 16(2), 189-206.<br />
Beven K (2002b) Towards a coherent philosophy for modelling the environment. Proceedings of the Royal<br />
Society of London, A, 458 (2026), 2465-2484.<br />
84
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Beven K, Binley AM (1992) The future of distributed models: model calibration and uncertainty prediction.<br />
Hydrological Processes, 6, 279-298.<br />
Binley AM, Beven KJ, Calver A, Watts LG (1981) Changing Responses in Hydrology: Assessing the Uncertainty<br />
in Physically Based Model Predictions. Water Resources Research, 27(6), 1253-1261.<br />
Birkinshaw SJ, Ewen J (2000) Nitrogen transformation component for SHETRAN catchment nitrate transport<br />
modelling. Journal of Hydrology, 230, 1-17.<br />
Blöschl G, Sivapalan M (1995) Scale issues in hydrological modelling: A review. Hydrological Processes, 9,<br />
251-290.<br />
Brown JD (2004) Knowledge, uncertainty and physical geography: towards the development of methodologies<br />
for questioning belief. Transactions of the Institute of British Geographers 29(3), 367-381.<br />
Brown JD, Heuvelink GBM, Refsgaard JC (2005) An integrated framework for assessing and recording uncertainties<br />
about environmental data. Water Science and Technology, 52(6), 153-160.<br />
Brown JD, Heuvelink GBM (2005) Data Uncertainty Engine (DUE) User’s Manual. University of Amsterdam.<br />
http://www.harmonirib.com.<br />
Butts MB, Payne JT, Kristensen M, Madsen H (2004) An evaluation of the impact of model structure on hydrological<br />
modelling uncertainty for streamflow prediction. Journal of Hydrology, 298, 242-266.<br />
Burnash RJC (1995) The NWS river forecast system - catchment modelling. In: Singh VP (Ed): Computer<br />
Models of Watershed Hydrology, Water Resources Publications, 311-366.<br />
Christensen S (1994) Hydrological Model for the Tude Å Catchment. Nordic Hydrology, 25, 145-166.<br />
Conan C, Bouraoui F, Turpin N, de Marsily G, Bidoglio G (2003) Modelling Flow and Nitrate Fate at Catchment<br />
Scale in Brittany (France). Journal of Environmental Quality, 32, 2026-2032.<br />
Crawford NH, Linsley RK (1966) Digital simulation in hydrology, Stanford Watershed Model IV, Department<br />
of Civil Engineering, Stanford University, Technical Report 39.<br />
Currie JA (1961) Gaseous diffusion in the aeration of aggregated soils. Soil Science, 92, 40-45.<br />
Dagan G (1986) Statistical theory of groundwater flow and transport: pore to laboratory, laboratory to formation<br />
and formation to regional scale. Water Resources Research, 22(9), 120-134.<br />
De Marsily, G Combes P, Goblet P (1992) Comments on 'Ground-water models cannot be validated', by<br />
Konikow LF, Bredehoeft, JD, Advances in Water Resources, 15, 367-369.<br />
Dewulf A, Craps M, Bouwen R, Pahl-Wostl C (2005) Integrated management of natural resources dealing<br />
with ambiguous issues, multiple actors and diverging frames. Water Science and Technology, 52(6),<br />
115-124.<br />
DHI (1995) MIKE 21 Short Description. Danish Hydraulic Institute, Hørsholm, Denmark.<br />
Djuurhus J, Hansen S, Schelde K, Jacobsen OH (1999) Modelling mean nitrate leaching from spatially variable<br />
fields using effective parameters. Geoderma, 87,261-279.<br />
Doherty J (2003) Ground water model calibration using pilot points and regularization. Ground Water, 41(2),<br />
170-177.<br />
Duan Q, Sorooshian S, Gupta VK (1994) Optimal use of the SCE-UA global optimization method for calibrating<br />
watershed models. Journal of Hydrology 158, 265–284.<br />
Dubus, IG, Brown CD, Beulke S (2003) Sources of uncertainty in pesticide fate modelling. The Science of<br />
the Total Environment, 317, 53-72.<br />
EC (1992) Working Group of Independent Experts on Variant C of the Gabcikovo-Nagymaros Project, Working<br />
Group Report, Commission of the European Communities, Czech and Slovak Federative Republic,<br />
Republic of Hungary, Budapest November 23, 1992.<br />
EC (1993a) Working Group of Monitoring and Water Management Experts for the Gabcikovo System of<br />
Locks - Data Report, Commission of the European Communities, Republic of Hungary, Slovak Republic,<br />
Budapest November 2, 1993.<br />
EC (1993b) Working Group of Monitoring and Water Management Experts for the Gabcikovo System of<br />
Locks - Report on Temporary Water Management Regime, Commission of the European Communities,<br />
Republic of Hungary, Slovak Republic, Bratislava, December 1, 1993.<br />
EC (2003a) Common Implementation Strategy for the Water Framework Directive (2000/60/EC). Guidance<br />
Document No. 7. Monitoring under the Water Framework Directive. Working Group 2.7. Office for the<br />
Official Publications of the European Communities, Luxembourg.<br />
85
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
EC (2003b) Common Implementation Strategy for the Water Framework Directive (2000/60/EC). Guidance<br />
Document No. 11. Planning Processes. Working Group 2.9. Office for the Official Publications of the<br />
European Communities, Luxembourg.<br />
EC (2004) Common Implementation Strategy for the Water Framework Directive (2000/60/EC) Guidance<br />
Document No 3, pressures and impacts, IMPRESS. Working Group 2.3. Office for the Official Publications<br />
of the European Communities, Luxembourg.<br />
Fleming G (1975) Computer simulation techniques in hydrology. Elsevier, New York.<br />
Franchini M, Pacciani M (1992) Comparative analysis of several conceptual rainfall-runoff models. Journal of<br />
Hydrology, 122, 161-219.<br />
Freeze RA, Harlan RL (1969) Blueprint for a physically-based digitally-simulated hydrologic response model.<br />
Journal of Hydrology, 9, 237-258.<br />
Gelhar LW (1986) Stochastic subsurface hydrology. From theory to application. Water Resources Research,<br />
22(9), 135-145.<br />
Graham DN, Butts MB (2005) Flexible integrated watershed modelling with MIKE SHE. In: Singh VP, Frevert<br />
DK (Eds) Watershed Models. CRC Press, Chapter 10.<br />
Graham LP (1999) Modelling runoff to the Baltic Sea, Ambio, 28, 328-334.<br />
Grayson RB, Moore ID, McHahon TA (1992a) Physically based hydrologic modelling, 1. A terrain-based<br />
model for investigative purposes. Water Resources Research, 28(10), 2639-2658.<br />
Grayson RB, Moore ID, McHahon TA (1992b) Physically based hydrologic modelling, 2. Is the concept realistic<br />
Water Resources Research, 28(10), 2639-2658.<br />
Grayson R, Blöschl G (2000) Spatial Modelling of Catchment Dynamics. In: Grayson R, Blöschl G (Eds.)<br />
Spatial Patterns in Catchment Hydrology: Observations and Modelling. Cambridge University Press,<br />
UK.<br />
Groenenberg JE, Kros J, van der Salm C, de Vries W (1995) Application of the model NUCSAM to the<br />
Solling spruce site. Ecological Modelling, 83, 97-107.<br />
GWP (2000) Integrated Water Resources Management. TAC Background Papers No. 4. Global Water Partnership,<br />
Stockholm.<br />
Hansen S, Jensen HE, Nielsen NE, Svendsen H (1991) Simulation of nitrogen dynamics and biomass production<br />
in winter wheat using the Danish simulation model DAISY. Fertilizer Research, 27, 245-259.<br />
Hansen S, Thorsen M, Pebesma E, Kleeschulte S, Svendsen H (1999) Uncertainty in simulated leaching<br />
due to uncertainty in input data. A case study. Soil Use and Management, 15, 167-175.<br />
Harrar WG, Sonnenborg TO, Henriksen HJ (2003) Capture zone, travel time and solute transport predictions<br />
using inverse modelling and different geological models. Hydrogeology Journal, 11(5), 536-548.<br />
Havnø K, Madsen MN, Dørge J (1995) MIKE 11 - A Generalized River Modelling Package. In: Singh VP (Ed)<br />
Computer Models of Watershed Hydrology, Water Resources Publications, Highlands Ranch, Colorado,<br />
733-782.<br />
Henriksen HJ, Refsgaard JC, Sonnenborg TO, Gravesen P, Brun A, Refsgaard A, Jensen KH (2001) STÅBI i<br />
grundvandsmodellering (Handbook in groundwater modelling). Danmarks og Grønlands Geologiske<br />
Undersøgelse, Rapport 2001/56. (In Danish)<br />
Henriksen HJ, Troldborg L, Nyegaard P, Sonnenborg TO, Refsgaard JC, Madsen B (2003) Methodology for<br />
construction, calibration and validation of a national hydrological model for Denmark. Journal of Hydrology<br />
280, 52-71.<br />
Henriksen HJ, Refsgaard JC, Højberg AL, Ferrand N, Gijsbers P, Scholten H (submitted) Public participation<br />
in relation to quality assurance of water resources modelling (HarmoniQuA).<br />
Heuvelink GBM, Pebesma EJ (1999) Spatial aggregation and soil process modelling. Geoderma, 89, 47-65.<br />
Hill MC (1998) Methods and guidelines for effective model calibration. U.S. Geological Survey, Water-<br />
Resources Investigations Report 98-4005. Denver CO.<br />
Højberg AL, Refsgaard JC (2005) Model Uncertainty - Parameter uncertainty versus conceptual models.<br />
Water Science and Technology, 52(6), 177-186.<br />
ICJ (1997) Case Concerning Gabcikovo-Nagymaros project (Hungary/Slovakia). Summary of the Judgement of<br />
25 September 1997. International Court of Justice, The Hague.<br />
86
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
ICWE (1992) The Dublin Statement and report of the conference. International Conference on Water and the<br />
Environment: Development issues for the 21st century. 26-31 January 1992, Dublin, Ireland.<br />
IPCC (2001) Climate Change 2001: The Scientific Basis. Contribution of Working Group I to the Third Assessment<br />
Report of the Intergovernmental Panel of Climate Change [Houghton JT, Ding Y, Griggs DJ,<br />
Noguer M, van der Linden PJ, Dai X, Maskell K and Johnson CA (eds)]. Cambridge University Press,<br />
Cambridge, UK and New York, NY, USA, 881 pp.<br />
Jensen KH, Mantoglou A (1992) Application of stochastic unsaturated flow theory, numerical simulations,<br />
and comparisons to field observations. Water Resources Research, 28, 269-284.<br />
Jensen RA, Jørgensen GH (1988) Hydrologisk overfladevands/grundvands model (Hydrological surface<br />
water/groundwater model). Technical report prepared by Danish Hydraulic Institute for the County of<br />
Storstrøm and the County of Vestsjælland. (in Danish)<br />
Jensen KH, Refsgaard JC (1991a) Spatial variability of physical parameters and processes in two field soils.<br />
Part I: Water Flow and Solute Transport at Local Scale. Nordic Hydrology, 22, 275-302.<br />
Jensen KH, Refsgaard JC (1991b) Spatial variability of physical parameters and processes in two field soils.<br />
Part II: Water flow at field scale. Nordic Hydrology, 22, 303-326.<br />
Jensen KH, Refsgaard JC (1991c) Spatial variability of physical parameters and processes in two field soils.<br />
Part III: Solute Transport at Field Scale. Nordic Hydrology, 22, 327-340.<br />
Jønch-Clausen T (1979) SHE. Système Hydrologiique Européen. A short description. Danish Hydraulic Institute,<br />
Hørsholm, Denmark.<br />
Jønch-Clausen T (2004) Integrated Water Resources Management (IWRM) and Water Efficiency Plans by<br />
2005. Why, What and How Global Water Partnership, TEC Background Papers No. 10, Stockholm.<br />
Jønch-Clausen T, Refsgaard JC (1984) A Mathematical Modelling System for Flood Forecasting. Nordic<br />
Hydrology, 15, 307-318.<br />
Kaiser-Hill (2001) Model Code and Scenario Selection Report Site-Wide Water Balance Rocky Flats Environmental<br />
Technology Site. Report 01-RF-00337. Kaiser-Hill Company LLC.<br />
Klauer B, Brown JD (2003) Conceptualising imperfect knowledge in public decision making: ignorance, uncertainty,<br />
error and ‘risk situations’. Environmental Research, Engineering and Management.<br />
Klemes V (1986) Operational testing of hydrological simulation models. Hydrological Sciences Journal, 31,<br />
13-24.<br />
Knudsen J, Thomsen A, Refsgaard JC (1986) WATBAL: A semi-distributed, physically based hydrological<br />
modelling system. Nordic Hydrology, 17, 347-362.<br />
Konikow LF, Bredehoeft JD (1992) Ground-water models cannot be validated. Advances in Water Resources,<br />
15, 75-83.<br />
Kros J, Reinds GJ, de Vries W, Latour JB, Bollen M (1995) Modelling of soil acidity and nitrogen availability<br />
in natural ecosystems in response to changes in acid deposition and hydrology. Report 95, DLO Winand<br />
Staring Centre, Wageningen.<br />
Kutchment LS, Demidov VN, Naden PS, Cooper DM, Broadhurst P (1996) Rainfall-runoff modelling of the<br />
Ouse basin, North Yorkshire: an application of a physically based distributed model. Journal of Hydrology,<br />
181, 323-342.<br />
Lane SA, Richards KS (2001) The ‘Validation’ of Hydrodynamic Models: Some Critical Perspectives. In:<br />
Anderson MG, Bates PD (Eds) Model Validation perspectives in Hydrological Science, 413-438. John<br />
Wiley & Sons, Ltd.<br />
Linkov I, Burmistrov D (2003) Model Uncertainty and Choices Made by Modelers: Lessons Learned from the<br />
International Atomic Energy Model Intercomparisons. Risk Analysis, 23(6), 1297-1308.<br />
Lloyd JW (1980) The importance of drift deposit influences on the hydrogeology of major British aquifers.<br />
Institution of Water Engineers and Scientists, Journal, 34, 346-356.<br />
Loague KM, Freeze RA (1985) A Comparison of Rainfall-Runoff Modelling Techniques on Small Upland<br />
Catchments. Water Resources Research, 21(2), 1985.<br />
Luckner L (1978) Gekoppelte Grundwasser-Oberflächenwassermodelle (A coupled groundwater-surface<br />
water model). Wasserwirtschaft-Wassertechnik, 1978, 276-278 (In German).<br />
Madsen H, Skotner C (2005) Adaptive state updating in real-time river flow forecasting – a combined filtering<br />
and error forecasting procedure. Journal of Hydrology, 308, 302-312.<br />
87
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Michaud J, Sorooshian S (1994) Comparison of simple versus complex distributed runoff models on a midsized<br />
semiarid watershed. Water Resources Research, 30(3), 593-605.<br />
Michaud JD, Shuttelworth WJ (1997) Executive summary of the Tuczon aggregation workshop. Journal of<br />
Hydrology, 190, 176-181.<br />
Middlemis H (2000) Murray-Darling Basin Commission. Groundwater flow modelling guideline. Aquaterra<br />
Consulting Pty Ltd., South Perth. Western Australia. Project no. 125.<br />
Miles JC, Rushton KR (1983) A coupled surface water and groundwater catchment model. Journal of Hydrology,<br />
62, 159-177.<br />
Neuman SP, Wierenga PJ (2003) A comprehensive strategy of hydrogeologic modeling and uncertainty<br />
analysis for nuclear facilities and sites. University of Arizona, Report NUREG/CR-6805.<br />
Nielsen DR, Bigger JW, Erk KT (1973) Spatial variability of field measured soil water properties. Hilgardia,<br />
42, 215-259.<br />
Nielsen SA, Hansen E (1973) Numerical simulation of the rainfall-runoff process on a daily basis. Nordic<br />
Hydrology, 4, 171-190.<br />
NRC (1990) Ground Water Models: Scientific and Regulatory Applications. National Research Council, National<br />
Academy Press, Washington, D.C.<br />
Oreskes N, Shrader-Frechette K, Belitz K (1994) Verification, validation and confirmation of numerical models<br />
in the earth sciences. Science, 264, 641-646.<br />
Pahl-Wostl C (2002) Towards sustainability in the water sector – The importance of human actors and processes<br />
of social learning. Aquatic Sciences, 64, 394-411.<br />
Panday S, Hayakorn PS (2004) A fully coupled physically-based spatially-distributed model for evaluating<br />
surface/subsurface flow. Advances in Water Resources, 27, 361-382.<br />
Pappenberger F, Beven KJ (2006) Ignorance in bliss: Or seven reasons not to use uncertainty analysis. Water<br />
Resources Research 42, W05302, doi:10.1029/2005WR004820.<br />
Pascual P, Steiber N, Sunderland E (2003) Draft guidance on development, evaluation and application of<br />
regulatory environmental models. The Council for Regulatory Environmental Modeling. Officie of Science<br />
Policy, Office of Research and Development. US Environmental Protection Agency, Washington<br />
D.C. 60 pp.<br />
Perkins SP, Sophocleous M (1999) Development of a Comprehensive Watershed Model Applied to Study<br />
Stream Yield under Drought Conditions. Ground Water, 37(3), 418-426.<br />
Perrin C, Michel C, Andréassian V (2001) Does a large number of parameters enhance model performance<br />
Comparative assessment of common catchment model structures on 429 catchments. Journal of Hydrology,<br />
242, 275-301.<br />
Poeter E, Anderson D (2005) Multiple Ranking and Inference in Ground Water Modeling. Ground Water,<br />
43(4), 597-605.<br />
Popper KR (1959) The logic of scientific discovery. Hutchingson & Co, London.<br />
Prickett TA, Lonnquist CG (1971) Selected digital computer techniques for groundwater resource evaluation.<br />
Illinois State Water Survey, Bulletin 55.<br />
Querner EP (1997) Description and application of the combined surface and groundwater flow model<br />
MOGROW. Journal of Hydrology, 192, 158-188.<br />
Quinn PF, Beven KJ (1993) Spatial and temporal predictions of soil moisture dynamics, runoff, variable<br />
source areas and evapotranspiration for Plynlimon, Mid-Wales. Hydrological Processes, 7, 425-448.<br />
Radwan M, Willems P, Berlamont J (2004) Sensivity and uncertainty analysis for river quality modelling.<br />
Journal of Hydroinformatics, 6, 83-99.<br />
Reed S, Koren V, Smith M, Zhang Z, Moreda F, Seo D-J (2004) Overall distributed model intercomparison<br />
project results. Journal of Hydrology, 298, 27-60.<br />
Refsgaard JC (1981) The surface water component of an integrated hydrological model. Danish Committee for<br />
Hydrology. Suså Report No. H12.<br />
Refsgaard JC (1996) Terminology, modelling protocol and classification of hydrological model codes. In:<br />
Abbott MB, Refsgaard JC (Eds): Distributed Hydrological Modelling, Kluwer Academic Publishers, 17-<br />
39.<br />
88
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Refsgaard JC, Stang O (1981) An integrated groundwater/surface water hydrological model. Danish Committee<br />
for Hydrology. Suså Report No. H13.<br />
Refsgaard JC, Rosbjerg D, Markussen LM (1983) Application of Kalman filter to real-time operation and to<br />
uncertainty analyses in hydrological modelling. IAHS Publication No 147, 273-282.<br />
Refsgaard JC, Storm B (1995) MIKE SHE. In: Singh VP (Ed) Computer Models of Watershed Hydrology.<br />
Water Resources Publications, Highlands Ranch, Colorado, 809-846.<br />
Refsgaard JC, Storm B, Abbott MB (1996) Comments on ‘A discussion of distributed hydrological modelling’.<br />
In: Abbott MB, Refsgaard JC (Eds): Distributed Hydrological Modelling, Kluwer Academic Publishers,<br />
279-287.<br />
Refsgaard JC, Ramaekers D, Heuvelink GBM, Schreurs V, Kros H, Rosén L, Hansen S (1998) Assessment<br />
of ‘cumulative’ uncertainty in spatial decision support systems: Application to examine the contamination<br />
of groundwater from diffuse sources (UNCERSDSS). Presented at the European Climate Science<br />
Conference, Vienna, 19-23 October 1998.<br />
Refsgaard JC, Butts MB (1999) Determination of grid scale parameters in catchment modelling by upscaling<br />
local scale parameters. Key note presentation. Proceedings of the EurAgEng International Workshop<br />
on Modelling of transport processes in soils at various scales in time and space, 24-26 November<br />
1999, Leuven, Belgium.<br />
Refsgaard JC, van der Sluijs JP, Højberg AL, Vanrolleghem P (2005) Harmoni-CA Guidance Uncertainty<br />
Analysis. Guidance 1. 46 pp. www.harmoni-ca.info.<br />
Rykiel ER (1996) Testing ecological models: the meaning of validation. Ecological Modelling, 90, 229-244.<br />
Saulnier GM, Beven K, Obled C (1997) Digital elevation analysis for distributed hydrological modelling: Reducing<br />
scale dependence in effective hydraulic conductivity values. Water Resources Research,<br />
33(9), 2097-2101.<br />
Scholten H, Van Waveren RH, Groot S, Van Geer FC, Wösten JHM, Koeze RD, Noort JJ (2000) Good Modelling<br />
Practice in water management. Paper presented on Hydroinformatics 2000, Cedar Rapids, IA,<br />
USA.<br />
Scholten H, Groot S (2002) Dutch guidelines. In: Refsgaard, JC (Ed) State-of-the-Art Report on Quality Assurance<br />
in modelling related to river basin management. Chapter 12, Geological Survey of Denmark<br />
and Greenland, Copenhagen. www.harmoniqua.org.<br />
Scholten H, Kassahun A, Refsgaard JC, Kargas T, Gavardinas C, Beulens AJM (2007) A methodology to<br />
support multidisciplinary model-based water management. Environmental Modelling & Software, 22,<br />
743-759.<br />
Singh VP (Ed) (1995) Computer Models of Watershed Hydrology. Water Resources Publications, Highlands<br />
Ranch, Colorado.<br />
Smith KA (1980) A model of the extent of anaerobic zones in aggregated soils and its potential application to<br />
estimates of denitrification. Journal of Soil Science, 31, 263-277.<br />
Sonnenborg TO, Christensen BSB, Nyegaard P, Henriksen HJ, Refsgaard JC (2003) Transient modelling of<br />
regional groundwater flow using parameter estimates from steady-state automatic calibration. Journal<br />
of Hydrology, 273, 188-204.<br />
Stang O (1981) A regional groundwater model for the Suså area. Danish Committee for Hydrology. Suså Report<br />
No. H9.<br />
Styczen M, Storm B (1993a) Modelling of N-movements on catchment scale – a tool for analysis and decisionmaking.<br />
1. Model description. Fertilizer Research, 36, 1-6.<br />
Styczen M, Storm B (1993b) Modelling of N-movements on catchment scale – a tool for analysis and decisionmaking.<br />
2. A case study. Fertilizer Research, 36, 7-17.<br />
Tampa Bay Water (2001) Scientific review of integrated hydrologic model ISGW/CNTB121. Prepared by West<br />
Consultants, Gartner Lee Ltd and AQUA TERRA Consultants for Tampa Bay Water, Florida.<br />
Thomas RG (1973) Groundwater models. FAO, Irrigation and Drainage Paper 21, Rome.<br />
Troch PA, Mancini M, Paniconni C, Wood EF (1993) Evaluation of a Distributed Catchment Scale Water<br />
Balance Model. Water Resources Research, 29(6), 1805-1817.<br />
Troeh FR, Jabro JD, Kirkham D (1982) Gaseous diffusion equations for porous materials. Geoderma, 27,<br />
239-253.<br />
89
Refsgaard JC – Doctoral Thesis January 2007<br />
Hydrological Modelling and River Basin Management<br />
Troldborg L (2004) The influence of conceptual geological models on the simulation of flow and transport in<br />
Quaternary aquifer systems. PhD Thesis. Geological Survey of Denmark and Greenland, Report<br />
2004/107.<br />
Van Asselt MBA, Rotmans J (2002) Uncertainty in Integrated Assessment Modelling. From Positivism to<br />
Pluralism. Climatic Change, 54: 75-105.<br />
Van der Sluijs JP, Craye M, Funtowicz SO, Kloprogge P, Ravetz J, Risbey JS (2005) Combining Quantitative<br />
and Qualitative Measures of Uncertainty in Model based Foresight Studies: the NUSAP System. Risk<br />
Analysis, 25(2), 481-492.<br />
Van Griensven A, Meixner T (2004) Dealing with unidentifiable sources of uncertainty within environmental<br />
models. In: Pahl C, Schmidt S, Jakeman T. (Eds.), iEMSs 2004 International Congress: "Complexity<br />
and Integrated Resources Management". International Environmental Modelling and Software Society,<br />
Osnabrück, Germany, June 2004.<br />
Van Loon E, Refsgaard JC (eds.) (2005) Guidelines for assessing data uncertainty in hydrological studies.<br />
HarmoniRiB Report. Geological Survey of Denmark and Greenland. http://www.harmonirib.com.<br />
Van Waveren RH, Groot S, Scholten H, Van Geer FC, Wösten JHM, Koeze RD, Noort JJ (2000) Good Modelling<br />
Practice Handbook, STOWA Report 99-05, Utrecht, RWS-RIZA, Lelystad, The Netherlands,<br />
http://waterland.net/riza/aquest/<br />
Vrugt J, Diks CGH, Gupta HV (2005) Improved treatment of uncertainty in hydrologic modelling: Combining<br />
the strengths of global optimization and data assimilation. Water Resources Research, 41, W01017,<br />
doi:10.1029/2004WR003059.<br />
Walker WE, Harremoës P, Rotmans J, Van der Sluijs JP, Van Asselt MBA, Janssen P, Krayer von Krauss<br />
MP (2003) Defining Uncertainty A Conceptual Basis for Uncertainty Management in Model-Based Decision<br />
Support, Integrated Assessment, 4(1), 5-17.<br />
Wardlaw RB (1978) The development of a deterministic integrated surface/subsurface hydrological response<br />
model. PhD Thesis, University of Stratchclyde, Glasgow.<br />
Wardlaw RB, Wyness A, Rippon P (1994) Integrated catchment modelling. Surveys in Geophysics, 15, 311-<br />
330.<br />
Weeks JB (1974) Simulated effects of oil-shale development on the hydrology of the Piceance basin, Colorado.<br />
US Geological Survey, Professional Paper 908.<br />
Wen X-H, Gómez-Hernández JJ (1996) Upscaling hydraulic conductivities in heterogeneous media: An overview.<br />
Journal of Hydrology, 183, ix-xxxii.<br />
WMO (1975) Intercomparison of conceptual models used in operational hydrological forecasting. WMO Operational<br />
Hydrology Report No 7, WMO No 429, World Meteorological Organisation, Geneva.<br />
WMO (1988) Intercomparison of models for snowmelt runoff. WMO Operational Hydrology Report No 23,<br />
WMO No 646, World Meteorological Organisation, Geneva.<br />
WMO (1992) Simulated real-time intercomparison of hydrological models. WMO Operational Hydrology Report<br />
No 38, WMO No 779, World Meteorological Organisation, Geneva.<br />
Wolf J, Beusen AHW, Groenendijk P, Kroon T, Rötter R, van Zeijts H (2003) The integrated modelling system<br />
STONE for calculating nutrient emissions from agriculture in the Netherlands. Environmental Modelling &<br />
Software, 18, 597-617.<br />
Wood EF, Sivapalan M, Beven KJ, Band L (1988) Effects of spatial variability and scale with implications to<br />
hydrologic modelling. Journal of Hydrology, 102, 29-47.<br />
WSSTP (2005) Water safe strong and sustainable. A European vision for water supply and sanitation in<br />
2030. Water Supply and Sanitation Technology Platform. October 2005. http://www.wsstp.org<br />
WWAP (2003) Water for People, Water for Life. UN World Water Development Report. Prepared as a collaborative<br />
effort of 23 UN agencies and convention secretariats co-ordinated by the World Water Assessment<br />
Programme. UNESCO, Paris. http://www.unesco.org/water/wwap/index.shtml<br />
90
[1]<br />
Refsgaard JC, Hansen E (1982) A Distributed Groundwater/Surface Water<br />
Model for the Suså Catchment. Part 1: Model Description.<br />
Nordic Hydrology, 13, 299-310.<br />
Reprinted with permission from Nordic Hydrology
[2]<br />
Refsgaard JC, Hansen E (1982) A Distributed Groundwater/Surface Water<br />
Model for the Suså Catchment. Part 2: Simulations of Streamflow Depletions<br />
Due to Groundwater Abstraction.<br />
Nordic Hydrology, 13, 311-322.<br />
Reprinted with permission from Nordic Hydrology
[3]<br />
Refsgaard JC, Christensen TH, Ammentorp HC (1991) A model for oxygen<br />
transport and consumption in the unsaturated zone.<br />
Journal of Hydrology, 129, 349-369.<br />
Reprinted from Journal of Hydrology with permission from Elsevier
[4]<br />
Refsgaard JC, Seth SM, Bathurst JC, Erlich M, Storm B, Jørgensen, GH,<br />
Chandra S (1992) Application of the SHE to catchments in India - Part 1:<br />
General results.<br />
Journal of Hydrology, 140, pp 1-23.<br />
Reprinted from Journal of Hydrology with permission from Elsevier
[5]<br />
Jain SK, Storm B, Bathurst JC, Refsgaard JC, Singh RD (1992) Application of<br />
the SHE to catchments in India - Part 2: Field experiments and simulation<br />
studies with the SHE on the Kolar subcatchment of the Narmada River.<br />
Journal of Hydrology, 140, 25-47.<br />
Reprinted from Journal of Hydrology with permission from Elsevier
[6]<br />
Refsgaard JC, Knudsen J (1996) Operational validation and intercomparison<br />
of different types of hydrological models.<br />
Water Resources Research, 32 (7), 2189-2202.<br />
Reproduced by permission of American Geophysical Union
WATER RESOURCES RESEARCH, VOL. 32, NO. 7, PAGES 2189–2202, JULY 1996<br />
Operational validation and intercomparison of different types<br />
of hydrological models<br />
Jens Christian Refsgaard and Jesper Knudsen<br />
Danish Hydraulic Institute, Hørsholm, Denmark<br />
Abstract. A theoretical framework for model validation, based on the methodology<br />
originally proposed by Klemes [1985, 1986], is presented. It includes a hierarchial<br />
validation testing scheme for model application to runoff prediction in gauged and<br />
ungauged catchments subject to stationary and nonstationary climate conditions. A case<br />
study on validation and intercomparison of three different models on three catchments in<br />
Zimbabwe is described. The three models represent a lumped conceptual modeling system<br />
(NAM), a distributed physically based system (MIKE SHE), and an intermediate<br />
approach (WATBAL). It is concluded that all models performed equally well when at<br />
least 1 year’s data were available for calibration, while the distributed models performed<br />
marginally better for cases where no calibration was allowed.<br />
Introduction<br />
Copyright 1996 by the American Geophysical Union.<br />
Paper number 96WR00896.<br />
0043-1397/96/96WR-00896$09.00<br />
In recent years water resources studies have become increasingly<br />
concerned with aspects of water resources for which data<br />
are not directly available. Examples include studies of the<br />
development potential of ungauged areas, environmental impacts<br />
of land use changes related to agricultural and forestry<br />
practices, conjunctive use of groundwater and surface water,<br />
and climate impact studies concerned with the effects on water<br />
resources of an anticipated climate change.<br />
In these and other types of studies, hydrological simulation<br />
models are often used to provide the missing information as a<br />
basis for decisions regarding the development and management<br />
of water and land resources.<br />
Traditionally, hydrological simulation modeling systems are<br />
classified in three main groups, namely, (1) empirical black<br />
box, (2) lumped conceptual, and (3) distributed physically<br />
based systems. The great majority of the modeling systems<br />
used in practice today belongs to the simple types (1) or (2)<br />
and require a modest numbers of parameters (approximately<br />
5–10) to be calibrated for their operation. Despite their simplicity,<br />
many models have proven quite successful in representing<br />
an already measured hydrograph.<br />
A severe drawback of these traditional modeling systems,<br />
however, is that their parameters are not directly related to the<br />
physical conditions of the catchment. Accordingly, it may be<br />
expected that their applicability is limited to areas where runoff<br />
has been measured for some years and where no significant<br />
change in catchment conditions have occurred.<br />
To provide a more appropriate tool for the type of studies<br />
mentioned above, considerable efforts within hydrological research<br />
have been directed toward development of distributed<br />
physically based catchment models. Such models use parameters<br />
which are related directly to the physical characteristics of<br />
the catchment (topography, soil, vegetation, and geology) and<br />
operate within a distributed framework to account for the<br />
spatial variability of both physical characteristics and meteorological<br />
conditions. These models aim at describing the hydrological<br />
processes and their interaction as and where they<br />
occur in the catchment and therefore offer the prospect of<br />
remedying the shortcomings of the traditional rainfall runoff<br />
models.<br />
Although there appears to be a certain degree of consensus<br />
at the theoretical level regarding the potential of the distributed<br />
physically based types of models, there are widely divergent<br />
points of view as to whether they offer a significant improvement<br />
in actual performance when compared to the wellproven<br />
lumped conceptual model type. Beven [1989, p. 161]<br />
argues from theoretical considerations of scale problems that<br />
“the current generation of distributed physically based models<br />
are lumped conceptual models,” and, further, that all current<br />
physically based models “are not well suited to applications to<br />
real catchments.” Grayson et al. [1992] support this view and<br />
claim that physically based models have been oversold by their<br />
developers. Other authors, for example, Smith et al. [1994],<br />
argue that this criticism is “overly pessimistic.”<br />
An evaluation of the capabilities of hydrological models<br />
when applied in the absence of site calibration data and limited<br />
validation data to predict the effects of major land use changes<br />
was made by the Task Committee on Quantifying Land-Use<br />
Change Effects [U.S. Committee, 1985], which reported a great<br />
belief among committee members in the capabilities of 28<br />
surface water hydrological modeling systems, most of which<br />
can be classified as lumped conceptual models. In view of the<br />
limited number of model comparison studies conducted and<br />
the less-than-encouraging results often obtained, this confidence<br />
is remarkable. According to the U.S. Committee [1985, p.<br />
1], “the reasons for this confidence were explored and appear<br />
to be based upon personal experience, possibly tempered by<br />
belief in the model originators.”<br />
Owing to the complexity of the problems involved, further<br />
theoretical evaluation is not likely to provide a definite conclusion<br />
regarding the capability and limitation of distributed,<br />
physically based modeling systems. For establishing a basis to<br />
better advance the discussion, relevant model validations appear<br />
to be a more fruitful approach, where the models concerned<br />
simply are subjected to a range of practical modeling<br />
tests to validate their capability for undertaking particular<br />
tasks.<br />
2189
2190<br />
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
In this respect, Klemes [1986, p. 17], has developed a hierarchial<br />
scheme for model testing, which is based on the philosophy<br />
that “a hydrological simulation model must demonstrate,<br />
before it is used operationally, how well it can perform<br />
the kind of task for which it is intended.” It may appear needless<br />
to advocate such a basic and evident requirement. Unfortunately,<br />
it is well justified in view of the current practice in<br />
hydrological model testing.<br />
The present paper is based on results from a research<br />
project conducted at the Danish Hydraulic Institute (DHI)<br />
[1993a]. The project had two major objectives. The first objective<br />
was to identify a rigorous framework for the testing of<br />
model capabilities for different types of tasks. The second<br />
objective was to use this theoretical framework and conduct an<br />
intercomparison study involving application of three modeling<br />
systems of different complexity to a number of tasks ranging<br />
from traditional simulation of stationary, gauged catchments to<br />
simulation of ungauged catchments and of catchments with<br />
nonstationary climate conditions. Data from three catchments<br />
in Zimbabwe were used for the tests. The research project was<br />
a contribution to project D.5, “Testing the transferability of<br />
hydrological simulation models,” forming part of the World<br />
Climate Programme—Water [World Meteorological Organization<br />
(WMO), 1985].<br />
Some of the results of DHI [1993a] were presented by<br />
Refsgaard [1996] with a focus on modeling the land surface<br />
processes and the coupling between hydrological and atmospheric<br />
models within the global change context. Thus Refsgaard<br />
[1996] presents some of the results from two of the<br />
Zimbabwean catchments to illustrate data requirements and<br />
form the basis for conclusions regarding which type of hydrological<br />
model is required for climate change modeling. The<br />
present paper, on the other hand, emphasizes the modeling<br />
methodology and contains a summary of all the test results<br />
from all the three Zimbabwian catchments. It furthermore<br />
provides a general discussion of these results with references to<br />
similar studies reported in literature.<br />
Theoretical Framework for Model Validation<br />
Terminology<br />
No unique and generally accepted terminology is presently<br />
used in the hydrological community with regard to issues related<br />
to model validation. The framework used in the present<br />
paper is basically in line with the terminology defined by<br />
Schlesinger et al. [1979], Tsang [1991], and Flavelle [1992] and<br />
comprises the following key definitions.<br />
A modeling system (i.e., code) is a generalized software<br />
package, which can be used for different catchments without<br />
modifying the source code. Examples of modeling systems are<br />
MIKE SHE, SACRAMENTO, and MODFLOW.<br />
A model is a site-specific application of a modeling system,<br />
including given input data and specific parameter values. An<br />
example of a model is a MIKE SHE–based model for the<br />
Ngezi catchment (cf. the case study below).<br />
A modeling system or a code can be “verified.” A code<br />
verification involves comparison of the numerical solution generated<br />
by the code with one or more analytical solutions or<br />
with other numerical solutions. Verification ensures that the<br />
computer program accurately solves the equations that constitute<br />
the mathematical model.<br />
Model validation is here defined as the process of demonstrating<br />
that a given site-specific model is capable of making<br />
accurate predictions for periods outside a calibration period. A<br />
model is said to be validated if its accuracy and predictive<br />
capability in the validation period have been proven to lie<br />
within acceptable limits or errors. It is important to notice that<br />
the term model validation refers to a site specific validation of<br />
a model. This must not be confused with a more general<br />
validation of a generalized modeling system which, in principle,<br />
will never be possible.<br />
Testing Scheme for Validation of Hydrological Models<br />
The hierarchial testing scheme proposed by Klemes [1985,<br />
1986] appears suitable for testing the capability of a model to<br />
predict the hydrological effect of climate change, land use<br />
change, and other nonstationary conditions. Klemes distinguished<br />
between simulations conducted for the same station<br />
(catchment) used for calibration and simulations conducted<br />
for ungauged catchments. He also distinguished between cases<br />
where climate, land use, and other catchment characteristics<br />
remain unchanged (are stationary) and cases where they are<br />
not. This leads to the definitions of four basic categories of<br />
typical modeling tests.<br />
1. The split-sample test (SS) involves calibration of a<br />
model based on 3–5 years of data and validation on another<br />
period of a similar length.<br />
2. The differential split-sample test (DSS) involves calibration<br />
of a model based on data before catchment change occurs,<br />
adjustment of model parameters to characterize the change,<br />
and validation on the subsequent period.<br />
3. In the proxy-basin test (PB) no direct calibration is allowed,<br />
but advantage may be taken of information from other<br />
gauged catchments. Hence validation will comprise identification<br />
of a gauged catchment deemed to be of a nature similar to<br />
that of the validation catchment; initial calibration; transfer of<br />
model, including adjustment of parameters to reflect actual<br />
conditions within validation catchment; and validation.<br />
4. With the proxy-basin differential split-sample test (PB-<br />
DSS), again no direct calibration is allowed, but information<br />
from other catchments may be used. Hence validation will<br />
comprise initial calibration on the other relevant catchment,<br />
transfer of model to validation catchment, selection of two<br />
parameter sets to represent the periods before and after the<br />
change, and subsequent validations on both periods.<br />
Relevant Literature on Model Intercomparison<br />
Studies<br />
The testing of hydrological models through validation on<br />
independent data has for a long time been emphasized by the<br />
World Meteorological Organization (WMO). In their pioneering<br />
studies [WMO, 1975, 1986, 1992] several hydrological modeling<br />
systems of the empirical black box and the lumped conceptual<br />
types were tested on the same data from different<br />
catchments. The actual testing, however, only included the<br />
standard SS test comprising an initial calibration of a model<br />
and subsequent validation based on data from an independent<br />
period. No firm conclusions were derived regarding significant<br />
differences in performance among different model types.<br />
Franchini and Pacciani [1991] made a comparative analysis<br />
of seven different lumped conceptual models. They used an SS<br />
testing approach calibrating on a 1-month period and validating<br />
on a subsequent 3-month period. They concluded that in<br />
spite of a wide range of structural complexity all the models<br />
produced similar and equally valid results. With regard to the
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
2191<br />
question of whether the simpler or the more complex variants<br />
within this group of models are better, they concluded that<br />
significantly different models produced basically equivalent results,<br />
with calibration times being generally proportional to the<br />
complexity of their structure. On the other hand, they concluded<br />
that the model structure should not be made too simple,<br />
because it will then cause a loss of the link with the physics<br />
of the problem and of the possibility of taking advantage of<br />
prior knowledge of the geomorphological nature of the catchment.<br />
Other researchers have conducted similar intercomparison<br />
studies involving empirical black box models and lumped conceptual<br />
models [Naef, 1981; Wilcox et al., 1990] with similar<br />
conclusions.<br />
Only a few studies have included comparisons of distributed<br />
physically based models with simpler models. Loague and<br />
Freeze [1985] in a classical study compared two empirical black<br />
box modeling systems (a regression model and a unit hydrograph<br />
model) and a quasi physically based system on three<br />
small experimental catchments ranging from 10 ha to 7.2 km 2 .<br />
The models were used on an event basis to simulate runoff<br />
peaks. The two empirical models were calibrated against runoff<br />
data and subsequently validated on independent data in an SS<br />
approach. The parameter values for the quasi physically based<br />
model were assessed directly from field data and not subject to<br />
any calibration before being validated against the same data as<br />
the two other models. Loague and Freeze [1985] found that all<br />
models performed poorly. For one catchment the quasi physically<br />
based model was subsequently applied with and without<br />
calibration of one key model parameter. Such calibration had<br />
little impact on the model performance during the validation<br />
period.<br />
In a study in the semiarid 150 km 2 Walnut Gulch experimental<br />
watershed Michaud and Sorooshian [1994] compared a<br />
lumped conceptual model (SCS), a distributed conceptual<br />
model (SCS with eight subcatchments, one per raingauge) and<br />
a distributed physically based model (KINEROS) for simulation<br />
of storm events. They found that with calibration, the<br />
accuracies of the two distributed models were similar. Without<br />
calibration the distributed physically based model performed<br />
better than the distributed conceptual model, and in both cases<br />
the lumped conceptual model performed poorly.<br />
Thus, as far as the test experience for distributed physically<br />
based models is concerned, both Loague and Freeze [1985] and<br />
Michaud and Sorooshian [1994] have performed tests on relatively<br />
small experimental catchments with very good data coverage.<br />
Both studies have used the models on ungauged conditions<br />
(without calibration) but in all cases under stationary<br />
climate conditions. The present paper presents results from<br />
larger catchments in Zimbabwe with ordinary data coverage<br />
and performs a sequence of rigorous tests of increasing complexity<br />
according to the hierarchial scheme outlined by Klemes<br />
[1986], involving intercomparisons between lumped conceptual<br />
and distributed physically based models.<br />
Hydrological Modeling Systems<br />
The following three modeling systems (codes) are used in<br />
the present study: a lumped conceptual rainfall-runoff modeling<br />
system (NAM), a semidistributed hydrological modeling<br />
system (WATBAL), and a distributed physically based hydrological<br />
modeling system (MIKE SHE). The NAM and MIKE<br />
SHE can be characterized as very typical of their respective<br />
classes, while the WATBAL falls in between these two standard<br />
classes. All three modeling systems are being used on a<br />
routine basis at the Danish Hydraulic Institute (DHI) in connection<br />
with consultancy and research projects.<br />
NAM<br />
NAM is a traditional hydrological modeling system of the<br />
lumped conceptual type operating by continuously accounting<br />
for the moisture contents in four mutually interrelated storages.<br />
The NAM was originally developed at the Technical<br />
University of Denmark [Nielsen and Hansen, 1973] and has<br />
been modified and extensively applied by DHI in a large number<br />
of engineering projects covering all climatic regimes of the<br />
world. Furthermore, the NAM has been transferred to more<br />
than 100 other organizations worldwide as part of DHI’s<br />
MIKE 11 generalized river modeling package. The structure of<br />
NAM is illustrated in Figure 1. The NAM has in its present<br />
version a total of 17 parameters; however, in most cases only<br />
about 10 of these are adjusted during calibration.<br />
WATBAL<br />
WATBAL was developed in the early 1980s by DHI in an<br />
attempt to enable full utilization of readily available, distributed<br />
data on land surface properties (topography, vegetation,<br />
and soil) in a physically based model, and yet it is simple<br />
enough to allow large-scale applications within reasonable<br />
computational requirements. Here the WATBAL is briefly<br />
introduced; more detailed information has been given by<br />
Knudsen et al. [1986].<br />
WATBAL has been designed to account for the spatial and<br />
temporal variations of soil moisture. On the basis of distributed<br />
information on meteorological conditions, topography,<br />
vegetation, and soil types, the catchment area is divided into a<br />
number of hydrological response units, as illustrated in Figure<br />
2, with each unit being characterized by a different composition<br />
of the above features. These units are used to provide the<br />
spatial representation of soil moisture, while temporal variations<br />
within each unit are accounted for by means of empirical<br />
relations for the processes affecting soil moisture, using physical<br />
parameters particular to each unit.<br />
For the representation of subsurface flows a simple lumped,<br />
conceptual approach is applied, using a cascade of linear reservoirs<br />
to account for the interflow and baseflow components<br />
(Figure 3). In summary, WATBAL provides a distributed physically<br />
based description of the surface processes affecting soil<br />
moisture (interception, infiltration, evapotranspiration, and<br />
percolation), while a lumped conceptual approach is used to<br />
represent subsurface flows. WATBAL has previously been<br />
used successfully for prediction of runoff from ungauged catchments<br />
[Nielsen and Bari, 1988].<br />
MIKE SHE<br />
MIKE SHE is a further development of the European Hydrological<br />
System—SHE [Abbott et al., 1986a, b]. It is a deterministic,<br />
fully distributed and physically based modeling system<br />
for describing the major flow processes of the entire land phase<br />
of the hydrological cycle. MIKE SHE solves the partial differential<br />
equations for the processes of overland and channel flow<br />
and unsaturated and saturated subsurface flow. The system is<br />
completed by a description of the processes of snow melt,<br />
interception, and evapotranspiration. The flow equations are<br />
solved numerically using finite difference methods.<br />
In the horizontal plane the catchment is discretized in a
2192<br />
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
Figure 1. Structure of the NAM rainfall runoff modeling system [DHI, 1994].<br />
network of grid squares. The river system is assumed to run<br />
along the boundaries of these. Within each square the soil<br />
profile is represented by a number of computational nodes in<br />
the vertical direction, which above the groundwater table may<br />
become partly saturated. Lateral subsurface flow is only considered<br />
in the saturated part of the profile. Figure 4 illustrates<br />
the structure of the MIKE SHE. A description of the methodology<br />
and some experiences of model application to ordi-<br />
Figure 2. WATBAL representation of catchment characteristics and definition of hydrological response<br />
units [Knudsen et al., 1986].
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
2193<br />
Figure 3. Principal structure of WATBAL [Knudsen et al., 1986].<br />
nary catchments have been given by Refsgaard et al. [1992] and<br />
Jain et al. [1992]. A more detailed description has been given<br />
by Refsgaard and Storm [1995].<br />
MIKE SHE is usually categorized as a physically based system.<br />
The characterization is, strictly speaking, correct only if it<br />
is applied on an appropriate scale. A number of scale problems<br />
arise when the MIKE SHE is used on a regional scale [Refsgaard<br />
and Storm, 1995]. In addition, if there is a considerable<br />
Figure 4.<br />
Schematic presentation of the MIKE SHE [DHI, 1993b].
2194<br />
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
Figure 5.<br />
Location of the three catchments in Zimbabwe.<br />
uncertainty attached to the basic information, and if the spatial<br />
and temporal variables (such as groundwater table elevations)<br />
cannot be validated against observations, a MIKE SHE model<br />
of that particular site cannot be considered fully physically<br />
based but will degenerate towards a detailed conceptual<br />
model. In this case the calibration procedure is usually to<br />
adjust the parameters with the largest uncertainties attached,<br />
within a reasonable range.<br />
Case Study: Methodology<br />
Selected Catchments in Zimbabwe<br />
The three catchments in Zimbabwe that were selected for<br />
the model tests are Ngezi-South (1090 km 2 ), Lundi (254 km 2 ),<br />
and Ngezi-North (1040 km 2 ). The locations of the catchments<br />
are shown in Figure 5.<br />
A brief data collection/field reconnaissance to Zimbabwe<br />
was arranged to obtain relevant information. Daily series of<br />
rainfall and monthly series of pan evaporation were obtained<br />
from the Department of Meteorological Services. Records of<br />
mean daily discharges as well as information on water rights<br />
were obtained from the Hydrological Branch, Ministry of Energy<br />
Water Resources and Development. Detailed information<br />
on land use was obtained through subcontracting R. Whitlow,<br />
University of Zimbabwe, to prepare land-use maps based<br />
upon 1:25,000 aerial photographs. Furthermore, 1:50,000 topographical<br />
maps were collected and digitized. Information on<br />
vegetation characteristics was obtained from Timberlake [1989]<br />
as well as from J. Timberlake and N. Nobanda, National Herbarium<br />
(personal communication, 1989); B. Campell, Department<br />
of Biological Sciences (personal communication, 1989);<br />
and G. MacLaureen, Department of Crop Science, University<br />
of Zimbabwe (personal communication, 1989). Information on<br />
soil characteristics and hydrogeology was obtained from Anderson<br />
[1989]. Finally, valuable information of various kinds was<br />
provided by R. Whitlow, Department of Geography, University<br />
of Zimbabwe (personal communication, 1989); H. Elwell,<br />
Agritex (personal communication, 1989); J. Anderson, Chemistry<br />
and Soil Research Institute, Ministry of Agriculture (personal<br />
communication, 1989); and others. A more detailed description<br />
is given in DHI [1993a].<br />
The annual catchment rainfall and runoff for the periods<br />
selected for modeling are shown in Table 1, while some of the<br />
key features for the three catchments are presented in Table 2.<br />
It is noticed from the rainfall and runoff figures in Table 1 that<br />
there are very large interannual variations. From Table 2 it<br />
appears that there are significant differences in the vegetation<br />
and soil characteristics from catchment to catchment.<br />
Model Testing Scheme<br />
The model testing scheme is illustrated in Figure 6. The<br />
testing of the involved models has been undertaken in parallel<br />
and in the following sequence.<br />
1. The SS test was based on data from Ngezi-South comprising<br />
an initial calibration of the models and a subsequent<br />
validation using data for an independent period.<br />
2. The PB test involved transfer of models to the Lundi<br />
catchment and adjustment of parameters to reflect the prevailing<br />
catchment characteristics and validation without any calibration.<br />
3. The modified proxy-basin (M-PB) test was as above, but
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
2195<br />
Table 1. Annual Rainfall and Runoff Values for the Three<br />
Zimbabwean Test Catchments<br />
Hydrological<br />
Year<br />
Rainfall,<br />
mm/yr<br />
Runoff,<br />
mm/yr<br />
Ngezi-South<br />
1971/1972 890 131<br />
1972/1973 317 2<br />
1973/1974 1290 349<br />
1974/1975 1087 236<br />
1975/1976 879 90<br />
1976/1977 872 116<br />
1977/1978 1131 245<br />
1978/1979 609 59<br />
Lundi<br />
1971/1972 920 89<br />
1972/1973 371 2<br />
1973/1974 1384 460<br />
1974/1974 1046 217<br />
1975/1976 857 89<br />
1981/1982 416 10<br />
1982/1983 528 7<br />
1983/1984 547 8<br />
Ngezi-North<br />
1977/1978 1047 156<br />
1978/1979 730 64<br />
1981/1982 430 12<br />
1982/1983 395 1<br />
1983/1984 436 4<br />
was adjusted by allowing model calibration based on 1 year of<br />
runoff data.<br />
4. For the DSS test, model calibration was based on data<br />
from an initial calibration period, and validation was based on<br />
data from a subsequent period. The differential nature of this<br />
test is justified by the fact that the later independent period<br />
includes three successive years (1981/1982–1983/1984) with a<br />
markedly lower rainfall than would be otherwise and hence<br />
represents a nonstationary climate scenario.<br />
5. The PB-DSS test involved transferring the models to the<br />
Ngezi-North catchment, adjusting the parameters to represent<br />
the catchment characteristics, and validating them by runoff<br />
simulation over a nonstationary climate period.<br />
6. The modified proxy-basin differential split-sample (M-<br />
PB-DSS) test was as above, though it allowed models to be<br />
calibrated using a short-term (1 year) record.<br />
Evaluation Criteria<br />
For measuring the performance of the models for each test,<br />
a standard set of criteria has been defined. The criteria have<br />
been designed with the sole purpose of measuring how closely<br />
the simulated series of daily flows agree with the measured<br />
series. Owing to the generalized nature of the defined model<br />
validations, it has been necessary to introduce several criteria<br />
for measuring the performance with regard to water balance,<br />
low flows, and peak flows.<br />
The standard set of performance criteria comprises a combination<br />
of the following four graphical plots and three numerical<br />
measures: (1) joint plots of the simulated and observed<br />
hydrographs; (2) scatter diagram of monthly runoffs; (3) flow<br />
duration curves; (4) scatter diagram of annual maximum discharges;<br />
(5) overall water balance; (6) the Nash-Sutcliffe coefficient<br />
(R2); and (7) an index (EI) measuring the agreement<br />
between the simulated and observed flow duration curves.<br />
The coefficient R2, introduced by Nash and Sutcliffe [1970],<br />
is computed on the basis of the sequence of observed and<br />
simulated monthly flows over the whole testing period (perfect<br />
agreement for R2 is 1):<br />
M<br />
R2 1 <br />
m1<br />
2 M<br />
Q o m Q s m Q o m Q¯ o 2<br />
m1<br />
where<br />
M total number of months;<br />
s<br />
Q m simulated monthly flows;<br />
o<br />
Q m observed monthly flows;<br />
Q¯o<br />
average observed monthly flows over whole period.<br />
The flow duration curve error index, EI, provides a numerical<br />
measure of the difference between the flow duration curves<br />
of simulated and observed daily flows (perfect agreement for<br />
EI is 1):<br />
EI 1 f oq f s q dq f oq dq<br />
where f o (q) is the flow duration curve based on observed daily<br />
flows, and f s (q) is the flow duration curve based on simulated<br />
daily flows.<br />
Table 2. Land-Use Vegetation and Soil Characteristics Estimated From Available<br />
Information and a Brief Field Visit<br />
Catchment<br />
Ngezi-South Lundi Ngezi-North<br />
Land use/vegetation (area %)<br />
Dense/closed woody vegetation 7 13 10<br />
Open woody vegetation 36 25 35<br />
Sparse woody vegetation 14 19 14<br />
Grassland 11 39 16<br />
Cropland 29 3 19<br />
Abandoned cropland 2 0 6<br />
Rock outcrops 1 0 0<br />
Soil depth range, m 0–2.5 0–1 0.5–6<br />
Saturated hydraulic<br />
conductivity in root zone<br />
range: 1–250<br />
average: 80<br />
range: 1–70<br />
average: 60<br />
range: 2–100<br />
average: 50<br />
soil, mm/hr<br />
Available water content in root<br />
zone soil, vol %<br />
range: 10–14 range: 10–12 range: 9–29<br />
average: 12 average: 11 average: 17
2196<br />
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
Figure 6.<br />
Model validation test schemes.<br />
Model Construction, Calibration, and Application<br />
All models have had access to the same hydrometeorological<br />
data and catchment information at any time. Due to the nature<br />
of the different models, however, the WATBAL and SHE have<br />
been able to make more direct use of the available information<br />
than the NAM.<br />
In this respect, the NAM has disregarded the spatial variation<br />
of rainfall and used the catchment average series as input,<br />
and for the simulation of ungauged catchments, a subjective<br />
evaluation of catchment characteristics has been undertaken<br />
for estimation of the appropriate model parameters. On the<br />
other hand, the WATBAL and SHE have attempted to account<br />
for the spatial variability of rainfalls as well as information<br />
on typical storm durations to convert daily rainfall series<br />
to realistic hourly rainfalls. Furthermore, these models have<br />
directly used the available information on the spatial variation<br />
of topography and soil and vegetation types and their characteristics<br />
for model setup and estimation of appropriate model<br />
parameters.<br />
As an illustration of the differences in model complexity and<br />
the different abilities of the three modeling systems to utilize<br />
the available distributed catchment data, some key facts for the<br />
three model applications to the 1090 km 2 Ngezi-South catchment<br />
are given in the following three paragraphs.<br />
The NAM model considered the entire catchment as one<br />
unit, utilized only catchment areal rainfall, and initially disregarded<br />
information on soil, vegetation, and geology. Such information<br />
was subsequently used on a subjective basis for assessing<br />
likely parameter values in the PB tests on the other two<br />
catchments. During the model calibrations (when allowed) the<br />
values of the 10 parameters were assessed.<br />
The WATBAL model was established on the basis of six<br />
meteorological zones, eight soil types, and 11 vegetation types.<br />
The spatial occurrences of these three features resulted in 129<br />
hydrological response units. During the model calibrations<br />
(when allowed) parameter values reflecting root depths, soil<br />
water retention capacity, soil hydraulic conductivities, and time<br />
constants in subsurface flow routing were adjusted.<br />
The MIKE SHE also distributed the rainfall information to<br />
different inputs in six meteorological zones. Information on<br />
topography, soil, vegetation, and geology were distributed to a<br />
1-km grid. Thus MIKE SHE carried out calculations at 1090<br />
horizontal grid points. During the model calibrations (when<br />
allowed) parameter values reflecting soil depth and maximum<br />
root depths, as well as an empirical drainage time constant,<br />
were adjusted. In order to minimize the calibration work the<br />
parameter values were not varied within all 1090 grid points,<br />
but kept identical within each of the 13 land-use classes. In<br />
general, the parameters for which field data were available,<br />
such as soil water retention curves and leaf area index, were<br />
not modified during the calibration process.<br />
The present study has aimed at testing various types of<br />
general modeling systems. However, it should be emphasized<br />
that validation results are not solely dependent on the modeling<br />
system but, indeed, also depend on the hydrologist operating<br />
the model, including his or her personal interpretation of<br />
available information and subjective assessments. In the<br />
present study this element of uncertainty has been minimized<br />
to the extent possible by assigning three experienced hydrologists<br />
with comprehensive experience in the application of each<br />
of the three modeling systems and by providing each of them<br />
with the same catchment data.<br />
The calibration procedure adopted was that of “trial and<br />
error,” implying that the hydrologists made subjective adjustments<br />
of parameter values in between the calibration runs. The<br />
numerical and graphical performance criteria described above<br />
were used as important guidance for the hydrologists when<br />
deciding upon the set of parameter values which they assessed<br />
to be the optimal ones. As these decisions inevitably depend on<br />
the personal experiences and judgments of the hydrologists, it<br />
may be argued that this procedure adds an undesirable degree<br />
of subjectivity to the results. However, given the large number<br />
of performance criteria and the large number of adjustable<br />
parameters, especially in the WATBAL and MIKE SHE models,<br />
suitable and well-proven automatic parameter optimization<br />
techniques did not exist. Instead, by applying the standard<br />
calibration procedure by which the three hydrologists had comprehensive<br />
experience, the results may be seen as typical results<br />
from three different modeling systems, when using standard<br />
engineering procedures for data collection, model<br />
construction, and calibration.<br />
Results of Model Validation Test Scheme<br />
The results of the six tests outlined in Figure 6 are summarized<br />
in Figure 7, which shows the overall water balances and
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
2197<br />
Figure 7.<br />
Summary of key validation results for all tests.<br />
the R2 and EI numerical criteria. Simulated and observed<br />
hydrographs are shown in Figure 8 for two of the tests from the<br />
Lundi and Ngezi-North catchments. Annual water balances<br />
are shown for all the tests in Figures 9–15. Assessments of<br />
uncertainties in the PB predictions are shown in Figures 16 and<br />
17. Note that the different performance criteria presented in<br />
the figures focus on different aspects, such as overall annual<br />
water balances (Figures 9–17), monthly flows (R2 in Figure 7),<br />
flow pattern on a daily basis (EI in Figure 7) and hydrograph<br />
shapes (Figure 8). The results are discussed test by test in the<br />
following sections.<br />
SS Test<br />
This test is based on data from Ngezi-South and comprises<br />
an initial calibration of the models and a subsequent validation<br />
using data for an independent period. As indicated in Figures<br />
7, 9, and 10 the performances of the three models are very<br />
similar. All models are able to provide a close fit to the recorded<br />
flows for the calibration period, while for the independent<br />
validation period the performance is somewhat reduced,<br />
as expected. The reduction is, however, limited, and all models<br />
are able to maintain a very good representation of the overall<br />
water balance and the interannual and seasonal variations, as<br />
well as the general flow pattern.<br />
PB Test<br />
This test comprises a transfer of models to the Lundi catchment,<br />
adjustment of parameters to reflect the prevailing catchment<br />
characteristics, and validation without any calibration.<br />
The PB test was arranged to test the capability of the different<br />
models to represent runoff from an ungauged catchment area,<br />
and hence no calibration was allowed prior to the simulation.<br />
All models have used the experience from the Ngezi-South<br />
calibrations in combination with the available information on<br />
the particular catchment characteristics for Lundi. While the<br />
NAM model has used this information in a purely subjective<br />
manner to revise model parameters, both the WATBAL and<br />
MIKE SHE models have directly used this information for the<br />
model setup. The estimates prepared by the latter two models<br />
have, however, also been influenced by the individual modelers’<br />
subjective interpretation of the available information on<br />
soil and vegetation characteristics.<br />
In order to assess the effects of the uncertainty in parameter<br />
estimation as perceived by the individual modelers, three alternative<br />
runoff simulations were prepared, reflecting expected<br />
low, central, and high (runoff) estimates, respectively. The results<br />
of the central estimates are included in Figures 7, 8a, and<br />
11, while annual runoff figures for the assessed uncertainty<br />
intervals are shown in Figure 16.<br />
In general, all models provide an excellent representation of<br />
the general flow pattern and the overall water balance, while<br />
maintaining the significant interannual variability to a satisfactory<br />
degree. The predicted hydrographs for the rainy season of<br />
1973/1974, shown in Figure 8a, confirm that the overall hydrograph<br />
pattern is predicted quite well by all three models.<br />
The overall performance of the central estimates by the<br />
NAM and MIKE SHE models is somewhat reduced compared<br />
to validation runs for the Ngezi-South catchment as expected<br />
when no calibration is possible. The estimates would, however,<br />
still be very valuable for all practical purposes. For the<br />
WATBAL model, the central estimate is even better than<br />
obtained for the validation period for Ngezi-South, providing<br />
for a very accurate representation of observed runoff record.<br />
From Figure 16 it appears that the assessed uncertainty<br />
interval for the NAM predictions of annual runoff is about<br />
twice as wide as for the WATBAL and MIKE SHE predictions.<br />
M-PB Test<br />
This test is based on the same data from Lundi as the above<br />
PB test. The M-PB test was undertaken to evaluate whether<br />
better model performance could be obtained should shortterm<br />
measurements be available for calibration. Hence, before<br />
the results of the previous test were revealed, 1 year (1975/<br />
1976) of runoff record was released for calibration, and the PB<br />
test repeated. The main results of this test are summarized in<br />
Figure 7, and annual water balances are shown in Figure 12.<br />
For the NAM model the short-term calibration leads to an<br />
improved performance, decreasing the deviation of the overall<br />
water balance to some 15%. At the same time, the statistics of<br />
R2 and EI confirm the good representation of monthly flows<br />
and the overall flow pattern in general.<br />
For the WATBAL model the short-term calibration introduces<br />
only a slight improvement in the overall performance.<br />
The reason for this is thought to be due to the originally very<br />
good performance, which in any case would be difficult to<br />
improve. The main benefit of the short runoff record is in this<br />
case primarily to confirm the validity of the central estimate
2198<br />
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
Figure 8. (a) Lundi (central estimates) proxy-basin (PB) test hydrographs from 1973/1974. (b) Ngezi-North<br />
(central estimates) PB differential split-sample (SS) test hydrographs for 1977/1978.<br />
and hence to reduce the uncertainty related to the final runoff<br />
estimate. In this sense the calibration has proven quite valuable<br />
and would indeed be so in any practical case.<br />
For the MIKE SHE model the calibration has not introduced<br />
any improvement in the overall performance. As compared<br />
to the best of the original estimates (i.e., the low case)<br />
the calibration has in fact caused a deterioration of the performance.<br />
This rather unfortunate incident may occur for all<br />
Figure 9. Annual water balances for the calibration part of<br />
the SS test on Ngezi-South catchment.<br />
Figure 10. Annual water balances for the validation part of<br />
the SS test on Ngezi-South catchment.
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
2199<br />
Figure 11.<br />
catchment.<br />
Annual water balances for PB test on Lundi<br />
Figure 13. Annual water balances for differential split sample<br />
(DSS) test on Lundi catchment.<br />
types of models when calibration data are not fully consistent,<br />
but it appears that the SHE type of model requires a greater<br />
reliability of input data than other, more simple types of models<br />
to avoid the pitfall of miscalibration.<br />
DSS Test<br />
This test consists of model calibrations based on data from<br />
Lundi for 4 wet years (1971/1972–1975/1976 with mean annual<br />
runoff of 171 mm) and validation on data from 3 very dry years<br />
(1981/1982–1983/1984 with mean annual runoff of 8 mm). The<br />
purpose of this test is to assess the capability of the models to<br />
do simulations under nonstationary climate conditions. A summary<br />
of the main results of the differential SS tests is given in<br />
Figure 7, and the annual water balances are shown in Figure<br />
13.<br />
As is evident from the results, both NAM and MIKE SHE<br />
predict the water balance well. The WATBAL model, however,<br />
grossly overestimates the peaks in the relative sense,<br />
causing the simulated average runoff to be about twice that<br />
measured (15 mm compared to 8 mm). The related statistics<br />
are poorer than those in the other testing schemes, but it<br />
should be noted that even small deviations cause poor statistics<br />
when mean flows are as low as those in this case.<br />
PB-DSS Test<br />
This test is based on data from the third catchment, Ngezi-<br />
North. Without allowing for any prior calibration, all modelers<br />
were requested to prepare low, central, and high estimates of<br />
the expected series of flows for the 1977/1978–1983/1984 period.<br />
This period contained a sequence of mainly wet years<br />
(1977/1978–1980/1981) followed by 3 consecutive dry years,<br />
with rainfalls being less than half of that experienced in the<br />
former period.<br />
At the stage when the measured flow record was revealed, it<br />
was unfortunately discovered that the record for the 1979/<br />
1980–1980/1981 years was erroneous and hence had to be<br />
disregarded when computing the test statistics. The results of<br />
this test are summarized in Figure 7, while the annual water<br />
Figure 12. Annual water balances for modified proxy-basin<br />
(M-PB) test on Lundi catchment.<br />
Figure 14. Annual water balances for proxy-basin differential<br />
split-sample (PB-DSS) test on Ngezi-North catchment.
2200<br />
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
Figure 17. Assessments of uncertainty interval for prediction<br />
of annual water balances in the PB-DSS test on Ngezi-North<br />
catchment.<br />
Figure 15. Annual water balances for modified proxy-basin<br />
differential split-sample (M-PB-DSS) test on Ngezi-North<br />
catchment.<br />
balances are shown in Figure 14. The assessed uncertainty<br />
intervals of the model predicted annual runoff are shown in<br />
Figure 17.<br />
From Figure 17 it appears that all models have managed to<br />
provide for a nonbiased range of estimates of the overall water<br />
balance, which for some models is quite narrow: NAM, 50%;<br />
WATBAL, 30%; and MIKE SHE, 10%. In terms of the<br />
overall water balance, the central estimates of the models<br />
agree within 25% (NAM), 5% (WATBAL), and 2% (MIKE<br />
SHE). The agreement between the recorded and simulated<br />
monthly flows and the flow duration curves, however, is less<br />
accurate for NAM and MIKE SHE than for the WATBAL<br />
model, which provides for an excellent fit in terms of these<br />
measures. The reason for the somewhat lower R2 and EI<br />
figures for the NAM model is related to its generally less<br />
accurate prediction of flows, while for the MIKE SHE model<br />
this is directly linked to the erroneous assessment of a key<br />
drainage parameter, causing the model to produce much more<br />
base flow than actually exist.<br />
Hydrographs showing measured discharge and predictions<br />
by the three models for the rainy season of 1977/1978 are<br />
presented in Figure 8b. These graphs confirm the conclusions<br />
derived from the numerical criteria, R2, and EI, namely, that<br />
Figure 16. Assessments of uncertainty interval for prediction<br />
of annual water balances in the PB test on Lundi catchment.<br />
the WATBAL reproduces the observed hydrograph very well,<br />
while the daily hydrograph for MIKE SHE reveals major errors<br />
in overall flow pattern. Note that the model which produces<br />
the best overall water balance (MIKE SHE) has at the<br />
same time the poorest fit when compared on daily values.<br />
M-PB-DSS Test<br />
This test is based on the same data from Ngezi-North as the<br />
previous PB-DSS test. Following the calibration of all models<br />
based on only 1 year of data (1977/1978), before the results for<br />
other years were revealed the above test was repeated. The<br />
main results of the modified test are shown in Figures 7 and 15.<br />
These results clearly demonstrate that access to only 1 year of<br />
runoff data has enabled all models to provide an excellent<br />
representation of the runoff within the entire testing period.<br />
The overall water balance agrees within 7% for all models<br />
and despite the fact that the calibration was based on a wet<br />
year, annual flows for the dry period come within the right<br />
order of magnitude, although the relative deviation in some<br />
cases is quite significant. The high R2 and EI scores achieved<br />
by all models confirm that the representation of the monthly<br />
flow sequence and the overall flow pattern has become very<br />
good after the calibration.<br />
Discussion and Conclusions<br />
The three generalized modeling systems, NAM, WATBAL,<br />
and MIKE SHE, have been subject to a rigorous testing<br />
scheme on data from three Zimbabwean catchments. NAM is<br />
a typical representative for the lumped conceptual class of<br />
models, while MIKE SHE similarly belongs to the distributed<br />
physically based class. WATBAL falls between the two classes.<br />
However, for the specific applications in Zimbabwe, where<br />
surface water hydrological aspects have been dominated, it can<br />
be argued that WATBAL can be considered as another representative<br />
of the distributed physically based class.<br />
Although establishing an objective framework for the model<br />
tests and intercomparisons has been attempted, it should be<br />
recognized that the results of a certain validation will be influenced<br />
by the specific test conditions, including the particular<br />
climate, catchment characteristics, data availability, and quality<br />
as well as subjective assessments made by the user (e.g., interpretation<br />
of available information for determining model parameters).<br />
Hence the obtained results are not only a function
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
2201<br />
of the modeling system itself, but also of the user and numerous<br />
other factors. To arrive at a firm conclusion many validations<br />
would usually be required, and the limited number of<br />
tests undertaken therefore suggests that individual results may<br />
only be cautiously concluded.<br />
With this caution regarding generality in mind, a number of<br />
specific conclusions may be derived from the case study. First,<br />
in view of the difficult tasks given to the models involving<br />
simulation for ungauged catchments and nonstationary time<br />
periods, the overall performance of the models is considered<br />
quite impressive. The overall water balance agrees within<br />
25% in all cases but one, and good results are achieved<br />
without balancing out excessive positive and negative deviations<br />
within individual years. In most cases the models score an<br />
R2 value at about 0.8 or greater and an EI index generally<br />
above 0.7.<br />
Secondly, the following is noted with regard to the specific<br />
types of validations tests:<br />
1. For the SS test the NAM, WATBAL, and MIKE SHE<br />
systems generally exhibit similar performance. All models are<br />
able to provide a close fit to the recorded flows for the calibration<br />
period, without severely reducing the performance<br />
during the independent validation period. Hence this test suggests<br />
that if an adequate runoff period for a few (3–5) years<br />
exists, any of the modeling systems could be used as a reliable<br />
tool for filling in gaps in such records or used to extend runoff<br />
series based on long-term rainfall series. Considering the data<br />
requirements and efforts involved in the setup of the different<br />
models, however, a simple model of the NAM type should<br />
generally be selected for such tasks.<br />
2. For the PB tests, designed for validating the capability of<br />
the models to represent flow series of ungauged catchments, it<br />
had been expected that the physically based models would<br />
produce better results than the simple type of models. The<br />
results, however, do not provide unambiguous support for this<br />
hypothesis. All three modeling systems generated good results,<br />
with the WATBAL providing slightly more accurate results<br />
than the others. Hence for the Zimbabwean conditions the<br />
additional capabilities of the MIKE SHE, as compared to the<br />
WATBAL, namely, the distributed physically based features<br />
relating to subsurface flow, proved to be of little value in<br />
simulating the water balance. For the PB tests it is noticed that<br />
the uncertainty range represented by the low and high estimates<br />
is significantly larger for the NAM than for the WAT-<br />
BAL and MIKE SHE cases. This probably reflects the fact that<br />
parameter estimation for ungauged catchments is generally<br />
more uncertain for the NAM, whose parameters are semiempirical<br />
coefficients without direct links to catchment characteristics.<br />
3. A general experience of the M-PB tests is that allowing<br />
for model calibration based on only 1 year of runoff data<br />
improves the overall performance of all models. The improvement<br />
appears to be particularly significant for the NAM model,<br />
which also showed the largest uncertainties in the cases where<br />
no calibration was possible.<br />
4. For the DSS tests all models have been able to simulate<br />
flows of the right order of magnitude and correct pattern.<br />
Hence all models have proven their ability to simulate the<br />
runoff pattern in periods with much reduced rainfall and runoff<br />
as compared to the calibration period. On the basis of these<br />
results there appears no immediate justification for using an<br />
advanced type of model to represent flows following a significant<br />
change of rainfall, providing a number of years are available<br />
for calibration purposes. It is tempting to extend this<br />
finding to suggest that the simple type of model could be used<br />
to assess the impact of climate change on water resources. It<br />
should be recognized, however, that above results cannot fully<br />
justify such a hypothesis, since a long-term climate change<br />
would probably bring about changes in vegetation and their<br />
evaporation. This type of nonstationarity has not been adequately<br />
tested.<br />
As far as the SS tests are concerned the above conclusion is<br />
in full agreement with results of other studies [e.g., Michaud<br />
and Sorooshian, 1994]. With regard to the PB tests the present<br />
conclusion in favor of the distributed physically based modeling<br />
systems is in agreement with, albeit more vague than, that<br />
of Michaud and Sorooshian [1994].<br />
In summary, the present study, as well as similar studies<br />
reported in literature, suggests the following conclusions with<br />
regard to rainfall runoff modeling.<br />
1. Given a few (1–3) years of runoff measurements, a<br />
lumped model of the NAM type would be a suitable tool from<br />
the point of view of technical and economical feasibility. This<br />
applies for catchments with homogeneous climatic input as<br />
well as cases where significant variations in the exogenous<br />
input is encountered.<br />
2. For ungauged catchments, however, where accurate<br />
simulations are critical for water resources decisions, a distributed<br />
model is expected to give better results than a lumped<br />
model if appropriate information on catchment characteristics<br />
can be obtained.<br />
Acknowledgments. The modeling work on the Zimbabwe catchments<br />
were carried out by our colleagues Børge Storm and Merete<br />
Styczen (MIKE SHE) and Roar Jensen (NAM), while the second<br />
author was responsible for the WATBAL work. During the data collection<br />
and field reconnaissance in Zimbabwe, kind help and assistance<br />
was provided by University of Zimbabwe; National Herbarium; and<br />
Department of Meteorological Services and Hydrological Branch,<br />
Ministry of Energy, Water Resources and Development. The study was<br />
carried out with financial support from the Danish Council of Technology,<br />
and the paper preparation was supported by the Danish Technical<br />
Research Council.<br />
References<br />
Abbott, M. B., J. C. Bathurst, J. A. Cunge, P. E. O’Connel, and J.<br />
Rasmussen, An introduction to the European Hydrological System—Systeme<br />
Hydrologique Europeen, “SHE,” 1, History and philosophy<br />
of a physically based distributed modelling system, J. Hydrol.,<br />
87, 45–59, 1986a.<br />
Abbott, M. B., J. C. Bathurst, J. A. Cunge, P. E. O’Connell, and J.<br />
Rasmussen, An introduction to the European Hydrological System—Système<br />
Hydrologique Européen “SHE,” 2, Structure of a<br />
physically based distributed modelling system, J. Hydrol., 87, 61–77,<br />
1986b.<br />
Anderson, J., Communal land physical resource inventory, Mhondoro<br />
and Ngezi, Draft Rep. A 551, Chem. and Soil Res. Inst., Minist. of<br />
Agric., Harare, Zimbabwe, 1989.<br />
Beven, K. J., Changing ideas in hydrology—The case of physically<br />
based models, J. Hydrol., 105, 157–172, 1989.<br />
Danish Hydraulic Institute (DHI), Validation of hydrological models,<br />
Phase II, Hørsholm, 1993a.<br />
Danish Hydraulic Institute (DHI), MIKE SHE WM, short description,<br />
1993b.<br />
Danish Hydraulic Institute (DHI), MIKE11 short description, 1994.<br />
Flavelle, P., A quantitative measure of model validation and its potential<br />
use for regulatory purposes, Adv. Water Resour., 15, 5–13, 1992.<br />
Franchini, M., and M. Pacciani, Comparative analysis of several conceptual<br />
rainfall-runoff models, J. Hydrol., 122, 161–219, 1991.<br />
Grayson, R. B., I. D. Moore, and T. A. McHahon, Physically based
2202<br />
REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />
hydrologic modeling, 2, Is the concept realistic, Water Resour. Res.,<br />
28(10), 2659–2666, 1992.<br />
Jain, S. K., B. Storm, J. C. Bathurst, J. C. Refsgaard, and R. D. Singh,<br />
Application of the SHE to catchments in India, 2, Field experiments<br />
and simulation studies with the SHE on the Kolar subbasin to the<br />
Narmada River, J. Hydrol., 140, 25–47, 1992.<br />
Klemes, V., Sensitivity of water resources systems to climate variations,<br />
WCP Rep. 98, World Meteorological Organisation, Geneva, 1985.<br />
Klemes, V., Operational testing of hydrological simulation models,<br />
Hydrol. Sci. J., 31(1), 13–24, 1986.<br />
Knudsen, J., A. Thomsen, and J. C. Refsgaard, WATBAL: A semidistributed,<br />
physically based hydrological modelling system, Nordic<br />
Hydrol., 17, 347–362, 1986.<br />
Loague, K. M., and R. A. Freeze, A comparison of rainfall-runoff<br />
modeling techniques on small upland catchments, Water Resour.<br />
Res., 21(2), 229–248, 1985.<br />
Michaud, J., and S. Sorooshian, Comparison of simple versus complex<br />
distributed runoff models on a midsized semiarid watershed, Water<br />
Resour. Res., 30(3), 593–605, 1994.<br />
Naef, F., Can we model the rainfall-runoff process today, Hydrol. Sci.<br />
Bull., 26(3), 281–289, 1981.<br />
Nash, I. E., and I. V. Sutcliffe, River flow forecasting through conceptual<br />
models, I, J. Hydrol., 10, 282–290, 1970.<br />
Nielsen, S. A., and Bari, Simulation of runoff from ungauged catchments<br />
by a semi-distributed hydrological modelling system, Proceedings,<br />
6th IAHR Congress, Int. Assoc. for Hydraul. Res., Delft, Netherlands,<br />
1988.<br />
Nielsen, S. A., and E. Hansen, Numerical simulation of the rainfallrunoff<br />
process on a daily basis, Nordic Hydrol., 4, 171–190, 1973.<br />
Refsgaard, J. C., Model and data requirements for simulation of runoff<br />
and land surface processes, in Proceedings from NATO Advanced<br />
Research Workshop “Global Environmental Change and Land Surface<br />
Processes in Hydrology: The Trials and Tribulations of Modelling and<br />
Measurering, Tucson, May 17–21, 1993, edited by S. Sorooshian and<br />
V. K. Gupta, Springer-Verlag, New York, 1996.<br />
Refsgaard, J. C., and B. Storm, MIKE SHE, in Computer Models of<br />
Watershed Hydrology, edited by V. J. Singh, pp. 809–846, Water<br />
Resour. Publ., Littleton, Colo., 1995.<br />
Refsgaard, J. C., S. M. Seth, J. C. Bathurst, M. Erlich, B. Storm, G. H.<br />
Jørgensen, and S. Chandra, Application of the SHE to catchments in<br />
India, 1, General results, J. Hydrol., 140, 1–23, 1992.<br />
Schlesinger, S., R. E. Crosbie, R. E. Gagné, G. S. Innis, C. S. Lalwani,<br />
J. Loch, J. Sylvester, R. D. Wright, N. Kheir, and D. Bartos, Terminology<br />
for model credibility, Simulation, 32(3), 103–104, 1979.<br />
Smith, R. E., D. R. Goodrich, D. A. Woolhiser, and J. R. Simanton,<br />
Comment on “Physically based modeling, 2, Is the concept realistic”<br />
by R. B. Grayson, I. D. More, and T. A. McHahon, Water<br />
Resour. Res., 30(3), 851–854, 1994.<br />
Timberlake, J., Brief description of the vegetation of Mondoro and<br />
Ngezi communal lands, Mashonaland West, Natl. Herbarium,<br />
Harare, Zimbabwe, 1989.<br />
Tsang, C.-F., The modelling process and model validation, Ground<br />
Water, 29(6), 825–831, 1991.<br />
U.S. Committee, Task Committee on Quantifying Land-Use Change<br />
Effects, Evaluation of hydrological models used to quantify major<br />
land-use change effects, J. Irrig. Drain. Eng., 111(1), 1–17, 1985.<br />
Wilcox, B. P., W. J. Rawls, D. L. Brakensiek, and J. R. Wright, Predicting<br />
runoff from rangeland catchments: A comparison of two<br />
models, Water Resour. Res., 26(10), 2401–2410, 1990.<br />
World Meteorological Organization, (WMO), Intercomparison of<br />
conceptual models used in operational hydrological forecasting,<br />
WMO Oper. Hydrol. Rep. 7, WMO 429, Geneva, 1975.<br />
World Meteorological Organization (WMO), Third planning meeting<br />
on World Climate Programme Water, WCP 114, WMO/TD 106,<br />
Geneva, 1985.<br />
World Meteorological Organization (WMO), Intercomparison of<br />
models for snowmelt runoff, WMO Oper. Hydrol. Rep. 23, WMO 646,<br />
Geneva, 1986.<br />
World Meteorological Organization (WMO), Simulated real-time intercomparison<br />
of hydrological models, WMO Oper. Hydrol. Rep. 38,<br />
WMO 779, Geneva, 1992.<br />
J. Knudsen and J. C. Refsgaard, Danish Hydraulic Institute, Agern<br />
Alle 5, DK-2970 Hørsholm, Denmark.<br />
(Received September 25, 1995; revised March 15, 1996;<br />
accepted March 20, 1996.)
[7]<br />
Refsgaard JC (1997) Parametrisation, calibration and validation of distributed<br />
hydrological models.<br />
Journal of Hydrology, 198, 69-97.<br />
Reprinted from Journal of Hydrology with permission from Elsevier
[8]<br />
Refsgaard JC (1997) Validation and Intercomparison of Different Updating<br />
Procedures for Real-Time Forecasting.<br />
Nordic Hydrology, 28, 65-84.<br />
Reprinted with permission from Nordic Hydrology
[9]<br />
Refsgaard JC, Sørensen HR, Mucha I, Rodak D, Hlavaty Z, Bansky L,<br />
Klucovska J, Topolska J, Takac J, Kosc V, Enggrob HG, Engesgaard P,<br />
Jensen JK, Fiselier J, Griffioen J, Hansen S (1998) An Integrated Model for<br />
the Danubian Lowland – Methodology and Applications.<br />
Water Resources Management, 12, 433-465.<br />
Reprinted from Water Resources Management with permission from Springer<br />
(www.springerlink.com)
Water Resources Management 12: 433–465, 1998.<br />
© 1998 Kluwer Academic Publishers. Printed in the Netherlands.<br />
433<br />
An Integrated Model for the Danubian Lowland –<br />
Methodology and Applications<br />
J. C. REFSGAARD 1 ,H.R.SØRENSEN 1 , I. MUCHA 2 , D. RODAK 2 ,<br />
Z. HLAVATY 2 , L. BANSKY 2 , J. KLUCOVSKA 2 , J. TOPOLSKA 4 , J. TAKAC 3 ,<br />
V. KOSC 3 , H. G. ENGGROB 1 , P. ENGESGAARD 5 , J. K. JENSEN 5 ,<br />
J. FISELIER 6 , J. GRIFFIOEN 7 and S. HANSEN 8<br />
1 Danish Hydraulic Institute, Denmark<br />
2 Ground Water Consulting Ltd., Bratislava, Slovakia<br />
3 Irrigation Research Institute (VUZH), Bratislava, Slovakia<br />
4 Water Research Institute (VUVH), Bratislava, Slovakia<br />
5 Water Quality Institute (VKI), Denmark<br />
6 DHV Consultants BV, The Netherlands<br />
7 Netherlands Institute of Applied Geosciences TNO, The Netherlands<br />
8 Royal Veterinary and Agricultural University, Denmark<br />
(Received: 30 December 1997; in final form: 10 November 1998)<br />
Abstract. A unique integrated modelling system has been developed and applied for environmental<br />
assessment studies in connection with the Gabcikovo hydropower scheme along the Danube.<br />
The modelling system integrates model codes for describing the reservoir (2D flow, eutrophication,<br />
sediment transport), the river and river branches (1D flow including effects of hydraulic control structures,<br />
water quality, sediment transport), the ground water (3D flow, solute transport, geochemistry),<br />
agricultural aspects (crop yield, irrigation, nitrogen leaching) and flood plain conditions (dynamics<br />
of inundation pattern, ground water and soil moisture conditions, and water quality). The uniqueness<br />
of the established modelling system is the integration between the individual model codes, each of<br />
which provides complex descriptions of the various processes. The validation tests have generally<br />
been carried out for the individual models, whereas only a few tests on the integrated model were<br />
possible. Based on discussion and examples, it is concluded that the results from the integrated model<br />
can be assumed less uncertain than outputs from the individual model components. In an example,<br />
the impacts of the Gabcikovo scheme on the ecologically unique wetlands created by the river branch<br />
system downstream of the new reservoir have been simulated. In this case, the impacts of alternative<br />
water management scenarios on ecologically important factors such as flood frequency and duration,<br />
depth of flooding, depth to ground water table, capillary rise, flow velocities, sedimentation and water<br />
quality in the river system have been explicitly calculated.<br />
Key words: Danube, environmental impacts, floodplain, Gabcikovo, groundwater, hydropower, integrated<br />
modelling, river branch.<br />
434 J. C. REFSGAARD ET AL.<br />
Figure 1. The Danubian Lowland with the new reservoir and the Gabcikovo scheme.<br />
1. Introduction<br />
1.1. THE DANUBIAN LOWL<strong>AND</strong> <strong>AND</strong> THE GABCIKOVO HYDROPOWER SCHEME<br />
The Danubian Lowland (Figure 1) in Slovakia and Hungary between Bratislava and<br />
Komárno is an inland delta (an alluvial fan) formed in the past by river sediments<br />
from the Danube. The entire area forms an alluvial aquifer, which receives around<br />
30 m 3 s −1 infiltration water from the Danube throughout the year, in the upper parts<br />
of the area and returns it to the Danube and the drainage canals in the downstream<br />
part. The aquifer is an important water resource for municipal and agricultural<br />
water supply.<br />
Human influence has gradually changed the hydrological regime in the area.<br />
Construction of dams upstream of Bratislava together with straightening and embanking<br />
of the river for navigational and flood protection purposes as well as<br />
exploitation of river sediments have significantly deepened the river bed and lowered<br />
the water level in the river and surrounding ground water level. These changes<br />
have had a significant influence on the ground water regime as well as the sensitive<br />
riverine forests downstream of Bratislava. Despite this basically negative trend the<br />
floodplain area with its alluvial forests and associated ecosystems still represents a<br />
unique landscape of outstanding ecological importance.<br />
The Gabcikovo hydropower scheme was put into operation in 1992. A large<br />
number of hydraulic structures has been established as part of the hydropower<br />
scheme. The key structures are a system of weirs across the Danube at Cunovo<br />
15 km downstream of Bratislava, a reservoir created by the damming at Cunovo, a<br />
30 km long lined power and navigation canal, outside the floodplain area, parallel to<br />
the Danube River with intake to the hydropower plant, a hydropower plant and two
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 435<br />
ship-locks at Gabcikovo, and an intake structure at Dobrohost, 10 km downstream<br />
of Cunovo, diverting water from the new canal to the river branch system. The<br />
entire scheme has significantly affected the hydrological regime and the ecosystem<br />
of the region, see, e.g., Mucha et al. (1997). The scheme was originally planned as<br />
a joint effort between former Czecho-Slovakia and Hungary, and the major parts of<br />
the construction were carried out as such on the basis of a 1977 international treaty.<br />
However, since 1989 Gabcikovo has been a major matter of controversy between<br />
Slovakia and Hungary, who have referred some disputed questions to international<br />
expert groups (EC, 1992, 1993a, b) and others to the International Court of Justice<br />
in The Hague (ICJ, 1997).<br />
Comprehensive monitoring and assessments of environmental impacts have been<br />
made, see Mucha (1995) for an overview. Since 1995 a joint Slovak-Hungarian<br />
monitoring program has been carried out (JAR, 1995, 1996, 1997).<br />
1.2. NEED FOR INTEGRATED <strong>MODELLING</strong><br />
The hydrological regime in the area is very dynamic with so many crucial links<br />
and feedback mechanisms between the various parts of the surface- and subsurface<br />
water regimes that integrated modelling is required to thoroughly assess environmental<br />
impacts of the hydropower scheme. This is illustrated by the following three<br />
examples:<br />
• Ground water quality. Based on qualitative arguments it was hypothesised<br />
that the damming and creation of the reservoir might lead to changes in the<br />
oxidation-reduction state of the ground water. The reason for this is that the<br />
reservoir might increase infiltration from the Danube to the aquifer because of<br />
increased head gradients. On the other hand, fine sediment matter might accumulate<br />
on the reservoir bottom, thereby creating a reactive sediment layer. The<br />
river water infiltrating to the aquifer has to pass this layer, which might induce<br />
a change in the oxidation status of the infiltrating water. This could affect the<br />
quality of the ground water from being oxic or suboxic towards being anoxic,<br />
which is undesirable for Bratislava’s water works, most of which are located<br />
near the reservoir. Thus, the oxidation-reduction state of the groundwater is<br />
intimately linked to a balance between the rates of infiltrating reducing water<br />
and the aquifer oxidizing capacity. The infiltrating water is linked to the hydraulic<br />
behaviour of the reservoir: how large is the infiltration area and at which<br />
rates does the infiltration take place at different locations. However, without<br />
an integrated model it is not possible to quantify whether and under which<br />
conditions these mechanisms play a significant role in practise, whether they<br />
are correct in principle but without practical importance, and what measures<br />
should be realised.<br />
• Agricultural production. Changes in discharges in the Danube caused by diversion<br />
of some of the water through the power canal and creation of a reservoir<br />
436 J. C. REFSGAARD ET AL.<br />
Figure 2. Important processes and their interactions with regard to floodplain hydrology.<br />
would lead to changes in the ground water levels. As the agricultural crops<br />
depend on capillary rise from the shallow ground water table and irrigation, the<br />
new hydrological situation created by the damming of the Danube might influence<br />
both the crop yield, the irrigation requirements and the nitrogen leaching.<br />
Traditional crop models describing the root zone are not sufficient in this case,<br />
because the lower boundary conditions (ground water levels) are changed in a<br />
way that can only be quantified if also the reservoir, the river and canal system<br />
and the aquifer are explicitly included in the modelling.<br />
• Floodplain ecosystem. The flora and fauna, which in the floodplain area are<br />
dominated by the river side branches, depend on many factors such as flooding<br />
dynamics, flow velocities, depth of ground water table, soil moisture, water<br />
quality and sediments. Also in this case the important factors depend on the<br />
interaction between the groundwater and the surface water systems (illustrated<br />
in Figure 2), and even on water quality and sediments in the surface water<br />
system, so that quantitative impact assessments require an integrated modelling<br />
approach.<br />
2. Integrated Modelling System<br />
2.1. INDIVIDUAL MODEL COMPONENTS<br />
An integrated modelling system (Figure 3) has been established by combining the<br />
following existing and well proven model codes:<br />
• MIKE SHE (Refsgaard and Storm, 1995) which, on a catchment scale, can<br />
simulate the major flow and transport processes in the hydrological cycle:<br />
– 1-D flow and transport in the unsaturated zone
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 437<br />
Figure 3. Structure of the integrated modelling system with indication of the interactions<br />
between the individual models.<br />
– 3-D flow and transport in the ground water zone<br />
– 2-D flow and transport on the ground surface<br />
– 1-D flow and transport in the river.<br />
All of the above processes are fully coupled allowing for feedback’s and interactions<br />
between components. In addition, MIKE SHE includes modules for<br />
multi-component geochemical and biodegradation reactions in the saturated<br />
zone (Engesgaard, 1996).<br />
• MIKE 11 (Havnø et al., 1995), is a one-dimensional river modelling system.<br />
MIKE 11 is used for simulating hydraulics, sediment transport and morphology,<br />
and water quality. MIKE 11 is based on the complete dynamic wave<br />
formulation of the Saint Venant equations. The modules for sediment transport<br />
and morphology are able to deal with cohesive and noncohesive sediment<br />
transport, as well as the accompanying morphological changes of the river bed.<br />
The noncohesive model operates on a number of different grain sizes.<br />
• MIKE 21 (DHI, 1995), which has the same basic characteristics as MIKE 11,<br />
extended to two horizontal dimensions, and is used for reservoir modelling.<br />
• MIKE 11 and MIKE 21 include River/Reservoir Water Quality (WQ) and<br />
Eutrophication (EU) (Havnø et al., 1995; VKI, 1995) modules to describe oxygen,<br />
ammonium, nitrate and phosphorus concentrations and oxygen demands<br />
as well as eutrophication issues such as bio-mass production and degradation.<br />
• DAISY (Hansen et al., 1991) is a one-dimensional root zone model for simulation<br />
of soil water dynamics, crop growth and nitrogen dynamics for various<br />
agricultural management practices and strategies.<br />
438 J. C. REFSGAARD ET AL.<br />
2.2. INTEGRATION OF MODEL COMPONENTS<br />
The integrated modelling system is formed by the exchange of data and feedbacks<br />
between the individual modelling systems. The structure of the integrated<br />
modelling system and the exchange of data between the various modelling systems<br />
are illustrated in general in Figure 3 and the steps in the integrated modelling is<br />
described further in Section 6.2 and illustrated in Figure 10 for the case of flood<br />
plain modelling. The interfaces between the various models indicated in Figure 3<br />
are<br />
A) MIKE SHE forms the core of the integrated modelling system having interfaces<br />
to all the individual modelling systems. The coupling of MIKE SHE and<br />
MIKE 11 is a fully dynamic coupling where data is exchanged within each<br />
computational time step, see Section 2.3 below.<br />
B) Results of eutrophication simulations with MIKE 21 in the reservoir are used<br />
to estimate the concentration of various water quality parameters in the water<br />
that enters the Danube downstream of the reservoir. This information serves as<br />
boundary conditions for water quality simulations for the Danube using MIKE<br />
11.<br />
C) Sediment transport simulations in the reservoir with MIKE 21 provide information<br />
on the amount of fine sediment on the bottom of the reservoir. The<br />
simulated grain size distribution and sediment layer thickness is used to calculate<br />
leakage coefficients, which are used in ground water modelling with MIKE<br />
SHE to calculate the exchange of water between the reservoir and the aquifer.<br />
D) The DAISY model simulates vegetation parameters which are used in MIKE<br />
SHE to simulate the actual evapotranspiration. Ground water levels simulated<br />
with MIKE SHE act as lower boundary conditions for DAISY unsaturated zone<br />
simulations. Consequently, this process is iterative and requires several model<br />
simulations.<br />
E) Results from water quality simulations with MIKE 11 and MIKE 21 provide<br />
estimates of the concentration of various components/parameters in the water<br />
that infiltrates to the aquifer from the Danube and the reservoir. This can be<br />
used in the ground water quality simulations (geochemistry) with MIKE SHE.<br />
A general discussion on the limitations in the above couplings is given in Section 7<br />
below.<br />
2.3. A COUPLING OF MIKE SHE <strong>AND</strong> MIKE 11<br />
The focus in MIKE SHE lies on catchment processes with a comparatively less<br />
advanced description of river processes. In contrary, MIKE 11 has a more advanced<br />
description of river processes and a simpler catchment description than MIKE<br />
SHE. Hence, for cases where full emphasis is needed for both river and catchment<br />
processes a coupling of the two modelling systems is required.
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 439<br />
Figure 4. Principles of the coupling between the MIKE SHE catchment code and the MIKE<br />
11 river code.<br />
A full coupling between MIKE SHE and MIKE 11 has been developed (Figure<br />
4). In the combined modelling system, the simulation takes place simultaneously<br />
in MIKE 11 and MIKE SHE, and data transfer between the two models<br />
takes place through shared memory. MIKE 11 calculates water levels in rivers<br />
and floodplains. The calculated water levels are transferred to MIKE SHE, where<br />
flood depth and areal extent are mapped by comparing the calculated water levels<br />
with surface topographic information stored in MIKE SHE. Subsequently, MIKE<br />
SHE calculates water fluxes in the remaining part of the hydrological cycle. Exchange<br />
of water between MIKE 11 and MIKE SHE may occur due to evaporation<br />
from surface water, infiltration, overland flow or river-aquifer exchange. Finally,<br />
water fluxes calculated with MIKE SHE are exchanged with MIKE 11 through<br />
source/sink terms in the continuity part of the Saint Venant equations in MIKE 11.<br />
The MIKE SHE–MIKE 11 coupling is crucial for a correct description of the<br />
dynamics of the river-aquifer interaction. Firstly, the river width is larger than<br />
one MIKE SHE grid, in which case the MIKE SHE river-aquifer description is<br />
no longer valid. Secondly, the river/reservoir system comprises a large number of<br />
hydraulic structures, the operation of which are accurately modelled in MIKE 11,<br />
but cannot be accounted for in MIKE SHE. Thirdly, the very complex river branch<br />
system with loops and flood cells needs a very efficient hydrodynamic formulation<br />
such as in MIKE 11.<br />
440 J. C. REFSGAARD ET AL.<br />
2.4. COMPARISON TO OTHER <strong>MODELLING</strong> SYSTEMS REPORTED IN<br />
LITERATURE<br />
Yan and Smith (1994) described the demand and outlined a concept for a full<br />
integrated ground water–surface water modelling system including descriptions of<br />
hydraulic structures and agricultural irrigation as a decision support tool for water<br />
resources management in South Florida. Typical examples of integrated codes<br />
described in the literature are Menetti (1995) and Koncsos et al. (1995).<br />
In a review of recent advances in understanding the interaction of groundwater<br />
and surface water Winter (1995) mainly describes groundwater codes, such as<br />
MODFLOW, which have been expanded with some, but very limited, surface water<br />
simulation capabilities. The research activities are characterized as ‘... although<br />
studies of these systems have increased in recent years, this effort is minimal compared<br />
to what is needed’. Winter (1995) sees the prospects for the future as follows:<br />
‘Future studies of the interaction of groundwater and surface water would benefit<br />
from, and indeed should emphasise, interdisciplinary approaches. Physical hydrologists,<br />
geochemists, and biologists have a great deal to learn from each other, and<br />
contribute to each other, from joint studies of the interface between groundwater<br />
and surface water.’<br />
Integrated three-dimensional descriptions of flow, transport and geochemical<br />
processes is still rarely seen for groundwater modelling of large basins. Thus,<br />
according to a recent review of basin-scale hydrogeological modelling (Person<br />
et al., 1996) most of the existing reactive transport model codes are based on<br />
one-dimensional descriptions.<br />
While many model codes contain a distributed physically-based representation<br />
of one of the three main components: ground water, unsaturated zone, and surface<br />
water systems, only few codes provide a fully integrated description of all<br />
these three main components. For example in an up-to-date book (Singh, 1995)<br />
presenting descriptions of 25 hydrological codes only three codes, SHE/SHESED<br />
(Bathurst et al., 1995), IHDM (Calver and Wood, 1995) and MIKE SHE (Refsgaard<br />
and Storm, 1995) provide such integrated descriptions. Among these three<br />
codes only MIKE SHE has capabilities for modelling advection-dispersion and<br />
water quality. None of the three codes contained options for computations of hydraulic<br />
structures in river systems, nor agricultural modelling such as crop yield<br />
and nitrogen leaching.<br />
The individual components of the integrated modelling system presented in this<br />
paper, we believe, represent state-of-the-art within their respective disciplines. The<br />
uniqueness is the full integration.<br />
3. Methodology for Model Construction, Calibration, Validation and<br />
Application<br />
The terminology and methodology used in the following is based on the concepts<br />
outlined in Refsgaard (1997).
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 441<br />
3.1. MODEL CONSTRUCTION<br />
All of the applied models are based on distributed physically-based model codes.<br />
This implies that most of the required input data and model parameters can ideally<br />
be measured directly in nature.<br />
3.2. MODEL CALIBRATION<br />
The calibration of a physically-based model implies that simulation runs are carried<br />
out and model results are compared with measured data. The adopted calibration<br />
procedure was based on ‘trial and error’ implying that the model user in between<br />
calibration runs made subjective adjustments of parameter values within physically<br />
realistic limits. The most important guidance for the model user in this process was<br />
graphical display of model results against measured values. It may be argued that<br />
such manual procedure adds a degree of subjectivity to the results. However, given<br />
the very complex and integrated modelling focusing on a variety of output results<br />
and containing a large number of adjustable parameters, automatic parameter optimisation<br />
is not yet possible and ‘trial and error’ still becomes the only feasible<br />
method in practise.<br />
3.3. MODEL VALIDATION<br />
Good model results during a calibration process cannot automatically ensure that<br />
the model can perform equally well for other time periods as well, because the<br />
calibration process involves some manipulation of parameter values. Therefore,<br />
model validations based on independent data sets are required. To the extent possible,<br />
limited by data availability, the models have been validated by demonstrating<br />
the ability to reproduce measured data for a period outside the calibration period,<br />
using a so-called split-sample test (Klemes, 1986). For some of the models, the<br />
model was even calibrated on pre-dam conditions and validated on post-dam conditions,<br />
where the flow regime at some locations was significantly altered due to<br />
the construction of the reservoir and related hydraulic structures and canals.<br />
3.4. MODEL APPLICATION<br />
The validated models have finally been used, as an integrated system, in a scenario<br />
approach to assess the environmental impacts of alternative water management<br />
options. The uncertainties of the model predictions have been assessed through<br />
sensitivity analyses.<br />
442 J. C. REFSGAARD ET AL.<br />
4. Selected Results from Model Construction, Calibration and Validation of<br />
Individual Components<br />
Comprehensive data collection and processing as well as model calibration and<br />
validation were carried out (DHI et al., 1995). In the following sections a few<br />
selected results are presented for the individual components. Further aspects of<br />
model validation focusing on integrated aspects are discussed in Section 5.<br />
4.1. <strong>RIVER</strong> <strong>AND</strong> RESERVOIR FLOW <strong>MODELLING</strong><br />
The following models have been constructed, calibrated and validated:<br />
• one-dimensional MIKE 11 model for the Danube from Bratislava to Komarno,<br />
• one-dimensional MIKE 11 model for the river branch system at the Slovak<br />
floodplain, and<br />
• two-dimensional MIKE 21 model for the reservoir.<br />
The MIKE 11 models have been established in two versions reflecting post- and<br />
pre-dam conditions, respectively.<br />
4.1.1. MIKE 11 River Model for the Danube<br />
The MIKE 11 model for the Danube is based on river cross-sections measured in<br />
1989 and 1991. The applied boundary conditions were measured daily discharges<br />
at Bratislava (upstream) and a discharge rating curve at Komarno (downstream).<br />
The model was initially calibrated for two steady state situations reflecting a low<br />
flow situation (905 m 3 s −1 ) and a flow situation close to the long term average<br />
(2390 m 3 s −1 ), respectively. Subsequently, the model was calibrated in a nonsteady<br />
state against daily water level and discharge measurements from 1991. The model<br />
was finally validated by demonstrating the ability to reproduce measured daily<br />
water level data from 1990. Calibration and validation results are presented in<br />
Topolska and Klucovska (1995). For the post-dam model some river reaches were<br />
updated with cross-sections measured in 1993. In addition, the reservoir and related<br />
hydraulic structures and canals were included. As the conditions after damming<br />
of the Danube have changed significantly, re-calibration of the post-dam model<br />
was carried out for the period April 1993–July 1993. Subsequently, the model was<br />
validated against measured data from the period November 1992–March 1993.<br />
4.1.2. MIKE 11 Model for the River Branch System<br />
The Danubian floodplain is a forest area of major ecological interest characterised<br />
by a complex system of river branches. A layout of the river branch system is shown<br />
in Figure 5. The cross-sections in the river branch system were measured during the<br />
1960’s and 1970’s. The pre-dam model was calibrated against water level and flow<br />
data from the 1965 flood. In the post-dam situation, the branch system is fed by an
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 443<br />
Figure 5. Layout of the river branch system on the Slovakian side of the Danube.<br />
inlet structure with water from the power canal. The system consists of a number<br />
of compartments (cascades) separated by small dikes. On each of these dikes combined<br />
structures of culverts and spillways are located enabling some control of the<br />
water levels and flows in the system. Results of the model calibration against data<br />
measured during the summer 1994 are shown in Klucovska and Topolska (1995).<br />
Finally, the model was validated by demonstrating the ability to reproduce water<br />
levels measured during the summer of 1993. Some of these results are presented in<br />
Sørensen et al. (1996).<br />
4.1.3. MIKE 21 Reservoir Model<br />
A MIKE 21 hydrodynamic model for the reservoir was established based on a<br />
reservoir bathymetry measured in 1994. The spatial resolution of the finite difference<br />
model is 100 × 50 m. The model was calibrated against flow velocities<br />
measured in the reservoir in the autumn of 1994.<br />
4.2. GROUND WATER FLOW <strong>MODELLING</strong><br />
Ground water modelling has been carried out at three different spatial scales:<br />
• A regional ground water model for pre-dam conditions (3000 km 2 , 500 m<br />
horizontal grid, 5 vertical layers).<br />
• A regional ground water model for post-dam conditions (3000 km 2 , 500 m<br />
horizontal grid, 5 vertical layers).<br />
• A local ground water model for an area surrounding the reservoir for both preand<br />
post-dam conditions (200 km 2 , 250 m horizontal grid, 7 vertical layers).<br />
• A local ground water model for the river branch system for both pre- and postdam<br />
conditions (50 km 2 , 100 m horizontal grid, 2 vertical layers).<br />
• A cross-sectional (vertical profile) model near Kalinkovo at the left side of the<br />
reservoir (2 km long, 10 m horizontal grid, 24 vertical layers).<br />
The regional and local ground water models all use the coupled version of the<br />
MIKE SHE and MIKE 11 and hence, include modelling of evapotranspiration and<br />
444 J. C. REFSGAARD ET AL.<br />
snowmelt processes, river flow, unsaturated flow and ground water flow. The crosssectional<br />
model only includes ground water processes.<br />
4.2.1. Model Construction<br />
Comprehensive input data were available and used in the construction of the models.<br />
In general, the regional and the local models are based on the same data with<br />
the main difference being that the local models provide finer resolutions and less<br />
averaging of measured input data. The two regional models, reflecting pre- and<br />
post-dam conditions, are basically the same. The only difference is that the postdam<br />
model includes the reservoir and related hydraulic structures and seepage<br />
canals.<br />
The models are based on information on location of river systems and crosssectional<br />
river geometry, surface topography, land use and cropping pattern, soil<br />
physical properties and hydrogeology. In addition, time series of daily precipitation,<br />
potential evapotranspiration and temperature as well as discharge inflow at<br />
Bratislava have been used. Comprehensive geological data exist from this area, see<br />
e.g., Mucha (1992) and Mucha (1993). The aquifer, ranging in thickness from about<br />
10 m at Bratislava to about 450 m at Gabcikovo, consists of Danube river sediments<br />
(sand and gravel) of late Tertiary and mainly Quaternary age. The present model is<br />
based on the work of Mucha et al. (1992a, b).<br />
4.2.2. Model Calibration<br />
The ground water model was calibrated against selected measured time series of<br />
ground water levels. The following parameters were subject to calibration: specific<br />
yield in the upper aquifer layer, leakage coefficients for the river bed and hydraulic<br />
conductivities for the aquifer layers. The soil physical characteristics for the unsaturated<br />
zone have been adopted directly from the unsaturated zone/agricultural<br />
modelling.<br />
The river model that has been used in the ground water modelling is identical<br />
to the MIKE 11 river model of the Danube, which was successfully validated independently<br />
as a ‘stand alone model’ (Subsection 4.1, above). When coupling MIKE<br />
SHE and MIKE 11 water is exchanged between the two models. The amount of<br />
water that recharges the aquifer in the upstream part and re-enters the river further<br />
downstream is in the order of 10–60 m 3 s −1 depending on the Danube discharge<br />
and on the actual ground water level. The recharge is typically two orders of magnitude<br />
less than the Danube discharge, and hence, a re-calibration of the MIKE<br />
11 river model is not required. As the major part of the ground water recharge<br />
originates from infiltration through the river bed, the leakage coefficient for the<br />
river bed becomes very important. Limited field information was available on this<br />
parameter, and hence, it was assumed spatially constant and through calibration<br />
assessed to be 5 × 10 −5 s −1 for the Danube and Vah rivers and 5 × 10 −6 s −1 for
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 445<br />
the Little Danube. These values are in good agreement with previous modelling<br />
experiences (Mucha et al., 1992b).<br />
When keeping the specific yield and the leakage coefficients for the river bed<br />
fixed the main calibration parameters were the hydraulic conductivities of the saturated<br />
zone. About 300 time series of ground water level observations were available<br />
for the model area, typically in terms of 30–40 yr of weekly observations. The<br />
calibration was carried out on the basis of about 80 of these series for the period<br />
1986–1990. In the parameter adjustments the overall spatial pattern described in the<br />
geological model were maintained. Some of the calibration results are illustrated<br />
in Figure 6 showing observed Danube discharge data together with simulated and<br />
measured ground water levels for three wells located at different distances from the<br />
Danube. Wells 694 and 740 are seen to react relatively quickly to fluctuations in<br />
river discharge as compared to well 7221, which is located further away from the<br />
river. This illustrates how the dynamics of the Danube propagates and is dampened<br />
in the aquifer.<br />
4.2.3. Model Validation<br />
The calibrated ground water model was validated by demonstrating the ability to<br />
reproduce measured ground water tables after damming of the Danube. In this<br />
regard the only model modification is the inclusion of the reservoir and related<br />
structures and canals. Due to the nonstationarity of the hydrological regime such<br />
a validation test, which according to Klemes (1986) is denoted a differential splitsample<br />
test, is a demanding test. Figure 7 shows the simulated and observed ground<br />
water levels for the same three observation wells as shown for the calibration period<br />
in Figure 6. The effects of the damming of the Danube in October 1992, when the<br />
new reservoir was established, is clearly seen in terms of increased ground water<br />
levels and reduced ground water dynamics when comparing the two figures. These<br />
features are well captured by the model.<br />
4.3. GROUND WATER QUALITY<br />
A geochemical field investigation was carried out in a cross-section north of the<br />
reservoir near Kalinkovo as a basis for identifying the key geochemical processes<br />
and estimating parameter values (see Mucha, 1995). Eleven multi-screen wells<br />
were installed close to the water supply wells at Kalinkovo forming a 7.5 km long<br />
cross-section parallel to the regional ground water flow direction. The multi-screen<br />
wells have been sampled frequently to investigate the ongoing bio-geochemical<br />
processes during infiltration of the Danube river water into the aquifer.<br />
A ground water quality model was established for the Kalinkovo cross-sectional<br />
profile based on all the measured field data. This model includes a comprehensive<br />
description of the bio-geochemical processes such as kinetically controlled<br />
denitrification and equilibrium controlled inorganic chemistry based on the well<br />
known PHREEQE code. More details are given in Griffioen et al. (1995) and<br />
446 J. C. REFSGAARD ET AL.<br />
Figure 6. Danube discharge at Bratislava together with simulated and observed ground water<br />
levels for three wells before the damming of the Danube (calibration period).
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 447<br />
Figure 7. Simulated and observed ground water levels for three wells after damming of the<br />
Danube (validation period).<br />
Engesgaard (1996). The transport part of the Kalinkovo cross-section has been<br />
calibrated against 18 O isotope data. The parameters describing reactive processes<br />
have been assessed and adjusted on the basis of the detailed field measurements<br />
in the Kalinkovo cross-sectional profile. It was shown that the geochemical model<br />
behaves qualitatively correct (Engesgaard, 1996).<br />
4.4. UNSATURATED ZONE <strong>AND</strong> AGRICULTURAL <strong>MODELLING</strong><br />
Modelling of the pre-dam and post-dam conditions of agricultural potential and<br />
nitrate leaching risk was carried out using a representative selection of soil units,<br />
cropping pattern and meteorological data covering the area between Danube and<br />
Maly Danube (Figure 1). The DAISY model uses time-varying ground water levels<br />
(simulated with the regional MIKE SHE ground water model) as lower boundary<br />
condition, for the unsaturated flow simulations. Cropping pattern and fertiliser<br />
application is included in the model based on measurements and statistical data.<br />
The model was calibrated on the basis of data from field experiments carried<br />
out during the years 1981–1987 at the experimental station in Most near Bratislava.<br />
During this process the crop parameters used in the model were adjusted to Slovak<br />
448 J. C. REFSGAARD ET AL.<br />
conditions. After the initial model construction and calibration, the model performance<br />
was evaluated through preliminary simulations using data from a number of<br />
plots located on an experimental field site at Lehnice in the middle of the project<br />
area. On the basis of comparisons between measured and simulated values of<br />
nitrogen uptake, dry matter yield and nitrate concentrations in soil moisture, the<br />
model performance under Slovak conditions was considered satisfactory (DHI et<br />
al., 1995).<br />
4.5. <strong>RIVER</strong> <strong>AND</strong> RESERVOIR SEDIMENT TRANSPORT <strong>MODELLING</strong><br />
4.5.1. Danube River Sediment Transport<br />
A one-dimensional morphological model was established for the Danube. The<br />
model operates with cross-sectional averaged parameters representing the river<br />
reach between every computational point (i.e. approximately 500 m), a special<br />
technique for comparing ‘real’ and simulated state variables was required. Therefore,<br />
the changes in mean water level over a decade rather than changes in bed<br />
elevations were compared between observations and simulations. For this purpose<br />
the changes in the so-called ‘Low Regulation and Navigable Water Level’ (LR-<br />
NWL) were used. LR-NWL is specified by the Danube Commission as the water<br />
level corresponding to Q94% which is approximately 980 m 3 s −1 . By using such<br />
an approach, perturbations in bed levels from one cross-section to another did not<br />
destroy the picture of the overall trends in aggradation and degradation of the river<br />
bed. The results of the calibration (1974–84) and validation runs (1984–90) are<br />
described in Topolska and Klucovska (1995).<br />
4.5.2. Sediment Transport in the River Branch System<br />
A one-dimensional fine sediment model was constructed for the river branch system<br />
in order to have a tool for quantitative evaluation of the possible sedimentation<br />
in the river branch system for alternative water management options. The upstream<br />
boundary condition for the model was provided in terms of concentration of suspended<br />
sediments simulated by the reservoir model. As virtually no field data on<br />
sedimentation in the river branch system were available neither calibration nor validation<br />
was possible. Instead, experienced values of model parameters from other<br />
similar studies as reported in the literature were used.<br />
4.5.3. Reservoir Sediment Model<br />
A two-dimensional fine graded sediment model was constructed for the reservoir.<br />
The suspended sediment input was imposed as a boundary condition in Bratislava<br />
with time series of sediment concentrations of six suspended sediment fractions<br />
with their own grain sizes and fall velocities. The fall velocity for each of the six<br />
fractions was assessed according to field measurements. No further model calibration<br />
was carried out. The only field data available for validation were a few bed
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 449<br />
sediment samples from summer 1994 with data on sedimentation thickness and<br />
grain size analyses (Holobrada et al., 1994). A comparison of model results and<br />
field data indicated that a reservoir sedimentation of the right order of magnitude<br />
was simulated. The simulated reservoir sedimentation corresponded to 42% of the<br />
total suspended load at Bratislava.<br />
4.6. SURFACE WATER QUALITY <strong>MODELLING</strong><br />
4.6.1. Danube River Model<br />
A BOD-DO model (MIKE 11 WQ) has been used to describe the water quality<br />
in the main stream of the Danube between Bratislava and Komarno. This model<br />
describes oxygen concentration (DO) as a function of the decay of organic matter<br />
(BOD), transformation of nitrogen components, re-aeration, oxygen consumption<br />
by the bottom and oxygen production and respiration by living organisms.<br />
As the conditions from pre-dam to post-dam have changed significantly, separate<br />
calibrations and validations were carried out. The pre-dam model was calibrated<br />
against data from October 1991 and validated against data from April and August/September<br />
1991. The post-dam model was calibrated against data from May<br />
1993 and validated against data from June 1993.<br />
4.6.2. Model for the River Branch System<br />
The water quality in the river branches was simulated with a eutrophication model<br />
(MIKE 11 EU), in which the algae production is the driving force. The algae<br />
growth in this model is described as a function of incoming light, transparency<br />
of the water, temperature, sedimentation and growth rate of the algae and of the<br />
available inorganic nutrients. The calibration was carried out on the basis of few<br />
data available during the period June–August 1993. Due to lack of further data no<br />
independent model validation was possible and hence, the uncertainties related to<br />
applying the model for making quantitative predictions of the effects of alternative<br />
water management schemes may be considerable.<br />
4.6.3. Reservoir Model<br />
In the reservoir the driving force is also the algae growth and hence, a eutrophication<br />
model (MIKE 21 EU) was applied. The reservoir model was calibrated<br />
against measured data from August 1994. This field programme was substantial<br />
and resulted in much more data than available for the river branch system. Good<br />
correspondence between simulated and observed values were achieved during the<br />
calibration period. However, no further data have been available for independent<br />
validation tests.<br />
450 J. C. REFSGAARD ET AL.<br />
5. Validation of Integrated Model<br />
The model calibration and validation have basically been carried out for the individual<br />
models using separate domain data for river system, aquifer system, etc.<br />
Rigorous validation tests of the integrated model were generally not possible due<br />
to lack of specific and simultaneous data on the processes describing the various<br />
couplings. Furthermore, although reasonable good assessments of uncertainties<br />
of the individual model predictions could be made, it was not obvious how such<br />
uncertainty would propagate in the integrated model.<br />
It can be argued that uncertainties in output from one model would in principle<br />
influence the uncertainties in other components of the integrated modelling system,<br />
thus adding to the total uncertainty of the integrated model. Following this line of<br />
argument would lead to the conclusion that the uncertainty of predictions by the<br />
integrated model would be larger than the corresponding uncertainty of predictions<br />
made by traditional individual models. On the other hand it can also be argued<br />
that in the integrated modelling approach the uncertainties in the crucial boundary<br />
conditions are reduced, because assumptions needed for executing individual<br />
models are substituted by model simulations based on data from neighbouring<br />
domains, which, if properly calibrated and validated, better represent the boundary<br />
effects. This would lead to the conclusion that the uncertainties in predictions by<br />
the integrated model would be smaller than those of the individual models.<br />
In the present study, no theoretical analyses have been made of this problem.<br />
Instead, a few validation tests have been made for cases where the couplings could<br />
indirectly be checked by testing the performance of the integrated model against<br />
independent data. In the following, results from one of these validation tests for the<br />
integrated model are shown.<br />
The river-aquifer interaction changed significantly, when the reservoir was established.<br />
An important model parameter describing this interaction is the leakage<br />
coefficient, which was calibrated on the basis of ground water level data for the predam<br />
situation (Subsection 4.2). For the post-dam situation the MIKE 21 reservoir<br />
model calculates the thickness and grain sizes of the sedimentation at all points<br />
in the reservoir. By use of the Carman-Kozeny formula, the leakage factors are<br />
recalculated for the area which was now covered by the reservoir. The model results<br />
were then checked against ground water level observations from wells near the<br />
reservoir, and it was found, that a calibration factor of 10 had to be applied to the<br />
Carman-Kozeny formula. This can theoretically be justified by the fact that the<br />
sediments are stratified or layered due to variations in flow velocities during the<br />
sedimentation process. The same formula and the same calibration factor was also<br />
used for converting all texture data from aquifer sediment samples to hydraulic<br />
conductivity values in the model.<br />
Now, how can the validity of the integrated model be tested The ground water<br />
level observations from a few wells have been used to assess the leakage calibration<br />
factor, so although the model output was subsequently checked against data from
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 451<br />
Figure 8. Measured and simulated discharges in seepage canals. The data are from a particular<br />
day in May 1995 and in m 3 s −1 .<br />
many more wells, it may be argued that this in itself is not sufficient for a true model<br />
validation. Consider instead a comparison of simulated and measured discharges<br />
in the so-called seepage canals, which are small canals constructed a few hundred<br />
meters away from the reservoir with the aim of intercepting part of the infiltration<br />
through the bottom of the reservoir. In Figure 8 it can be seen that the model<br />
simulations match the measured data remarkably well at different locations along<br />
the seepage canals. Thus, at the two stations most downstream on both seepage<br />
canals (stations 2809 and 3214) the agreements between model predictions and<br />
field data are within 5%. This is a powerful test, because the discharge data have<br />
not been used at all in the calibration process, and because it integrates the effects of<br />
reservoir sedimentation, calculation of leakage factors and geological parameters.<br />
6. Model Application – Case Study of River Branch System<br />
6.1. HYDROLOGY OF <strong>RIVER</strong> BRANCH SYSTEM<br />
The hydrology of the river branch system is highly complex with many processes<br />
influencing the water characteristics of importance for flora and fauna (Figure 2).<br />
These processes are highly interrelated and dynamic with large variations in time<br />
and space. The complexity of the floodplain, with its river branch system, is indicated<br />
in Figures 5 and 9 for the 20 km reach downstream the reservoir on the<br />
Slovakian side, where alluvial forest occurs. Before the damming of the Danube<br />
452 J. C. REFSGAARD ET AL.<br />
Figure 9. Plan and perspective view of the surface topography, of the river branches and the<br />
related flood plains as represented in a model network of 100 m grid squares.<br />
in 1992 the river branches were connected with the Danube during periods with<br />
discharge above average. However, some of the branches were only active during<br />
flood situations a few days per year. It was anticipated that after the damming,<br />
the water level in the Danube would decrease significantly. Therefore, in order to<br />
avoid that water drains from the river branches to the Danube, resulting in totally<br />
dry river branches, the water outflow from branches into the Danube have been<br />
blocked except for the downstream one at chainage 1820 rkm (Figure 5). Now, the<br />
river branch system receives water from an inlet structure in the hydropower canal<br />
at Dobrohost (Figure 5). This weir has a design capacity of 234 m 3 s −1 . Together<br />
with the various hydraulic structures in the river branches, it controls the hydraulic,<br />
hydrological and ecological regime in the river branches and on the flood plains.
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 453<br />
Figure 10. Steps in integrated model for floodplain hydrology.<br />
6.2. <strong>MODELLING</strong> APPROACH<br />
Comprehensive field studies and modelling analyses are often carried out in connection<br />
with assessing environmental impacts of hydropower schemes. Recent examples<br />
from the Danube include the studies of the Austrian schemes Altenwörth<br />
(Nachtnebel, 1989) and Freudenau (Perspektiven, 1989). However, like in the Austrian<br />
cases, the modelling studies have most often been limited to independent<br />
modelling of river systems, groundwater systems or other subsystems, without<br />
providing an integrated approach as the one presented in this paper.<br />
The models in this study were applied in a scenario approach simulating the<br />
hydrological conditions resulting from alternative possible operations of the entire<br />
system of hydraulic structures (alternative water management regimes). Thus, one<br />
historical (pre-dam) regime and three hypothetical (post-dam) water regimes cor-<br />
454 J. C. REFSGAARD ET AL.<br />
responding to alternative operation schemes for the structures of the Gabcikovo<br />
system were simulated (DHI et al., 1995). Due to the integration of the overall<br />
modelling system each scenario simulation involves a sequence, some times in an<br />
iterative mode, of model calculations. For the case of river branch modelling a<br />
hierarchical scheme of simulation runs (Figure 10) included the following major<br />
steps:<br />
Step 1. Hydraulic river modelling (MIKE 11)<br />
Model simulation: The MIKE 11 model simulates the river flows and water<br />
levels in the entire river system and river branches.<br />
Coupling: The model outputs, in terms of flows into the reservoir at the upstream<br />
end and downstream outflows through the reservoir structures are used<br />
as boundary conditions for the reservoir modelling (Step 2). Furthermore, the<br />
flow velocities and water levels are used in the river water quality simulations<br />
(Step 4a).<br />
Step 2. Reservoir modelling (MIKE 21)<br />
Model simulation: The MIKE 21 reservoir model simulates velocities, sedimentation<br />
and eutrophication/water quality in the reservoir.<br />
Coupling: The flow boundary conditions are generated by the river model<br />
(Step 1). Results on sedimentation are used to calculate leakage coefficients.<br />
Results on oxygen, nitrogen and carbon can be used as boundary conditions of<br />
river water quality, water quality of infiltrating water (Step 3a).<br />
Step 3a. Regional ground water flow (MIKE SHE/MIKE 11)<br />
Model simulation: The coupled MIKE SHE/MIKE 11 model simulates the<br />
ground water flow and levels including the interaction with the river system<br />
and the reservoir.<br />
Coupling: In the reservoir, the infiltration is simulated on the basis of leakage<br />
coefficients, which have been calculated from the amount and composition<br />
(grain sizes) of the sedimentation on the reservoir bottom (Step 2). This link<br />
between reservoir sedimentation and ground water was shown to be crucial<br />
for the model results. Furthermore, an iterative link to the DAISY agricultural<br />
model exists (Step 3b). Hence, spatially and temporally varying ground<br />
water levels from MIKE SHE/MIKE 11 are used as lower boundary conditions<br />
in DAISY, which in turn simulates the leaf area index and the root zone<br />
depth which are used as input time series data in MIKE SHE/MIKE 11. The<br />
model outputs, in terms of ground water flow velocities, are used as input<br />
to the ground water quality simulation. The model results, in terms of river<br />
flow velocities and water levels, ground water flow velocities and water levels,<br />
are used as time varying boundary conditions for the local flood plain model<br />
(Step 4b).
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 455<br />
Step 3b. Root zone (DAISY)<br />
Model simulation: The DAISY model simulates the unsaturated zone flows,<br />
the vegetation development, including crop yield.<br />
Coupling: The DAISY has an iterative link to the MIKE SHE/MIKE 11 model<br />
(as described above under Step 3a).<br />
Step 4a. River branches water quality (MIKE 11)<br />
Model simulation: The MIKE 11 model simulates the river water quality (BOD,<br />
DO, COD, NO3, etc).<br />
Coupling: The model uses data from Step 2 and Step 4b and produces output<br />
on concentrations of COD and DO, which are used as input to the ecological<br />
assessments (Step 5).<br />
Step 4b. Flood plain model (MIKE SHE/MIKE 11)<br />
Model simulation: The coupled MIKE SHE/MIKE 11 model simulates all the<br />
flow processes in the flood plain area including water flows and storages on<br />
the ground surface, river flows and water levels, ground water flows and water<br />
levels, evapotranspiration, soil moisture content in the unsaturated zone and<br />
capillary rise.<br />
Coupling: The model uses data from Step 3a as boundary conditions and provides<br />
river flow velocities as the basis for the water quality and sediment<br />
simulations (Steps 4a and c). The model provides data on flood frequency and<br />
duration, depth of flooding, depth to ground water table, moisture content in the<br />
unsaturated zone and flow velocities in river branches, which are key figures in<br />
the subsequent ecological assessments (Step 5).<br />
Step 4c. River branches sedimentation (MIKE 11)<br />
Model simulation: The MIKE 11 model simulates the transport of fine sediments<br />
through the river branch system. As a result the sedimentation/erosion<br />
and the suspended sediment concentrations are simulated.<br />
Coupling: The model uses sediment concentrations simulated by the reservoir<br />
model (Step 2) as input. Furthermore, the flow velocities simulated by the local<br />
flood plain model (Step 4b) are used as the basis for the sediment calculations.<br />
The results, in terms of grain size of the river bed and concentrations of<br />
suspended material, are used as input to the ecological assessments (Step 5).<br />
Step 5. Ecology<br />
A correlation matrix between the physical/chemical parameters provided by<br />
the model simulations (Steps 4a, b and c) and the aquatic and terrestric ecotopes<br />
has been established for the project area. Alternative water management<br />
regimes can be described in terms of specific operation of certain hydraulic<br />
structures and corresponding distribution of water discharges primarily between<br />
the Danube, the Gabcikovo hydropower scheme and the river branch<br />
456 J. C. REFSGAARD ET AL.<br />
system. The hydrological effects of such alternative operations can be simulated<br />
by the integrated model and subsequently, the ecological impacts can be<br />
assessed in terms of likely changes of ecotopes.<br />
6.3. THE FLOODPLAIN MODEL<br />
The extent of the floodplain model area is indicated in Figure 5 and a perspective<br />
view of the area with the river branch system and floodplains is shown in Figure 9.<br />
The horizontal discretization of the finite difference model is 100 m, and the ground<br />
water zone is represented by two layers. Several hundreds of cross-sections and<br />
more than 50 hydraulic structures in the river branch system were included in the<br />
MIKE 11 model for the river system.<br />
For the pre-dam model, the surface water boundary conditions comprise a discharge<br />
time series at Bratislava and a discharge rating curve at the downstream end<br />
(Komarno). For the post-dam model, the Bratislava discharge time series has been<br />
divided into three discharge boundary conditions, namely at Dobrohost (intake<br />
from hydropower canal to river branch system), at the inlet to the hydropower<br />
canal and at the inlet to Danube from the reservoir. For the groundwater system,<br />
time varying ground water levels simulated with the regional ground water models<br />
act as boundary conditions. The Danube river forms an important natural boundary<br />
for the area. The Danube is included in the model, located on the model boundary,<br />
and symmetric ground water flow is assumed below the river. Hence, a zero-flux<br />
boundary condition is used for ground water flow below the river.<br />
To illustrate the complex hydrology and in particular the interaction between<br />
the surface and subsurface processes model results from a model simulation for a<br />
period in June–July 1993 are shown in Figures 11 and 12.<br />
Figure 11 presents the inlet discharges at the upstream point of the river branch<br />
system (Dobrohost), while the discharges and water levels at the confluence between<br />
the Danube and the hydropower outlet canal downstream of Gabcikovo<br />
during the same period are shown in Figure 12. Figure 11 further shows the soil<br />
moisture conditions for the upper two m below terrain and the water depth on the<br />
surface at location 2. Similar information is shown for location 1 in Figure 12. A<br />
soil water content above 0.40 (40 vol.%) corresponds to saturation. Location 2 is<br />
situated in the upstream part of the river branch system, while location 1 is located<br />
in the downstream part (see Figure 9).<br />
At location 2 (Figure 11) flooding is seen to occur as a result of river spilling<br />
(surface inundation occurs before the ground water table rises to the surface) whenever<br />
the inlet discharge exceeds approximately 60 m 3 s −1 . The soil moisture content<br />
is seen to react relatively fast to the flooding and the soil column becomes<br />
saturated. In contrary, full saturation and inundation does not occur in connection<br />
with the flood in the Danube in July, but the event is recognised through increasing<br />
ground water levels following the temporal pattern of the Danube flood.
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 457<br />
Figure 11. Observed inlet discharge to the river branch system at Dobrohost; simulated moisture<br />
contents at the upper two m of the soil profile at location 2 and simulated depths of<br />
inundation at location 2 during June–July 1993.<br />
At location 1 (Figure 12) the conditions are somewhat different. During the<br />
simulation period location 1 never becomes inundated due to high inlet flows at<br />
Dobrohost. However, during the July flood in Danube, inundation at location 1<br />
occurs as a result of increased ground water table caused by higher water levels in<br />
river branches due to backwater effects from the Danube. The surface elevation at<br />
location 1 is 116.4 m which is 0.4 m below the flood water level shown in Figure 12<br />
at the confluence (5 km downstream of location 1). It is noticed that the inundation<br />
at this location occurs as a result of ground water table rise and not due to spilling<br />
of the river (surface inundation occurs after the ground water table has reached<br />
ground surface).<br />
6.4. EXAMPLE OF MODEL RESULTS<br />
As an example of the results which can be obtained by the floodplain model, Figure<br />
13 shows a characterisation of the area according to flooding and depths to<br />
groundwater. The map has been processed on the basis of simulations for 1988 for<br />
pre-dam conditions. The classes with different ground water depths and flooding<br />
458 J. C. REFSGAARD ET AL.<br />
Figure 12. Simulated discharge and water levels in the Danube at the confluence between<br />
Danube and the outlet canal from the hydropower plant; simulated moisture contents at the<br />
upper two meter of the soil profile at location 1 and simulated depths of inundation at location<br />
1 in the river branch system during June–July 1993.<br />
have been determined from ecological considerations according to requirements<br />
of (semi)terrestrial (floodplain) ecotopes. From the figure the contacts between the<br />
main Danube river and the river branch system is clearly seen. Similar computations<br />
have been made by alternative water management schemes after damming of<br />
the Danube. The results of one of the hypothetical post-dam water management<br />
regimes, characterized by average water flows in the power canal, Danube and<br />
river branch system intake of 1470 m 3 s −1 , 400 m 3 s −1 and 45 m 3 s −1 , respectively,<br />
are shown in Figure 14. By comparing Figure 13 and Figure 14 the differences<br />
in hydrological conditions can clearly be seen. For instance the pre-dam conditions<br />
(Figure 13) are in many places characterised by high groundwater tables
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 459<br />
Figure 13. Hydrological regime in the river branch area for 1988 pre-dam conditions<br />
characterized in ecological classes.<br />
and small/seldom flooding, while the post-dam situation (Figure 14) generally has<br />
deeper ground water tables and more frequent flooding. From such changes in hydrological<br />
conditions inferences can be made on possible changes in the floodplain<br />
ecosystem.<br />
Further scenarios (not shown here) have, amongst others, investigated the<br />
effects of establishing underwater weirs in the Danube and in this way improvement<br />
of the connectivity between the Danube and the river branch system.<br />
7. Limitations in the Couplings made in the Integrated Model<br />
The integrated modelling system and the way it was applied includes different<br />
degrees of integration ranging from sequential runs, where results from one model<br />
are used as input to the next model, to a full integration, such as the coupling<br />
between MIKE SHE and MIKE 11. Hence, the system is not truly integrated in<br />
all respects. The justification for these different levels lies in assessments of where<br />
it was required in the present project area to account for feed back mechanisms<br />
and where such feed backs could be considered to be of minor importance for all<br />
practical purposes. For other areas with different hydrological characteristics, the<br />
required levels of integration are not necessarily the same. Therefore, a discussion<br />
460 J. C. REFSGAARD ET AL.<br />
Figure 14. Hydrological regime in the river branch area for a post-dam water management<br />
regime characterized in ecological classes. The scenario has been simulated using<br />
1988 observed upstream discharge data and a given hypothetical operation of the hydraulic<br />
structures.<br />
is given below on the universality and limitations of the various couplings made in<br />
the present case.<br />
A. Hydrological catchment/river hydraulics (MIKE SHE/MIKE 11)<br />
This coupling between the hydrological code and the river hydraulic code is fully<br />
dynamic and fully integrated with feed back mechanisms between the two codes<br />
within the same computational time step. This coupling cannot be treated sequentially<br />
in this area, since the feedback between river and aquifer works in both<br />
directions, with the river functioning as a source in part of the area and as a drain<br />
in other parts, and since the direction of the stream-aquifer interaction changes<br />
dynamically in time and space as a consequence of discharge fluctuations in the<br />
Danube. This coupling was shown to be crucial during the course of the project,<br />
and, due to the full integration, it is fully generic.<br />
B. Reservoir/river (MIKE 21/MIKE 11)<br />
This coupling is a simple one-way coupling with the reservoir model providing<br />
input data to the downstream river model, both in terms of sediment and water
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 461<br />
quality parameters. This coupling is sufficient in the present case, because there is<br />
no feedback from the downstream river to the reservoir. Even though this coupling<br />
is not fully generic, it may be sufficient in most cases, even in cases with a network<br />
of reservoirs and connecting river reaches.<br />
C. Reservoir/groundwater water exchange (MIKE 21/MIKE SHE)<br />
This coupling is a simple one-way coupling with the reservoir model providing<br />
data on sedimentation to the groundwater module of MIKE SHE, where they are<br />
used to calculate leakage coefficients in the surface water/ground water flow calculations.<br />
This coupling is sufficient in the present case, where the reservoir water<br />
table always is higher than the ground water table, and where the flow always is<br />
from the reservoir to the aquifer. However, for cases where water flows in both<br />
directions, or where there are significant temporal variations in the sedimentation,<br />
the present coupling is not necessarily sufficient.<br />
D. Hydrology catchment/crop growth (MIKE SHE/DAISY)<br />
This coupling is an iterative coupling with data flowing in both directions. However,<br />
it is not a full integration with the two model codes running simultaneously.<br />
Therefore, a number of iterations are required until the input data used in MIKE<br />
SHE (vegetation data simulated by DAISY) generates the input data used in DAISY<br />
(ground water levels) and vice versa. For example, changes in river water levels<br />
affect the ground water levels, implying that the crop growth conditions change and<br />
hence, the DAISY simulated vegetation data used by MIKE SHE to simulate the<br />
ground water levels are not correct. In such a case, the MIKE SHE simulation has to<br />
be repeated with the new crop growth data and subsequently, the DAISY simulation<br />
has to be repeated with the new ground water levels, etc., until the differences<br />
become negligible. This coupling has been used successfully in previous studies<br />
(Styczen and Storm, 1993), but may, due to the iterative mode, be troublesome in<br />
practise.<br />
E. Surface water/ground water quality (MIKE 11 – MIKE 21/MIKE SHE)<br />
In contrary to the full coupling of flows (coupling A) the corresponding water<br />
quality coupling is a simple one-way coupling with the river and reservoir models<br />
providing the water quality parameters in the infiltrating water and uses these as<br />
boundary conditions for the ground water quality simulations. This coupling is<br />
sufficient in the present case with respect to the reservoir, where the flow always<br />
is from the reservoir to the aquifer. The river-aquifer interaction involves flows in<br />
both directions, but the return flow from the aquifer to the Danube is very small<br />
(about 1%) as compared to the Danube flow, and hence, the feedback from the<br />
ground water quality to Danube water quality is assumed negligible. However, for<br />
other cases where the mass flux from the aquifer to the river system is important<br />
for the river water quality, the present one-way coupling will not be sufficient.<br />
462 J. C. REFSGAARD ET AL.<br />
8. Discussion and Conclusions<br />
The hydrological and ecological system of the Danubian Lowland is so complex<br />
with so many interactions between the surface and the subsurface water regimes<br />
and between physical, chemical and biological changes, that an integrated numerical<br />
modelling system of the distributed physically-based type is required in order<br />
to provide quantitative assessments of environmental impacts on the ground water,<br />
the surface water and the floodplain ecosystem of alternative management options<br />
for the Gabcikovo hydropower scheme.<br />
Such an integrated modelling system has been developed, and an integrated<br />
model has been constructed, calibrated and, to the extent possible, validated for<br />
the 3000 km 2 area. The individual components of the modelling system represent<br />
state-of-the-art techniques within their respective disciplines. The uniqueness is the<br />
full integration. The integrated system enables a quite detailed level of modelling,<br />
including quantitative predictions of the surface and ground water regimes in the<br />
floodplain area, ground water levels and dynamics, ground water quality, crop<br />
yield and nitrogen leaching from agricultural land, sedimentation and erosion in<br />
rivers and reservoirs, surface water quality as well as frequency, magnitude and<br />
duration of inundations in floodplain areas. The computations were carried out on<br />
Hewlett Packard Apollo 9000/735 UNIX workstations with 132 MB RAM. With<br />
a 300 MHz Pentium II NT computer a typical computational times for one of<br />
the steps described in Section 6.2 (Figure 10) would be 2–10 hr. Thus, although<br />
the integrated system is rather computationally demanding, the computational requirements<br />
are not a serious constraint in practise as compared to the demand for<br />
comprehensive field data.<br />
For most of the individual model components, traditional split-sample validation<br />
tests have been carried out, thus documenting the predictive capabilities of<br />
these models. However, this was not possible for some aspects of the integrated<br />
model. Hence, according to rigorous scientific modelling protocols, the integrated<br />
model can be argued to have a rather limited predictive capability associated with<br />
large uncertainties. A theoretical analysis of error propagation in such an integrated<br />
model would be quite interesting, but was outside the scope of the present study<br />
which was limited to the comprehensive task of developing the integrated modelling<br />
system and establishing the integrated model on the basis of all available<br />
data. However, on the basis of the few possible tests (e.g. Figure 7) of the integrated<br />
model against independent data not used in the calibration-validation process for<br />
the individual models, it is our opinion that the uncertainties of the integrated model<br />
are significantly smaller than those of the individual models. The two key reasons<br />
for this are: (1) in the integrated model the internal boundaries are simulated by<br />
neighbouring model components and not just assessed through qualified but subjective<br />
estimates by the modeller; and (2) the integrated model makes it possible<br />
to explicitly include more sources of data in validation tests that can not all be<br />
utilised in the individual models. Thus, by adding independent validation tests for
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 463<br />
the integrated model, such as the one shown in Figure 7 on discharges in seepage<br />
canals, to the validation tests for the individual models, the outputs of the integrated<br />
model have been subject to a more comprehensive test based on more data and<br />
hence, must be considered less uncertain than outputs from the individual models.<br />
The environmental impacts of the new reservoir and the diversion of water from<br />
the Danube through the Gabcikovo power plant can be simulated in rather fine<br />
detail by the integrated model established for the area. The integrated nature of<br />
the model has been illustrated by a case study focusing on hydrology and ecology<br />
in the wetland comprising the river branch system. The integrated model is not<br />
claimed to be capable of predicting detailed ecological changes at the species level.<br />
However, it is believed to be capable of simulating changes in the hydrological<br />
regime resulting from alternative water management decisions to such a degree of<br />
detail that it becomes a valuable tool for broader assessments of possible ecological<br />
changes in the area.<br />
Acknowledgements<br />
The present paper is based on results from the project ‘Danubian Lowland – Ground<br />
Water Model’ supported by the European Commission under the PHARE program.<br />
The project was executed by the Slovak Ministry of the Environment. The work<br />
was carried out by an international group of research and consulting organisations<br />
as reflected by the team of authors. The constructive criticisms of two anonymous<br />
reviewers are acknowledged.<br />
References<br />
Bathurst, J. C., Wicks, J. M. and O’Connel, P. E.: 1995, The SHE/SHESED basin scale water flow<br />
and sediment transport modelling system, In V. P. Singh (ed.), Computer Models of Watershed<br />
Hydrology, Water Resources Publications, pp. 563–594.<br />
Calver, A. and Wood, W. L.: 1995, The institute of hydrology distributed model, In V. P. Singh (ed.),<br />
Computer Models of Watershed Hydrology, Water Resources Publications, pp. 595–626.<br />
CEC: 1991, Commission of European Communities, Czech and Slovak Federative Republic,<br />
Danubian Lowland-Ground Water Model, No. PHARE/90/062/030/001/EC/WAT/1<br />
DHI: 1995, MIKE 21 Short Description. Danish Hydraulic Institute, Hørsholm, Denmark.<br />
DHI, DHV, TNO, VKI, Krüger and KVL: 1995, PHARE project Danubian Lowland – Ground Water<br />
Model (EC/WAT/1), Final Report. Prepared by a consultant group for the Ministry of the Environment,<br />
Slovak Republic and for the Commission of the European Communities, Vol. 1, 65 pp.;<br />
Vol. 2, 439 pp.; Vol. 3, 297 pp., Bratislava.<br />
EC: 1992, Working group of independent experts on variant C of the Gabcikovo-Nagymaros project,<br />
working Group Report, Commission of the European Communities, Czech and Slovak Federative<br />
Republic, Republic of Hungary, Budapest, 23 November, 1992.<br />
EC: 1993a, Working group of monitoring and water management experts for the Gabcikovo system<br />
of locks – Data Report, Commission of the European Communities, Republic of Hungary, Slovak<br />
Republic, Budapest, 2 November, 1993.<br />
464 J. C. REFSGAARD ET AL.<br />
EC: 1993b, Working group of monitoring and water management experts for the Gabcikovo system<br />
of locks – Report on temporary water management regime, Commission of the European<br />
Communities, Republic of Hungary, Slovak Republic, Bratislava, 1 December, 1993.<br />
Engesgaard, P.: 1996, Multi-Species Reactive Transport, In M. B. Abbott and J. C. Refsgaard (eds),<br />
Distributed Hydrological Modelling, Kluwer Academic Publishers, pp. 71–91.<br />
Griffioen, J., Engesgaard, P., Brun, A., Rodak, R., Mucha, I. and Refsgaard, J. C.: 1995, Nitrate<br />
and Mn-chemistry in the alluvial Danubian Lowland aquifer, Slovakia. Ground Water Quality:<br />
Remediation and Protection (GQ95), Proceedings of the Prague Conference, May 1995, IAHS<br />
Publ. No. 225, pp. 87–96.<br />
Hansen, S., Jensen, H. E., Nielsen, N. E. and Svendsen, H.: 1991, Simulation of nitrogen dynamics<br />
and biomass production in winter wheat using the Danish simulation model DAISY. Fertilizer<br />
Research 27, 245–259.<br />
Havnø, K., Madsen, M. N. and Dørge, J.: 1995, ‘MIKE 11 – A Generalized River Modelling<br />
Package’, In V. P. Singh (ed), Computer Models of Watershed Hydrology, Water Resources<br />
Publications, pp. 733–782.<br />
Holobrada, M., Capekova, Z., Lukac, M. and Misik, M.: 1994, Prognoses of the Hrusov reservoir<br />
eutrophication and siltation under various discharge distribution to the Old Danube (in Slovak),<br />
Water Research Institute (VUVH), Bratislava.<br />
ICJ: 1997, Case Concerning Gabcikovo-Nagymaros project (Hungary/Slovakia). Summary of the<br />
Judgement of 25 September 1997. International Court of Justice, The Hague, (available on<br />
www.icj-cij.org).<br />
JAR: 1995, 1996, 1997, Joint Annual Report of the environment monitoring in 1995, 1996, 1996<br />
according to the ‘Agreement between the Government of the Slovak Republic and the Government<br />
of Hungary about Certain Temporary Measures and Discharges to the Danube and Mosoni<br />
Danube’, signed 19 April, 1995.<br />
Klemes, V.: 1986, Operational testing of hydrological simulation models, Hydrological Sciences<br />
Journal, 13–24.<br />
Klucovska, J. and Topolska, J.: 1995, Water regime in the Danube river and its river branches, In I.<br />
Mucha (ed.), Gabcikovo Part of the Hydroelectric Power Project. Environmental Impact Review,<br />
Faculty of Natural Sciences, Comenius University, Bratislava, pp. 33–42.<br />
Kocinger, D.: 1995, Gabcikovo Part of the Hydroelectric Power Project, Basic Characteristics, In I.<br />
Mucha (ed.), Gabcikovo Part of the Hydroelectric Power Project – Environmental Impact Review,<br />
Faculty of Natural Sciences, Comenius University, Bratislava, pp. 5–14.<br />
Koncsos, L., Schütz, E. and Windau, U.: 1995, Application of a comprehensive decision support<br />
system for the water quality management of the river Ruhr, Germany, In S. P. Simonovic, Z.<br />
Kunzewicz, D. Rosbjerg and K. Takeuchi (eds), Modelling and Management of Sustainable<br />
Basin-Scale Water Resources Systems, IAHS Publ. No. 231, pp. 49–59.<br />
Menetti, M.: 1995, Analysis of regional water resources and their management by means of numerical<br />
simulation models and satellites in Mendoza, Argentina, In S. P. Simonovic, Z. Kunzewicz, D.<br />
Rosbjerg and K. Takeuchi (eds), Modelling and Management of Sustainable Basin-Scale Water<br />
Resources Systems, IAHS Publ. No. 231, pp. 49–59.<br />
Mucha, I.: 1992, Database processing of the hydropedological parameters for the ground water flow<br />
model of the Danubian Lowland (in Slovak), Ground Water Division, Faculty of Natural Science,<br />
Comenius University, Bratislava.<br />
Mucha, I., Paulikova, E., Hlavaty, Z., Rodak, D. and Pokorna, L.: 1992a, Danubian Lowland Ground<br />
Water Model, Working Manual to consortium of invited specialists for workshop in Bratislava,<br />
Ground Water Division, Faculty of Natural Sciences, Comenius University, Bratislava.<br />
Mucha, I., Paulikova, E., Hlavaty, Z. and Rodak, D.: 1992b, Elaboration of basis data for preparation<br />
of hydrogeological parameters for the model of the ground water flow of the Danubian Lowland<br />
area (in Slovak), Ground Water Division, Faculty of Natural Science, Comenius University,<br />
Bratislava.
AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 465<br />
Mucha, I., Paulikova, E., Hlavaty, Z., Rodak, D. and Pokorna, L.: 1993, Surface and ground water<br />
regime in the Slovak part of the Danube alluvium, Ground Water Division, Faculty of Natural<br />
Science, Comenius University.<br />
Mucha, I. (ed): 1995, Gabcikovo part of the hydroelectric power project environmental impact<br />
review. Evaluation based on two years monitoring, Faculty of Natural Sciences, Comenius<br />
University, Bratislava.<br />
Mucha, I., Rodak, D., Hlavaty, Z. and Bansky, L.: 1997, Environmental aspects of the design<br />
and construction of the Gabcikovo Hydroelectric Power Project on the river Danube, Proceedings<br />
International Symposium on Engineering Geology and the Environment, organized by the<br />
Greek National Group of IAEG, Athens, June 1997, Engineering Geology and the Environment,<br />
pp. 2809–2817.<br />
Nachtnebel, H.-P. (ed): 1989, Ökosystemstudie Donaustau Altenwörth, Veränderungen durch das<br />
Donaukraftwerk Altenwörth, Österreische Akademie der Wissenschaften, Veröffentlichungen<br />
des Österreischen MaB-Programs, Band 14, Universitätsverlag Wagner, Innsbruck.<br />
Person, M., Raffensperger, J. P., Ge, S. and Garven, G.: 1996, Basin-scale hydrogeologic modelling,<br />
Rev. Geophys. 34(1), 61–87.<br />
Perspektiven: 1989, Staustufe Freudenau, Perspektiven, Magazin für Stadtgestaltung und Lebensqualität,<br />
Dezember 1989.<br />
Refsgaard, J. C.: 1997, Parameterisation, calibration and validation of distributed hydrological<br />
models, J. Hydrology 198, 69–97.<br />
Refsgaard, J. C. and Storm, B.: 1995, MIKE SHE, In V. P. Singh (ed), Computer Models of Watershed<br />
Hydrology, Water Resources Publications, pp. 809–846.<br />
Singh, V. P. (ed): 1995, Computer Models of Watershed Hydrology, Water Resources Publications.<br />
Sørensen, H. R., Klucovska, J., Topolska, T., Clausen, T. and Refsgaard, J. C.: 1996, An engineering<br />
case study – Modelling the influences of the Gabcikovo hydropower plant in the hydrology and<br />
ecology in the Slovak part of the river branch system, In M. B. Abbott and J. C. Refsgaard (eds),<br />
Distributed Hydrological Modelling, Kluwer Academic Publishers, pp. 233–253.<br />
Styczen, M. and Storm, B.: 1993, Modelling of N-movements on catchment scale – a tool for analysis<br />
and decision making. 1. Model description. 2. A case study, Fertilizer Research 36, 1–17.<br />
Topolska, J. and Klucovska, J.: 1995, River morphology, In I. Mucha (ed.), Gabcikovo Part of the Hydroelectric<br />
Power Project. Environmental Impact Review, Faculty of Natural Sciences, Comenius<br />
University, Bratislava, pp. 23–32.<br />
VKI: 1995, Short Description of water quality and eutrophication modules,. Water Quality Institute,<br />
Hørsholm, Denmark.<br />
Winter, T. C.: 1995, Recent advances in understanding the interaction of groundwater and surface<br />
water, Rev. Geophys., Supplement, U.S. National Report 1991–94 to IUGG, pp. 985–994.<br />
Yan, J. and Smith, K. R.: 1994, Simulation of integrated surface water and ground water systems –<br />
model formulation, Water Resources Bulletin 30(5), 879–890.
[10]<br />
Refsgaard JC, Thorsen M, Jensen JB, Kleeschulte S, Hansen S (1999) Large<br />
scale modelling of groundwater contamination from nitrogen leaching.<br />
Journal of Hydrology, 221(3-4), 117-140.<br />
Reprinted from Journal of Hydrology with permission from Elsevier
Journal of Hydrology 221 (1999) 117–140<br />
Large scale modelling of groundwater contamination from<br />
nitrate leaching<br />
J.C. Refsgaard a, *, M. Thorsen a , J.B. Jensen a , S. Kleeschulte b , S. Hansen c<br />
a Danish Hydraulic Institute, Hørsholm, Denmark<br />
b GIM, Luxembourg<br />
c Royal Veterinary and Agricultural University, Copenhagen, Denmark<br />
Received 20 July 1998; received in revised form 3 May 1999; accepted 31 May 1999<br />
Abstract<br />
Groundwater pollution from non-point sources, such as nitrate from agricultural activities, is a problem of increasing<br />
concern. Comprehensive modelling tools of the physically based type are well proven for small-scale applications with<br />
good data availability, such as plots or small experimental catchments. The two key problems related to large-scale simulation<br />
are data availability at the large scale and model upscaling/aggregation to represent conditions at larger scale. This paper<br />
presents a methodology and two case studies for large-scale simulation of aquifer contamination due to nitrate leaching. Readily<br />
available data from standard European level databases such as GISCO, EUROSTAT and the European Environment Agency<br />
(EEA) have been used as the basis of modelling. These data were supplemented by selected readily available data from national<br />
sources. The model parameters were all assessed from these data by use of various transfer functions, and no model calibration<br />
was carried out. The adopted upscaling procedure combines upscaling from point to field scale using effective parameters with a<br />
statistically based aggregation procedure from field to catchment scale, preserving the areal distribution of soil types, vegetation<br />
types and agricultural practices on a catchment basis. The methodology was tested on two Danish catchments with good<br />
simulation results on water balance and nitrate concentration distributions in groundwater. The upscaling/aggregation procedure<br />
appears to be applicable in many areas with regard to root zone processes such as runoff generation and nitrate leaching,<br />
while it has important limitations with regard to hydrograph shape due to its lack of accounting for scale effects in relation to<br />
stream aquifer interaction. 1999 Elsevier Science B.V. All rights reserved.<br />
Keywords: Upscaling; Databases; Non-point pollution; Nitrate leaching; Distributed model; Water balance<br />
1. Introduction<br />
Groundwater is a significant source of freshwater<br />
used by industry, agriculture and domestic users.<br />
However, increasing demand for water, increasing<br />
use of pesticides and fertilisers as well as atmospheric<br />
deposition constitute a threat to the quality of groundwater.<br />
The use of fertilisers and manure leads to the<br />
* Corresponding author.<br />
E-mail address: jcr@dhi.dk (J.C. Refsgaard)<br />
leaching of nitrates into the groundwater and atmospheric<br />
deposition contributes to the acidification of<br />
soils that may have an indirect effect on the contamination<br />
of water.<br />
In Europe, for instance, the present situation is<br />
summarised in EEA (1995), where it is assessed that<br />
the major part of aquifers in Northern and Central<br />
Europe are subject to risk of nitrate contamination<br />
amongst others due to agricultural activities. Therefore,<br />
policy makers and legislators in EU are<br />
concerned about the issue and a number of preventive<br />
0022-1694/99/$ - see front matter 1999 Elsevier Science B.V. All rights reserved.<br />
PII: S0022-1694(99)00081-5
118<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
legislation steps are being taken in these years (EU<br />
Council of Ministers, 1991; EC, 1996).<br />
In the scientific community, concerns on groundwater<br />
contamination have motivated the development<br />
of numerous simulation models for groundwater quality<br />
management. Groundwater models describing the<br />
flow and transport mechanisms of aquifers have been<br />
developed since the 1970s and applied in numerous<br />
pollution studies. They have mainly described the<br />
advection and dispersion of conservative solutes.<br />
More recently, geochemical and biochemical reactions<br />
have been included to simulate the transport<br />
and fate of pollutants from point sources as industrial<br />
and municipal waste disposal sites, see e.g. Mangold<br />
and Tsang (1991); Engesgaard et al. (1996) for overviews.<br />
Fewer attempts have been made to simulate<br />
non-point pollution at catchment scale resulting<br />
from agricultural activities, see e.g. Thorsen et al.<br />
(1996); Person et al. (1996) for overviews. The<br />
approaches range from relatively simple models<br />
with semi-empirical process descriptions of the<br />
lumped conceptual type such as ANSWERS (Beasley<br />
et al., 1980), CREAMS (Knisel, 1980; Knisel and<br />
Williams, 1995), GLEAMS (Leonard et al., 1987),<br />
SWRRB (Arnold and Wiliams, 1990; Arnold et al.,<br />
1995) and AGNPS (Young et al., 1995) to more<br />
complex models with a physically based process<br />
description. The physically based models are most<br />
commonly one-dimensional leaching models, such<br />
as RZWQM (DeCoursey et al., 1989, 1992), Daisy<br />
(Hansen et al., 1991) and WAVE (Vereecken et al.,<br />
1991; Vanclooster et al., 1994, 1995), which basically<br />
describe root zone processes only, while true,<br />
spatially distributed, catchment models based on<br />
comprehensive process descriptions, such as the<br />
coupled MIKE SHE/Daisy (Styczen and Storm,<br />
1993), are seldom reported. The simple conceptual<br />
models are attractive because they require relatively<br />
less data, which are usually easily accessible, while<br />
the predictive capability of these models with regard<br />
to assessing the impacts of alternative agricultural<br />
practises is questionable due to the semi-empirical<br />
nature of the process descriptions. On the contrary, a<br />
key problem in using the more complex catchment<br />
models operationally lies in the generally large data<br />
requirements prescribed by the developers of such<br />
model codes. However, due to the better process<br />
descriptions these models may for some types of<br />
application be expected to have better predictive<br />
capabilities than the simpler models (Heng and Nikolaidis,<br />
1998).<br />
Input data for the complex catchment models have<br />
traditionally been available in practise only for small<br />
areas such as experimental research catchments.<br />
However, as more and more data have been gathered<br />
in computerised databases and, in particular, in<br />
Geographical Information Systems (GIS), the data<br />
availability has improved significantly. Further,<br />
experience from case studies indicates that a considerable<br />
part of the input data may be derived from<br />
statistical data and more general databases (Styczen<br />
and Storm, 1995).<br />
The database of EUROSTAT, the statistical office<br />
of the European Commission, holds statistical information<br />
about different topics from all Member States<br />
of the European Union. Agricultural statistics provide<br />
information on main crops, on the structure of agricultural<br />
holdings and crop and on animal production.<br />
Environment statistics provide figures on impacts of<br />
other sector’s work on the environment, such as fertiliser<br />
and pesticide input, groundwater withdrawal,<br />
water quality or manure production on animal<br />
farms. These figures are mostly aggregated and<br />
published on national level.<br />
In order to use these statistics in a spatially distributed<br />
simulation model, the information needs to be<br />
spatially referenced to represent a unit on the ground.<br />
Therefore the statistical information needs to be<br />
linked to a GIS data set. Such GIS data is stored in<br />
the GISCO (Geographic Information System of the<br />
European Commission) database. The GISCO database<br />
holds spatial data about administrative boundaries<br />
down to commune level, thematic data sets<br />
such as the soil database, CORINE land cover (managed<br />
by the EEA) or climatic time series for about 2000<br />
measuring stations in the European Union.<br />
Thus on one hand, there is a clearly expressed need<br />
from decision makers at national and international<br />
level to have tools, which on the basis of readily available<br />
data can predict the risks of groundwater pollution<br />
from non-point sources and the impacts of<br />
alternative agricultural management practices; and<br />
on the other hand, the scientific community has<br />
achieved new knowledge and developed new tools<br />
aiming at this. However, there are some important<br />
gaps to be filled before the scientifically based tools
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 119<br />
Fig. 1. Schematic structure of the MIKE SHE.<br />
can be applied operationally for supporting the decision<br />
makers:<br />
• The physically based models are very promising<br />
tools for assessing the impacts of alternative agricultural<br />
practises, but have so far been tested on<br />
plot scale and very small experimental catchments,<br />
whereas the need from a policy making point of<br />
view mainly relates to application on a much larger<br />
scale. Hence, there is a need to derive and test<br />
methodologies for upscaling of such models to<br />
run with model grid sizes one to two order of<br />
magnitudes larger than usually done.<br />
• Readily available data on large (national and international)<br />
scales do exist, although in a somewhat<br />
aggregated form. However, such data have not yet<br />
been used as the basis for comprehensive modelling,<br />
which so far always have been based on more<br />
detailed data, often from experimental catchments.<br />
Hence, there is a need to test to which extent these<br />
readily available data are suitable for modelling.<br />
• There is a need to assess the predictive<br />
uncertainties, before it can be evaluated whether<br />
the approach of combining complex predictive<br />
models with existing data bases is of any practical<br />
use in the decision making process or whether the<br />
uncertainties are too large.<br />
This paper presents results from a joint EU research<br />
project on prediction of non-point nitrate contamination<br />
at catchment scale due to agricultural activities.<br />
Other results from the same study focussing on uncertainty<br />
aspects are presented in UNCERSDSS (1998),<br />
Refsgaard et al. (1998a, 1999) and Hansen et al.<br />
(1999).<br />
2. Methodology<br />
2.1. Materials and methods<br />
2.1.1. MIKE SHE<br />
MIKE SHE is a modelling system describing the<br />
flow of water and solutes in a catchment in a distributed<br />
physically based way. This implies numerical
120<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
solutions of the coupled partial differential equations<br />
for overland (2D) and channel flow (1D), unsaturated<br />
flow (1D) and saturated flow (3D) together with a<br />
description of evapotranspiration and snowmelt<br />
processes. The model structure is illustrated in Fig.<br />
1. For further details reference is made to the literature<br />
(Abbott et al., 1986; Refsgaard and Storm, 1995).<br />
2.1.2. Daisy<br />
Daisy (Hansen et al., 1991) is a one-dimensional<br />
physically based modelling tool for the simulation<br />
of crop production and water and nitrogen balance<br />
in the root zone. Daisy includes modules for description<br />
of evapotranspiration, soil water dynamics based<br />
on Richards’ equation, water uptake by plants, soil<br />
temperature, soil mineral nitrogen dynamics based<br />
on the advection–dispersion equation, nitrate uptake<br />
by plants and nitrogen transformations in the soil. The<br />
nitrogen transformations simulated by Daisy are<br />
mineralization–immobilization turnover, nitrification<br />
and denitrification. In addition, Daisy includes a<br />
module for description of agricultural management<br />
practices. Details on the Daisy application in the<br />
present study are given by Hansen et al. (1999).<br />
2.1.3. MIKE SHE/Daisy coupling<br />
By combining MIKE SHE and Daisy, a complete<br />
modelling system is available for the simulation of<br />
water and nitrate transport in an entire catchment. In<br />
the present case the coupling is a sequential one. Thus<br />
for all agricultural areas, Daisy first produces calculations<br />
of water and nitrogen behaviour from the soil<br />
surface and through the root zone. The percolation of<br />
water and nitrate at the bottom of the root zone simulated<br />
by Daisy, is then used as input to MIKE SHE<br />
calculations for the remaining part of the catchment.<br />
For natural areas, MIKE SHE calculates also the root<br />
zone processes assuming no nitrate contribution from<br />
these areas. Owing to the sequential execution of the<br />
two codes, it has to be assumed that there is no feed<br />
back from the groundwater zone (MIKE SHE) to the<br />
root zone (Daisy). Further, overland flow generated by<br />
high intensity rainfall (Hortonian) cannot be simulated<br />
by this coupling, while overland flow due to<br />
saturation from below (Dunne) can be accounted for<br />
by MIKE SHE.<br />
Thus, MIKE SHE does not in the present case<br />
handle evapotranspiration and other root zone<br />
processes in the agricultural areas. As Daisy is onedimensional,<br />
one Daisy run in principle should be<br />
carried out for each of MIKE SHE’s horizontal<br />
grids. However, several MIKE SHE grids are assumed<br />
to have identical root zone properties (soil, crop, agricultural<br />
management practices, etc.), so that in practise<br />
the outputs from each Daisy run can be used as<br />
input to several MIKE SHE grids.<br />
2.2. Data availability at European databases<br />
Input data for modelling at the European scale need<br />
to satisfy certain requirements to make them useful for<br />
large-scale applications:<br />
• The data must be available for the whole of<br />
Europe.<br />
• The data must be harmonised according to a<br />
common nomenclature in order to avoid regional<br />
or national inconsistencies.<br />
• The data should be available in a seamless database.<br />
• The data should be available from one single<br />
source to avoid regional or national inconsistencies.<br />
• The data should be available in a format which can<br />
be directly integrated into a Geographical Information<br />
System (GIS).<br />
Attached to the use of “European” data sets are also<br />
certain problems. The data are generalised in<br />
geometric as well as in thematic detail, local particularities<br />
which are especially important for hydrological<br />
simulations are not always accounted for. Often<br />
information that is required for specific modelling<br />
objectives is not directly available on European<br />
level demanding the establishment and use of transfer<br />
functions instead. On the contrary, information is<br />
sometimes too specific when it has been collected in<br />
the framework of a particular research project, e.g.<br />
information on a particular soil property is being<br />
collected in natural soils but not in agricultural soils.<br />
Given these formal requirements, a first task of the<br />
project was to study the availability of data sets suited<br />
for large-scale hydrological modelling of groundwater<br />
contamination from diffuse sources. After<br />
intensive searches of on-line data catalogues, paper<br />
publications and direct contacts with organisations<br />
holding relevant information, it was possible to
Table 1<br />
Data sources for European scale hydrological modelling<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 121<br />
Data<br />
Potential data source<br />
identified in European data<br />
base<br />
Source actually used for<br />
modelling<br />
Scale of available data used<br />
Topography USGS a /GISCO USGS/GISCO 1 km grid<br />
Soil type GISCO soil map GISCO soil map 1 km grid<br />
Soil organic matter RIVM b report Experience value for Danish Denmark<br />
arable soils c<br />
Vegetation EEA: CORINE land cover EEA: CORINE land cover 1 km grid<br />
River network and river DCW d<br />
Provided by an application 1 km grid<br />
cross sections<br />
developed within the project<br />
Geology<br />
Report on groundwater<br />
resources in Denmark (EC,<br />
1982) RIVM—digital map<br />
data of report<br />
Report on groundwater<br />
resources in Denmark (EC,<br />
1982)<br />
County, i.e. approximately<br />
3,000 km 2<br />
Groundwater abstraction<br />
Report on groundwater<br />
resources in Denmark (EC,<br />
1982) RIVM—digital map<br />
data of report<br />
Report on groundwater<br />
resources in Denmark (EC,<br />
1982)<br />
Commune, i.e.<br />
approximately 200 km 2<br />
Management practices SC-DLO e report Plantedirektoratet (1996) Denmark<br />
Crop type Eurostat—Regional Statistics Agricultural Statistics (1995) County, i.e. approximately<br />
3000 km 2<br />
Livestock density<br />
Eurostat—Regional Statistics<br />
Eurostat—Eurofarm<br />
Agricultural Statistics (1995) County, i.e. approximately<br />
3000 km 2<br />
Fertilizer consumption Eurostat—Environmental<br />
Statistics<br />
Agricultural Statistics (1995) County, i.e. approximately<br />
3000 km 2<br />
Manure production<br />
Eurostat—Environmental<br />
Statistics<br />
Agricultural Statistics (1995) County, i.e. approximately<br />
3000 km 2<br />
Atmospheric deposition MARS project National data Denmark<br />
Climatic variables MARS project f National data Denmark<br />
River runoff GRDC g National data Catchment<br />
a USGS—United States Geological Survey.<br />
b RIVM—National Institute of Public Health and the Environment of The Netherlands.<br />
c RIVM data only include natural areas, not arable land. Instead the figure was assessed on the basis of previous experience with Danish<br />
agricultural soils.<br />
d DCW—Digital Chart of the World.<br />
e SC-DLO—Winand Staring Centre, The Netherlands.<br />
f MARS—Monitoring Agriculture by Remote Sensing database.<br />
g GRDC—Global Runoff Data Centre, database mainly for large river basins.<br />
identify sources for all the information requirements.<br />
However, after evaluation of all the potential sources<br />
the following deficiencies became apparent:<br />
• Not all information was available in spatially referenced<br />
GIS format, therefore other sources such as<br />
tables and statistics had to be considered.<br />
• Not all information was available from<br />
“European” databases, finally national sources<br />
had to be considered. For these national sources<br />
strict requirements in terms of ease of availability,<br />
data quality and data comparability were<br />
imposed.<br />
• The scale of the available data was often too coarse<br />
for the application. Global data sets with 1 × 1<br />
longitude/latitude resolution are often not detailed<br />
enough.<br />
The potential “European scale” data sources and the<br />
data sources which ultimately was used for the model<br />
are shown in Table 1.<br />
Data about climatic variables were obtained from
122<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
the national meteorological institutes and river runoff<br />
from the national hydrological institutes. These data<br />
were only available from national sources, but on the<br />
contrary these data are probably the most easily available<br />
(if the issue of price charges is disregarded) and<br />
the most easily comparable due to international<br />
harmonised measuring techniques at these organisations.<br />
Regional statistics on Denmark obtained from<br />
EUROSTAT proved to be not detailed enough<br />
(country level only). The required statistical information<br />
could easily be recovered from Danish national<br />
statistics.<br />
Cost estimates for the compilation of the database<br />
have only been undertaken to a limited extent. The<br />
project data itself have mostly been obtained in<br />
exchange for the anticipated project results, i.e. at<br />
no cost. The main data that in a fully commercial<br />
environment cost a substantial amount of money are<br />
meteorological data which are available from the<br />
national meteorological institutes (Kleeschulte,<br />
1998).<br />
2.3. Change of scale<br />
Large scale hydrological models are required for a<br />
variety of applications in hydrological, environmental<br />
and land surface-atmosphere studies, both for research<br />
and for day to day water resources management<br />
purposes. The physically based models have so far<br />
mainly been tested and applied at small scale and<br />
therefore require upscaling. The complex interactions<br />
between spatial scale and spatial variability is widely<br />
perceived as a substantial obstacle to progress in this<br />
respect (Blöschl and Sivapalan, 1995; and many<br />
others).<br />
The research results on the scaling issue reported<br />
during the past decade have, depending on the particular<br />
applications, focussed on different aspects,<br />
which may be categorised as follows:<br />
• Subsurface processes focussing on the effect of<br />
geological heterogeneity.<br />
• Root zone processes including interactions<br />
between land surface and atmospheric processes.<br />
• Surface water processes focussing on topographic<br />
effects and stream–aquifer interactions.<br />
The effect of spatial heterogeneity on the description<br />
of subsurface processes has been the subject of<br />
comprehensive research for two decades, see e.g.<br />
Dagan (1986) and Gelhar (1986) for some of the<br />
first consolidated results and Wen and Gómez-<br />
Hernández (1996) for a more recent review, mainly<br />
related to aquifer systems. The focus in this area is<br />
largely concerned with upscaling of hydraulic<br />
conductivity and its implications on solute transport<br />
and dispersion processes in the unsaturated zone and<br />
aquifer system, typically at length scales less than<br />
1 km.<br />
The research in the land surface processes has<br />
mainly been driven by climate change research<br />
where the meteorologists typically focus on length<br />
scales up to 100 km. Michaud and Shuttelworth<br />
(1997), in a recent overview, conclude that substantial<br />
progress has been made for the description of surface<br />
energy fluxes by using simple aggregation rules. Sellers<br />
et al. (1997) conclude that “it appears that simple<br />
averages of topographic slope and vegetation parameters<br />
can be used to calculate surface energy and<br />
heat fluxes over a wide range of spatial scales, from<br />
a few meters up to many kilometers at least for grassland<br />
and sites with moderate topography”. An interesting<br />
finding is the apparent existence of a threshold<br />
scale, or representative elementary area (REA) for<br />
evapotranspiration and runoff generation processes<br />
(Wood et al., 1988, 1990, 1995). Famiglietti and<br />
Wood (1995) concludes on the implications of such<br />
an REA in a study of catchment evapotranspiration<br />
that “the existence of an REA for evapotranspiration<br />
modelling suggests that in catchment areas smaller<br />
than this threshold scale, actual patterns of model<br />
parameters and inputs may be important factors<br />
governing catchment-scale evapotranspiration rates<br />
in hydrological models. In models applied at scales<br />
greater than the REA scale, spatial patterns of dominant<br />
process controls can be represented by their<br />
statistical distribution functions”. The REA scales<br />
reported in the literature are in the order of 1–5 km 2 .<br />
The research on scale effects related to topography<br />
and stream–aquifer interactions has been rather<br />
limited as compared to the above two areas. Saulnier<br />
et al. (1997) have examined the effect of the grid sizes<br />
in digital terrain maps (DTM) on the model simulations<br />
using the topography-based TOPMODEL. They<br />
concluded that in particular for channel pixels the<br />
spatial resolution of the underlying DTM is important.<br />
Refsgaard (1997) using the distributed MIKE SHE
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 123<br />
Fig. 2. Schematic representation of upscaling/aggregation procedure.<br />
model to the Danish Karup catchment with grid sizes<br />
of 0.5, 1, 2 and 4 km, found that the discharge hydrograph<br />
shape was significantly affected for the 2 and<br />
4 km grids as compared to the almost identical model<br />
results with 0.5 and 1 km grids. He concluded that the<br />
main reason for this change was that the density of<br />
smaller tributaries within the catchment was smaller<br />
for the models with the larger grids.<br />
Many researchers doubt whether it is feasible to use<br />
the same model process descriptions at different<br />
scales. For instance Beven (1995) states that “… the<br />
aggregation approach towards macroscale hydrological<br />
modelling, in which it is assumed that a model<br />
applicable at small scales can be applied at larger<br />
scales using ‘effective’ parameter values, is an inadequate<br />
approach to the scale problem. It is also unlikely<br />
in the future that any general scaling theory can be<br />
developed due to the dependence of hydrological<br />
systems on historical and geological perturbations”.<br />
We have experienced some of the same problems<br />
and agree that it is generally not possible to apply<br />
the same model without recalibration at small and<br />
large scales. Therefore, we have used another<br />
approach based on a combination of aggregation and<br />
upscaling in accordance with the principles recommended<br />
by Heuvelink and Pebesma (1998). The<br />
scale terminology and the upscaling procedure<br />
adopted here are as follows (Fig. 2):<br />
• The basic modelling system is of the distributed<br />
physically based type. For application at point<br />
scale (where it is not used spatially distributed)<br />
the process descriptions of this model type can be<br />
tested directly against field data.<br />
• The model is in this case run with (equations and)<br />
parameter values in each horizontal grid point<br />
representing field scale (50–200 m) conditions.<br />
The field scale is characterised by ‘effective’ soil<br />
and vegetation parameters, but assuming only one<br />
soil type and one cropping pattern. Thus the spatial<br />
variability within a typical field is aggregated and<br />
accounted for in the ‘effective’ parameter values.<br />
• The smallest horizontal discretization in the model<br />
is the grid scale or grid size (1–5 km) that is larger<br />
than the field scale. This implies that all the variations<br />
between categories of soil type and crop type
124<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
Fig. 3. Locations of the Karup and Odense catchments in Denmark.<br />
within the area of each grid cannot be resolved and<br />
described at the grid level. Such input data whose<br />
variations are not included in the grid scale model<br />
representation, are distributed randomly at the<br />
catchment scale so that their statistical distributions<br />
are preserved at that scale.<br />
• The results from the grid scale modelling are then<br />
aggregated to catchment scale (10–50 km) and the<br />
statistical properties of model output and field data<br />
are then compared at catchment scale.<br />
• For applications to larger scales than catchment<br />
scale, such as continental scale, the catchment<br />
scale concept is used, just with more grid points.<br />
This implies that the continental scale can be<br />
considered to consist of several catchments, within<br />
each of which the field scale statistical variations<br />
are preserved and at which scale the predictive<br />
capability of the model thus lies.<br />
In the upscaling procedure a distinction is made<br />
between the terms upscaling and aggregation. Thus,<br />
spatial attributes are aggregated and model parameters<br />
are scaled up. A principal difference between<br />
aggregation and upscaling is that whereas aggregation<br />
can be defined irrespective of a model operating on<br />
the aggregated values, upscaling must always be<br />
defined in the context of a model that uses the parameters<br />
that have been scaled up (Heuvelink and<br />
Pebesma, 1998). In this respect the main principle<br />
of the upscaling procedure can be summarised as<br />
follows:<br />
• Upscale model from point scale to field scale.<br />
• Run model at grid scale using field scale parameters<br />
in such a way that their statistical properties<br />
are preserved at catchment scale.<br />
• Aggregate grid scale model output to catchment<br />
scale.<br />
This methodology mainly attempts to address scaling<br />
within the second of the above fields, namely root
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 125<br />
zone processes, while scaling in relation to subsurface<br />
processes and stream–aquifer interaction has not been<br />
considered when designing the present upscaling<br />
procedure. The methodology has some complications<br />
and critical assumptions:<br />
• The assumption of upscaling from point scale to<br />
field scale is crucial. This assumption is documented<br />
to be fulfilled in many cases (Jensen and<br />
Refsgaard 1991a–c; Djuurhus et al., 1999), but<br />
may fail in other cases (Bresler and Dagan,<br />
1983), for instance in areas where overland flow<br />
is a dominant flow mechanism.<br />
• Running the model at grid scale but using model<br />
parameters valid at a field scale, which is typically<br />
2 to 3 orders of magnitude smaller, is necessary to<br />
make the computational demand acceptable for<br />
catchment and continental scale applications. The<br />
solution to this is to assign inputs on soil and vegetation<br />
types not correctly georeferenced but such<br />
that their statistical distribution at catchment scale<br />
is preserved. This implies that results at grid scale<br />
are dubious and should not be used. The aggregation<br />
step up to catchment scale is therefore essential.<br />
• While the statistical properties of the critical<br />
root zone parameters due to the aggregation<br />
step have been preserved at catchment scale<br />
this is not the case for the geological, topographical<br />
and stream data which are used directly<br />
at the grid scale. A critical question is therefore,<br />
how the catchment scale model output,<br />
due to these other data, are influenced by selection<br />
of grid scale. Here, investigations with 1, 2<br />
and 4 km grids are made.<br />
3. Application<br />
3.1. Modelling approach for the Karup and Odense<br />
catchments<br />
The modelling studies have focussed on two<br />
aspects, namely the feasibility of using coarse aggregated<br />
data available at European level databases, and<br />
the effect of the upscaling procedure. The modelling<br />
aims at describing the integrated runoff at the catchment<br />
outlet and the distribution function of the nitrate<br />
concentrations sampled from available wells over the<br />
catchment (aquifer). On this basis the following<br />
approach has been adopted:<br />
1. Simulation models have been established for<br />
two catchments in Denmark, Karup Å and<br />
Odense Å (Fig. 3), in the following denoted<br />
the Karup and Odense models, respectively.<br />
The topographical areas for the Karup catchment<br />
gauging station 20.05 Hagebro is<br />
518 km 2 . Correspondingly, the catchment area<br />
at the gauging station used for the model validation<br />
tests in the Odense catchment, 45.26<br />
Ejby Mølle, is 536 km 2 . The most detailed<br />
studies were carried out for the Karup catchment,<br />
while the results for the Odense catchment<br />
were included mainly to check the<br />
generality of the conclusions derived from the<br />
Karup catchment.<br />
2. The models are established directly from the<br />
European level databases and all input parameter<br />
values are assessed from these data or in a predefined<br />
objective way from experience values<br />
obtained from previous model studies. Thus, the<br />
models are not calibrated at all.<br />
3. The results of the models are compared with field<br />
data, on which basis the model performance is<br />
assessed.<br />
4. The effects of upscaling have been examined in<br />
two ways:<br />
• The models are run with different grid sizes (1, 2<br />
and 4 km) and the results compared.<br />
• For the Karup catchment two different procedures<br />
have been compared, namely:<br />
the upscaling/aggregation procedure described<br />
above (Fig. 2), which according to its representation<br />
of agricultural crops is denoted ‘distributed’;<br />
a simpler procedure where the agricultural crops<br />
are upscaled all the way from field scale to<br />
catchment scale. This implies that one crop<br />
type represents all the agricultural areas. The<br />
dominant crop in the area, namely winter<br />
wheat, has been selected as the crop for the<br />
70% agricultural area, while the 30% natural/
126<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
Fig. 4. Surface topography, catchment delineation and river network for the Karup-EU model.<br />
urban areas remain as the only other vegetation<br />
type. This procedure is denoted ‘uniform’.<br />
3.2. Karup model<br />
3.2.1. Catchment and river system<br />
The catchment area and locations of the river<br />
branches (Fig. 4) were generated from the DEM by<br />
use of standard ARC/Info functionalities. The generated<br />
catchment areas for 1, 2 and 4 km grids were<br />
within 4% of the correct one at station 20.05 Hagebro.<br />
The river cross-sections were subsequently automatically<br />
derived on the basis of the following assumptions:<br />
• The bankful discharge (i.e. water flow up to top of<br />
cross-section) corresponds to a typical annual<br />
maximum discharge. This characteristic discharge<br />
is further assumed uniform in terms of specific<br />
runoff (1 s 1 km 2 ), so that the actual discharge<br />
at any cross section is estimated as the specific<br />
runoff multiplied by the upstream catchment area<br />
that can be estimated from the DEM.<br />
• The river slope corresponds to the slope of the<br />
surrounding surface, which can be derived from<br />
the DEM.<br />
• The cross-section has a trapezium shape with a<br />
fixed given angle and relation between depth and<br />
width.<br />
• The relation between discharge, slope and river<br />
cross-section can be determined by the Manning<br />
formula with a given Manning number.<br />
Most areas in Denmark are drained in order to make<br />
the land suitable for agriculture. Agricultural areas are<br />
typically artificially drained with tile drains in combination<br />
with small ditches. Other areas may be naturally<br />
drained by creeks and rivers. It is not possible to<br />
include a detailed and fully correct drainage description<br />
in a coarse model like the Karup model. Moreover,<br />
detailed information on drainage network is not<br />
available. Therefore, when establishing a coarse scale
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 127<br />
model, a lumped description must be used. In the<br />
present case it is simply assumed that the entire catchment<br />
area is drained and that the drains are located<br />
1 m below ground surface. Drainage water is<br />
produced whenever the groundwater table is located<br />
above this drainage level. Drainage water is routed to<br />
the nearest river node where it contributes as a source<br />
to the river flow. Routing of groundwater to the drains<br />
and further to the ultimate recipient is in MIKE SHE<br />
described using a linear routing technique, where a<br />
time constant is specified by the user. In this case a<br />
time constant of 2:3 × 10 7 s 1 was used corresponding<br />
to an average retention time (in the linear reservoir)<br />
of 50 days. This time constant represents a<br />
typical value for Danish catchments.<br />
3.2.2. Soil properties<br />
The soil texture classes in a 1 × 1 km resolution<br />
were provided by the GISCO soil data base. The<br />
texture classes were translated into soil parameters<br />
in terms of hydraulic conductivity functions and soil<br />
water retention curves using pedo-transfer functions<br />
(Cosby et al., 1984). According to the GISCO the<br />
Karup catchment is covered by coarse sandy soil for<br />
which the following key parameter values were estimated:<br />
(a) saturated hydraulic conductivity<br />
K s ˆ 1:7 × 10 5 m=s; (b) moisture content at saturation<br />
u s ˆ 40 vol%; (c) moisture content at field capacity<br />
u FC ˆ 20 vol%; and (d) moisture content at<br />
wilting point u wp ˆ 6vol%:<br />
A specific problem was related to assessment of soil<br />
organic matter, which is an important parameter for<br />
nitrogen turnover processes. As indicated in Table 1<br />
such information was not identified in any of the<br />
European data bases. Instead a value based on<br />
previous experience (Lamm, 1971) with Danish agricultural<br />
soils was estimated. In the plough layer (0–<br />
20 cm) a value of 1.5%C was used, and this value<br />
decreased rapidly with depth to a minimum of<br />
0.01%C below 1 m depth.<br />
3.2.3. Hydrogeology<br />
The geological perception of the area and the basis<br />
for estimation of the hydrogeological parameters used<br />
in the model are all based on EC (1982), where the<br />
aquifer is described as composed of two main geological<br />
layers.<br />
The upper layer is Quaternary sediments consisting<br />
of sands and gravel. The transmissivity of these sediments<br />
are assessed to be in the order of 2 × 10 3 m 2 =s<br />
and the thickness about 15 m (EC, 1982). This leads to<br />
a horizontal hydraulic conductivity of 1:3 × 10 4 m=s<br />
that was used in the model calculations. An anisotropy<br />
factor of 10 between horizontal and vertical hydraulic<br />
conductivities was assumed leading to a vertical<br />
hydraulic conductivity of 1:3 × 10 5 m=s: Moreover,<br />
a specific yield of 0.2 and a storage coefficient of<br />
10 4 m 1 was assumed.<br />
Below the Quaternary sediments there are Miocene<br />
quarts-sand sediments with a relatively high transmissivity<br />
of 3 × 10 3 m 2 =s and a thickness of typically<br />
10–20 m (EC, 1982). Hence, in the model a thickness<br />
of 15 m has been used. This leads to a horizontal<br />
hydraulic conductivity of 2:0 × 10 4 m=s: The same<br />
assumptions on anisotropy, specific yield and storage<br />
coefficients as for the Quaternary sediments were<br />
applied for the Miocene sediments.<br />
EC (1982) provides information on groundwater<br />
abstraction on a commune (local administrative unit)<br />
basis. The Miocene sediments are described as suitable<br />
for drinking water supply, why it is assumed that<br />
all groundwater abstractions are made from these<br />
sediments that are the lower layer in the model. The<br />
total abstraction is given as 13 × 10 6 m 3 =year: The<br />
exact location of the individual water supply wells<br />
is not given in EC (1982), and has been evenly distributed<br />
among 10–20 model grids located along the river<br />
system.<br />
The location of the reduction front in the aquifer is<br />
an important parameter for nitrate conditions. As<br />
percolation water containing nitrate moves into<br />
areas with reduced geochemical conditions the nitrate<br />
will disappear. No information on this important parameter<br />
was provided in EC (1982). It was assumed that<br />
the front separating oxic and reduced aquifer conditions<br />
all over the aquifer is located in the Miocene<br />
sediments, 3 m below the interface to the Quaternary<br />
sediments. This corresponds to a location 18 m below<br />
the terrain surface.<br />
3.2.4. Hydrometeorology<br />
Time series of daily precipitation and temperature<br />
based on standard meteorological stations within the<br />
catchment was used. In addition, monthly values of<br />
potential evapotranspiration were calculated by the<br />
Makkink equation on the basis of climate data from
128<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
the synoptic station at Karup airport. The data from<br />
synoptic stations are generally easily available internationally.<br />
3.2.5. Crop growth, evapotranspiration and nitrate<br />
leaching model<br />
Distributions of crop types and livestock densities<br />
were obtained from Agricultural Statistics (1995) and<br />
converted to slurry production using standard values<br />
for nitrogen content. Based on typical crop rotations<br />
proposed by The Danish Agricultural Advisory<br />
Centre and the constraints offered by crop distribution<br />
and livestock density two cattle farm rotations, one<br />
pig farm rotation and one arable farm rotation were<br />
constructed. In order to capture the effect of the interaction<br />
between weather conditions and crops, simulations<br />
were performed in such a way that each crop at<br />
its particular position in the considered rotation<br />
occurred exactly once in each of the years, which<br />
resulted in a total of 17 crop rotation schemes.<br />
These 17 schemes were distributed randomly over<br />
the area in such a way that the statistical distribution<br />
was in accordance with the agricultural statistics.<br />
To simulate the trend in the nitrate concentrations<br />
in the groundwater and in the streams, it is<br />
necessary to have information on the history of<br />
the fertiliser application in space and time. In<br />
Denmark, norms and regulations for fertilisation<br />
practice are defined (Plantedirektoratet, 1996)<br />
which regulate the maximum amount of nutrients<br />
allowed for a particular crop depending on forefruit<br />
and soil type, and in addition, provide norms<br />
for the lower limit of nitrogen utilisation for<br />
organic fertilisers. It was assumed that the farmers<br />
follow the statuary norms, and that the proportion<br />
of organic fertiliser to the individual crop in a<br />
rotation is proportional to the production of<br />
organic fertiliser in the rotation and to the relative<br />
nitrogen demand of the crop (the fertiliser norm of<br />
the particular crop in relation to the fertiliser norm<br />
of the rotation). Based on estimated application<br />
rates of organic and mineral fertilisers to the individual<br />
crops each year, the Daisy model simulated<br />
time series of nitrate leaching from the root zone<br />
for each agricultural grid. The MIKE SHE model<br />
then routed these fluxes further through the<br />
unsaturated zone and in the groundwater layers<br />
accounting for dispersion and dilution processes<br />
and finally into the Karup stream where the integrated<br />
load from the entire catchment was estimated.<br />
The parameterisation of the Daisy model is<br />
adopted from previous studies. The basic<br />
processes and standard parameter values were<br />
originally assessed from results of Danish agricultural<br />
field experiments (Hansen et al., 1990). As<br />
then the process description and standard parameters<br />
have only been subject to minor modifications<br />
in connection with model tests against data<br />
from The Netherlands, Germany, Denmark and<br />
Slovakia (Hansen et al 1991; Jensen et al, 1994,<br />
1996, 1997; Svendsen et al, 1995). Hence, the<br />
parameters related to both, evapotranspiration/<br />
water balance processes and to the nitrogen transformation<br />
processes have, except for the soil parameters<br />
described in Section 3.2.3, been taken as<br />
the standard values. More details on the parameter<br />
values, their assessed uncertainties and results<br />
from the Daisy simulations are provided in<br />
Hansen et al. (1999).<br />
3.2.6. Boundary and initial conditions<br />
In addition to precipitation and groundwater<br />
abstraction rates the following boundary conditions<br />
are used:<br />
• The area included in the catchment is per definition<br />
a hydrological catchment as based on topography.<br />
Thus a zero-flux boundary is used along the catchment<br />
boundaries, also for the aquifer layers. The<br />
bottom of the model is considered impermeable.<br />
• For all upstream river ends a zero-flux boundary<br />
condition is applied. For the downstream end, a<br />
constant water level was applied.<br />
The most important initial conditions are the moisture<br />
content in the unsaturated zone and the elevation<br />
of the groundwater table. The initial soil moisture<br />
content was assumed equal to field capacity, while<br />
the initial groundwater tables was assumed equal to<br />
the groundwater tables after a seven years simulation<br />
period with guessed initial conditions. The model was<br />
run for seven years (1987–1993). In order to reduce<br />
the importance of uncertain initial conditions, the two<br />
first years were considered as a ‘warming-up period’<br />
and the last five years were considered the simulation<br />
period.
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 129<br />
Table 2<br />
Water balance in mm/year for the Karup catchment at station 20.05 Hagebro (518 km 2 )<br />
Year Precipitation River flow Observed<br />
Model 1 km grid Model 2 km grid Model 4 km grid<br />
1989 812 428 392 353 460<br />
1990 1020 496 518 512 476<br />
1991 863 446 441 424 449<br />
1992 892 499 531 527 437<br />
1993 835 434 425 405 432<br />
Average 884 460 461 444 451<br />
3.3. Odense model<br />
The same procedure as outlined above for the<br />
Karup model was followed. The two main differences<br />
as compared to the Karup catchment are<br />
that the top soil belong to more fine textured<br />
classes with lower hydraulic conductivities and<br />
that the aquifer having groundwater abstraction<br />
is confined in the Odense catchment. This results<br />
in an assumption that the covering sediments are<br />
less permeable than the aquifer material. As no<br />
direct information on these confining sediments<br />
is given in EC (1982) the hydraulic properties of<br />
the soil in the root zone are assumed valid. This<br />
implies in practise that recharge rates to the<br />
aquifer is lower than in the Karup catchment<br />
and that the horizontal flow towards the drains<br />
and the river system is correspondingly larger. A<br />
similar geological geometry as in the Karup<br />
catchment is assumed, i.e. the upper less<br />
permeable, confining layer is assumed to have a<br />
thickness of 15 m and the reduction front is<br />
assumed to be located in the lower aquifer, 3 m<br />
below this confining layer.<br />
4. Results<br />
To test the model performance a number of validation<br />
tests were carried out for both catchments. Validation<br />
is here defined as substantiation that a site<br />
specific model performs simulations at a satisfactory<br />
level of accuracy. Hence, no universal validity of the<br />
general model code is tested nor claimed. In Tables 2<br />
and 3 and Figs. 5–8 results are shown for model grid<br />
sizes 1, 2 and 4 km and for the Karup catchment additionally<br />
for both the distributed and uniform upscaling<br />
procedures. The validation tests described below only<br />
considers the 1 km grid model runs, while the remaining<br />
results are discussed further below in the section<br />
dealing with scaling effects.<br />
4.1. Karup catchment<br />
The Karup model (1 km grid) was validated by<br />
comparison of model simulations and field data on<br />
the following aspects:<br />
• Annual water balances. Table 2 shows the annual<br />
water balances for the five years simulation period<br />
together with the observed annual discharge. The<br />
Table 3<br />
Water balance in mm/year for the Odense catchment at station 45.21 Ejby Mølle (536 km 2 )<br />
Year Precipitation River flow Observed<br />
Model 1 km grid Model 2 km grid Model 4 km grid<br />
1989 649 220 177 187 181<br />
1990 943 349 351 394 299<br />
1991 760 312 291 308 265<br />
1992 770 308 306 332 243<br />
1993 906 334 329 353 306<br />
Average 805 305 291 315 259
130<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
Fig. 5. Comparison of the recorded discharge hydrograph for the Karup catchment with simulations based on 1, 2 and 4 km grids. The two<br />
simulated curves corresponds to the combined upscaling/aggregation procedure (Distributed) and the simpler upscaling procedure (Uniform).<br />
simulated and observed hydrographs are shown in<br />
Fig. 5.<br />
• Nitrate concentrations in the upper groundwater<br />
layer. Simulated values are compared to observed<br />
values from 35 wells in terms of statistical distributions<br />
over the aquifer (Fig. 6).<br />
The main findings from these validation tests can be<br />
summarised as follows:<br />
• The annual water balance is simulated remarkably<br />
well. Thus the simulated and recorded flows, which<br />
also reflect the annual groundwater recharges in<br />
this area, differ only 2% as average values over<br />
the five year simulation period (Table 2).<br />
• The variation of the river runoff over the year is<br />
relatively well described, although not at all as<br />
good as the long term average water balance<br />
(Fig. 6). The model generally underestimates the<br />
runoff in the summer periods (low flows) and overestimates<br />
the winter flow. There may be many<br />
reasons for this. The most important is probably<br />
that the observed groundwater levels and dynamics<br />
are poorly reproduced by the model. The runoff<br />
from the Karup catchment is dominated by drainage<br />
flow and baseflow components. Thus a good<br />
simulation of groundwater levels and dynamics are<br />
required in order to produce a good runoff simulation.<br />
An improved simulation of groundwater<br />
levels and dynamics requires that the model<br />
includes, in particular, spatial variations of the<br />
transmissivity of the aquifer, which is not possible<br />
based on the available input data.<br />
• The nitrate concentrations simulated by the model<br />
are seen to match the observed data remarkably<br />
well, both with respect to average concentrations<br />
and statistical distribution of concentrations within<br />
the catchment. It may be noticed that the critical<br />
NO 3 concentration level of 50 mg/l (maximum<br />
admissible concentration according to drinking<br />
water standards) is exceeded in about 60% of the<br />
area.<br />
4.2. Odense catchment<br />
The Odense model (1 km grid) was validated by
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 131<br />
Fig. 6. Comparison of the statistical distribution of nitrate concentrations in groundwater for the Karup catchment predicted by the model with<br />
1, 2 and 4 km grids and observed in 35 wells. The upper figure corresponds to the upscaling/aggregation procedure resulting in a distributed<br />
representation of agricultural crops, while the lower figure is from the run with the upscaling procedure, where all the agricultural area is<br />
represented by one uniform crop.<br />
comparison of model simulations and field data on the<br />
following aspects:<br />
• Annual water balances. Table 3 shows the annual<br />
water balances for the five years simulation period<br />
together with the observed annual discharge. The<br />
simulated and observed hydrographs are shown in<br />
Fig. 7.<br />
• Nitrate concentrations in the upper groundwater<br />
layer. Simulated values are compared<br />
to observed values from 42 wells in terms<br />
of statistical distributions over the aquifer<br />
(Fig. 8).
132<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
Fig. 7. Discharge hydrographs for Odense catchment simulated with 1, 2 and 4 km grids.<br />
The main findings from these validation tests are:<br />
• The annual water balance is simulated reasonably<br />
well, although not with the same accuracy as for<br />
the Karup catchment. Thus the simulated and<br />
recorded flows differ 18% for the 1 km grid<br />
model as average values over the five year simulation<br />
period (Table 3). A comparison with another<br />
model study for this area reveals that one of the<br />
reasons for this deviation is uncertainties (errors) in<br />
the catchment delineation in the flat downstream<br />
part of the catchment. Another reason may be that<br />
Fig. 8. Comparison of the statistical distribution of nitrate concentrations in groundwater for the Odense catchment predicted by the model with<br />
1, 2 and 4 km grids and observed in 35 wells.
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 133<br />
the soil hydraulic conductivity functions and the<br />
soil water retention curves that significantly affect<br />
the evapotranspiration are not very accurately<br />
determined. These inaccuracies may originate<br />
either from non-representative soil texture data in<br />
the 1 km × 1 km GISCO database or by errors<br />
introduced by use of the pedo-transfer functions.<br />
• The variation of the river runoff over the year is<br />
relatively well described, although the winter<br />
peaks are simulated too small and the summer<br />
low flows too high, reflecting that some of the<br />
internal hydrological processes may not be simulated<br />
correctly.<br />
• The distribution of groundwater concentrations by<br />
the end of the simulation period is seen not to<br />
compare very well to the observations from 42<br />
wells. Thus, in 80% of the observation wells no<br />
nitrate was found, whereas the model simulates<br />
zero concentration in only 25% of the area. With<br />
respect to the critical concentration value of 50 mg/<br />
l, the observations indicate that such high concentrations<br />
are not found in the area, while the model<br />
simulates such concentrations to exist in about 5%<br />
of the catchment area. The main reason for this<br />
disagreement is most likely that in reality the<br />
nitrate is in most of the area reduced (disappears)<br />
in the confining sediments overlaying the aquifer.<br />
This is not simulated by the model, because the<br />
reduction front was assumed to be located within<br />
the aquifer, while analysis of local geological data<br />
reveals that it in reality is located in the upper<br />
confining layer over most of the aquifer.<br />
• It is noticed that the nitrate concentrations are<br />
significantly lower in the Odense catchment than<br />
in the Karup catchment, both the observed and the<br />
simulated values. The main reason for this is that<br />
the different soil properties and the less number of<br />
animals result in a lower nitrate leaching from the<br />
root zone in the Odense catchment.<br />
4.3. Scaling effects<br />
The results of running the Karup and Odense<br />
models with different computational grid sizes, 1, 2<br />
and 4 km, appear from Tables 2 and 3 for annual water<br />
balances and Figs. 5 and 7 for discharge hydrographs.<br />
Further, the results in terms of groundwater<br />
concentrations are shown in Figs. 6 and 8. From<br />
these results the following findings appear:<br />
• The simulated annual runoff is almost identical and<br />
thus independent of grid sizes. A reason for some<br />
of the small differences is that the catchment areas<br />
in the 1, 2 and 4 km models are not quite identical.<br />
Thus, the root zone processes responsible for<br />
generating the evapotranspiration and consequently<br />
the runoff does not appear to be scale<br />
dependent as long as the statistical properties of<br />
the soil and vegetation types are preserved, which<br />
is the case with the upscaling/aggregation procedure<br />
used in this case.<br />
• The hydrograph shape differs significantly for the<br />
three grid sizes. For the Karup model, the simulation<br />
with 1 km grid reproduces the low flow conditions<br />
reasonably well, whereas the 2 and 4 km grids<br />
have a rather poor description of the baseflow<br />
recession in general and the low flow conditions<br />
in particular. For the Odense model, the simulation<br />
with the 1 km grid shows too large baseflows<br />
during the low flow season, while the 2 km grid<br />
model has the right level and the 4 km grid<br />
model simulates less low flow than observed.<br />
This indicates that there are significant scale effects<br />
on the stream–aquifer interaction that are not properly<br />
described in the present upscaling/aggregation<br />
procedure.<br />
• The nitrate concentrations in the groundwater is<br />
not clearly influenced by the grid size for the<br />
Karup catchment, while there appears to be some<br />
effect for the Odense catchment. The reason for<br />
this difference is related to the different hydrogeological<br />
situations in the two catchments. In the<br />
Karup catchment the groundwater table is generally<br />
located a couple of meters below terrain<br />
surface and the horizontal flows take place in<br />
both the Quaternary and the Miocene sediments.<br />
Hence for both the 1, 2 and 4 km grid models, the<br />
main part of the horizontal groundwater flow takes<br />
place in the about 15 m of the aquifer located<br />
above the reduction front, and only a relatively<br />
small part of the flow lines are crossing the reduction<br />
front, below which the nitrate disappears. In<br />
the Odense catchment, the horizontal groundwater<br />
flows take place almost exclusively in the lower<br />
aquifer, of which only the upper 3 m is located
134<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
above the reduction front. This implies that a large<br />
part of the groundwater flow is crossing the reduction<br />
front on its route from the infiltration zones in<br />
the hilly areas towards the discharge zones near the<br />
river. As the size of the grid influences the smoothness<br />
of the aquifer geometry, the grid size will<br />
significantly influence the number of flow lines<br />
crossing the reduction front and hence the nitrate<br />
concentrations. Such scaling effect on geological<br />
conditions is not accounted for in the present<br />
upscaling/aggregation procedure.<br />
Further, for evaluating the importance of the<br />
combined upscaling/aggregation method (‘distributed’)<br />
a model run has been carried out for the Karup<br />
catchment with another upscaling method. This alternative<br />
method is based on upscaling of soil/crop types<br />
all the way from point scale to catchment scale. This<br />
implies that all the agricultural area is described by<br />
one representative (‘uniform’) crop instead of the 17<br />
cropping patterns used in the ‘distributed’ method.<br />
This representative crop has been assumed to have<br />
the same characteristics as the dominant crop, namely<br />
winter wheat, and further to be fertilised by the same<br />
total amount of the organic manure as in the other<br />
simulations, supplemented by some mineral fertiliser<br />
up to the nitrate amount prescribed in the norms<br />
defined by Plantedirektoratet (1996).<br />
The results are illustrated in Figs. 5 and 6 by the<br />
legend denoted ‘uniform’. The effects on the<br />
discharge hydrographs (Fig. 5) are seen to be negligible,<br />
indicating that the dominant crop (by chance) has<br />
similar evapotranspiration characteristics as the sum<br />
of the different crops weighted according to their<br />
actual occurrence. The nitrate concentrations in<br />
groundwater (Fig. 6) show some differences in terms<br />
of a lower average concentration and a less smooth<br />
areal distribution as compared to the distributed agricultural<br />
representation. Thus, in case of the ‘uniform’<br />
representation the nitrate concentrations fall in two<br />
main groups. Around 30% of the area, corresponding<br />
to the natural areas with no nitrate leaching, has<br />
concentrations between 0 and 20 mg/l, while the<br />
remaining 70%, corresponding to the agricultural<br />
area with the ‘uniform’ crop, has concentrations<br />
between 70 and 90 mg/l. In the ‘distributed’ agricultural<br />
representation the areal distribution curve is<br />
much smoother in accordance with the measured data.<br />
5. Discussion and conclusions<br />
Two prerequisites are required for performing large<br />
scale simulations of nitrate leaching on an operational<br />
basis: firstly access to readily available global (or in<br />
the present case European) databases, and secondly an<br />
adequate scaling enabling suitable models to be<br />
applied at a larger scale than the field scales for<br />
which they usually have been proven valid. A key<br />
challenge as compared to the experiences reported<br />
in the literature is then how to make use of the physically<br />
based model at large scale without possibility for<br />
detailed calibration at that scale, when we know that<br />
its physically based equations are developed for small<br />
scales. Such model can only be stated as well proven<br />
for small scales, and the few attempts made so far to<br />
use it on scales above 1000 km 2 have applied calibration<br />
at that scale (Refsgaard et al. 1998b, 1992; Jain et<br />
al., 1992).<br />
5.1. Data availability<br />
From the experiences gathered and the lessons<br />
learnt with regard to availability of European data<br />
bases the following conclusions can be drawn:<br />
• Not all of the existing “European” databases are<br />
generally applicable due to various restrictions<br />
(e.g. copyright, not open to other projects, pointers<br />
only).<br />
• Not all databases maintained by international institutions<br />
contain harmonised and integrated data<br />
sets. Many databases in fact only contain a collection<br />
of national data sets that are neither integrated<br />
in one seamless data set, nor harmonised in their<br />
contents or nomenclatures.<br />
• Not all input data requirements could be satisfied<br />
from GIS (spatial) data sets, why tables and paper<br />
maps are needed to supplement the information.<br />
However often the available data are too coarse<br />
in scale (e.g. EU statistics at a higher administrative<br />
unit than needed) or too specific (e.g. transfer<br />
functions for natural soils only but not for agricultural<br />
soils).<br />
• Use of national data sets is to some extent necessary,<br />
with restrictions to data quality and origin.<br />
• The search for data sets could have been largely<br />
improved by the existence of a European spatial
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 135<br />
data clearinghouse and the association of the<br />
available data sets with meta information.<br />
It is noted that in spite of comprehensive efforts<br />
made during recent years for assessing spatial data<br />
by use of advanced remote sensing technology the<br />
only data in the “European” databases which<br />
originate from remote sensing data are the<br />
CORINE land cover data, which were useful for<br />
distinguishing between natural, urban and agricultural<br />
areas, but which did not contain any further<br />
information about agricultural crops of importance in<br />
the present context.<br />
In spite of the above limitations, the attempts in<br />
the present study to identify suitable data sources<br />
at the European scale have shown that useful data<br />
are available at that scale for most of the required<br />
model input data. Although these data require<br />
some kind of transformation, as e.g. pedo-transfer<br />
functions, the data appear adequate for overall<br />
model simulations at this scale. However, some<br />
gaps exist in the European level databases. Thus,<br />
for the following data it was necessary to use<br />
national data sources:<br />
• Meteorological data on a daily basis.<br />
• Soil organic matter from arable land.<br />
• Agricultural statistics.<br />
• Agricultural practices.<br />
These data were all easily available at a national<br />
scale, and hence their availability is not expected to<br />
pose significant constraints for large scale modelling<br />
in other parts of Europe.<br />
The most critical data that may cause problems in<br />
terms of availability at larger scale are the geological<br />
data, for which no global (or European) digital database<br />
apparently exists. The present case study relied<br />
heavily on an EC report produced by the Danish<br />
Geological Survey. The information in this report<br />
proved adequate for the present purpose, although<br />
the lack of geochemical information turned out to<br />
have some importance for one of the two catchments.<br />
Similar readily available EC reports exist for other<br />
countries, but they appear to be non-standardised<br />
and comprise information at a variable level of details.<br />
Hence, the positive conclusions from using the geological<br />
data in EC (1982) for Denmark cannot<br />
necessarily be generalised.<br />
5.2. Parameter assessment—no calibration<br />
An important element of the present methodology<br />
is the principle not to carry out any calibration. The<br />
parameter values were assessed in three different<br />
ways:<br />
• Directly from the available data, e.g. topography<br />
and geology.<br />
• Indirectly from the available data through application<br />
of predefined transfer functions, e.g. the soil<br />
hydraulic parameters.<br />
• Use of standard parameter values that have been<br />
assessed in previous studies on other locations.<br />
While the first two methods can be characterised as<br />
fully objective and transparent, it may be argued that<br />
there always will be some elements of subjective<br />
assessment hidden in the use of standard parameter<br />
values and that the possible calibration exercises in<br />
previous studies may question the “no calibration”<br />
statement.<br />
In the present case the standard parameters originate<br />
from two model codes and associated accumulated<br />
experiences:<br />
• Parameters in the MIKE SHE part. The standard<br />
parameter used here is the time constant for routing<br />
of groundwater to drains (50 days). From comprehensive<br />
hydrological modelling experience on<br />
dozens of Danish catchments starting with<br />
Refsgaard and Hansen (1982) this value can be<br />
characterised as a typical value. It is not the optimal<br />
value that would be estimated in a calibration<br />
for any of the two respective catchments: Thus, for<br />
instance the calibrated value for Karup was in<br />
Refsgaard (1997) estimated to 33 days.<br />
• Parameters in the Daisy part. The standard parameters<br />
used here are the ones controlling the vegetation<br />
part of the evapotranspiration and the<br />
nitrogen turnover processes in the root zone.<br />
These parameters are essential both for the water<br />
balance and the nitrogen concentrations. The Daisy<br />
has standard parameter that can be used if no calibration<br />
is possible (or desirable). These standard<br />
parameter values have originally been assessed<br />
from agricultural field experiments on plot scales<br />
(Hansen et al, 1990). As then the process descriptions<br />
and associated standard parameter values
136<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
have only been subject to minor adjustments<br />
through a number of additional tests on new data<br />
sets from different countries. It should be emphasised<br />
that Daisy has not previously been calibrated<br />
on the Karup and Odense catchments. These two<br />
catchments, and in particular the Karup catchment,<br />
have been subject to modelling studies which have<br />
included calibration of the water balance (evapotranspiration)<br />
parameters. However, in the<br />
previous studies of the Karup catchment (Styczen<br />
and Storm, 1993) and (Refsgaard, 1997) the water<br />
balance in the root zone was simulated by MIKE<br />
SHE, which is not the case in the present study. As<br />
the process descriptions for evapotranspiration in<br />
MIKE SHE and Daisy are fundamentally different,<br />
the Daisy standard parameters used in the present<br />
study, have not been affected at all by the previous<br />
MIKE SHE studies in the same catchment.<br />
Thus although it may correctly be argued that the<br />
standard model parameters are results of previous<br />
studies where calibration was carried out, the specific<br />
parameters used in the present study have not been<br />
subject to, and are not results of, calibration neither in<br />
the Karup nor the Odense catchments.<br />
In our opinion, one of the strengths of physically<br />
based models is the possibility to assess many parameter<br />
values from standard values, achieved from<br />
experience through a number of other applications.<br />
We think that the results of the present study shows<br />
both this strength and some of limitations in this<br />
respect. Thus on one hand, the key results in terms<br />
of annual runoff and nitrogen concentration distributions<br />
are encouraging, while on the contrary Figs. 5<br />
and 7 clearly illustrate that it would be very easy to<br />
obtain a better hydrograph fit through calibration of a<br />
couple of parameter values.<br />
When parameter values are assessed in this way<br />
they inevitably are subject to considerable uncertainty,<br />
which again will generate significant uncertainty<br />
in model results. It is therefore highly relevant<br />
to conduct uncertainty analyses in order to assess<br />
whether the resulting uncertainty becomes so large<br />
that the model results are not of any use for water<br />
management in practise. A methodology and some<br />
results of such uncertainty analyses are provided in<br />
Hansen et al. (1999) for the root zone processes and in<br />
Refsgaard et al. (1998a) for the catchment processes.<br />
5.3. Upscaling<br />
The adopted upscaling methodology is a combination<br />
of upscaling and aggregation. Hence, upscaling in<br />
its traditional definition (Beven, 1995) is used only<br />
from point scale to field scale, where the same equations<br />
are assumed valid and where ‘effective’ parameter<br />
values are used. The parameter values<br />
estimated through pedo-transfer functions (soil data)<br />
and the vegetation parameters representing the different<br />
crops are assumed valid at field scale. Subsequently,<br />
an aggregation procedure is used to<br />
represent catchment scale conditions with regard to<br />
soil and vegetation types. This aggregation procedure<br />
is in full agreement with the findings made regarding<br />
the apparent existence of a threshold area (REA)<br />
above which “… spatial patterns of dominant process<br />
controls can be represented by their statistical distribution<br />
functions” (Famiglietti and Wood, 1995).<br />
This theoretical consideration is supported empirically<br />
by the model results, which show that the annual<br />
catchment runoff can be simulated well, even when<br />
using different model grid sizes. For the Karup catchment,<br />
where the nitrate reduction in the aquifer does<br />
not appear to have influenced the results adversely,<br />
even the statistical distribution of nitrate concentrations<br />
is simulated well.<br />
For simulation of annual runoff and nitrate concentration<br />
distributions, both of which are affected<br />
primarily by root zone processes, the impact of<br />
changes of scale is thus relatively small. In contrary<br />
to this, the impact on hydrograph shape is consistently<br />
rather large. This finding, which also is documented<br />
earlier in Refsgaard (1997), indicates that the applied<br />
upscaling/aggregation procedure has important<br />
limitations with regard to describing the stream–aquifer<br />
interactions. Thus in summary, upscaling of<br />
processes described by vertical, non-correlated, but<br />
patchy, columns is successful, while the upscaling<br />
fails in case of processes where horizontal flows<br />
between grids dominate. The differences in hydrograph<br />
shapes caused by the differences in grid sizes<br />
illustrate how careful a model user has to be when<br />
changing grid size. In our opinion it is not relevant<br />
to talk about an ‘optimal’ scale for hydrograph simulation.<br />
The important point is rather that the present<br />
methodology is scale dependent with regard to hydrograph<br />
simulation; hence a change of scale (grid size)
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 137<br />
generates a need for recalibration of parameters<br />
responsible for baseflow recession and low flow simulation.<br />
An alternative, and commonly used, upscaling<br />
procedure, where upscaling is used all the way from<br />
point scale to catchment scale by selecting the dominant<br />
crop type in each grid, resulted in one uniform<br />
crop representing all the agricultural area. Results<br />
indicate that whereas this uniform upscaling procedure<br />
may be sufficient for simulating annual water<br />
balance and discharge hydrographs, it is not satisfactory<br />
for simulation of nitrate leaching and groundwater<br />
concentrations. This is in agreement with<br />
Beven (1995) who states that upscaling from small<br />
scales to larger scales using effective parameter values<br />
cannot be assumed to be generally adequate.<br />
An inherent limitation of the applied upscaling/<br />
aggregation method is that it does not preserve the<br />
georeferenced location of simulated concentrations,<br />
but only their statistical distribution over the catchment<br />
area. Therefore, comparisons with field data<br />
make no sense on a well by well or subcatchment<br />
by subcatchment basis, and no information on the<br />
actual location of the simulated “hot spots” within<br />
the catchment is possible. If it from a management<br />
point of view is required with a more detailed spatial<br />
resolution of the model predictions, then the same<br />
upscaling method has to be carried out at a finer<br />
scale with all the statistical input data being supplied<br />
on a subcatchment basis. This is in principle straightforward,<br />
but in reality it may often be limited by data<br />
availability.<br />
A critical assumption in the upscaling procedure is<br />
the application of the point scale equations at the field<br />
scale with effective parameters. This corresponds to<br />
interpreting the field as a single equivalent soil<br />
column using effective hydraulic parameters. This<br />
approach was evaluated on two Danish experimental<br />
0.25 ha plots, a coarse sandy soil and sandy loam,<br />
using the Daisy model (Djurhuus et al., 1999). The<br />
two plots were monitored with respect to soil water<br />
content and nitrate in soil water at several depths at 57<br />
points, where also texture, soil water retention and<br />
hydraulic conductivity functions had been measured.<br />
The conclusions from comparing the field measured<br />
data with the model simulations over the experimental<br />
plot, represented by the 57 points, was that the<br />
observed mean nitrate concentrations were matched<br />
well by a simulation using the geometric means as<br />
effective parameters. This conclusion is in agreement<br />
with previous studies for Danish hydrological regime<br />
(Jensen and Refsgaard 1991a–c; Jensen and Mantoglou,<br />
1992). Other studies from other regimes (Bresler<br />
and Dagan, 1983) conclude that effective soil hydraulic<br />
parameters are not adequate for modelling water<br />
flow in spatially variable fields. The critical issue<br />
determining whether such approach is feasible or<br />
not may depend on whether Hortonian overland<br />
flow is created in the hydrological regime in question.<br />
Thus, although the upscaling methodology from point<br />
to field scale is far from universally valid, there are<br />
good reasons to believe that this assumption was satisfactorily<br />
fulfilled in the present case studies.<br />
The spatial patterns, which in subsurface hydrology<br />
is considered to be of significant importance (Wen and<br />
Gómez Hernández, 1996), have been treated in different<br />
ways with regard to continuous data (parameter<br />
values) and categorical data (soil and vegetation<br />
classes). The effects of spatial autocorrelation of soil<br />
and vegetation parameters within a field have been<br />
assumed incorporated into the ‘effective parameters’,<br />
which in the present case are assessed in a rather crude<br />
way through pedo-transfer functions and use of standard<br />
values. The categorical data have been treated<br />
differently in the aggregation procedure for soil and<br />
vegetation classes. The soil data (one soil type for<br />
Karup and two soil types for Odense) were assessed<br />
from the soil map and assigned at a grid basis so that<br />
the percentage of each soil type within a catchment<br />
was preserved and the individual grids to the largest<br />
possible extent were characterised by the dominant<br />
soil type within the respective grid. For the vegetation<br />
types, the same procedure was applied to initially<br />
distinguish between agricultural and non-agricultural<br />
areas by use of the land cover map. Subsequently, it<br />
was assumed that the spatial distribution of cropping<br />
patterns are random and without spatial autocorrelation.<br />
This is justified by the agricultural management<br />
practise of rotating the crops within the individual<br />
farms.<br />
5.4. General applicability of methodology<br />
From the results of the present study it appears that<br />
it is possible to use distributed physically based<br />
models of the same type as the MIKE SHE/Daisy
138<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
for catchment scale assessment of nitrate contamination<br />
from agricultural land. It appears obvious that<br />
such model application is straightforward and the<br />
above conclusion is valid for other areas in Denmark.<br />
The interesting question is therefore how general this<br />
conclusion is to other areas in Europe (and on other<br />
continents) and what the scientific and practical<br />
limitations are. In this respect the following considerations<br />
may be noted:<br />
• Except for the geological data, the general availability<br />
of which are somewhat uncertain, there is<br />
no reason to expect that the application of similar<br />
data for other catchments in other European countries<br />
should not be as relatively easy as the application<br />
for the two Danish catchments. Likewise, the<br />
encouraging simulation results of using European<br />
level databases, in spite of their often coarse resolution<br />
and high level of aggregation, may also be<br />
expected for other areas. With regard to geological<br />
data it may be noted that considerable efforts are<br />
being made at most (if not all) national geological<br />
institutes to provide geological data to users in<br />
digital form; hence the limitation on non-easy<br />
data availability existing so far is likely to be overcome,<br />
at least nationally, during the coming years.<br />
• The combined aggregation/upscaling procedure<br />
appears valid in many areas. The catchments for<br />
which it was used in the present study were limited<br />
to a maximum of about 500 km 2 . However, the<br />
further upscaling to larger areas provides no fundamental<br />
problems, as it consists of just a larger<br />
number of computational grids. Computationally,<br />
running a model like MIKE SHE/Daisy for an area<br />
of for instance 100 000 km 2 with e.g. 250<br />
subcatchments of each 100 grids is maybe close<br />
to the limit of what is practically feasible today<br />
(five years run would require 100 h CPU time on<br />
a Pentium 300 MHz), but this problem will soon<br />
disappear as computers become faster.<br />
• The MIKE SHE/Daisy modelling methodology is<br />
general and applicable to many other areas. Some<br />
limitations, however, is related to special geological<br />
conditions such as karstic flow and fissured<br />
aquifers, which cannot be described explicitly.<br />
Another important limitation is related to the<br />
upscaling procedure from point to field scale,<br />
which may fail in areas where Hortonian overland<br />
flow is a dominant mechanism. In this respect it<br />
should be noted that many areas with dominant<br />
overland flow regimes are mountainous regions<br />
characterised by thin soil layers and steep slopes,<br />
which generally not are regions with important<br />
aquifers.<br />
Hence, it may be concluded that the methodology<br />
can relatively easily be applied to larger areas and<br />
used as decision support tool for evaluation of legislative<br />
and management measures aiming at reducing<br />
nitrate contamination risks.<br />
Acknowledgements<br />
The present work was partly funded by the EC<br />
Environment and Climate Research Programme<br />
(contract number ENV4-CT95-0070). Good ideas<br />
and constructive comments to the manuscript by<br />
Gerard Heuvelink, University of Amsterdam, are<br />
greatly acknowledged. Further, the constructive criticism<br />
of Marnik Vanclooster, Université Catholique de<br />
Louvain, and an anonymous reviewer are<br />
acknowledged.<br />
References<br />
Abbott, M.B., Bathurst, J.C., Cunge, J.A., O’connell, P.E., Rasmussen,<br />
J., 1986. An introduction to the european hydrological<br />
system—systéme hydrologique européen SHE 2: structure of<br />
a physically based distributed modelling system. Journal of<br />
Hydrology 87, 61–77.<br />
Agricultural Statistics, 1995. Danmarks Statistik, 294 pp. (In<br />
Danish).<br />
Arnold, J.G., Williams, J.R., 1995. SWRRB—a watershed scale<br />
model for soil and water resources management. In: Singh,<br />
V.J. (Ed.). Computer Models of Watershed Hydrology, Water<br />
Resources Publication, pp. 847–908.<br />
Arnold, J.G., Williams, J.R., Nicks, A.D., Sammons, N.B., 1990.<br />
SWRRB—A basin scale simulation model for soil and water<br />
resources management, Texas A & M University Press, College<br />
Station 241 pp.<br />
Beasley, D.B., Huggins, L.F., Monke, E.J., 1980. ANWERS: a<br />
model for watershed planning. Transactions of ASAE 23 (4),<br />
938–944.<br />
Beven, K., 1995. Linking parameters across scales: subgrid parameterizations<br />
and scale dependent hydrological models. Hydrological<br />
Processes 9, 507–525.<br />
Blöschl, G., Sivapalan, M., 1995. Scale issues in hydrological<br />
modelling: a review. Hydrological Processes 9, 251–290.<br />
Brester, E., Dagan, G., 1983. Unsaturated flow in spatially variable
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 139<br />
fields: application of water flow models to various fields II.<br />
Water Resources Research 19, 421–428.<br />
Cosby, B.J., Hornberger, M., Clapp, Ginn, T.R., 1984. A statistical<br />
exploration of relationships of soil moisture characteristics to<br />
the physical properties of soils. Water Resources Research 20,<br />
682–690.<br />
Dagan, G., 1986. Statistical theory of groundwater flow and transport:<br />
pore to laboratory, laboratory to formation, and formation<br />
to regional scale. Water Resources Research 22 (9), 120–134.<br />
DeCoursey, D.G., Rojas, K.W., Ahuja, L.R., 1989. Potentials for<br />
non-point source groundwater contamination analyzed using<br />
RZWQM. Paper No. SW892562, presented at the International<br />
American Society of Agricultural Engineers’ Winter Meeting,<br />
New Orleans, Louisiana.<br />
DeCoursey, D.G., Ahuja, L.R., Hanson, J., Shaffer, M., Nash, R.,<br />
Rojas, K.W., Hebson, C., Hodges, T., Ma, Q., Johnsen, K.E.,<br />
Ghidey, F., 1992. Root zone water quality model, Version 1.0,<br />
Technical Documentation. United States Department of Agriculture,<br />
Agricultural Research Service, Great Plains Systems<br />
Research Unit, Fort Collins, Colorado, USA.<br />
Djuurhus, J., Hansen, S., Schelde, K., Jacobsen, O.H., 1999. Modelling<br />
mean nitrate leaching from spatially variable fields using<br />
effective parameters. Geoderma 87, 261–279.<br />
EC, 1982. Groundwater resources in Denmark. Commission of the<br />
European Communities. EUR 7941 (In Danish).<br />
EC, 1996. Commission proposal for an Action Programme for Integrated<br />
Groundwater Protection and Management, Brussels.<br />
EEA, 1995. Europe’s Environment. The Dobris Assessment. The<br />
European Agency, Copenhagen.<br />
Engesgaard, P., 1996. Multi-species reactive transport modelling.<br />
In: Abbott, M.B., Refsgaard, J.C. (Eds.). Distributed Hydrological<br />
Modelling, Kluwer Academic Publishers, Dordrecht, pp.<br />
71–91.<br />
EU, 1991. Resolution from Ministerial seminar held in The Hague<br />
in November 1991.<br />
Famiglietti, J.S., Wood, E.F., 1995. Effects of spatial variability and<br />
scale on arealy averaged evapotranspiration. Water Resources<br />
Research 31 (3), 699–712.<br />
Gelhar, L.W., 1986. Stochastic subsurface hydrology. From theory<br />
to applications. Water Resources Research 22 (9), 135–145.<br />
Hansen, S., Jensen, H.E., Nielsen, N.E., Svendsen, H., 1990. Daisy,<br />
a soil plant system model. NPO-forskning fra Miljøstyrelsen,<br />
Report no. A10. Danish Environmental Protection Agency,<br />
Copenhagen.<br />
Hansen, S., Jensen, H.E., Nielsen, N.E., Svendsen, H., 1991. Simulation<br />
of nitrogen dynamics and biomass production in winter<br />
wheat using the Danish simulation model Daisy. Fertilizer<br />
Research 27, 245–259.<br />
Hansen, S., Thorsen, M., Pebesma, E., Kleeschulte, S., Svendsen,<br />
H., 1999. Uncertainty in simulated leaching due to uncertainty<br />
in input data. A case study. Soil Use and Management.<br />
Heng, H.H., Nikolaidis, N.P., 1998. Modelling of non-point source<br />
pollution of nitrogen at the watershed scale. Journal of the<br />
American Water Resources Association 34 (2), 359–374.<br />
Heuvelink, G.B.M., Pebesma, E.J., 1998. Spatial aggregation and<br />
soil process modelling. Geoderma.<br />
Jain, S.K., Storm, B., Bathurst, J.C., Refsgaard, J.C., Singh, R.D.,<br />
1992. Application of the SHE to catchment in India. Part 2.<br />
Field experiments and simulation studies with the SHE on the<br />
Kolar subcatchment of the Narmada River. Journal of<br />
Hydrology 140, 25–47.<br />
Jensen, C., Stougaard, B., Østergaard, H.S., 1996. The performance<br />
of the Danish simulation model Daisy in prediction of Nmin at<br />
spring. Fertilizer Research 44, 79–85.<br />
Jensen, C., Stougaard, B., Østergaard, H.S., 1994. Simulation of the<br />
nitrogen dynamics in farm land areas in Denmark 1989–1993.<br />
Soil Use and Management 10, 111–118.<br />
Jensen, K.H., Refsgaard, J.C., 1991. Spatial variability of physical<br />
parameters in two fields. Part II: Water flow at field scale.<br />
Nordic Hydrology 22, 303–326.<br />
Jensen, K.H., Refsgaard, J.C., 1991. Spatial variability of physical<br />
parameters in two fields. Part III. Solute transport at field scale.<br />
Nordic Hydrology 22, 327–340.<br />
Jensen, K.H., Refsgaard, J.C., 1991. Spatial variability of physical<br />
parameters in two fields. Part I. Water flow and solute transport<br />
at local scale. Nordic Hydrology 22, 275–302.<br />
Jensen, K.H., Mantoglou, A., 1992. Application of stochastic unsaturated<br />
flow theory, numerical simulations, and comparisons to<br />
field observations. Water Resources Research 28, 269–284.<br />
Jensen, L.S., Mueller, T., Nielsen, N.E., Hansen, S., Crocker, G.J.,<br />
Grace, P.R., Klir, J., Körschens, M., Poulton, P.R., 1997. Simulating<br />
trends in soil organic carbon in long-term experiments<br />
using the soil–plant–atmosphere model DAISY. Geoderma 81<br />
(1–2), 5–28.<br />
Kleeschulte, S., 1998. Assessment of data availability for direct<br />
modelling use at the European scale. In: Refsgaard, J.C.,<br />
Ramaekers, D.A. (Eds.), Assessment of ‘cumulative’ uncertainty<br />
in spatial decision support systems: Application to examine<br />
the contamination of groundwater from diffuse sources.<br />
Final Report, vol. 1, EU contract ENV-CT95-070. http://<br />
projects.gim.lu/uncersdss.<br />
Knisel, W.G. (Ed.), 1980. CREAMS: a field-scale model for<br />
chemicals, runoff, and erosion from agricultural managements<br />
systems. US Department of Agriculture, Science,<br />
and Education Administration. Conservation Research Report<br />
no. 26, 643 pp.<br />
Knisel, W.G., Williams, J.R., 1995. Hydrology component of<br />
CREAMS and GLEAMS models. In: Singh, V.P. (Ed.). Computer<br />
Models of Watershed Hydrology, Water Resources Publication,<br />
pp. 1069–1114.<br />
Lamm, C.G., 1971. Det danske jordarkiv (The Danish soil<br />
archieve), Tidsskrift for Planteavl, pp. 703–720 (in Danish).<br />
Leonard, R.A., Knisel, W.G., Still, D.A., 1987. GLEAMS: groundwater<br />
loading effects of agricultural management systems.<br />
Transactions of ASAE 30, 1403–1418.<br />
Mangold, D.C., Tsang, C.F., 1991. A summary of subsurface hydrological<br />
and hydrochemical models. Reviews of Geophysics 29<br />
(1), 51–79.<br />
Michaud, J.D., Shuttelworth, W.J., 1997. Executive summary of the<br />
Tuczon aggregation workshop. Journal of Hydrology 190, 176–<br />
181.<br />
Person, M., Raffensperger, J.P., Ge, S., Garven, G., 1996. Basinscale<br />
hydrogeologic modelling. Reviews of Geophysics 34 (1),<br />
61–97.
140<br />
J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />
Plantedirektoratet, 1996. Guidelines and forms 1996/1997. Ministry<br />
for Food, Agriculture and Fishery, 38 pp. (In Danish).<br />
Refsgaard, J.C., 1997. Parameterisation, calibration and validation<br />
of distributed hydrological models. Journal of Hydrology 198,<br />
69–97.<br />
Refsgaard, J.C., Hansen, E., 1982. A distributed groundwater/<br />
surface water model for the Suså catchment. Part 1. Model<br />
description. Nordic Hydrology 13, 299–310.<br />
Refsgaard, J.C., Storm, B., 1995. MIKE SHE. In: Singh, V.P. (Ed.).<br />
Computer Models of Watershed Hydrology, Water Resources<br />
Publication, pp. 809–846.<br />
Refsgaard, J.C., Seth, S.M., Bathurst, J.C., Erlich, M., Storm, B.,<br />
Jørgensen, G.H., Chandra, S., 1992. Application of the SHE to<br />
catchment in India. Part1. General results. Journal of Hydrology<br />
140, 1–23.<br />
Refsgaard, J.C., Thorsen, M., Jensen, J.B., Hansen, S., Heuvelink,<br />
G., Pebesma, E., Kleeschulte, S., Ramamaekers, D., 1998.<br />
Uncertainty in spatial decision support systems—Methodology<br />
related to prediction of groundwater pollution. In: Babovic, V.,<br />
Larsen, L.C. (Eds.), Hydroinformatics ‘98. Proceedings of the<br />
Third International Conference on Hydroinformatics, Copenhagen,<br />
Balkema, 24–26 August 1998, pp. 1153–1159.<br />
Refsgaard, J.C., Sørensen, H.R., Mucha, I., Rodak, D., Hlavaty, Z.,<br />
Bansky, L., Klucovska, J., Topolska, J., Takac, J., Kosc, V.,<br />
Enggrob, H.G., Engesgaard, P., Jensen, J.K., Fiselier, J., Griffioen,<br />
J., Hansen, S., 1998. An integrated model for the Danubian<br />
Lowland—methodology and applications. Water<br />
Resources Management 12, 433–465.<br />
Refsgaard, J.C., Ramaekers, D., Heuvelink, G.B.M., Schreurs, V.,<br />
Kros, H., Rosén, L., Hansen, S., 1998. Assessment of ‘cumulative’<br />
uncertainty in spatial decision support systems: Application<br />
to examine the contamination of groundwater from diffuse<br />
sources (UNCERSDSS). Presented at the European Climate<br />
Science Conference, Vienna, 19–23 October.<br />
Saulnier, G.M., Beven, K., Obled, C., 1997. Digital elevation analysis<br />
for distributed hydrological modelling: Reducing scale<br />
dependence in effective hydraulic conductivity values. Water<br />
Resources Research 33 (9), 2097–2101.<br />
Sellers, P.J., Heiser, M.D., Hall, F.G., Verma, S.B., Desjardins,<br />
R.L., Schuepp, P.M., MacPherson, J.I., 1997. The impact of<br />
using area-averaged land surface properties—topography,<br />
vegetation conditions, soil wetness—in calculations of intermediate<br />
scale (approximately 10 km 2 ) surface-atmosphere<br />
heat and moisture fluxes. Journal of Hydrology 190, 269–301.<br />
Styczen, M., Storm, B., 1993. Modelling of N-movements on catchment<br />
scale—a tool for analysis and decision making. 1. Model<br />
description. 2. A case study. Fertilizer Research 36, 1–17.<br />
Styczen, M., Storm, B., 1995. Modelling of the effects of management<br />
practices on nitrogen in soils and groundwater. In: Bacon,<br />
P.E. (Ed.). Nitrogen Fertilization in the Environment, Marcel<br />
Dekker, New York, pp. 537–564.<br />
Svendsen, H., Hansen, S., Jensen, H.E., 1995. Simulation of crop<br />
production, water and nitrogen balances in two German agroecosystems<br />
using the Daisy model,. Ecological Modelling 81,<br />
197–212.<br />
Thorsen, M., Feyen, J., Styczen, M., 1996. Agrochemical modelling.<br />
In: Abbott, M.B., Refsgaard, J.C. (Eds.). Distributed<br />
Hydrological Modelling, Kluwer Academic Publishers,<br />
Dordrecht, pp. 121–141.<br />
UNCERSDSS, 1998. Assessment of cumulative uncertainty in<br />
Spatial Decision Support Systems: Application to examine the<br />
contamination of groundwater from diffuse sources<br />
(UNCERSDSS). EU contract ENV4-CT95-070. Final Report,<br />
available on http://projects.gim.lu/uncersdss.<br />
Vanclooster, M., Viaene, P., Christians, K., 1994. WAVE—a mathematical<br />
model for simulating agrochemicals in the soil and<br />
vadose environment. Reference and user’s manual (release<br />
2.0). Institute for Land and Water Management, Katholieke<br />
Universiteit Leuven, Belgium.<br />
Vanclooster, M., Viaene, P., Diels, J., Feyen, J., 1995. A deterministic<br />
validation procedure applied to the integrated soil crop<br />
model. Ecological Modelling 81, 183–195.<br />
Vereecken, H., Vanclooster, M., Swerts, M., Diels, J., 1991. Simulating<br />
nitrogen behaviour in soil cropped with winter wheat.<br />
Fertilizer Research 27, 233–243.<br />
Wen, X.-H., Gómez-Hernández, J.J., 1996. Upscaling hydraulic<br />
conductivities in heterogeneous media: An overview. Journal<br />
of Hydrology 183, ix–xxxii.<br />
Wood, E.F., Sivapalan, M., Beven, K.J., Band, L., 1988. Effects of<br />
spatial variability and scale with implications to hydrologic<br />
modelling. Journal of Hydrology 102, 29–47.<br />
Wood, E.F., Sivapalan, M., Beven, K., 1990. Similarity and scale in<br />
catchment storm response. Reviews of Geophysics 28, 1–18.<br />
Woods, R., Sivapalan, M., Duncan, M., 1995. Investigating the<br />
representative elementary area concept: an approach based on<br />
field data. Hydrological Processes 9, 291–312.<br />
Young, R.A., Onstad, C.A., Bosch, D.D., 1995. AGNPS: an agricultural<br />
nonpoint source model. In: Singh, V.P. (Ed.). Computer<br />
Models of Watershed Hydrology, Water Resources Publication,<br />
pp. 1001–1020.
[11]<br />
Thorsen M, Refsgaard JC, Hansen S, Pebesma E, Jensen JB, Kleeschulte S<br />
(2001) Assessment of uncertainty in simulation of nitrate leaching to aquifers<br />
at catchment scale.<br />
Journal of Hydrology, 242, 210-227.<br />
Reprinted from Journal of Hydrology with permission from Elsevier
Journal of Hydrology 242 (2001) 210±227<br />
www.elsevier.com/locate/jhydrol<br />
Assessment of uncertainty in simulation of nitrate leaching to<br />
aquifers at catchment scale<br />
M. Thorsen a , J.C. Refsgaard a, *, S. Hansen b , E. Pebesma c , J.B. Jensen a , S. Kleeschulte d<br />
a DHI Water and Environment, Hùrsholm, Denmark<br />
b Royal Veterinary and Agricultural University, Copenhagen, Denmark<br />
c University of Amsterdam, Amsterdam, The Netherlands<br />
d GIM, Luxembourg, Luxembourg<br />
Received 21 February 2000; revised 21 July 2000; accepted 23 October 2000<br />
Abstract<br />
Deterministic models are used to predict the risk of groundwater contamination from non-point sources and to evaluate the<br />
effect of alleviation measures. Such model predictions are associated with considerable uncertainty due to uncertainty in the<br />
input data used, especially when applied at large scales. The present paper presents a case study related to prediction of nitrate<br />
concentrations in groundwater aquifers using a spatially distributed catchment model. Input data were primarily obtained from<br />
databases at an European level. The model parameters were all assessed from these data by use of transfer functions, and no<br />
model calibration was carried out. The Monte Carlo simulation technique was used to analyse how uncertainty in input data<br />
propagates to model output. It appeared that the magnitude of the uncertainty depends signi®cantly on the considered temporal<br />
and spatial scale. Thus simulations of ¯ux concentrations leaving the root zone at grid level were associated with large<br />
uncertainties, whereas uncertainties in simulated concentrations at aquifer level on catchment scale was much smaller.<br />
q 2001 Elsevier Science B.V. All rights reserved.<br />
Keywords: Nitrate; Non-point pollution; Distributed model; Catchment scale; Uncertainty; Monte Carlo method<br />
1. Introduction<br />
1.1. Background<br />
Deterministic models are important tools for assessing<br />
nitrate leaching, transport and transformation in<br />
connection with groundwater resources management.<br />
Such models may be classi®ed according to the<br />
description of the physical processes as black box,<br />
* Corresponding author. Present address. Department of Hydrology,<br />
Geological Survey of Denmark and Greenland, Thoravej 8,<br />
DK-2400 Copenhagen, Denmark.<br />
E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />
conceptual and physically-based and according to<br />
the spatial description as lumped and distributed<br />
(Wood and O'Connell, 1985; Nemec, 1994;<br />
Refsgaard, 1996; and others). In this respect three<br />
typical model types are the lumped black box<br />
model, the lumped conceptual and the distributed<br />
physically-based. Most nitrogen leaching models<br />
such as RZWQM (DeCoursey et al., 1989) and<br />
DAISY (Hansen et al., 1991) are of the physicallybased<br />
type, but cover only the root zone at plot or ®eld<br />
scale. Within the ®elds of nitrogen modelling at a<br />
catchment scale, typical examples of a black box, a<br />
conceptual and a distributed physically-based model<br />
are statistical regression models (Simmelsgaard,<br />
0022-1694/01/$ - see front matter q 2001 Elsevier Science B.V. All rights reserved.<br />
PII: S0022-1694(00)00396-6
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 211<br />
1991), the SWRRB (Arnold et al., 1990; Arnold and<br />
Williams, 1995) and the MIKE SHE/DAISY (Styczen<br />
and Storm, 1993), respectively.<br />
The black box and conceptual models are<br />
attractive because they require relatively less<br />
data, which are usually easily accessible, while<br />
the predictive capability of these models with<br />
regard to assessing the impacts of alternative agricultural<br />
practices is questionable due to the semiempirical<br />
nature of the process descriptions. A key<br />
problem in using the more complex physicallybased<br />
catchment models operationally lies in the<br />
generally large data requirements prescribed by<br />
the developers of such model codes. However,<br />
due to the better process descriptions these models<br />
may for some types of application be expected to<br />
have better predictive capabilities than the simpler<br />
models (Heng and Nikolaidis, 1998). Traditionally,<br />
complex leaching models are only used on plot or<br />
®eld scales in areas with extraordinarily good data<br />
availability, and even for such cases the relevance<br />
of such an approach is often questioned because<br />
of the perceived uncertainty related to the model<br />
simulations (Skop, 1993). Hence, there is an<br />
evident need to assess the uncertainty related to<br />
large scale simulation of aquifer pollution from<br />
diffuse sources.<br />
When analysing for uncertainties in model<br />
simulations the two fundamentally different<br />
sources of uncertainty are: (1) uncertainty on<br />
input data in terms of input variables (time varying<br />
input such as climate data) and model parameters<br />
(e.g. soil physical characteristics); and (2)<br />
inadequate model structure (process descriptions,<br />
equations). When comparing the model outputs<br />
to measured ®eld data a third source of uncertainty<br />
has to be added, namely the error in the<br />
measurement of output from nature.<br />
Stochastic approaches are useful tools in uncertainty<br />
analyses. Assessment of uncertainties of<br />
model simulations requires a joint stochastic±deterministic<br />
approach, where the input data and/or the<br />
structure of the deterministic model somehow are<br />
considered stochastic. By considering input data as<br />
realisations of stochastic variables with given statistical<br />
properties, the governing equations become socalled<br />
stochastic partial differential equations<br />
(PDEs). The three traditional approaches to solving<br />
the stochastic PDEs are (1) state space formulations<br />
Ð Kalman ®ltering (Gelb, 1974; Ahsan and O'Connor,<br />
1994), (2) Monte Carlo techniques (Smith and<br />
Freeze, 1979a,b; Freeze, 1980; Zhang et al., 1993,<br />
and (3) analytical solutions to the stochastic PDEs<br />
(Gelhar, 1986; Dagan, 1986; Jensen and Mantoglou,<br />
1992). A severe limitation of the above three methods<br />
is that they only consider uncertainties on input data,<br />
while all of them assume the model structure to be<br />
correct. A more comprehensive approach also allowing<br />
consideration of the uncertainty in the model<br />
structure and process equations is the generalised likelihood<br />
uncertainty estimation (GLUE) methodology<br />
outlined in Beven and Binley (1992). Although no<br />
such studies have been reported yet, the GLUE in<br />
principle allows the uncertainty on model structure<br />
to be considered by introducing several alternative<br />
models, so that the Monte Carlo procedure includes<br />
both uncertainties on input data and on model structure.<br />
The objective of the present paper is, by use of<br />
Monte Carlo simulations, to assess whether a distributed<br />
physically-based model can provide fairly accurate<br />
predictions of nitrate concentrations in aquifers<br />
when applied at a catchment scale with input data only<br />
from readily available, aggregated data sources such<br />
as European databases. A limitation of the present<br />
paper is that only uncertainties in input data are<br />
considered, while errors in model structures are not<br />
taken into account.<br />
The studies reported in literature dealing with<br />
assessment of uncertainty of physically-based<br />
models consider only individual components of<br />
the hydrological cycle, typically groundwater,<br />
while the studies dealing with conceptual models,<br />
including both surface water, root zone and groundwater<br />
processes, have not considered uncertainties<br />
on nitrogen or other water quality aspects. Thus, to<br />
our knowledge, no similar attempts have been<br />
reported so far. The present paper focussing on<br />
uncertainty assessment at catchment scale is an<br />
extension of Refsgaard et al. (1999) and Hansen<br />
et al. (1999), where details on the deterministic<br />
modelling at catchment scale and the uncertainty<br />
aspects at the nitrogen leaching from the root<br />
zone, respectively, have been described. All three<br />
papers present results from the UNCERSDSS<br />
project (Refsgaard et al., 1998).
212<br />
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />
2. Methodology<br />
2.1. Modelling approach<br />
The deterministic simulation is carried out by the<br />
coupled MIKE SHE/DAISY system. This is a<br />
coupling of a 1D root zone model (DAISY) and a<br />
3D distributed catchment model (MIKE SHE).<br />
MIKE SHE is a modelling system describing the<br />
¯ow of water and solutes in a catchment in a distributed<br />
physically-based way. This implies numerical<br />
solutions of the coupled PDEs for overland (2D) and<br />
channel ¯ow (1D), unsaturated ¯ow (1D) and saturated<br />
¯ow (3D) together with a description of evapotranspiration<br />
and snowmelt processes. For further<br />
details reference is made to the literature (Abbott et<br />
al., 1986; Refsgaard and Storm, 1995).<br />
DAISY (Hansen et al., 1991) is a 1D physicallybased<br />
modelling tool for the simulation of crop<br />
production and water and nitrogen balance in the<br />
root zone. DAISY includes modules for description<br />
of evapotranspiration, soil water dynamics based on<br />
Richards' equation, water uptake by plants, soil<br />
temperature, soil mineral nitrogen dynamics based<br />
on the advection±dispersion equation, nitrate uptake<br />
by plants and nitrogen transformations in the soil. The<br />
nitrogen transformations simulated by DAISY are<br />
mineralisation-immobilisation turnover (MIT), nitri®cation<br />
and denitri®cation. In addition, DAISY<br />
includes a module for description of agricultural<br />
management practices.<br />
By combining MIKE SHE and DAISY, a complete<br />
modelling system is available for the simulation of<br />
water and nitrate transport in an entire catchment. In<br />
the present case the coupling is a sequential one. Thus<br />
for all agricultural areas, DAISY ®rst performs calculations<br />
of water and nitrogen behaviour from the soil<br />
surface and through the root zone. The percolation of<br />
water and nitrate at the bottom of the root zone, simulated<br />
by DAISY, is then used as input to MIKE SHE<br />
calculations for the remaining part of the catchment.<br />
For natural areas, MIKE SHE calculates also the root<br />
zone processes assuming no nitrate contribution from<br />
these areas. Due to the sequential execution of the two<br />
codes, it has to be assumed that there is no feedback<br />
from the groundwater zone (MIKE SHE) to the root<br />
zone (DAISY). As the riparian buffer zone, where<br />
such feedback mechanism is effective, often mainly<br />
(like in our case study) constitutes a part of the natural<br />
areas, this limitation is of minor practical importance.<br />
Furthermore, overland ¯ow generated by high intensity<br />
rainfall (Hortonian) can not be simulated by this<br />
coupling, while saturation-excess overland ¯ow<br />
(Dunne) can be accounted for by MIKE SHE.<br />
Thus, MIKE SHE does not in the present case<br />
handle evapotranspiration and other root zone<br />
processes in the agricultural areas. As DAISY is 1D,<br />
one DAISY run in principle should be carried out for<br />
each of MIKE SHE's horizontal grids. However,<br />
several MIKE SHE grids are assumed to have identical<br />
root zone properties (soil, crop, agricultural<br />
management practices, etc), so that in practise the<br />
outputs from each DAISY run can be used as input<br />
to several MIKE SHE grids.<br />
To ful®l one of the overall objectives of the project,<br />
which was to assess the quality of European data sets<br />
for direct use for modelling at the European scale, two<br />
key constraints were applied to the modelling<br />
approach. One constraint was that, if possible, input<br />
data such as model parameters and driving variables<br />
should be based on publicly available information,<br />
which preferably could be accessed from the standard<br />
European databases such as GISCO or EUROSTAT,<br />
or from very easily available national sources.<br />
Another constraint was that all model parameters<br />
obtained from standard databases were to be used<br />
directly or by way of transfer functions without any<br />
model calibration.<br />
2.2. Scaling<br />
As the equations in both the MIKE SHE and the<br />
DAISY codes basically are point scale equations a<br />
scaling procedure had to be adopted in order to<br />
apply the codes at a catchment scale. MIKE SHE/<br />
DAISY is in this case run with equations and parameter<br />
values in each model grid point representing<br />
®eld scale conditions. The ®eld scale is characterised<br />
by `effective' soil and vegetation parameters, but<br />
assuming only one soil type and one cropping pattern.<br />
The smallest horizontal discretisation in the model is<br />
the grid scale (2 £ 2km 2 ) that is larger than the ®eld<br />
scale. This implies that all the variations between<br />
categories of soil type and crop type within the area<br />
of each grid can not be resolved and described at the<br />
grid level. Input data, whose variations are not
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 213<br />
Fig. 1. Location of the Karup catchment in Jutland, Denmark.<br />
included in the grid scale representation, are distributed<br />
randomly at the catchment scale so that their<br />
statistical distributions are preserved at that scale.<br />
The results from the grid scale modelling are then<br />
aggregated to catchment scale (130 grids) and the<br />
statistical properties of model output and ®eld data<br />
are then compared at catchment scale. Thus the scaling<br />
procedure from point scale to catchment scale<br />
may be characterised as a combination of an upscaling<br />
step and an aggregation step. The upscaling step is<br />
simply the important assumption that the point scale<br />
equations are valid at ®eld scale. The aggregation step<br />
highlights a key issue from the concept of representative<br />
elementary area (REA) (Wood et al., 1988),<br />
namely that variability can be explicitly represented<br />
only at scales larger than the model grid size.<br />
More details on the adopted scaling approach is<br />
provided in Refsgaard et al. (1999), where it is also<br />
documented that the approach can be assumed valid<br />
for the case study in question.<br />
2.3. Input error assessment<br />
The MIKE SHE/DAISY model contains a very<br />
large number of input parameters. Ideally, all these<br />
parameters should be treated stochastically and<br />
included in the uncertainty analyses. However, this<br />
would result in an unrealistically high number of<br />
Monte Carlo simulations and CPU-time. Therefore,<br />
the input uncertainty was limited to ®ve key parameters<br />
(see Section 3.2 below), which were selected<br />
so that they, by experience, are known to be the dominant<br />
parameters in the processes governing the water<br />
balance and nitrate leaching and transformation.
214<br />
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />
The actual input error assessment, i.e. the choice<br />
and parameterisation of the joint probability distribution<br />
of the stochastic variables was partly based on the<br />
analysis of available data and partly on expert judgement.<br />
Available data comprised data from national<br />
surveys or previous studies. The expert judgement<br />
refers for instance to the choice of the distribution<br />
type if no data were present, and the assessment of<br />
`realistic' ranges between which the true parameter<br />
values were expected to vary. Although this assessment<br />
seems rather subjective, it was hard to ®nd a<br />
better way of doing this in the case of lacking data.<br />
Since the basic unit of calculation is a ®eld, the variation<br />
of ®eld-effective values was used for determining<br />
the range of the parameter probability distributions. A<br />
single realisation of such a parameter was then used in<br />
the model for each grid cell. All stochastic parameters<br />
were treated as being mutually independent. The<br />
reasons for this are that no signi®cant correlation was<br />
suspected a priori, and that no data were available to<br />
actually estimate possible correlation.<br />
2.4. Error propagation<br />
The propagation of errors in the input data to the<br />
model output was assessed using Monte Carlo analysis.<br />
This means that a number of realisations were drawn at<br />
random from the stochastic input parameter distributions<br />
and that the model was run for each realisation.<br />
The ensemble of model outputs then is an estimate of the<br />
model output probability distribution, as only in¯uenced<br />
by uncertainty in model input parameters. In order to<br />
reduce the number of Monte Carlo runs, Latin hypercube<br />
sampling was used to draw realisations from the<br />
input variables (McKay et al., 1979). This essentially<br />
means that each sample of a stochastic input variable<br />
was strati®ed in N strata with equal probability mass,<br />
where N equals the number of Monte Carlo runs. The<br />
theoretical background for the adopted Latin hypercube<br />
sampling method is described in Pebesma and Heuvelink<br />
(1999).<br />
3. Application<br />
3.1. Study area<br />
The area used in the study is the Karup river basin,<br />
located in the middle part of Jutland, Denmark<br />
(Fig. 1). The topographic catchment covers approximately<br />
500 km 2 of which 70% are used for agricultural<br />
purposes and 30% are natural areas. The<br />
catchment characteristics are described in Styczen<br />
and Storm (1993). The data used for the present<br />
study and the model construction are described in<br />
detail in Refsgaard et al. (1999) and Hansen et al.<br />
(1999). In the following a brief summary is provided.<br />
The catchment was in the model represented in a<br />
3D network. The discretisation used for the uncertainty<br />
analysis was 2 km in the horizontal direction<br />
and varied in the vertical from 5 to 40 cm in the unsaturated<br />
zone, and from 10 to 15 m in the saturated<br />
zone. The catchment area and the location of the<br />
river branches as well as the stream geometry were<br />
generated on the basis of a digital elevation map from<br />
USGS/GISCO using Arc/Info facilities. Spatial distributions<br />
of land use and soil types were derived from<br />
the GISCO database and hydrogeological data were<br />
obtained from EC (1982). Distributions of crop types<br />
and livestock densities were obtained from Agricultural<br />
Statistics (1995) and converted to slurry production<br />
using standard values for nitrogen content. Based<br />
on typical crop rotations proposed by The Danish<br />
Agricultural Advisory Centre and the constraints<br />
offered by crop distribution and livestock density<br />
two cattle farm rotations, one pig farm rotation and<br />
one arable farm rotation were constructed. In order to<br />
capture the effect of the interaction between weather<br />
conditions and crop, simulations were performed in<br />
such a way that each crop at its particular position in<br />
the considered rotation occurred once in each of the<br />
years in the rotation. This resulted in a total of 17<br />
agricultural crop rotation schemes and one scheme<br />
representing natural areas with no assumed nitrate<br />
leaching. These 18 schemes were distributed<br />
randomly over the area in such a way that the statistical<br />
distribution was in accordance with the agricultural<br />
statistics.<br />
To simulate the trend in the nitrate concentrations<br />
in the groundwater and in the streams, it is necessary<br />
to have information on the history of the fertiliser<br />
application in space and time. In Denmark, norms<br />
and regulations for fertilisation practice are de®ned<br />
(Plantedirektoratet, 1996). These regulate the maximum<br />
amount of nutrients allowed for a particular<br />
crop depending on forefruit and soil type, and in addition,<br />
provide norms for the lower limit of nitrogen
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 215<br />
Table 1<br />
Statistical properties of the input error considered in the Monte Carlo analysis<br />
Parameter Unit Distribution Mean Std. Range<br />
Daily rainfall<br />
Standard error % 50<br />
a<br />
Clay content % Uniform 8.5 0.0±17.0<br />
SOM2 % Truncated normal 0.5 0.22 0.06±0.94<br />
Cattle slurry<br />
Dry matter content % Truncated normal 7.5 2.5 1.89±14.35<br />
Total N content % Truncated normal 0.5 0.12 0.24±1.02<br />
Pig slurry<br />
Dry matter content % Truncated normal 4.9 2.5 0.82±13.79<br />
Total N content % Truncated normal 0.61 0.18 0.24±1.02<br />
Depth of reduction front m Uniform 22.5 18±27<br />
a<br />
The series was normalised so that the mean value was preserved.<br />
utilisation for organic fertilisers. It was assumed that<br />
the farmers follow these statuary norms. Based on<br />
estimated application rates of organic and mineral<br />
fertiliser to the individual crops each year, the<br />
DAISY model simulated time series of nitrate leaching<br />
from the root zone for each agricultural grid. The<br />
MIKE SHE model then routed these ¯uxes further<br />
through the unsaturated zone and in the groundwater<br />
layers accounting for dispersion and dilution<br />
processes and ®nally into the Karup stream where<br />
the integrated load from the entire catchment was<br />
estimated.<br />
The model was run for seven years, from 1987 to<br />
1993. The large storage possibilities in the unsaturated<br />
zone and the aquifer imply that the initial conditions<br />
in¯uence the simulation results for several years. The<br />
initial conditions were established by running the<br />
deterministic model twice for the period 1987±<br />
1993. In the ®rst run the initial conditions were<br />
guessed and in the second run they were taken as<br />
the simulated conditions by the end of the period.<br />
The simulated 1993 conditions in the second run<br />
were then used as initial conditions for the Monte<br />
Carlo runs. This procedure ensures that the initial<br />
conditions are consistent with the assumptions made<br />
in the deterministic simulation, but not necessarily<br />
with the parameter values drawn in the Monte Carlo<br />
runs, where e.g. a run with a parameter value resulting<br />
in higher nitrate leaching, in principle, should have<br />
been associated with higher initial nitrate concentrations<br />
in the aquifer. In order to reduce the effect of<br />
this, the two ®rst years were considered as a<br />
`warming-up period' and the last ®ve years were<br />
considered the simulation period.<br />
3.2. Assessment of input errors<br />
Uncertainty on the following ®ve parameters was<br />
introduced in the analysis: precipitation, soil hydraulic<br />
properties, soil organic matter (SOM) content,<br />
slurry composition, and depth of the nitrate reduction<br />
front in the aquifer. The rationale for selecting these<br />
®ve parameters and details on their assessment are<br />
provided in Sections 3.2.2±3.2.6 below. The statistical<br />
characteristics of the data included in the Monte<br />
Carlo analysis are shown in Table 1.<br />
3.2.1. Length scale and spatial correlation<br />
A fundamental question in the assessment of uncertainty<br />
of input data for a spatially distributed model<br />
like MIKE SHE/DAISY is whether the input data are<br />
spatially correlated or not. It is possible to take spatial<br />
correlation into account, however, it will complicate<br />
the Monte Carlo sampling considerably (Kros et al.,<br />
1999). The critical question in this relation is whether<br />
the spatial autocorrelation length scale of the input<br />
data is larger than the computational scale, or whether<br />
the dominating spatial variability takes place within a<br />
computational length scale, in which case it should be<br />
incorporated into the effective model parameters and<br />
their inherent uncertainties.<br />
As discussed above, the basic unit of calculation is<br />
the model grid (2 £ 2km 2 ) with some of the parameters,<br />
however, representing ®eld-effective values
216<br />
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />
(typically 1±10 ha in size). Hence the soil hydraulic<br />
parameters, the SOM content and slurry composition<br />
are representing ®eld length scales in the order of<br />
100±300 m, while the precipitation and reduction<br />
front are represented at a 2 km length scale.<br />
For the ®eld related parameters the correlation<br />
length scales can be assumed smaller than 100 m.<br />
For soil hydraulic properties this is documented in<br />
previous studies (Hansen and Jensen, 1988), while<br />
no data exist on length scales for SOM. With respect<br />
to slurry composition this parameter is the result of<br />
farm management and storage conditions, and it is<br />
known that the temporal variability of the produced<br />
slurry on the individual farm is considerable. Hence, it<br />
is assumed that the variability within the individual<br />
®elds is much larger than the variability among the<br />
®elds.<br />
Daily rainfall data are known to have correlation<br />
length scales that are usually larger than the<br />
2 km grid scale used in the present case. Geostatistical<br />
analysis (Storm et al., 1988) suggests that<br />
the length scale for Danish conditions is in the<br />
order of 10 km. Similarly, the location of the<br />
reduction/oxidation front, which is mainly dependent<br />
on geological conditions, may be assumed to<br />
be signi®cantly larger than the 2 km grid.<br />
This implies that the three ®eld related parameters<br />
in principle should be treated as spatially independent<br />
in the Monte Carlo analysis, while the two other input<br />
data could be treated as almost spatially constant.<br />
As a consequence of the adopted scaling approach<br />
the relevant scale for which the uncertainty on the<br />
input data should be generated in the Monte Carlo<br />
analysis is the catchment scale and not the grid<br />
scale. The uncertainty at catchment scale can be<br />
generated either by allowing spatial variation among<br />
grids and use a variance applicable for grid scale in<br />
the Monte Carlo sampling or by assuming a spatially<br />
constant value and using the (smaller) catchment scale<br />
variance. In the present study we have adopted the<br />
latter approach. This has two important limitations.<br />
Firstly, the nitrate reduction processes in the aquifer,<br />
where the horizontal dimension with ¯ows between<br />
neighbouring grids is important, is not fully correctly<br />
described because the autocorrelation length scale is<br />
not preserved. Secondly, the output uncertainties are<br />
only simulated correctly at the catchment scale, while<br />
they are underestimated at grid scales.<br />
3.2.2. Precipitation<br />
In general the required daily climate data are available<br />
throughout Europe from the national meteorological<br />
institutes. Among the required meteorological<br />
variables the precipitation is the one, subject to most<br />
local variations. Therefore uncertainty on the daily<br />
amount of precipitation was included in the present<br />
analysis. The uncertainty was described by adding a<br />
random error to the measured series. This error was<br />
assumed to follow a normal distribution with zero<br />
mean and a standard deviation equivalent to 50% of<br />
the measured daily value. Thus, dry days were kept<br />
dry. The error was assumed to contain no temporal<br />
autocorrelation. Finally, the series was normalised so<br />
that the mean value, taken over the 25 Monte Carlo<br />
runs, was preserved. The adopted variance is in agreement<br />
with Allerup et al. (1982) as standard error of<br />
daily rainfall for a catchment of this size.<br />
3.2.3. Soil hydraulic properties<br />
The modelling system requires soil hydraulic parameters<br />
in terms of retention curves and hydraulic<br />
conductivity functions. Such data were not directly<br />
available through European databases. Instead, these<br />
properties were estimated using pedo-transfer functions<br />
based on soil information in terms of texture<br />
composition obtained from the GISCO soil database<br />
in which soils are divided into ®ve texture classes<br />
according to FAO classi®cation. All soil types of the<br />
Karup catchment fall within one texture class (coarse<br />
texture) which covers soils with less than 18% clay<br />
and more than 65% sand. As the texture class covers a<br />
wide range of different texture compositions, soil<br />
hydraulic properties derived from this information<br />
will be associated with considerable uncertainty.<br />
Based on a review by Tietje and Tapakenhinrichs<br />
(1993) evaluating available pedo-transfer functions<br />
and based on the constraints imposed by the available<br />
information on texture (clay, silt and sand content),<br />
the pedo-transfer functions proposed by Cosby et al.<br />
(1984) were selected. These functions estimate the<br />
saturated hydraulic conductivity and the parameters<br />
in the soil water retention function proposed by<br />
Campbell (1974). The hydraulic conductivity function<br />
was calculated according to Burdine (1952) using the<br />
same parameters. In order to facilitate a smooth retention<br />
function the Campbell functions were modi®ed<br />
according to the modi®cations of the Brooks±Corey
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 217<br />
function (Brooks and Corey, 1966) proposed by Smith<br />
(1992). In Danish soils the clay and the silt content are<br />
correlated. Based on information in the Danish Soil<br />
Library (Lamm, 1971) a relation between clay and silt<br />
has been established:<br />
Silt content ˆ 0:035 1 0:82 £ Clay content<br />
…r 2 ˆ 0:68†<br />
Adopting this relation and assuming that clay, silt<br />
and sand constitute all soil solids, the soil hydraulic<br />
properties can be calculated once the clay content is<br />
known. In the uncertainty analysis, the clay content<br />
was drawn strati®ed random from a uniform distribution<br />
ranging from 0 to 17% (Table 1). In reality, the<br />
uncertainty on the soil hydraulic parameters originate<br />
from two sources, namely the uncertainty on soil<br />
texture and the uncertainty related to use of the<br />
adopted pedotransfer function. In the present<br />
approach uncertainty is only associated to soil texture.<br />
Data from the Danish Soil Textural Database show<br />
that a uniform distribution, as adopted in the present<br />
study, clearly overestimates the uncertainty on soil<br />
texture (Bùrresen, 2000). The assumed large uncertainty<br />
range on soil texture may therefore compensate<br />
for the lack of uncertainty on the pedotransfer function,<br />
so that the integrated uncertainty on the soil<br />
hydraulic parameters is of the right order of magnitude.<br />
Considering that the autocorrelation length scale<br />
for soil texture is in the order of 100 m, this adopted<br />
uncertainty range may at a ®rst glance appear as a<br />
rather high uncertainty for soil texture at the catchment<br />
scale. However, as the FAO texture class is so<br />
broad that it actually covers different soil types with<br />
large differences in hydraulic properties the adopted<br />
catchment scale variance should be seen to cover<br />
uncertainty on which soil type actually is present in<br />
the catchment rather than uncertainty on hydraulic<br />
properties due to small scale variations.<br />
3.2.4. Soil organic matter<br />
In DAISY, the MIT model considers three types of<br />
organic matter: newly added relatively fresh organic<br />
matter (AOM) with a relatively short turnover rate,<br />
the living soil microbial biomass (SMB) and old<br />
native SOM with slow turnover, respectively. The<br />
former two can be initialised with default values<br />
when the model is run with a `warm-up' period of a<br />
couple of years prior to the actual simulation period.<br />
The latter comprises by far, most of the organic matter<br />
found in the soil. However, SOM is divided into two<br />
sub-pools, SOM 1 and SOM 2 . The turnover of SOM 1 is<br />
so slow that its contribution to the annual nitrogen<br />
mineralisation in agricultural soils is negligible.<br />
Hence, when initialising the MIT model the important<br />
factor is the quantity of SOM 2 . As the European databases<br />
did not provide this information we had to rely<br />
on estimates of both the amount of the organic matter<br />
present in the soil and the amount of this organic<br />
matter that is allocated to the SOM 2 . The assumed<br />
statistical properties of this uncertainty are shown in<br />
Table 1.<br />
3.2.5. Slurry composition<br />
Due to the high livestock density, slurry is a<br />
substantial source of nitrogen in the Karup region.<br />
Hence the management of slurry is of prime importance<br />
for the leaching losses. A main problem in<br />
management of slurry is the large variability found<br />
in the composition of the slurry. This variability<br />
makes the actual fertiliser application in slurry differ<br />
from the planned application and introduces therefore<br />
a considerable source of uncertainty. In the uncertainty<br />
analysis this has been accounted for by introducing<br />
uncertainty on the dry matter content and the<br />
nitrogen content of the slurry. The assumed error<br />
statistics are shown in Table 1. Further details on<br />
the agricultural management and the rationale behind<br />
the error statistics are provided in Hansen et al.<br />
(1999).<br />
3.2.6. Depth of reduction front<br />
In the uncertainty analysis the depth of the reduction<br />
front in the saturated zone was drawn from a<br />
uniform distribution in the interval 18±27 m below<br />
soil surface.<br />
3.3. Uncertainty analyses<br />
The initial part of the uncertainty analysis<br />
comprised an evaluation of the selected number of<br />
Monte Carlo runs. As the CPU-time required to run<br />
the model for the seven year period is substantial it<br />
was necessary to keep the number of Monte Carlo<br />
runs to a minimum. Therefore an initial choice of 25
218<br />
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />
Table 2<br />
Evaluation of the representativeness of 25 Monte Carlo runs<br />
Variable 1±25 26±50 51±75 1±75 CV (%)<br />
Mean Std. Mean Std. Mean Std. Mean a Std.<br />
Leaching from root zone (kg N/ 64.7 19.2 68.2 18.9 67.2 16.7 66.7 18.1 27.1<br />
ha/year)<br />
Groundwater concentration (mg 47.7 8.0 48.3 7.2 47.6 6.0 47.8 7.0 14.6<br />
NO 3 /l)<br />
River ¯ow (mm/year) 464.0 22.0 464.0 23.0 464.0 17.0 464.0 21.0 4.5<br />
River concentration (mg NO 3 /l) 45.1 7.8 46.2 7.3 45.7 6.6 45.7 7.1 15.5<br />
a<br />
Homogeneity of means accepted by F-test.<br />
runs was made. In order to investigate whether 25<br />
Monte Carlo runs are suf®cient to capture the variability,<br />
75 Monte Carlo runs were performed and the<br />
results were split into 3 groups of 25 runs each and the<br />
statistical distribution of the three elements were<br />
compared. The output variables analysed were river<br />
¯ow, average NO 3 concentration in groundwater, and<br />
average NO 3 concentration in the stream. The three<br />
sets of Monte Carlo runs were evaluated by comparing<br />
the statistical distribution of simulation results, i.e.<br />
testing whether the simulation results can be<br />
described by a normal distribution and whether homogeneity<br />
of mean and variance can be assumed.<br />
In the second part of the uncertainty analysis the<br />
sources of uncertainty with respect to uncertainties<br />
associated with each of the selected Monte Carlo parameters<br />
were evaluated by performing ®ve sets of<br />
Monte Carlo simulations in each of which one of<br />
the initially stochastic parameters was kept deterministic.<br />
The uncertainty contributions of the different<br />
parameters were then evaluated. As annual leaching<br />
depends on weather, crop and crop position in the<br />
rotation, groundwater concentrations in single years<br />
were not considered, instead data averaged over the<br />
®ve year simulation period, 1989±1993, were used for<br />
the uncertainty analysis.<br />
4. Results Ð uncertainties of model results<br />
4.1. Evaluation of the number of Monte Carlo runs<br />
The main results of the comparison between three<br />
individual sets of 25 Monte Carlo runs are given in<br />
Table 2. Statistical tests showed that the hypothesis of<br />
homogeneity of means and variances can not be<br />
Fig. 2. Statistical distribution from 25 Monte Carlo runs of simulated average annual river ¯ow at the catchment outlet. The corresponding<br />
measured value based on daily river ¯ow data was 451 mm/year.
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 219<br />
Fig. 3. Statistical distribution over 25 Monte Carlo runs of simulated areal average NO 3 concentrations in upper aquifer layer by the end of<br />
1993. The corresponding measured value based on data from 35 wells was 58 mg/l.<br />
rejected. As the three sub-sets appear statistically<br />
similar it was concluded that 25 Monte Carlo runs<br />
were suf®cient to assess the uncertainty on the simulation<br />
results. It should be emphasised that the small<br />
number of Monte Carlo runs only is possible because<br />
we focus on mean values and standard deviations. If<br />
the aim were to assess uncertainties on extreme<br />
values, such as the 1% fractile, 25 runs would<br />
obviously not have been suf®cient.<br />
4.2. Comparisons with ®eld data<br />
The simulated uncertainty intervals on selected<br />
model results were, if possible, compared to corresponding<br />
measured data available from monitoring<br />
programmes conducted in the area. In this context it<br />
is noted that due to the adopted scaling approach, the<br />
simulation results are only supposed to re¯ect the ®eld<br />
observations at a catchment scale and not at a point<br />
scale.<br />
The simulated water balance represented by average<br />
annual river discharge at the catchment outlet<br />
vary from 428 to 502 mm/year (Fig. 2). The corresponding<br />
measured value is 451 mm/year which<br />
falls within the simulated interval and within 5% of<br />
both the median (462 mm) and the average (463 mm)<br />
Fig. 4. Statistical distribution over 25 Monte Carlo runs of percentage of catchment area with NO 3 concentrations above the drinking water limit<br />
of 50 mg/l. The corresponding measured value based on data from 35 wells was 57%.
220<br />
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />
Fig. 5. Measured (B) and simulated ( £ ) areal distribution of NO 3 concentrations in groundwater at eight points in time. Measured values are based on 35 groundwater observations.
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 221<br />
Fig. 6. (a) Simulated time series of six monthly ¯ux concentrations from the root zone obtained in three different crop rotations (B ˆ mean,<br />
u ˆ ^ 1 £ std). The range of seasonal variation in standard errors is shown inside the ®gures. (b) Simulated time series of average areal aquifer<br />
concentrations (B ˆ mean, u ˆ ^ 1 £ std). The range of seasonal variation in standard errors is shown inside the ®gures.
222<br />
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />
Fig. 6. (continued)<br />
of the simulated values. Fig. 3 presents the simulated<br />
distribution of average nitrate concentrations in the<br />
upper groundwater layer averaged over the entire<br />
catchment and over the ®ve years simulation period.<br />
The corresponding value obtained from observations<br />
in 35 wells is 58 mg/l, which falls within the simulated<br />
interval (35.4±61.4 mg/l) and within 25% of<br />
both the median (46.7 mg/l) and the average<br />
(47.4 mg/l) of the Monte Carlo runs. In Fig. 4 the<br />
fraction of the catchment area with groundwater<br />
concentrations above the drinking water limit of<br />
50 mg/l is shown in terms of statistical distribution<br />
for the 25 Monte Carlo runs. Also in this case the<br />
observed value from the 35 observation wells (57%)<br />
falls within the simulated interval (27±65%) and<br />
within 10% of the median (53%) of the Monte Carlo<br />
runs.<br />
A visual comparison is shown in Fig. 5, where<br />
observed areal distributions of nitrate concentrations<br />
from existing wells are compared to similar results<br />
from the Monte Carlo runs on a six-monthly basis.<br />
From this ®gure it is seen that the measured concentration<br />
distribution in general is within the uncertainty<br />
band generated from the Monte Carlo simulations,<br />
though not always centred. It appears that, in general,<br />
the simulated fraction of the area with nitrate concentrations<br />
exceeding 50 mg/l is slightly overestimated in<br />
the summer period and slightly underestimated in the<br />
winter period, indicating that the overall trend in the<br />
concentration level is simulated adequately whereas<br />
the seasonal variation in observed concentrations is<br />
not fully represented in the simulations.<br />
4.3. Nitrate concentrations in aquifer Ð at different<br />
temporal and spatial scales<br />
The results regarding the uncertainty on simulated<br />
nitrogen leaching from different cropping patterns and<br />
the importance of the contribution from different error<br />
sources are described in detail in Hansen et al. (1999).<br />
The present paper focuses on the catchment scale and<br />
on how uncertainties at a point scale propagate and are<br />
transformed (reduced) at larger spatial and temporal<br />
scales.<br />
The transformation process is illustrated in Fig. 6<br />
which shows the uncertainty, characterised by time<br />
series of the means and standard deviations among<br />
the 25 Monte Carlo runs for (a) six-monthly ¯ux<br />
concentrations from the root zone (DAISY output)<br />
for three different crop rotations, and (b) mean sixmonthly<br />
concentrations in the upper aquifer layer<br />
averaged over the entire aquifer. It is very clearly<br />
seen from the ®gures how the uncertainties are<br />
reduced when moving from root zone leakage to aquifer<br />
concentrations at catchment scale. Thus it is<br />
remarkable that for instance the average standard<br />
errors (standard deviation divided by mean) of six<br />
monthly root zone ¯ux concentrations in the order<br />
of 33±44% are reduced to a standard error of 18%<br />
on the assessed mean six monthly values for ground<br />
water concentrations at the catchment scale.<br />
The large seasonal variation in concentration levels<br />
observed in the percolation water (Fig. 6a) is levelled<br />
out in the simulated groundwater concentrations at<br />
both grid level and catchment level. This is mainly a
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 223<br />
Table 3<br />
Simulations used for evaluation of uncertainty contributions. All six<br />
sets are based on the input uncertainties drawn for the ®rst set of<br />
Monte Carlo simulations (1±25)<br />
Monte Carlo run series<br />
O<br />
A<br />
B<br />
C<br />
D<br />
E<br />
Status of parameters<br />
All ®ve parameters are treated<br />
stochastic<br />
Precipitation is treated<br />
deterministic<br />
Texture is treated deterministic<br />
Soil organic matter is treated<br />
deterministic<br />
Slurry composition is treated<br />
deterministic<br />
Depth of reduction front is<br />
treated deterministic<br />
result of dilution and averaging in the entire groundwater<br />
volume of the upper layer which accounts for<br />
8±13 m of the saturated zone. The differences in<br />
concentration levels between crop rotations is, on<br />
the other hand, still re¯ected in the groundwater<br />
concentrations of corresponding grids (Fig. 6b) with<br />
lowest concentration arising from the plant production<br />
rotations and highest concentrations from the pig rotations.<br />
4.4. Analyses of different sources of input error<br />
In addition to the basic set of Monte Carlo simulations<br />
(1±25), where all ®ve selected parameters were<br />
treated stochastically, ®ve series were simulated in<br />
each of which one of the Monte Carlo parameters<br />
was kept deterministic (Table 3). The results of<br />
these extra ®ve series were compared to the result of<br />
the basic set in order to evaluate the uncertainty associated<br />
with each of the selected parameters. In Table<br />
4, the uncertainty contribution of each series given as<br />
variances is shown. The variance contribution of<br />
single parameters was obtained by subtracting the<br />
total simulated variance obtained with only four<br />
stochastic parameters (e.g. series A) from the total<br />
variance obtained with ®ve stochastic parameters<br />
(series O). Ideally, the sum of the variances corresponding<br />
to the simulation series A±E should equate<br />
the variance associated with Monte Carlo run series<br />
O, if no covariance components were generated. It is,<br />
however, noted that discrepancies occur indicating<br />
that all variance and covariance components are not<br />
accounted for. In spite of this, the results can give a<br />
rough estimate on the relative importance of the<br />
selected sources of uncertainty.<br />
As can be seen from Table 2 (runs 1±25) the uncertainty<br />
on the simulated annual river ¯ows (CV ˆ std./<br />
mean ˆ 5%) was signi®cantly less than the uncertainty<br />
related to the components of the nitrogen<br />
balance i.e. nitrogen leaching (CV ˆ 30%) and nitrate<br />
concentrations in groundwater and stream water<br />
(CV ˆ 17%). According to Table 4 the uncertainty<br />
on simulated river ¯ow was dominated by contributions<br />
from uncertainty on soil texture and on precipitation,<br />
whereas the uncertainties associated with<br />
components of the nitrogen balance were dominated<br />
by the uncertainty contributions from both soil<br />
texture, SOM and slurry composition. Uncertainty<br />
on precipitation contributed only little to the simulated<br />
uncertainties on the nitrogen components despite<br />
the in¯uence it had on the water balance. The depth of<br />
Table 4<br />
Estimation of uncertainty on selected simulation results distributed on calculated variance contribution (s 2 ) from precipitation (A), soil texture<br />
(B), soil organic matter (C), slurry composition (D), and depth of the reduction front (E), respectively<br />
Variable Variance contribution from single parameters SUM (A:E) All parameters O a<br />
A B C D E<br />
Leaching from root zone<br />
0 192 100 114 0 406 370<br />
(kg/ha year)<br />
Groundwater<br />
2 30 29 28 0 89 64<br />
concentration (mg/l)<br />
River ¯ow (mm/year) 284 345 6 6 0 641 499<br />
River concentration (mg/l) 0 27 21 19 0 67 61<br />
a<br />
Variance from simulations with all ®ve Monte Carlo parameters included.
224<br />
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />
the reduction front appeared to have only minor in¯uence<br />
on the uncertainty of stream water concentrations<br />
in the present simulations.<br />
5. Discussion and conclusions<br />
From the analysis of input error contributions it was<br />
observed that only three of the ®ve input parameters<br />
included in the uncertainty analysis contributed<br />
signi®cantly to the simulated variation in the model<br />
output related to the nitrogen balance, i.e. areal leaching<br />
from the root zone and average nitrate concentrations<br />
in groundwater and stream water. Of these three<br />
only one, soil texture, is related to the transport<br />
processes. The two others, SOM and slurry composition,<br />
are related to the nitrogen turnover processes.<br />
The uncertainty introduced to the driving variable<br />
precipitation in¯uenced the simulated water balance<br />
but not the simulated nitrogen balance. This indicates<br />
that the timing of the percolating water governed by<br />
the hydraulic parameters is more important for the<br />
simulated nitrogen loads than the total annual<br />
amounts of percolation. This result is supported by<br />
other studies showing that one of the major factors<br />
in¯uencing nitrogen losses from the root zone under<br />
northern temperate climate is the amount of readily<br />
available organic nitrogen present in the soil at the end<br />
of the growing season where groundwater recharge is<br />
initiated (Landbrugets RaÊdgivningscenter, 1996). The<br />
predicted uncertainty on the simulated river ¯ow is in<br />
good agreement with results from Storm et al. (1988).<br />
The uncertainty introduced to the depth of the<br />
reduction front in the saturated zone had no in¯uence<br />
on the simulation results. The main reason for this is<br />
that the simulated groundwater levels were shallower<br />
than normally observed in the area. This prevented the<br />
percolating water from passing through the reduced<br />
zone before entering the stream. If the hydrogeological<br />
parameters had been included in the Monte Carlo<br />
analysis, the depth of the reduction front might have<br />
contributed to the simulated variation in the nitrogen<br />
balance component, in particular stream ¯ow concentrations,<br />
as well.<br />
A fundamental limitation of the adopted approach<br />
is that the errors due to incorrect model structure are<br />
neglected. One approach to assess such model error is<br />
through comparison of predicted and observed values.<br />
In the present case it was, however, not possible<br />
during the validation tests to identify a signi®cant<br />
model error. This must not be taken as a general<br />
proof for a correct model structure. It only shows<br />
that the model performs without apparent model<br />
error for the particular case study.<br />
Another limitation of the adopted approach lies in<br />
the choice of associating input uncertainty to only ®ve<br />
parameters. Although these ®ve parameters according<br />
to our experience are the most important ones in the<br />
different processes governing the nitrate leaching and<br />
transformation, this has not been documented by<br />
systematic sensitivity analyses, either by us or by<br />
other authors. It can be argued that the uncertainties<br />
have been underestimated by neglecting the uncertainty<br />
on the other input parameters. Hence, the absolute<br />
uncertainty ®gures should be considered with<br />
some reservation.<br />
A third limitation is the mostly subjective method<br />
of assessing errors in input data. If suitable data had<br />
been available for assessing such errors in a statistically<br />
more rigorous way this should have been done.<br />
Cases where such data are available are typically<br />
studies on small experimental areas, while our case<br />
is more comparable to practical studies, where such<br />
data most often are not available. In spite of the weak<br />
data basis for the input error assessment, the adopted<br />
Monte Carlo analysis is still valuable as a rigorous<br />
method of analysing uncertainty propagation,<br />
although the predicted uncertainties should be treated<br />
with some caution.<br />
When considering uncertainties at different scales it<br />
must be noticed that due to the adopted approaches<br />
with respect to upscaling and Monte Carlo sampling<br />
the uncertainties can only be assumed to be correctly<br />
assessed at the catchment scale, while the uncertainties<br />
at smaller scale are underestimated. This ampli-<br />
®es the ®nding re¯ected in Fig. 6, namely that the<br />
uncertainties in ¯ux concentrations leaving the root<br />
zone is much larger than the uncertainty at the catchment/aquifer<br />
scale. Taking this into account one could<br />
argue that the uncertainty in simulated ¯ux concentrations<br />
leaving the root zone at point/grid scale is so<br />
large that this in itself may lead to the conclusion<br />
that modelling with this type of model, this grid<br />
size, and this data basis is of minor practical use.<br />
However, the uncertainty at the catchment (or aquifer)<br />
scale, which is an interesting scale seen from a water
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 225<br />
supply and policy point of view is reduced so much<br />
that the results may be useful in practice. This duality<br />
illustrates that discussions of model uncertainty are<br />
useless unless the type of simulation result is de®ned<br />
precisely in terms of spatial and temporal scale, which<br />
is probably one of the reasons why `®eld/process<br />
study oriented scientists' and `modellers/large scale<br />
oriented scientists' often misunderstand each other.<br />
One way of reducing the simulated uncertainty<br />
would be to increase the quality of the input data<br />
support either by using national databases instead of<br />
the European data sets or by actually gathering site<br />
speci®c data through ®eld monitoring. The uncertainty<br />
related to the texture composition could be<br />
reduced by using national soil databases, which<br />
often include more detailed classi®cation systems<br />
than the FAO approach provided in the GISCO database.<br />
Keeping the procedure of using pedo-transfer<br />
functions for obtaining hydraulic parameters this<br />
would decrease the uncertainty within each de®ned<br />
soil class. Based on the effect of keeping soil texture<br />
deterministic (Table 4) it could for example be<br />
expected that a 50% reduction in the input error<br />
related to soil texture obtained by collecting better<br />
data in this way would reduce the uncertainty on<br />
simulated groundwater concentration with approximately<br />
25%. Gathering of better precipitation data<br />
would, on the other hand, only improve simulation<br />
of the water balance and not in¯uence the simulated<br />
uncertainty in groundwater concentrations signi®cantly.<br />
Another way of decreasing the uncertainty would<br />
be to carry out model calibration, as this in principle<br />
would decrease the uncertainty related to the input<br />
parameters. In practice it is, however, dif®cult to<br />
quantify how much the input error of a single parameter<br />
should be reduced if calibration involving this<br />
parameter is conducted. In the present study, calibration<br />
of the hydrogeological parameters by use of<br />
measured groundwater levels and observed stream<br />
¯ow might have in¯uenced both the simulated<br />
groundwater concentrations by introducing a more<br />
diverse hydrology and in particular the simulated<br />
stream concentrations as the reduction front may<br />
have come into function. Calibration of the root<br />
zone processes would have required ®eld data in<br />
terms of e.g. soil moisture contents, nitrogen concentrations<br />
in the root zone, crop yields, etc., data which<br />
are not often available. In order to get some idea of the<br />
quality of the simulated mass balances, one possibility<br />
could be to calibrate the simulated crop yields using<br />
regional agricultural statistics, though these can only<br />
provide rather rough estimates.<br />
From the results of the present study it can be<br />
concluded that the present modelling approach appear<br />
feasible for estimating uncertainties in predicted<br />
nitrate concentrations at larger scales, and hereby<br />
also for evaluating the reliability of the simulation<br />
results. The results also indicate that the use of distributed<br />
physically-based models is feasible at the catchment<br />
scale, even if data have to be obtained from<br />
readily available aggregated data sources such as<br />
European databases. Given the constraints for obtaining<br />
data and given that no model calibration was<br />
performed in the present case study, the validation<br />
tests came out surprisingly well as measured groundwater<br />
concentrations were within the uncertainty<br />
intervals of the simulated groundwater concentration.<br />
The uncertainty of the model simulations at catchment<br />
scale are at a relatively low level, and thus the predictive<br />
capability of the model appear very interesting<br />
from a practical water resources management point<br />
of view.<br />
Acknowledgements<br />
The present work was partly funded by the EC<br />
Environment and Climate Research Programme<br />
(contract number ENV4-CT95-0070). We thank the<br />
two reviewers, Tim Burt and Bernd Huwe, for valuable<br />
comments to an earlier version of this manuscript.<br />
References<br />
Abbott, M.B., Bathurst, J.C., Cunge, J.A., O'Connell, P.E., Rasmussen,<br />
J., 1986. An introduction to the European hydrological<br />
system Ð SysteÂme Hydrologique EuropeÂen `SHE'. 1. History<br />
and philosophy of a physically based distributed modelling<br />
system. 2. Structure of a physically based distributed modelling<br />
system. Journal of Hydrology 87, 45±77.<br />
Agricultural Statistics, 1995. Danmarks Statistik, 294pp.<br />
Ahsan, M., O'Connor, K.M., 1994. A reappraisal of the Kalman<br />
®ltering technique as applied in river ¯ow forecasting. Journal<br />
of Hydrology 161, 197±226.<br />
Allerup, P., Madsen, H., Riis, J., 1982. Methods for calculating areal
226<br />
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />
precipitation Ð applied to the SusaÊ-catchment. Nordic Hydrology<br />
13, 263±278.<br />
Arnold, J.G., Williams, J.R., Nicks, A.D., Sammons, N.B., 1990.<br />
SWRRB Ð A Basin Scale Simulation Model for Soil and Water<br />
Resources Management. Texas A & M University Press,<br />
College Station (241 pp).<br />
Arnold, J.G., Williams, J.R., 1995. SWRRB Ð a watershed scale<br />
model for soil and water resources management. In: Singh, V.P.<br />
(Ed.). Computer Models of Watershed Hydrology. Water<br />
Resources Publication, pp. 847±908.<br />
Beven, K., Binley, A.M., 1992. The future role of distributed<br />
models: model calibration and predictive uncertainty. Hydrological<br />
Processes 6, 279±298.<br />
Brooks, R.H., Corey, A.T., 1966. Properties of porous media affecting<br />
¯uid ¯ow. Journal of the Irrigation and Drainage Division of<br />
the American Society of Civil Engineering 92, 61±88.<br />
Burdine, N.T., 1952. Relative permeability calculations from poresize<br />
distribution data. Transactions of the AIME 198, 35±42.<br />
Bùrgesen, C.D., 2000. Personal communication. Danish Institute of<br />
Agricultural Science.<br />
Campbell, G.S., 1974. A simple method for determining unsaturated<br />
conductivity from moisture retention data. Soil Science<br />
117, 311±314.<br />
Cosby, B.J., Hornberger, M., Clapp, Ginn, T.R., 1984. A statistical<br />
exploration of relationships of soil moisture characteristics to<br />
the physical properties of soils. Water Resources Research 20,<br />
682±690.<br />
Dagan, G., 1986. Statistical theory of groundwater ¯ow and transport:<br />
pore to laboratory, laboratory to formation, and formation<br />
to regional scale. Water Resources Research 22 (9), 120±134.<br />
DeCoursey, D.G., Rojas, K.W., Ahuja, L.R., 1989. Potentials for<br />
non-point source groundwater contamination analyzed using<br />
RZWQM. Paper no. SW892562. Presented at the International<br />
American Society of Agricultural Engineers' Winter Meeting,<br />
New Orleans, Louisiana.<br />
EC, 1982. Groundwater resources in Denmark. Commission of the<br />
European Communities. EUR 7941 (in Danish).<br />
Freeze, R.A., 1980. A stochastic-conceptual analysis of the rainfallrunoff<br />
process on a hillslope. Water Resources Research 16 (2),<br />
391±408.<br />
Gelb, A. (Ed.), 1974. Applied Optimal Estimation MIT Press,<br />
Cambridge, MA.<br />
Gelhar, L.W., 1986. Stochastic subsurface hydrology. From theory<br />
to applications. Water Resources Research 22 (9), 135±145.<br />
Hansen, S., Jensen, H.E., 1988. Spatial variability of soil physical<br />
properties. Theoretical and experimental analysis. II. Soil water<br />
variables-data acquisition, processing and basic statistics.<br />
Research report no. 1210. Department of Soil and Water and<br />
Plant Nutrition. The Royal Veterinary and Agricultural University,<br />
Copenhagen, 54pp.<br />
Hansen, S., Jensen, H.E., Nielsen, N.E., Svendsen, H., 1991. Simulation<br />
of nitrogen dynamics and biomass production in winter<br />
wheat using the Danish simulation model DAISY. Fertiliser<br />
Research 27, 245±259.<br />
Hansen, S., Thorsen, M., Pebesma, E., Kleeschulte, S., Svendsen, H.,<br />
1999. Uncertainty in simulated leaching due to uncertainty in input<br />
data. A case study. Soil Use and Management 15, 167±175.<br />
Heng, H.H., Nikolaidis, N.P., 1998. Modelling of nonpoint source<br />
pollution of nitrogen at the watershed scale. Journal of the<br />
American Water Resources Association 34 (2), 359±374.<br />
Jensen, K.H., Mantoglou, A., 1992. Application of stochastic<br />
unsaturated ¯ow theory, numerical simulations and comparison<br />
to ®eld observations. Water Resources Research 28 (1),<br />
269±284.<br />
Kros, J., Pebesma, E.J., Reinds, G.J., Finke, P.A., 1999. Uncertainty<br />
assessment in modelling soil acidi®cation at the European scale:<br />
a case study. Journal of Environmental Quality 28 (2), 366±377.<br />
Lamm, C.G., 1971. The Danish soil database. Tidskrift for Planteavl<br />
75, 703±720 (in Danish).<br />
Landbrugets RaÊdgivningscenter, 1996. Square grid for nitrate investigations<br />
in Danmark 1990±1993. Landskontoret for Planteavl,<br />
Skejby, Denmark (in Danish).<br />
McKay, M.D., Conover, W.J., Beckman, R.J., 1979. A comparison<br />
of three methods for selection values of input variables in the<br />
analysis of output from a computer code. Technometrics 2,<br />
239±245.<br />
Nemec, J., 1994. Distributed hydrological models in the perspective<br />
of forecasting operational real time hydrological systems<br />
(FORTHS). In: Rosso, P., Peano, A., Becchi, I., Bemporad,<br />
G.A. (Eds.). Advances in Distributed Hydrology. Water<br />
Resources Publications, pp. 69±84.<br />
Pebesma, E.J., Heuvelink, G.B.M., 1999. Latin hypercube sampling<br />
of Gaussian random ®elds. Technometrics 41 (4), 303±312.<br />
Plantedirektoratet, 1996. Vejledninger og skemaer 1996/1997.<br />
Ministry for Food, Agriculture and Fishery, 38pp.<br />
Refsgaard, J.C., 1996. Terminology, modelling protocol and classi-<br />
®cation of hydrological model codes. In: Abbott, M.B.,<br />
Refsgaard, J.C. (Eds.). Distributed Hydrological Modelling.<br />
Kluwer Academic, pp. 17±39.<br />
Refsgaard, J.C., Storm, B., 1995. MIKE SHE. In: Singh, V.P. (Ed.).<br />
Computer Models of Watershed Hydrology. Water Resources<br />
Publication, pp. 809±846.<br />
Refsgaard, J.C., Ramaekers, D., Heuvelink, G.B.M., Schreurs, V.,<br />
Kros, H., RoseÂn, L., Hansen, S., 1998. Assessment of cumulative<br />
uncertainty in spatial decision support systems: application<br />
to examine the contamination of groundwater from diffuse<br />
sources (UNCERSDSS). Presented at the European Climate<br />
Science Conference, Vienna, 19±23 October, 1998. To appear<br />
in conference proceedings.<br />
Refsgaard, J.C., Thorsen, M., Birk Jensen, J., Kleeschulte, S.,<br />
Hansen, S., 1999. Large scale modelling of groundwater<br />
contamination from nitrogen leaching. Journal of Hydrology<br />
221, 117±140.<br />
Simmelsgaard, S.E., 1991. Estimating functions for nitrogen leaching:<br />
nitrogen fertilizers in agriculture Ð requirement and leaching<br />
now and in the future. National Institute of Agricultural<br />
Economics, Copenhagen, Denmark (in Danish).<br />
Skop, E., 1993. Calculation of nitrogen leaching on a regional scale.<br />
Technical report no. 65. National Environmental Research Institute,<br />
Silkeborg, Denmark, 54 pp (in Danish).<br />
Smith, L., Freeze, R.A., 1979a. Stochastic analysis of steady state<br />
¯ow in a bounded domain. 1. One-dimensional simulations.<br />
Water Resources Research 15 (3), 521±528.<br />
Smith, L., Freeze, R.A., 1979b. Stochastic analysis of steady state
M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 227<br />
¯ow in a bounded domain. 2. Two-dimensional simulations.<br />
Water Resources Research 15 (6), 1543±1559.<br />
Smith, R.E., 1992. An integrated simulation model of<br />
nonpoint-source pollutants at the ®eld scale. Department of<br />
Agriculture, Agricultural Research Service, 120pp.<br />
Storm, B., Jensen, K.H., Refsgaard, J.C., 1988. Estimation of catchment<br />
rainfall uncertainty and its in¯uence on runoff prediction.<br />
Nordic Hydrology 19, 77±88.<br />
Styczen, M., Storm, B., 1993. Modelling of N-movements on catchment<br />
scale Ð a tool for analysis and decision making. 1. Model<br />
description. & 2. A case study. Fertiliser Research 36, 1±17.<br />
Tietje, O., Tapkenhinrichs, M., 1993. Evaluation of pedo-transfer<br />
functions. Soil Science Society of America Journal 57, 1088±<br />
1095.<br />
Wood, E., O'Connell, P.E., 1985. Real-time forecasting. In: Anderson,<br />
M.G., Burt, T.P. (Eds.). Hydrological Forecasting. Wiley,<br />
New York, pp. 505±558.<br />
Wood, E.F., Sivapalan, M., Beven, K.J., Band, L., 1988. Effects of<br />
spatial variability and scale with implications to hydrologic<br />
modelling. Journal of Hydrology 102, 29±47.<br />
Zhang, H., Haan, C.T., Nofziger, D.L., 1993. An approach to estimating<br />
uncertainties in modelling transport of solutes through<br />
soils. Journal of Contaminant Hydrology 12, 35±50.
[12]<br />
Refsgaard JC, Henriksen HJ (2004) Modelling guidelines – terminology and<br />
guiding principles.<br />
Advances in Water Resources, 27(1), 71-82.<br />
Reprinted from Advances in Water Resources with permission from Elsevier
Advances in Water Resources 27 (2004) 71–82<br />
www.elsevier.com/locate/advwatres<br />
Modelling guidelines––terminology and guiding principles<br />
Jens Christian Refsgaard * , Hans Jørgen Henriksen<br />
Department of Hydrology, Geological Survey of Denmark and Greenland (GEUS), Øster Voldgade 10, Copenhagen DK-1350, Denmark<br />
Received 29 October 2002; received in revised form 7 August 2003; accepted 18 August 2003<br />
Abstract<br />
Some scientists argue, with reference to Popper’s scientific philosophical school, that models cannot be verified or validated.<br />
Other scientists and many practitioners nevertheless use these terms, but with very different meanings. As a result of an increasing<br />
number of examples of model malpractice and mistrust to the credibility of models, several modelling guidelines are being elaborated<br />
in recent years with the aim of improving the quality of modelling studies. This gap between the views and the lack of<br />
consensus experienced in the scientific community and the strongly perceived need for commonly agreed modelling guidelines is<br />
constraining the optimal use and benefits of models. This paper proposes a framework for quality assurance guidelines, including a<br />
consistent terminology and a foundation for a methodology bridging the gap between scientific philosophy and pragmatic modelling.<br />
A distinction is made between the conceptual model, the model code and the site-specific model. A conceptual model is<br />
subject to confirmation or falsification like scientific theories. A model code may be verified within given ranges of applicability and<br />
ranges of accuracy, but it can never be universally verified. Similarly, a model may be validated, but only with reference to sitespecific<br />
applications and to pre-specified performance (accuracy) criteria. Thus, a model’s validity will always be limited in terms<br />
of space, time, boundary conditions and types of application. This implies a continuous interaction between manager and modeller<br />
in order to establish suitable accuracy criteria and predictions associated with uncertainty analysis.<br />
Ó 2003 Elsevier Ltd. All rights reserved.<br />
Keywords: Model guidelines; Scientific philosophy; Validation; Verification; Confirmation; Domain of applicability; Uncertainty<br />
1. Introduction<br />
Models describing water flows, water quality and<br />
ecology are being developed and applied in increasing<br />
number and variety. With the requirements imposed by<br />
the EU Water Framework Directive the trend in recent<br />
years to base water management decisions to a larger<br />
extent on model studies and to use more sophisticated<br />
models is likely to be reinforced. At the same time<br />
insufficient attention is generally given to documenting<br />
the predictive capability of the models. Therefore, contradictions<br />
emerge regarding the various claims of<br />
model applicability on the one hand and the lack of<br />
documentation of these claims on the other hand.<br />
Hence, the credibility of the models is often questioned,<br />
and sometimes with good reason.<br />
As emphasised by e.g. Forkel [12] modelling studies<br />
involve several partners with different responsibilities.<br />
* Corresponding author. Tel.: +45-38-14-27-76; fax: +45-38-14-20-<br />
50.<br />
E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />
The Ôkey players’ are code developers, model users and<br />
water resources managers. However, due to the complexity<br />
of the modelling process and the different backgrounds<br />
of these groups, gaps in terms of lack of mutual<br />
understanding easily develop. For example, the strengths<br />
and limitations of modelling applications are most often<br />
difficult, if not impossible, to assess by the water resources<br />
managers. Similarly, the transformation of water<br />
managers’ objectives to specific performance criteria can<br />
be very difficult to assess for the model users. Due to lack<br />
of documentation and transparency, modelling projects<br />
can be difficult to audit, and without a considerable effort<br />
it is hardly possible to reconstruct, repeat and reproduce<br />
the modelling process and its results.<br />
In the water resources management community a<br />
number of different guidelines on good modelling practise<br />
have been prepared. One of the most, if not the most,<br />
comprehensive examples of modelling guidelines has<br />
been developed in The Netherlands [37] as a result of a<br />
process involving all the main players in the Dutch water<br />
management field. The background for this process was<br />
a perceived need for improving the quality in modelling<br />
0309-1708/$ - see front matter Ó 2003 Elsevier Ltd. All rights reserved.<br />
doi:10.1016/j.advwatres.2003.08.006
72 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />
by addressing malpractice such as careless handling of<br />
input data, insufficient calibration and validation and<br />
model use outside its scope [34]. Similarly, the background<br />
for modelling guidelines for the Murray–Darling<br />
Basin in Australia was a perception among the end-users<br />
that model capabilities may have been Ôover-sold’, and<br />
that there is a lack of consistency in approaches, communication<br />
and understanding among and between<br />
modellers and water resources managers, often resulting<br />
in considerable uncertainty for decision making [25].<br />
A key problem in relation to establishment of generally<br />
acceptable modelling guidelines is confusion on<br />
terminology. For example the terms validation and<br />
verifications are used with different, and some times<br />
interchangeable, meaning by different authors. The<br />
confusion arises from both semantic and philosophical<br />
considerations [32]. Another important problem is the<br />
lack of consensus related to the so far non-conclusive<br />
debate on the fundamental question concerning whether<br />
a water resources model can be validated or verified, and<br />
whether it as such can be claimed to be suitable or valid<br />
for particular applications [3,11,16,20,26].<br />
Finally, modelling guidelines have to reflect and be in<br />
line with the underlying philosophy of environmental<br />
modelling which have changed significantly during the<br />
past decades from what in retrospect may be called<br />
rather naive enthusiasms (see for example Freeze and<br />
Harlan [13]; Abbott [1]––many of us focussed on the<br />
huge potentials of sophisticated models outlined in these<br />
early days without reflecting too much on the associated<br />
limitations) to what now appears to be a much more<br />
balanced and mature view (e.g. Beven [7,9]).<br />
Thus, there is a gap between the theory and practice,<br />
i.e. between the various, contradictory views and the<br />
lack of a common terminology and methodology in the<br />
scientific community on the one side, and the need of<br />
having quality assurance guidelines for practical model<br />
applications on the other side. The objective of the<br />
present paper is to establish guiding principles for<br />
quality assurance guidelines, including establishing a<br />
consistent terminology and a foundation for a methodology<br />
bridging the gap between scientific philosophy and<br />
pragmatic modelling.<br />
2. Key opinions in the scientific community<br />
The present paper does not attempt to provide a full<br />
review of all relevant papers on this subject. Rather, it<br />
provides a review of a few selected characteristic<br />
examples.<br />
2.1. Terminology<br />
No unique and generally accepted terminology and<br />
methodology exist at present in the scientific community<br />
with respect to modelling protocol and guidelines for<br />
good modelling practise. Examples of general methodologies<br />
exist [4,32,33], but they use different terminology<br />
and have significant differences with respect to the<br />
underlying scientific philosophy.<br />
A rigorous and comprehensive terminology for model<br />
credibility was presented by Schlesinger et al. [33]. This<br />
terminology was developed by a committee composed of<br />
members from diverse disciplines and background with<br />
the intent that it could be employed in all types of simulation<br />
applications. In regard to terminology, distinctions<br />
are made between model qualification (adequacy<br />
of conceptual model), model verification (adequacy of<br />
computer programme) and model validation (adequacy<br />
of site-specific model). With the exception of a few<br />
important terms, such as generic model code and model<br />
calibration, which are not considered by Schlesinger<br />
et al. [33], their proposed terminology includes all the<br />
important elements of the modelling process.<br />
Konikow and Bredehoeft [20], in their thought provoking<br />
paper, express the view that ‘‘the terms validation<br />
and verification have little or no place in<br />
groundwater science; these terms lead to a false impression<br />
of model capability’’. Their main argument relates<br />
to the anti-positivistic view that a theory (in this case a<br />
model) can never be proved to be generally valid, but<br />
may in contrary be falsified by just one example. They<br />
argue and recommend that the term history matching,<br />
which does not indicate a claim of predictive capability,<br />
should be used instead.<br />
Oreskes et al. [26], in their classic and philosophically<br />
based paper, distinguish between verification, validation<br />
and confirmation:<br />
• Verify is ‘‘an assertion or establishment of truth’’. To<br />
verify a model therefore means to demonstrate its<br />
truth. According to the authors ‘‘verification is only<br />
possible in closed systems in which all the components<br />
of the system is established independently and<br />
are known to be correct. In its application to models<br />
of natural systems, the term verification is highly misleading.<br />
It suggests a demonstration of proof that is<br />
simply not accessible’’. They argue that mathematical<br />
components are subject to verification, because they<br />
are part of closed systems, but numerical models in<br />
application cannot be verified because of uncertainty<br />
of input parameters, scaling problems and uncertainty<br />
in observations.<br />
• The term validation is weaker than the term verification.<br />
Thus validation does not necessarily denote an<br />
establishment of truth, but rather ‘‘the establishment<br />
of legitimacy, typically given in terms of contracts,<br />
arguments and methods’’. They argue that ‘‘the term<br />
valid may be useful for assertions about a generic<br />
model code but is clearly misleading if used to refer<br />
to actual model results in any particular realisation’’.
J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 73<br />
• The term confirmation is weaker than the terms verification<br />
and validation. It is used with regard to a theory,<br />
when it is found that the theory is in agreement<br />
with empirical observations. As discussed below such<br />
agreement does not prove that the theory is true, it<br />
only confirms it.<br />
Oreskes et al. [26] do not define how the terms verification<br />
and validation should be used, but rather define<br />
their meaning and set limitations to the contexts in<br />
which they meaningfully can be used.<br />
An important distinction is made between open and<br />
closed systems. A system is a closed system if its true<br />
conditions can be predicted or computed exactly. This<br />
applies to mathematics and mostly to physics and<br />
chemistry. Systems where the true behaviour cannot be<br />
computed due to uncertainties and lack of knowledge on<br />
e.g. input data and parameter values are called open<br />
systems. The systems we are dealing with in water resources<br />
management, based on geosciences, biology and<br />
socio-economy, are open systems.<br />
It may be argued that e.g. the behaviour of a<br />
groundwater flow system can be predicted correctly if all<br />
the details of the subsurface (soil system and geological<br />
system) media were known, because the fundamental<br />
physical laws governing the flow are known. However,<br />
in practice it will never be possible to know all the details<br />
of the media down to molecular scale, and hence<br />
uncertainties will always exist. For instance, several<br />
alternative representations of the subsurface system at<br />
microscopic scale will be able to provide the same<br />
flow field at a macroscopic scale. Therefore, the results<br />
from a groundwater flow model are said to be nonunique.<br />
In addition, as the system is a so-called open<br />
system, the boundary conditions generate further<br />
uncertainty.<br />
Matalas et al. [24] draw a distinction between the<br />
terms Ômodel’ and Ôtheory’. They state that ‘‘a theory<br />
represents a synthesis of understanding, which provides<br />
not only a description of what constitutes the states of<br />
the system and their connectedness (i.e. postulated<br />
concepts), but also deducted consequences from these<br />
postulates. A model is an analogy or an abstraction,<br />
which ...may be derived intuitively and without formal<br />
deductive capability’’.<br />
Rykiel [32] argues that models can be validated as<br />
acceptable for pragmatic purposes, whereas theoretical<br />
validity is always provisional. In this respect he, like<br />
Matalas et al. [24], distinguishes between scientific<br />
models and predictive (engineering) models. Scientific<br />
models can be corroborated (confirmed) or refuted<br />
(falsified) in the sense of hypothesis testing, while predictive<br />
models can be validated or invalidated in the<br />
sense of engineering performance testing. Thus according<br />
to Rykiel [32], validation is not a procedure for<br />
testing scientific theory or for certifying the Ôtruth’ of<br />
current scientific understanding, but rather a testing<br />
of whether a model is acceptable for its intended use.<br />
Within the hydraulic engineering community attempts<br />
have been made to establish a common quality<br />
assurance methodology IAHR [18]. The IAHR methodology<br />
comprises guidelines for standard validation<br />
documents, where validation of a software package is<br />
considered in four steps [10,23]: conceptual validation,<br />
algorithmic validation, software validation and functional<br />
validation. It is noted that the term validation in<br />
the IAHR methodology corresponds to what other authors<br />
call code verification, while schemes for validation<br />
of site-specific models are not included.<br />
2.2. Scientific philosophical aspects of verification and<br />
validation<br />
Different principal schools of philosophical thought<br />
exist on the issue of verification and validation. During<br />
the second half of the 19th century and the first half of<br />
the 20th century positivism was the dominant philosophical<br />
school. Matalas et al. [24] characterises the<br />
positivistic school in the following way: ‘‘...theories are<br />
proposed through inductive logic, and the proposed<br />
theories are confirmed or refuted on the basis of critical<br />
experiments designed to verify the consequences of the<br />
theories. And through theory reduction or adoption of<br />
new or modified theories, science is able to approach<br />
truth’’. The logic rationale behind positivism is the<br />
inductive method, i.e. the inference from singular<br />
statements, such as accounts of results of observations<br />
or experiments, to universal statements, such as hypothesis<br />
or theories.<br />
Popper [29] opposed the positivistic school arguing<br />
that science is deductive rather than inductive, and that<br />
theories cannot be verified, only falsified. The deductive<br />
method implies inferences from a universal statement to<br />
a singular statement, where conclusions are logically<br />
derived from given premises. Science is considered as a<br />
hypothetico-deductive activity, implying that empirical<br />
observations must be framed as deductive consequences<br />
of a general theory or scientific law. If the observations<br />
can be shown to be true then the theory or law is said to<br />
be corroborated. Popper used the term corroborate instead<br />
of confirmation, because he ‘‘wanted a neutral<br />
term to describe the degree to which a theory has stood<br />
up to severe tests and proved its mettle’’.<br />
The greater the number and diversity of confirming<br />
observations the more credible the theory or law becomes.<br />
But no matter how much data and how many<br />
confirmations we have, there will always be the possibility<br />
that more than one theory can explain the observations.<br />
Over time the false theories are likely to be<br />
confronted with observations that falsify them. Thus,<br />
scientific theories are never certain or proved but only<br />
hypotheses subject to corroboration or falsification.
74 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />
Popper [29] distinguished between two kinds of universal<br />
statements: the Ôstrictly universal’ and the Ônumerical<br />
universal’. The strictly universal statements are<br />
those usually dealt with when speaking about theories<br />
or natural laws. They are a kind of Ôall-statement’<br />
claiming to be true for any place and any time. In contrary<br />
numerical universal statements refers only to a<br />
finite class of specific elements within a finite individual<br />
spatio-temporal region. A numerical universal statement<br />
is thus in fact equivalent to conjunctions of singular<br />
statements.<br />
Kuhn [21] also strongly criticised positivism, and in<br />
a discussion of selection of correct scientific theories<br />
(paradigms) states ‘‘... few philosophers of science still<br />
seek absolute criteria for the verification of scientific<br />
theories. Noting that no theory can ever be exposed to<br />
all possible relevant tests, they ask not whether a theory<br />
has been verified but rather about its probability in the<br />
light of the evidence that actually exists. And to answer<br />
that question one important school is driven to compare<br />
the ability of different theories to explain the evidence at<br />
hand.’’<br />
According to the deductive approach a given system<br />
is reduced into elements or sub-systems that are closed,<br />
i.e. without uncertainties from the boundary or initial<br />
conditions, and a given hypothesis is then confirmed by<br />
use of causal relationships and rigouristic logic. The<br />
deductive method is the traditional scientific philosophy<br />
and methodology for Ôexact sciences’ such as physics and<br />
chemistry. Hansen [15] and Baker [5] argue that this<br />
deductive or Ôtheory-directed’ scientific method is not<br />
suitable to earth sciences, such as geology and biology,<br />
which are characterised by open systems, and where<br />
many of the signs in the historical development process<br />
are not preserved. Instead, they argue for another scientific<br />
method, which they, respectively, denote Ôholistic’<br />
or Ôearth-directed’. The earth-directed scientific method<br />
does not focus on idealised theories verified in experimental<br />
laboratories. Instead, it is oriented towards<br />
observations in nature, uncontrolled by artificial constraints.<br />
The earth-directed method, being more Ôsoft’<br />
and accepting conclusions on the complex state of nature<br />
from an integration of many observations, but<br />
without the logical rigorous proof required by the<br />
deductive method, can be argued to be well in line with<br />
Popper’s philosophy where the scientific knowledge<br />
comprises a variety of falsifiable theories that are subject<br />
to tests against observations [15].<br />
2.3. Philosophy of environmental modelling<br />
Following several papers (ranging from Beven [6] to<br />
[7]) with comprehensive critique against the predominant<br />
philosophy underlying most environmental modelling,<br />
Beven [9] outlines a new philosophy for modelling<br />
of environmental systems. The basic aim of this new<br />
approach is to extend the most common, past approach<br />
with a more realistic account of uncertainty rejecting the<br />
idea of being able to identify only one optimal model as<br />
being the most reliable for a given case. His basic idea is<br />
in line with Oreskes et al. [26] that verification and<br />
validation of environmental models is impossible, because<br />
natural systems are open. Instead environmental<br />
models may be non-unique subject to only a conditional<br />
confirmation, due to e.g. errors in model structure, calibration<br />
of parameters and period of data used for<br />
evaluation. Due to this there will always be the possibility<br />
of equifinality in that many different model<br />
structures and parameter sets may give simulations that<br />
cannot be falsified from the available observational<br />
data. Beven therefore argues that the range of behavioural<br />
models (structures and parameter sets) is best<br />
represented in terms of mapping of the Ôlandscape space’<br />
into the Ômodel space’, and that uncertainty predictions<br />
should consider all the behavioural models.<br />
3. Proposed terminology and methodological framework<br />
The following terminology is inspired by the generalised<br />
terminology for model credibility proposed by<br />
Schlesinger et al. [33], but modified and extended to<br />
accommodate some of the scientific philosophical issues<br />
raised above. The simulation environment is divided<br />
into four basic elements as shown in Fig. 1. The inner<br />
arrows describe the processes that relate the elements to<br />
each other, and the outer circle refers to the procedures<br />
that evaluate the credibility of these processes.<br />
In general terms a model is understood as a simplified<br />
representation of the natural system it attempts to describe.<br />
However, in the terminology proposed below a<br />
distinction is made between three different meanings of<br />
the general term model, namely the conceptual model,<br />
the model code and the model that here is defined as a<br />
site-specific model. The most important elements in the<br />
terminology and their interrelationships are defined as<br />
follows:<br />
Reality: The natural system, understood here as the<br />
study area.<br />
Conceptual model: A description of reality in terms of<br />
verbal descriptions, equations, governing relationships<br />
or Ônatural laws’ that purport to describe reality. This is<br />
the user’s perception of the key hydrological and ecological<br />
processes in the study area (perceptual model)<br />
and the corresponding simplifications and numerical<br />
accuracy limits that are assumed acceptable in order to<br />
achieve the purpose of the modelling. A conceptual<br />
model thus includes both a mathematical description<br />
(equations) and a descriptions of flow processes, river<br />
system elements, ecological structures, geological features,<br />
etc. that are required for the particular purpose of<br />
modelling. By drawing an analogy to the scientific
J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 75<br />
Fig. 1. Elements of a modelling terminology. Modified after Schlesinger et al. [33].<br />
philosophical discussion above the conceptual model in<br />
other words constitutes the scientific hypothesis or theory<br />
that we assume for our particular modelling study.<br />
Model code: A mathematical formulation in the form<br />
of a computer program that is so generic that it, without<br />
program changes, can be used to establish a model with<br />
the same basic type of equations (but allowing different<br />
input variables and parameter values) for different study<br />
areas.<br />
Model: A site-specific model established for a particular<br />
study area, including input data and parameter<br />
values.<br />
Model confirmation: Determination of adequacy of<br />
the conceptual model to provide an acceptable level of<br />
agreement for the domain of intended application. This<br />
is in other words the scientific confirmation of the theories/hypotheses<br />
included in the conceptual model.<br />
Code verification: Substantiation that a model code is<br />
in some sense a true representation of a conceptual<br />
model within certain specified limits or ranges of application<br />
and corresponding ranges of accuracy.<br />
Model calibration: The procedure of adjustment of<br />
parameter values of a model to reproduce the response<br />
of reality within the range of accuracy specified in the<br />
performance criteria.<br />
Model validation: Substantiation that a model within<br />
its domain of applicability possesses a satisfactory range<br />
of accuracy consistent with the intended application of<br />
the model.<br />
Model set-up: Establishment of a site-specific model<br />
using a model code. This requires, among other things,<br />
the definition of boundary and initial conditions and<br />
parameter assessment from field and laboratory data.<br />
Simulation: Use of a validated model to gain insight<br />
into reality and obtain predictions that can be used by<br />
water managers. This includes insight into how reality<br />
can be expected to respond to human interventions. In<br />
this connection uncertainty assessments of the model<br />
predictions are very important.<br />
Performance criteria: Level of acceptable agreement<br />
between model and reality. The performance criteria<br />
apply both for model calibration and model validation.<br />
Domain of applicability (of conceptual model): Prescribed<br />
conditions for which the conceptual model has<br />
been tested, i.e. compared with reality to the extent<br />
possible and judged suitable for use (by model confirmation).<br />
Domain of applicability (of model code): Prescribed<br />
conditions for which the model code has been tested, i.e.<br />
compared with analytical solutions, other model codes<br />
or similar to the extent possible and judged suitable for<br />
use (by code verification).<br />
Domain of applicability (of model): Prescribed conditions<br />
for which the site-specific model has been tested,<br />
i.e. compared with reality to the extent possible and<br />
judged suitable for use (by model validation).<br />
The credibility of the descriptions or the agreements<br />
between reality, conceptual model, model code and<br />
model are evaluated through the terms confirmation,<br />
verification, calibration and validation. Thus, the relation<br />
between reality and the scientific description of reality<br />
which is constituted by the conceptual model with its<br />
theories and equations on flow and transport processes,<br />
its interpretation of the geological system and ecosystem<br />
at hand, etc., is evaluated through the confirmation of<br />
the conceptual model. As a logical consequence of our
76 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />
position on scientific methodology, we use the term<br />
confirmation in connection with conceptual model. This<br />
implies that we agree that it is never possible to prove<br />
the truth of a theory/hypothesis and as such of a conceptual<br />
model. And even if a site-specific model is<br />
eventually accepted as valid for specific conditions, this<br />
is not a proof that the conceptual model is true, because,<br />
due to non-uniqueness, the site-specific model may turn<br />
out to perform right for the wrong reasons.<br />
Methods for conceptual model confirmation should<br />
follow the standard procedures for confirmation of scientific<br />
theories. This implies that conceptual models<br />
should be confronted with actual field data and be<br />
subject to critical peer reviews. Furthermore, the feedback<br />
from the calibration and validation process may<br />
also serve as a means by which one or a number of<br />
alternative conceptual model(s) may be either confirmed<br />
or falsified.<br />
The ability of a given model code to adequately describe<br />
the theory and equations defined in the conceptual<br />
model by use of numerical algorithms is evaluated<br />
through the verification of the model code. Use of the<br />
term verification in this respect is in accordance with<br />
Oreskes et al. [26], because mathematical equations are<br />
closed systems. The methodologies used for code verification<br />
include comparing a numerical solution with an<br />
analytical solution or with a numerical solution from<br />
other verified codes. However, some programme errors<br />
only appear under circumstances that do not routinely<br />
occur, and may not have been anticipated. Furthermore,<br />
for complex codes it is virtually impossible to verify that<br />
the code is universally accurate and error-free. Therefore,<br />
the term code verification must be qualified in<br />
terms of specified ranges of application and corresponding<br />
ranges of accuracy. A code may be applied<br />
outside its documented ranges of application, but in<br />
such cases it must not carry the label Ôverified’ and<br />
caution should be expressed with respect to its results.<br />
The application of a model code to be used for setting<br />
up a site-specific model is usually associated with model<br />
calibration. The model performance during calibration<br />
depends on the quantity and quality of the available<br />
input and observation data as well as on the conceptual<br />
model. If sufficient accuracy cannot be achieved either<br />
the conceptual model and/or the data have to be reevaluated.<br />
A discussion of the problems and methodologies<br />
in model calibration is provided by Gupta et al.<br />
[14].<br />
Often the model performance during calibration is<br />
used as a measure of the predictive capability of a<br />
model. This is a fundamental error. Many studies (e.g.<br />
Refsgaard and Knudsen [31]; Liden [22]) have demonstrated<br />
that the model performance against independent<br />
data not used for calibration is generally poorer than the<br />
performance achieved in the calibration situation.<br />
Therefore, the credibility of a site-specific model’s<br />
capability to make predictions about reality must be<br />
evaluated against independent data. This process is denoted<br />
model validation. In designing suitable model<br />
validation tests a guiding principle should be that a<br />
model should be tested to show how well it can perform<br />
the kind of task for which it is specifically intended [19].<br />
This implies for instance that for the case where a model<br />
is intended to be used for conditions similar to conditions<br />
where test data exist, such as extension of<br />
streamflow records, a standard split-sample test may be<br />
applied. However, models are often intended to be used<br />
as management tools to help answer questions such as:<br />
What happens to the water resources if land use is<br />
changed In such case no site-specific test data exist and<br />
the question of defining a validation test scheme becomes<br />
non-trivial.<br />
4. Discussion<br />
4.1. Scientific philosophical aspects<br />
The fundamental view expressed by scientific philosophers<br />
is that verification and validation of numerical<br />
models of natural systems is impossible, because natural<br />
systems are never closed and because the mapping of<br />
model results are always non-unique [26]. Thus, seen<br />
from a theoretical point it is tempting to conclude that<br />
the establishment of modelling guidelines comprising<br />
these terms simply is not possible.<br />
On the other hand, there is a large and increasing<br />
need to establish guidelines to improve the quality of<br />
modelling, and such guidelines need to address the issues<br />
of verification and validation in order to be operational<br />
in practise. Irrespective of what the scientific community<br />
decides regarding terminology and validation methodology,<br />
including the associated philosophical aspects,<br />
models are being used more and more to support water<br />
resources management in practise. As long as the present<br />
situation continues, characterised by a large degree<br />
of confusion on terminology and methodology, the potential<br />
benefits of using models are severely constrained.<br />
They are often subject to either Ôoverselling’ or Ômistrust’,<br />
and misunderstandings between model users and<br />
water resources managers may easily occur in the absence<br />
of a commonly accepted and understood Ôlanguage’.<br />
Thus, establishment of a terminology and<br />
methodology that bridge the gap between scientific<br />
philosophy and pragmatic modelling is a key challenge<br />
and an important one.<br />
This gap between a scientific philosophical and a<br />
pragmatic modelling position is also clearly reflected in<br />
the dialogue between Konikow and Bredehoeft [20] and<br />
De Marsily et al. [11]. Following the Popperian school,<br />
Konikow and Bredehoeft [20] express the view that ‘‘the<br />
terms validation and verification have little or no place
J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 77<br />
in ground-water science; these terms lead to a false<br />
impression of model capability’’. De Marsily et al. [11],<br />
in a response, argue for a more pragmatic view: ‘‘...<br />
using the model in a predictive mode and comparing it<br />
with new data is not a futile exercise; it makes a lot of<br />
sense to us. It does not prove that the model will be<br />
correct for all circumstances, it only increases our confidence<br />
in its value. We do not want certainty; we will be<br />
satisfied with engineering confidence.’’<br />
With regard to scientific methodology we fundamentally<br />
agree with the views of Popper [29] and the<br />
earth-directed theoretical method described by Baker<br />
[5]. Consequently, we agree with the view of Oreskes<br />
et al. [26], Konikow and Bredehoeft [20] and many<br />
others that it is not possible to carry out model verification<br />
or model validation, if these terms are used<br />
without restriction to domains of applicability and levels<br />
of accuracy.<br />
The restrictions in use of the terms confirmation,<br />
verification and validation imposed by the respective<br />
domains of applicability imply, according to Popper’s<br />
views, that the conceptual model, model code and<br />
site-specific models can only be classified as numerical<br />
universal statements as opposed to strictly universal<br />
statements. This distinction is fundamental for our<br />
proposed methodology and its link to scientific philosophical<br />
theories.<br />
4.2. Model confirmation, verification and validation<br />
An important aspect of our proposed methodology<br />
lies in the separation between the three different Ôversions’<br />
of the word model, namely the conceptual model,<br />
the model code and the site-specific model. This separation<br />
is in line with Matalas et al. [24] and Rykiel [32],<br />
who distinguish between the theory (conceptual model)<br />
and the engineering model (the site-specific model).<br />
Similarly, Schlesinger et al. [33] distinguish between<br />
conceptual model and computerised model. Schlesinger<br />
et al. [33], Matalas et al. [24] and Rykiel [32] do not<br />
separate the model code from the site-specific model.<br />
Due to this distinction it is possible, at a general level,<br />
to talk about confirmation of a theory or a hypothesis<br />
about how nature can be described using the relevant<br />
scientific method for that purpose, and, at a site-specific<br />
level, to talk about validity of a given model within<br />
certain domains of applicability and associated with<br />
specified accuracy limits.<br />
As Beven [9] argues we need to distinguish between<br />
our qualitative understanding (perceptual model) and<br />
the practical implementation of that understanding in<br />
our conceptual model. As we have defined a conceptual<br />
model as combination of a perceptual model and the<br />
simplifications acceptable for a particular model study a<br />
conceptual model becomes site-specific and even case<br />
specific. For example a conceptual model of a groundwater<br />
aquifer may be described as two-dimensional for a<br />
study focussing on regional groundwater heads, while it<br />
may need to include more complex three-dimensional<br />
geological structures for detailed simulation of solute<br />
transport studies.<br />
Confirmation of a conceptual model is a non-trivial<br />
issue. It is hardly possible to prescribe general test<br />
procedures, in particular not exact tests. Conceptual<br />
models are more difficult in some domains than in<br />
others. For example, the process descriptions/equations<br />
and the actual system is relatively easily identifiable in<br />
a hydrodynamic river flow system as compared to a<br />
groundwater system or an ecosystem, because the geology<br />
will never be completely known in a groundwater<br />
system and the biological processes may not be well<br />
known in an ecosystem. The more complex and difficult<br />
the conceptual model becomes the more Ôsoft’ the confirmation<br />
tests may turn out to be. Thus, expert<br />
knowledge in terms of peer reviews may be an important<br />
element of such tests.<br />
In cases where considerable uncertainty exists in the<br />
conceptual model, the possibility of testing alternative<br />
conceptual models should be promoted. An example of<br />
this is given by Troldborg [35], who reports a study<br />
where three scientists developed alternative geological<br />
interpretations for the same area, and three numerical<br />
groundwater models were set-up and calibrated on this<br />
basis. During this process, or in the subsequent validation<br />
phase, one or more of these models may turn out to<br />
perform so poorly that the underlying conceptual model<br />
has to be rejected. This approach of building the<br />
uncertainty of our knowledge of reality into alternative<br />
conceptual models, which are subsequently subject to a<br />
confirmation test, is fully in line with Popper’s scientific<br />
philosophical school. Unfortunately, this is very seldom<br />
pursued in practise.<br />
Code verification is not an activity that is carried out<br />
from scratch in every modelling study. In a particular<br />
study it has to be ascertained that the domain of<br />
applicability for which the selected model code has been<br />
verified covers the conditions specified in the actual<br />
conceptual model. If that is not the case, additional<br />
verification tests have to be conducted. Otherwise, the<br />
code explicitly must be classified as not verified for this<br />
particular study, and the subsequent simulation results<br />
therefore have to be considered with extra caution.<br />
Establishment of validation test schemes for the situations,<br />
where the split-sample test is not sufficient, is an<br />
area, where limited work has been carried out so far.<br />
The only rigorous and comprehensive methodology reported<br />
in literature is that of Klemes [19]. He proposed a<br />
systematic scheme of validation tests, where a distinction<br />
is made between simulations conducted for the<br />
same catchment as was used for calibration (split-sample<br />
test) and simulations conducted for ungauged catchments<br />
(proxy-basin tests). He also distinguished between
78 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />
cases where catchment conditions such as climate, land<br />
use and ground water abstraction are stationary (splitsample<br />
test) and cases where they are not (differential<br />
split-sample test). A further discussion, including examples,<br />
of Klemes’s test scheme is given in Refsgaard<br />
[30]. The two key principles are: (a) the validation tests<br />
must be carried out against independent data, i.e. data<br />
that have not been used during calibration, and (b) the<br />
model should be tested to show how good it can perform<br />
the kind of task for which it is specifically intended to be<br />
applied subsequently. This implies e.g. that multi-site<br />
validation is needed if predictions of spatial patterns are<br />
required, and multi-variable checks are required if predictions<br />
of the behaviour of individual subsystems<br />
within a catchment is needed. Thus, a model should only<br />
be assumed valid with respect to outputs that have been<br />
explicitly validated. This means for instance that a<br />
model which is validated against catchment runoff cannot<br />
automatically be assumed valid also for simulation<br />
of erosion on a hillslope within the catchment, because<br />
smaller scale processes may dominate here; it will need<br />
validation against hillslope soil erosion data.<br />
From a theoretical point of view the procedures<br />
outlined by Klemes [19] for the proxy-basin and the<br />
differential split-sample tests, where tests have to be<br />
carried out using data from similar catchments, are<br />
weaker than the usual split-sample test, where data from<br />
the specific catchment are available. However, no<br />
obviously better testing schemes exist. Therefore, this<br />
will have to be reflected in the performance criteria in<br />
terms of larger expected uncertainties in the predictions.<br />
It must be realised that the validation test schemes<br />
proposed above are so demanding that many applications<br />
today would fail to meet them. Thus, for many<br />
cases where either proxy-basin and differential splitsample<br />
tests are required, suitable test data simply do<br />
not exist. This is for example the case for prediction of<br />
regional scale transport of potential contamination from<br />
underground radionuclide deposits over the next thousands<br />
of years. In such case model validation is not<br />
possible. This does not imply that these modelling<br />
studies are not useful, only that their output should be<br />
recognised to be somewhat more uncertain than is often<br />
stated and that the term Ôvalidated model’ should not<br />
be used. Thus, a model’s validity will always be confined<br />
in terms of space, time, boundary conditions, types of<br />
application, etc.<br />
According to the methodology, model validation<br />
implies substantiating that a site-specific model can<br />
produce simulation results within the range of accuracy<br />
specified in the performance criteria for the particular<br />
study. Hence, before carrying out the model calibration<br />
and the subsequent validation tests quantitative performance<br />
criteria must be established. In determining<br />
the acceptable level of accuracy a trade-off will, either<br />
explicitly or implicitly, have to be made between costs,<br />
in terms of data collection and modelling work, and<br />
associated benefits that can be obtained due to more<br />
accurate model results. Consequently, the acceptable<br />
level of accuracy will vary from case to case and must be<br />
seen in a socio-economic context. It should therefore<br />
usually not be defined by the modeller, but in a dialogue<br />
between the modeller and the manager.<br />
4.3. Need for interaction between manager, code developer<br />
and modeller<br />
As discussed above, the validation methodologies<br />
presently used, even in research projects, are generally<br />
not rigorous and far from satisfactory. At the same time<br />
models are being used in practise and daily claims are<br />
being made on validity of models and on the basis of, at<br />
the best, not very strict and rigorous test schemes. An<br />
important question then, is how can the situation be<br />
improved in the future As emphasised by Forkel [12]<br />
improvements cannot be achieved by the research<br />
community alone, but requires an interaction between<br />
the three main Ôplayers’, namely water resources managers,<br />
code developers and model users (modellers).<br />
The key responsibilities of the water resources manager<br />
are to specify the objectives and define the acceptance<br />
limits of accuracy performance criteria for the<br />
model application. Furthermore, it is the manager’s<br />
responsibility to define requirements for code verification<br />
and model validation. In many consultancy jobs<br />
accuracy criteria and validation requirements are not<br />
specified at all, with the result being that the model user<br />
implicitly defines them in accordance with the achieved<br />
model results. In this respect it is important in the terms<br />
of references for a given model application to ensure<br />
consistency between the objectives, the specified accuracy<br />
criteria, the data availability and the financial<br />
resources. In order for the manager to make such evaluations,<br />
some knowledge on the modelling process is<br />
required.<br />
The model user has the responsibility for selection of<br />
a suitable code as well as for construction, calibration<br />
and validation of the site-specific model. In particular,<br />
the model user is responsible for preparing validation<br />
documents in such a way that the domain of applicability<br />
and the range of accuracy of the model are<br />
explicitly specified. Furthermore, the documentation of<br />
the modelling process should ideally be done in enough<br />
detail that it can be repeated several years later, if required.<br />
The model user has to interact with the water<br />
resources manager on assessments of realistic model<br />
accuracies. Furthermore, the model user must be aware<br />
of the capabilities and limitations of the selected code<br />
and interact with the code developer with regard to<br />
reporting of user experience such as shortcomings in<br />
documentation, errors in code, market demands for<br />
extensions, etc.
J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 79<br />
The key responsibilities of the developer of the model<br />
code are to develop and verify a model code. In this<br />
connection it is important that the capabilities and<br />
limitations of the code appear in the documentation. As<br />
code development is a continuous process, code maintenance<br />
and regular updating with new versions improved<br />
as a response to user reactions become important. Although<br />
a model code should be comprehensively documented,<br />
there will in practise always occur doubts once<br />
in a while on its functioning, even for experienced users.<br />
Hence, active support to and dialogue with model users<br />
are crucial for ensuring operational model applications<br />
at a high professional level.<br />
4.4. Performance criteria––when is a model good enough<br />
A critical issue in relation to the methodological<br />
framework is how to define the performance criteria. We<br />
agree with Beven [9] that any conceptual model is<br />
known to be wrong and hence any model will be falsified<br />
if we investigate it in sufficient detail and specify very<br />
high performance criteria.<br />
Clearly, if one attempts to establish a model that<br />
should simulate the truth it would always be falsified.<br />
However, this is not a very useful information. Therefore,<br />
we are using the conditional validation, or the<br />
validation restricted to domain of applicability (or<br />
numerical universal as opposed to strictly universal in<br />
Popperian terms). The good question is then what is<br />
good enough Or in other words what are the criteria<br />
How do we select them<br />
A good reference for model performance is to compare<br />
it with uncertainties of the available field observations.<br />
If the model performance is within this uncertainty<br />
range we often characterise the model as good enough.<br />
However, usually it is not so simple. How wide confidence<br />
bands do we accept on observational uncertainties––ranges<br />
corresponding to 65%, 95% or 99% Do<br />
we always then reject a model if it cannot perform within<br />
the observational uncertainty range In many cases even<br />
results from less accurate models may be very useful.<br />
Therefore, our answer is that the decision on what is<br />
good enough generally must be taken in a socio-economic<br />
context. For instance, the accuracy requirements<br />
to a model to be used for an initial screening of alternative<br />
options for location of a new small well field for a<br />
small water supply will be much smaller than the<br />
requirements to a model that is intended to be used for<br />
the final design of a large well field for a major water<br />
supply in an area with potential damaging effects on<br />
precious nature and other significant conflicts of interests.<br />
Thus, we believe that the accuracy criteria cannot<br />
be decided universally by modellers or researchers, but<br />
must be different from case to case depending on how<br />
much is at stake in the decision to depend on the support<br />
from model predictions. This implies that the performance<br />
criteria must be discussed and agreed between the<br />
manager and the modeller beforehand. However, as the<br />
modelling process and the underlying study progresses<br />
with improved knowledge on the data and model<br />
uncertainties as well as on the risk perception of the<br />
concerned stakeholders it may well be required to adjust<br />
the performance criteria in a sort of adaptive project<br />
management context [27].<br />
4.5. The role of uncertainty assessments<br />
Should we then trust a model if it happens to pass a<br />
validation test Are we sure that this model is the best<br />
one and that the underlying conceptual basis and input<br />
data are basically correct<br />
Yes on the one hand, in such case we may trust a<br />
model as a suitable tool to make predictions through<br />
model simulations. But on the other hand, we can never<br />
be sure that a model that passes a validation test will<br />
have a sound conceptual basis. It could be right for the<br />
wrong reasons, e.g. by compensating error in conceptual<br />
model (model structure) with errors in parameter values.<br />
And we know that it would be possible to find many<br />
other models that can pass the validation test, and that it<br />
would not be possible beforehand to identify one of these<br />
models as the best one in all respects. Having realised this<br />
equifinality problem the relevant question is what we<br />
should do to address it in practical cases. In this respect<br />
our framework prescribes that model predictions (see<br />
definition of Ôsimulation’ in Section 3) made subsequent<br />
to passing a validation test should include uncertainty<br />
assessments. Hence, we basically agree with Beven [9]<br />
that uncertainty assessments are necessary, and that such<br />
uncertainty analyses should include uncertainty on<br />
model structure, parameter values etc. Different methodologies<br />
exist for conducting uncertainty assessments,<br />
e.g. Beven [8] and Van Asselt and Rotmans [36].<br />
5. Guiding principles and future perspectives for modelling<br />
guidelines<br />
5.1. Guiding principles<br />
In our opinion the two key factors causing the poor<br />
quality of the modelling work in practise are: (a) too<br />
poor quality of the modelling work done by practitioners<br />
(inadequate use of guidelines and quality assurance<br />
procedures and inadequate role play between manager<br />
(client) and modeller (consultant)) and (b) lack of data<br />
and methodology in the hydrological science. Modelling<br />
guidelines like [25,37] almost exclusively address the<br />
former issue while scientific literature like [7,9] focus on<br />
the latter issue. In our opinion it is crucial that the two<br />
lines of action are combined. This implies that we need<br />
to define modelling guidelines that are both operational
80 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />
in practise and scientifically founded. The framework we<br />
have described here attempts to establish one such a<br />
bridge between the two fields, i.e. pragmatic modelling<br />
and natural science. An important aspect of this<br />
framework is in a scientifically consistent way to enable<br />
the manager and the modeller to make the compromises<br />
that are required in practise.<br />
On this background the following five key principles<br />
for pragmatic modelling have emerged:<br />
• A terminology that is internally consistent. We<br />
acknowledge that many authors in the scientific literature<br />
use different terminology and that, in particular,<br />
some authors do not use the terms verification<br />
and validation. However, these terms are also widely<br />
used, and we need in practise to have understandable<br />
terms for these operations. Thus, with the clear distinction<br />
between conceptual model, model code and<br />
site-specific model and the restrictions to domains<br />
of applicability (numerical universal in Popperian<br />
sense) we believe that our terminology is in accordance<br />
with the main stream of scientific philosophy.<br />
• We never talk about universal code verification or universal<br />
model validation, but always restrict these<br />
terms to clearly defined domains of applicability. This<br />
is a necessary assumption for the consistency of the<br />
terminology and methodology and must be emphasised<br />
explicitly in any guidelines.<br />
• Validation tests against independent data that have<br />
not also been used for calibration are necessary in order<br />
to be able to document the predictive capability<br />
of a model.<br />
• Model predictions achieved through simulation<br />
should be associated with uncertainty assessments<br />
where amongst others the uncertainty in model structure<br />
and parameter values should be accounted for.<br />
• A continuous interaction between manager and modeller<br />
is crucial for the success of the modelling process.<br />
One of the key aspects in this regard is to establish suitable<br />
performance criteria for the model calibration<br />
and validation tests. This dialogue is also very important<br />
in connection with uncertainty assessments.<br />
5.2. Future challenges<br />
Some of the issues dealt with in the present manuscript<br />
are still not fully explored. The four most<br />
important future challenges are:<br />
• Establishment of accuracy criteria for a modelling<br />
study is a very important issue and one where we<br />
maybe differ from most scientific literature. Modellers<br />
often establish numerical accuracy criteria in order to<br />
classify the goodness of a given model [2,17,28].<br />
These attempts are very useful in making the performance<br />
more transparent and quantitative, but do not<br />
provide an objective means to decide what the optimal<br />
accuracy criteria really should be in a given case.<br />
According to our framework no universal accuracy<br />
criteria can be established, i.e. it is generally not possible<br />
from a natural scientific point of view to tell<br />
when a model performance is good enough. Such<br />
acceptance criteria will vary from case to case<br />
depending on the socio-economic context, i.e. what<br />
is at stake in the decisions to be supported by the<br />
model predictions. The good question now is: how<br />
do we translate the Ôsoft’ socio-economic objectives<br />
to Ôhard-core’ model performance criteria This is<br />
obviously a challenge that cannot be solved by natural<br />
science alone, but need to be addressed in a much<br />
broader context including aspects of economy, stakeholder<br />
interests and risk perception. Until we become<br />
better to overcome this challenge we will, however,<br />
not be able to arrive at the optimal balance between<br />
the costs of modelling and the derived societal benefits.<br />
Although this work has hardly begun yet, and<br />
we know that it is a very difficult road, we see no real<br />
alternative.<br />
• Although all experience shows that models generally<br />
perform poorer in validation tests against independent<br />
data than they do in calibration tests, model validation<br />
is in our opinion a much neglected issue, both<br />
in many modelling guidelines and in the scientific<br />
literature. Maybe many scientists have not wanted<br />
to use the term validation due to the scientific philosophically<br />
related controversies, but in any case<br />
many scientists are not advocating the need for model<br />
validation. One of the unfortunate consequences of<br />
this Ôlack of interest’ is that not much work has<br />
been devoted to developing suitable validation test<br />
schemes since Klemes [19]. In our opinion further<br />
development of suitable testing schemes and imposing<br />
them to all modelling projects is a major future<br />
challenge.<br />
• A third issue that requires considerable attention is<br />
how do we decide among alternative model structures<br />
and parameter sets (the equifinality problem). If we<br />
use multiple criteria one model may be better on<br />
one criteria and another on another criteria. In our<br />
opinion we need not necessarily chose. We know that<br />
all conceptual models are wrong and we know that<br />
wrong conceptual models are compensated by biased<br />
model parameter values through calibration. But, unless<br />
we can falsify a conceptual model directly, which<br />
is very difficult, or unless the resulting model is falsified<br />
through the validation test, this model is a possible<br />
candidate for predictions. And if several models<br />
pass the validation tests we may not be able to tell<br />
which one is the best. In such case they should all<br />
be considered suitable, and the fact that they provide<br />
different predictive results should be used as part of<br />
the uncertainty assessments. Work on this relatively
J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 81<br />
new paradigm has just begun [9] and a lot of work is<br />
still required to further develop and operationalise it.<br />
• Finally, there are many more challenges related to<br />
uncertainty in water resources management. Quality<br />
assurance and uncertainty assessments are two<br />
aspects that are very closely linked. Initially, the manager<br />
has to define accuracy criteria from a perception<br />
of which uncertainty level he believes is suitable in a<br />
particular case (see above). Subsequently, as the modelling<br />
study proceeds, the dialogue between modeller<br />
and manager has to continue with the necessary<br />
trade-off between modelling accuracy and cost of<br />
modelling study. In the uncertainty assessments it is<br />
very important to go beyond the traditional statistical<br />
uncertainty analysis. Thus, e.g. aspects of scenario<br />
uncertainty and ignorance should generally be included<br />
and in addition the uncertainties originating<br />
from data and models often needs to be integrated<br />
with socio-economic aspects in order to form a suitable<br />
basis for the further decision process [36]. Thus,<br />
like with the accuracy criteria (above) the use of<br />
uncertainty assessments in water resources management<br />
goes beyond natural science.<br />
Acknowledgements<br />
The present work was carried out within the Project<br />
ÔHarmonising Quality Assurance in model based catchments<br />
and river basin management (HarmoniQuA)’,<br />
which is partly funded by the EC Energy, Environment<br />
and Sustainable Development programme (Contract<br />
EVK2-CT2001-00097). The constructive comments and<br />
suggestions to the manuscript by the HarmoniQuA<br />
project team and by our colleague William (Bill) G.<br />
Harrar are acknowledged. Finally, the constructive<br />
criticisms by Keith Beven, University of Lancaster;<br />
Rodger Grayson, University of Melbourne and a third,<br />
anonymous referee helped to improve the manuscript<br />
significantly.<br />
References<br />
[1] Abbott MB. The theory of the hydrological model, or: the<br />
struggle for the soul of hydrology. In: O’Kane JP, editor.<br />
Advances in theoretical hydrology. Elsevier; 1992. p. 237–54.<br />
[2] Andersen J, Refsgaard JC, Jensen KH. Distributed hydrological<br />
modelling of the Senegal River Basin––model construction and<br />
validation. J Hydrol 2001;247:200–14.<br />
[3] Anderson MG, Bates PD, editors. Model validation: perspectives<br />
in hydrological science. John Wiley and Sons; 2001.<br />
[4] Anderson MP, Woessner WW. The role of postaudit in model<br />
validation. Adv Water Resour 1992;15:167–73.<br />
[5] Baker VR. Conversing with the Earth: the geological approach to<br />
understanding. In: Frodeman R, editor. Earth matters The earth science,<br />
philosophy and the claims of community. Prentice Hall; 2000.<br />
[6] Beven K. Changing ideas in hydrology––the case of physically<br />
based models. J Hydrol 1989;105:157–72.<br />
[7] Beven K. Towards an alternative blueprint for a physically based<br />
digitally simulated hydrologic response modelling system. Hydrol<br />
Process 2002;16(2):189–206.<br />
[8] Beven K, Binley AM. The future of distributed models: model<br />
calibration and uncertainty prediction. Hydrol Process 1992;6:<br />
279–98.<br />
[9] Beven K. Towards a coherent philosophy for modelling the<br />
environment. Proc Roy Soc Lond A 2002;458(2026):2465–84.<br />
[10] Dee DP. A pragmatic approach to model validation. In: Lynch<br />
DR, Davies AM, editors. Quantitative skill assessment of coastal<br />
ocean models. Washington: AGU; 1995. p. 1–13.<br />
[11] De Marsily G, Combes P, Goblet P. Comments on ’Ground-water<br />
models cannot be validated’, by Konikow LF, Bredehoeft, JD.<br />
Adv Water Resour 1992;15:367–9.<br />
[12] Forkel C. Das numerische Modell––ein schmaler Grat zwischen<br />
vertrauensw€urdigem Werkzeug und gef€ahrlichem Spielzeug. Presented<br />
at the 26. IWASA, RWTH Aachen, 4–5 January 1996.<br />
[13] Freeze RA, Harlan RL. Blueprint for a physically-based digitallysimulated<br />
hydrologic response model. J Hydrol 1969;9:237–58.<br />
[14] Gupta HV, Sorooshian S, Yapo PO. Toward improved calibration<br />
of hydrologic models: multiple and noncommensurable<br />
measures of information. Water Resour Res 1998;34(4):751–<br />
63.<br />
[15] Hansen JM. The line in the sand the wave on the water––Steno’s<br />
theory on the language of nature and the limits of the knowledge.<br />
Copenhagen: Fremad; 2000. 440 pp (in Danish).<br />
[16] Hassanizadeh SM, Carrera J. Editorial, special issue on validation<br />
of geo-hydrological models. Adv Water Resour 1992;15:1–3.<br />
[17] Henriksen HJ, Troldborg L, Nyegaard P, Sonnenborg TO,<br />
Refsgaard JC, Madsen B. Methodology for construction, calibration<br />
and validation of a national hydrological model for<br />
Denmark. J Hydrol 2003;280(1–4):52–71.<br />
[18] IAHR. Publication of guidelines for validation documents and<br />
call for discussion. Int Assoc Hydraul Res Bull 1994;11:41.<br />
[19] Klemes V. Operational testing of hydrological simulation models.<br />
Hydrol Sci J 1986;31:13–24.<br />
[20] Konikow LF, Bredehoeft JD. Ground-water models cannot be<br />
validated. Adv Water Resour 1992;15:75–83.<br />
[21] Kuhn TS. The structure of scientific revolutions. Chicago:<br />
University of Chicago Press; 1962.<br />
[22] Liden R. Conceptual runoff models for material transport<br />
estimations. PhD dissertation, Report No. 1028, Lund Institute<br />
of Technology, Lund University, Sweden, 2000.<br />
[23] Los H, Gerritsen H. Validation of water quality and ecological<br />
models. Presented at the 26th IAHR Conference, London, Delft<br />
Hydraulics, 11–15 September 1995, 8 pp.<br />
[24] Matalas NC, Landwehr JM, Wolman MG. Prediction in water<br />
management. In: Scientific basis of water resource management.<br />
Washington, DC: National Research Council, National Academy<br />
Press; 1982. p. 118–27.<br />
[25] Middlemis H. Murray–Darling Basin Commission. Groundwater<br />
flow modelling guideline. Aquaterra Consulting Pty Ltd, South<br />
Perth, Western Australia. Project no. 125, 2000.<br />
[26] Oreskes N, Shrader-Frechette K, Belitz K. Verification, validation<br />
and confirmation of numerical models in the earth sciences.<br />
Science 1994;264:641–6.<br />
[27] Pahl-Wostl C. Towards sustainability in the water sector––the<br />
importance of human actors and processes of social learning.<br />
Aquat Sci 2002;64:394–411.<br />
[28] Parkin G, O’Donnell GO, Ewen J, Bathurst JC, O’Connel PE,<br />
Lavabre J. Validation of catchment models for predicting land-use<br />
and climate change impacts. 2. Case study for a Mediterranean<br />
catchment. J Hydrol 1996;175:595–613.<br />
[29] Popper KR. The logic of scientific discovery. London: Hutchingson<br />
& Co; 1959.<br />
[30] Refsgaard JC. Towards a formal approach to calibration and<br />
validation of models using spatial data. In: Grayson R, Bl€oschl G,
82 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />
editors. Spatial patterns in catchment hydrology: Observations<br />
and modelling. Cambridge University Press; 2001. p. 329–54.<br />
[31] Refsgaard JC, Knudsen J. Operational validation and intercomparison<br />
of different types of hydrological models. Water Resour<br />
Res 1996;32(7):2189–202.<br />
[32] Rykiel ER. Testing ecological models: The meaning of validation.<br />
Ecol Modell 1996;90:229–44.<br />
[33] Schlesinger S, Crosbie RE, Gagne RE, Innis GS, Lalwani CS,<br />
Loch J, et al. Terminology for model credibility. SCS Tech Comm<br />
Model Credibil Simul 1979;32(3):103–4.<br />
[34] Scholten H, Van Waveren RH, Groot S, Van Geer FC, W€osten<br />
JHM, Koeze RD, et al. Good modelling practice in water<br />
management. Paper presented on Hydroinformatics 2000, Cedar<br />
Rapids, IA, USA, 2000.<br />
[35] Troldborg L. Effects of geological complexity on groundwater age<br />
prediction. Poster Session 62C, AGU December 2000. EOS<br />
Transactions, 81(48), F435.<br />
[36] Van Asselt MBA, Rotmans J. Uncertainty in integrated assessment<br />
modelling––from positivism to pluralism. Climat Change<br />
2002;54(1–2):75–105.<br />
[37] Van Waveren RH, Groot S, Scholten H, Van Geer FC, W€osten<br />
JHM, Koeze RD, et al. Good modelling practice handbook.<br />
STOWA Report 99-05, Utrecht, RWS-RIZA, Lelystad, The<br />
Netherlands. Available from: http://waterland.net/riza/aquest/.
[13]<br />
Refsgaard JC, Henriksen HJ, Harrar WG, Scholten H, Kassahun A (2005)<br />
Quality assurance in model based water management – review of existing<br />
practice and outline of new approaches.<br />
Environmental Modelling & Software, 20, 1201-1215.<br />
Reprinted from Environmental Modelling & Software with permission from Elsevier
Environmental Modelling & Software 20 (2005) 1201–1215<br />
www.elsevier.com/locate/envsoft<br />
Quality assurance in model based water management – review of<br />
existing practice and outline of new approaches<br />
Jens Christian Refsgaard a, ) , Hans Jørgen Henriksen a , William G. Harrar a ,<br />
Huub Scholten b , Ayalew Kassahun b<br />
a Geological Survey of Denmark and Greenland (GEUS), Øster Voldgade 10, DK-1350 Copenhagen K, Denmark<br />
b Wageningen University (WU), Dreijenplein 2, 6703 HB, Wageningen, The Netherlands<br />
Received 11 December 2003; received in revised form 30 March 2004; accepted 30 July 2004<br />
Abstract<br />
Quality assurance (QA) is defined as protocols and guidelines to support the proper application of models. In the water<br />
management context we classify QA guidelines according to how much focus is put on the dialogue between the modeller and the<br />
water manager as: (Type 1) Internal technical guidelines developed and used internally by the modeller’s organisation; (Type 2)<br />
Public technical guidelines developed in a public consensus building process; and (Type 3) Public interactive guidelines developed as<br />
public guidelines to promote and regulate the interaction between the modeller and the water manager throughout the modelling<br />
process. State-of-the-art QA practices vary considerably between different modelling domains and countries. It is suggested that<br />
these differences can be explained by the scientific maturity of the underlying discipline and differences in modelling markets in terms<br />
of volume of jobs outsourced and level of competition. The structure and key aspects of new generic guidelines and a set of<br />
electronically based supporting tools that are under development within the HarmoniQuA project are presented. Model credibility<br />
can be enhanced by a proper modeller-manager dialogue, rigorous validation tests against independent data, uncertainty<br />
assessments, and peer reviews of a model at various stages throughout its development.<br />
Ó 2004 Elsevier Ltd. All rights reserved.<br />
Keywords: Modelling guidelines; Quality assurance; Water resources management; Uncertainty; Support tools<br />
1. Introduction<br />
Models describing water flows, water quality and<br />
ecology are being developed and applied in increasing<br />
number and variety. The trend in recent years has been<br />
to base water management decisions to a larger extent<br />
on modelling studies, and to use more sophisticated<br />
models. In Europe this trend is likely to be reinforced by<br />
the EU Water Framework Directive due to its demand<br />
for integrating groundwater, surface water, ecological<br />
) Corresponding author. Tel.: C45 38 142 776; fax: C45 38 142<br />
050.<br />
E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />
and economic aspects of water management at the river<br />
basin scale and due to the explicit requirement to study<br />
impacts of alternative measures (human interventions)<br />
intended to improve the ecological status in the river<br />
basin. Insufficient attention is often given to documenting<br />
the predictive capability of models. Therefore,<br />
contradictions may emerge regarding the various claims<br />
of model applicability on the one hand and the lack of<br />
documentation of these claims on the other hand.<br />
Hence, the credibility of the model is often questioned,<br />
and sometimes with good reason.<br />
Another important trend is the demand to involve<br />
different stakeholders in the water resources management<br />
process, and therefore also indirectly in the<br />
modelling process (Pahl-Wostl, 2002). This stakeholder<br />
1364-8152/$ - see front matter Ó 2004 Elsevier Ltd. All rights reserved.<br />
doi:10.1016/j.envsoft.2004.07.006
1202 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
involvement does not imply active participation in<br />
the technical modelling itself, but rather appears as<br />
a demand to be able to understand and review the<br />
various assumptions and their implications for the<br />
modelling results. This trend is seen at the global scale<br />
in connection with the generally accepted principles<br />
behind integrated water resources management, where<br />
public participation is a key element (GWP-TAC, 2000).<br />
In Europe, this is reflected in the EU Water Framework<br />
Directive, where it is explicitly prescribed that stakeholders<br />
and the general public should be involved in the<br />
water resources management process.<br />
The need for improving the quality of the modelling<br />
process has been emphasised by the research community,<br />
e.g. Klemes (1986), NRC (1990), Anderson and<br />
Woessner (1992), Forkel (1996), and Rykiel (1996). The<br />
recommendations made in this respect primarily focus<br />
on scientific/technical guidance on how the modeller<br />
should carry out various steps during the modelling<br />
process in order to achieve the best and most reliable<br />
results.<br />
Anderson and Bates (2001) in a discussion of model<br />
credibility and scientific integrity state that ‘‘over the last<br />
decade we have begun to have an appreciation of the<br />
need to be much more rigorous in establishing<br />
procedures for defining model credibility’’. They argue<br />
further that this demand has not evolved from the<br />
hydrological science itself due to immaturity and data<br />
limitations, but instead comes from policy makers and<br />
regulators who wish to have some kind of certification<br />
of model results.<br />
As emphasised by e.g. Forkel (1996) modelling<br />
studies involve several partners with different responsibilities.<br />
The ‘key players’ are code developers, model<br />
users and water managers. However, a lack of mutual<br />
understanding may develop due to the complexity of the<br />
modelling process and the different backgrounds of the<br />
‘key players’. For example, the strengths and limitations<br />
of modelling applications are often difficult, if not<br />
impossible, for the water managers to assess. Similarly,<br />
the transformation of objectives defined by the water<br />
manager to specific performance criteria can be very<br />
difficult for the model users to assess. It can be difficult<br />
to audit modelling projects due to the lack of proper<br />
documentation and transparency. Furthermore, it is<br />
often difficult to reconstruct and reproduce the modelling<br />
process and its results.<br />
In the water resources management community many<br />
different guidelines on good modelling practise have<br />
been developed. One of, if not the most, comprehensive<br />
example of a modelling guideline has been developed in<br />
The Netherlands (Van Waveren et al., 2000; Scholten<br />
and Groot, 2002) as a result of a process involving all<br />
the main players in the Dutch water management field.<br />
The background for this process was a perceived need<br />
for improving the quality in modelling by addressing<br />
malpractice issues such as careless handling of input<br />
data, insufficient calibration and validation, and model<br />
use outside its intended scope (Scholten et al., 2000).<br />
Similarly, modelling guidelines for the Murray-Darling<br />
Basin in Australia were developed due to the perception<br />
among end-users that model capabilities may have been<br />
‘over-sold’, and that there was a lack of consistency in<br />
approaches, communication and understanding among<br />
and between the modellers and the water managers,<br />
which often resulted in considerable uncertainty for<br />
decision making (Middlemis, 2000).<br />
As pointed out by Merrick et al. (2002) good<br />
modelling practice cannot be decomposed into a set of<br />
rigid rules that can be followed without communication<br />
between modellers and water managers. Furthermore,<br />
there is a risk that modellers will not embrace guidelines<br />
aiming to inject too much consistency in the review<br />
procedure. Experiences from Australia have shown that<br />
review reports are commonly interpreted by water<br />
managers (non-modellers) as quite negative. Nonmodellers<br />
may tend to focus mainly on the negative<br />
review comments rather than balance those against the<br />
positive comments. This may mostly be the case for<br />
projects where there has not been a proper specification<br />
of the purpose and conditions at the initiation of the<br />
model study or where previous reviews during earlier<br />
project stages have been inadequate. External reviews<br />
performed at the end of a project when things may have<br />
already gone wrong may often result in defensive<br />
responses both from the modellers and the water<br />
managers (Henriksen, 2002a).<br />
All the existing modelling guidelines that we are<br />
aware of exist as reports. Electronically based support is<br />
only available as text forms to record modelling<br />
activities. No electronically based tool that is coupled<br />
to a knowledge base defining how to carry out the<br />
modelling (electronic version of guidelines with comprehensive<br />
guidance to different types of users) exists at<br />
present. This is a paradox, considering the significant<br />
resources that are invested in improving modelling<br />
software packages with respect to new sophisticated<br />
information technology.<br />
Poor modelling results may be caused by the lack of<br />
adequate model codes, or data of insufficient quantity or<br />
quality. However, according to our experience the most<br />
prevalent reason for poor modelling results is the<br />
inadequate use of guidelines and quality assurance<br />
procedures, and improper interaction between the<br />
manager (client) and the modeller (consultant). Our<br />
work has been carried out within the context of an EU<br />
supported research project (http://www.harmoniqua.org)<br />
aimed at developing a common set of quality<br />
assurance guidelines and supporting software tools. The<br />
scientific philosophical basis for the adopted terminology<br />
and guiding principles are described by Refsgaard<br />
and Henriksen (2004). The objective of the present
J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
1203<br />
paper is to establish new approaches and outline the<br />
requirements of supporting tools for quality assurance<br />
procedures in the modelling process.<br />
2. Theoretical framework<br />
2.1. Terminology and scientific basis<br />
The terminology and methodology used in the<br />
following are based on Refsgaard and Henriksen (2004).<br />
The key elements in the terminology are illustrated in<br />
Fig. 1 and the most important definitions are:<br />
A model code is a generic software program, which<br />
can be used for different study areas without<br />
modifying the source code.<br />
A model is a site application of a code to a particular<br />
study area, including input data and parameter<br />
values.<br />
A model code can be verified. A code verification<br />
involves comparison of the numerical solution<br />
generated by the code with one or more analytical<br />
solutions or with other numerical solutions. Verification<br />
ensures that the computer programme accurately<br />
solves the equations that constitute the<br />
mathematical model.<br />
Model validation is here defined as the process of<br />
demonstrating that a given site-specific model is<br />
capable of making accurate predictions for periods<br />
outside a calibration period. A model is said to be<br />
validated if its accuracy and predictive capability in<br />
the validation period have been proven to lie within<br />
acceptable limits or errors.<br />
These terms are commonly used, although with<br />
differences in meaning between authors. Our views on<br />
Fig. 1. Elements of a modelling terminology (Refsgaard and<br />
Henriksen, 2004).<br />
these terms and the ongoing discussion on validationfalsification-confirmation<br />
as well as between the terms<br />
perceptual model, conceptual model and site-specific<br />
model are given in Refsgaard and Henriksen (2004).<br />
Here we just note that, from a quality assurance<br />
guideline point of view, it is fundamental for us to<br />
make a clear distinction between the terms conceptual<br />
model, model code and (site-specific) model. Furthermore,<br />
we never use the terms verification and validation<br />
in a universal sense, but always restricted to clearly<br />
defined domains of applicability (numerical universal in<br />
Popperian sense).<br />
In addition to ensure a proper quality of work the<br />
three most important underlying principles that have<br />
been identified from an analysis of the modelling process<br />
are (Refsgaard and Henriksen, 2004):<br />
Validation tests against independent data that have<br />
not also been used for calibration are necessary in<br />
order to be able to document the predictive<br />
capability of a model.<br />
Model predictions achieved through simulation<br />
should be associated with uncertainty assessments<br />
where amongst others the uncertainty in model<br />
structure and parameter values should be accounted<br />
for.<br />
A continuous interaction between water manager and<br />
modeller is crucial for the success of the modelling<br />
process. One of the key aspects in this regard is to<br />
establish suitable performance criteria for the model<br />
calibration and validation tests. This dialogue is also<br />
very important in connection with uncertainty<br />
assessments.<br />
2.2. Types of QA guidelines<br />
2.2.1. Definition and classification<br />
of quality assurance (QA)<br />
Quality assurance (QA) is defined by NRC (1990) as<br />
the procedural and operational framework used by an<br />
organisation managing the modelling study to assure<br />
technically and scientifically adequate execution of all<br />
tasks included in the study, and to assure that all<br />
modelling-based analysis is reproducible and defensible.<br />
In line with this we define QA guidelines as protocols<br />
and guidelines to support good application of models in<br />
water management.<br />
QA in the modelling process has two main components:<br />
(a) QA in development of model codes; and (b)<br />
QA in relation to application studies. Our paper focuses<br />
on the second component only.<br />
QA in model application studies includes data<br />
analyses, methodologies of good modelling practice,<br />
reviews and administrative procedures. Such QA guidelines<br />
can be classified according to how much focus is
1204 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
put on the consensus building process between the<br />
modeller and the water manager in the following three<br />
classes:<br />
Internal technical guidelines (Type 1) established and<br />
used internally by the modeller’s organisation.<br />
Public technical guidelines (Type 2) established as<br />
public guidelines and used internally by the modeller’s<br />
organisation.<br />
Public interactive guidelines (Type 3) established as<br />
public guidelines and based on regulation of the<br />
interaction between the modeller and the water<br />
manager throughout the modelling process.<br />
2.2.2. Type 1: Internal technical guidelines<br />
Most organisations involved in modelling studies<br />
have some kind of internal QA procedures. They usually<br />
focus on the technical aspects, i.e. to ensure that the<br />
modelling work itself is done without making unqualified<br />
judgements or errors. The betters of these are<br />
based on the modelling protocols and similar scientifically<br />
based procedures originating from the research<br />
community. These procedures are internal in nature<br />
because they have been established or adopted unilaterally<br />
by the modeller’s organisation, and because they<br />
seldom deal with the interaction between modeller and<br />
end-user. Examples of Type 1 guidelines include:<br />
Internal QA procedures, common in many companies.<br />
Text books. Many textbooks contain chapters with<br />
recommended modelling protocols (e.g. Anderson<br />
et al., 1993).<br />
Manuals to software packages with hints on the best<br />
way to use a model (e.g. Rumbaugh and Rumbaugh,<br />
2001; DHI, 2002).<br />
2.2.3. Type 2: Public technical guidelines<br />
These guidelines often contain the same substance as<br />
the internal technical guides mentioned above. However,<br />
they differ in the sense that they have been<br />
prepared through a consultative and consensus building<br />
process involving many persons and organisations. They<br />
focus on the technical aspects and give no or little<br />
emphasis to the interaction between the modeller and<br />
the end-user. Examples of Type 2 guidelines include:<br />
The CAMASE guidelines for modelling that were<br />
developed after substantial consultation within the<br />
scientific modelling community (CAMASE, 1996).<br />
Standards from American Society for Testing and<br />
Materials (e.g. ASTM, 1994).<br />
Many of the UK standards, especially the older ones<br />
(Packman, 2002).<br />
2.2.4. Type 3: Public interactive guidelines<br />
These guidelines have, like the public technical<br />
guidelines (Type 2), been established through a public<br />
consultative and consensus building process. However,<br />
they differ from the Type 2 guidelines by an additional<br />
focus on regulating the interaction between the modeller<br />
and the water manager, who often have the roles of<br />
consultant and client, respectively.<br />
Important elements in public interactive guidelines<br />
are reviews that, in addition to QA in the sense of technical<br />
guidance, can facilitate the consensus-building process<br />
between the parties. Experience shows that such a<br />
process is crucial for the overall credibility of the modelling<br />
process. Examples of such QA guidelines include<br />
(more details on these guidelines provided in next<br />
chapter):<br />
The Dutch guidelines (Van Waveren et al., 2000;<br />
Scholten and Groot, 2002).<br />
The Australian groundwater flow modelling guidelines<br />
established by the Murray-Darling Basin<br />
Commission (Middlemis, 2000; Merrick et al.,<br />
2002; Henriksen, 2002a).<br />
The Danish groundwater modelling guidelines<br />
(Henriksen, 2002b).<br />
Some of the recent UK standards (Packman, 2002).<br />
Californian guidelines prepared by Bay-Delta Modelling<br />
Forum (BDMF, 2000).<br />
2.3. Development stage and prevalence<br />
of QA guidelines<br />
Reviews of a number of existing QA guidelines (see<br />
details in next chapter) revealed significant differences in<br />
current practice, both between domains and between<br />
different countries. In some domains and some countries<br />
there has been a clear trend over the past couple of<br />
decades to move from Type 1 to Type 2 or Type 3<br />
guidelines. In order to understand the development of<br />
QA guidelines and be able to provide recommendations<br />
based on anticipated future needs, it is important to try<br />
to understand why the present differences in the<br />
developmental stage of QA guidelines exist. The<br />
hypothesis that we will test is that the development<br />
stage depends on two main factors:<br />
The scientific maturity of the underlying discipline,<br />
i.e. how well understood are the underlying processes<br />
and how easily available are the data<br />
necessary for practical applications. In this respect,<br />
a mature scientific discipline is one where there is<br />
a general acceptance in the scientific community on<br />
how the processes are described, there are no<br />
significant controversies on key issues, and it is<br />
feasible to acquire the necessary data for practical
J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
1205<br />
studies. Similarly, an immature scientific discipline is<br />
one where some processes are not well understood,<br />
where there are several alternative ‘schools’ on how<br />
to describe things, and where it is often not possible<br />
to obtain sufficient field data necessary to perform<br />
scientifically sound modelling. Immature scientific<br />
disciplines are often considered as being complex,<br />
and are characterised by unresolved problems such<br />
as scale problems. For example, whereas biology is<br />
a relatively old science in comparison with hydrogeology,<br />
biota (ecological) modelling is considered<br />
to be immature in contrast to groundwater flow<br />
modelling which is considered to be mature. Biota<br />
modelling is rather uncertain due to the inherent<br />
complexity of ecological systems and the general<br />
limited availability of relevant field data, whereas the<br />
mathematical principles describing groundwater<br />
flow are well established and flow systems are<br />
readily characterised in the field.<br />
The modelling market maturity, i.e. how well developed<br />
is the market for modelling studies. In this<br />
respect, a mature market is characterised by (a) the<br />
modelling market is relatively old with numerous<br />
examples of good and poor quality modelling<br />
studies, and the motivation for establishing QA<br />
guidelines is largely due to water managers having<br />
experience with studies of poor quality; (b) most jobs<br />
are outsourced to private consultants; (c) the volume<br />
of modelling work is large, so that a number of<br />
consultants can be sustained and standard routines<br />
can evolve; and (d) there is a considerable competition<br />
among modellers in getting the jobs. Similarly,<br />
an immature market is characterised by (a) it is relatively<br />
new (typically !10 years); (b) most modelling<br />
studies are carried out by government agencies themselves;<br />
(c) the volume of work for the consultants is<br />
small; and (d) there is virtually no competition<br />
among modellers, instead the work is carried out by<br />
a few specialised groups which are often located in or<br />
have close ties to the research community.<br />
If these hypotheses were true one would a priori<br />
expect that a considerable degree of scientific maturity is<br />
required for QA guidelines of Type 2 to develop, and<br />
that further a mature modelling market is a necessary<br />
prerequisite for the development of Type 3 guidelines.<br />
3. Existing guidelines<br />
Reviews of existing QA guidelines were conducted<br />
(Refsgaard, 2002). The reviews attempted to cover two<br />
aspects: (a) variation of practices between seven different<br />
modelling domains (groundwater, precipitation-runoff,<br />
hydrodynamics, flood forecasting, surface water quality,<br />
biota (ecology) and socio-economy); and (b) differences<br />
between geographical regions. The reviews of stateof-the-art<br />
in the seven domains were carried out by<br />
seven different organisations with special expertise in the<br />
respective domains. During these reviews a broad search<br />
of relevant QA guidelines were made with primary focus<br />
on existing guidelines in Europe and secondarily<br />
on guidelines from North America and Australia.<br />
Subsequently, a few cases with guidelines from different<br />
geographical areas were selected for a more detailed<br />
review. The reviews did not intend to be exhaustive by<br />
including all important QA guidelines, but aimed at<br />
selecting guidelines representative for conditions in<br />
Europe, North America and Australia.<br />
In order to test the above hypotheses the conclusions<br />
of the state-of-the-art of QA guidelines for the different<br />
domains summarised in Section 3.1 are plotted in Fig. 2<br />
as a function of scientific maturity. Furthermore,<br />
examples of guidelines from different countries are<br />
Scientific<br />
maturity<br />
Mature<br />
FF<br />
HD<br />
GW-HD<br />
Immature<br />
SWQ<br />
Biota<br />
GW-WQ<br />
Type 1<br />
Internal<br />
PR<br />
HD-Sed<br />
SE<br />
GW-AD<br />
Type 2<br />
Public<br />
Modelling domains<br />
GW-HD: Groundwater flow<br />
GW-AD: Groundwater solute transport<br />
GW-WQ: Groundwater geochemistry<br />
PR: Precipitation runoff<br />
HD: Hydrodynamic – surface water flow<br />
HD-Sed: Sediment transport/morphology<br />
FF: Flood forecasting<br />
SWQ: Surface water quality<br />
Biota: Biota (ecology)<br />
SE: Socio-economy<br />
Type 3<br />
Interactive<br />
QA<br />
guidelines<br />
Fig. 2. State-of-the-art for QA guidelines in different modelling domains plotted against maturity of the underlying scientific disciplines.
1206 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
Modelling<br />
market<br />
Mature<br />
(Old, big,<br />
competive)<br />
ASTM<br />
UK<br />
BDMF<br />
AUS-GW<br />
NL-GMP<br />
DK-GW<br />
UK<br />
UK<br />
Immature<br />
(New, small,<br />
specialised)<br />
CEE<br />
FR-FF<br />
Cases-guidelines<br />
BDMF: Bay Delta Modelling Forum (California)<br />
AUS-GW: Australia, groundwater<br />
NL-GMP: Dutch Good Modelling Practise<br />
DK-GW: Denmark, groundwater<br />
UK: United Kingdom, several domains<br />
ASTM: American Society for Testing and Materials<br />
CEE: Central and Eastern Europe<br />
FR-FF: France, flood forecasting<br />
Type 1<br />
Internal<br />
Type 2<br />
Public<br />
Type 3<br />
Interactive<br />
QA<br />
guidelines<br />
Fig. 3. Different types of guidelines as a function of maturity in the modelling market.<br />
presented in Section 3.2 and Fig. 3 with focus on market<br />
maturity.<br />
3.1. State-of-the-art in different modelling domains<br />
Groundwater modelling (Refsgaard and Henriksen,<br />
2002): In this field, QA guidelines are well developed<br />
and used in many countries, but mostly in groundwater<br />
flow modelling, where the state-of-the-art corresponds<br />
to Type 3 guidelines. For solute transport, and in<br />
particular for geochemical modelling, relatively few<br />
guidelines exist and they are not commonly used. The<br />
need for QA guidelines differs from country to country,<br />
amongst others due to different stages of development of<br />
the groundwater modelling market. For instance, the<br />
guides from the American Society for Testing and<br />
Materials (ASTM) were among the first of their kind to<br />
be developed, in the early 1990s, because the practical<br />
application of groundwater models at that time had<br />
progressed further in the USA than in most other<br />
countries.<br />
Precipitation-runoff modelling (Perrin et al., 2002a):<br />
Relatively few guidelines exist for this domain as standalone<br />
guidelines. The guidelines that do exist are generally<br />
confined to relatively simple (lumped) approaches,<br />
while no generic guidelines exist for the more complex<br />
models of the distributed physically-based type. Thus,<br />
the state-of-the-art for precipitation-runoff as a standalone<br />
domain may be characterised as Type1/Type2.<br />
However, it is also noted that precipitation-runoff<br />
modelling is often used as an integral part of other<br />
domains, e.g. groundwater models, hydrodynamic<br />
models, flood forecasting models and surface water<br />
quality models. For some of these integrated applications<br />
some guidelines have been developed which<br />
include the precipitation-runoff domain. This is, for<br />
instance, the case for the Danish groundwater guidelines<br />
(Henriksen, 2002b) which include aspects of precipitation-runoff<br />
modelling.<br />
Hydrodynamic modelling (Metelka and Krejcik,<br />
2002a): This domain includes environmental applications<br />
such as modelling of urban drainage and sewer<br />
systems, rivers, floodplains, estuaries and coastal waters<br />
both with respect to flows, sediment and morphological<br />
issues. QA guidelines are well developed in some fields<br />
(e.g. in urban drainage and river modelling), but not in<br />
other fields (e.g. sediment and morphological modelling).<br />
For hydrodynamic modelling in coastal areas and<br />
estuaries few QA guidelines have been identified. The<br />
state-of-the-art may be characterised as Type 2 for most<br />
parts of the domain and Type 1 for other parts. It is<br />
noted that hydrodynamic modelling is often an integral<br />
part of flood forecasting and surface water quality<br />
modelling. Although very similar in theoretical scientific<br />
background, this domain is different from the field of<br />
Computational Fluid Dynamics that typically is used for<br />
industrial purposes.<br />
Flood forecasting modelling (Balint, 2002): This<br />
domain differs fundamentally from the other domains<br />
by being based on real-time operation. This implies that<br />
the models, once established, are applied on a routine<br />
(daily) basis although often under extreme boundary<br />
conditions. The focus on QA in this domain is often<br />
concentrated on data quality for the on-line data<br />
acquisition. Due to this fundamental difference in nature,<br />
the status of QA guidelines for this domain does not fit<br />
well into the above classification, and it is not easily<br />
comparable to the status of the other domains.<br />
Surface water quality modelling (Da Silva et al.,<br />
2002): Surface water quality modelling is based on<br />
a description of physical, chemical and biological<br />
processes. Often the data availability to assess model<br />
processes and parameters is sparse and often the<br />
key processes are not well understood. QA guidelines
J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
1207<br />
are generally not well developed. The state-of-the-art<br />
may be characterised as Type 1.<br />
Biota (ecological) modelling (Old et al., 2002):<br />
Ecology is a diverse branch of biology that focuses on<br />
the relations of flora and fauna to one another and to<br />
their physical environment. Ecological models are<br />
widely used today, but perceived as being rather<br />
uncertain due to the inherent complexity of ecological<br />
systems and the general limited availability of relevant<br />
field data. QA guidelines are generally not well<br />
developed. The state-of-the-art may be characterised as<br />
Type 1.<br />
Socio-economic modelling (Heinz and Eberle, 2002):<br />
No general QA guidelines exist for socio-economic<br />
modelling. The few existing guidelines, such as the<br />
CAMS, CFMPS and RBMPs in the UK, are specific for<br />
particular types of application, and they are so far only<br />
used in practice in a few countries. The state-of-the-art<br />
may be characterised as Type1/Type2.<br />
In Fig. 2 the state-of-the-art for QA guidelines in the<br />
respective modelling domains have been plotted against<br />
the scientific maturity of the underlying disciplines. The<br />
scientific maturity of the respective domains has been<br />
assessed subjectively on the basis of the criteria outlined<br />
in Section 2.3 above. There is a tendency that the least<br />
developed guidelines (Type 1) appear in domains where<br />
the underlying scientific basis is characterised as<br />
immature, i.e. in surface water quality, biota (ecology)<br />
and groundwater quality, reflecting that many fundamental<br />
scientific issues remain to be solved. Similarly,<br />
the Type 2 and Type 3 guidelines are dominant in<br />
domains characterised by scientific maturity. However,<br />
there are clear exceptions such as precipitation-runoff<br />
and flood forecasting, where other factors than scientific<br />
maturity must play a role for the development stage of<br />
QA guidelines.<br />
3.2. Current practice in different countries<br />
The current practice of using QA guidelines in<br />
different countries has been illustrated through some<br />
selected cases that have been reviewed in Refsgaard<br />
(2002). InFig. 3 the type of QA guidelines used in the<br />
case studies is plotted against the maturity of the<br />
modelling market that has been assessed subjectively on<br />
the basis of the criteria given in Section 2.3 above. The<br />
practice as reflected by the case studies and shown on<br />
the figure is summarised as follows:<br />
Dutch guidelines (Scholten and Groot, 2002): The<br />
Dutch guidelines are the most generic of the existing<br />
guidelines in the sense that they cover all the domains<br />
relevant for river basin management. The technical<br />
guidance for different modelling domains exist, but are<br />
not as detailed as some of the guidelines that only cover<br />
one domain (e.g. ASTM guides or Australian guidelines<br />
on groundwater flow modelling). The Dutch guidelines<br />
emphasise the dialogue process between modeller and<br />
water manager, including the review procedures. The<br />
Dutch guidelines belong to Type 3. The Dutch<br />
modelling market may be characterised as mature.<br />
Australian groundwater flow modelling guidelines<br />
(Henriksen, 2002a): The Australian guidelines are<br />
technically comprehensive. They focus on the dialogue<br />
between the modeller and the water manager in general<br />
and on review procedures in particular. The guidelines<br />
were developed over several years with involvement of<br />
all of the key stakeholders. The Australian guidelines<br />
belong to Type 3. The Australian groundwater modelling<br />
market may be characterised as mature.<br />
Danish groundwater modelling guidelines (Henriksen,<br />
2002b): The Danish Handbook of Good Modelling<br />
Practice and draft guidelines is similar to the Australian<br />
ones, although some important details differ. The water<br />
managers, who also ensure that they presently are being<br />
used in most studies, have initiated the Danish guidelines.<br />
The Danish guidelines belong to Type 3. The<br />
Danish groundwater modelling market may be characterised<br />
as mature.<br />
Central and Eastern Europe (Metelka and Krejcik,<br />
2002b;Van Gils and Groot, 2002): Public QA guidelines<br />
are neither well developed nor used. Many modellers<br />
therefore rely only on internal QA procedures (Type 1)<br />
adopted by their respective organisations. This situation<br />
reflects a new and unregulated market for modelling<br />
services, and a market where the managers and their<br />
organisations often are technically too weak to adopt<br />
and enforce QA guidelines.<br />
French guidelines in flood forecasting (Perrin et al.,<br />
2002b): Public or interactive guidelines do not exist in<br />
this area, and the case study describes a set of internal<br />
technical guidelines (Type 1). Although flood forecasting<br />
is an old modelling discipline, the modelling<br />
market is virtually non-existent, because flood forecasting<br />
modelling in France (as well as in most other<br />
countries) is carried out either by a government agency<br />
or by a specialised research institute.<br />
UK guidelines (Packman, 2002): QA guidelines are<br />
generally very well developed in the UK. Application of<br />
guidelines is prescribed as a routine in most areas of<br />
model application. Thus, in general the UK market for<br />
modelling services is well regulated and characterised as<br />
being mature. Most of the guidelines are of Type 2 and<br />
some recent ones of Type 3. The exceptions to this are<br />
the surface water quality and biota (ecological) domains<br />
where no general guidelines exist. The guidelines in these<br />
domains are therefore confined to internal procedures<br />
inspired by textbooks and manuals (Type 1).<br />
Bay Delta Modelling Forum, California (BDMF,<br />
2000): The Californian guidelines provide a framework,<br />
but very few technical details. The main emphasis of<br />
these guidelines is on the interaction between modellers,<br />
managers and the public (Type 3). In this respect various
1208 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
kinds of reviews are prescribed at various stages of the<br />
modelling process. The American market in general and<br />
the Californian in particular are well established<br />
(mature).<br />
American Society for Testing and Materials (ASTM,<br />
1992, 1994): The American guidelines are especially<br />
comprehensive in the groundwater domain, where they<br />
have served as inspiration for all the other groundwater<br />
guidelines, including the Australian and the Danish<br />
guidelines. There are a number of guidelines on various<br />
elements of the modelling process. These guides are 5–10<br />
years old and are mainly technical of nature, while<br />
limited focus is put on the interaction and review<br />
process.<br />
In addition to the above QA guidelines ISO (the<br />
International Organisation for Standardisation) regularly<br />
publishes quality management and quality assurance<br />
standards. ISO standards provide guidance on<br />
fundamental principles and procedures, but on a rather<br />
general level. We have found ISO standards addressing<br />
development, supply and maintenance of computer<br />
software (ISO 9000-3:1997) and other standards providing<br />
guidance for a general process based quality<br />
management system in an organisation (ISO<br />
9004:2000(E)). However, none of the ISO standards<br />
include any particular guidance on matters related to<br />
water resources modelling or management, and they are<br />
therefore of limited practical use as compared to the<br />
above other QA guidelines dedicated to water resources<br />
modelling.<br />
3.3. Content of existing guidelines<br />
3.3.1. Key elements<br />
The existing guidelines all comprise modelling protocols<br />
with recommended steps and technical guidance<br />
on how to perform these steps in the modelling process.<br />
The key elements may be divided into two groups,<br />
namely: (1) technical guides on how to use models; and<br />
(2) guides for regulating the interaction between<br />
modeller and end-user/water manager. The key elements<br />
in the technical guides include:<br />
Definition of the purpose of the modelling study.<br />
Collection and processing of data.<br />
Establishment of a conceptual model.<br />
Selection of code or alternatively programming and<br />
verification of code.<br />
Model set-up.<br />
Establishment of performance criteria.<br />
Model calibration.<br />
Model validation.<br />
Uncertainty assessments.<br />
Simulation with model application for a specific<br />
purpose.<br />
Reporting.<br />
The key elements in the interaction between the<br />
modeller and the end-user in addition to some of the<br />
above elements also includes other aspects:<br />
Definition of the purpose of the modelling study,<br />
including translation of the end-users needs to<br />
preliminary performance criteria.<br />
Establishment of performance criteria. The accuracy<br />
of the model predictions has to be established via<br />
a trade off between the benefits of improving the<br />
accuracy in terms of less uncertainty on the<br />
management decisions and the costs of improving<br />
the accuracy through additional model studies and/<br />
or collection of additional field data.<br />
Reviews with subsequent consultation between the<br />
modeller and the end-user at different phases of the<br />
modelling project.<br />
The content of the technical guides are to a large<br />
extent domain specific, while the elements of the<br />
interaction between the modeller and the end-user are<br />
more general in nature and differ only slightly from one<br />
domain to another.<br />
3.3.2. Integration across modelling domains<br />
Almost all the existing guidelines were developed for<br />
a specific domain e.g. groundwater modelling. As<br />
integrated modelling may be expected to play an<br />
important role in connection with implementation of<br />
the EU Water Framework Directive and adoption of<br />
Integrated Water Resources Management principles,<br />
guidelines not including integrated modelling aspects are<br />
inadequate. Even the Dutch guidelines (Scholten and<br />
Groot, 2002) which cover a large number of domains are<br />
essentially single domain guidelines, because they do not<br />
provide guidance on how to integrate across domains<br />
(interdependencies etc.). However, the Dutch guidelines<br />
do have the clear advantage over other existing guidelines<br />
in that they are based on a common methodology<br />
and a common glossary.<br />
It should be noted though that some guidelines cover<br />
more than one modelling domain, as they are defined<br />
here. For instance hydrodynamic modelling or groundwater<br />
modelling are often combined with precipitationrunoff,<br />
and guidelines combining these domains exist.<br />
3.3.3. Differences in terminology<br />
As illustrated in Refsgaard (2002) the terminology<br />
used in the modelling community varies significantly<br />
between domains and even to some extent from one<br />
country to another. This clearly demonstrates the need<br />
for establishing one common terminology and glossary<br />
for modelling applications as addressed by Refsgaard<br />
and Henriksen (2004).
J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
1209<br />
4. Outline of new guidelines – HarmoniQuA<br />
4.1. Overall aim and structure<br />
On the basis of the knowledge achieved through the<br />
review of existing guidelines, the HarmoniQuA project<br />
aims to develop a new comprehensive set of guidelines<br />
and supporting software tools to facilitate an improved<br />
quality of the modelling process and hence enhance the<br />
confidence of all stakeholders.<br />
HarmoniQuA forms part of the CATCHMOD<br />
cluster of EU research projects (Blind, 2004). It aims<br />
to be a methodological component of a future infrastructure<br />
for model based decision support for water<br />
management at catchment and river basin scale. This<br />
main goal will be reached by providing the elements of<br />
a methodological layer in this infrastructure, embodied<br />
in a knowledge base (KB) and software tools. HarmoniQuA<br />
will collect methodological expertise, structure<br />
this knowledge and identify and fill in gaps. It will<br />
consist of generic and domain specific knowledge,<br />
modelling software specific aspects, and a transparent<br />
and consistent glossary of terms and concepts. This<br />
body of knowledge will be structured in a knowledge<br />
base. The following set of software tools will provide<br />
functionality for the HarmoniQuA system:<br />
guideline tool: will generate guidelines from the KB;<br />
monitoring tool: will monitor all activities within<br />
a modelling job and store these activities as a single<br />
model journal in a model archive;<br />
report tool: generates reports from a model journal;<br />
advisor tool: advises modellers in new modelling jobs<br />
based on decisions and choices of previous jobs and<br />
associated model journals in the model archive.<br />
An overview of the HarmoniQuA products (KB and<br />
tools) and how these interact with the activities of the<br />
users is presented in Fig. 4. The lower part of Fig. 4<br />
depicts the five major steps of the modelling process.<br />
These five major steps are decomposed into 45 tasks,<br />
with interrelations (order and feedback) as shown in<br />
Fig. 5. Each task has an internal structure, i.e. name,<br />
definition, explanation, interrelations with other tasks,<br />
activities, activity related methods, references, task<br />
inputs and outputs. This knowledge structure (steps,<br />
tasks, within-task-knowledge) is stored in the KB. The<br />
five steps and the tasks have been selected on the basis of<br />
existing modelling protocols and QA guidelines and<br />
include the key elements outlined in Section 3.3 above.<br />
Model based decision support has several dimensions,<br />
which hinder a ‘one-size-fits-all’-approach. HarmoniQuA<br />
attempts to serve several types of users in<br />
Knowledge Base<br />
Guidelines<br />
Software capabilities<br />
Glossary<br />
Domains:<br />
Groundwater<br />
Precipitation-runoff<br />
Hydrodynamics<br />
Flood forecasting<br />
Water quality<br />
Biota (ecology)<br />
Socio-economics<br />
Model<br />
Archive<br />
Model journal, Project A<br />
Model journal, Project B<br />
Model journal, Project C<br />
Model journal, Project D<br />
MoST<br />
Reporting<br />
Specific for types<br />
of users<br />
Guidance<br />
Generic + specific for:<br />
- model domain<br />
- user<br />
- job complexity<br />
Advise<br />
From previous<br />
model projects<br />
Monitoring<br />
Generic + specific for:<br />
- model domain<br />
- user<br />
- job complexity<br />
User<br />
Model Team<br />
Single/multiple domain<br />
Model Study<br />
Plan<br />
Data and<br />
Conceptualisation<br />
Model<br />
Set-up<br />
Calibration<br />
and<br />
Validation<br />
Simulation and<br />
Evaluation<br />
Reporting and client review take place in each step<br />
Fig. 4. HarmoniQuA tools (MoST) to support the QA process.
1210 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
Model Study Plan<br />
Describe Problem and<br />
Context<br />
Define Objectives<br />
Identify Data Availability<br />
Determine Requirements<br />
Prepare Terms of<br />
Reference<br />
Proposal and Tendering<br />
no<br />
Agree on<br />
Model Study Plan and<br />
Budget<br />
yes<br />
Legends<br />
Ordinary task<br />
Decision task<br />
Review task<br />
feedforward<br />
feedfback<br />
Data and Conceptualisation<br />
Model Set-up Calibration and Validation Simulation and Evaluation<br />
Describe System and<br />
Data Availability<br />
Construct Model<br />
Specify Stages in<br />
Calibration Strategy<br />
Simulations<br />
Process Raw Data<br />
no<br />
Test Runs<br />
Completed<br />
bad<br />
Select Optimisation<br />
Method<br />
Check<br />
Simulations<br />
no<br />
bad<br />
yes<br />
Sufficient<br />
Data<br />
yes<br />
Model Structure and<br />
Processes<br />
Model Parameters<br />
Summarise Conceptual<br />
Model and Assumptions<br />
Need for<br />
Alternative<br />
Conceptual<br />
Models<br />
no<br />
Process Model Structure<br />
Data<br />
no<br />
no<br />
OK<br />
Specify or Update<br />
Calibration + Validation<br />
Targets and Criteria<br />
Report and Revisit<br />
Model Study Plan (Model<br />
Set-up)<br />
Review Model Set-up<br />
and Calibration and<br />
Validation Plan<br />
bad<br />
yes<br />
Define Stop Criteria<br />
Select Calibration<br />
Parameters<br />
Parameter<br />
Optimisation<br />
yes<br />
All Calibration<br />
Stages<br />
Completed<br />
yes<br />
Assess<br />
Soundness of<br />
Calibration<br />
OK<br />
Validation<br />
no<br />
no<br />
no<br />
not OK<br />
no<br />
bad<br />
yes<br />
Analyse and Interpret<br />
Results<br />
Assess<br />
Soundness of<br />
Simulation<br />
yes<br />
Uncertainty Analysis of<br />
Simulation<br />
Reporting of Simulation<br />
(incl. Uncertainty)<br />
Review of Simulation<br />
yes<br />
Model Study Closure<br />
bad<br />
no<br />
no<br />
Assess<br />
Soundness of<br />
Conceptualisation<br />
yes<br />
bad<br />
Assess<br />
Soundness of<br />
Validation<br />
not OK<br />
Code Selection<br />
Report and Revisit<br />
Model Study Plan<br />
(Conceptualisation)<br />
OK<br />
Uncertainty Analysis of<br />
Calibration and<br />
Validation<br />
Document Model Scope<br />
no<br />
Review<br />
Conceptualisation and<br />
Model Set-up Plan<br />
yes<br />
Report and Revisit<br />
Model Study Plan<br />
(Calibration + Validation)<br />
Review Calibration and<br />
Validation and<br />
Simulation Plan<br />
yes<br />
no<br />
Fig. 5. The five steps and 45 tasks of modelling process in the HarmoniQuA knowledge base.
J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
1211<br />
a series of water management domains, in jobs of<br />
diverse complexity and diverse application purpose.<br />
In this way, users working on a specific job will only be<br />
confronted with guidelines, instructions, decisions and<br />
activities that are relevant to their role in a particular<br />
modelling job.<br />
The HarmoniQuA tools have been developed in<br />
Prote´ ge´ 2000 following an ontological approach. More<br />
details can be found in Kassahun et al. (2004). The tools<br />
are available on http://www.harmoniqua.org/.<br />
4.2. Key elements<br />
Some of the key features to be implemented in the<br />
new HarmoniQuA guidelines are:<br />
4.2.1. Interactive guidelines<br />
The dialogue between the different players is crucial<br />
to ensure that the output from the modelling process is<br />
understandable for stakeholders and beneficial for the<br />
client. The importance of involvement of stakeholder<br />
and public opinions are emphasised by Pahl-Wostl<br />
(2002) and addressed in some Type 3 guidelines (e.g.<br />
BDMF, 2000; Pascual et al., 2003). In HarmoniQuA,<br />
each of the five major steps (Fig. 5) is therefore<br />
concluded with a dialogue task, in terms of either<br />
contract negotiation (first step) or reviews (last four<br />
steps). A dialogue task encourages the assessment of the<br />
present step and provides the opportunity to redefine the<br />
content of the model study plan for the next step based<br />
upon the results and findings of the present step. These<br />
dialogue steps provide flexibility to the modelling study<br />
and ensure that the tasks that have yet to be performed<br />
can be modified according to the achieved results and<br />
perceptions of modeller and client.<br />
4.2.2. Transparency and reproducibility<br />
Transparency and reproducibility are important,<br />
especially for large studies involving use of complex<br />
models. This will be ensured through the Monitoring<br />
Tool which enables modelling teams, consisting of<br />
modellers, managers and auditors, to be guided through<br />
the modelling process, to monitor all modelling activities<br />
and to oversee the status of each task to perform. With<br />
an increasing tendency to reuse existing models or<br />
rebuild them with additional data, modified conceptual<br />
models (revised model structure and/or inclusion of<br />
additional processes) and improved calibration and<br />
validation tests, this functionality of the Monitoring<br />
Tool becomes very important.<br />
4.2.3. Accuracy criteria<br />
Establishment of accuracy criteria for a modelling<br />
study is a very important, but difficult, issue. Modellers<br />
often establish numerical accuracy criteria in order to<br />
classify the goodness of a given model (e.g. Henriksen<br />
et al., 2003; Scholten and Van der Tol, 1998). These<br />
attempts are very useful in making the performance<br />
more transparent and quantitative, but do not provide<br />
an objective means to decide what the optimal accuracy<br />
criteria really should be in a given case. According to<br />
Refsgaard and Henriksen (2004) no universal accuracy<br />
criteria can be established, i.e. it is generally not possible<br />
from a natural scientific point of view to tell when<br />
a model performance is good enough. Such acceptance<br />
criteria will vary from case to case depending on the<br />
socio-economic context, i.e. what is at stake in the<br />
decisions to be supported by the model predictions. An<br />
appropriate question may be: how do we translate the<br />
‘soft’ socio-economic objectives to ‘hard-core’ model<br />
performance criteria This is obviously a challenge that<br />
cannot be solved by natural science alone, but needs to<br />
be addressed in a much broader context including<br />
aspects of economy, stakeholder interests and risk<br />
perception.<br />
Performance statistics must comprise quantifiable<br />
and objective measures. However numerical measures<br />
cannot stand alone. Often expert opinions are necessary<br />
supplements.<br />
4.2.4. Uncertainty assessments<br />
Quality assurance and uncertainty assessments are<br />
two aspects that are very closely linked. Initially, the<br />
manager has to define accuracy criteria from a perception<br />
of which uncertainty level he/she believes is suitable<br />
for a particular case (see above). Subsequently, as the<br />
modelling study proceeds, the dialogue between modeller<br />
and manager has to continue with the necessary<br />
trade off between modelling accuracy and the cost of the<br />
modelling study. In the uncertainty assessments it is very<br />
important to go beyond the traditional statistical<br />
uncertainty analysis. Thus, e.g. aspects of scenario<br />
uncertainty and ignorance should generally be included<br />
and in addition the uncertainties originating from data<br />
and models often needs to be integrated with socioeconomic<br />
aspects in order to form a suitable basis for<br />
the further decision process (e.g. Van Asselt and<br />
Rotmans, 2002). Thus, like with the accuracy criteria<br />
(above) the use of uncertainty assessments in water<br />
resources management goes beyond natural science.<br />
Assessment of uncertainty due to errors in the model<br />
structure is a particularly difficult task and is most often<br />
neglected. One way of evaluating this source of uncertainty<br />
is through the establishment of alternative<br />
conceptual models. This aspect is emphasised in the<br />
HarmoniQuA guidelines.<br />
4.2.5. Model validation<br />
Although experience shows that models generally<br />
perform poorer in validation tests against independent<br />
data than they do in calibration tests, model validation is<br />
in our opinion a neglected issue, both in many modelling
1212 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
guidelines and in the scientific literature. Maybe many<br />
scientists have not wanted to use the term validation due<br />
to the scientific philosophically related controversies, but<br />
in any case many scientists are not advocating the need for<br />
model validation. One of the unfortunate consequences<br />
of this ‘lack of interest’ is that not much work has been<br />
devoted to developing suitable validation test schemes<br />
since Klemes (1986). In our opinion further development<br />
of suitable testing schemes, particularly for non-linear<br />
models and for applications comprising extrapolations<br />
beyond the calibration data basis, and imposing them to<br />
all modelling projects is a major future challenge.<br />
4.2.6. Dedication aspects<br />
The QA guidelines describe the different tasks and<br />
responsibilities of the different types of users such as (1)<br />
modellers; (2) water managers; (3) auditors; (4) stakeholders<br />
(other than water manager); and (5) general<br />
public.<br />
The QA guidelines are developed so that they<br />
adequately reflect the different requirements in several<br />
modelling domains (and still maintain a common generic<br />
core to ensure coherency). Furthermore, the guidelines<br />
will be applicable for studies where several domains,<br />
including socio-economy, are integrated.<br />
The QA guidelines differentiate according to job<br />
complexity in modelling, e.g. (1) basic (rough calculations);<br />
(2) intermediate (moderately complex calculations);<br />
and (3) comprehensive (sophisticated, detailed<br />
calculations).<br />
5. Discussion and conclusions<br />
5.1. Types and reasons of existing QA guidelines<br />
We have classified quality assurance (QA) guidelines in<br />
three types: Internal technical guidelines (Type 1), Public<br />
technical guidelines (Type 2), and Public interactive<br />
guidelines (Type 3). We have then characterised the<br />
conditions for which the guidelines are used by (a) the<br />
scientific maturity of the underlying discipline(s) and (b)<br />
the maturity of the modelling market in the region/<br />
country for which the guidelines were developed. Our<br />
review of existing QA guidelines is not exhaustive, but<br />
limited to examples aimed at being representative for<br />
conditions in Europe, North America and Australia.<br />
Thus, we have for instance not reviewed QA guidelines<br />
from countries in Asia, where modelling has taken place<br />
for many years. The results of our review revealed<br />
significant variations in the type of guidelines available<br />
and their usage between different modelling domains and<br />
countries. We hypothesised that the stage of QA guideline<br />
development largely depends on the maturity of both the<br />
specific scientific discipline and the modelling market in<br />
the respective country or region (Figs. 2 and 3).<br />
Considering Figs. 2 and 3 it appears that the maturity<br />
of the scientific discipline and market both play an<br />
important role in QA development. However, neither<br />
the scientific level nor the market maturity alone is able<br />
to explain the differences in the stage of QA guideline<br />
development. If the underlying process understanding or<br />
necessary data are too weak, then the modelling process<br />
lacks credibility no matter how well QA procedures are<br />
adhered to. Hence, the motivation to establish sophisticated<br />
QA guidelines in such cases is small. Similarly,<br />
even though a specific discipline may be scientifically<br />
mature, modellers may be reluctant to use sophisticated<br />
QA guidelines if they are not required to do so by<br />
regulators and/or water managers. The general development<br />
of QA guidelines has progressed over time<br />
from Type 1 towards Type 3. A developmental process<br />
that is consistent with the results of the reviews as<br />
reflected in Figs. 2 and 3 is the following.<br />
Initially, when models are introduced for practical<br />
application, internal technical guidelines (Type 1)<br />
originating from the research community are applied.<br />
The development from Type 1 to Type 2 QA guidelines<br />
requires a certain degree of maturity within both the<br />
specific scientific discipline and the market. This implies<br />
that there should not be significant lacks of knowledge<br />
on process descriptions, and that there is a common<br />
agreement about the scientifically sound procedures for<br />
solving the problems in this domain. The development<br />
of Type 2 guidelines is most often driven by the demands<br />
of regulators and water managers. The development<br />
from Type 2 to Type 3 requires a clear and conscious<br />
demand from regulators and water managers.<br />
It would also have been possible to classify the QA<br />
guidelines after other criteria, for example according to<br />
how uncertainty analysis is treated, whether they apply<br />
to single or multiple domains and whether they apply to<br />
natural or social science. We have chosen our classification<br />
for two main reasons. Firstly, an improved mutual<br />
understanding between modeller and water manager is<br />
crucial for a model application to be successful in<br />
practice, and this should be facilitated by the QA<br />
guidelines. Secondly, the trend of increasing stakeholder<br />
involvement in the water resources management process<br />
demands that QA guidelines also enable stakeholders to<br />
observe and take part in parts of the modelling process.<br />
Our characterisation of QA guidelines according to<br />
scientific and market maturity has some weaknesses.<br />
First of all, the assessments have been done subjectively,<br />
because there was no other feasible method. Secondly,<br />
the two characteristics are not completely independent.<br />
Thus a large and mature market will often put demands<br />
on new scientific knowledge and hence to enhance the<br />
scientific development, as well as it will lead to needs for<br />
improved technical standards.<br />
Altogether, it may be concluded that our hypotheses<br />
on the importance of scientific and market maturity for
J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
1213<br />
the development of QA guidelines have not been<br />
falsified. However, due to the above weaknesses and<br />
the limited empirical basis (review not exhaustive but<br />
selected examples) this conclusion should be taken with<br />
some reservation.<br />
5.2. Organisational requirements<br />
for QA guidelines to be effective<br />
As emphasised by e.g. Forkel (1996) modelling<br />
studies involve several partners with different responsibilities.<br />
The ‘key players’ are code developers, model<br />
users (modellers) and water managers (including planning<br />
and regulatory authorities). To a large extent the<br />
quality of the modelling study is determined by the<br />
expertise, attitudes and motivation of the teams involved<br />
in the modelling and quality assessment process.<br />
The attitude of the modellers is important. NRC<br />
(1990) characterises this as follows: ‘‘most modellers<br />
enjoy the modelling process but find less satisfaction in<br />
the process of documentation and quality assurance’’.<br />
Scholten and Groot (2002) describe the main problem<br />
with the Dutch Handbook on Good Modelling Practice<br />
that they all like it, but only a few use it.<br />
QA will only become successful if both of the parties,<br />
modeller and water manager, are motivated and active in<br />
supporting its use. The water manager has a particular<br />
responsibility, because he/she has the power to request<br />
and pay for adequate QA in modelling studies. Therefore,<br />
QA guidelines can only be expected to be used in practice,<br />
if the water manager prescribes their use. In this respect it<br />
is very important that the water manager has the technical<br />
capacity to organise the QA process. A significant<br />
problem for water manager’s organisation is that it often<br />
lacks individuals who are trained at an appropriate level<br />
to understand and use models. If the water manager does<br />
not possess such skill within his/her own staff, an external<br />
modelling expert can be hired to help the manager in the<br />
QA process. However, this requires that the manager is<br />
aware of the problem and the need.<br />
5.3. The HarmoniQuA guidelines<br />
The approach adopted in the present HarmoniQuA<br />
guidelines correspond to Type 3. However, in addition<br />
to its focus on the dialogue and role play between the<br />
various actors in the modelling process, i.e. modellers,<br />
water managers, auditors and the public/stakeholders,<br />
the HarmoniQuA approach is innovative compared to<br />
existing Type 3 QA guidelines on the following aspects:<br />
Supporting software tools, beyond simple scoreboards<br />
and templates, are novel and important<br />
elements. These tools, which contain the knowledge<br />
base (KB), can guide the users through the<br />
modelling process, monitor decisions and outcomes,<br />
and provide experienced based advise on the<br />
appropriate route to be followed. This will significantly<br />
improve the transparency and reproducibility<br />
of the modelling process. To our knowledge no such<br />
tools exist or are under development at present.<br />
The focus on performance and accuracy criteria<br />
in the modelling process is not novel as such. However,<br />
the current adaptation of these criteria through<br />
the process in connection with the formalised review<br />
steps is, if not novel, then at least emphasised much<br />
more in the HarmoniQuA guidelines than in any<br />
other existing guidelines. This approach allows the<br />
HarmoniQuA guidelines to fit nicely with the new<br />
ideas of adaptive management (Pahl-Wostl, 2002).<br />
The uncertainty aspects are given a more central role<br />
than in existing guidelines, where uncertainty often<br />
is confined to assessment of predictive uncertainties<br />
towards the end of the study. In the HarmoniQuA<br />
guidelines uncertainty aspects plays an important<br />
role in 13 of the 45 tasks. Thus, uncertainty<br />
assessment is a central element in the dialogue<br />
between modeller and water manager already in the<br />
beginning of the model study when the initial<br />
performance criteria are outlined. Furthermore,<br />
HarmoniQuA recommends including less quantifiable<br />
elements such as scenario uncertainty and<br />
model structural uncertainty in the assessment.<br />
Model validation tests against independent data have<br />
more emphasis than in most other guidelines.<br />
Although the most comprehensive of the existing<br />
guidelines, the Dutch guidelines (Van Waveren<br />
et al., 2000), for example recommends validation<br />
to be carried out, they do not describe validation<br />
tests beyond the traditional split-sample test.<br />
The HarmoniQuA guidelines are unique in their<br />
dedication aspects, namely that different tasks and<br />
responsibilities are described for different users,<br />
different modelling domains and different levels of<br />
modelling job complexity. The Australian groundwater<br />
modelling guidelines have the same feature,<br />
but only with respect to the review procedures<br />
(Merrick et al., 2002).<br />
The HarmoniQuA guidelines consist of a comprehensive<br />
set of QA guidelines for multiple modelling domains<br />
combined with the supporting software tools. These<br />
functionalities appear to be well suited to the challenges<br />
and demands of modern water resources management.<br />
The usefulness, user friendliness and appreciation by the<br />
users will be assessed through a testing of the guidelines<br />
and tools in a range of river basin modelling projects.<br />
Acknowledgements<br />
The present work was carried out within the<br />
Project ‘Harmonising Quality Assurance in model based
1214 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
catchments and river basin management (Harmoni-<br />
QuA)’, which is partly funded by the EC Energy,<br />
Environment and Sustainable Development programme<br />
(Contract EVK1-CT2001-00097). The constructive comments<br />
of five anonymous reviewers are acknowledged.<br />
References<br />
Anderson, M.G., Bates, P.D., 2001. Hydrological science: model<br />
credibility and scientific integrity. In: Anderson, M.G., Bates, P.D.<br />
(Eds.), Model Validation. Perspectives in Hydrological Science.<br />
John Wiley & Sons, Chichester, pp. 1–10.<br />
Anderson, M.P., Woessner, W.W., 1992. The role of postaudit in<br />
model validation. Advances in Water Resources 15, 167–173.<br />
Anderson, M.P., Ward, D.S., Lappala, E.G., Prickett, T.A., 1993.<br />
Computer models for subsurface water. In: Maidment, D.R. (Ed.),<br />
Handbook of Hydrology. McGraw-Hill, Inc (Chapter 22).<br />
ASTM, 1992. Standard Practice for Evaluating Mathematical Models<br />
for the Environmental Fate of Chemicals. Standard E978-92,<br />
American Society for Testing and Materials, http://www.astm.org.<br />
ASTM, 1994. Standard Guide for Application of a Ground-Water<br />
Flow Model to a Site-Specific Problem. Standard D5447-93,<br />
American Society for Testing and Materials, http://www.astm.org.<br />
Balint, G., 2002. State-of-the-art for flood forecasting modelling. In:<br />
Refsgaard, J.C. (Ed.), State-of-the-Art Report on Quality Assurance<br />
in Modelling Related to River Basin Management. Chapter 7,<br />
Geological Survey of Denmark and Greenland, Copenhagen,<br />
http://www.harmoniqua.org.<br />
BDMF, 2000. Protocols for Water and Environmental Modeling.<br />
Bay-Delta Modeling Forum. Ad hoc Modeling Protocols Committee,<br />
http://www.sfei.org/modelingforum/.<br />
Blind, M., 2004. ICT requirements for an ‘evolutionary’ development<br />
of WFD compliant River Basin Management Plans. In: Pahl, C.,<br />
Schmidt, S., Jakeman, T. (Eds.), iEMSs 2004 International<br />
Congress: ‘‘Complexity and Integrated Resources Management’’.<br />
International Environmental Modelling and Software Society,<br />
Osnabru¨ ck, Germany, June 2004.<br />
CAMASE, 1996. CAMASE was a Concerted Action for the Development<br />
and Testing of Quantitative Methods for research on<br />
Agricultural Systems and the Environment, http://www.bib.wau.<br />
nl/camase/.<br />
Da Silva, M.C., Barbosa, A.E., Rocha, J.S., Fortunato, A.B., 2002.<br />
State-of-the-art for surface water quality modelling. In: Refsgaard,<br />
J.C. (Ed.), State-of-the-Art Report on Quality Assurance in<br />
Modelling Related to River Basin Management. Chapter 8,<br />
Geological Survey of Denmark and Greenland, Copenhagen,<br />
http://www.harmoniqua.org.<br />
DHI, 2002. MIKE 11 User Guide. DHI Water & Environment,<br />
Hørsholm, Denmark.<br />
Forkel, C., 1996. Das numerische Modell – ein schmaler Grat zwischen<br />
vertrauenswu¨ rdigem Werkzeug und gefährlichem Spielzeug. Presented<br />
at the 26. IWASA, RWTH Aachen, 4–5 January 1996.<br />
GWP-TAC, 2000. Integrated Water Management, TEC Background<br />
Papers No. 4, Global Water Partnership, SE-105 25 Stockholm,<br />
Sweden, ISBN: 91-630-9229-8.<br />
Heinz, I., Eberle, S., 2002. State-of-the-art for socio-economic<br />
modelling. In: Refsgaard, J.C. (Ed.), State-of-the-Art Report on<br />
Quality Assurance in Modelling Related to River Basin Management.<br />
Chapter 10, Geological Survey of Denmark and Greenland,<br />
Copenhagen, http://www.harmoniqua.org.<br />
Henriksen, H.J., 2002a. Australian groundwater modelling guidelines.<br />
In: Refsgaard, J.C. (Ed.), State-of-the-Art Report on Quality<br />
Assurance in Modelling Related to River Basin Management.<br />
Chapter 13, Geological Survey of Denmark and Greenland,<br />
Copenhagen, http://www.harmoniqua.org.<br />
Henriksen, H.J., 2002b. Danish groundwater modelling guidelines. In:<br />
Refsgaard, J.C. (Ed.), State-of-the-Art Report on Quality Assurance<br />
in Modelling Related to River Basin Management. Chapter<br />
14, Geological Survey of Denmark and Greenland, Copenhagen,<br />
http://www.harmoniqua.org.<br />
Henriksen, H.J., Troldborg, L., Nyegaard, P., Sonnenborg, T.O.,<br />
Refsgaard, J.C., Madsen, B., 2003. Methodology for construction,<br />
calibration and validation of a national hydrological model for<br />
Denmark. Journal of Hydrology 280 (1–4), 52–71.<br />
Kassahun, A., Scholten, H., Zompanakis, G., Gavardinas, C., 2004.<br />
Support for model based water management with the HarmoniQuA<br />
toolbox. In: Pahl, C., Schmidt, S., Jakeman, T. (Eds.),<br />
iEMSs 2004 International Congress: ‘‘Complexity and Integrated<br />
Resources Management’’. International Environmental Modelling<br />
and Software Society, Osnabru¨ ck, Germany, June 2004.<br />
Klemes, V., 1986. Operational testing of hydrological simulation<br />
models. Hydrological Sciences Journal 31, 13–24.<br />
Merrick, N.P., Middlemis, H., Ross, J.B., 2002. Groundwater<br />
Modelling Guidelines for Australia – Recommended Procedures<br />
for Modelling Reviews. International Groundwater Conference.<br />
Balancing the Groundwater Budget. Northern Territory. Australia.<br />
12–17 May 2002.<br />
Metelka, T., Krejcik, J., 2002a. State-of-the-art for hydrodynamic. In:<br />
Refsgaard, J.C. (Ed.), State-of-the-Art Report on Quality Assurance<br />
in Modelling Related to River Basin Management. Chapter 6,<br />
Geological Survey of Denmark and Greenland, Copenhagen,<br />
http://www.harmoniqua.org.<br />
Metelka, T., Krejcik, J., 2002b. Quality assurance in Central and<br />
Eastern Europe. In: Refsgaard, J.C. (Ed.), State-of-the-Art Report<br />
on Quality Assurance in Modelling Related to River Basin<br />
Management. Chapter 15, Geological Survey of Denmark and<br />
Greenland, Copenhagen, http://www.harmoniqua.org.<br />
Middlemis, H., 2000. Murray-Darling Basin Commission. Groundwater<br />
Flow Modelling Guideline. Aquaterra Consulting Pty Ltd.<br />
South Perth. Western Australia. Project no. 125.<br />
NRC, 1990. Ground Water Models: Scientific and Regulatory<br />
Applications. National Research Council, National Academy<br />
Press, Washington, D.C.<br />
Old, G.H., Packman, J.C., Calver, A.N., 2002. State-of-the-art<br />
for biota (ecological) modelling. In: Refsgaard, J.C. (Ed.),<br />
State-of-the-Art Report on Quality Assurance in Modelling<br />
Related to River Basin Management. Chapter 9, Geological<br />
Survey of Denmark and Greenland, Copenhagen, http://www.<br />
harmoniqua.org.<br />
Packman, J.C., 2002. Quality Assurance in the UK. In: Refsgaard, J.C.<br />
(Ed.), State-of-the-Art Report on Quality Assurance in Modelling<br />
Related to River Basin Management. Chapter 17, Geological<br />
Survey of Denmark and Greenland, Copenhagen, http://www.<br />
harmoniqua.org.<br />
Pahl-Wostl, C., 2002. Towards sustainability in the water sector – the<br />
importance of human actors and processes of social learning.<br />
Aquatic Sciences 64, 394–411.<br />
Pascual, P., Stiber, N., Sunderland, E., 2003. Draft Guidance on the<br />
Development, Evaluation, and Application of Regulatory Environmental<br />
Models. Council for Regulatory Environmental Modeling.<br />
US EPA, Washington D.C.<br />
Perrin, C., Andreassian, V., Michel, C., 2002a. State-of-the-art for<br />
precipitation-runoff modelling. In: Refsgaard, J.C. (Ed.), Stateof-the-Art<br />
Report on Quality Assurance in Modelling Related to<br />
River Basin Management. Chapter 5, Geological Survey of Denmark<br />
and Greenland, Copenhagen, http://www.harmoniqua.org.<br />
Perrin, C., Andreassian, V., Michel, C., 2002b. Quality assurance for<br />
precipitation-runoff modelling in France. In: Refsgaard, J.C. (Ed.),<br />
State-of-the-Art Report on Quality Assurance in Modelling<br />
Related to River Basin Management. Chapter 16, Geological<br />
Survey of Denmark and Greenland, Copenhagen, http://www.<br />
harmoniqua.org.
J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />
1215<br />
Refsgaard, J.C. (Ed.), 2002. State-of-the-Art Report on Quality<br />
Assurance in Modelling Related to River Basin Management.<br />
Report from the EU research project HarmoniQuA, http://www.<br />
harmoniqua.org. 18 chapters, 182 pp. Geological Survey of<br />
Denmark and Greenland, Copenhagen.<br />
Refsgaard, J.C., Henriksen, H.J., 2002. State-of-the-art for Groundwater<br />
Modelling. In: Refsgaard, J.C. (Ed.), State-of-the-Art Report<br />
on Quality Assurance in Modelling Related to River Basin<br />
Management. Chapter 4, Geological Survey of Denmark and<br />
Greenland, Copenhagen, http://www.harmoniqua.org.<br />
Refsgaard, J.C., Henriksen, H.J., 2004. Modelling guidelines –<br />
terminology and guiding principles. Advances in Water Resources<br />
27, 71–82.<br />
Rumbaugh, J.O., Rumbaugh, D.B., 2001. Guide to Using Groundwater<br />
Vistas. Environmental Simulations, Inc, Virginia, USA.<br />
Rykiel, E.R., 1996. Testing ecological models: the meaning of<br />
validation. Ecological Modelling 90, 229–244.<br />
Scholten, H., Van der Tol, M.W.M., 1998. Quantitative validation of<br />
deterministic models: when is a model acceptable In: Obaidat, M.S.,<br />
Davoli, F., DeMarinis, D. (Eds.), The Proceedings of the Summer<br />
Computer Simulation Conference. SCS, The Society for Computer<br />
Simulation International, San Diego, CA, USA, pp. 404–409.<br />
Scholten, H., Groot, S., 2002. Dutch guidelines. In: Refsgaard, J.C.<br />
(Ed.), State-of-the-Art Report on Quality Assurance in modelling<br />
related to river basin management. Chapter 12, Geological<br />
Survey of Denmark and Greenland, Copenhagen, http://www.<br />
harmoniqua.org.<br />
Scholten, H., Van Waveren, R.H., Groot, S., Van Geer, F.C., Wo¨ sten,<br />
J.H.M., Koeze, R.D., Noort, J.J., 2000. Good Modelling Practice<br />
in Water Management. Paper Presented on Hydroinformatics<br />
2000, Cedar Rapids, IA, USA.<br />
Van Asselt, M.B.A., Rotmans, J., 2002. Uncertainty in integrated<br />
assessment modelling – From positivism to pluralism. Climatic<br />
Change 54 (1–2), 75–105.<br />
Van Gils, J.A.G., Groot, S., 2002. Examples of good modelling<br />
practice in the Danube Basin. In: Refsgaard, J.C. (Ed.), Stateof-the-Art<br />
Report on Quality Assurance in Modelling Related to<br />
River Basin Management. Chapter 18, Geological Survey of<br />
Denmark and Greenland, Copenhagen, http://www.harmoniqua.<br />
org.<br />
Van Waveren, R.H., Groot, S., Scholten, H., Van Geer, F.C., Wo¨ sten,<br />
J.H.M., Koeze, R.D., Noort, J.J., 2000. Good Modelling Practice<br />
Handbook, STOWA Report 99-05, Utrecht, RWS-RIZA, Lelystad,<br />
The Netherlands, http://waterland.net/riza/aquest/ (In Dutch).
[14]<br />
Refsgaard JC, Nilsson B, Brown J, Klauer B, Moore R, Bech T, Vurro M,<br />
Blind M, Castilla G, Tsanis I, Biza P (2005) Harmonised techniques and<br />
representative river basin data for assessment and use of uncertainty<br />
information in integrated water management (HarmoniRiB).<br />
Environmental Science and Policy, 8, 267-277.<br />
Reprinted from Environmental Science and Policy with permission from Elsevier
Environmental Science & Policy 8 (2005) 267–277<br />
www.elsevier.com/locate/envsci<br />
Harmonised techniques and representative river basin data for<br />
assessment and use of uncertainty information in<br />
integrated water management (HarmoniRiB)<br />
Jens Christian Refsgaard a, *, Bertel Nilsson a , James Brown b ,<br />
Bernd Klauer c , Roger Moore d , Thomas Bech e , Michele Vurro f , Michiel Blind g ,<br />
Guillermo Castilla h , Ioannis Tsanis i , Pavel Biza j<br />
a Geological Survey of Denmark and Greenland (GEUS), Department of Hydrology, Øster Voldgade, DK-1350 Copenhagen, Denmark<br />
b Universiteit van Amsterdam (UVA), Amsterdam, The Netherlands<br />
c Centre for Environmental Research (UFZ), Leipzig, Germany<br />
d Centre for Ecology and Hydrology (CEH), Wallingford, UK<br />
e DHI Water and Environment (DHI), Hørsholm, Denmark<br />
f Istituto di Ricerca Sulle Acque del CNR (IRSA), Bari, Italy<br />
g Institute of Inland Water Management and Waste Water Treatment (RIZA), Lelystad, The Netherlands<br />
h Universidad de Castilla – La Mancha (UCLM), Albacete, Spain<br />
i Technical University Crete (TUC), Chania, Greece<br />
j Povodi Moravi (PM), Brno, Czech Republic<br />
Abstract<br />
This paper describes progress on HarmoniRiB, a European Commission Framework 5 project. The HarmoniRiB project aims to support<br />
the implementation of the EU Water Framework Directive (WFD) by developing concepts and tools for handling uncertainty in data and<br />
modelling, and by designing, building and populating a database containing data and associated uncertainties for a number of representative<br />
basins. This river basin network aims at becoming a ‘virtual laboratory for modelling studies’, and it will be made available for the scientific<br />
community. The data may, e.g. be used for comparison and demonstration of methodologies and models relevant to the WFD.<br />
# 2005 Elsevier Ltd. All rights reserved.<br />
Keywords: Uncertainty; River basin management; Data; Models; River basin network; HarmoniRiB; Water Framework Directive<br />
1. Introduction<br />
1.1. Problems to be addressed<br />
The Water Framework Directive (WFD) provides a<br />
European policy basis at the river basin scale. The river basin<br />
management and planning process prescribed in the WFD is<br />
an adaptation of the Integrated Water Resources Management<br />
principles (GWP, 2000), involving all physical<br />
domains in water management, sectors of water use,<br />
socio-economics and stakeholder participation. As such,<br />
* Corresponding author. Tel.: +45 38 14 27 76; fax: +45 38 14 20 50.<br />
E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />
the WFD poses new challenges to water resources managers.<br />
The traditional physical domain specific and sectoral<br />
approaches need to be combined and extended to fulfil<br />
the WFD requirements. The preparation of the river basin<br />
management plans, prescribed in the WFD, is furthermore<br />
influenced by uncertainties on the underlying data and<br />
modelling results. In several sections of the WFD document,<br />
uncertainty is addressed (Blind and de Blois, 2003). In<br />
addition, most of the WFD guidance documents, being more<br />
specific than the WFD document itself, explicitly emphasise<br />
that uncertainty analyses should be performed. However, in<br />
spite of strong recommendations to consider uncertainty<br />
aspects the guidance documents do not include recommendations<br />
on how to do so.<br />
1462-9011/$ – see front matter # 2005 Elsevier Ltd. All rights reserved.<br />
doi:10.1016/j.envsci.2005.02.001
268<br />
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />
Therefore, there is a clear and urgent need for developing<br />
new concepts, methodologies and tools that can be used to<br />
assist in implementing the WFD. In order to support such<br />
research and development, it is necessary to have a network<br />
of representative river basins with datasets suitable for this<br />
purpose. This implies that the datasets, in addition to<br />
covering the diversity in terms of ecological regimes and<br />
socio-economic conditions found across Europe, must have<br />
built-in information on the uncertainties in the data.<br />
1.2. Objectives<br />
The paper presents status and preliminary results from an<br />
ongoing research project, HarmoniRiB, that is supported<br />
under EU’s 5th Framework Programme. The overall goal of<br />
HarmoniRiB is to develop methodologies for quantifying<br />
uncertainty and its propagation from the raw data to concise<br />
management information. The four specific project objectives<br />
are:<br />
To establish a practical methodology and a set of tools for<br />
assessing and describing uncertainty originating from<br />
data and models used in decision making processes for the<br />
production of integrated water management plans. It will<br />
include a methodology for integrating uncertainties on<br />
basic data and models and socio-economic uncertainties<br />
into a decision support concept applicable for implementation<br />
of the WFD.<br />
To provide a conceptual model for data management that<br />
can handle uncertain data and implement it for a network<br />
of representative river basins.<br />
To provide well documented datasets, suitable for<br />
studying the influence of uncertainty on management<br />
decisions for a network of representative river basins and<br />
to provide examples of their use in the development of<br />
integrated water management plans.<br />
To disseminate intermediate and final results among<br />
researchers and end-users across Europe and obtain and<br />
incorporate feedback on the methodologies, tools and the<br />
datasets.<br />
2. Uncertainty assessments<br />
2.1. Definitions and taxonomy<br />
Uncertainty and associated terms such as error, risk and<br />
ignorance are defined and interpreted differently by different<br />
authors (see Walker et al., 2003 for a review). The different<br />
definitions reflect, among other factors, the different<br />
scientific disciplines and philosophies of the authors<br />
involved, as well as the intended audience. In addition they<br />
vary depending on their purpose. Some are rather generic,<br />
such as Funtowicz and Ravetz (1990), while others apply<br />
more specifically to model based water management, such as<br />
Beck (1987). The terminology used in HarmoniRiB has<br />
emerged after discussions between social scientists and<br />
natural scientists specifically aiming at applications in<br />
model based water management (Klauer and Brown, 2003).<br />
By doing so we adopt a subjective interpretation of<br />
uncertainty in which the degree of confidence that a decision<br />
maker has about possible outcomes and/or probabilities of<br />
these outcomes is the central focus. Thus, according to our<br />
definition a person is uncertain if s/he lacks confidence<br />
about the specific outcomes of an event. Reasons for this lack<br />
of confidence might include a judgement that the information<br />
is incomplete, blurred, inaccurate, imprecise or<br />
potentially false. Similarly, a person is certain if s/he is<br />
confident about the outcome of an event. It is possible that a<br />
person feels certain but has misjudged the situation (i.e. s/he<br />
is wrong).<br />
There are many different (decision) situations, with<br />
different possibilities for characterising of what we know or<br />
do not know and of what we are certain or uncertain. A first<br />
distinction is between ignorance as a lack of awareness<br />
about imperfect knowledge and uncertainty as a state of<br />
confidence about knowledge (which includes the act of<br />
ignoring). Our state of confidence may range from being<br />
certain to admitting that we know nothing (of use), and<br />
uncertainty may be expressed at a number of levels in<br />
between. Regardless of our confidence in what we know,<br />
ignorance implies that we can still be wrong (‘in error’). In<br />
this respect Brown (2004) has defined a taxonomy of<br />
imperfect knowledge illustrated in Fig. 1.<br />
In evaluating uncertainty, it is useful to distinguish<br />
between uncertainty that can be quantified, e.g. by<br />
probabilities and uncertainty that can only be qualitatively<br />
described, e.g. by scenarios. If one throws a balanced die, the<br />
precise outcome is uncertain, but the ‘attractor’ of a perfect<br />
die is certain: we know precisely the probability for each of<br />
the 6 outcomes, each being 1/6. This is what we mean with<br />
‘uncertainty in terms of probability’. However, the estimates<br />
for the probability of each outcome can also be uncertain. If<br />
a model study says: ‘‘there is a 30% probability that this area<br />
will flood two times in the next year’’, there is not only<br />
‘uncertainty in terms of probability’ but also uncertainty<br />
regarding whether the estimate of 30% is a reliable estimate.<br />
Secondly, it is useful to distinguish between bounded<br />
uncertainty, where all possible outcomes have been<br />
identified (they can be distinct or indistinct) and unbounded<br />
uncertainty, where the known outcomes are considered<br />
incomplete. Since quantitative probabilities require ‘all<br />
possible outcomes’ of an uncertain event and each of their<br />
individual probabilities to be known, they can only be<br />
defined for ‘bounded uncertainties’. If probabilities cannot<br />
be quantified in any undisputed way, we often can still<br />
qualify the available body of evidence for the possibility of<br />
various outcomes.<br />
The bounded uncertainty where all probabilities are<br />
deemed known (Fig. 1) is often denoted ‘statistical<br />
uncertainty’ (e.g. Walker et al., 2003). This is the case<br />
traditionally addressed in model based uncertainty assess-
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 269<br />
Fig. 1. Taxonomy of imperfect knowledge resulting in different uncertainty situations (Brown, 2004).<br />
ment. It is important to note that this case constitutes one of<br />
many decision situations outlined in Fig. 1, and in other<br />
situations the main uncertainty in a decision situation cannot<br />
be characterised statistically.<br />
2.2. Framework for describing data uncertainty<br />
By considering space–time variability and data type,<br />
Brown et al. (2005) have distinguished 13 uncertainty<br />
categories of uncertain data (Table 1).<br />
By considering measurement scale, it becomes possible<br />
to quickly limit the relevant uncertainty models for a certain<br />
variable. On a discrete measurement scale, for example, it is<br />
only relevant to consider discrete probability distribution<br />
functions, whereas continuous density functions are required<br />
for continuous numerical data. In addition, the use of space<br />
and time variability determines the need for autocorrelation<br />
functions alongside a probability density function ( pdf ).<br />
Brown et al. (2005) explain that this classification of data by<br />
measurement scale and space–time variability is useful for<br />
uncertainty assessment because: (1) it reduces the amount of<br />
required information requested from the user in populating a<br />
database; (2) it reduces the amount of information stored in a<br />
database (model parameter values); (3) it ensures a close<br />
relationship between the structure of the probability model<br />
and the techniques used to estimate its parameters and; (4) it<br />
encourages planning of measurement campaigns for<br />
collecting information on uncertainty.<br />
Each data category is associated with a range of<br />
uncertainty models, for which more specific pdfs may be<br />
developed with different simplifying assumptions (e.g.<br />
Gaussian; second-order stationarity; degree of temporal and<br />
spatial autocorrelation). The advantages of allowing a range<br />
of possible models for each data category are threefold.<br />
First, there is a need to explicitly define an appropriate set of<br />
statistical assumptions for a particular dataset. Secondly, a<br />
range of possible assumptions can be defined a priori, and<br />
hence the significance of particular assumptions can be<br />
demonstrated with examples. Finally, the trade-off between<br />
model complexity, identifiability and reliability can be<br />
reviewed over time and balanced against the (changing)<br />
practical constraints on assessing uncertainty. For example,<br />
levels of risk and expertise can be associated with the<br />
simplifying assumptions allowed in a pdf, with default<br />
Table 1<br />
The subdivision and coding of uncertainty-categories, along the ‘axes’ of space–time variability and measurement scale (Brown et al., 2005)<br />
Space–time variability<br />
Measurement scale<br />
Continuous numerical Discrete numerical Categorical Narrative<br />
}<br />
Constant in space and time A1 A2 A3<br />
Varies in time, not in space B1 B2 B3<br />
Varies in space, not in time C1 C2 C3<br />
Varies in time and space D1 D2 D3<br />
4
270<br />
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />
models for low-risk applications involving users with<br />
limited expertise. Minimum requirements can also be<br />
identified for specific datasets, such as data on toxic<br />
chemicals.<br />
Categorical data (3) differ from numerical data (1, 2) and<br />
narrative (4) in three important ways. First, categorical data<br />
cannot be manipulated statistically (i.e. computation of<br />
mean and variance), because the categories are not measured<br />
on a numerical scale. Secondly, individual values may be<br />
assigned to unique classes (one value to one class), where<br />
pdfs are based on the measured frequency, or perceived<br />
probability (Bayes rule), that a value occurs in a particular<br />
‘hard’ class or they can be partially assigned to multiple<br />
classes (fuzzy), where probabilities reflect doubt about the<br />
proportional membership of a value to a particular class<br />
(Heuvelink and Burrough, 1993). For the purposes of an<br />
uncertainty analysis, this distinction is important, because<br />
accuracy assessments are more complicated for fuzzy<br />
descriptions of reality. An important issue often overlooked<br />
with categorical data (e.g. the confusion matrix in landcover<br />
classification) is the problem of correlation in space<br />
and time or between datasets, since traditional statistical<br />
techniques do not apply to categorical data.<br />
Reviews with results on data uncertainty reported in the<br />
literature have been compiled into a guideline report for<br />
assessing uncertainty in various types of data originating<br />
from meteorology, soil physics and geochemistry, hydrogeology,<br />
land cover, topography, discharge, surface water<br />
quality, ecology and socio-economics (Van Loon and<br />
Refsgaard, 2005).<br />
2.3. Software tool to support uncertainty assessment in<br />
data and models<br />
The components of the HarmoniRiB uncertainty software<br />
are shown in Fig. 2.<br />
There are four software components in the HarmoniRiB<br />
design, namely: (1) a module for assessing uncertainties in<br />
data and storing this information within a database design<br />
(the database design is described briefly below (assess data<br />
uncertainty)); (2) a module for assessing uncertainties in<br />
models (assess model uncertainty); (3) a module for<br />
sampling from a distribution of uncertain inputs and<br />
(possibly) model parameters and implementing the model<br />
for each realisation of the uncertain inputs and parameters<br />
(uncertainty propagation); (4) a module for synthesising and<br />
presenting the uncertainty results ( present uncertainty).<br />
The Data Uncertainty Engine (DUE) is illustrated in<br />
Fig. 3. It separates the analysis of data uncertainties into four<br />
stages, whereby objects are first imported into the software<br />
(1), the sources of uncertainty are then identified (2)<br />
(important for a structured analysis) and are translated into a<br />
simple model (3) (e.g. probability model) from which<br />
‘alternative realities’ can be generated. These ‘alternative<br />
realities’ are used in an uncertainty propagation analysis to<br />
establish the impacts of data uncertainty on other operations,<br />
such as modelling. Finally, it is necessary to reflect on the<br />
quality of an uncertainty analysis (4), as they are fraught<br />
with assumptions and difficulties and can be misleading<br />
without quality control. The information required to<br />
generate ‘alternative realities’ of one or more environmental<br />
attributes is stored in the project database (see below).<br />
The methodology proposed for assessing model uncertainty<br />
is outlined in Refsgaard et al. (submitted for<br />
publication).<br />
2.4. Uncertainty in socio-economics<br />
Often uncertainty assessments are confined to uncertainties<br />
in data and models originating from natural science. We<br />
also consider uncertainty in socio-economic aspects by<br />
developing concepts based on the management of water<br />
resources and river basins (e.g. Cech, 2003). It takes into<br />
account literature on evaluation, e.g. cost-benefit analysis<br />
(Hanley and Spash, 1993; Bergstrom et al., 2001), multicriteria<br />
analysis (Roy, 1996; Munier, 2004) and decision<br />
making under uncertainty (Jungermann et al., 1998). The<br />
innovative aspects of our work lie in the further development<br />
Fig. 2. HarmoniRiB software components.
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 271<br />
Fig. 3. Screen shots from the HarmoniRiB data uncertainty assessment tool.<br />
of these ideas to support the implementation of the WFD and<br />
particularly elaborating the role of uncertainty in the process<br />
of creating and selecting management measures.<br />
The uncertainty in socio-economic data of official<br />
statistics (Eurostat, Statistical bureaus of German Länder<br />
and the FRG) has been surveyed. We found that the efforts to<br />
produce accurate economic data are enormous but the<br />
knowledge and awareness of the remaining uncertainties is<br />
generally low. Despite the lack of knowledge and awareness<br />
about uncertainty in socio-economic data and their sources<br />
we judge the consideration of these uncertainties in river<br />
basin management as highly relevant. On the basis of our<br />
investigations and our experience, we expect that it will be<br />
difficult to reach a meaningful quantification of many of<br />
these uncertainties. Methods for the systematic collection of<br />
qualitative information on uncertainties as well as strategies<br />
to deal with uncertainties that are not necessarily based on<br />
quantification are therefore needed.<br />
3. Databases for accommodating uncertain data<br />
3.1. Functionality with respect to data uncertainty<br />
We have designed and developed software for a database<br />
than can handle data and data uncertainty. The novelty of<br />
this database is that it meets the following requirements:<br />
It can store time-series data.<br />
It can store spatial data, both raster and vector, as well as<br />
time-series of spatial data.<br />
It can store information about uncertainty in these data.<br />
The uncertainty characteristics are described according to<br />
the uncertainty categories listed in Table 1. This implies that<br />
for the continuous data types the uncertainty is described by<br />
use of a probability density function (pdf) and a correlation<br />
matrix (or correlation function) for normally distributed<br />
data. For categorical data (such as land cover or soil type), a<br />
non-parametric distribution is typically required, and may be<br />
stored alongside transition probabilities for describing statistical<br />
dependence. The HarmoniRiB database design therefore<br />
allows the user to associate a probability model with<br />
each uncertain data item. In future, the database will be<br />
extended to allow numerical bounds (e.g. confidence intervals)<br />
and scenarios when probabilities cannot be defined.<br />
Information on the sources of uncertainty and the quality of<br />
an uncertainty model is also stored in the database.<br />
An initial list of pdfs and autocorrelation functions are<br />
included in a Probability Distribution Function Dictionary<br />
and an Autocorrelation Function Dictionary of the database.<br />
In addition the software will allow a user to add new<br />
functions when required. In practice, it may not be possible<br />
to calculate the pdf parameters for every attribute value in<br />
the database individually. It may only be feasible to calculate<br />
them at the level of the attribute with which the value is<br />
associated (i.e. an assumption of stationarity in space or<br />
time). In all cases, an uncertainty model is referenced by an<br />
Uncertainty Model ID (UMID), which acts as a pointer to an<br />
uncertainty model that applies to a specific location in space<br />
or time and to the information on statistical dependence<br />
between locations and attributes.
272<br />
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />
3.2. General database functionality<br />
The overall aim of the HarmoniRiB database system is to<br />
enable the HarmoniRIB Data Centre to receive, quality<br />
control, store and make available the representative basin<br />
data being assembled by the project. Ideally, it should be<br />
able to handle any data required for developing WFDcompliant<br />
River Basin Management Plans. This includes<br />
data for underlying modelling studies, and thus exceeds the<br />
WFD needs for reporting or river basin characterisations.<br />
The data will cover a wide range of water related topics but<br />
will mainly take the form of site descriptions and time series<br />
records. They will also include spatial data describing site<br />
locations, networks and variables such as land use or<br />
elevation. The proposed HarmoniRiB database design for<br />
holding these data is generic and is based on the WIS Cube<br />
(Moore, 1997). The major enhancements are not only the<br />
inclusion of uncertainty but also the seamless linking of<br />
metadata to data and a new underlying table design.<br />
At the user level, a HarmoniRiB database perceives the<br />
world as being composed of objects. These are any objects<br />
whose description and history the user wishes to record. The<br />
types or classes of object are decided by the user. Examples<br />
of object classes relevant to the WFD are sampling points,<br />
wells, reservoirs and rivers.<br />
The descriptions of objects and the events observed at<br />
them are recorded in terms of attribute values. Attributes,<br />
like object classes, are decided and defined by the user, the<br />
definitions being held in a dictionary. Awide range of spatial<br />
and non-spatial data types are supported, allowing the<br />
system to record most known or foreseeable types of<br />
attribute information required for the implementation of the<br />
WFD. Examples of attributes are object identifiers (names,<br />
reference codes, serial numbers, etc.), position, mean daily<br />
river flow, concentration (of e.g. nitrate), soil type and<br />
hydraulic conductivity.<br />
At the conceptual level, there is no differentiation<br />
between spatial and non-spatial attributes. They are all<br />
stored within the same logical framework.<br />
One way of visualising the manner in which data are<br />
stored in a HarmoniRiB database is to imagine a large cube,<br />
made up of individual cells as shown in Fig. 4. The three<br />
axes of this cube represent objects (WHERE observations<br />
were made), attributes (which record WHAT the observation<br />
was a measure of) and occasions (WHEN the observations<br />
were made). Thus, each cell in the cube records the value of<br />
an attribute at a particular object for a particular point in<br />
time. For example, one cell might record the concentration<br />
of calcium on 29 June 2002 at 10:20 (GMT) in the river<br />
Thames at Wallingford.<br />
The design regards all attribute values as potentially<br />
changeable over time, thus enabling it to handle time-series<br />
data such as river flow. This facility applies to spatial<br />
attributes as well as conventional time series making it<br />
possible to track an object’s movement. There is no<br />
constraint on the number of objects, attributes or occasions<br />
Fig. 4. The Cube as a way of visualising how time series data are stored<br />
(Tindal et al., 2004).<br />
which can be recorded, other than that imposed by the<br />
physical limits of the hardware. The Cube is otherwise<br />
unlimited in all directions.<br />
The cells in the cube hold the users’ data. Each cell<br />
contains a single attribute value. A cell can also contain<br />
some or all of the following information associated with the<br />
value:<br />
A qualifier for the value. A qualifier is an item of<br />
information which users may enter in order to amplify the<br />
meaning of an attribute value. For example, qualifiers may<br />
be useful in:<br />
Bird or bacteriological count attributes where the value<br />
may take the form of, say, ‘more than 10,000’. In this<br />
case, the value would be entered as 10,000, and the<br />
qualifier as ><br />
Chemical concentration attributes, where the actual<br />
concentration is unknown, but it is possible to say that it<br />
is less than a certain value, where the value represents<br />
the limit of detection of the analysis method. The value<br />
would be entered as the limit of detection, for example<br />
0.001, and the qualifier as <<br />
A method of derivation identifier. The method code is a<br />
user defined code identifying the source from which the<br />
value was obtained or the method by which it was derived.<br />
This information can be used, for example, by future users<br />
of the value, to determine its reliability.<br />
A measure of the value’s uncertainty in the form of a<br />
reference to an uncertainty model stored elsewhere in the<br />
database. This part of the requirement represents the<br />
major area of innovation and is likely to evolve as the<br />
project progresses.<br />
Dataset ID. Every value in the database has a pointer<br />
connecting the value to the dataset of which it is a<br />
member. The definition of what constitutes a dataset is up<br />
to the user. The only mandatory part of its definition is that<br />
the data values that make up a dataset must be owned by<br />
the same person or organisation. This condition is<br />
necessary to facilitate access control which will relate<br />
to ‘owned’ blocks of data.
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 273<br />
Uncertainty Model ID. Each value contains a reference to<br />
an uncertainty model, which describes the range of<br />
possible values that an attribute might take at a given<br />
location.<br />
At the physical level, the data will be stored in a set of<br />
tables in a relational database such as Oracle. These will be<br />
held in a single account managed by the database administrator.<br />
Approved applications such as the data load facility<br />
will have direct access to this account and will be able to<br />
select and update data. Users and user written applications<br />
will be given read only access to the database via their own<br />
accounts.<br />
The database software is developed for application on an<br />
ArcSDE/ArcGIS platform using ESRI technology.<br />
4. River basin network and data<br />
Many networks of river basin data have been established<br />
for research purposes during the last couple of decades. A<br />
review of the characteristics of existing networks with<br />
respect to type of data, geographical coverage, data<br />
accessibility and data use by third parties is provided by<br />
Passarella and Vurro (2003). Examples of existing international<br />
networks are Flow Regimes from International<br />
Experimental and Network Data (FRIEND); Global Runoff<br />
Data Centre (GRDC); Hydrology for the Environment, Life<br />
and Policy (HELP); World Hydrological Cycle Observing<br />
System (WHYCOS); European River and Catchment<br />
Database Pilot Project (ERICA); Inventory of the Catchments<br />
for Research in Europe (ICARE) metadatabase and<br />
the Experimental Representative Basins (ERB) network and<br />
GLOWA.<br />
In addition to these international networks, many national<br />
databases containing data from national networks of river<br />
basins exist, e.g. Lowland Catchment Research (LOCAR);<br />
Data Storage for the Rijkswaterstaat (DONAR) and British<br />
Oceanographic Data Centre (BODC).<br />
Some of the existing networks provide data for<br />
operational purposes, while most of them have been<br />
established for research purposes. Many of these networks<br />
have existed for long periods and have served (and still do)<br />
important purposes. However, seen from a Water Framework<br />
Directive perspective, most of them have the key<br />
deficiency that they focus on only some aspects (domains) of<br />
Fig. 5. Location of the HarmoniRiB network of representative river basins.
274<br />
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />
data required for water management in WFD, and most<br />
typically they do not contain data on ecological and socioeconomic<br />
aspects. Even comprehensive national databases<br />
such as LOCAR and DONAR do not contain do not contain<br />
much data on groundwater, land use and socio-economics.<br />
Among the international networks HELP has the broadest<br />
scope with a focus on socio-economic aspects. HELP,<br />
however, does not include groundwater or coastal water<br />
data. Furthermore, HELP so far only consists of rather few<br />
river basins Worldwide and does not have a good coverage in<br />
Europe.<br />
Thus, none of the existing river basin networks can<br />
provide suitable datasets for supporting research on<br />
integrated water management of direct relevance for<br />
implementation of the WFD. In addition, none of the<br />
existing networks comprise any quantifiable information on<br />
data uncertainty. Consequently, it is concluded that there is a<br />
clear need to supplement the existing networks with a<br />
network of representative river basins that as its principal<br />
aim has to provide data supporting research in integrated<br />
water resources management as required by the WFD. The<br />
HarmoniRiB river basin network is meant for this purpose.<br />
The HarmoniRiB network of representative river basins<br />
comprise eight basins, see Fig. 5 for locations and Table 2 for<br />
characteristic features. These basins have been selected to<br />
ensure a good coverage across Europe in terms of ecoregions,<br />
types of water problems, socio-economic conflicts<br />
and amount and quality of existing data. In addition, two of<br />
the river basins (Odense and Jucar) are also included in the<br />
Pilot River Basin Network, where the EC guidance<br />
documents have been tested. The aim of HarmoniRiB is,<br />
through interaction with the respective river basin organisations<br />
and data owners, to provide well documented data for<br />
research purposes, suitable for studying the influence of<br />
uncertainty on management decisions. The data will be<br />
publicly accessible for all research purposes. Thus, scientists<br />
may use the data to, e.g. assess the appropriateness of<br />
models and other tools in relation to the WFD.<br />
For each of the eight river basins a comprehensive<br />
amount of data is presently being collected and uploaded to<br />
the HarmoniRiB database. The data basically include all<br />
data that are required to carry out analysis for the WFD<br />
implementation (Blind and de Blois, 2003). Most of the data<br />
are organised in seven datasets, one for each of the six<br />
domains: climate, rivers, lakes, groundwater, transitional<br />
waters, and coastal waters, and one for spatial data, river<br />
basin characteristics and socio-economic data. Specific lists<br />
of data have been prepared by matching the data<br />
requirements given in the guidance documents on ‘Monitoring’<br />
(EC, 2003b) and ‘Analysis of pressures and impacts’<br />
(EC, 2003a), with the data available in the respective river<br />
basins (Rasmussen, 2003).<br />
After collecting and reformatting the data they are being<br />
uploaded to the HarmoniRiB Data Centre. Subsequently,<br />
uncertainty will be assessed and added to the data following<br />
the framework outlined above.<br />
Table 2<br />
Key characteristics of the HarmoniRiB network of representative river basins<br />
Dominant land use Main water uses Main conflicting interest<br />
GNP<br />
(Euro/pers/year)<br />
Country river basin Area (km 2 ) Population<br />
density<br />
(person/km 2 )<br />
Flood protection, minimum discharges, water quality<br />
CZ, Svratka 3998 142 5600 Agriculture, forest Drinking water, electrical power,<br />
recreation, nature<br />
DE, Weisse Elster 5325 278 15000 Agriculture Drinking water, industry Point and non-point sources; wastewater and contaminated<br />
sites; strong economic and social changes.<br />
DK, Odense 1090 135 25000 Agriculture Public water supply,<br />
recreation, nature<br />
Agricultural contamination; groundwater abstraction depletes<br />
stream flow and wetlands<br />
Farming use; hydroelectrical use; touristic water demand<br />
ES, Jucar 21328 28 9900 Agriculture Irrigation, hydroelectric,<br />
touristic supply, industry<br />
GR, Geropotamou 600 66 10000 Agriculture Irrigation, touristic Water shortage, water quality, oversized dam, salt intrusion,<br />
difficulties in sharing water among municipalities<br />
IT, Candelaro 1980 230 10277 Agriculture Irrigation, industry Water shortage; rainfall rates decrease; intensive<br />
horticultural farming.<br />
Agriculture, water quality, ecology, flooding —<br />
room for water retention<br />
NL + DE Vecht 3780 (1980 in NL) 311 19000 Industry, agriculture, habitation Agriculture, drinking water,<br />
receiving water, recreation<br />
Water supply vs. ecology<br />
UK, Thames 12917 929 30000 Urban, agriculture Public water supply, ecosystem,<br />
recreation
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 275<br />
5. Case studies<br />
For each of the river basins the methodologies will be<br />
tested through one case study for each of the eight river<br />
basins. The focus in the case studies will be assessment of<br />
uncertainties related to various aspects of the decision<br />
process related to evaluating potential measures for<br />
achieving the WFD objective of good ecological status.<br />
The following aspects of uncertainty will be considered:<br />
Uncertainty related to framing of the decision making<br />
process. This uncertainty will typically be described in<br />
qualitative terms.<br />
Uncertainty related to prediction of effects of a given<br />
measure, i.e. what is the impact of a given management<br />
decision such as changes in agricultural practice of<br />
abstraction of groundwater. Such predictions will often be<br />
made by use of hydrological models and involve the<br />
following sources of uncertainty:<br />
- Uncertainty of input data.<br />
- Uncertainty of model parameter values.<br />
- Uncertainty of model techniques (numerical solution,<br />
software bugs, etc.).<br />
- Uncertainty of model structure.<br />
Uncertainty on economic assessments, which, like for<br />
uncertainty in hydrological model predictions, may<br />
originate from economic data and from the choice of<br />
evaluation method.<br />
A key problem in assessing the uncertainty of the effects<br />
of a measure is that the effects usually are estimated as a<br />
difference between two model simulations, e.g. a reference<br />
run describing the present conditions and a run where the<br />
measure is taken into account. Procedures for assessing u-<br />
ncertainty of a model simulation are well known, while<br />
procedures for assessing uncertainties in differences between<br />
two simulation runs are theoretically difficult and rarely<br />
used. However, here we are mainly interested in the uncertainty<br />
on the difference figures. These uncertainties related<br />
to differences in simulated output may be much smaller than<br />
the uncertainties in the model predictions of each simulation<br />
(Reichert and Borsuk, 2005) as many sources of uncertainty<br />
affect the predictions for different alternatives in similar<br />
ways.<br />
The results of the case study will be uncertainties<br />
expressed partly quantitatively and partly qualitatively. The<br />
quantitative parts may be illustrated as in Fig. 6, where the<br />
uncertainty on the impacts (hydrological models) are shown<br />
along the vertical axis and the uncertainty on the costs of<br />
implementing a measure is shown along the horizontal axis.<br />
In the hypothetical example shown in Fig. 6 measure no. 1<br />
(PoM 1) is clearly suboptimal as compared to the two other<br />
measures, because its effect is much lower and the<br />
implementation cost higher. A decision on whether to<br />
chose PoM 2 or PoM 3 is, however, more difficult, because<br />
the uncertainty ranges are overlapping both with regards to<br />
effects and costs. The choice will also be influenced by the<br />
risk strategy of the decision maker. If the decision maker<br />
wants a high degree of certainty for an effect corresponding<br />
to the dashed line denoted ‘Minimum effect’ s/he will have<br />
to select PoM 3, even if the expected cost efficiency of PoM<br />
2 is more favourable.<br />
Fig. 6. Graphical representation of uncertainty in simulated effect of measure vs. estimated uncertainty in cost of implementing a measure.
276<br />
6. Discussion and conclusions<br />
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />
Acknowledgement<br />
Assessment of uncertainty in model simulations is<br />
important when such models are used to support decisions<br />
in water resources management (Beven and Binley, 1992;<br />
Pahl-Wostl, 2002; Jakeman and Letcher, 2003; Refsgaard<br />
and Henriksen, 2004). This is reflected in EU’s new water<br />
management approaches as described in the Water Framework<br />
Directive (EC, 2000) and the associated guidance<br />
documents. A basic principle in EU environmental policy on<br />
which the WFD is based is ‘‘...to contribute to pursuit of the<br />
objectives of preserving, protecting and improving the<br />
quality of the environment in prudent and rational use of<br />
natural resources, and to be based on the precautionary<br />
principle ... ’’ (paragraph 11 in the directive). The holistic<br />
concept that is prescribed in the WFD with its integrated<br />
approach to natural resources and socio-economic issues<br />
therefore requires that uncertainty be considered in the<br />
decision making process in order for it to become truly<br />
rational. This need for taken uncertainties into account is<br />
also explicitly stated in the WFD guidance documents<br />
(Blind and de Blois, 2003).<br />
The key sources of uncertainty of importance for<br />
evaluating the effect and cost of a measure in relation to<br />
preparing a WFD-compliant river basin management plan<br />
are (1) uncertainty related to framing of the decision<br />
making process; (2) uncertainty related to hydrological<br />
models (input data, parameter values, model technique,<br />
model structure) and; (3) uncertainty in economic assessments.<br />
The framework adopted in HarmoniRiB addresses<br />
this wide spectrum of uncertainties. The particularly<br />
novel contributions of HarmoniRiB in this respect are<br />
related to the assessment of uncertainty in data and to<br />
the integration of uncertainty in effects of a measure<br />
(outputs from hydrological models) and socio-economic<br />
uncertainty, including uncertainty in costs of implementing a<br />
measure.<br />
New principles often lead to a demand for new research<br />
for supporting their implementation. This is also the case for<br />
the WFD. Hence there is a need for easy access to river basin<br />
datasets suitable for WFD related research. None of the<br />
existing international river basin networks can provide<br />
suitable datasets for supporting research on integrated water<br />
management of direct relevance for implementation of the<br />
WFD. In addition, none of the existing networks comprise<br />
any quantifiable information on data uncertainty. The<br />
HarmoniRiB project aims at filling this gap by designing,<br />
building and populating a database containing data and<br />
associated uncertainties for a eight river basins representatively<br />
characterising the diversity of climatic regimes and<br />
water management challenges across Europe. This river<br />
basin network aims at becoming a ‘virtual laboratory for<br />
modelling studies’, and it will be made available for the<br />
scientific community. The data may, e.g. be used for<br />
comparison and demonstration of methodologies and<br />
models relevant to the WFD.<br />
This work is partly funded by the EC Energy,<br />
Environment and Sustainable Development programme<br />
(Contract EVK1-2002-00109).<br />
References<br />
Beck, M.B., 1987. Water quality modelling: a review of the analysis of<br />
uncertainty. Water Resour. Res. 23 (8), 1393–1442.<br />
Bergstrom, J.C., Boyle, K.J., Poe, G.L. (Eds.), 2001. The Economic Value<br />
of Water Quality. Edward Elgar, Chaltenham.<br />
Beven, K., Binley, A.M., 1992. The future of distributed models, model<br />
calibration and uncertainty predictions. Hydrol. Processes 6, 279–298.<br />
Blind, M., de Blois, C., 2003. The Water Framework Directive and its<br />
Guidance Documents — Review of data aspects. In: Refsgaard, J.C.,<br />
Nilsson, B. (Eds.), Requirements, Report, Geological Survey of Denmark,<br />
Greenland, Copenhagen (Chapter 5). Available on http://<br />
www.harmonirib.com/.<br />
Brown, J.D., 2004. Knowledge, uncertainty and physical geography:<br />
towards the development of methodologies for questioning belief.<br />
Trans. Inst. Br. Geographers 29 (3), 367–381.<br />
Brown, J.D., Heuvelink, G.B.M., Refsgaard, J.C., 2005. An integrated<br />
framework for assessing and recording uncertainties about environmental<br />
data. To appear in a special issue of Water Sci. Technol.<br />
Cech, T.V., 2003. Principles of Water Resources — History, Development,<br />
Management, and Policy. John Wiley & Sons, New York.<br />
EC, 2000. Water Framework Directive. Directive 2000/60/EC. European<br />
Commission.<br />
EC, 2003a. Guidance for the analysis of Pressures and Impacts in accordance<br />
with the Water Framework Directive. Working Group 2.1.<br />
Available on http://forum.europa.eu.int/Public/irc/env/wfd/library.<br />
EC, 2003b. Water Framework Directive, Common Implementation Strategy.<br />
Working group 2.7. Monitoring. Available on http://forum.europa.eu.int/Public/irc/env/wfd/library.<br />
Funtowicz, S.O., Ravetz, J., 1990. Uncertainty and Quality in Science for<br />
Policy. Kluwer Academic Publishers, Dordrecht.<br />
GWP, 2000. Integrated Water Resources Management. TAC Background<br />
Papers No. 4. Global Water Partnership, Stockholm. Available on http://<br />
www.gwpforum.org/.<br />
Hanley, N., Spash, C.L., 1993. Cost-Benefit Analysis and the Environment.<br />
Edward Elgar, Brookfield.<br />
Heuvelink, G.B.M., Burrough, P.A., 1993. Error propagation in cartographic<br />
modelling using Boolean logic and contionous classification.<br />
Int. J. Geogr. Inform. Sci. 7 (3), 231–246.<br />
Jakeman, A.J., Letcher, R.A., 2003. Integrated assessment and modelling:<br />
features, principles and examples for catchment management. Environ.<br />
Modell. Software 18, 491–501.<br />
Jungermann, H., Pfister, H-R., Fischer, K., 1998. Die Psychologie der<br />
Entscheidung (The Psychology of Decisions). Spektrum Akademischer<br />
Verlag, Heidelberg.<br />
Klauer, B., Brown, J.D., 2003. Conceptualising imperfect knowledge in<br />
public decision making: ignorance, uncertainty, error and ‘risk situations’.<br />
Environ. Res., Eng. Manage.<br />
Moore, R.V., 1997. The logical and physical design of the land Ocean<br />
Interaction Study database. Sci. Total Environ. 194/195, 137–146.<br />
Munier, N., 2004. Multicriteria Environmental Assessment. Kluwer Academic<br />
Publishers, Dortrecht.<br />
Pahl-Wostl, C., 2002. Towards sustainability in the water sector — the<br />
importance of human actors and processes of social learning. Aquatic<br />
Sci. 64, 394–411.<br />
Passarella, G., Vurro, M., 2003. Review of Existing River Basin Networks.<br />
In: Refsgaard, J.C., Nilsson, B. (Eds.), Requirements Report. Geological
J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 277<br />
Survey of Denmark and Greenland, Copenhagen (Chapter 3). Available<br />
on http://www.harmonirib.com/.<br />
Rasmussen, P., 2003. Requirements for Data for HarmoniRiB. In:<br />
Refsgaard, J.C., Nilsson, B. (Eds.), Requirements Report. Geological<br />
Survey of Denmark and Greenland, Copenhagen (Chapter 7). Available<br />
on http://www.harmonirib.com/.<br />
Refsgaard, J.C., Henriksen, H.J., 2004. Modelling guidelines — terminology<br />
and guiding principles. Adv. Water Resour. 27, 71–82.<br />
Refsgaard, J.C., van der Sluijs, J.P., Brown, J., van der Keur, P., submitted<br />
for publication. A framework for dealing with uncertainty due to model<br />
structure error.<br />
Reichert, P., Borsuk, M.E., 2005. Does high forecast uncertainty preclude<br />
effective decision support. Environ. Modell. Software 20 (8), 991–1001.<br />
Roy, B., 1996. Multicriteria Methodology for Decision Aiding. Kluwer<br />
Academic Publishers, Dortrecht.<br />
Tindal, C.I., Moore, R.V., Dunbar, M., Goodwin, T., 2004. The HarmoniRiB<br />
project — the effect of uncertainty on catchment management. In:<br />
British Hydrological Society International Conference on Hydrology:<br />
Science and Practice for the 21st Century, 12–16 July 2004, London,<br />
UK.<br />
Walker, W.E., Harremoës, P., Rotmans, J., Van der Sluijs, J.P., Van Asselt,<br />
M.B.A., Janssen, P., Krayer von Krauss, M.P., 2003. Defining uncertainty.<br />
A conceptual basis for uncertainty management in model-based<br />
decision support. Integrated Assess. 4 (1), 5–17.<br />
Van Loon, E., Refsgaard, J.C. (Eds.), 2005. Guidelines for assessing data<br />
uncertainty in hydrological studies. First draft version prepared September<br />
2004. Final version to be published beginning of 2005 on http://<br />
www.harmonirib.com/.<br />
Jens Christian Refsgaard is co-ordinator of the HarmoniRiB project.<br />
Since his graduation in hydrology at the Technical University of Denmark in<br />
1976 he has worked with hydrological modelling and water resources<br />
management at DTU, DHI and now at GEUS, where he holds a position<br />
as research professor. He is currently also WP leader in HarmoniQuA<br />
(quality assurance in the modelling process) and NeWater (new approaches<br />
in water resources management).<br />
Bertel Nilsson is a research scientist in hydrogeology at Geological Survey<br />
of Denmark and Greenland since 1988.<br />
James Brown is a postdoctoral research associate at the University of<br />
Amsterdam with interests in environmental modelling, methods for uncertainty<br />
analysis of models, and the impacts of scientific uncertainty on<br />
decision making.<br />
Bernd Klauer has a professional background in mathematics, physics and<br />
economics. After his PhD in economics from the University of Heidelberg<br />
he became engaged at the UFZ Centre for Environmental Research, Leipzig.<br />
There he currently works as a senior scientist and leader of a research group<br />
on integrated assessment and decision support.<br />
Roger Moore is a member of the Centre for Ecology and Hydrology, UK.<br />
His backgound lies in civil engineering but has spent most of his career<br />
working on integrated database design mainly in the UK but also around the<br />
world. Currently, he is also co-ordinator for The FP5 project HarmonIT.<br />
Thomas Bech holds an MSc in electronics engineering and computer<br />
science, and has worked as software developer and project manager at<br />
Seven Technologies and DHI Water & Environment. He is currently<br />
working as a Software Development Manager at DHI Water & Environment.<br />
Michele Vurro graduated in hydraulic engineering. Researcher at<br />
CNR.IRSA from 1982, and is now principal researcher with responsibility<br />
for methodology and techniques for protecting and managing water<br />
resources, with particular emphasis on water budget under scarce water<br />
availability.<br />
Michiel Blind, Msc Environmental Science — Water Systems Analysis, has<br />
worked 5 years on monitoring network design at Wageningen University,<br />
where after he continued his career at RWS-RIZA, on IT-water management<br />
issues. He is mainly involved in European Research Projects on Catchment<br />
modelling.<br />
Guillermo Castilla is a forest engineer specialized in Remote Sensing and<br />
GIS. He is currently involved in the dissemination activities of HarmoniRiB.<br />
Ioannis K. Tsanis is a professor in the Department of Environmental<br />
Engineering at Technical University of Crete. He obtained his PhD in civil<br />
engineering from University of Toronto. His research activities are in the<br />
areas of hydroinformatics, water resources management and coastal engineering.<br />
His main background is hydrological modelling, water resources<br />
management and hydroinformatics.<br />
Pavel Biza has been educated in civil engineering and developed his career<br />
at the water board Povodi Moravy in the Czech Republic. He is now<br />
involved in development of river basin management plans.
[15]<br />
Refsgaard JC, van der Sluijs JP, Brown J, van der Keur P (2006). A<br />
framework for dealing with uncertainty due to model structure error.<br />
Advances in Water Resources, 29, 1586-1597.<br />
Reprinted from Advances in Water Resources with permission from Elsevier
Advances in Water Resources 29 (2006) 1586–1597<br />
www.elsevier.com/locate/advwatres<br />
A framework for dealing with uncertainty due to model<br />
structure error<br />
Jens Christian Refsgaard a, *, Jeroen P. van der Sluijs b ,<br />
James Brown c , Peter van der Keur a<br />
a Department of Hydrology, Geological Survey of Denmark and Greenland (GEUS), Oster Voldgade 10, 1350 Copenhagen, Denmark<br />
b Copernicus Institute for Sustainable Development and Innovation, Department of Science Technology and Society,<br />
Utrecht University, Utrecht, The Netherlands<br />
c University of Amsterdam (UVA), Amsterdam, The Netherlands<br />
Received 29 July 2004; received in revised form 6 September 2005; accepted 21 November 2005<br />
Available online 5 January 2006<br />
Abstract<br />
Although uncertainty about structures of environmental models (conceptual uncertainty) is often acknowledged to be the main<br />
source of uncertainty in model predictions, it is rarely considered in environmental modelling. Rather, formal uncertainty analyses<br />
have traditionally focused on model parameters and input data as the principal source of uncertainty in model predictions. The traditional<br />
approach to model uncertainty analysis, which considers only a single conceptual model, may fail to adequately sample the<br />
relevant space of plausible conceptual models. As such, it is prone to modelling bias and underestimation of predictive uncertainty.<br />
In this paper we review a range of strategies for assessing structural uncertainties in models. The existing strategies fall into two<br />
categories depending on whether field data are available for the predicted variable of interest. To date, most research has focussed<br />
on situations where inferences on the accuracy of a model structure can be made directly on the basis of field data. This corresponds<br />
to a situation of ‘interpolation’. However, in many cases environmental models are used for ‘extrapolation’; that is, beyond the situation<br />
and the field data available for calibration. In the present paper, a framework is presented for assessing the predictive uncertainties<br />
of environmental models used for extrapolation. It involves the use of multiple conceptual models, assessment of their<br />
pedigree and reflection on the extent to which the sampled models adequately represent the space of plausible models.<br />
Ó 2005 Elsevier Ltd. All rights reserved.<br />
Keywords: Environmental modelling; Model error; Model structure; Conceptual uncertainty; Scenario analysis; Pedigree<br />
1. Introduction<br />
1.1. Background<br />
* Corresponding author. Tel.: +45 38 14 27 76; fax: +45 38 14 20 50.<br />
E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />
Assessing the uncertainty of model simulations is<br />
important when such models are used to support decisions<br />
about water resources [6,33,23,39]. The key<br />
sources of uncertainty in model predictions are (i) input<br />
data; (ii) model parameter values; and (iii) model structure<br />
(=conceptual model). Other authors further distinguish<br />
uncertainty in model context, model assumptions,<br />
expert judgement and indicator choice [46,54,48] but<br />
these are beyond the scope of this paper. Uncertainties<br />
due to input data and due to parameter values have been<br />
dealt with in many studies, and methodologies to deal<br />
with these are well developed. However, no generic<br />
methodology exists for assessing the effects of model<br />
structure uncertainty, and this source of uncertainty is<br />
frequently neglected.<br />
Any model is an abstraction, simplification and interpretation<br />
of reality. The incompleteness of a model<br />
0309-1708/$ - see front matter Ó 2005 Elsevier Ltd. All rights reserved.<br />
doi:10.1016/j.advwatres.2005.11.013
J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1587<br />
structure and the mismatch between the real causal<br />
structure of a system and the assumed causal structure<br />
as represented in a model always result in uncertainty<br />
about model predictions. The importance of the model<br />
structure for predictions is well recognised, even for situations<br />
where predictions are made on output variables,<br />
such as discharge, for which field data are available<br />
[16,8]. The considerable challenge faced in many applications<br />
of environmental models is that predictions are<br />
required beyond the range of available observations,<br />
either in time or in space, e.g. to make extrapolations<br />
towards unobservable futures [2] or to make predictions<br />
for natural systems, such as ecosystems, that are likely<br />
to undergo structural changes [4]. In such cases, uncertainty<br />
in model structure is recognised by many authors<br />
to be the main source of uncertainty in model predictions<br />
[44,13,31,28].<br />
1.2. An example – five alternative conceptual models<br />
The problem is illustrated for a study conducted by<br />
the County of Copenhagen in 2000 involving a real<br />
water management decision [11,37]. The County of<br />
Copenhagen is the authority responsible for water<br />
resources management in the county where the city of<br />
Copenhagen abstracts groundwater for most of its water<br />
supply. According to a new Water Supply Act the<br />
county had to prepare an action plan for protection of<br />
groundwater against pollution. As a first step, the<br />
county asked five groups of Danish consulting firms to<br />
conduct studies of the aquifer’s vulnerability towards<br />
pollution in a 175 km 2 area west of Copenhagen, where<br />
the groundwater abstraction amounts to about 12 million<br />
m 3 /year. The key question to be answered was:<br />
which parts of this particular area are most vulnerable<br />
to pollution and need to be protected The five consultants<br />
were among the most well reputed consulting firms<br />
in Denmark, and they were known to have different<br />
views and preferences on which methodologies are most<br />
suitable for assessing vulnerability. As the task was one<br />
of the first consultancy studies on a new major market<br />
for preparation of groundwater protection plans it was<br />
considered a prestigious job to which the consultants<br />
generally allocated some of their most qualified<br />
professionals.<br />
The five consultants used significantly different<br />
approaches. One consultant based his approach on<br />
annual fluctuations of piezometric heads assuming that<br />
larger fluctuations represent greater interaction between<br />
aquifer and surface water systems and hence a larger<br />
vulnerability. Several consultants used the DRASTIC<br />
multi-criteria method [1], but modified it in different<br />
ways by changing weights and adding new, mainly geochemically<br />
oriented, criteria. One consultant based his<br />
approach on advanced hydrological modelling of both<br />
groundwater and surface water systems using the MIKE<br />
SHE code [40], while two other consultants used simpler<br />
groundwater modelling approaches. Thus, the five consultants<br />
had different perceptions of what causes<br />
groundwater pollution and used models with different<br />
processes and causal relationships to describe the possibility<br />
of groundwater pollution in the area. In addition,<br />
their different interpretations and interpolations made<br />
from common field data resulted in significantly different<br />
figures for e.g. areal means of precipitation and<br />
evapotranspiration and the thickness of various geological<br />
layers [37].<br />
The conclusions of the five consultants regarding vulnerability<br />
to nitrate pollution are shown in Fig. 1. Itis<br />
apparent that the five estimates differ substantially from<br />
each other. In the present case, no data exist to validate<br />
the model predictions, because the five models were used<br />
to make extrapolations. Thus, it is not possible, from<br />
existing field data, to tell which of the five model estimates<br />
are more reliable. The differences in prediction<br />
originate from two main sources: (i) data and parameter<br />
uncertainty and (ii) conceptual uncertainty. Although<br />
the data and parameter uncertainties were not explicitly<br />
assessed by any of the consultants (as is common in such<br />
studies), the substantial differences in model structures<br />
and the fact that the consultants all used the same raw<br />
data point to structural uncertainty as the main cause<br />
of difference between the five model results and as a<br />
major source of uncertainty in model predictions.<br />
Fig. 1. Model predictions on aquifer vulnerability towards nitrate<br />
pollution for a 175 km 2 area west of Copenhagen [11].
1588 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />
Usually a water manager bases their decisions on the<br />
conclusions from only one study. The uniqueness of the<br />
present study was that five consultants were asked to<br />
answer the same question on the basis of the same data.<br />
In this respect the differences between the five estimates<br />
are striking and clearly do not provide a sound basis for<br />
deciding anything about which areas should be protected.<br />
A worrying question, which is left unanswered,<br />
is whether the basis for decisions is similarly poor in<br />
the many other cases where only a single conceptual<br />
model has been adopted and where millions of DKK<br />
have subsequently been used to prepare and implement<br />
action plans.<br />
1.3. Objective and outline of paper<br />
The objective of this paper is to review possible strategies<br />
for dealing with model structure errors and to outline<br />
a framework for handling the effects of model<br />
structure errors on predictive uncertainty, with particular<br />
emphasis on situations where model predictions represent<br />
extrapolations to situations not covered by<br />
calibration data and are often outside the domain on<br />
which our knowledge on the dynamics of the system<br />
and our understanding of its causal relationships is<br />
based.<br />
The paper is organised so that reviews of existing<br />
strategies and the discussion of their potentials and limitations<br />
are given in Section 2. A new framework is presented<br />
in Section 3 for analysing the uncertainties due to<br />
model structure errors when models are used for making<br />
extrapolations beyond their calibration base. Finally,<br />
the problems and perspectives of the new framework<br />
are discussed in Section 4. The terminology used is<br />
defined in Appendix.<br />
2. Review of possible strategies<br />
2.1. Classification<br />
The existing strategies for assessing uncertainty due<br />
to incomplete or inadequate model structure may be<br />
grouped into the categories shown in Fig. 2. The most<br />
important distinction is whether data exist that makes<br />
it possible to make inferences on the model structure<br />
uncertainty directly. This requires that data are available<br />
for the output variable of predictive interest and for conditions<br />
similar to those in the predictive situation. In<br />
other words it is a distinction between whether the<br />
model predictions can be considered as interpolations<br />
or extrapolations relative to the calibration situation.<br />
The two main categories are thus equivalent to different<br />
situations with respect to model validation tests.<br />
According to Klemes’ classical hierarchical test scheme<br />
[26,38], the interpolation case corresponds to situations<br />
where the traditional split-sample test is suitable, while<br />
the extrapolation case corresponds to situations where<br />
no data exist for the concerned output variable<br />
(proxy-basin test) or where the basin characteristics<br />
are considered non-stationary, e.g. for predictions of<br />
effects of climate change or effects of land use change<br />
(differential split-sample test).<br />
In the review of existing strategies given below examples<br />
of studies have been selected to illustrate the classification<br />
and the common approaches. It is not an<br />
Availability of data for<br />
model validation test<br />
Target data exist<br />
(interpolation)<br />
No direct data<br />
(extrapolation)<br />
Increase<br />
parameter<br />
uncertainty<br />
Estimate<br />
structural<br />
term<br />
Multiple<br />
conceptual<br />
models<br />
Expert<br />
elicitation<br />
Pedigree<br />
analysis<br />
Intermediate data<br />
(differential splitsample<br />
case)<br />
No data at all<br />
(proxy basin case)<br />
Fig. 2. Classification of existing strategies for assessing conceptual model uncertainty.
J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1589<br />
exhaustive review, but illustrates the range of<br />
approaches available to diagnose structural uncertainty<br />
in models.<br />
2.2. Data exist – interpolation<br />
In this situation, calibration is usually carried out<br />
against a sample of the existing field data to ensure some<br />
kind of optimal parameter values, and then the model<br />
predictions are compared with the remaining (‘independent’)<br />
field data. The deviations between model predictions<br />
and independent field observations can be used<br />
to infer the model’s conceptual error. Different methodologies<br />
can be used in this respect.<br />
2.2.1. Increasing parameter uncertainty to account<br />
for structural uncertainty<br />
One strategy is to increase the parameter uncertainty<br />
to a level where it is assumed to compensate for omitting<br />
model structure error from the analysis. Van Griensven<br />
and Meixner [45] provide an example of this. They<br />
assess the total predictive uncertainty without identifying<br />
or quantifying the underlying sources of uncertainty.<br />
They use the split-sample approach assessing ranges of<br />
predictive uncertainty from analyses of predictions and<br />
data for a period different from the calibration period.<br />
Their total predictive uncertainty is assessed by increasing<br />
the model parameter uncertainty beyond the magnitudes<br />
estimated during calibration to a level where the<br />
resulting predictive uncertainty intervals bracket the<br />
observations. This technique does not introduce a separate<br />
stochastic term for the structural uncertainty, but<br />
represents the structural term in the parameter term.<br />
The model structure error is likely to influence the model<br />
simulations in non-random and temporally varying<br />
ways. By compensating the model structure error by<br />
increasing the variance of a temporally constant random<br />
variable the results from this approach can be questioned,<br />
particularly if used for predictions in situations<br />
where split-sample tests are not made.<br />
2.2.2. Estimation of the structural uncertainty term<br />
Other strategies attempt to estimate the structural<br />
contribution to uncertainty in the model predictions.<br />
An example of such an approach is given by Radwan<br />
et al. [35], who estimate the total predictive uncertainty<br />
from a statistical analysis of the residuals between model<br />
predictions and observations. Further, they analyse the<br />
propagated uncertainties from model input and parameter<br />
values. By subtracting these two uncertainties from<br />
the total predictive uncertainty they assign the remaining<br />
predictive uncertainty to be an effect of model structure<br />
uncertainty. It is then possible to add the model<br />
structure uncertainty when making other predictions.<br />
This approach assumes that the uncertainties from different<br />
sources are additive. This assumption is questionable,<br />
because the combination of uncertainties is often<br />
non-linear due to interactions, correlations and dependencies<br />
between variables in a model. It also assumes<br />
that the differences in predictions and observations are<br />
caused by structural error and not by the poor specification<br />
of input and parameter uncertainty, nor by errors in<br />
the observations.<br />
Vrugt et al. [53] present another stochastic approach<br />
based on a simultaneous parameter optimisation and<br />
data assimilation with an ensemble Kalman filter. By<br />
specifying values for measurement error and a so-called<br />
‘stochastic forcing term’, representing structural uncertainty,<br />
they are able to estimate the dynamic behaviour<br />
of the model structure uncertainty. Both techniques<br />
assume a smooth contribution from structural uncertainty,<br />
but an important advantage of the latter is that<br />
parameter innovations (an output from the Kalman filter)<br />
may be used to diagnose non-stationarity in system<br />
structure.<br />
2.3. No direct data – extrapolation<br />
In cases where model structure errors cannot be<br />
assessed directly due to a lack of relevant data, the main<br />
strategy is to do the extrapolation with multiple conceptual<br />
models. Two supporting methods can be used here<br />
for the generation and qualification of each of the alternative<br />
models: expert elicitation and pedigree analysis<br />
(Fig. 2).<br />
2.3.1. Multiple conceptual models<br />
In the scenario approach a number of alternative<br />
conceptual models are considered. For each of these,<br />
the model input and parameter uncertainties may be<br />
analysed and the differences between model predictions<br />
are then seen as a measure of the model structure uncertainty.<br />
The idea of using alternative or competing candidate<br />
model structures was introduced in water quality<br />
modelling some time ago [5]. The issue typically dealt<br />
with here is whether models developed for current conditions<br />
can yield correct predictions when used under<br />
changed control. Van Straten and Keesman [50] note<br />
in this respect that good performance at the calibration<br />
stage does not guarantee correctly predicted behaviour,<br />
due to non-stationarity of the underlying processes in<br />
space or time.<br />
The multiple modelling approach has also been used<br />
in flood forecasting. For example, Butts et al. [8] use 10<br />
different model structures to evaluate structural uncertainty<br />
in flood predictions. They conclude that exploring<br />
an ensemble of model structures provides a useful<br />
approach in assessing simulation uncertainty.<br />
In groundwater modelling different conceptual models<br />
are typically based on different geological interpretations<br />
[18,43,42,30,34]. Højberg and Refsgaard [21]<br />
present an example using three different conceptual
1590 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />
models, based on three alternative geological interpretations<br />
for a multi-aquifer system in Denmark. Each of<br />
the models was calibrated against piezometric head data<br />
using inverse technique. The three models provided<br />
equally good and very similar predictions of groundwater<br />
heads, including well field capture zones. However,<br />
when using the models to extrapolate beyond the calibration<br />
data to predictions of flow pathways and travel<br />
times the three models differed dramatically. When<br />
assessing the uncertainty contributed by the model<br />
parameter values, the overlap of uncertainty ranges<br />
between the three models significantly decreased when<br />
moving from groundwater heads to capture zones and<br />
travel times. They conclude that the larger the degree<br />
of extrapolation, the more the underlying conceptual<br />
model dominates over the parameter uncertainty and<br />
the effect of calibration.<br />
The strategy of applying several alternative models<br />
based on codes with different model structures is also<br />
common in climate change modelling. In its description<br />
of uncertainty related to model predictions of both present<br />
and future climates the Intergovernmental Panel on<br />
Climate Change (IPCC) [22] bases its evaluation on scenarios<br />
of many (up to 35) different models. The same<br />
strategy is followed in the dialogue model [52]. Dialogue<br />
is a so-called integrated assessment model (IAM) of climate<br />
change. It has been developed as an interactive<br />
decision-support tool for energy supply policy making.<br />
Dialogue simulates the cause effect chain of climate<br />
change, using mono-disciplinary sub-models for each<br />
step in the chain. The chain starts with scenarios for economic<br />
growth, energy demand, fuel mix etc., leading to<br />
emissions of greenhouse gasses, leading to changes in<br />
atmospheric composition, leading to radiative forcing<br />
of the climate, leading to climate change, leading to<br />
impacts of climate change on societies and ecosystems.<br />
Rather than selecting one mono-disciplinary sub-model<br />
for each step, as most other climate IAMs do, dialogue<br />
uses multiple models for each step (for instance, three<br />
different carbon cycle models, simplified versions of five<br />
different global climate model – outcomes, etc.), representing<br />
the major part of the spectrum of expert opinion<br />
in each discipline.<br />
2.3.2. Expert elicitation<br />
Expert elicitation can be used as a supporting method<br />
in uncertainty analysis. It is a structured process to elicit<br />
subjective judgements and ideas from experts. It is<br />
widely used in uncertainty assessment to quantify uncertainties<br />
in cases where there is no or too few direct<br />
empirical data available to infer uncertainty. Usually<br />
the subjective judgement is represented as a probability<br />
density function reflecting the experts’ degree of belief.<br />
Expert elicitation aims to specify uncertainties in a structured<br />
and documented way, ensuring the account is both<br />
credible and traceable to its assumptions. Typically it is<br />
applied in situations where there is scarce or insufficient<br />
empirical material for a direct quantification of uncertainty<br />
[20]. An example with use of expert elicitation<br />
to estimate probabilities of alternative conceptual models<br />
is given by Meyer et al. [29]. They assessed probabilities<br />
as subjective values, from expert elicitation,<br />
reflecting a belief about the relative plausibility of each<br />
model based on its apparent consistency with available<br />
knowledge and data.<br />
Expert elicitation can also be used to generate ideas<br />
about alternative causal structures (conceptual models)<br />
that govern the behaviour of a system. Techniques used<br />
in decision analysis include group model building [51]<br />
and the hexagon method [19] but these techniques usually<br />
aim to achieve consensus. From the point of view<br />
of model structure uncertainty, these elicitation techniques<br />
could perhaps be used to generate alternative<br />
conceptual models.<br />
2.3.3. Pedigree analysis<br />
Another supporting method is pedigree analysis. The<br />
idea comes from Funtowicz and Ravetz [17], who note<br />
that statistical uncertainty in terms of inexactness does<br />
not cover all relevant dimensions of uncertainty, including<br />
the methodological and epistemological dimensions.<br />
To promote a more differentiated insight into uncertainty<br />
they propose to extend good scientific practice with five<br />
qualifiers for quantitative scientific information: numeral<br />
unit, spread, assessment, and pedigree (NUSAP). By<br />
adding expert judgement of reliability (assessment) and<br />
systematic multi-criteria evaluation of the processes by<br />
which numbers have been produced (pedigree), NUSAP<br />
has extended the statistical approach to uncertainty (inexactness)<br />
with the methodological (unreliability) and epistemological<br />
ignorance dimensions. By providing a<br />
separate qualification for each dimension of uncertainty,<br />
it enables flexibility in their expression.<br />
Each special sort of information has its own aspects<br />
that are key to its pedigree, so different pedigree matrices<br />
using different pedigree criteria can be used to qualify<br />
different sorts of information. Early applications of pedigree<br />
analysis of environmental models have focussed on<br />
parameter pedigree, using proxy representation, empirical<br />
basis, methodological rigor, theoretical understanding<br />
and validation as pedigree criteria. Later on,<br />
pedigree analysis has been extended to assessment of<br />
model assumptions and problem framing [49,12].<br />
2.4. Discussion of strengths/weaknesses and potentials/<br />
limitations<br />
The strategies used in ‘interpolation’, i.e. for situations<br />
that are similar to the calibration situation with<br />
respect to variables of interest and conditions of the natural<br />
system, have the advantage that they can be based<br />
directly on field data. A fundamental weakness is that
J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1591<br />
field data are themselves uncertain. Nevertheless, in<br />
many cases, they can be expected to provide relatively<br />
accurate estimates of, at least, the total predictive uncertainty<br />
for the specific measured variable and for the<br />
same conditions as those in the calibration and validation<br />
situation. Some of the methods cannot differentiate<br />
how the total predictive uncertainty originates from<br />
model input, model parameter and model structure<br />
uncertainty. Other methods attempt to do so. However,<br />
this distinction is, as recognised by many authors, e.g.<br />
Vrugt et al. [53], problematic. In the case of uncalibrated<br />
models, the parameter uncertainty is very difficult to<br />
assess quantitatively, and wrong estimates of model<br />
parameter uncertainty will influence the estimates of<br />
model structure uncertainty. In the case of calibrated<br />
models, estimates of model parameter uncertainty can<br />
often be derived from autocalibration routines. An inadequate<br />
model structure will, however, be compensated<br />
by biased parameter values to optimise the model fit<br />
with field data during calibration. Hence, the uncertainty<br />
due to model structure will be underestimated in<br />
this case.<br />
A more serious limitation of the strategies depending<br />
on observed data is that they are only applicable for situations<br />
where the output variables of interest are measured<br />
(e.g. [35,45,53]). While relevant field data are<br />
often available for variables such as water levels and<br />
water flows, this is usually not the case for concentrations,<br />
or when predictions are desired for scenarios<br />
involving catchment change, such as land use change<br />
or climate change. Another serious limitation stems<br />
from an assumption that the underlying system does<br />
not undergo structural changes, such as changes in ecosystem<br />
processes due to climate change.<br />
The strategy that uses multiple conceptual models<br />
benefits from an explicit analysis of the effects of alternative<br />
model structures. Furthermore, it makes it possible<br />
to include expert knowledge on plausible model structures.<br />
This strategy is strongly advocated by Neuman<br />
and Wierenga [31] and Poeter and Anderson [34]. They<br />
characterise the traditional approach of relying on a single<br />
conceptual model as one in which plausible conceptual<br />
models are rejected (in this case by omission). They<br />
conclude that the bias and uncertainty that results from<br />
reliance on an inadequate conceptual model are typically<br />
much larger than those introduced through an<br />
inadequate choice of model parameter values.<br />
This view is consistent with Beven [7] who outlines a<br />
new philosophy for modelling of environmental systems.<br />
The basic aim of his approach is to extend traditional<br />
schemes with a more realistic account of uncertainty,<br />
rejecting the idea that a single optimal model exists for<br />
any given case. Instead, environmental models may be<br />
non-unique in their accuracy of both reproduction of<br />
observations and prediction (i.e. unidentifiable or equifinal),<br />
and subject to only a conditional confirmation, due<br />
to e.g. errors in model structure, calibration of parameters<br />
and period of data used for evaluation. A weakness<br />
of the multiple modelling strategy, is the absence of<br />
quantitative information about the extent to which each<br />
model is plausible. Furthermore, it may be difficult to<br />
sample from the full range of plausible conceptual models.<br />
In this respect, expert knowledge on which the formulations<br />
of multiple conceptual models are based, is<br />
an important and unavoidable subjective element. The<br />
level of subjectivity can be reduced if the scenarios are<br />
generated in a formalised and reproducible manner.<br />
For example, this is possible with the TPROGS procedure<br />
[9,10], by which alternative geological models can<br />
be generated stochastically. The subjectivity does not<br />
disappear with this approach. Rather, it is transferred<br />
from formulation of the geological model itself to<br />
assumptions on probability functions and correlation<br />
structures of the various geological units that are more<br />
easily constrained in practice.<br />
The strategy of expert elicitation has the advantage<br />
that subjective expert knowledge can be included in<br />
the evaluation. It has the potential to make use of all<br />
available knowledge including knowledge that cannot<br />
be easily formalised otherwise. It can include views of<br />
sceptics, and reveals the level of expert disagreement<br />
on certain estimates. Expert elicitation also has several<br />
limitations. The fraction of experts holding a given view<br />
is not proportional to the probability of that view being<br />
correct. One may safely average estimates of model<br />
parameters, but if the expert’s models were incommensurate,<br />
one cannot average models [25]. If differences<br />
in expert opinion are irresolvable, weighing and combining<br />
the individual estimates of distributions is impossible.<br />
In practice, the opinions are often weighted<br />
equally, although sometimes self-rating is used to obtain<br />
a weight-factor for the experts competence. Finally, the<br />
results of expert elicitation tend to be sensitive to the<br />
selection of the experts whose estimates are gathered.<br />
In a review of four different case studies in which pedigree<br />
analysis was applied, Van der Sluijs et al. [49] show<br />
that pedigree analysis broadens the scope of uncertainty<br />
assessment and stimulates scrutiny of underlying methods<br />
and assumptions. Craye et al. [12] reported similar<br />
experiences. It facilitates structured, creative thinking<br />
on conceivable sources of error and fosters an enhanced<br />
appreciation of the issue of quality in information. It<br />
thereby enables a more effective criticism of quantitative<br />
information by providers, clients, and also users of all<br />
sorts, expert and lay. It provides differentiated insight<br />
in what the weakest parts of a given knowledge base<br />
are. It is flexible in its use and can be used on different<br />
levels of comprehensiveness: from a ‘back of the envelope’<br />
sketch based on self-elicitation to a comprehensive<br />
and sophisticated procedure involving structured informed<br />
in-depth group discussions, covering each pedigree<br />
criterion. The scoring of pedigree criteria is to a certain
1592 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />
degree subjective. Subjectivity can partly be remedied by<br />
the design of unambiguous pedigree matrices and by<br />
involving multiple experts in the scoring. The choice of<br />
experts to do the scoring is also a potential source of<br />
bias. The method is relatively new, with a limited (but<br />
growing) number of practitioners. There is as yet no settled<br />
guideline for good practice. We must keep in mind<br />
that it is not a panacea for the problem of unquantifiable<br />
uncertainty.<br />
3. New framework<br />
We propose that conceptual uncertainty can be<br />
assessed by adopting a protocol based on the six elements<br />
shown in Fig. 3. The central aim is to establish<br />
a number of plausible conceptual models, with a range<br />
that adequately samples the space of possible conceptual<br />
models, to evaluate the tenability of each conceptual<br />
model and the overall range of models selected in relation<br />
to the perceived uncertainty on model structure<br />
and to propagate the uncertainties in each case.<br />
STEP 1: Formulate a conceptual model. A conceptual<br />
model is established. Since we have defined a conceptual<br />
model as a combination of our qualitative process<br />
understanding and the simplifications acceptable for a<br />
particular modelling study, a conceptual model becomes<br />
highly site-specific and even case-specific. For example a<br />
conceptual model of an aquifer may be described as<br />
Formulate a conceptual<br />
model<br />
Set up and calibrate<br />
model<br />
Sufficient conceptual<br />
models<br />
Perform validation tests<br />
and accept/reject models<br />
Evaluate tenability and<br />
completeness of<br />
conceptual models<br />
Make model predictions<br />
and assess uncertainty<br />
Fig. 3. Protocol for assessing conceptual model uncertainty.<br />
two-dimensional for a study focussing on regional<br />
groundwater heads, while it may need to include threedimensional<br />
geological structures for detailed simulation<br />
of contaminant transport. Formulating a new conceptual<br />
model may involve changing or refining the model<br />
structure, e.g. by modifying the hydrogeological interpretations<br />
(in the case of groundwater models), dimensionality,<br />
temporal and spatial resolution, initial and<br />
boundary conditions and process descriptions (governing<br />
equations).<br />
STEP 2: Set up and calibrate model. On the basis of<br />
the formulated conceptual model a site- and case-specific<br />
model is set up. Subsequently the model is calibrated<br />
and the model parameter uncertainty assessed.<br />
For the purposes of ‘interpolation’ (i.e. relevant observations<br />
are available), the parameter uncertainty can<br />
reasonably be constrained through calibration. However,<br />
for the case of ‘extrapolation’, the risk of calibrating<br />
model parameters for prediction of unobserved<br />
variables is that the model becomes biased for the unobserved<br />
variable.<br />
STEP 3: Sufficient conceptual models The first two<br />
steps are repeated until sufficient conceptual models<br />
are included. This judgement will be influenced by the<br />
practical constraints on including additional models<br />
and the desire to include additional conceptual models<br />
that are substantially different from those already<br />
included.<br />
STEP 4: Perform validation tests (to the extent data<br />
availability allows). In order to evaluate how well the<br />
models describe the system in question, the performances<br />
of each of the models are tested by comparing<br />
model predictions with independent field data, i.e. data<br />
not used for calibration. This may be achieved by splitting<br />
the sample data into a calibration and validation<br />
set, or, alternatively, by cross-validation (e.g. bootstrapping:<br />
[15]) against ‘independent data’. The models whose<br />
predictive capability is deemed low are discarded and<br />
the reasons for these predictive failures are explored,<br />
where possible, for insight into the origins of structural<br />
uncertainty. In ‘extrapolation’ cases, data will usually<br />
not be available for validation tests and STEP 4 must<br />
be skipped. However, in some cases, it is possible to test<br />
‘intermediate’ model results. For example a groundwater<br />
model aimed at prediction of concentration values<br />
can often be tested against groundwater head and discharge<br />
data, or sparse concentration data may be available<br />
for parts of the study area.<br />
STEP 5: Evaluate tenability and completeness of conceptual<br />
models. The aim of this step is to analyse the<br />
retained models with respect to their predictive bias<br />
and uncertainty. This has two elements: (i) to evaluate<br />
the tenability of each conceptual model; and (ii) as far<br />
as possible, to evaluate the extent to which the retained<br />
models represent the space of plausible conceptual models.<br />
The tenability of the conceptual models is evaluated
J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1593<br />
Table 1<br />
Pedigree matrix for evaluating the tenability of a conceptual model<br />
Plausibility Colleague consensus<br />
Score Supporting empirical evidence Theoretical understanding Representation of<br />
understood<br />
Proxy Quality and quantity<br />
underlying mechanisms<br />
Highly plausible All but cranks<br />
Well-established theory Model equations reflect high<br />
mechanistic process detail<br />
Controlled experiments and large<br />
sample direct measurements<br />
4 Exact measures of the<br />
modelled quantities<br />
Reasonably plausible All but rebels<br />
Model equations reflect<br />
acceptable mechanistic<br />
process detail<br />
Accepted theory with<br />
partial nature<br />
(in view of the<br />
phenomenon it describes)<br />
Historical/field data uncontrolled<br />
experiments small sample<br />
direct measurements<br />
3 Good fits or measures of<br />
the modelled quantities<br />
Somewhat plausible Competing schools<br />
Aggregated parameterised<br />
meta model<br />
Accepted theory with<br />
partial nature and<br />
limited consensus on reliability<br />
Modelled/derived data indirect<br />
measurements<br />
2 Well correlated but not<br />
measuring the same thing<br />
Preliminary theory Grey box model Not very plausible Embryonic field<br />
Educated guesses indirect approx.<br />
rule of thumb estimate<br />
1 Weak correlation but<br />
commonalties in measure<br />
Crude speculation Crude speculation Black box model Not at all plausible No opinion<br />
0 Not correlated and not<br />
clearly related<br />
through expert reviews. First, the strength of the tenability<br />
of each conceptual model is evaluated by using the<br />
pedigree matrix in Table 1. A structured procedure for<br />
the elicitation of pedigree scores is given by Van der Sluijs<br />
et al. [47]. Note that there is no need to arrive at a<br />
consensus pedigree score for each criterion: if experts<br />
disagree on the pedigree scores for a given model, this<br />
reflects further epistemological uncertainty surrounding<br />
that model. Next, the adequacy of the retained conceptual<br />
models to represent the range of plausible models is<br />
evaluated. This is an assessment of whether the space of<br />
the retained conceptual models is sufficient to encapsulate<br />
the relevant range of plausible conceptual models<br />
without becoming impractical. This has strong similarities<br />
to Dunn’s concept of context validation [14]. Context<br />
validity refers to the validity of inferences that we<br />
have estimated the proximal range of rival hypotheses.<br />
Context validation can be performed by a bottom-up<br />
process to elicit from experts rival hypotheses on causal<br />
relations governing the dynamics of a system. One could<br />
argue that an infinite number of conceivable models<br />
might exist. However, it has been shown in projects<br />
where such elicitation processes were used, that the<br />
cumulative distribution of unique rival models flattens<br />
out after consultation of a limited number of experts,<br />
usually somewhere between 20 and 25 when chosen with<br />
diverse enough backgrounds [27].<br />
STEP 6: Make model predictions and assess uncertainty.<br />
Together with model predictions of the desired<br />
variables, uncertainty assessments are carried out. This<br />
will typically include uncertainty in input data and<br />
parameter values in addition to the conceptual uncertainty.<br />
Furthermore, on the basis of the goodness of<br />
the conceptual models, evaluated in STEP 5 the goodness<br />
of the assessed predictive uncertainty associated<br />
with the model structure should be evaluated.<br />
4. Discussion and conclusions<br />
4.1. Methodologies to assess conceptual uncertainty<br />
As discussed above, the existing strategies fall into<br />
two main categories, each with limitations. The strategies<br />
where model structure errors are assessed from<br />
observed data are confined to interpolation cases,<br />
understood as cases where the model can be calibrated<br />
and validated against field data for the variables of predictive<br />
interest and where the natural system does not<br />
undergo structural change. The strategies used for situations<br />
involving extrapolation depend either on multiple<br />
conceptual models (preferred) or on expert elicitation or<br />
pedigree analysis for a single conceptual model (usually<br />
less preferred).<br />
The novelty of our proposed framework is the combination<br />
of multiple conceptual models and the pedigree
1594 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />
approach for assessing the overall tenability of these<br />
models in one formalised protocol. Some of our proposed<br />
steps are similar to other approaches for dealing<br />
with equifinality, multiple possible models and the<br />
rejection of non-behavioural model [6,31]. Other steps<br />
are based on qualitative approaches, including expert<br />
knowledge in a structured manner [20,49]. The aim of<br />
our new framework is not to identify the ‘‘true’’ model<br />
structure or the cause of the errors in the existing model<br />
structure. Instead, we propose an approach that integrates<br />
different types of knowledge, not previously combined,<br />
such as quantitative and qualitative uncertainty,<br />
to estimate the impact of model structure uncertainty<br />
on model predictions.<br />
The GLUE approach (generalised likelihood uncertainty<br />
estimation, [6,7]) also operates with a range of<br />
alternative models. Although almost all applications<br />
of GLUE reported so far operate with only one model<br />
structure and many alternative model parameter sets, it<br />
is possible to use GLUE with alternative model structures<br />
[24]. In addition to prescribing multiple conceptual<br />
models, an important difference between our<br />
proposed approach and GLUE is that we recommend<br />
parameter optimisation is conducted as part of the calibration<br />
in order to take full advantage of the information<br />
in field data. There are different opinions about<br />
whether calibration by parameter optimisation is advisable<br />
or not. The main advantage of calibration is that it<br />
improves the ability of the model to reproduce hydrological<br />
behaviour of a system within the limits of<br />
observed behaviour [31]. An important by-product is<br />
that it provides useful information about the uncertainty<br />
of model parameters. The disadvantage is that<br />
parameter optimisation may result in biased parameter<br />
values to compensate for errors in model structure and<br />
that many parameter sets (i.e. many models) perform<br />
more or less equally well but provide different results.<br />
In implementing our framework, model calibration<br />
might be skipped and many models with different<br />
parameter sets retained, as in the GLUE approach.<br />
The reason we are not advocating such an approach<br />
is partly for pragmatic reasons (very large computational<br />
requirements) and partly that we aim to focus<br />
on model structure uncertainty rather than parameter<br />
uncertainty.<br />
Although intended for use in a very different context,<br />
the central aim behind our proposed protocol is similar<br />
to the approach of IPCC [22], who assign a level of confidence<br />
to their assessment of climate change by evaluating<br />
predictions from multiple models. The level of<br />
confidence placed in a particular finding reflects both<br />
the degree of consensus amongst modellers and the<br />
quantity of evidence that is available to support the finding.<br />
IPCC [22] classifies the confidence qualitatively in<br />
three levels: (i) ‘well established’, (ii) ‘evolving’ and (iii)<br />
‘speculative’.<br />
4.2. Critical issues for implementing the new protocol<br />
4.2.1. Performance criteria – threshold for accepting/<br />
rejecting models<br />
A critical issue in relation to acceptance/rejection of<br />
models (STEP 4 above) is how to define performance<br />
criteria. We agree with Beven [7] that any conceptual<br />
model is (known to be) wrong in an absolute sense,<br />
and hence that any model will be rejected if we investigate<br />
it in sufficient detail and specify very high performance<br />
criteria. On the other hand, the whole point in<br />
modelling is to simplify.<br />
A good reference for model performance is to compare<br />
it with uncertainties of the available field observations.<br />
If the model performance is within this<br />
uncertainty range we may characterise the model as<br />
good enough. However, usually it is less straightforward.<br />
For example, how wide should the confidence<br />
bands be before we reject models or accept them within<br />
observational uncertainties – ranges corresponding to<br />
65%, 95% or 99% Indeed, the differences between<br />
95% and 99% may be significant in practical terms. Do<br />
we always then reject a model if it cannot perform<br />
within the observational uncertainty range How reasonable<br />
are our estimates of uncertainty in observations<br />
In many cases, even the results from less<br />
accurate models may be very useful.<br />
Another reference for what is acceptable accuracy is<br />
the use of a benchmark model as discussed by e.g. Seibert<br />
[41]. The difficulty is then transferred to selecting<br />
an appropriate benchmark.<br />
Our answer is that the decision on performance criteria<br />
must, in general, be taken in a socio-economic context,<br />
for which predictive uncertainties must be clearly<br />
explained and open to interpretation beyond small<br />
groups of scientists. Thus, we believe that the accuracy<br />
criteria cannot be decided universally by modellers or<br />
researchers, but must be different from case to case<br />
depending on the nature of a decision and the risks<br />
involved.<br />
4.2.2. Qualitative assessment of tenability of conceptual<br />
models<br />
Pedigree analysis structures the critical appraisal of<br />
alternative model structures and provides insight in the<br />
state of knowledge on which each of the conceivable<br />
model structures is based. However, it does not give<br />
an indication of the relative quality of the various model<br />
structures. With reference to Table 1, the pedigree analysis<br />
for a simple statistical model (A) and a complex<br />
mechanistic model (B) could, for example, result in<br />
statements like:<br />
• Model A is weakly correlated to the predicted variable<br />
(Proxy, score 1), based on a large sample of<br />
direct measurements (Quality and quantity, score
J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1595<br />
4), built on a preliminary theory and a black box<br />
model (Theoretical understanding, score 1; Representation<br />
of mechanisms, score 1), somewhat plausible<br />
(Plausibility, score 2) and controversial among colleagues<br />
(Colleague consensus, score 2);<br />
• Model B exactly addresses the desired predictive variable<br />
(Proxy, score 4), is based on data with rule of<br />
thumb estimates (Quality and quantity, score 1), built<br />
on a well-established theory with model equations<br />
reflecting high process details (Theoretical understanding,<br />
score 4; Representation of mechanisms,<br />
score 4), reasonably plausible and accepted by all colleagues<br />
except rebels (Plausibility and Colleague consensus,<br />
score 3).<br />
Such statements cannot be integrated in a quantitative<br />
uncertainty analysis in terms of probabilities, but<br />
they should be available as the best possible scientifically<br />
based characterisation of uncertainties and as such be<br />
made available to those involved in the decision making<br />
process.<br />
Furthermore, as the selected conceptual models can<br />
never cover all possibilities, but instead cover limited<br />
range, it is important to emphasise that the overall<br />
uncertainty of model predictions cannot be assessed in<br />
an absolute sense, only in a conditional or relative sense<br />
[7,31]. Our suggested method does not alter this fundamentally.<br />
However, we believe that the outcome of the<br />
proposed formalised review is a qualitative assessment<br />
that is more useful in a decision making context than<br />
unstructured information, or verbose information from<br />
scientific outlets that is not always available to the decision<br />
maker. The challenge is to design environmental<br />
management strategies that are robust against the uncertainties<br />
identified. Inclusion of a wider range of conceivable<br />
model structures may help to anticipate surprises<br />
that would have been overlooked otherwise.<br />
4.2.3. Different degrees of extrapolation<br />
Our proposed framework deals with situations where<br />
predictions involve extrapolations beyond available field<br />
data. However, there are different degrees of extrapolation<br />
(Fig. 2). If we look at the situation where a threedimensional<br />
groundwater model is calibrated against<br />
groundwater head and discharge data, model predictions<br />
of groundwater recharge to a given layer is a smaller<br />
extrapolation than model predictions of groundwater<br />
age or contaminant concentration. In both situations,<br />
model predictions are carried out for variables that have<br />
not been used as calibration targets and for which no<br />
traditional split-sample validation tests are possible.<br />
The type of validation test recommended for such situation<br />
is a proxy-basin test, which according to the principles<br />
in Klemes [26] and Refsgaard [38], for instance,<br />
could imply that validation tests have to be conducted<br />
in two similar catchments where relevant data (e.g. concentrations)<br />
exist, and where such data are not used for<br />
calibration. The residuals in the other catchments can<br />
then be seen as a measure of the uncertainty to be<br />
expected in the catchment of interest.<br />
If model predictions are made for groundwater heads<br />
in cases involving groundwater abstraction, and the<br />
existing data available for calibration and validation<br />
tests do not include such abstraction, we also have an<br />
extrapolation case, although of a different nature. In this<br />
case we have data for the variable of predictive interest,<br />
but the catchment characteristics are non-stationary.<br />
This corresponds to the situation of model validation<br />
denoted by a differential split-sample test [26,38]. The<br />
differential split-sample test scheme recommended by<br />
Klemes also operates by tests on similar catchments<br />
where data for the type of non-stationary situation exist.<br />
Differential split-sample tests are often less demanding<br />
than proxy-basin tests [36]. A similar type of differential<br />
split-sample situation arises when predictions are<br />
required for a system in which structural change is<br />
expected (e.g. [50,4].<br />
In cases where the conceptual models can be transferred<br />
to other catchments in a reliable and reproducible<br />
way, such proxy-basin and differential split-sample tests<br />
could be conducted and the results used to evaluate the<br />
goodness of the underlying conceptual models. It is<br />
worth noting that Klemes’ test schemes, which also<br />
apply for cases of extrapolation, operate with tests for<br />
two alternative catchments. This has clear similarities<br />
with our strategy of recommending the use of multiple<br />
conceptual models.<br />
4.3. Perspectives<br />
In many cases where environmental models are used<br />
to make predictions that are extrapolations beyond the<br />
calibration base, no suitable framework exists for assessing<br />
the effects of model structure error. The proposed<br />
framework is composed of elements originating from<br />
different scientific disciplines. The elements are well<br />
tested individually, but not previously applied in such<br />
an integrated manner for water resources or environmental<br />
modelling applications. The full framework still<br />
needs to be tested in real-life cases.<br />
Acknowledgement<br />
For the three authors from GEUS and UVA the present<br />
work was supported by the Project ‘Harmonised<br />
Techniques and Representative River Basin Data for<br />
Assessment and Use of Uncertainty Information in<br />
Integrated Water Management’ (www.harmonirib.com),<br />
which is partly funded by the EC Energy, Environment<br />
and Sustainable Development programme (Contract<br />
EVK1-2002-00109). The constructive comments of
1596 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />
Hoshin V. Gupta and two anonymous reviewers are<br />
acknowledged.<br />
Appendix. Terminology<br />
The terminology used is mainly based on Refsgaard<br />
and Henriksen [39]:<br />
Reality: The system that we aim to represent with the<br />
model, understood here as the study area.<br />
Conceptual model: A representation of ‘reality’ in<br />
terms of verbal descriptions, equations, governing relationships<br />
or ‘natural laws’ that purport to describe reality.<br />
This is the user’s perception of the key hydrological<br />
and ecological processes in the study area (perceptual<br />
model) and the corresponding simplifications and<br />
numerical accuracy limits that are assumed acceptable<br />
in order to achieve the purpose of the modelling. A conceptual<br />
model therefore includes a mathematical<br />
description (equations) of assumed processes and a<br />
description of the objects they interact with, including<br />
river system elements, ecological structures, geological<br />
features, etc. that are required for the particular purpose<br />
of modelling.<br />
Model code: A generic mathematical description of a<br />
conceptual model, implemented in a computer program.<br />
It is generic in the sense that, without program changes,<br />
it can be used to establish a model with the same basic<br />
type of equations (but allowing different input variables<br />
and parameter values) for a different study area.<br />
Model: A case-specific tailored version of a model<br />
code established for a particular study area and set of<br />
modelling objectives (output variables) including specific<br />
input data and parameter values.<br />
Model confirmation: Determination of the adequacy<br />
of the conceptual model to provide an acceptable performance<br />
for the domain of intended application.<br />
Code verification: Substantiation that a model code<br />
adequately represents a conceptual model within certain<br />
specified limits or ranges of application and corresponding<br />
ranges of accuracy.<br />
Model calibration: The procedure of adjusting the<br />
parameter values of a model in such a way that the<br />
model reproduces an observed response of the system<br />
represented in the model within the range of accuracy<br />
specified in the performance criteria.<br />
Model validation: Substantiation that a model, within<br />
its domain of applicability, possesses a satisfactory<br />
range of accuracy, consistent with the intended application<br />
of the model. Note that various authors have criticised<br />
the use of the word validation for predictive<br />
models because universal validation of a model is in<br />
principle impossible and therefore prefer to use the term<br />
model evaluation [32,3]. In our definition [39] the term<br />
validation is not used in a universal sense, but is always<br />
restricted to clearly defined domains of applicability and<br />
performance accuracy (‘numerical universal’ in Popperian<br />
sense).<br />
Pedigree: Pedigree conveys an evaluative account of<br />
the production process of information, and indicates different<br />
aspects of the underpinning and scientific status<br />
of the knowledge used. Pedigree is expressed by means<br />
of a set of pedigree criteria to assess these different<br />
aspects. Criteria for model parameter pedigree are for<br />
instance proxy representation, empirical basis, methodological<br />
rigor, theoretical understanding and validation.<br />
Assessment of pedigree involves qualitative expert<br />
judgement. To minimise arbitrariness and subjectivity<br />
in measuring strength, a pedigree matrix is used to code<br />
qualitative expert judgements for each criterion into a<br />
discrete numeral scale from 0 (weak) to 4 (strong) with<br />
linguistic descriptions (modes) of each level on the scale<br />
[49].<br />
References<br />
[1] Aller LT, Bennet T, Lehr JH, Petty RJ. DRASTIC: a standardized<br />
system for evaluating ground water pollution potential using<br />
hydrogeologic setting, US EPA Robert S. Kerr Environmental<br />
Research Laboratory, EPA/600/287/035, Ada, OK, 1987.<br />
[2] Babendreier JE. National-scale multimedia risk assessment for<br />
hazardous waste disposal. In: International workshop on uncertainty,<br />
sensitivity and parameter estimation for multimedia<br />
environmental modelling held at US Nuclear Regulatory Commission,<br />
Rockville (MD), August 19–21, 2003. Proceedings, pp.<br />
103–9.<br />
[3] Beck MB. Model evaluation and performance. In: El-Shaarawi<br />
AH, Piegorsch WW, editors. Encyclopedia of environmetrics, vol.<br />
3. Chichester: John Wiley & Sons, Ltd; 2002. p. 1275–9.<br />
[4] Beck MB. Environmental foresight and structural change. Environ<br />
Modell Software 2005;20:651–70.<br />
[5] Beck MB, van Straten G, editorsUncertainty and forecasting of<br />
water quality. Springer-Verlag; 1983.<br />
[6] Beven K, Binley AM. The future of distributed models, model<br />
calibration and uncertainty predictions. Hydrol Process<br />
1992;6:279–98.<br />
[7] Beven K. Towards a coherent philosophy for modelling the<br />
environment. Proc Roy Soc London, A 2002;458(2026):<br />
2465–84.<br />
[8] Butts MB, Payne JT, Kristensen M, Madsen H. An evaluation of<br />
the impact of model structure on hydrological modelling uncertainty<br />
for streamflow prediction. J Hydrol 2004;298:242–66.<br />
[9] Carle SF, Fog GE. Transition probability based on indicator<br />
geostatistics. Math Geol 1996;28(4):453–77.<br />
[10] Carle SF, Fog GE. Modeling spatial variability with one and<br />
multidimensional contineous-lag Markov chains. Math Geol<br />
1997;29(7):891–917.<br />
[11] Copenhagen County. Pilot project on establishment of methodology<br />
for zonation of groundwater vulnerability. In: Proceedings<br />
from seminar on groundwater zonation, November 7, 2000,<br />
County of Copenhagen [in Danish].<br />
[12] Craye M, van der Sluijs JP, Funtowicz S. A reflexive approach to<br />
dealing with uncertainties in environmental health risk science and<br />
policy. Int J Risk Assess Manage 2005;5(2):216–36.<br />
[13] Dubus IG, Brown CD, Beulke S. Sources of uncertainty in<br />
pesticide fate modelling. Sci Total Environ 2003;317:53–72.<br />
[14] Dunn W. Using the method of context validation to mitigate type<br />
III errors in environmental policy analysis. In: Hisschemöller M,
J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1597<br />
Hoppe HV, Dunn W, Ravetz J, editors. Knowledge, power and<br />
participation in environmental policy. Policy studies review<br />
annual, vol. 12. New Jersey (USA): Transaction Publishers. p.<br />
417–36.<br />
[15] Efron B, Tibshirani RJ. An introduction to the bootstrap.<br />
Monographs on statistics and applied probability. New<br />
York: Chapman and Hall; 1993.<br />
[16] Franchini M, Pacciani M. Comparative analysis of several<br />
conceptual rainfall-runoff models. J Hydrol 1992;122:161–219.<br />
[17] Funtowicz SO, Ravetz JR. Uncertainty and quality in science for<br />
policy. Dordrecht: Kluwer; 1990. p. 229.<br />
[18] Harrar WG, Sonnenborg TO, Henriksen HJ. Capture zone, travel<br />
time and solute transport predictions using inverse modelling and<br />
different geological models. Hydrogeol J 2003;11(5):536–48.<br />
[19] Hodgson AM. Hexagons for systems thinking. Eur J Oper Res<br />
1992;59:220–30.<br />
[20] Hora SC. Acquisition of expert judgement: examples from risk<br />
assessment. J Energy Eng 1992;118:136–48.<br />
[21] Højberg AL, Refsgaard JC. Model uncertainty – parameter<br />
uncertainty versus conceptual models. Water Sci Technol<br />
2005;52(6):177–86.<br />
[22] IPCC. Climate change 2001: the scientific basis. Contribution of<br />
working group I to the third assessment report of the intergovernmental<br />
panel of climate change [Houghton JT, Ding Y, Griggs<br />
DJ, Noguer M, van der Linden PJ, Dai X, Maskell K, Johnson<br />
CA, editors]. Cambridge University Press, Cambridge (UK) and<br />
New York (NY, USA). p. 881.<br />
[23] Jakeman AJ, Letcher RA. Integrated assessment and modelling:<br />
features, principles and examples for catchment management.<br />
Environ Modell Software 2003;18:491–501.<br />
[24] Jensen JB. Parameter and uncertainty estimation in groundwater<br />
modelling. PhD thesis, Department of Civil Engineering, Aalborg<br />
University, Series Paper no. 23, 2003.<br />
[25] Keith DW. When is it appropriate to combine expert judgements<br />
Clim Change 1996;33:139–43.<br />
[26] Klemes V. Operational testing of hydrological simulation models.<br />
Hydrol Sci J 1986;31:13–24.<br />
[27] Kloprogge P, van der Sluijs JP. The inclusion of stakeholder<br />
knowledge and perspectives in integrated assessment of climate<br />
change. Climatic Change, in press.<br />
[28] Linkov I, Burmistrov D. Model uncertainty and choices made by<br />
modelers: lessons learned from the international atomic energy<br />
model intercomparisons. Risk Anal 2003;23(6):1297–308.<br />
[29] Meyer PD, Ye M, Neuman SP, Cantrell KJ. Combined estimation<br />
of hydrogeologic conceptual model and parameter uncertainty.<br />
NUREG/CR-6843 Report, NRC, Washington, DC, 2004.<br />
[30] National Research Council. Conceptual models of flow and<br />
transport in the vadose zone. Washington, DC: National Academy<br />
Press; 2001.<br />
[31] Neuman SP, Wierenga PJ. A comprehensive strategy of hydrogeologic<br />
modeling and uncertainty analysis for nuclear facilities<br />
and sites. University of Arizona, Report NUREG/CR-6805,<br />
2003.<br />
[32] Oreskes N, Shrader-Frechette K, Belitz K. Verification, validation,<br />
and confirmation of numerical models in the Earth Sciences.<br />
Science 1994;263:641–6.<br />
[33] Pahl-Wostl C. Towards sustainability in the water sector – the<br />
importance of human actors and processes of social learning.<br />
Aquat Sci 2002;64:394–411.<br />
[34] Poeter E, Anderson D. Multiple ranking and inference in ground<br />
water modeling. Ground Water 2005;43(4):597–605.<br />
[35] Radwan M, Willems P, Berlamont J. Sensivity and uncertainty<br />
analysis for river quality modelling. J Hydroinform 2004:83–99.<br />
[36] Refsgaard JC, Knudsen J. Operational validation and intercomparison<br />
of different types of hydrological models. Water<br />
Resources Res 1996;32(7):2189–202.<br />
[37] Refsgaard JC, Hansen LK, Vahman M. Groundwater zonation in<br />
Copenhagen County – Intercomparision of thematic results from<br />
different consultants. In: Seminar on groundwater zonation,<br />
County of Copenhagen, November 7, 2000 [in Danish].<br />
[38] Refsgaard JC. Towards a formal approach to calibration and<br />
validation of models using spatial data. In: Grayson R, Blöschl G,<br />
editors. Spatial patterns in catchment hydrology: observations<br />
and modelling. Cambridge University Press; 2001. p. 329–54.<br />
[39] Refsgaard JC, Henriksen HJ. Modelling guidelines – terminology<br />
and guiding principles. Adv Water Resources 2004;27:71–82.<br />
[40] Refsgaard JC, Storm B. MIKE SHE. In: Singh VP, editor.<br />
Computer models of watershed hydrology. Water Resources<br />
Publication; 1995. p. 809–46.<br />
[41] Seibert J. On the need for benchmarks in hydrological modelling.<br />
Hydrol Process 2001;15(6):1063–4.<br />
[42] Selroos JO, Walker DD, Strom A, Gylling B, Follin S. Comparison<br />
of alternative modelling approaches for groundwater flow in<br />
fractured rock. J Hydrol 2001;257:174–88.<br />
[43] Troldborg L. The influence of conceptual geological models on<br />
the simulation of flow and transport in quaternary aquifer<br />
systems. PhD Thesis. Geological Survey of Denmark and Greenland,<br />
Report 2004/107.<br />
[44] Usunoff E, Carrera J, Mousavi SF. An approach to the design of<br />
experiments for discriminating among alternative conceptual<br />
models. Adv Water Resources 1992;15:199–214.<br />
[45] Van Griensven A, Meixner T. Dealing with unidentifiable sources<br />
of uncertainty within environmental models. In: Pahl C, Schmidt<br />
S, Jakeman T, editors. iEMSs 2004 international congress:<br />
‘‘Complexity and integrated resources management’’. International<br />
Environmental Modelling and Software Society, Osnabrück,<br />
Germany, June 2004.<br />
[46] Van der Sluijs JP. Anchoring amid uncertainty; On the management<br />
of uncertainties in risk assessment of anthropogenic climate<br />
change, Ph.D. thesis, Utrecht University, 1997. p. 260.<br />
[47] Van der Sluijs JP, Potting J, Risbey JS, Van Vuuren D, de Vries B,<br />
Beusen A, et al. Uncertainty assessment of the IMAGE/TIMER<br />
B1 CO2 emissions scenario, using the NUSAP method. Report<br />
commissioned by the Netherlands National Research Program on<br />
global Air Pollution and Climate Change, RIVM, Bilthoven, The<br />
Netherlands, 2002. p. 225.<br />
[48] Van der Sluijs JP, Risbey JS, Kloprogge P, Ravetz JR, Funtowicz<br />
SO, Corral Quintana S, et al. RIVM/MNP Guidance for<br />
uncertainty assessment and communication: detailed guidance,<br />
report commissioned by RIVM/MNP – Copernicus Institute,<br />
Department of Science, Technology and Society, Utrecht University,<br />
Utrecht, The Netherlands, 2003. p. 71.<br />
[49] Van der Sluijs JP, Craye M, Funtowicz SO, Kloprogge P, Ravetz<br />
J, Risbey JS. Combining quantitative and qualitative measures of<br />
uncertainty in model based foresight studies: the NUSAP system.<br />
Risk Anal 2005;25(2):481–92.<br />
[50] Van Straten G, Keesman KJ. Uncertainty propagation and<br />
speculation in projective forecasts of environmental change: a<br />
lake-eutrophication example. J Forecast 1991;10:163–90.<br />
[51] Vennix JAM. Group model-building: tackling messy problems.<br />
Syst Dyn Rev 1999;15(4).<br />
[52] Visser H, Folkert RJM, Hoekstra J, De Wolff JJ. Identifying key<br />
sources of uncertainty in climate change projections. Clim Change<br />
2000;45:421–57.<br />
[53] Vrugt JA, Diks CGH, Gupta HV. Improved treatment of<br />
uncertainty in hydrologic modelling: combining the strengths of<br />
global optimization and data assimilation. Water Resources Res<br />
2005;41(1). Art No W01017.<br />
[54] Walker WE, Harremoës P, Rotmans J, Van der Sluijs JP, Van<br />
Asselt MBA, Janssen P, et al. Defining uncertainty. A conceptual<br />
basis for uncertainty management in model-based decision support.<br />
Integr Assessment 2003;4(1):5–17.