19.01.2015 Views

HYDROLOGICAL MODELLING AND RIVER BASIN MANAGEMENT

HYDROLOGICAL MODELLING AND RIVER BASIN MANAGEMENT

HYDROLOGICAL MODELLING AND RIVER BASIN MANAGEMENT

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Danmarks og Grønlands Geologiske Undersøgelse — Særudgivelse 2007<br />

Hydrological Modelling and River Basin Management<br />

Doctoral Thesis<br />

Jens Christian Refsgaard<br />

Geological Survey of Denmark and Greenland<br />

Danish Ministry of the Environment


Denne afhandling er af Det Naturvidenskabelige Fakultet ved Københavns Universitet antaget til offentligt at forsvares<br />

for den naturvidenskabelige doktorgrad.<br />

København, den 5. januar, 2007<br />

Nils O. Andersen<br />

Dekan<br />

Forsvaret vil finde sted fredag den 1 juni, 2007 kl 14 00 i Anneksauditorium A, Studiestræde 6, Københavns Universitet<br />

This thesis has been accepted by the Faculty of Natural Science at the University of Copenhagen for public defence<br />

in fulfilment of the degree of Doctor of Science.<br />

Copenhagen, 5 th January, 2007<br />

Nils O Andersen<br />

Dean<br />

The defence will take place on Friday 1 st June, 2007 at 14 00 in Anneksaudiorium A, Studiestræde 6, University of<br />

Copenhagen<br />

Special Issue<br />

Author: Jens Christian Refsgaard<br />

Illustrations: Kristian A. Rasmussen and reproductions from existing publications<br />

Cover: Kristian A. Rasmussen<br />

Date: January 2007<br />

The Report is available on the internet at http://www.geus.dk/<br />

ISBN 978-87-7871-185-4<br />

Geological Survey of Denmark and Greenland (GEUS)<br />

Øster Voldgade 10<br />

DK-1350 København K<br />

Tel: +45 38142000<br />

Fax: +45 38142050<br />

Email: geus@geus.dk<br />

http://www.geus.dk/


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Table of Contents<br />

Dansk Resume 3<br />

Abstract 4<br />

1. Introduction 5<br />

1.1 Water Resources Management and Hydrological Modelling 5<br />

1.2 Objective and Content 5<br />

2 Water Resources Management and the Modelling Process 7<br />

2.1 Modelling as Part of the Planning and Management Process 7<br />

2.2 Terminology and Scientific Philosophical Basis for the Modelling Process 10<br />

2.2.1 Background 10<br />

2.2.2 Terminology and guiding principles 10<br />

2.2.3 Scientific philosophical aspects 12<br />

2.3 Modelling Protocol 14<br />

2.4 Classification of Models 18<br />

3 Simulation of Hydrological Processes at Catchment Scale 20<br />

3.1 Flow modelling 20<br />

3.1.1 Groundwater/surface water model for the Suså catchment ([1], [2]) 20<br />

3.1.2 Application of SHE to catchments in India ([4], [5]) 27<br />

3.1.3 Intercomparison of different types of hydrological models ([6]) 32<br />

3.2 Reactive Transport 36<br />

3.2.1 Oxygen transport and consumption in the unsaturated zone ([3]) 36<br />

3.2.2 An integrated model for the Danubian Lowland ([9]) 39<br />

3.2.3 Large scale modelling of groundwater contamination ([10]) 45<br />

3.3 Real-time Flood Forecasting 49<br />

3.3.1 Intercomparison of updating procedures for real-time forecasting ([8]) 49<br />

4. Key Issues in Catchment Scale Hydrological Modelling 53<br />

4.1 Scaling 53<br />

4.1.1 Catchment heterogeneity 53<br />

4.1.2 A scaling framework 56<br />

4.1.3 Scaling - an example 58<br />

4.1.4 Discussion – post evaluation 59<br />

4.2 Confirmation, Verification, Calibration and Validation 62<br />

4.2.1 Confirmation of conceptual model 62<br />

4.2.2 Code verification 62<br />

4.2.3 Model calibration 63<br />

4.2.4 Model validation 63<br />

i


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

4.2.5 Discussion – post evaluation 64<br />

4.3 Uncertainty Assessment 66<br />

4.3.1 Modelling uncertainty in a water resources management context 66<br />

4.3.2 Data uncertainty 71<br />

4.3.3 Parameter uncertainty 71<br />

4.3.4 Model structure uncertainty 73<br />

4.3.5 Discussion – post evaluation 75<br />

4.4 Quality Assurance in Model based Water Management 77<br />

4.4.1 Background 77<br />

4.4.2 The HarmoniQuA approach 77<br />

4.4.3 Organisational requirements for QA guidelines to be effective 79<br />

4.4.4 Performance criteria and uncertainty – when is a model good enough 79<br />

4.4.5 Discussion – post evaluation 80<br />

5 Conclusions and Perspectives for Future Work 81<br />

5.1 Summary of Main Scientific Contributions 81<br />

5.2 Modelling Issues for Future Research 82<br />

6 References 84<br />

ii


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Appendices: Publications [1] – [15]<br />

[1] Refsgaard JC, Hansen E (1982) A Distributed Groundwater/Surface Water Model for the Suså<br />

Catchment. Part 1: Model Description. Nordic Hydrology, 13, 299-310.<br />

[2] Refsgaard JC, Hansen E (1982) A Distributed Groundwater/Surface Water Model for the Suså<br />

Catchment. Part 2: Simulations of Streamflow Depletions Due to Groundwater Abstraction. Nordic<br />

Hydrology, 13, 311-322.<br />

[3] Refsgaard JC, Christensen TH, Ammentorp HC (1991) A model for oxygen transport and consumption<br />

in the unsaturated zone. Journal of Hydrology, 129, 349-369.<br />

[4] Refsgaard JC, Seth SM, Bathurst JC, Erlich M, Storm B, Jørgensen, GH, Chandra S (1992) Application<br />

of the SHE to catchments in India - Part 1: General results. Journal of Hydrology, 140,<br />

pp 1-23.<br />

[5] Jain SK, Storm B, Bathurst JC, Refsgaard JC, Singh RD (1992) Application of the SHE to catchments<br />

in India - Part 2: Field experiments and simulation studies with the SHE on the Kolar subcatchment<br />

of the Narmada River. Journal of Hydrology, 140, 25-47.<br />

[6] Refsgaard JC, Knudsen J (1996) Operational validation and intercomparison of different types of<br />

hydrological models. Water Resources Research, 32 (7), 2189-2202.<br />

[7] Refsgaard JC (1997) Parametrisation, calibration and validation of distributed hydrological models.<br />

Journal of Hydrology, 198, 69-97.<br />

[8] Refsgaard JC (1997) Validation and Intercomparison of Different Updating Procedures for Real-<br />

Time Forecasting. Nordic Hydrology, 28, 65-84.<br />

[9] Refsgaard JC, Sørensen HR, Mucha I, Rodak D, Hlavaty Z, Bansky L, Klucovska J, Topolska J,<br />

Takac J, Kosc V, Enggrob HG, Engesgaard P, Jensen JK, Fiselier J, Griffioen J, Hansen S<br />

(1998) An Integrated Model for the Danubian Lowland – Methodology and Applications. Water<br />

Resources Management, 12, 433-465.<br />

[10] Refsgaard JC, Thorsen M, Jensen JB, Kleeschulte S, Hansen S (1999) Large scale modelling of<br />

groundwater contamination from nitrogen leaching. Journal of Hydrology, 221(3-4), 117-140.<br />

[11] Thorsen M, Refsgaard JC, Hansen S, Pebesma E, Jensen JB, Kleeschulte S (2001) Assessment<br />

of uncertainty in simulation of nitrate leaching to aquifers at catchment scale. Journal of Hydrology,<br />

242, 210-227.<br />

[12] Refsgaard JC, Henriksen HJ (2004) Modelling guidelines – terminology and guiding principles.<br />

Advances in Water Resources, 27(1), 71-82.<br />

[13] Refsgaard JC, Henriksen HJ, Harrar WG, Scholten H, Kassahun A (2005) Quality assurance in<br />

model based water management – review of existing practice and outline of new approaches.<br />

Environmental Modelling & Software, 20, 1201-1215.<br />

[14] Refsgaard JC, Nilsson B, Brown J, Klauer B, Moore R, Bech T, Vurro M, Blind M, Castilla G,<br />

Tsanis I, Biza P (2005) Harmonised techniques and representative river basin data for assessment<br />

and use of uncertainty information in integrated water management (HarmoniRiB). Environmental<br />

Science and Policy, 8, 267-277.<br />

[15] Refsgaard JC, van der Sluijs JP, Brown J, van der Keur P (2006). A framework for dealing with<br />

uncertainty due to model structure error. Advances in Water Resources, 29, 1586-1597.<br />

iii


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

iv


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Preface<br />

The work presented in this thesis together with the 15 publications published between 1982 and 2006<br />

form the material for evaluation for the degree of doctor scientiarum (dr. scient.) at the University of<br />

Copenhagen. The papers have all been published in peer reviewed international scientific journals.<br />

They are referred to by the numbers [1] to [15].<br />

In the present report I have assembled and summarised my most important scientific contributions to<br />

catchment modelling that has been my research interest during the past three decades. In this connection<br />

I wish to thank all my co-authors for a very inspiring co-operation during the years. Research does<br />

not take place in a vacuum, and without the interactions with them my work would not have been possible.<br />

I wish to acknowledge former and present colleagues and managements at the three organisations<br />

where I have been employed. At the Institute of Hydrodynamics and Hydraulic Engineering, Technical<br />

University of Denmark (now Environment and Resources, DTU) I was given the opportunity to explore<br />

and develop new integrated groundwater/surface water catchment models at a time when hydrological<br />

modelling was still in its infancy. This showed me the enormous potential of this new field. At Danish<br />

Hydraulic Institute (now DHI Water & Environment) I was then entrusted with further development of<br />

modelling tools and with testing them in real life applications. This taught me the limitations and difficulties<br />

we encounter and the need to be humble when applying models in water resources management.<br />

Finally, the Geological Survey of Denmark and Greenland (GEUS) has provided a very inspiring scientific<br />

environment and given me the opportunity to get involved in broader international research projects<br />

that have matured much of my previous views and allowed me to assemble this work.<br />

A special thank goes to Kristian A. Rasmussen, GEUS, for using his magic touch to polish some of the<br />

old dusty figures from the last century to make them easier to read in this thesis.<br />

Last, but not least, I wish to thank my family for their patience and support and for accepting that I always<br />

have been too busy with this topic.<br />

Copenhagen, January 2007<br />

Jens Christian Refsgaard<br />

"Life can only be understood backwards; but it must be lived forwards"<br />

Søren Kierkegaard (1813-1855)<br />

1


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

2


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Dansk Resume<br />

Publikationerne og materialet i denne doktorafhandling beskriver en række videnskabelige undersøgelser<br />

af hydrologisk modellering på oplandsskala i relation til vandressourceforvaltning. Hver af de 15<br />

publikationer fokuserer på dele af det overordnede emne spændende fra udvikling af nye koncepter og<br />

modelkoder til modelanvendelser; fra punktskala til oplandsskala; fra modellering af vandstrømninger til<br />

transport af opløste og reaktive stoffer; fra fokus på planlægning til real-tids oversvømmelsesvarsling og<br />

videre til tværgående emner og protokoller for selve modelleringsprocessen.<br />

Afhandlingens kapitel 2 præsenterer protokoller for hydrologisk modellering og en diskussion af interaktionen<br />

mellem hydrologisk modellering og vandressourceforvaltning. Endvidere forklares den terminologi<br />

og den tilgrundlæggende videnskabsfilosofiske tankegang samt den klassifikation af modeltyper,<br />

som benyttes i resten af afhandlingen. Kapitel 3 indeholder resumeer af modelstudier baseret på ni af<br />

publikationerne. Vurderingerne af disse publikationers bidrag til ny viden på det tidspunkt de blev publiceret<br />

og af emner som ikke blev behandlet i publikationerne, viser en betydelig udvikling gennem de<br />

sidste 25 år. Fx indeholder de første publikationer om udvikling af nye modelkoder, intet om verifikation<br />

af modelkode, validering af modeller mod uafhængige data eller usikkerhedsvurderinger – emner som i<br />

dag betragtes som meget væsentlige. Eksemplerne illustrerer ligeledes, hvordan generelle emner som<br />

skalaproblemer og model validering gradvis udviklede sig med baggrund i erfaringer og erkendte problemer<br />

fra modelstudier, som egentlig havde andre formål. Kapitel 4 præsenterer og diskuterer herefter<br />

fire generelle emner: (a) heterogenitet og skalering; (b) konfirmation, verifikation, kalibrering og validering<br />

af modeller; (c) usikkerhedsvurderinger; og (d) kvalitetssikring af modelleringsprocessen.<br />

Mine væsentligste bidrag til ny videnskabelig viden har været indenfor de følgende fem områder:<br />

• Ny konceptuel forståelse og tilhørende kodeudvikling. Suså modellen var baseret på en ny forståelse<br />

af interaktionen mellem overfladevand og grundvand i moræneområder og bragte ny viden om<br />

hvorledes grundvandsindvinding påvirker vandløb i sådanne oplande.<br />

• Validering af modeller. Arbejdet med rigoristiske principper for validering af modeller og eksempler<br />

på anvendelser for såvel ’lumped conceptual’ og ’distributed physically-based’ modeller har været<br />

en grundpille gennem de sidste 15 år af min forskning. Specielt er introduktionen af begrebet ’conditional<br />

validation’ ny.<br />

• Skalering. Mit arbejde har ikke ’løst’ skalaproblemerne, men bidrager til at tydeliggøre de principielt<br />

forskellige metoder med fokus på deres respektive forudsætninger og begrænsninger.<br />

• Usikkerhedsvurderinger. En betydelig del af min forskningsaktivitet gennem de sidste 10 år har<br />

fokuseret på usikkerhedsaspekter. Mit hovedbidrag i den sammenhæng har været introduktion af<br />

bredere usikkerhedsaspekter i hele modelleringsprocessen samt arbejdet med usikkerheder på<br />

modelstruktur.<br />

• Protokoller for hydrologisk modellering og kvalitetssikring af modelleringsprocessen. Den omfattende<br />

og detaljerede modelleringsprotokol, som blev udviklet i HarmoniQuA projektet er en formalisering<br />

og udmøntning af erfaring fra de foregående 25 års arbejde med hydrologisk modellering. De<br />

ny elementer heri er den fokus der lægges på (a) den interaktive dialog mellem modellør, vandressourceforvalter,<br />

reviewer, interessenter og offentligheden; (b) usikkerhedsvurderinger som et løbende<br />

element gennem hele modelleringsprocessen; (c) model validering; og (d) introduktion af erfaringer<br />

og subjektiv viden via eksterne reviews.<br />

3


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Abstract<br />

The publications and material presented in this thesis describe a series of scientific investigations on<br />

catchment modelling in relation to water resources management. Each of the 15 publications represents<br />

parts of the overall topic ranging from development of new concepts and model codes to model<br />

applications; from point scale to catchment scale; from flow modelling to transport and reactive modelling;<br />

from planning type applications to real-time forecasting and further on to crosscutting issues and<br />

protocols for the modelling process.<br />

The thesis starts with a presentation of protocols for the hydrological modelling process together with a<br />

discussion of the interaction between the water resources planning and management process and the<br />

hydrological modelling process. This includes a definition of terminology, a discussion of the underlying<br />

scientific philosophy and a classification of hydrological models. The following chapter comprises summaries<br />

of cases of simulation models based on nine of the publications. The post evaluations of the<br />

contributions to scientific knowledge in the publications and the issues not taken into account in the<br />

earlier publications reveal significant developments over the years. For example the first publications<br />

focussing on development of new model codes did not put any emphasis on rigorous verification or<br />

validation tests nor on uncertainty assessments, which are key issues today. The cases furthermore<br />

illustrate how general issues such as scaling and model validation gradually emerged from experiences<br />

and problems encountered in catchment studies that had other primary objectives. The next chapter<br />

then provides a presentation and discussion of four general issues: (a) catchment heterogeneity and<br />

scaling; (b) confirmation, verification, calibration and model validation; (c) uncertainty assessment; and<br />

(d) quality assurance in model based water management.<br />

My main contributions to scientific knowledge have been in the following five areas:<br />

• New conceptual understanding and code development. The Suså model was based on a new conceptual<br />

understanding of the surface water/groundwater interaction in moraine catchment and<br />

brought new insight into the effect of groundwater abstraction on streamflow in catchments with<br />

such hydrogeological characteristics.<br />

• Model validation. The work on rather rigorous principles for model validation and the examples of<br />

their application both for lumped conceptual and distributed physically based models is a cornerstone<br />

in my research. In particular the introduction of the term ‘conditional validation’ is novel.<br />

• Scaling. The framework on scaling does not ‘solve’ the scaling problem but contributes to clarifications<br />

on applicable methodologies with focus on their respective assumptions and limitations.<br />

• Uncertainty assessment. During the past decade a considerable part of my research work has focussed<br />

on uncertainty aspects. I consider my main contributions in this respect to be the introduction<br />

of the broader uncertainty aspects integrated into the modelling framework and the work with<br />

model structure uncertainty.<br />

• Modelling protocols and guidelines for quality assurance in the modelling process. The comprehensive<br />

modelling protocol developed within the HarmoniQuA project is a formalisation of experience<br />

and practises that have gradually emerged over the years. The novel elements are the emphasis on<br />

(a) the interactive dialogue between modeller, water manager, reviewer, stakeholders and the public;<br />

(b) uncertainty assessments throughout the modelling process; (c) model validation; and (d) experience<br />

and subjective knowledge introduced through external model reviews.<br />

4


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

1. Introduction<br />

1.1 Water Resources Management and Hydrological Modelling<br />

"Scarcity and misuse of fresh water pose a serious and growing threat to sustainable development<br />

and protection of the environment. Human health and welfare, food security, industrial<br />

development and the ecosystems on which they depend, are all at risk, unless water<br />

and land resources are managed more effectively in the present decade and beyond than<br />

they have been in the past". (ICWE, 1992)<br />

“The fact that the world faces a water crises has become increasingly clear in recent years.<br />

Challenges remain widespread and reflect severe problems in the management of water resources<br />

in many parts of the world. These problems will intensify unless effective and concerted<br />

actions are taken”. (WWAP, 2003)<br />

The first of the above quotes presents the status and the future challenges facing hydrologists and water<br />

resources managers as summarised in the introductory paragraph of the Dublin Statement on Water and<br />

Sustainable Development (ICWE, 1992). The second quote is from the first chapter of the UN World Water<br />

Development Report “Water for People, Water for Life” which is a collaborative effort of 23 UN agencies<br />

and convention secretariats co-ordinated by the World Water Assessment Programme.<br />

Thus the challenges in water resources management are enormous, both at the global scale as illustrated<br />

above and at smaller scales as for instance outlined in the vision for the European water sector recently<br />

formulated by the European Water Supply and Sanitation Technology Platform (WSSTP, 2005).<br />

The present thesis deals with hydrological modelling. It must be emphasised that modelling in itself is not<br />

sufficient to address these challenges. Modelling only constitute one, among several, sets of tools that can<br />

be used to support water resources management. Computer based hydrological models have been<br />

developed and applied at an ever increasing rate during the past four decades. The key reasons for that<br />

are twofold: (a) improved models and methodologies are continuously emerging from the research<br />

community, and (b) the demand for improved tools increases with the increasing pressure on water<br />

resources. Overviews of the status and development trends in catchment scale hydrological modelling<br />

during this period can be found in Fleming (1975) and Singh (1995).<br />

1.2 Objective and Content<br />

The objective of this thesis is to present the contributions to scientific knowledge that has emerged from<br />

the research described in the 15 appended publications. I have structured the thesis with an aim of presenting<br />

my research contributions within a framework of catchment modelling and its application to<br />

support water resources management.<br />

5


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The next chapter (Chapter 2) therefore presents an overall framework of the water resources management<br />

and planning process and the modelling process and the interaction between these two processes.<br />

Here the terminology and modelling protocol are introduced and discussed. This chapter is<br />

based on publications [7], [12] and [13], i.e. mainly some of my most recent work.<br />

Chapter 3 comprises a number of examples of simulation models ranging from point scale to catchment<br />

scale, from flow modelling to transport and reactive modelling and from planning type applications to<br />

real-time forecasting. This chapter is based on publications [1], [2], [3], [4], [5], [6], [8], [9] and [10], i.e.<br />

mainly some of my earlier work.<br />

Chapter 4 then provides a presentation and discussion of key and cross-cutting issues in hydrological<br />

modelling such as scaling, model validation, uncertainty assessment and quality assurance. These issues<br />

that were introduced as part of the overall framework in Chapter 2 are here discussed with reference<br />

to the experience and findings made in the publications. This chapter includes ideas, views and<br />

material from all the 15 publications, but with more emphasis on some of the more general purpose<br />

publications [6], [7], [10], [11], [12], [13], [14] and [15].<br />

Finally, Chapter 5 contains some conclusions and perspectives for future work.<br />

Thus I have not structured the content of this report according to the chronology of my publications [1] –<br />

[15]. The reason for this is that my most recent work provides a broader and better overview of the topic<br />

and is thus better suited for providing a framework for my earlier work.<br />

6


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

2 Water Resources Management and the Modelling Process<br />

2.1 Modelling as Part of the Planning and Management Process<br />

Integrated Water Resources Management (IWRM) is “a process, which promotes the co-ordinated development<br />

and management of water, land and related resources, in order to maximise the resultant<br />

economic and social welfare in an equitable manner without compromising the sustainability of vital<br />

ecosystems” (GWP, 2000). In the EU Water Framework Directive (WFD) Guidance Document on Planning<br />

Processes planning is defined as “a systematic, integrative and iterative process that is comprised<br />

of a number of steps executed over a specified time schedule” (EC, 2003b). In all new guidelines on<br />

water resources management the importance of integrated approaches, cross-sectoral planning and of<br />

public participation in the planning process are emphasised (GWP, 2000; EC, 2003b; Jønch-Clausen,<br />

2004).<br />

Models describing water flows, water quality, ecology and economy are being developed and used in<br />

increasing number and variety to support water management decisions. The interactions between the<br />

modelling process and the water management process are illustrated in Figs. 1 and 2. Fig. 1 shows the<br />

key actors in the water management process and the five steps that the modelling process typically<br />

may be decomposed in. The organisation that commissions a modelling study is denoted the water<br />

manager. This is often the competent authority, but can also be a stakeholder such as a water supply<br />

company. The role of the government is most often limited to providing the enabling environment such<br />

as legislation, research and information infrastructure. The typical cyclic and iterative character of the<br />

water management process, such as the WFD process, is illustrated in Fig. 2, where the interaction<br />

with the modelling process is illustrated by the large circle (water management) and the four smaller<br />

supporting circles (modelling). The WFD planning process, as most other planning processes, contains<br />

four main elements:<br />

• Identification including assessment of present status, analysis of impacts and pressures and establishment<br />

of environmental objectives. Here modelling may be useful for example for supporting assessments<br />

of what are the reference conditions and what are the impacts of the various pressures<br />

(EC, 2004).<br />

• Designing including the set up and analysis of programme of measures designed to be able in a<br />

cost effective way to reach the environmental objectives. Here modelling will typically be used for<br />

supporting assessments of the effects and costs of various measures under consideration.<br />

• Implementing the measures. Here on-line modelling in some cases may support the operational<br />

decisions to be made.<br />

• Evaluation of the effects of the measures on the environment. Here modelling may support the<br />

monitoring in order to extract maximum information from the monitoring data, e.g. by indicating errors<br />

and inadequacies in the data and by filtering out the effects of climate variability.<br />

7


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The Environment<br />

Problem<br />

Identification<br />

1. Model Study Plan<br />

• Identify problem<br />

• Define requirements<br />

• Assess uncertainties<br />

• Prepare model study plan<br />

Public Opinion<br />

2. Data and Conceptualisation<br />

• Collect and process data<br />

• Develop conceptual model<br />

• Select model code<br />

• Review and dialogue<br />

Stakeholders<br />

Competent<br />

Authority<br />

3. Model Set-up<br />

• Construct model<br />

• Reassess performance<br />

criteria<br />

• Review and dialogue<br />

Government<br />

4. Calibration and Validation<br />

• Model calibration<br />

• Model validation<br />

• Uncertainty assessment<br />

• Review and dialogue<br />

Implementation<br />

Water<br />

Management<br />

Decision<br />

5. Simulation and Evaluation<br />

• Model predictions<br />

• Uncertainty assessment<br />

• Review and dialogue<br />

Water Management Process<br />

Modelling Process<br />

Fig. 1 The role of the modelling process and the water management decision process (inspired from<br />

Pascual et al. (2003).<br />

It is important to note that the modelling studies typically do not address the entire planning and management<br />

process, but rather support certain elements of the process. Modelling is applied as a response<br />

(but usually not the only response) to an identified problem and can provide support for water<br />

management decisions. The types of interactions between the modelling process and the planning and<br />

management process are:<br />

8


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

• The modelling process starts with a thorough framing of the problem to be addressed and definition<br />

of modelling objectives and requirements for the modelling study (Step 1 in Fig. 1). Water managers<br />

and stakeholders dominate this step, which basically is identical to part of the broader planning<br />

process. A participatory based assessment of the most important sources of uncertainty for the decision<br />

process should be used as a basis for prioritising the elements of the modelling study. The<br />

uncertainty assessments made at this stage will typically be qualitative.<br />

• The main modelling itself is composed of steps 2, 3 and 4 of Fig. 1. Here the link with the main<br />

planning process consists of dialogue, reviews and discussions of preliminary results. The amount<br />

and type of interaction here depends on the level of public participation that may vary from case to<br />

case from providing information over consultation to active involvement (Henriksen et al., submitted).<br />

• The finalisation of the modelling study (equivalent to the last step in Fig. 1), typically including scenario<br />

simulations. Here the water managers and the stakeholders again have a dominant role. The<br />

decisions made at the outcome of this step on the basis of modelling results are made in the context<br />

of the main planning process. Uncertainty assessment of model predictions is a crucial aspect<br />

of the modelling results and should be communicated in a way that is accessible for the stakeholders<br />

in the further water management process.<br />

Modelling<br />

Evaluation<br />

Modelling<br />

Implementation<br />

WFD process<br />

Modelling<br />

Identification<br />

Designing<br />

Modelling<br />

Fig. 2 The role of modelling in the water management process within the context of the EU Water<br />

Framework Directive (WFD)<br />

9


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

2.2 Terminology and Scientific Philosophical Basis for the Modelling<br />

Process<br />

2.2.1 Background<br />

As pointed out in [12] a key problem in relation to establishment of a theoretical modelling framework is<br />

confusion on terminology. For example the terms validation and verification are used with different, and<br />

some times interchangeable, meanings by different authors. The confusion arises from both semantic<br />

and philosophical considerations (Rykiel, 1996). Another important problem is the lack of consensus<br />

related to the so far non-conclusive debate on the fundamental question concerning whether a water<br />

resources model can be validated or verified, and whether it as such can be claimed to be suitable or<br />

valid for particular applications (Konikow and Bredehoeft, 1992; De Marsily et al., 1992; Oreskes et al.,<br />

1994).<br />

An important issue in relation to validation/verification is the distinction between open and closed systems.<br />

A system is a closed system if its true conditions can be predicted or computed exactly. This applies<br />

to mathematics and mostly to physics and chemistry. Systems where the true behaviour cannot be<br />

computed due to uncertainties and lack of knowledge on e.g. input data and parameter values are<br />

called open systems. The systems we are dealing with in water resources management, based on geosciences,<br />

biology and socio-economy, are open systems. According to Konikow and Bredehoeft (1992)<br />

and Oreskes et al. (1994) it is not possible to verify or validate models of open systems.<br />

Finally, the principles have to reflect and be in line with the underlying philosophy of environmental<br />

modelling that have changed significantly during the past decades. In the early days many of us were<br />

focussing on the huge potentials of sophisticated models in a way that in retrospect may be characterised<br />

as rather naive enthusiasm (e.g. Freeze and Harlan (1969); Abbott, 1992). The dominant views<br />

today appears to be a much more balanced and mature view (e.g. Beven, 2002a; Beven, 2002b).<br />

2.2.2 Terminology and guiding principles<br />

According to the terminology presented in [12] the simulation environment is divided into four basic<br />

elements as shown in Fig. 3. The inner arrows describe the processes that relate the elements to each<br />

other, and the outer circle refers to the procedures that evaluate the credibility of these processes.<br />

In general terms a model is understood as a simplified representation of the natural system it attempts to<br />

describe. However, a distinction is made between three different meanings of the general term model,<br />

namely the conceptual model, the model code and the model that here is defined as a site-specific model.<br />

The most important elements in the terminology and their interrelationships are defined as follows:<br />

Reality: The natural system, understood here as the study area.<br />

Conceptual model: A description of reality in terms of verbal descriptions, equations, governing<br />

relationships or ‘natural laws’ that purport to describe reality. This is the user's perception of the key<br />

hydrological and ecological processes in the study area (perceptual model) and the corresponding<br />

10


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

simplifications and numerical accuracy limits that are assumed acceptable in order to achieve the purpose<br />

of the modelling. A conceptual model thus includes both a mathematical description (equations) and a<br />

descriptions of flow processes, river system elements, ecological structures, geological features, etc. that<br />

are required for the particular purpose of modelling. By drawing an analogy to scientific philosophical<br />

discussion the conceptual model in other words constitutes the scientific hypothesis or theory that we<br />

assume for our particular modelling study.<br />

Fig. 3 Elements of a modelling terminology [12].<br />

Model code: A mathematical formulation in the form of a computer program that is so generic that it,<br />

without program changes, can be used to establish a model with the same basic type of equations (but<br />

allowing different input variables and parameter values) for different study areas.<br />

Model: A site-specific model established for a particular study area, including input data and parameter<br />

values.<br />

11


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Model confirmation: Determination of adequacy of the conceptual model to provide an acceptable level of<br />

agreement for the domain of intended application. This is in other words the scientific confirmation of the<br />

theories/hypotheses included in the conceptual model.<br />

Code verification: Substantiation that a model code is in some sense a true representation of a conceptual<br />

model within certain specified limits or ranges of application and corresponding ranges of accuracy.<br />

Model calibration: The procedure of adjustment of parameter values of a model to reproduce the response<br />

of reality within the range of accuracy specified in the performance criteria.<br />

Model validation: Substantiation that a model within its domain of applicability possesses a satisfactory<br />

range of accuracy consistent with the intended application of the model.<br />

Model set-up: Establishment of a site-specific model using a model code. This requires, among other<br />

things, the definition of boundary and initial conditions and parameter assessment from field and laboratory<br />

data.<br />

Simulation: Use of a validated model to gain insight into reality and obtain predictions that can be used by<br />

water managers. This includes insight into how reality can be expected to respond to human interventions.<br />

In this connection uncertainty assessments of the model predictions are very important.<br />

Performance criteria: Level of acceptable agreement between model and reality. The performance criteria<br />

apply both for model calibration and model validation.<br />

Domain of applicability (of conceptual model): Prescribed conditions for which the conceptual model<br />

has been tested, i.e. compared with reality to the extent possible and judged suitable for use (by model<br />

confirmation).<br />

Domain of applicability (of model code): Prescribed conditions for which the model code has been<br />

tested, i.e. compared with analytical solutions, other model codes or similar to the extent possible and<br />

judged suitable for use (by code verification).<br />

Domain of applicability (of model): Prescribed conditions for which the site-specific model has been<br />

tested, i.e. compared with reality to the extent possible and judged suitable for use (by model validation).<br />

2.2.3 Scientific philosophical aspects<br />

The credibility of the descriptions or the agreements between reality, conceptual model, model code and<br />

model are evaluated through the terms confirmation, verification, calibration and validation. Thus, the relation<br />

between reality and the scientific description of reality which is constituted by the conceptual model<br />

with its theories and equations on flow and transport processes, its interpretation of the geological system<br />

and ecosystem at hand, etc., is evaluated through the confirmation of the conceptual model. By using the<br />

term confirmation in connection with conceptual model, it is implied that it is never considered possible<br />

to prove the truth of a theory/hypothesis and as such of a conceptual model. And even if a site-specific<br />

12


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

model is eventually accepted as valid for specific conditions, this is not a proof that the conceptual<br />

model is true, because, due to non-uniqueness, the site-specific model may turn out to perform right for<br />

the wrong reasons.<br />

The fundamental view expressed by scientific philosophers is that verification and validation of numerical<br />

models of natural systems is impossible, because natural systems are never closed and because<br />

the mapping of model results are always non-unique (Popper, 1959; Oreskes et al., 1994). I agree that<br />

it is not possible to carry out model verification or model validation, if these terms are used universally,<br />

without restriction to domains of applicability and levels of accuracy.<br />

[12] note, however, that Popper (1959) distinguished between two kinds of universal statements: the<br />

'strictly universal' and the 'numerical universal'. The strictly universal statements are those usually dealt<br />

with when speaking about theories or natural laws. They are a kind of 'all-statement' claiming to be true<br />

for any place and any time. In contrary, numerical universal statements refers only to a finite class of<br />

specific elements within a finite individual spatio-temporal region. A numerical universal statement is<br />

thus in fact equivalent to conjunctions of singular statements.<br />

The restrictions in use of the terms confirmation, verification and validation imposed by the respective<br />

domains of applicability imply, according to Popper's views, that the conceptual model, model code and<br />

site-specific models can only be classified as numerical universal statements as opposed to strictly universal<br />

statements. This distinction is fundamental for the terminology described in [12] and its link to<br />

scientific philosophical theories. Consequently the terms verification and validation should never be<br />

used without qualifiers.<br />

An important aspect of the framework outlined in [12] lies in the separation between the three different<br />

‘versions’ of the word model, namely the conceptual model, the model code and the-site specific model.<br />

Due to this distinction it is possible, at a general level, to talk about confirmation of a theory or a hypothesis<br />

about how nature can be described using the relevant scientific method for that purpose, and,<br />

at a site-specific level, to talk about validity of a given model within certain domains of applicability and<br />

associated with specified accuracy limits.<br />

13


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

2.3 Modelling Protocol<br />

The procedure for applying a hydrological model is often denoted a modelling protocol. It comprises a<br />

series of actions to be followed in a sequential or iterative form. The modelling protocol presented in [7]<br />

for distributed catchment modelling was inspired by the groundwater community (Anderson and<br />

Woessner, 1992). It was subsequently used in the Danish Handbook for Groundwater Modelling (Henriksen<br />

et al., 2001) that has been used extensively in practise since its emergence. A more recent modelling<br />

protocol, developed within the context of the EU research project HarmoniQuA, is reported in [13]<br />

and Scholten et al. (2007). The two protocols are illustrated in Figs. 4 and 5.<br />

Fig. 4 The modelling protocol from [7].<br />

14


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

A modelling study will involve several phases and several actors. A typical modelling study will involve<br />

the following four different types of actors:<br />

• The water manager, i.e. the person or organisation responsible for the management or protection of<br />

the water resources, and thus responsible for the modelling study and the outcome (the problem<br />

owner).<br />

• The modeller, i.e. a person or an organisation that works with the model conducting the modelling<br />

study. If the modeller and the water manager belong to different organisations, their roles will typically<br />

be denoted consultant and client, respectively.<br />

• The reviewer, i.e. a person that is conducting some kind of external review of a modelling study.<br />

The review may be more or less comprehensive depending on the requirements of the particular<br />

case. The reviewer is typically appointed by the water manager to support the water manager to<br />

match the modelling capability of the modeller.<br />

• The stakeholders/public. A stakeholder is an interested party with a stake in the water management<br />

issue, either in exploiting or protecting the resource. Stakeholders include the following different<br />

groups: (i) competent water resource authority (typically the water manager, cf. above); (ii) interest<br />

groups; and (iii) general public.<br />

The modelling process may, according to [13], be decomposed into five major steps which again are<br />

decomposed into 48 tasks (Fig. 5). The contents of the five steps are:<br />

• STEP1 (Model Study Plan). This step aims to agree on a Model Study Plan comprising answers to<br />

the questions: Why is modelling required for this particular model study What is the overall modelling<br />

approach and which work should be carried out Who will do the modelling work Who should<br />

do the technical reviews Which stakeholders/public should be involved and to what degree What<br />

are the resources available for the project The water manager needs to describe the problem and<br />

its context as well as the available data. A very important task is then to analyse and determine the<br />

various requirements of the modelling study in terms of the expected accuracy of modelling results.<br />

The acceptable level of accuracy will vary from case to case and must be seen in a socio-economic<br />

context. It should, therefore, be defined through a dialogue between the modeller, water manager<br />

and stakeholders/public. In this respect an analysis of the key sources of uncertainty is crucial in<br />

order to focus the study on the elements that produce most information of relevance to the problem<br />

at hand.<br />

• STEP 2 (Data and Conceptualisation). In this step the modeller should gather all the relevant<br />

knowledge about the study basin and develop an overview of the processes and their interactions in<br />

order to conceptualise how the system should be modelled in sufficient detail to meet the requirements<br />

specified in the Model Study Plan. Consideration must be given to the spatial and temporal<br />

detail required of a model, to the system dynamics, to the boundary conditions and to how the<br />

model parameters can be determined from the available data. The need to model certain processes<br />

in alternative ways or to differing levels of detail in order to enable assessments of model structure<br />

uncertainty should be evaluated. The availability of existing computer codes that can address the<br />

model requirements should also be addressed.<br />

• STEP 3 (Model Set-up). Model Set-up implies transforming the conceptual model into a site-specific<br />

model that can be run in the selected model code. A major task in Model Set-up is the processing of<br />

data in order to prepare the input files necessary for executing the model. Usually, the model is run<br />

within a Graphical User Interface (GUI) where many tasks have been automated. The GUI speeds<br />

15


Refsgaard JC – Doctoral Thesis<br />

Hydrological Modelling and River Basin Management<br />

January 2007<br />

up the generation of input files, but it does not guarantee that the input files are error free. The<br />

modeller performs this work.<br />

• STEP 4 (Calibration and Validation). This step is concerned with the process of analysing the model<br />

that was constructed during the previous step, first by calibrating the model, and then by validating<br />

its performance against independent field data. Finally, the reliability of model simulations for the intended<br />

domain of applicability is assessed through uncertainty analyses. The results are described<br />

so that the scope of model use and its associated limitations are documented and made explicit.<br />

The modeller performs this work.<br />

• STEP 5 (Simulation and Evaluation). In this step the modeller uses the calibrated and validated<br />

model to make simulations to meet the objectives and requirements of the model study. Depending<br />

on the objectives of the study, these simulations may result in specific results that can be used in<br />

subsequent decision making (e.g. for planning or design purposes) or to improve understanding<br />

(e.g. of the hydrological/ecological regime of the study area). It is important to carry out suitable uncertainty<br />

assessments of the model predictions in order to arrive at a robust decision. As with the<br />

other steps, the quality of the results needs to be assessed through internal and external reviews.<br />

Each of the last four steps is concluded with a reporting task followed by a review task. The review<br />

tasks include dialogues between water manager, modeller, reviewer and, often, stakeholders/public.<br />

The protocol includes many feedback possibilities (Fig. 5).<br />

A comparison of the old protocol (Fig. 4) and the one decade younger HarmoniQuA protocol (Fig. 5)<br />

shows some interesting developments:<br />

• The basic sequence of the prescribed activities in the protocols is the same. The HarmoniQuA protocol<br />

is much more detailed than the old one, but there are no fundamental disagreements between<br />

the two.<br />

• The HarmoniQuA protocol puts much more emphasis on the framing of the modelling study. This<br />

is only considered in one box in Fig. 4 and not given much weight in [7], while it is one full Step<br />

comprising seven tasks in Fig 5. This implies for instance that requirements on performance criteria<br />

and uncertainty assessments are introduced rather late in the old protocol, while it is an important<br />

part of Step 1 in the HarmoniQuA protocol.<br />

• There is much emphasis on uncertainty assessments throughout the modelling process in the<br />

HarmoniQuA protocol, while uncertainty assessments are only considered as part of model calibration<br />

and simulation in the old protocol.<br />

• The HarmoniQuA protocol is part of a quality assurance framework with much emphasis on the<br />

role play between the various actors in the modelling process. This results in stakeholder involvement,<br />

peer reviews, focus on reporting and dialogue between water manger and modeller. In contrary<br />

to this, the old protocol only focuses on the modeller.<br />

These developments reflect a process from guidance to the modeller only (old protocol) towards guidance<br />

to all actors involved in the modelling process (HarmoniQuA). This process has been inspired by<br />

feedbacks from introducing the old protocol to real world applications, where it was realised that a<br />

broader concept was required.<br />

16


Data and<br />

Conceptualisation<br />

Describe System and<br />

Data Availability<br />

Collect and Process<br />

Raw Data<br />

Calibration<br />

and Validation<br />

Specify Stages in<br />

Calibration Strategy<br />

Select Calibration<br />

Method<br />

Define Stop Criteria<br />

Simulation<br />

and Evaluation<br />

Set-up Scenario<br />

Simulations<br />

Model Study Plan<br />

Describe Problem<br />

and Context<br />

Define<br />

Objectives<br />

Identify Data<br />

Availability<br />

Determine<br />

Requirements<br />

Prepare Terms of<br />

Reference<br />

Proposal and<br />

Tendering<br />

No<br />

Agree on<br />

Model Study Plan<br />

and Budget<br />

Yes<br />

No<br />

Sufficient<br />

Data<br />

Dire<br />

Yes<br />

Model Structure and<br />

Processes<br />

Model Parameters<br />

Summarise<br />

Conceptual Model and<br />

Assumptions<br />

Need<br />

Yes<br />

for Alternative<br />

Conceptual<br />

Models<br />

No<br />

Process Model<br />

Structure Data<br />

Not<br />

Assess<br />

OK<br />

Dire<br />

Soundness of<br />

Conceptualisation<br />

OK<br />

Code Selection<br />

Report and Revisit<br />

Model Study Plan<br />

(Data and<br />

Conceptualisation)<br />

Review Data and<br />

Conceptualisation and<br />

Model Set-up Plan<br />

OK<br />

Not<br />

OK<br />

Model Set-up<br />

Construct Model<br />

Not<br />

Dire<br />

Test Runs<br />

Completed<br />

OK<br />

OK<br />

Specify or Update<br />

Calibration and<br />

Validation Targets<br />

and Criteria<br />

Report and Revisit Model<br />

Study Plan<br />

(Model Set-up)<br />

Not<br />

Dire<br />

Review Model<br />

OK<br />

Set-up and Calibration<br />

and Validation Plan<br />

OK<br />

Select Calibration<br />

Parameters<br />

Not<br />

OK<br />

Parameter<br />

Estimation<br />

Dire<br />

OK<br />

All<br />

No<br />

Calibration Stages<br />

Completed<br />

Yes<br />

Assess<br />

Not<br />

Soundness of<br />

OK<br />

Calibration<br />

OK<br />

Validation<br />

Not<br />

Dire<br />

Assess<br />

OK<br />

Soundness of<br />

Validation<br />

OK<br />

Uncertainty Analysis<br />

of Calibration and<br />

Validation<br />

Scope of Applicability<br />

Report and Revisit<br />

Model Study Plan<br />

(Calibration and<br />

Validation)<br />

Not<br />

Review<br />

OK<br />

Calibration and Validation<br />

and Simulation Plan<br />

OK<br />

Dire<br />

Not<br />

OK<br />

Check<br />

Simulations<br />

OK<br />

Analyse and Interpret<br />

Results<br />

Not<br />

Assess<br />

OK<br />

Soundness of<br />

Simulattion<br />

OK<br />

Uncertainty Analysis<br />

of Simulation<br />

No<br />

All Scenarios<br />

Completed<br />

Yes<br />

Reporting of<br />

Simulation and<br />

Evaluation<br />

Not<br />

Review of<br />

OK<br />

Simulation and<br />

Evaluation<br />

OK<br />

Need for Post Audit<br />

Model Study<br />

Closure<br />

Fig. 5 The five modelling steps and the 48 tasks in the HarmoniQuA modelling protocol. The diagram is an updated version of Fig. 5 in [13]<br />

(Refsgaard et al., 2006).


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

2.4 Classification of Models<br />

Many attempts have been made to classify hydrological models (or model codes). Refsgaard (1996)<br />

presented the classification shown in Fig. 6 that I have used in all papers of the present thesis. Deterministic<br />

models can be classified according to whether the model gives a lumped or a distributed description<br />

of the considered area, and whether the description of the hydrological processes is empirical,<br />

conceptual, or more physically-based. A lumped model implies that the catchment is considered as one<br />

computational unit. A distributed model, on the other hand, provides a description of catchment processes<br />

at geo-referenced computational grid points within the catchment. An intermediate approach is a<br />

semi-distributed model, which uses some kind of distribution, either in sub-catchments or in hydrological<br />

response units, where areas with the same key characteristics are aggregated to sub-units without<br />

considering their actual locations within the catchment. Examples of hydrological response units considered<br />

in semi-distributed models are elevation zones, which are relevant for snow modelling, and<br />

combinations of soil and vegetation type, which may be relevant for simulation of root zone processes<br />

such as evapotranspiration and nitrate leaching.<br />

As most conceptual models are also lumped, and as most physically-based models are also distributed,<br />

the three main classes emerge:<br />

• Empirical (black box)<br />

• Lumped conceptual models (grey box)<br />

• Distributed physically-based (white box)<br />

The classification is discussed in some details in Refsgaard (1996). Here, the focus is on the two traditional<br />

approaches in deterministic hydrological catchment modelling, namely the lumped conceptual<br />

and the distributed physically-based ones. The fundamental difference between these two types of<br />

models lies in their process descriptions and the way spatial variability is treated. The distributed physically-based<br />

models contain equations which have originally been developed for point scales and which<br />

provide detailed descriptions of flows of water and solutes. The variability of catchment characteristics<br />

is accounted for explicitly through the variations of hydrological parameter values among the different<br />

computational grid points. This approach leaves the variability within a grid as un-accounted for, which<br />

in some cases is of minor importance but in other cases may pose a serious constraint. The lumped<br />

conceptual models uses empirical process descriptions, which have built-in accounting for the spatial<br />

variability of catchment characteristics.<br />

18


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Fig. 6 Classification of hydrological models according to process description (Refsgaard, 1996).<br />

Typical examples of lumped conceptual model codes are the Stanford Watershed Model (Crawford and<br />

Linsley, 1966), the Sacramento (Burnash, 1995), the HBV (Bergström, 1995) and the NAM (Nielsen<br />

and Hansen, 1973). Typical examples of distributed physically-based model codes are the MIKE SHE<br />

(Abbott et al., 1986a, b; Refsgaard and Storm, 1995) and the Thales (Grayson et al., 1992a, b).<br />

Groundwater model codes like MODFLOW belong to the distributed physically-based class.<br />

The classification has some shortcomings that should be noted. First of all, the use of the term ‘conceptual<br />

model’ is unfortunate, because this is a different meaning of the term as compared to the definition<br />

given in Section 2.2 and used in the modelling protocols (Section 2.3). This can cause some confusion,<br />

but to introduce a new term completely different from what is used by almost all other scientists in the<br />

community of catchment modelling may cause even more confusion. Secondly, and more fundamental,<br />

the names of the classes should be considered as relative rather than absolute. For example Beven<br />

(1989) argued that in most applications physically-based models are used as lumped conceptual models<br />

at the grid scale. As discussed in [4] I agree that some degree of lumping and conceptualisation will<br />

always need to take place, but that in spite of this there is a fundamental difference in the functioning<br />

and, as shall also be discussed later, of the applicability of the two model types.<br />

19


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

3 Simulation of Hydrological Processes at Catchment<br />

Scale<br />

In this chapter some modelling examples from the publications are briefly summarised and discussed<br />

within the framework outlined in Chapter 2.<br />

3.1 Flow modelling<br />

3.1.1 Groundwater/surface water model for the Suså catchment ([1], [2])<br />

Summary<br />

The publications [1] and [2] describe a new model code and the set-up, calibration and validation of a<br />

model for a 1,000 km 2 area. Further details can be found in Stang (1981), Refsgaard (1981) and<br />

Refsgaard and Stang (1981). The objectives of the study were to develop a spatially distributed<br />

groundwater/surface water model code and apply it to the Suså catchment with a particular focus on<br />

the stream-aquifer interaction in a hydrogeological system consisting of confined aquifer-aquitardphreatic<br />

aquifer and to test the model for prediction of the hydrological consequences on streamflows<br />

and hydraulic heads of groundwater abstraction.<br />

The new model code was rather complex and computationally demanding at the time of development.<br />

Thus, standard 30 years model simulations could only be carried out as night runs at the main frame<br />

computer at DTU’s computer centre.<br />

The model area comprising the Suså and the neighbouring Køge Å catchments is located in the central<br />

and southern part of Zealand. The model area, the topographic divides and the groundwater model<br />

polygonal mesh are shown in Fig. 7. The overall structure of the model is outlined in Fig. 8. It consists<br />

of four separate components for the confined regional aquifer, the aquitard, the phreatic aquifer and the<br />

root zone. The spatial distribution and the degree of physical basis differ between the four components.<br />

The time steps in the calculations are one day in all parts of the model.<br />

The confined aquifer is described by a two-dimensional integrated finite difference model with 112 polygons.<br />

For the phreatic aquifer consisting of till with very small transmissivities and for the aquitard each<br />

of the polygons are distributed further into four sub-polygons based on hypsographic curves (Fig. 9).<br />

Due to small scale topographic variations the flows in the aquitard in most polygons are upwards in<br />

some parts and downwards in other parts of the polygon. A correct representation of these flows between<br />

the regional aquifer and the phreatic aquifer that discharges the rivers is crucial for achieving a<br />

good description of the stream-aquifer interaction. Without such approach allowing a description of both<br />

upwards and downwards flows in the aquitard within the same polygon a much finer spatial resolution<br />

with 10-100 times as many polygons would have been required. This would have been impossible 25<br />

years ago due to computational constraints.<br />

20


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The root zone component calculated the net precipitation that recharged the phreatic aquifer. The modelling<br />

area was divided into seven sub-areas with separate precipitation input and soil parameters. Further<br />

the spatial variation in vegetation was accounted for by dividing each of these seven areas into five<br />

vegetation areas based on agricultural statistics and one meadow (wetland) area. This makes the total<br />

distribution to 42 sub-areas where each sub-area is a kind of ‘hydrological response unit’, i.e. a semidistributed<br />

approach. The root zone calculations were based on a box approach with four layers in the<br />

root zone.<br />

Fig. 7 Topographic divides, groundwater polygonal mesh, precipitation gauging stations and precipitation<br />

zones of the Suså model.<br />

21


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Fig. 8 The structure of the Suså model<br />

Aquitard<br />

40<br />

30<br />

Ground surface<br />

Water table<br />

(lower outlet)<br />

Head, regional<br />

aquifer<br />

Legend<br />

0 1 2 3<br />

km<br />

POLYGON 21<br />

< 24 m above MSL<br />

24–28 m above MSL<br />

28–34 m above MSL<br />

> 34 m above MSL<br />

Lilleå<br />

20<br />

Vendebæk<br />

Regional<br />

aquifer<br />

10<br />

0<br />

1 2 3 4<br />

50 100 %<br />

Pre-Quaternary<br />

surface<br />

Suså<br />

Gasmose Bæk<br />

Fig. 9 Hypsographic curve for polygon 21 and areas represented by the four sub-polygons.<br />

22


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Fig. 10 Examples of simulation results from soil moisture in root zone, hydraulic head of regional confined<br />

aquifer and river discharge.<br />

The model was calibrated against soil moisture data from four experimental plots, time series of hydraulic<br />

heads from 40 observation wells in the regional aquifer and streamflow from six gauging stations.<br />

Examples of simulation results from the calibration period are shown in Fig. 10 which shows excellent<br />

curve fits. The groundwater and aquitard models were calibrated, along with the code development<br />

itself, using all available hydraulic head data from the period 1950-80. Between 1964 and 1970 the<br />

groundwater abstraction to Copenhagen Water Supply from the Regnemark Waterworks in the Køge Å<br />

catchment was increased from zero to about 15 million m 3 /year. The remaining model components<br />

23


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

were calibrated against only some of the available streamflow data, namely some of the data from the<br />

Suså catchment, while amongst others Køge Å data were not used for calibration.<br />

While the simulation of streamflows in the Køge Å catchment in [1] was characterised as a “half-way<br />

test of the model’s ability to simulate streamflow from ungauged catchments” no systematic validation<br />

tests against independent data were carried out as part of the study. Some years later the model simulations<br />

were extended with new data from the period 1981-87, where the groundwater abstractions had<br />

changed slightly. In this post audit validation study the model simulations were found to match the observations<br />

to the same degree of accuracy as during the calibration period (Jensen and Jørgensen,<br />

1988).<br />

The model’s ability to simulate the streamflow depletion caused by a groundwater abstraction from the<br />

regional confined aquifer was tested on historical data from the Køge Å catchment. Fig. 11 shows simulated<br />

streamflow assuming actual groundwater abstraction from the Regnemark Waterworks starting in<br />

1964, Q sim , and assuming no abstracting from Regnemark, Q 1 sim. The recorded streamflow fits reasonably<br />

well with Q sim . The difference Q 1 sim - Q sim , which is the simulated streamflow depletion caused<br />

by the increased groundwater abstraction, is seen to have a clear seasonal variation with smaller depletion<br />

during the dry summer periods and larger depletion during the wet winter season.<br />

Fig. 11 Comparison of 15 days moving average streamflows for Køge Å (lower) and the relative streamflow<br />

depletion caused by the groundwater abstraction (upper)<br />

24


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Discussion - post evaluation<br />

Most other catchment models existing when the Suså model code was developed were either purely<br />

rainfall runoff models of the lumped conceptual type, such as the classical Stanford Watershed Model<br />

(Crawford and Linsley, 1966), the HBV (Bergström and Forsman, 1973; Bergström, 1976) and the NAM<br />

(Nielsen and Hansen, 1973) or purely groundwater models (Prickett and Lonnquist, 1971; Thomas,<br />

1973). A few authors had concluded that coupled groundwater/surface water modelling was essential<br />

(e.g. Luckner, 1978; Lloyd, 1980) and some had outlined specific, but not yet operational, concepts<br />

(e.g. Freeze and Harlan, 1969; Wardlaw, 1978; Jønch-Clausen, 1979). In some studies groundwater<br />

models and rainfall-runoff models were used at the same catchment, but without coupling (e.g. Weeks<br />

et al., 1974). Thus, apparently no other model had previously been used to dynamically simulate coupled<br />

groundwater/surface water conditions at catchment scale (rainfall, evapotranspiration, surface near<br />

runoff, groundwater recharge, groundwater heads, baseflow discharge from aquifers to streams).<br />

During the decade following [1] and [2] a few model codes with integrated groundwater/surface water<br />

descriptions emerged. The most prominent of these codes was the SHE (Abbott et al., 1986a, b) and its<br />

operational daughter codes, MIKE SHE from DHI (Refsgaard and Storm, 1995) and SHETRAN from<br />

University of Newcastle (Bathurst and O’Connell, 1992), which both are used today, although in later<br />

versions. Other operational models from that period were described by Miles and Rushton (1983),<br />

Christensen (1994) and Wardlaw (1994). Miles and Rushton (1983) used a simpler root zone and surface<br />

water component than [1] together with a two-dimensional finite difference groundwater model and<br />

monthly time steps. Christensen (1994) developed a model for the Tude Å catchment (a neighbour to<br />

Suså) that conceptually was similar and a little bit simpler than [1]. Wardlaw et al. (1994) used the concepts<br />

outlined in Wardlaw (1978) coupling the Stanford Watershed Model with a finite-difference<br />

groundwater model and a channel routing model for simulation of discharge and groundwater levels in<br />

the Allen catchment in England.<br />

During the past decade the number of integrated modelling codes has exploded. The existing codes<br />

today can be considered to fall in three classes: (a) fully integrated codes such as MIKE SHE (Graham<br />

and Butts, 2005); (b) couplings of existing groundwater codes and surface water codes such as MOD-<br />

FLOW and SWAT (Perkins and Sophocleous, 1999); and (c) codes based on the fully 3-dimensional<br />

Richards’ equation (Panday and Hayakorn, 2004). Independent reviews of the scientific basis and practical<br />

applicability of a number of recent integrated model codes are provided by e.g. Kaiser-Hill (2001)<br />

and Tampa Bay Water (2001).<br />

A major novelty of [1] and [2] was that the Suså model code was one of the first codes, which integrated<br />

surface water and groundwater descriptions, and the first of its kind applied operationally to moraine<br />

landscapes. The model results were unique with respect to simulation of the dynamics of the groundwater/surface<br />

water interaction, as for instance reflected by the annual hydraulic head fluctuations and the<br />

streamflow depletion due to the groundwater abstraction. Furthermore the study provided new insights<br />

and understanding on the mechanisms that governed streamflow depletion due to groundwater abstraction<br />

from confined aquifers in moraine catchments. In contrary to the traditional type curve analyses<br />

which were used extensively in hydrogeology to analyse test pumpings and to predict the effects of<br />

abstractions, [1] and [2] were based on non-stationary analysis which, as evident from the annual variations<br />

of streamflow depletion shown in Fig. 11, turns out to be crucial. The only modelling study from<br />

the following decade that considered the dynamics of the stream-aquifer interaction in moraine catch-<br />

25


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

ments in connection with groundwater abstraction was Christensen (1994) who basically confirmed the<br />

results of [2].<br />

The spatial distribution and the degree of physical basis differ between the four components of the<br />

Suså model. The groundwater model can be characterised as distributed physically-based, the aquitard<br />

model as semi-distributed physically-based and the phreatic aquifer and root zone models as semidistributed<br />

conceptual. In contrary to for instance the later SHE code (Abbott et al, 1986a, b), the Suså<br />

model code was not generic, because it could not be applied to other catchments without changes in<br />

the code. Furthermore, it was tailored to the specific hydrological conditions prevailing in the Suså<br />

catchment and could for instance not be applied to an alluvial unconfined aquifer.<br />

In retrospect, it is interesting to observe that issues related to the credibility of model simulations were<br />

not critically analysed or discussed in [1] and [2]. First of all, aspects of code verification were not dealt<br />

with in the publications, although a major novelty of the work was the development of a completely new<br />

code. Secondly, and maybe more surprisingly, model validation and uncertainty assessments of model<br />

simulations were almost not addressed. By using all the available groundwater head data for calibration<br />

the opportunity to make split-sample validation test against parts of the data or even the unique opportunity<br />

to calibrate on data before the groundwater abstraction and validate on data after the abstraction<br />

(differential split-sample test according to Klemes (1986)) were not utilised. By not addressing the uncertainty<br />

and by not conducting rigorous validation tests the reader may be left with the, undocumented,<br />

impression that the curve fitting in Fig. 10 is supposed to reflect the predictive capability of the model.<br />

That the model proved to perform well in a subsequent post-audit validation study could not be known<br />

at the time of [1] and [2].<br />

The other integrated groundwater/surface water modelling studies from the following decade (Miles and<br />

Rushton, 1983; Christensen, 1994; Wardlaw, 1994) had the same characteristics, i.e. only focus on<br />

calibration and model prediction but no mentioning of verification of the new model codes, no model<br />

validation tests against independent data and no uncertainty assessments. The SHE study reported by<br />

Bathurst (1986a, b) focussing on surface water hydrology did include split-sample validation testing and<br />

sensitivity analysis. For surface water (rainfall-runoff) modelling studies focusing more on model applications<br />

than code developments split-sample testing was more common (e.g. Bergström, 1976; WMO,<br />

1975; WMO 1988) but uncertainty assessment was not systematically carried out and usually not even<br />

considered until Beven called for it (Beven, 1989; Beven and Binley, 1992). Altogether, this illustrates a<br />

very significant development in the modelling practise during these three decades.<br />

26


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

3.1.2 Application of SHE to catchments in India ([4], [5])<br />

Summary<br />

The publications [4] and [5] describe the set-up, calibration and validation of the ‘Système Hydrologique<br />

Européen’ (SHE) code to six sub-catchments totalling about 15,000 km 2 of the Narmada basin in India,<br />

Fig. 12. The objective of the papers was to describe experiences from applying a distributed physicallybased<br />

code like SHE to large basins with rather limited data coverage compared to previous SHE applications<br />

to research catchments. In contrary to the Suså study in [1] and [2], the India study did not<br />

include any code development, except for data processing utility software. Instead it comprised application<br />

of an existing code (Abbott et al., 1986a,b) to conditions that were far beyond the conditions for<br />

which the SHE had previously been tested in terms of catchment size, data coverage and hydrological<br />

regime (Bathurst, 1986a).<br />

Fig. 12 Location map for the Narmada and the six sub-catchments.<br />

Applicationwise, the study focused on simulation of catchment runoff, i.e. surface water aspects only.<br />

The model structure was as illustrated in Fig. 17. The groundwater zone was, however, considered only<br />

with one layer, i.e. a 2-dimensional groundwater model, and there were no data from observation wells<br />

to allow a calibration of the groundwater part of the model. The six models were set-up with a 2 km x 2<br />

km computational grid. A split-sample approach was used with typically three years for model calibrations<br />

and other three years for the subsequent model validation.<br />

27


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The data requirements for a SHE based model is substantial and much larger than for a rainfall-runoff<br />

model of lumped conceptual type that previously had been applied to such types of catchments. A major<br />

challenge of the study was therefore to identify, collect and process data and to check their quality.<br />

Data were collected from more than 15 different agencies belonging to many different ministries and the<br />

data quality varied substantially.<br />

Another challenge was how to assess parameter values in a distributed model when data, in contrary to<br />

the previous tests on small experimental catchments like in Bathurst (1986a), are scarce. Each of the<br />

grid points in a distributed model is characterised by one or more parameters. Although the parameter<br />

values in principle (as in nature) vary from grid point to grid point, it is neither feasible nor desirable to allow<br />

the parameter values to vary so freely. Instead, a given parameter should only reflect the significant and<br />

systematic variation described in the available field data. Therefore a parameterisation procedure was<br />

developed, where representative parameter values were associated to individual soil types, vegetation<br />

types, geological layers, etc. This process of defining the spatial pattern of parameter values effectively<br />

reduced the number of free parameter coefficients, which needs to be adjusted in the subsequent<br />

calibration procedure. For example, the 820 km 2 Kolar catchment is parameterised into three soil classes<br />

and 10 land use/soil depth classes. For the soil type classes calibration was allowed for the hydraulic<br />

conductivity in the unsaturated zone (for each soil type class the conductivity could vary among three<br />

different land uses => nine parameter values). For the land use/soil depth classes the calibration<br />

parameters comprised soil depths (10 parameters in total) and the Strictler overland flow coefficients for<br />

four land use types (four parameters in total). Further three parameters were subject to calibration<br />

(hydraulic conductivity in the saturated zone, an (empirical) by-pass coefficient and a surface retention<br />

parameter; all kept constant throughout the catchment). Although the 26 calibration parameters could not<br />

be assessed from field data alone, but had to be modified through calibration, the physical realism of the<br />

parameter values resulting from the subsequent calibration procedure could be evaluated from available<br />

field data.<br />

The simulation results are illustrated in Fig. 13 as hydrographs for the largest sub-catchment and in Fig.<br />

14 as annual runoff and annual peaks for all six sub-catchments. In both figures the results are for the<br />

validation periods, where results are slightly poorer as compared to the calibration periods. In [4] the<br />

rainfall-runoff simulation results were characterised as having the same degree of accuracy as would<br />

have been expected with simpler hydrological models of the lumped conceptual type. The results therefore<br />

suggested that application of complex data demanding models like the present SHE approach are<br />

not justified in cases where the modelling objective is limited to simulation of catchment runoff and<br />

where observed runoff records exist for calibration purposes. No attempts were made in the study to<br />

test the capability of a model without calibration.<br />

After the first calibration and validation tests had been made, field investigations were carried out in the<br />

Kolar catchment during a 2½ week period to improve the parameter estimates, mainly for soil and vegetation<br />

parameters, and to evaluate the importance of additional field data. Subsequently, the Kolar<br />

model was recalibrated in such a way that rather narrow constraints were put on the range of values<br />

allowed for the key parameters. The final model, based on the additional data, produced simulation<br />

results of same quality as the preliminary model with respect to simulated hydrograph. Although it is<br />

argued in [5] that the final model is believed to give an improved physical representation of the hydrological<br />

regime, it is concluded that a good match between observed and simulated outlet hydrographs<br />

does not provide a sufficient guarantee of a hydrologically realistic process description.<br />

28


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Fig. 13 Observed and simulated hydrographs for the Narmada at Manot during the validation period<br />

1985 and 1987.<br />

Fig. 14 Simulated monthly runoff during monsoon season (left) and simulated annual peak discharge<br />

compared with measured values during validation periods for all six sub-catchments.<br />

29


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Discussion - post evaluation<br />

At the time of [4] and [5] lumped conceptual catchment model codes such as HBV (Bergström, 1992)<br />

and NAM (Jønch-Clausen and Refsgaard, 1984) had been used operationally for two decades, typically<br />

for catchments ranging from a few km 2 to more than 10,000 km 2 .<br />

At the same time distributed physically-based models had mainly been tested on flood events on small<br />

catchments that typically had very good data due to experimental instrumentation (Loague and Freeze,<br />

1985; Bathurst 1986a; Grayson et al., 1992a,b; Troch et al., 1993). Loague and Freeze (1985) compared<br />

a quasi-physically based model with a regression model and a unit hydrograph model on three<br />

experimental catchments, the 0.1 km 2 R-5, Chickasha, Oklahoma, the 7.2 km 2 WE-38, Klingertown,<br />

Pensylvania and the 0.1 km 2 HB-6, West Thornton, New Hampshire. Bathurst (1986a) applied the SHE<br />

to the simulation of flood events for the 10.6 km 2 experimental Wye catchment in Wales. Grayson et al.<br />

(1992a,b) applied the THALES to the simulation of flood events for the 7.0 ha Wagga catchment in Australia<br />

and the 4.4 ha Lucky Hill catchment at the Walnut Gulch Experimental Area in Arizona. Troch et<br />

al. (1993) applied a model based on a 3-dimensional numerical solution to Richards’ equation to the 7.2<br />

km 2 WE-38 catchment and a 0.64 km 2 subcatchment.<br />

To my knowledge the only examples until then of distributed physically-based model studies including<br />

applications on several hundred km 2 catchments and continuous simulation for periods of several years<br />

were the coupled groundwater/surface water models discussed in the previous section ([1]; [2]; Miles<br />

and Rushton, 1983; Christensen, 1994; Wardlaw et al., 1994) that all had distributed physically-based<br />

groundwater components and lumped (or semi-distributed) conceptual surface water components and<br />

some models such as WATBAL (Knudsen et al., 1986) that had semi-distributed surface water components<br />

and lumped conceptual groundwater components.<br />

During the following few years a few additional catchment scale studies with continuous simulations of<br />

distributed physically-based models emerged. One example is Querner (1997) who applied the<br />

MOGROW to the 6.5 km 2 Hupselse Beek catchment simulating both discharge and groundwater head<br />

dynamics. Another example is Kutchment et al. (1996) who simulated surface water processes for the<br />

3315 km 2 Ouse catchment. The study of Kutchment et al (1996) had many similarities with [4] and [5]<br />

with respect to model conceptualisation and conclusions.<br />

The main scientific contribution of [4] and [5] was therefore as the first study to demonstrate that distributed<br />

physically-based models could be established for catchments of this size and with ordinary data<br />

availability. Previous studies reported in literature had either been tests on small research catchments<br />

or been models with major components of the lumped conceptual type. As outlined above, it is worth<br />

noting the different traditions in the communities that had dealt with (large scale) lumped conceptual<br />

models, (small scale) physically-based models and groundwater models, respectively. I believe that an<br />

important characteristic of the team who performed the present study ([4] and [5]) was that it comprised<br />

scientists who together had comprehensive experiences from all these communities.<br />

Another key contribution was the parameterisation approach introduced. The point of departure for this<br />

approach, e.g. [1] and Bathurst (1986a), was an approach allowing parameter values to vary as required<br />

to fit the observed data during the calibration phase. This approach had been criticised by Beven<br />

(1989) to result in overparameterisation. The procedure resulted in 26 parameters to be calibrated for<br />

the Kolar catchment. Although this number is significantly less than e.g. the number of free parameters<br />

30


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

in [1], it is still very high and it is very likely that a sensitivity analysis would have shown that this number<br />

could easily be reduced without loss of model performance. It is interesting to note that similar parameterisation<br />

approaches reported for other catchments in 1997 ([7]) and 2001 (Andersen et al., 2001)<br />

resulted in 11 and 4 free parameters, respectively, implying that the parameterisation approach adopted<br />

in [4] and [5] were not yet finally developed.<br />

Beven (1989) had provided a fundamental critique of the way physically-based models such as the<br />

SHE had been promoted by e.g. Abbott et al. (1986a) and Bathurst (1986a). His main critique was that<br />

the attitudes in these early SHE papers were not realistic with respect to the abilities and achievements<br />

of physically-based models. Beven pointed amongst others to the following key problems:<br />

• The process equations are simplifications leading to model structure uncertainty.<br />

• Spatial heterogeneity at subgrid scale is not included in the physically-based models. The current<br />

generation of distributed physically-based models are in reality lumped conceptual models.<br />

• There is a great danger of overparameterisation if it is attempted to simulate all hydrological processes<br />

thought to be relevant and the related parameters against observed discharge data only.<br />

As a conclusion Beven argued that for future applications attempts must be made to obtain realistic<br />

estimates of the uncertainty associated with their predictions, particularly in the case of evaluating future<br />

scenarios of the effects of management strategies.<br />

[4] noted some of Beven’s critique, acknowledging that the process representation at the 2 km x 2 km<br />

grid squares is causing significant violations of some of the process descriptions, that “some degree of<br />

lumping and conceptualisation has taken place at the grid scale” and that “scale problems are important”.<br />

[4] stressed, however, that in spite of these acknowledged limitations “the present basin model is<br />

much more physically based and distributed than the traditional lumped conceptual model, where the<br />

entire catchment is represented in effect by one grid square, and where the process representations<br />

due to averaging over characteristics of topography, soil type and vegetation type are fundamentally<br />

different from the basic physical laws”.<br />

[4] and [5] concluded that the SHE is a suitable tool to support water management for conditions in India.<br />

In contrary to this, Beven (1989) had stated that the physically-based models “are not well suited to<br />

applications to real catchments”. In retrospect, it is remarkable that [4] and [5] did not go more substantially<br />

into a dialogue with the very fundamental critique raised by Beven (1989). For instance [4] and [5]<br />

did not comment at all on Beven’s main conclusion on the need for uncertainty assessment, although<br />

[5] actually used the model to study the impact of soil and land use by performing sensitivity analyses.<br />

A more comprehensive response and dialogue took place a few years later (Beven, 1996a; Refsgaard<br />

et al., 1996; Beven, 1996b).<br />

Seen in the perspective of present protocols for good modelling practise ([12] and [13]) the approach<br />

and conclusions in [4] and [5] are especially deficient by the lacking focus on uncertainty assessment. A<br />

main reason for the lack of dialogue with Beven’s critique and the lack of focus on uncertainty in [4] and<br />

[5] may be that we were too preoccupied with the real achievement as the first to setting up and running<br />

such type of model for such large catchments. Another reason may be that some of us had a background<br />

in groundwater modelling, where large scale distributed physically-based models had been successfully<br />

used to support practical water resources management for more than a decade, so we considered<br />

Beven’s statement that the physically-based models “are not well suited to applications to real<br />

catchments” as a large exaggeration.<br />

31


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

3.1.3 Intercomparison of different types of hydrological models ([6])<br />

Summary<br />

The research study reported in publication [6] had two objectives. The first objective was to identify a<br />

rigorous framework for the testing of model capabilities for different types of tasks. The second objective<br />

was to use this theoretical framework and conduct an intercomparison study involving application of<br />

three model codes of different complexity to a number of tasks ranging from traditional simulation of<br />

stationary, gauged catchments to simulation of ungauged catchments and of catchments with nonstationary<br />

climate conditions. Data from three catchments in Zimbabwe were used for the tests.<br />

The three codes used in the study were (a) NAM (Nielsen and Hansen, 1973; Havnø et al., 1995) – Fig.<br />

15; (b) WATBAL (Knudsen et al., 1986) – Fig. 16; and (c) MIKE SHE (Abbott et al., 1986a,b; Refsgaard<br />

and Storm, 1995) – Fig. 17. The NAM and MIKE SHE can be characterised as very typical of their<br />

lumped conceptual and distributed physically-based types, respectively, while the WATBAL with its<br />

semi-distributed approach falls in between these two standard classes.<br />

Fig. 15 Structure of the NAM rainfall-runoff model code<br />

32


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Fig. 16 Structure of the WATBAL code.<br />

Fig. 17 Schematic representation of the model structure of the ‘Système Hydrologique Européen’ (SHE)<br />

code.<br />

The three catchments in Zimbabwe that were selected for the tests were Ngezi-South (1090 km 2 ), Lundi<br />

(254 km 2 ) and Ngezi-North (1040 km 2 ). For two of the catchments the model simulations started with a<br />

blind simulation, i.e. a simulation where no calibration was conducted, but where model parameters<br />

were assessed directly from field data and indirectly by considering parameter values in the first catchment<br />

(proxy basin test). Then one year was made available for calibration and finally the full calibration<br />

period of 4-5 years was used. In all cases an independent period was used for validation tests (splitsample<br />

test). The hydrological regime in Zimbabwe is semi-arid and characterised by very large interannual<br />

variations. It was therefore possible to construct a test scheme in such a way that a model’s<br />

ability to predict differences in climate input could be tested by calibrating on a dry period and validating<br />

on a wet period or vice versa (differential split-sample test).<br />

33


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The model performance was evaluated for annual runoff and criteria focussing on the shape of the discharge<br />

hydrograph, i.e. rainfall-runoff modelling. The modelling work was carried out by three different<br />

persons/teams that were very experienced by applying their respective model codes. A general conclusion<br />

from the study was that the performances of the three codes were surprisingly similar. Thus, the<br />

ability of WATBAL and SHE to explicitly utilise data such as topography, soil and vegetation data that<br />

the NAM could not use turned out to make no significant difference in most cases. In summary the conclusions<br />

were:<br />

• Given a few (1–3) years of runoff measurements, a lumped model of the NAM type would be a<br />

suitable tool from the point of view of technical and economical feasibility. This applies for catchments<br />

with homogeneous climatic input as well as cases where significant variations in the exogenous<br />

input are encountered.<br />

• For ungauged catchments, however, where accurate simulations are critical for water resources<br />

decisions, a distributed model is expected to give better results than a lumped model if appropriate<br />

information on catchment characteristics can be obtained.<br />

Discussion - post evaluation<br />

A scientific contribution of [6] was the adoption and demonstration of Klemes’s model validation testing<br />

scheme, which had not been much used since the basic idea was published by Klemes (1986). This is<br />

discussed further in Section 4.2.4.<br />

Furthermore, the results from the intercomparison contributed to the ongoing scientific discussion on<br />

which types of model codes should be recommended for which application purpose. Only a few intercomparison<br />

studies involving different model types had been reported in literature and only two studies<br />

included physically-based models (Loague and Freeze, 1985; Michaud and Sorooshian, 1994). Most of<br />

these previous studies had been conducted on small research catchments and none of them had included<br />

tests for non-stationary climate conditions as in [6].<br />

From the emergence of the distributed physically-based models it was widely stated and believed that<br />

these new model types generally would be able to provide more accurate simulation of the hydrological<br />

cycle (Abbot et al., 1986a). In the absence of hard facts from suitable tests the scientific debate had to<br />

a very large extent been based on expectations and qualitative arguments such that the models with<br />

more physical basis in their model structure were assumed to be able to provide more accurate simulation<br />

results, or the opposite view, as e.g. advocated by Beven (1989) that such expectations to the superior<br />

performance of the physically-based models were unrealistic. In [4] we basically agreed with<br />

Beven (1989) with respect to the SHE’s capability to simulate discharge for large scale catchments with<br />

ordinary data, i.e. that the rainfall-runoff simulation results were of the same degree of accuracy “as<br />

would have been expected” with simpler hydrological models of the lumped conceptual type.<br />

With the results from [6] it was now possible to more firmly conclude that if the purpose of modelling is<br />

limited to simulation of runoff under stationary catchment conditions and if data exist for calibration purpose,<br />

there is no scientifically documented reason to go beyond lumped conceptual models. This issue<br />

has been subject to several studies since then, where the conclusions from [6] basically have been<br />

confirmed (e.g. Perrin et al., 2001; Reed et al., 2004). I believe that the only thing that may change that<br />

conclusion is the introduction of new spatial data from new airborne or satellite sensors. Whereas these<br />

new data types have proven to have great value for many hydrological purposes and for special condi-<br />

34


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

tions (e.g. snow cover), they have in general not yet documented that they can provide distributed<br />

models with comparative advantages in simulation of catchment runoff.<br />

35


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

3.2 Reactive Transport<br />

3.2.1 Oxygen transport and consumption in the unsaturated zone ([3])<br />

Summary<br />

Publication [3] describes the development of a new code for simulation of oxygen transport and consumption<br />

in the unsaturated zone. The code was linked as a sub-component to the SHE modelling system<br />

(Abbott et al., 1986a,b). The objective of the paper was to describe the new process formulation,<br />

document its applicability through two case studies and outline the perspectives in relation to its use as<br />

part of the comprehensive SHE code.<br />

The unsaturated zone water flow calculations in SHE were based on a finite difference solution to the<br />

full Richards’ equation for unsteady soil water flow. The solute transport calculations were based on the<br />

traditional convection-dispersion equation. The new code for oxygen transport and consumption was an<br />

add-on to these first two steps and used information on soil moisture content, water flows and solute<br />

concentrations and fluxes as input. Thus the spatial representation is given by the underlying flow and<br />

solute transport discretisation, implying a one-dimensional description with spatial resolution ranging<br />

from a few cm close to the terrain to 20-40 cm further down in the soil column.<br />

The process description in [3] is based on a three-phase system (soil, water, air) and accounting for<br />

spatial heterogeneity at this small scale. Fig. 18 shows a microscale illustration of the soil. Air tends to<br />

fill the larger pores in the soil matrix whereas water is drawn into the narrow necks and finer pore<br />

spaces in aggregates, forming capillary films and wedges. The air and water coexist in the soil by occupying<br />

different geometric configurations. Oxygen movement within these different portions of the pore<br />

space can occur by: convective transport in the water, diffusion in water, convective transport in soil air,<br />

diffusion in soil air, diffusion into water-saturated soil crumbs, and consumption in free and fixed water.<br />

Microorganisms and plant roots are generally found in the finer pores of the soil because they require<br />

close contact with the soil particles for uptake of substrate and nutrients. Transport of oxygen to these<br />

respiring sites usually occurs in the water phase of soil crumbs. It is the rate of oxygen diffusion through<br />

this fixed water in micropores that will determine the availability of oxygen for respiration and the anaerobic<br />

fraction of the soil. A soil crumb is considered to be any fully water-saturated subvolume of soil,<br />

the physical size of which is determined by the nearness of air-filled soil pores. The crumb is thus defined<br />

by the fact that oxygen transport within the crumb is primarily due to diffusion in water-filled pores.<br />

The size of the soil crumbs is dependent on the water content of the soil and the corresponding number<br />

of air-filled pores.<br />

The relation between soil water content and size of the water crumbs is derived from the soil water retention<br />

curve that is already used in Richards’ equation. The idea behind this is illustrated in Fig. 19 and<br />

described in more details in [3]. The number of air filled pores at a given soil moisture content can be<br />

36


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

calculated from the retention curve (Fig. 19b). It is furthermore assumed that the distance between two<br />

air filled pores, d i , corresponds to the average diameter of a water saturated crumb (Fig. 19a).<br />

Air<br />

“Free” water<br />

Solids/<br />

aggregates<br />

“Fixed” water<br />

Anaerobic<br />

zone<br />

Aerobic<br />

zone<br />

Fig. 18 Microscale representation of the three-phase soil system with respect to oxygen transport.<br />

Tension (ψ)<br />

Pore radius (p)<br />

Airfilled pore<br />

d i<br />

Water saturated<br />

crumb<br />

L<br />

(θ i<br />

+1)<br />

θ i<br />

Water content (θ)<br />

(a)<br />

(b)<br />

Fig. 19 (a) The assumed pore distribution within the unit L x L. (b) Retention curve showing the relation<br />

between tension, water content and pore radius of a soil.<br />

The two case studies where the model code was tested and demonstrated dealt with operation of a<br />

waste water infiltration plant and assessment of anaerobic zones of importance for denitrification in<br />

agricultural soils.<br />

37


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Discussion - post evaluation<br />

Previous research in oxygen transport processes in heterogeneous soils (e.g. Currie, 1961; Smith,<br />

1980; Troeh et al., 1982) were based on the assumption of steady-state conditions with regard to<br />

crumb/aggregate size and aerobic-anaerobic fractions. The novel scientific contribution of this paper<br />

was the new concept of calculating the size of the water crumbs as a function of the water retention<br />

curve and the time varying soil moisture content originating from SHE calculations and the linking of this<br />

concept to the previous research in this field. In this way it became possible to calculate aerobicanaerobic<br />

fractions dynamically.<br />

Although the scale of consideration in this study is the smallest possible in a catchment modelling perspective,<br />

namely point or column scale, it illustrates that smaller scale phenomena (here diffusion into<br />

soil crumbs that are of mm or less in size and temporally varying) often dominate the oxygen conditions<br />

at grid (cm - dm) scale. The approach in [3] is an upscaling from grain size to computational model grid<br />

point, where the within grid heterogeneity is accounted for by developing a set of process equations<br />

that includes the effect of the smaller scale heterogeneity at the larger grid scale.<br />

In retrospect, it is interesting to consider the issues that were not discussed in [3]. In this respect it<br />

should be noted that code verification aspects were not mentioned in [3], although a completely new<br />

code was developed. Furthermore, [3] did not discuss the issue of upscaling the present grid scale<br />

processes to application at catchment scale. Interesting issues in this regard would be evaluations of<br />

how data and parameter values could be assessed for catchment scale applications and discussions of<br />

whether it would still be the mm-scale (crumbs) processes that would be dominating when simulating at<br />

large scale, or whether larger scale heterogeneities, such as differences in crops, soil types or topography,<br />

would become more important and thus reduce the importance of the present process description.<br />

The model code presented in [3] was developed in a ‘research version’ of the SHE code. After the<br />

completion of the study it was not upgraded to become part of the ‘commercial version’ of MIKE SHE<br />

that emerged a few years later. The oxygen model has not been used for practical purposes.<br />

To my knowledge, process description of the same detail as in [3] has not been included in any catchment<br />

model, and not even in the most comprehensive physically-based root zone models such as<br />

DAISY (Hansen et al., 1991; Abrahamsen and Hansen, 2000). In DAISY that provides state-of-the-art<br />

descriptions of root zone processes with focus on water, plant growth and nitrogen a much simpler and<br />

more empirical process formulation is used for calculating denitrification as a function of anaerobic subsoil<br />

conditions.<br />

38


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

3.2.2 An integrated model for the Danubian Lowland ([9])<br />

Summary<br />

Publication [9] is concerned with environmental assessment studies in connection with the Gabcikovo<br />

hydropower scheme along the Danube. The objective of the underlying study was to develop and apply<br />

a comprehensive integrated modelling system to support management decisions in this respect.<br />

The Danubian Lowland (Fig. 20) in Slovakia and Hungary downstream Bratislava is an inland delta<br />

formed in the past by river sediments from the Danube. The entire area forms an alluvial aquifer, which<br />

throughout the year receives around 30 m 3 /s infiltration water from the Danube in the upper parts of the<br />

area and returns it to the Danube and the drainage canals in the downstream part. The aquifer is an<br />

important water resource for municipal and agricultural water supply, and the floodplain area with its<br />

alluvial forests and associated ecosystems represents a unique landscape of outstanding ecological<br />

importance.<br />

Fig. 20 The Danubian Lowland with the new reservoir and the Gabcikovo hydropower scheme.<br />

The Gabcikovo hydropower scheme was put into operation in 1992. A large number of hydraulic structures<br />

was established as part of the hydropower scheme. The key structures are a system of weirs<br />

across the Danube at Cunovo 15 km downstream of Bratislava, a reservoir created by the damming at<br />

Cunovo, a 30 km long lined navigation canal, outside the floodplain area, parallel to the Danube River<br />

39


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

with intake to the hydropower plant, a hydropower plant and two ship-locks at Gabcikovo, and an intake<br />

structure at Dobrohost, 10 km downstream of Cunovo, diverting water from the new canal to the river<br />

branch system. The entire scheme has significantly affected the hydrological regime and the ecosystem<br />

of the region. The scheme was originally planned as a joint effort between former Czecho-Slovakia and<br />

Hungary, and the major parts of the construction were carried out as such on the basis of an international<br />

treaty from 1977. However, since 1989 Gabcikovo has been a major matter of controversy between<br />

Slovakia and Hungary, who have referred some disputed questions to the International Court of<br />

Justice in The Hague (ICJ, 1997).<br />

The hydrological regime in the area is very dynamic with so many crucial links and feedback mechanisms<br />

between the various parts of the surface- and subsurface water regimes that no single existing model code<br />

was able to describe the entire regime. Therefore, the modelling system illustrated in Fig 21 was established.<br />

It integrates four model codes: (a) MIKE 21 (DHI, 1995) for describing the reservoir (2D flow, eutrophication,<br />

sediment transport); (b) MIKE 11 (Havnø et al., 1995) describing the river and river<br />

branches (1D flow including effects of hydraulic control structures, water quality, sediment transport);<br />

(c) MIKE SHE (Refsgaard and Storm, 1995) describing the ground water (3D flow, solute transport,<br />

geochemistry) and flood plain conditions (dynamics of inundation pattern, ground water and soil moisture<br />

conditions); and (d) DAISY (Hansen et al., 1991) describing agricultural aspects (crop yield, irrigation,<br />

nitrogen leaching). The interfaces between the various models were:<br />

Fig. 21 Structure of the integrated modelling system with indication of the interactions between the individual<br />

models<br />

40


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

A) MIKE SHE forms the core of the integrated modelling system having interfaces to all the individual<br />

modelling systems. The coupling of MIKE SHE and MIKE 11 is a fully dynamic coupling<br />

where data is exchanged within each computational time step.<br />

B) Results of eutrophication simulations with MIKE 21 in the reservoir are used to estimate the concentration<br />

of various water quality parameters in the water that enters the Danube downstream of<br />

the reservoir. This information serves as boundary conditions for water quality simulations for the<br />

Danube using MIKE 11.<br />

C) Sediment transport simulations in the reservoir with MIKE 21 provide information on the amount<br />

of fine sediment on the bottom of the reservoir. The simulated grain size distribution and sediment<br />

layer thickness is used to calculate leakage coefficients, which are used in ground water modelling<br />

with MIKE SHE to calculate the exchange of water between the reservoir and the aquifer.<br />

D) DAISY simulates vegetation parameters that are used in MIKE SHE to simulate the actual<br />

evapotranspiration. Ground water levels simulated with MIKE SHE act as lower boundary conditions<br />

for DAISY unsaturated zone simulations. Consequently, this process is iterative and requires<br />

several model simulations.<br />

E) Results from water quality simulations with MIKE 11 and MIKE 21 provide estimates of the concentration<br />

of various components/parameters in the water that infiltrates to the aquifer from the<br />

Danube and the reservoir. This can be used in the ground water quality simulations (geochemistry)<br />

with MIKE SHE.<br />

The integrated model was established for the 3,000 km 2 area on the basis of a large amount of good<br />

quality data. Most of the model parameters were assessed directly from field data, and some were estimated<br />

through calibration. For most of the individual model components, traditional split-sample validation<br />

tests were carried out.<br />

The modelling system was used in a scenario approach to assess the environmental impacts of alternative<br />

water management options. The uncertainties of the model predictions were assessed through<br />

sensitivity analyses. As an example, Figs 22 and 23 shows a characterisation of the floodplain area<br />

between the (old) main Danube river channel (western model boundary) and the power canal for predam<br />

(Fig. 22) and a hypothetical post-dam condition (Fig. 23) where the major part of the water is diverted<br />

from the main Danube channel to the power canal. The classes with different ground water depths<br />

and flooding have been determined from ecological considerations according to requirements of<br />

(semi)terrestrial (floodplain) ecotopes. For the pre-dam condition (Fig. 22) the contacts between the main<br />

Danube river and the river branch system is clearly seen. Similar results for a hypothetical post-dam water<br />

management regime (Fig. 23) show significant differences in hydrological regime, e.g. many areas are<br />

characterised by high groundwater tables and small/seldom flooding, while the post-dam situation (Fig. 22)<br />

generally has deeper ground water tables and more frequent flooding. From such changes in hydrological<br />

conditions inferences can be made on possible changes in the floodplain ecosystem.<br />

41


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Fig. 22 Hydrological regime in the river branch area for 1988 pre-dam conditions characterised in ecological<br />

classes<br />

Fig. 23 Hydrological regime in the river branch area for a post-dam water management regime characterised<br />

in ecological classes. The scenario has been simulated using 1988 observed upstream discharge<br />

data and a given hypothetical operation of the hydraulic structures.<br />

42


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Discussion - post evaluation<br />

The uniqueness of the established modelling system is the integration between the individual model<br />

codes, each of which providing complex distributed physically-based descriptions of the various processes.<br />

The validation tests have generally been carried out for the individual models, whereas only few<br />

tests on the integrated model were possible. Altogether, the integrated modelling system and the applications<br />

were more comprehensive and complex in terms of interactive dynamics between different<br />

components of an ecosystem than had previously been reported in the scientific literature.<br />

In the years following [9] a few comprehensive large scale studies with coupled models emerged. The<br />

most comprehensive of those was probably Wolf et al. (2003) who developed the STONE for calculating<br />

nutrient emissions from agriculture in The Netherlands. Although based on different codes the<br />

STONE resembles the integrated modelling system in [9] in terms of number of codes and complexity<br />

of process descriptions. One main difference, however, was that STONE consists of a chain of models<br />

without the feedback couplings that characterise [9]. Simpler, although still comprehensive, modelling<br />

systems were presented by Birkinshaw and Ewen (2000) as the SHETRAN code with a built-in nitrate<br />

transformation component and Conan et al. (2003) with a coupling of SWAT, MODFLOW and MT3DMS<br />

also focusing on nitrate fate at catchment scale.<br />

The complexity of the modelling studies in [9] may be compared to coupled modelling studies in<br />

neighbouring fields. The hydrology related field with the strongest modelling traditions is no doubt the<br />

atmospheric science. Here very comprehensive coupled models have been used in connection with<br />

hydrology oriented climate change studies. An example of a sequentially coupled atmospherichydrological<br />

model from that period is Graham (1999) who used the ECHAM4 regional atmospheric<br />

model coupled with the HBV hydrological model to simulate discharge for the entire 1.6 10 6 km 2 Baltic<br />

Sea basin. The atmospheric modelling component is in itself more demanding in terms of computer<br />

power than comprehensive hydrological modelling such as [9], and the complexity of the atmospheric<br />

modelling is maybe larger than the complexity of the individual process model codes in [9]. Otherwise<br />

the complexity of the coupled atmospheric-hydrological studies with respect to feedback couplings between<br />

process descriptions, data requirements, different scales for different processes, etc., may be<br />

considered comparable to the complexity of [9].<br />

In retrospect it is interesting to evaluate how much this comprehensive modelling system actually was used<br />

as part of the political decision process Were the full potential of the models utilised by the decision<br />

makers In the following my personal perception of these aspects are presented. The application of the<br />

integrated modelling and information system in practise may be categorised in three principally different<br />

functions: (a) to assist in design of structures and details of water management regimes, (b) to assist in<br />

policy analysis by assessing the environmental impacts of alternative water management regimes, and (c)<br />

to assist in resolving different views between interest groups on environmental assessments.<br />

The use of models to assist in designs is the classical "engineering" way of using such models. There were<br />

a number of such applications. The best example of this is the final design in 1993 of the guiding structures<br />

of the Cunovo reservoir that was based on model simulations. Such model use was possible, because the<br />

objectives of the decision-makers were clear and there was an urgent need for the results before the<br />

construction works actually started.<br />

43


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Use of models to assess the environmental impacts of alternative water management regimes was one of<br />

the primary reasons for establishing the modelling systems. There were several examples of such model<br />

applications. A key example was a combined field and modelling study of the geochemical conditions in the<br />

aquifer to assess whether the changed boundary conditions with the new reservoir would affect the redox<br />

conditions and hence the groundwater quality in the aquifer that forms the basis for the water supply of<br />

Bratislava. Another example is a combined field and modelling study of the eutrophication conditions in the<br />

reservoir. Such studies were conducted in close dialogue with the decision-makers in order to assist in their<br />

policy formulation.<br />

Finally, the modelling system was an invaluable tool in connection with the international attempts made to<br />

assist in resolving some of the issues that were disputed between Slovakia and Hungary. Many of the<br />

arguments brought forward on these highly controversial issues were mixtures of scientifically based facts<br />

and politically based views, but they were often claimed as purely scientifically based. It is very natural and<br />

fully legitimate that all parties have political interests and do their best to pursue them. However, the mixing<br />

of scientific facts and political interest makes the whole scene less transparent and may be an obstacle for<br />

arriving at rationale decisions. The role the modelling system had in this context was that it made it possible<br />

at some occasions to help distinguish between facts and fiction with respect to the scientific arguments. In<br />

this way the modelling tools assisted in separating scientific and political problems. Thus, the modelling<br />

system was often used as an important tool in resolving technical disagreements between the Slovakian<br />

and Hungarian delegations in the international expert groups (EC, 1992, 1993a, 1993b). Similarly, it is my<br />

impression that the modelling results played a significant role for the International Court of Justice when<br />

dealing with the question of whether the ecological situation could be characterised as a catastrophe<br />

justifying the use of the legal principle of “the ecological state of necessity” as done when Hungary stopped<br />

the construction works on the Gabcikovo scheme in 1989 (ICJ, 1997).<br />

However, there were also clear limitations to the application of the modelling tools. These limitations<br />

occurred when the political objectives were not clearly defined. It was for instance imagined that the<br />

modelling tools should be used to identify the optimal solution for the water management regime in the river<br />

branch system. This unique area is, however, subject to considerable interest from different sectors such<br />

as commercial forestry, fishery, tourism and natural conservation. The requirements of these different<br />

sectoral interests are not common and in some cases even contradictory with respect to how the water<br />

regime should be. Thus, until the balance of interests between these different stakeholders has been<br />

decided in terms of clear political goals from the government, an optimal solution does not exist. Another<br />

example of lack of clear political goals was related to the overall sharing of water between hydropower and<br />

the environment.<br />

44


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

3.2.3 Large scale modelling of groundwater contamination ([10])<br />

Summary<br />

Publication [10] describes results from an EU research project on groundwater pollution from non-point<br />

sources. The rationale outlined in [10] is that physically based models for describing nitrate due to better<br />

process descriptions may be expected to have better predictive capabilities than simpler empirical<br />

models for certain applications related to assessing the impacts of changes in agricultural management<br />

practise. Such models were well proven for simulation of nitrate contamination at small scale with good<br />

data availability. Two of the main constraints for using such models operationally were that (a) the databases<br />

existing at national or European scale had not previously been tested as input for such models;<br />

and (b) almost no tests had been conducted for such models at large scale. The objectives of the paper<br />

were therefore to study the data availability at the large scale and develop methodologies for model<br />

upscaling/aggregation to represent conditions at larger scale. The theoretical aspects on scaling included<br />

in [10] are dealt with in Section 4.1. Here some key results from one of the two catchments (Karup)<br />

are discussed.<br />

The modelling system used was MIKE SHE (Refsgaard and Storm, 1995) coupled with the DAISY root<br />

zone model (Hansen et al., 1991). Two Danish catchments of about 500 km 2 each, Karup and Odense,<br />

were used for the tests.<br />

The principles used for collecting input data and assessing values of model parameters were:<br />

• The data must be easily accessible. This implied that most of the data were aggregated data from<br />

national or European databases.<br />

• No model calibration is carried out. Instead parameter values are estimated from generic transfer<br />

functions.<br />

Data were collected from the following sources:<br />

• Topography: 1 km grid data downloadable from USGS and GISCO (Geographical Information System<br />

of the European Commission)<br />

• Catchment boundaries and river network: generated from the topographical data using standard<br />

GIS functionality.<br />

• River cross-sections: derived from a special GIS application where the cross-section was estimated<br />

based on upstream catchment area, slope and a characteristic discharge.<br />

• Soil type: GISCO soil map.<br />

• Soil organic matter: experience values.<br />

• Vegetation: EEA CORINE land cover map.<br />

• Agricultural management practise: Agricultural statistics and government prescribed norms<br />

• Geology and groundwater abstraction: EC report<br />

• Climatic variables and discharge data: national data<br />

The MIKE SHE models were run with 1, 2 and 4 km grids. For describing the nitrate leaching from the<br />

root zone, 17 crop rotation schemes were established by use of DAISY. The crop rotations were based<br />

45


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

on the statistical information on crop type and livestock densities. The 17 schemes were distributed<br />

randomly over the catchment in such a way that the statistical distribution was in accordance with the<br />

agricultural statistics. As an alternative, all the agricultural area was described by one representative<br />

crop instead of 17 cropping patterns. These two approaches are denoted ‘Distributed’ and ‘Uniform’ in<br />

Figs. 24 and 25 below.<br />

The Karup model was validated by comparison of model simulations and field data on annual water<br />

balances, discharge hydrographs (Fig. 24) and nitrate concentrations in the upper groundwater layer<br />

from 35 observation wells (Fig. 25). The results of the validation tests were characterised as follows:<br />

• The annual water balance was simulated remarkably well with only 2% difference as average value<br />

over the five years validation period. The variation over the year (Fig. 24) is less well described.<br />

• The simulated nitrate concentrations (Fig. 25) match the observed data remarkably well both with<br />

respect to average concentrations and statistical distribution of concentrations within the catchment.<br />

• The simulations are clearly affected by various scale effects (1, 2, 4 km grid and Distributed/Uniform).<br />

This is addressed further in Section 4.1 below.<br />

Fig. 24 Comparison of the recorded discharge hydrograph for the Karup catchment with simulations<br />

based on 1, 2 and 4 km grids. The two simulated curves correspond to the combined upscaling/aggregation<br />

procedure (Distributed) and the simpler upscaling procedure (Uniform).<br />

46


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

1,2<br />

Distribution of groundwater concentrations (ultimo 1993)<br />

(uniform agricultural representation)<br />

Cumulative frequency<br />

1<br />

0,8<br />

0,6<br />

0,4<br />

0,2<br />

Measure<br />

d<br />

det1000_<br />

d1<br />

det2000_<br />

d1<br />

det4000_<br />

d1<br />

0<br />

0 20 40 60 80 100 120 140 160 180<br />

1,2<br />

(mg/l)<br />

Distribution of groundwater concentrations (ultimo 1993)<br />

(distributed agricultural representation)<br />

Cumulative frequency<br />

1<br />

0,8<br />

0,6<br />

0,4<br />

0,2<br />

Measured<br />

det1000<br />

det2000<br />

det4000<br />

0<br />

0 20 40 60 80 100 120 140 160 180<br />

mg/l<br />

Fig. 25 Comparison of statistical distribution of nitrate concentrations in groundwater for the Karup<br />

catchment by the model with 1, 2 and 4 km grids and observed in 35 wells. The lower figure corresponds<br />

to the upscaling procedure resulting in a distributed representation of agricultural crops, while<br />

the upper figure is from the run with the upscaling procedure, where all agricultural area is represented<br />

by one uniform crop.<br />

Discussion - post evaluation<br />

The model codes used in [10] were well known and previously used in one of the catchments (Styczen<br />

and Storm, 1993a, b). The scientific contributions of [10] relate partly to scaling issues, which are dealt<br />

with in Section 4.1 below, and partly to testing the performance of nitrate catchment models when<br />

scarce data are used and when no model calibration is carried out. The most important finding with<br />

respect to data availability is probably that aggregated data in many cases can provide sufficient input<br />

to perform useful model simulations. This message is similar to the output from the first large scale application<br />

of SHE to catchments in India with scarce data ([4] and [5]), namely that an apparent lack of<br />

primary data should not always prevent you from using a model.<br />

With regard to data availability at large scale it was concluded that the most critical data that may cause<br />

problems for large scale applications are the geological data for which no suitable global or European<br />

digital database exist. In this respect the development of a national hydrological model in Denmark<br />

(Henriksen et al., 2003) that is based on comprehensive geological data from the very large national<br />

geological database is an important development.<br />

47


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The study showed that one of the strengths of physically-based models is the possibility to assess<br />

many parameter values from standard values, achieved from experience through a number of other<br />

applications. It also showed some of the limitations in this respect. While the key results in terms of<br />

annual runoff and nitrogen concentration distributions are encouraging, the discharge hydrographs<br />

clearly illustrate that it would be very easy to obtain a better hydrograph fit through calibration of a couple<br />

of parameter values. When parameters are assessed in this way they are subject to considerable<br />

uncertainty, which will generate significant uncertainty in model predictions. This aspect is addressed in<br />

([11]) which is discussed in Section 4.3 below.<br />

The attempt to assess parameter values directly from data without any model calibration can be seen<br />

as the extreme end of the development starting with hundreds of free parameters in the Suså model<br />

([1]), over 26 parameters in the Kolar basin in India ([5]), to 11 free parameters in a previous Karup<br />

study ([7]). The results from the present study showed some obvious shortcomings of this approach,<br />

and in a later study of the Senegal basin (Andersen et al., 2001) we used 4 free parameters for calibration.<br />

48


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

3.3 Real-time Flood Forecasting<br />

3.3.1 Intercomparison of updating procedures for real-time forecasting ([8])<br />

Summary<br />

Publication [8] presents a classification of updating procedures used in real-time flood forecasting modelling<br />

and a review of the results from the WMO project ‘Simulated Real-Time Intercomparison of Hydrological<br />

Models’ (WMO, 1992) comprising more than 10 commonly used hydrological model codes<br />

and a variety of different updating procedures. The objective of the paper was to analyse the performance<br />

of different types of updating procedures and to assess what is more important, the simulation<br />

model or the updating procedure.<br />

In the context of real-time forecasting a hydrological catchment model, as those in the remaining part of<br />

this thesis, may be denoted a process model (Fig. 26). A process model consists of a model structure<br />

including process equations, model parameters that are constant throughout a model run and state<br />

variables. The transformation from input to output by the process model is called simulation, in accordance<br />

with the terminology defined in Section 2.2 above. Process models that operate in real-time may<br />

take into consideration the measured discharge/water level at the time of preparing the forecast. This<br />

feedback process of assimilating the measured data into the forecasting procedure is referred to as<br />

updating, or data assimilation. Updating procedures can be classified according to four different methodologies<br />

(Fig. 26):<br />

1. Updating of input variables, typically by adjusting precipitation.<br />

2. Updating of state variables, e.g. the soil moisture content.<br />

3. Updating of model parameters.<br />

4. Updating of output variables (error prediction).<br />

The core of the WMO project was a workshop held in Vancouver during the period July 30 – August 8,<br />

1987, where 15 models from 14 different organisations were run in a simulated real-time environment.<br />

Data from three catchments with significantly different hydrological characteristics were used for the<br />

tests. Before the workshop the modellers had received historical data for several years for calibration<br />

and validation and two ‘warm up’ flood events. During the workshop four additional flood events were<br />

forecasted as blind tests, each with seven forecasts at consecutive times. Each event was forecasted<br />

within one workshop day, often under considerable time pressure.<br />

I participated in the workshop with two models that differed both with respect to process model and<br />

updating procedure:<br />

• NAMS11 comprising the NAM as catchment model, St. Venant river routing and an error prediction<br />

model as updating procedure. This is basically identical to what later became known as the flood<br />

forecasting module of MIKE 11 (Havnø et al., 1995).<br />

49


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

• NAMKAL comprising the NAM formulated in a state-space form and build into an extended Kalman<br />

filter for updating. This version had no separate river routing but relied on the linear reservoirs in<br />

NAM.<br />

The two models were tested on the 104 km 2 Orgeval catchment (France) and the 2,344 km 2 Bird Creek<br />

catchment (United States). The models were not tested on the third, snow-dominated catchment.<br />

Fig. 26 Schematic diagram of simulation and forecasting with illustration of four different updating<br />

methodologies), [8].<br />

Summary results from the two catchments are shown in Fig. 27 as root mean square errors (RMSE) as<br />

a function of forecast lead time (lag). As can be seen from the figure the intercomparison test turned out<br />

to be a very close ‘race’ with at least one third of the models performing almost equally well. Depending<br />

on the selected criteria for comparison (which catchment, priority to short, medium or long lead times,<br />

etc.) several of these could claim to be the ‘best model’. What is maybe more interesting is some of the<br />

general findings:<br />

• The process models belonged to two of the classes shown in Fig. 6, namely empirical (black box)<br />

models and lumped conceptual models. From the results it was not possible to clearly distinguish<br />

which model type performed better.<br />

• All four types of updating procedures were represented, both among the models with the best performance<br />

and among the models with the poorest performance. This indicates that the selection of<br />

a specific updating methodology is only one out of several important factors.<br />

• The forecast error (RMSE) generally increases with forecast lead time. This shows that updating<br />

procedures most often significantly improve the performance of hydrological models for short-range<br />

forecasting.<br />

• In most cases the models with the best performance for short lead times were also those with the<br />

best results for the long lead times. This indicates that the goodness of the basic simulation (by the<br />

50


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

process model) is crucial to forecast accuracy, or in other words that a good updating procedure<br />

can not compensate for a poor process model.<br />

Discussion - post evaluation<br />

Real-time forecasting is the toughest field I have experienced in hydrological modelling with respect to<br />

model validation, because the results of the model forecasts are continuously confronted with observations.<br />

In many studies involving model simulations for planning purposes it is often not possible to conduct<br />

a validation test that exactly fits the conditions for which model simulations of future conditions are<br />

needed. Therefore, the validation test results will often have many qualifiers and be considered together<br />

with other arguments. In real-time flood forecasting there is no need for such qualifiers and arguments<br />

(‘no nonsense’) and therefore only the hard facts are considered.<br />

Fig. 27 Root Mean Square Errors (RMSE) as a function of forecast lead time for all models participating<br />

in the Orgeval and Bird Creek catchments. The RMSE values are averaged over the four forecasted<br />

flood events with blind tests (events 3-6), [8].<br />

51


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The main scientific contribution of [8] was the analysis of the performance of different types of process<br />

models and updating procedures and combinations hereof. Our motivations to participate in this unique<br />

WMO intercomparison project were (a) to test DHI’s code NAMS11 (now MIKE 11), which was used<br />

operationally in India at that time, in an intercomparison with some of the internationally leading codes<br />

and modellers; and (b) to test whether an extended Kalman filter could provide a better updating routine<br />

than the more commonly used and simpler error prediction routine. In addition to noting that the<br />

NAMS11 performed very well and that the extended Kalman filter under ideal conditions could perform<br />

marginally better than the standard updating procedure, the analysis lead to the following interesting<br />

findings:<br />

• It was not possible to conclude which model type, black box or lumped conceptual, is better suited<br />

for simulation of runoff. This is in good agreement with [6] and later studies such as Reed et al.<br />

(2004), which concluded that lumped conceptual and distributed physically-based models performed<br />

equally well for split-sample tests. Thus it may be argued that all three model types described<br />

in Section 2.4 in many cases can be expected to be able to perform equally well in rainfallrunoff<br />

modelling.<br />

• It turned out that the personal factor is maybe the most important aspect of hydrological modelling.<br />

It was clear after the workshop that the difference in model performances between the participating<br />

codes could often not be explained by differences in model codes. Personal factors such as the<br />

modeller’s ability to make a good model calibration, experience from working in hydrological regimes<br />

different from the regime you see in your home office, ability to work under extreme stress,<br />

level of preparation beforehand and random luck also played important roles. The personal factor is<br />

most often overlooked in natural science, maybe because it is subjective of nature and therefore<br />

does not fit well into the methods usually adopted in natural science. The ultimate consequence of<br />

this finding is that good quality of modelling results requires both use of good scientifically based<br />

methodologies and adoption of sound practises by competent professionals. This consequence was<br />

not derived in [6] but is central for recent work on quality assurance guidelines in the modelling<br />

process ([13]).<br />

Most of the model codes that participated in the intercomparison study were state-of-the-art hydrological<br />

model codes such as Sacramento (Burnash, 1995), HBV (Bergström, 1995) and MIKE 11<br />

(NAMS11) with comprehensive experience in operational flood forecasting. These codes are still<br />

among the most commonly used today. The updating techniques tested in [8] are also still the basic<br />

techniques used operationally today, although more sophisticated developments and improvements<br />

have taken place, e.g. a combination of the Kalman filtering and the error prediction procedure (Madsen<br />

and Skotner, 2005).<br />

52


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

4. Key Issues in Catchment Scale Hydrological Modelling<br />

4.1 Scaling<br />

This section provides a discussion of catchment heterogeneity and upscaling in relation to catchment<br />

modelling based partly on the publications in the present thesis (most importantly [7] and [10]) and<br />

partly on other previous work such as Refsgaard (1981), the foundation of [1] and [2], and Refsgaard<br />

and Butts (1999) that was heavily inspired by the EU research project behind [10] and [11].<br />

Hydrological modelling is being carried out at spatial scales ranging from pore scale to global scale and<br />

a variety of scaling theories has been developed, see e.g. Blöschl and Sivapalan (1995) and Beven<br />

(1995). Many of the scaling theories consider different spatial scales for single processes. For catchment<br />

modelling it is necessary to include several processes and their linkages.<br />

4.1.1 Catchment heterogeneity<br />

Catchment properties exhibit spatial variability. For almost all properties this heterogeneity is very large<br />

and dominates the behaviour of the catchment. Scaling is basically a question of how to handle heterogeneity<br />

at different spatial scales. Different model types do this fundamentally different. Let us illustrate<br />

this by two examples.<br />

As the first example, let us consider an idealised description of flow through the root zone (Fig. 28). If a<br />

soil column, initially dry, is supplied with a certain amount of water it will retain water, until it is filled to a<br />

certain level, the field capacity θ’ F , whereupon all the supplied water will pass through. This is illustrated<br />

in Fig. 28 A,B,C, where also the frequency and the distribution of θ F are shown. If we then consider a<br />

catchment with a spatial variability in soil physical properties, the frequency and the distribution of the<br />

field capacity are illustrated in Fig. 28 D and E respectively. If the root zone of this catchment, initially<br />

dry, is being supplied with water, not all of the area will contribute to throughflow at the same time, as θ F<br />

varies in the catchment . When, for instance, the rainfall has supplied the water amount θ’ F,m , it is seen<br />

from Fig. 28 E that field capacity has been reached in one half of the catchment, thus contributing to<br />

throughflow, while the other half of the catchment still retains the rain in its root zone.<br />

In a lumped model, such as NAM, such spatial variability is taken into account by using semi-empirical<br />

relations as e.g. the dashed line in Fig. 28 F, where θ’ 1 and θ’ 2 typically have to be estimated from calibration.<br />

The difference between θ’ 1 and θ’ 2 can be seen as a measure of the heterogeneity of the catchment,<br />

or of the catchment input that is also assumed homogeneously distributed in a lumped approach.<br />

This way of accounting for the spatial variability in the process equations can be considered the heart of<br />

lumped models and also explains why the process equations in lumped models are fundamentally different<br />

from point scale physical process equations.<br />

In a distributed model the spatial variability is taken into account by dividing the catchment into several<br />

smaller elements, which are then usually treated as homogeneous units, i.e. as a column in Fig. 28.<br />

53


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

However, the spatial variability of soil physical properties comprise both variability between different soil<br />

types and variability within the same soil type as illustrated in Fig. 29. It has been demonstrated in several<br />

studies (Nielsen et al., 1973; Jensen and Refsgaard, 1991a,b,c; Djurhus et al. 1999) that the spatial<br />

variability of e.g. soil properties within one standard soil type at field scale is very high and can significantly<br />

influence the water balance and solute transport at this scale.<br />

Frequency<br />

A<br />

Distribution<br />

B<br />

Through flow<br />

Supplied water<br />

C<br />

Soil<br />

Column<br />

θ F<br />

θ F<br />

Supplied<br />

water<br />

θ’ F<br />

Frequency<br />

θ’ F<br />

Distribution<br />

θ F<br />

Through flow<br />

Supplied water<br />

1.0<br />

D<br />

E<br />

F<br />

Catchment<br />

0.5<br />

0<br />

θ F<br />

θ’ F, m<br />

θ’ 1<br />

θ’ F<br />

θ’ 2<br />

Supplied<br />

water<br />

Fig. 28 Idealised description of the variation of field capacity, θ F , and its effect on flow through the root<br />

zone in a soil column and in a catchment (Refsgaard, 1981).<br />

Frequency<br />

Spatial variability<br />

of field capacity, θ F<br />

within one of<br />

the soil types<br />

in the entire<br />

catchment<br />

θ F<br />

Fig. 29 The principle of spatial variability of a soil physical property within a single soil type and within a<br />

catchment containing more than one soil type (Refsgaard, 1981).<br />

Let us then turn to another example focusing on the limitation of a distributed model to resolve key features<br />

of a catchment. Fig. 30 shows the topography and river network for two models that are identical<br />

54


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

except for differences in spatial discretisation. It is clearly seen that the 500 m grid provides a much<br />

better resolution of the topography and the river network, and also of other catchment characteristics as<br />

explained in [7]. In the 2000 m grid the river valley cannot be described well and many of the smaller<br />

streams have to be omitted, where the distance between neighbouring streams are smaller than the<br />

model grid size. This significantly affects the stream-aquifer interaction and in this way the simulation of<br />

both river discharge and groundwater heads. As discussed in [7] a change in scale (grid size) in this<br />

way changes the model simulations. This can in some cases be compensated by adjusting parameter<br />

values. But it implies that parameter values are scale dependent and that the physical basis is reduced<br />

if the grid size is increased.<br />

Fig. 30 Topography, river network and model grid for two models with discretisations of 500 m and<br />

2000 m [7].<br />

This example focussed on river discharges and hydraulic heads at some given observational locations<br />

for which [7] argues that a 500 m resolution provides an adequate description. If we instead had focussed<br />

on other processes such as reactive transport in aquifers or in river valleys, we would have needed<br />

to account for geological and geomorphological heterogeneity of much smaller scale than 500 m. This<br />

line of argument can continue down to pore scale processes such as those described in [3]. The point is<br />

that, no matter which resolution a model has, it is always possible to find processes that require a<br />

smaller scale in order to provide a physically based description. Consequently, the ultimate distributed<br />

physically based model where everything is described can never be achieved. This implies that any<br />

distributed model needs to provide a kind of lumped conceptual representation at its scale of operation.<br />

An excellent example of this is the traditional advection dispersion equation with its associated dispersivities,<br />

where the dispersivities show the well known scale dependence (Gelhar, 1986). The process<br />

description of oxygen transport and consumption given in [3] is another example. Although meant for<br />

55


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

inclusion as a submodel in a distributed physically based model, [3] incorporates spatial heterogeneity<br />

of processes at pore scale (mm) to a process equation assumed valid at its scale of operation (grid<br />

points with 10-40 cm distance). This process equation can therefore be considered a lumped conceptual<br />

description at this scale.<br />

4.1.2 A scaling framework<br />

In this section we only consider the case of moving from the smaller to the larger scale, which is often<br />

denoted upscaling. When moving to larger scales the spatial variability of physical parameters and variables<br />

have to be taken into account. This can in principle be done in two ways, either by aggregation or<br />

upscaling (Heuvelink and Pebesma, 1999):<br />

• Upscaling means that the process equations and the associated parameters that basically constitute<br />

the model in principle are modified or substituted when moving from the smaller scale to the<br />

larger scale.<br />

• Aggregation means that the process equations are applied at the smaller scale (where they were<br />

derived) and the large-scale results are obtained by aggregating the small-scale results at the larger<br />

scale.<br />

Hence, in order not to confuse the terminology with two different meanings of the term upscaling the<br />

term scaling will in the following be used for the case of moving from modelling at the smaller scale to<br />

modelling at the larger scale. Thus, the term upscaling is reserved to the specific approach of scaling<br />

defined above.<br />

The differences between upscaling and aggregation are illustrated in Fig. 31 and some key characteristics<br />

are summarised in Table 1. At the smaller scale, the hydrological processes can be described by<br />

smaller scale equations and associated smaller scale parameters. If the aggregation approach is<br />

adopted for large-scale modelling, then the model is operated at the smaller scale units with smaller<br />

scale equations and parameters and the model output valid for the larger scale emerges after aggregation<br />

of the results. The aggregation consists of estimating the spatial mean and in some cases also the<br />

statistical distribution of the model outputs. If the model is linear or the parameters and variables are<br />

spatially constant, computational time may be saved by averaging of model parameters and input before<br />

running the model; otherwise the models runs must be made before the aggregation step.<br />

Table 1. Characteristics of different scaling procedures when moving from a smaller scale (SS) to a<br />

larger scale (LS).<br />

Aggregation<br />

Upscaling<br />

Basis of process descriptions<br />

SS equations<br />

used at LS<br />

Large-scale<br />

PDE<br />

Smaller scale Smaller scale Smaller<br />

scale<br />

LS equations<br />

developed<br />

Larger scale<br />

Computational unit Smaller scale Larger scale Larger<br />

scale<br />

Larger scale<br />

Parameter estimation<br />

possible from field<br />

data<br />

Yes<br />

No, some values<br />

need calibration<br />

Yes<br />

No, some values<br />

need calibration<br />

56


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Fig. 31 Upscaling and aggregation methods for extending hydrological processes from small-scale (SS)<br />

to large-scale (LS) models (Refsgaard and Butts, 1999).<br />

If the upscaling approach is adopted for the large-scale modelling, the smaller scale equations and parameters<br />

are in principle substituted by larger scale ones. The upscaling approach can be carried out in<br />

three different ways:<br />

• The smaller scale equations are assumed valid also at the larger scale. In this case the parameter<br />

values have to be estimated as effective parameters corresponding to the larger scale computational<br />

unit. Effective parameters are single values, similar to point scale parameters, but somehow<br />

reproduce the bulk behaviour of a heterogeneous medium. The estimation of parameter values is in<br />

such case often done by calibration, at least for a handful of the key parameters. An example of this<br />

approach is given in [5] describing an application of the SHE to a large catchment in India using<br />

spatial grid sizes of 2 km x 2 km.<br />

• The equations at the larger scale are derived in a theoretical framework from a set of deterministic<br />

partial differential equations (PDE) assumed valid at the smaller scale and assumptions on the spatial<br />

variability of key parameters and/or input data. This is often carried out in a stochastic framework<br />

where quantities such as the average value and higher order statistical moments of the desired<br />

model output variables can be assessed. An example of this approach is Jensen and Mantouglou<br />

(1992) who consider the spatial variability of soil hydraulic parameters in field scale modelling.<br />

In this case the parameter values may be assessed directly on the basis of smaller scale information.<br />

• The equations at the larger scale are developed at the larger scale using a concept, which does not<br />

explicitly consider the smaller scale equations, i.e. the formulation of laws that apply at the large<br />

scale. Examples of this approach are the conceptual rainfall-runoff models such as the NAM (Niel-<br />

57


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

sen and Hansen, 1973; [6]; [8]), cf Fig. 28 and the discussion above. The oxygen model described<br />

in [3] is also an example of this approach, although smaller scale and larger scale here refer to mm<br />

and dm scales and not to catchment scale. As a result of the larger scale concepts such codes are<br />

often not adequate also for smaller scale application and can most often not assess parameters directly<br />

from small scale information.<br />

4.1.3 Scaling - an example<br />

The above four scaling approaches each have their advantages and limitations and the specific approach<br />

to use in particular applications will depend on many factors such as the purpose of a given<br />

study, the dominating processes in the particular hydrological regime and the data availability. Thus, no<br />

unique approach can be claimed superior in all cases. As illustrated below, scaling procedures are in<br />

practise often based on combinations of the above approaches.<br />

The example outlines the scaling methodologies adopted under an EU research project dealing with<br />

uncertainties of assessing non-point pollution to aquifers at the European scale (Refsgaard et al, 1998;<br />

[10]). During this project two model codes were used:<br />

• SMART2 for studying leaching to groundwater of nitrate and aluminium from natural areas due to<br />

atmospheric deposition. SMART2 is a relatively simple dynamic model operating in vertical columns<br />

with annual time steps (Kros et al., 1995).<br />

• MIKE SHE/DAISY for studying groundwater contamination from agricultural areas. Both MIKE SHE<br />

(Refsgaard and Storm, 1995) and DAISY (Hansen et al., 1991) are physically-based model codes<br />

with detailed process descriptions and typically hourly time steps.<br />

The objective of the project was to assess the uncertainty in model predictions when applied at the<br />

European scale. As both codes had been developed for and previously mainly been applied at much<br />

smaller scales a scaling procedure had to be adopted. The two scaling procedures, illustrated in Fig.<br />

32, show significant differences:<br />

SMART 2 is operating at a 1 km grid scale. It was developed on the basis of experience with the NUC-<br />

SAM code (Groenenberg et al., 1995) which is a detailed physically-based code operating at point<br />

scale. Thus, SMART2 can be considered as an upscaling of NUCSAM with new equations and parameters<br />

applicable at the 1 km scale, equivalent to the upscaling procedure of the conceptual hydrological<br />

models described above. For use for the Netherlands the SMART2 model results were aggregated to 5<br />

km x 5 km grid by selecting the median value among the 25 grids of 1 km x 1 km size. The parameters<br />

were assessed by pedotransfer functions from field data without prior model calibration. The scaling<br />

procedure from point scale to national or European scales thus consists of a combination of an upscaling<br />

and an aggregation step.<br />

MIKE SHE/DAISY, on the other hand, is in this case run with equations and parameter values in each<br />

model grid point representing field scale conditions. The field scale is characterised by ‘effective’ soil<br />

and vegetation parameters, but assuming only one soil type and one cropping pattern. The smallest<br />

horizontal discretisation in the model is the grid scale (1-5 km) that is larger than the field scale. This<br />

implies that all the variations between categories of soil type and crop type within the area of each grid<br />

can not be resolved and described at the grid level. Input data, whose variations are not included in the<br />

58


Refsgaard JC – Doctoral Thesis<br />

Hydrological Modelling and River Basin Management<br />

January 2007<br />

grid scale representation, are distributed randomly at the catchment scale so that their statistical distributions<br />

are preserved at that scale. The results from the grid scale modelling are then aggregated to<br />

catchment scale (10-50 km) and the statistical properties of model output and field data are then compared<br />

at catchment scale (Hansen et al., 1999; [10]). Thus the scaling procedure from point scale to<br />

catchment scale is again a combination of an upscaling step and an aggregation step. In contrary to the<br />

NUCSAM-SMART2 case the upscaling step here is simply the (important) assumption that the point<br />

scale equations are valid at field scale. The aggregation step highlights a key issue from the concept of<br />

Representative Elementary Area, REA (Wood et al., 1988), namely that variability can be explicitly represented<br />

only at scales larger than the model grid size.<br />

Validation tests against field data suggested that the two different scaling procedures basically could be<br />

assumed valid for their respective cases, although important limitations were also identified. An important<br />

question regarding the differences between the two upscaling methods is, why it apparently was<br />

possible to make the large upscaling step from the smaller scale NUCSAM to the larger scale SMART 2<br />

code, while a similar step was not judged possible for the MIKE SHE/DAISY code. The answer may be<br />

that the nitrogen leaching in agricultural fields is a highly non-linear and dynamic process that depends<br />

on cropping pattern and agricultural management practise, which can not be lumped to a larger scale<br />

description, while the geochemical processes below natural lands, where no management practise is<br />

interfering, more easily can be represented by long term average simulations focussing on the gradual<br />

reduction of the chemical buffer capacities due to the acids in the atmospheric deposition.<br />

An inherent limitation of the scaling methodologies illustrated in this example is that they do not preserve<br />

the georeferenced location of simulated concentrations, but only their statistical distribution over<br />

the catchment area (e.g. Fig. 25). Therefore, comparisons with field data make no sense on a well by<br />

well or subcatchment by subcatchment basis, and no information on the actual location of the simulated<br />

‘hot spots’ within the catchment is provided. If it from a management point of view is required with a<br />

more detailed spatial resolution of the model predictions, then the same scaling method has to be carried<br />

out at a finer scale with all the statistical input data being supplied on a subcatchment basis. This is<br />

in principle straightforward, but in reality it may often be limited by data availability.<br />

4.1.4 Discussion – post evaluation<br />

The issue of scaling represents both a major scientific challenge and a practical problem in water resources<br />

management. Scaling is dealt with as a key issue in two of the publications in this thesis ([7],<br />

[10]). As the studies behind the other publications operate on scales ranging from point scale ([3]) to<br />

thousands of km 2 ([4], [5], [9]) catchment heterogeneity and scaling are dealt with and discussed in<br />

many of the publications.<br />

59


Fig. 32 Scaling methodology adopted by the SMART2 and MIKE SHE/DAISY models in the UNCERSDSS project (Refsgaard and Butts, 1999).


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

In the beginning of my career I had the rather naive view that it might be possible to develop a universal<br />

model code and a methodology that could be used to address most problems in hydrological management.<br />

This is reflected in the dualism of statements of the MIKE SHE description in Refsgaard and<br />

Storm (1995), where it on the one hand is stated that “MIKE SHE is applicable on spatial scales ranging<br />

from a single soil profile to a large regions”, while it on the other hand is acknowledged that “there are a<br />

number of fundamental scale problems which need to be carefully considered in the model applications”.<br />

I do not believe any longer that a universally applicable code and modelling methodology is theoretically<br />

realistic, and certainly it is not feasible in practise. The main reason for this is the scaling problems.<br />

Because scaling is interlinked with modelling concepts, I therefore do not believe it will ever be<br />

possible to derive a universal scaling theory of practical applicability.<br />

Scaling implies to take spatial heterogeneity into account. In catchment modelling it is furthermore<br />

complicated by the need to include and link several processes, such as subsurface processes (Dagan,<br />

1986; Gelhar, 1986; Wen and Gómez-Hernández, 1996), root zone processes including land surfaceatmosphere<br />

interaction (Michaud and Shuttelworth, 1997); and surface water processes including<br />

stream-aquifer interaction (Saulnier et al., 1997; [7]).<br />

Many researchers have expressed doubts whether it is feasible to use the same model process descriptions<br />

at different scales. For instance Beven (1995) states that “… the aggregation approach towards<br />

macroscale hydrological modelling, in which it is assumed that a model applicable at small<br />

scales can be applied at larger scales using ‘effective’ parameter values, is an inadequate approach to<br />

the scale problem. It is also unlikely in the future that any general scaling theory can be developed due<br />

to the dependence of hydrological systems on historical and geological perturbations.”<br />

Beven’s view can be considered a universal and fundamental statement to which it is difficult to disagree.<br />

A more pragmatic, but not necessarily conflicting, view is expressed by Grayson and Blöschl<br />

(2000): “As modellers, we are often left with little choice but to use the effective parameter approach,<br />

but we must recognise that effective parameters may have a narrow range of application and an effective<br />

parameter value that “works” for one process may not be valid for another process.” The scaling<br />

framework presented above should be seen in this context. It is not a fundamental theory but rather a<br />

collection of different methods and an emphasis on their respective assumptions and associated costs<br />

in terms of lost information. These methods or building blocks can then be used in composing specific<br />

scaling methodologies depending on the purposes of the particular modelling studies. In this respect it<br />

is crucial that the modeller is aware of the limitations of the scaling methodology chosen in a particular<br />

study.<br />

61


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

4.2 Confirmation, Verification, Calibration and Validation<br />

As illustrated in Fig. 3 the credibility of the descriptions or the agreements between reality, conceptual<br />

model, model code and model are evaluated through confirmation of the conceptual model, verification<br />

of the code, model calibration and model validation. These four terms are addressed in this section.<br />

4.2.1 Confirmation of conceptual model<br />

The conceptual model, with its selection of process descriptions, equations, etc., is the foundation for the<br />

model structure. Therefore a good conceptual model is most often a prerequisite for obtaining trustworthy<br />

model results. In groundwater modelling, establishment of the conceptual model is often considered the<br />

most important part of the entire modelling process (Middlemis, 2000). Evaluation of conceptual models is<br />

an important part in assessing uncertainty due to model structure error (Section 4.3 below and [15]).<br />

Methods for conceptual model confirmation should follow the standard procedures for confirmation of<br />

scientific theories. This implies that conceptual models should be confronted with actual field data and be<br />

subject to critical peer reviews. Furthermore, the feedback from the calibration and validation process may<br />

also serve as a means by which one or a number of alternative conceptual models may be either<br />

confirmed or falsified.<br />

As Beven (2002b) argues we need to distinguish between our qualitative understanding (perceptual model)<br />

and the practical implementation of that understanding in our conceptual model. As a conceptual model is<br />

defined in [12] as combination of a perceptual model and the simplifications acceptable for a particular<br />

model study a conceptual model becomes site-specific and even case specific. For example a conceptual<br />

model of a groundwater aquifer may be described as two-dimensional for a study focussing on regional<br />

groundwater heads, while it may need to include more complex three-dimensional geological structures for<br />

a study requiring detailed solute transport simulations.<br />

4.2.2 Code verification<br />

The ability of a given model code to adequately describe the theory and equations defined in the<br />

conceptual model by use of numerical algorithms is evaluated through the verification of the model code.<br />

Use of the term verification in this respect is in accordance with Oreskes et al. (1994), because<br />

mathematical equations are closed systems. The methodologies used for code verification include<br />

comparing a numerical solution with an analytical solution or with a numerical solution from other verified<br />

codes. However, some programme errors only appear under circumstances that do not routinely occur,<br />

and may not have been anticipated. Furthermore, for complex codes it is virtually impossible to verify that<br />

the code is universally accurate and error-free. Therefore, the term code verification must be qualified in<br />

terms of specified ranges of application and corresponding ranges of accuracy.<br />

Code verification is not an activity that is carried out from scratch in every modelling study. In a particular<br />

study it has to be ascertained that the domain of applicability for which the selected model code has been<br />

verified covers the conditions specified in the actual conceptual model. If that is not the case, additional<br />

62


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

verification tests have to be conducted. Otherwise, the code explicitly must be classified as not verified for<br />

this particular study, and the subsequent simulation results therefore have to be considered with extra caution.<br />

4.2.3 Model calibration<br />

The application of a model code to be used for setting up a site-specific model is usually associated with<br />

model calibration. The model performance during calibration depends on the quantity and quality of the<br />

available input and observation data as well as on the conceptual model. If sufficient accuracy cannot be<br />

achieved either the conceptual model and/or the data have to be re-evaluated.<br />

Many of the publications ([1], [4], [5], [6], [7], [8], [9]) have involved model calibration. This was in all<br />

cases done manually. Today automatic calibration (inverse modelling) is state-of-the-art (Duan et al.,<br />

1994; Hill, 1998; Doherty, 2003), also as part of the calibration process for rather complex distributed<br />

physically-based models (Sonnenborg et al., 2003; Henriksen et al., 2003).<br />

A key issue related to calibration of distributed models with potentially hundreds or thousands of parameter<br />

values is a rigorous parameterisation procedure, where the spatial pattern of the parameter<br />

values are defined and the number of free parameters adjustable through calibration is reduced as<br />

much as possible. A methodology for this is presented in [7], and this issue is further discussed in [4],<br />

[5], [10] and Andersen et al. (2001).<br />

4.2.4 Model validation<br />

Often the model performance during calibration is used as a measure of the predictive capability of a<br />

model. This is a fundamental error. Many studies (e.g. [4]; [6]; Andersen et al., 2001) have<br />

demonstrated that the model performance against independent data not used for calibration is generally<br />

poorer than the performance achieved in the calibration situation. Therefore, the credibility of a sitespecific<br />

model’s capability to make predictions about reality must be evaluated against independent<br />

data. This process is denoted model validation.<br />

In designing suitable model validation tests a guiding principle should be that a model should be tested<br />

to show how well it can perform the kind of task for which it is specifically intended (Klemes, 1986).<br />

Klemes proposed the following scheme comprising four types of test corresponding to different situations<br />

with regard to whether data are available for calibration and whether the catchment conditions are<br />

stationary or the impact of some kind of intervention has to be simulated:<br />

• The split-sample test is the classical test, being applicable to cases where there is sufficient data for<br />

calibration and where the catchment conditions are stationary. The available data record is divided into<br />

two parts. A calibration is carried out on one part and then a validation on the other part. Both the<br />

calibration and validation exercises should give acceptable results.<br />

• The proxy-basin test should be applied when there is not sufficient data for a calibration of the<br />

catchment in question. If, for example, streamflow has to be predicted in an ungauged catchment Z,<br />

two gauged catchments X and Y within the region should be selected. The model should be calibrated<br />

on catchment X and validated on catchment Y and vice versa. Only if the two validation results are<br />

63


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

acceptable and similar can the model command a basic level of credibility with regard to its ability to<br />

simulate the streamflow in catchment Z adequately.<br />

• The differential split-sample test should be applied whenever a model is to be used to simulate flows,<br />

soil moisture patterns and other variables in a given gauged catchment under conditions different from<br />

those corresponding to the available data. The test may have several variants depending on the<br />

specific nature of the modelling study. If for example a simulation of the effects of a change in climate is<br />

intended, the test should have the following form. Two periods with different values of the climate<br />

variables of interest should be identified in the historical record, such as one with a high average<br />

precipitation and the other with a low average precipitation. If the model is intended to simulate<br />

streamflow for a wet climate scenario, then it should be calibrated on a dry segment of the historical<br />

record and validated on a wet segment. Similar test variants can be defined for the prediction of<br />

changes in land use, effects of groundwater abstraction and other such changes. In general, the model<br />

should demonstrate an ability to perform through the required transition regime.<br />

• The proxy-basin differential split-sample test is the most difficult test for a hydrological model, because<br />

it deals with cases where there is no data available for calibration and where the model is directed to<br />

predicting non-stationary conditions. An example of a case that requires such a test is simulation of<br />

hydrological conditions for a future period with a change in climate and for a catchment, where no<br />

calibration data presently exist. The test is a combination of the two previous tests.<br />

The above test types are very general and needs to be translated to specific tests in each case depending<br />

on data availability, hydrological regime and purpose of the modelling study. Except for the situations,<br />

where the split-sample test is sufficient, rather limited work has been carried out so far on validation<br />

test schemes.<br />

From a theoretical point of view the procedures outlined by Klemes (1986) for the proxy-basin and the<br />

differential split-sample tests, where tests have to be carried out using data from similar catchments,<br />

are weaker than the usual split-sample test, where data from the specific catchment are available.<br />

However, no obviously better testing schemes exist.<br />

It must be realised that the validation test schemes proposed above are so demanding that many applications<br />

today would fail to meet them. Thus, for many cases where either proxy-basin or differential<br />

split-sample tests are required, suitable test data simply do not exist. This is for example the case for<br />

prediction of regional scale transport of potential contamination from underground radionuclide deposits<br />

over the next thousands of years. In such case model validation is not possible. This does not imply<br />

that these modelling studies are not useful, only that their output should be recognised to be somewhat<br />

more uncertain than is often stated and that the term ‘validated model’ should not be used. Thus, a<br />

model’s validity will always be confined in terms of space, time, boundary conditions, types of application,<br />

etc.<br />

4.2.5 Discussion – post evaluation<br />

Relative to confirmation, verification and calibration, the main scientific contributions in my publications<br />

[1] – [15] are on the model validation issue. The motivation for this research was twofold: First of all,<br />

there were too many undocumented claims (over-selling) in the modelling community on model capabilities<br />

during the years following the development of many comprehensive model codes such as MIKE<br />

64


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

SHE. This over-selling was most obvious in practical studies conducted by consultants, but it was also<br />

common in large parts of the scientific community, e.g. Abbot et al. (1986a,b) and many others. Secondly,<br />

dominant parts of the hydrological scientific community advocated that model validation was not<br />

possible (Konikow and Bredehoeft, 1992; Beven, 1996a). This left the practising world in a vacuum<br />

without scientifically based methodologies to test and document the degree of credibility of particular<br />

model predictions. The methodologies described in [6] and [7] should be seen as pragmatic approaches<br />

to help filling this vacuum and the discussions in [12] should be seen as an attempt to provide a scientific<br />

basis for adopting rigorous model validation schemes as part of a good modelling practise.<br />

The principles and schemes proposed by Klemes have been extensively used in the last 12 of the publications<br />

([4] – [15]). Thus, the intercomparison study in [6] was based on a rigorous use of all four types<br />

of tests. Furthermore, [7] ‘translated’ Klemes’ principles that were developed with lumped conceptual<br />

models in mind to use in distributed modelling. After demonstrating that a distributed model that was<br />

validated for simulating catchment response often performs much poorer for internal sites, [7] emphasised<br />

that a model should only be assumed valid with respect to the outputs that have been directly<br />

validated. This implies e.g. that multi-site validation is needed if predictions of spatial patterns are required.<br />

Furthermore, a model which is validated against catchment runoff can not automatically be assumed<br />

valid also for simulation of erosion on a hillslope within the catchment, because smaller scale processes<br />

may dominate here; it will need specific validation against hillslope soil erosion data. Furthermore,<br />

systematic split-sample tests were made in [4], [5] and [9], and proxy- basin tests were conducted in [10].<br />

Finally, the validation requirements are emphasised in the publications related to quality assurance [12]<br />

and [13].<br />

[6] and [7] were not the first studies to use Klemes’ principles for validation. For example Quinn and<br />

Beven (1993) used split sample-tests, proxy-basin tests and differential split-sample tests (wet/dry periods)<br />

to analyse TOPMODEL’s predictive capabilities for the Plynlimon catchment in Wales. The key<br />

contribution of [7] and [12] in this respect was the integration of Klemes’ principles as core elements of<br />

a protocol for good modelling practise.<br />

The principles outlined in [7] and consolidated in [12] that a model should never be considered universally<br />

validated, but can only be conditionally validated restricted by the availability of data and specifically<br />

performed validation tests are well in line with Lane and Richards (2001) who argue that “evidence<br />

of a successful prediction in observed spaces and times (conventional validation) cannot provide a sufficient<br />

basis for use of a model beyond the set of situations for which the model has been empirically<br />

tested”. The principles are also in accordance with the new coherent philosophy for modelling of the<br />

environment proposed by Beven (2002b) where he argues that it is required to be able to “define those<br />

areas of the model space where behavioural models occur”.<br />

65


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

4.3 Uncertainty Assessment<br />

This section presents a broad framework originating from Refsgaard et al. (2005) and [14] followed by a<br />

discussion on data uncertainty (including [14]), parameter uncertainty (including [11]) and model structure<br />

uncertainty (including [15]) and how they affect model output uncertainty.<br />

4.3.1 Modelling uncertainty in a water resources management context<br />

Definitions and Taxonomy<br />

Uncertainty and associated terms such as error, risk and ignorance are defined and interpreted differently<br />

by different authors (see Walker et al. (2003) for a review). The different definitions reflect, among<br />

other factors, the different scientific disciplines and philosophies of the authors involved, as well as the<br />

intended audience. In addition they vary depending on their purpose. Here I will use the terminology<br />

used in Refsgaard et al. (2005) and [14] that has emerged after discussions between social scientists<br />

and natural scientists specifically aiming at applications in model based water management (Klauer and<br />

Brown, 2003). It is based on a subjective interpretation of uncertainty in which the degree of confidence<br />

that a decision maker has about possible outcomes and/or probabilities of these outcomes is the central<br />

focus. Thus, according to this definition a person is uncertain if s/he lacks confidence about the specific<br />

outcomes of an event. Reasons for this lack of confidence might include a judgement that the information<br />

is incomplete, blurred, inaccurate, imprecise or potentially false. Similarly, a person is certain if s/he<br />

is confident about the outcome of an event. It is possible that a person feels certain but has misjudged<br />

the situation (i.e. s/he is wrong).<br />

There are many different (decision) situations, with different possibilities for characterising of what we<br />

know or do not know and of what we are certain or uncertain. A first distinction is between ignorance as<br />

a lack of awareness about imperfect knowledge and uncertainty as a state of confidence about knowledge<br />

(which includes the act of ignoring). Our state of confidence may range from being certain to admitting<br />

that we know nothing (of use), and uncertainty may be expressed at a number of levels in between.<br />

Regardless of our confidence in what we know, ignorance implies that we can still be wrong (‘in<br />

error’). In this respect Brown (2004) has defined a taxonomy of imperfect knowledge illustrated in Fig.<br />

33.<br />

66


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Ignorance: unaware of imperfect knowledge<br />

Spectrum of confidence (a state of awareness)<br />

Indeterminacy (‘cannot know’)<br />

Certainty ‘Bounded’ uncertainty ‘Unbounded’ uncertainty<br />

No possible outcomes<br />

known (‘do not know’)<br />

Some possible<br />

outcomes and<br />

probabilities known<br />

Some possible<br />

outcomes, but no<br />

probabilities known<br />

All possible<br />

outcomes and all<br />

probabilities known<br />

All possible outcomes<br />

and some probabilities<br />

known<br />

All possible outcomes<br />

but no probabilities<br />

known<br />

Fig. 33 Taxonomy of imperfect knowledge resulting in different uncertainty situations (Brown, 2004)<br />

In evaluating uncertainty, it is useful to distinguish between uncertainty that can be quantified e.g. by<br />

probabilities and uncertainty that can only be qualitatively described e.g. by scenarios. If one throws a<br />

balanced die, the precise outcome is uncertain, but the ‘attractor’ of a perfect die is certain: we know<br />

precisely the probability for each of the 6 outcomes, each being 1/6. This is what we mean with ‘uncertainty<br />

in terms of probability’. However, the estimates for the probability of each outcome can also be<br />

uncertain. If a model study says: “there is a 30% probability that this area will flood two times in the next<br />

year”, there is not only ‘uncertainty in terms of probability’ but also uncertainty regarding whether the<br />

estimate of 30% is a reliable estimate.<br />

Secondly, it is useful to distinguish between bounded uncertainty, where all possible outcomes have<br />

been identified and unbounded uncertainty, where the known outcomes are considered incomplete.<br />

Since quantitative probabilities require ‘all possible outcomes’ of an uncertain event and each of their<br />

individual probabilities to be known, they can only be defined for ‘bounded uncertainties’. If probabilities<br />

cannot be quantified in any undisputed way, we often can still qualify the available body of evidence for<br />

the possibility of various outcomes.<br />

The bounded uncertainty where all probabilities are deemed known (Fig. 33) is often denoted ‘statistical<br />

uncertainty’ (e.g. Walker et al., 2003). This is the case traditionally addressed in model based uncertainty<br />

assessment. It is important to note that this case constitutes one of many decision situations outlined<br />

in Fig. 33, and in other situations the main uncertainty in a decision situation cannot be characterised<br />

statistically.<br />

67


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Sources of uncertainty<br />

Walker et al. (2003) describes the uncertainty as manifesting itself at different locations in the model<br />

based water management process. These locations, or sources, may be characterised as follows:<br />

• Context, i.e. at the boundaries of the system to be modelled. The model context is typically determined<br />

at the initial stage of the study where the problem is identified and the focus of the model<br />

study selected as a confined part of the overall problem. This includes, for example, the external<br />

economic, environmental, political, social and technological circumstances that form the context of<br />

problem.<br />

• Input uncertainty in terms of external driving forces (within or outside the control of the water manager)<br />

and system data that drive the model such as land use maps, pollution sources and climate<br />

data.<br />

• Model structure uncertainty is the conceptual uncertainty due to incomplete understanding and simplified<br />

descriptions of processes as compared to nature.<br />

• Parameter uncertainty, i.e. the uncertainties related to parameter values.<br />

• Model technical uncertainty is the uncertainty arising from computer implementation of the model,<br />

e.g. due to numerical approximations and bugs in the software.<br />

• Model output uncertainty, i.e. the total uncertainty on the model simulations taken all the above<br />

sources into account, e.g. by uncertainty propagation.<br />

Nature of uncertainty<br />

Many authors (e.g. Walker et al., 2003) categorise the nature of uncertainty into:<br />

• Epistemic uncertainty, i.e. the uncertainty due to imperfect knowledge.<br />

• Stochastic uncertainty, i.e. uncertainty due to inherent variability, e.g. climate variability.<br />

Epistemic uncertainty is reducible by more studies: e.g. research or data collection. Stochastic uncertainty<br />

is non-reducible.<br />

Often the uncertainty on a certain event includes both epistemic and stochastic uncertainty. An example<br />

is the uncertainty of the 100 year flood at a given site. This flood event can be estimated: e.g. by use of<br />

standard flood frequency analysis on the basis of existing flow data. The (epistemic) uncertainty may be<br />

reduced by improving the data analysis, by making additional monitoring (longer time series) or by a<br />

deepening our understanding of how the modelled system works. However, no matter how much we<br />

improve our knowledge, there will always be some (stochastic) uncertainty inherent to the natural system,<br />

related to the stochastic and chaotic nature of several natural phenomena, such as weather. Perfect<br />

knowledge on these phenomena cannot give us a deterministic prediction, but would have the form<br />

of a perfect characterisation of the natural variability; for example, a probability density function for rainfall<br />

in a month of the year.<br />

68


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The uncertainty matrix<br />

The uncertainty matrix in Table 2 can be used as a tool to get an overview of the various sources of<br />

uncertainty in a modelling study. The matrix is modified after Walker et al. (2003) in such a way that it<br />

matches Fig. 33 and so that the taxonomy now gives ‘uncertainty type’ in descriptions that indicates in<br />

what terms uncertainty can best be described. The vertical axis identifies the source of uncertainty<br />

while the horizontal axis covers the level and nature of uncertainty. It is noticed that the matrix is in reality<br />

three-dimensional (source, type, nature), because the categories Type and Nature are not mutually<br />

exclusive<br />

Table 2 The uncertainty matrix (modified after Walker et al., 2003).<br />

Taxonomy (types of uncertainty)<br />

Source of uncertainty<br />

Natural, technological,<br />

Context<br />

economic,<br />

social, political<br />

Inputs System data<br />

Driving forces<br />

Model structure<br />

Model<br />

Technical<br />

Parameters<br />

Model outputs<br />

Statistical<br />

uncertainty<br />

Scenario<br />

uncertainty<br />

Qualitative<br />

uncertainty<br />

Recognised<br />

ignorance<br />

Nature<br />

Epistemic<br />

uncertainty<br />

Stochastic<br />

uncertainty<br />

69


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Methodologies for assessing uncertainty<br />

A list of the most common methodologies applicable for addressing different types of uncertainty has<br />

been compiled and briefly described in Refsgaard et al. (2005). Table 3 provides an overview.<br />

Table 3 Applicability of different methodologies to address different types and sources of uncertainty<br />

(modified after Refsgaard et al., 2005).<br />

Taxonomy (types of uncertainty)<br />

Statistical<br />

uncertainty<br />

Scenario uncertainty<br />

Qualitative<br />

uncertainty<br />

Recognised<br />

ignorance<br />

Source of uncertainty<br />

Natural, technological,<br />

EE EE, SC, SI EE, EPR,<br />

Context<br />

NUSAP, SI,<br />

economic,<br />

UM<br />

social, political<br />

Inputs System data DA, EPE, EE, DA, EE, SC DA, EE DA, EE<br />

MCA, SA<br />

EE, EPR, NU-<br />

SAP, SI, UM<br />

Driving forces DA, EPE, EE, DA, EE, SC DA, EE, EPR DA, EE, EPR<br />

MCA, SA<br />

Model structure<br />

EE, MMS, QA EE, MMS, SC, EE, NUSAP, EA, NUSAP,<br />

Model<br />

QA<br />

QA<br />

QA<br />

Technical QA QA QA QA<br />

Parameters EE, IN-PA, SA EE, IN-PA, SA EE EE<br />

Model outputs<br />

EPE, EE, IN- EE, IN-UN, EE, NUSAP EE, NUSAP<br />

UN, MCA, MMS, SA<br />

MMS, SA<br />

Abbreviations of methodologies:<br />

DA Data Uncertainty<br />

EPE Error Propagation Equations<br />

EE Expert Elicitation<br />

EPR Extended Peer Review (review by stakeholders)<br />

IN-PA Inverse modelling (parameter estimation)<br />

IN-UN Inverse modelling (predictive uncertainty)<br />

MCA Monte Carlo Analysis<br />

MMS Multiple Model Simulation<br />

NUSAP NUSAP<br />

QA Quality Assurance<br />

SC Scenario Analysis<br />

SA Sensitivity Analysis<br />

SI Stakeholder Involvement<br />

UM Uncertainty Matrix<br />

70


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

4.3.2 Data uncertainty<br />

Uncertainty in data is a major source of uncertainty when assessing uncertainty of model outputs. It is<br />

also an uncertainty source that is very visible for people outside the modelling community. One of the<br />

scientific contributions of the HarmoniRiB project ([14]) is to address data uncertainty. This has been<br />

done in three steps:<br />

• A methodology has been developed for characterising uncertainty in different types of data (Brown<br />

et al., 2005).<br />

• A software tool (Data Uncertainty Engine – DUE) for supporting the assessment of data uncertainty<br />

has been developed (Brown and Heuvelink, 2005).<br />

• Reviews with results on data uncertainty reported in the literature have been compiled into a guideline<br />

report for assessing uncertainty in various types of data originating from meteorology, soil physics<br />

and geochemistry, hydrogeology, land cover, topography, discharge, surface water quality,<br />

ecology and socio-economics (Van Loon and Refsgaard, 2005).<br />

The categorisation of data types distinguishes 13 categories (Table 4) for each of which a conceptual<br />

data uncertainty model is developed. By considering measurement scale, it becomes possible to<br />

quickly limit the relevant uncertainty models for a certain variable. On a discrete measurement scale, for<br />

example, it is only relevant to consider discrete probability distribution functions, whereas continuous<br />

density functions are required for continuous numerical data. In addition, the use of space and time<br />

variability determines the need for autocorrelation functions alongside a probability density function<br />

(pdf). Each data category is associated with a range of uncertainty models, for which more specific pdfs<br />

may be developed with different simplifying assumptions (e.g. Gaussian; second-order stationarity; degree<br />

of temporal and spatial autocorrelation).<br />

Table 4 The subdivision of uncertainty categories, along the ‘axes’ of space-time variability and measurement<br />

scale (Brown et al., 2005).<br />

Measurement scale<br />

Space-time variability<br />

Continuous<br />

numerical<br />

Discrete<br />

numerical<br />

Categorical<br />

Narrative<br />

Constant in space and time A1 A2 A3<br />

Varies in time, not in space B1 B2 B3<br />

Varies in space, not in time C1 C2 C3<br />

4<br />

Varies in time and space D1 D2 D3<br />

4.3.3 Parameter uncertainty<br />

In addition to data uncertainty, uncertainty of parameter values is the most commonly considered<br />

source of uncertainty in hydrological modelling. The scientifically soundest way of assessing parameter<br />

uncertainty is through inverse modelling (Duan et al., 1994; Hill, 1998; Doherty, 2003). These tech-<br />

71


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

niques have the benefit that they, in addition to optimal parameter values, also produce calibration statistics<br />

in terms of parameter- and observation sensitivities, parameter correlation and parameter uncertainties.<br />

When parameter uncertainties are assessed they can be propagated through the model to infer about<br />

model output uncertainty. A serious constraint in this respect is the interdependence between model<br />

parameters and model structure as discussed under model structure uncertainty below.<br />

[11] describe an example of how (input) data uncertainty and parameter uncertainty are propagated<br />

through a model to assess uncertainty in model simulation of nitrate concentrations in groundwater. The<br />

assessment of data and parameter values were done by expert judgement and a Monte Carlo technique<br />

with Latin hypercube sampling was used for the uncertainty propagation. The simulated uncertainty<br />

band around the deterministic model simulation in Fig. 25 is shown in Fig. 34 based on 25 Monte<br />

Carlo realisations. The uncertainty is seen to be considerable, e.g. with the estimate of the areal fraction<br />

of the aquifer having concentrations less than 50 mg NO 3 /l ranging between 30% and 80%.<br />

1<br />

0,8<br />

Cum. frequency<br />

0,6<br />

0,4<br />

0,2<br />

(ultimo 1993)<br />

0<br />

0 20 40 60 80 100 120 140 160 180<br />

mg/l<br />

Fig. 34 Measured (•) and simulated (×) areal distribution of NO 3 concentrations in groundwater at a<br />

point in time. Measured values are based on 35 groundwater observations. [11].<br />

As noted in [11] a fundamental limitation of the approach adopted in [11] is that the errors due to incorrect<br />

model structure are neglected. As discussed also below one approach to assess such model structure<br />

error is through comparison of predicted and observed values. In the present case (Figs 25 and 34)<br />

the deviation between observed and simulated values is so small that this term may be neglected. This<br />

is, however, by no means a proof of a correct model structure. It only shows that the particular model<br />

performs without apparent model errors for this particular application.<br />

72


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

4.3.4 Model structure uncertainty<br />

Existing approaches and new framework<br />

Any model is an abstraction, simplification and interpretation of reality. The incompleteness of a model<br />

structure and the mismatch between the real causal structure of a system and the assumed causal<br />

structure as represented in a model will therefore always result in uncertainty about model predictions.<br />

The importance of the model structure for predictions is well recognised, even for situations where predictions<br />

are made on output variables, such as discharge, for which field data are available (Franchini<br />

and Pacciani, 1992; Butts et al., 2004). The considerable challenge faced in many applications of environmental<br />

models is that predictions are required beyond the range of available observations, either in<br />

time or in space, e.g. to make extrapolations towards unobservable futures (Babendreier, 2003) or to<br />

make predictions for natural systems, such as ecosystems, that are likely to undergo structural changes<br />

(Beck, 2005). In such cases, uncertainty in model structure is recognised by many authors to be the<br />

main source of uncertainty in model predictions (Dubus et al., 2003; Neumann and Wierenga, 2003;<br />

Linkov and Burmistrov, 2003).<br />

The existing strategies for assessing uncertainty due to incomplete or inadequate model structure may<br />

be grouped into the categories shown in Fig. 35. The most important distinction is whether data exist<br />

that makes it possible to infer directly on the model structure uncertainty. This requires that data are<br />

available for the output variable of predictive interest and for conditions similar to those in the predictive<br />

situation. In other words it is a distinction between whether the model predictions can be considered as<br />

interpolations or extrapolations relative to the calibration situation.<br />

Availability of data for<br />

model validation test<br />

Target data exist<br />

(interpolation)<br />

No direct data<br />

(extrapolation)<br />

Increase<br />

parameter<br />

uncertainty<br />

Estimate<br />

structural<br />

term<br />

Multiple<br />

conceptual<br />

models<br />

Expert<br />

elicitation<br />

Pedigree<br />

analysis<br />

Intermediate data<br />

(differential splitsample<br />

case)<br />

No data at all<br />

(proxy basin case)<br />

Fig. 35 Classification of existing strategies for assessing conceptual model uncertainty [15].<br />

73


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The two main categories are thus equivalent to different situations with respect to model validation<br />

tests. According to Klemes’ classical hierarchical test scheme (Klemes, 1986; see Section 4.2 above),<br />

the interpolation case corresponds to situations where the traditional split-sample test is suitable, while<br />

the extrapolation case corresponds to situations where no data exist for the concerned output variable<br />

(proxy-basin test) or where the basin characteristics are considered non-stationary, e.g. for predictions<br />

of effects of climate change or effects of land use change (differential split-sample test).<br />

The strategies used in ‘interpolation’, i.e. for situations that are similar to the calibration situation with<br />

respect to variables of interest and conditions of the natural system, have the advantage that they can<br />

be based directly on field data (e.g. Radwan et al., 2004; van Griensven and Meixner, 2004; and Vrugt<br />

et al., 2005). A fundamental weakness is that field data are themselves uncertain. Nevertheless, in<br />

many cases, they can be expected to provide relatively accurate estimates of, at least, the total predictive<br />

uncertainty for the specific measured variable and for the same conditions as those in the calibration<br />

and validation situation. A more serious limitation of the strategies depending on observed data is<br />

that they are only applicable for situations where the output variables of interest are measured. While<br />

relevant field data are often available for variables such as water levels and water flows, this is usually<br />

not the case for concentrations, or when predictions are desired for scenarios involving catchment<br />

change, such as land use change or climate change. Another serious limitation stems from an assumption<br />

that the underlying system does not undergo structural changes, such as changes in ecosystem<br />

processes due to climate change.<br />

The strategy that uses multiple conceptual models benefits from an explicit analysis of the effects of<br />

alternative model structures, e.g. IPCC (2001), Harrar et al. (2003), Troldborg (2004), Poeter and<br />

Anderson (2005) and Højberg and Refsgaard (2005). The multiple conceptual model strategy makes it<br />

possible to include expert knowledge on plausible model structures. This strategy is strongly advocated<br />

by Neuman and Wierenga (2003) and Poeter and Anderson (2005). They characterise the traditional<br />

approach of relying on a single conceptual model as one in which plausible conceptual models are rejected<br />

(in this case by omission). They conclude that the bias and uncertainty that results from reliance<br />

on an inadequate conceptual model are typically much larger than those introduced through an inadequate<br />

choice of model parameter values. This view is consistent with Beven (2002b) who outlines a<br />

new philosophy for modelling of environmental systems. The basic aim of his approach is to extend<br />

traditional schemes with a more realistic account of uncertainty, rejecting the idea that a single optimal<br />

model exists for any given case. Instead, environmental models may be non-unique in their accuracy of<br />

both reproduction of observations and prediction (i.e. unidentifiable or equifinal), and subject to only a<br />

conditional confirmation, due to e.g. errors in model structure, calibration of parameters and period of<br />

data used for evaluation.<br />

A weakness of the multiple modelling strategy, is the absence of quantitative information about the extent<br />

to which each model is plausible. Furthermore, it may be difficult to sample from the full range of<br />

plausible conceptual models. In this respect, expert knowledge on which the formulations of multiple<br />

conceptual models are based, is an important and unavoidable subjective element.<br />

The framework presented in [15] for assessing the predictive uncertainties of environmental models<br />

used for extrapolation includes a combination of use of multiple conceptual models and assessment by<br />

use of the pedigree approach of their credibility as well as a reflection on the extent to which the sampled<br />

models adequately represent the space of plausible models.<br />

74


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

The role of model calibration<br />

Some of the existing strategies used in ‘interpolation’ cannot differentiate how the total predictive uncertainty<br />

originates from model input, model parameter and model structure uncertainty. Other methods<br />

attempt to do so, but as discussed in [15] this is problematic. In the case of uncalibrated models, the<br />

parameter uncertainty is very difficult to assess quantitatively, and wrong estimates of model parameter<br />

uncertainty will influence the estimates of model structure uncertainty. In the case of calibrated models,<br />

estimates of model parameter uncertainty can often be derived from autocalibration routines. An inadequate<br />

model structure will, however, be compensated by biased parameter values to optimise the<br />

model fit with field data during calibration. Hence, the uncertainty due to model structure will be underestimated<br />

in this case.<br />

The importance of model calibration can be illustrated by the example described in Højberg and<br />

Refsgaard (2005). They use three different conceptual models, based on three alternative geological<br />

interpretations, for a multi-aquifer system in Denmark. Each of the models was calibrated against piezometric<br />

head data using inverse technique. The three models provided equally good and very similar<br />

predictions of groundwater heads, including well field capture zones. However, when using the models<br />

to extrapolate beyond the calibration data to predictions of flow pathways and travel times the three<br />

models differed dramatically. When assessing the uncertainty contributed by the model parameter values,<br />

the overlap of uncertainty ranges between the three models significantly decreased when moving<br />

from groundwater heads to capture zones and travel times. They conclude that the larger the degree of<br />

extrapolation, the more the underlying conceptual model dominates over the parameter uncertainty and<br />

the effect of calibration.<br />

This diminishing effect of calibration as the prediction situation is extrapolated further and further away<br />

from the calibration base resembles the conclusion on the effects of updating relative to the underlying<br />

process model, when forecast lead times are increased in real-time forecasting (Fig. 27, Section 3.3).<br />

Here the effect of updating is reduced and the forecast error therefore increases as the forecast lead<br />

time (= degree of extrapolation) increases.<br />

4.3.5 Discussion – post evaluation<br />

Uncertainty is a key, and crosscutting, issue that I consider a useful platform or catalyst for establishing<br />

a common understanding in hydrological modelling and water resources management. By this I mean<br />

both a common understanding within the natural science based modelling issues such as scaling and<br />

validation and between people from the modelling and the monitoring communities as well as a broader<br />

dialogue between modellers and stakeholders on issues such as when is a model accurate and credible<br />

enough for its purpose of application, see Subsection 4.4.4 below.<br />

In the publications on developing the Suså model ([1], [2]) and the oxygen module ([3]) no explicit consideration<br />

is given to the goodness of the model structure and uncertainty assessment was not an issue<br />

at all. In the later work on catchment modelling in India ([4], [5]), where some twisting was done of the<br />

physical realism of the model due to scaling problems, it was noted that the model results might be<br />

‘right for the wrong reasons’, and the limitations of model applicability were emphasised in this respect,<br />

but no uncertainty assessments were made. In the paper describing a methodology for parameterisa-<br />

75


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

tion, calibration and validation of distributed hydrological models ([7]) uncertainty is also neglected. In<br />

the publications [6], [8], [9] and [10] uncertainty is discussed, but as a secondary issue only.<br />

Although examples of model prediction uncertainty assessments had been reported previously from<br />

different modelling disciplines (e.g. Refsgaard et al., 1983; Beck, 1987), the fist to emphasise the need<br />

to systematically perform uncertainty assessments related to catchment model predictions was probably<br />

Beven (1989). This was followed by Binley et al. (1991) who used Monte Carlo analysis to assess<br />

the predictive uncertainty for the Institute of Hydrology Distributed Model and by the introduction of the<br />

Generalised Likelihood Uncertainty Estimation (GLUE) methodology (Beven and Binley, 1992) after<br />

which uncertainty in catchment modelling was high on the agenda in the scientific community.<br />

My main scientific contributions on uncertainty are the publications [11], [14] and [15] and the link of<br />

uncertainty to principles and protocols for good modelling practise in [12] and [13]. Although reported 10<br />

years later than Binley et al. (1991), [11] was one of the first studies with uncertainty propagation<br />

through a complex, coupled distributed physically based catchment model with a focus on water quality.<br />

A key contribution of [14] and Refsgaard et al. (2005) is the broad framework for characterising uncertainty.<br />

This framework provides the link to uncertainty in the quality assurance work ([12], [13]). This<br />

broad framework is inspired by research in social science (Pahl-Wostl, 2002; van Asselt and Rotmans,<br />

2002; Dewulf et al., 2005). The main difference between the traditions in social science and natural<br />

science is that social scientists emphasise participatory processes including consultation and involvement<br />

of users, also on uncertainty aspects, right from the beginning of a study, while natural scientists<br />

often talk about users as someone to which uncertainty results should be communicated, e.g. Pappenberger<br />

and Beven (2006).<br />

The most difficult uncertainty problem (in natural science) to handle today is the model structure uncertainty,<br />

and the most important and novel contribution is probably the efforts made in this respect, primarily<br />

the new framework outlined in [15] but also the inclusion of options for evaluating multiple conceptual<br />

models in the HarmoniQuA modelling protocol ([13] and Fig. 5). The approach suggested in [15]<br />

of using multiple conceptual models (model structures) is not new (IPCC, 2001; Beven, 2002a; Neuman<br />

and Wierenga, 2003) and the use of pedigree analysis to qualitatively assess the credibility of something<br />

is not new either (van der Sluijs et al., 2005). The novelty lies in the combination of the two approaches<br />

that originate from different disciplines.<br />

76


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

4.4 Quality Assurance in Model based Water Management<br />

4.4.1 Background<br />

During the last decade many problems have emerged in river basin modelling projects, including poor<br />

quality of modelling, unrealistic expectations, and lack of credibility of modelling results. Some of the<br />

reasons for this lack of quality can be evaluated ([13]; Scholten et al., 2007) as the effect of:<br />

• Ambiguous terminology and a lack of understanding between key-players (modellers, clients, reviewers,<br />

stakeholders and concerned members of the public)<br />

• Bad practice (careless handling of input data, inadequate model set-up, insufficient calibration/validation<br />

and model use outside of its scope)<br />

• Lack of data or poor quality of available data<br />

• Insufficient knowledge on the processes<br />

• Poor communication between modellers and end-users on the possibilities and limitations of the<br />

modelling project and overselling of model capabilities<br />

• Confusion on how to use model results in decision making<br />

• Lack of documentation and clarity on the modelling process, leading to results that are difficult to<br />

audit or reproduce<br />

• Insufficient consideration of economic, institutional and political issues and a lack of integrated<br />

modelling.<br />

In the water resources management community many different guidelines on good modelling practice<br />

have been developed, see [13] for a review. One, if not the most, comprehensive example of a modelling<br />

guideline has been developed in The Netherlands (Van Waveren et al., 2000) as a result of a process<br />

involving all the main players in the Dutch water management field. The background for this was a<br />

perceived need to improve the quality of modelling (Scholten et al., 2000). Similarly, modelling guidelines<br />

for the Murray-Darling Basin in Australia were developed due to the perception among end-users<br />

that model capabilities may have been ‘over-sold’, and that there was a lack of consistency in approaches,<br />

communication and understanding among and between the modellers and the water managers,<br />

which often resulted in considerable uncertainty for decision making (Middlemis, 2000).<br />

4.4.2 The HarmoniQuA approach<br />

A software tool, MoST, with its associated knowledge base (KB), has been developed by the HarmoniQuA<br />

project ([13]; Scholten et al., 2007) to provide QA in modelling through guidance, monitoring<br />

and reporting. As defined in HarmoniQuA: “Quality Assurance (QA) is the procedural and operational<br />

framework used by an organisation managing the modelling study to build consensus among the organisations<br />

concerned in its implementation, to assure technically and scientifically adequate execution<br />

of all tasks included in the study, and to assure that all modelling-based analysis is reproducible and<br />

justifiable”. This modification of the older NRC (1990) definition includes the organisational, technical<br />

77


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

and scientific aspects, but also the need to build consensus among the organisations concerned in accordance<br />

with the discussion in Section 2.1 above.<br />

Guidelines for good modelling practise are included in the Knowledge Base (KB) of MoST. The modelling<br />

process has been decomposed into five steps, see the flowchart in Fig. 5. Each step includes several<br />

tasks. Each task has an internal structure i.e. name, definition, explanation, interrelations with other<br />

tasks, activities, activity related methods, references, sensitivity/pitfalls, task inputs and outputs.<br />

The KB contains knowledge specific to seven domains (groundwater, precipitation-runoff, river hydrodynamics,<br />

flood forecasting, water quality, ecology and socio-economics), and forms the heart of the<br />

tool. A computer based journal is produced within MoST where the water manager and modelling team<br />

record the progress and decisions made during a model study according to the tasks in the flowchart.<br />

This record can be used when reviewing the model study to judge its quality.<br />

The most important QA principles incorporated in the KB are:<br />

• The five modelling steps conclude with a formal dialogue between the modeller and manager,<br />

where activities and results from the present step are reported, and details of plans for the next step<br />

(a revised work plan) are discussed.<br />

• External reviews are prescribed as the key mechanism of ensuring that the knowledge and experience<br />

of other independent modellers are used.<br />

• The KB provides public interactive guidelines to facilitate dialogue between modellers and the water<br />

manager, with options to include auditors (reviewers), stakeholders and the public.<br />

• There are many feed back loops, some technical involving only the modeller, and others that may<br />

require a decision before doing costly additional work.<br />

• The KB allows performance and accuracy criteria to be updated during the modelling process. In<br />

the first step the water manager’s objectives and requirements are translated into performance criteria<br />

that may include qualitative and quantitative measures. These criteria may be modified during<br />

the formal reviews of subsequent steps.<br />

• Emphasis is put on validation schemes, i.e. tests of model performance against data that have not<br />

been used for model calibration.<br />

• Uncertainties must be explicitly recognised and assessed (qualitatively and/or quantitatively)<br />

throughout the modelling process.<br />

MoST supports multi-domain studies and working in teams of different user types (water managers,<br />

modellers, auditors, stakeholders and members of the public). It contains an interactive glossary that is<br />

accessible via hyperlinked text. The key functionality of MoST is to:<br />

• Guide, to ensure a model has been properly applied. This is based on the Knowledge Base.<br />

• Monitor, to record decisions, methods and data used in the modelling work and in this way enable<br />

transparency and reproducibility of the modelling process.<br />

• Report, to provide suitable reports of what has been done for managers/clients, modellers, auditors,<br />

stakeholders and the general public.<br />

78


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

4.4.3 Organisational requirements for QA guidelines to be effective<br />

Modelling studies involve several parties with different responsibilities. The key players are modellers<br />

and water managers, but often reviewers, stakeholders and the general public are also involved. To a<br />

large extent the quality of the modelling study is determined by the expertise, attitudes and motivation<br />

of the teams involved in the modelling and QA process.<br />

QA will only be successful if all parties actively support its use. The attitude of the modellers is important.<br />

NRC (1990) characterises this as follows: “most modellers enjoy the modelling process but find<br />

less satisfaction in the process of documentation and quality assurance”. Scholten and Groot (2002)<br />

describe the main problem with the Dutch Handbook on Good Modelling Practice as “they all like it, but<br />

only a few use it”. The water manager, however, has a particular responsibility, because he/she has the<br />

power to request and pay for adequate QA in modelling studies. Therefore, QA guidelines can only be<br />

expected to be used in practice if the water manager prescribes their use. It is therefore very important<br />

that the water manager has the technical capacity to organise the QA process. Often, water managers<br />

do not have individuals available with the appropriate training to understand and use models. An external<br />

modelling expert should then be sought to help with the QA process. However, this requires that the<br />

manager is aware of the problem and the need.<br />

4.4.4 Performance criteria and uncertainty – when is a model good enough<br />

A critical issue is how to define the performance criteria. We agree with Beven (2002b) that any conceptual<br />

model is known to be wrong and hence any model will be falsified if we investigate it in sufficient<br />

detail and specify very high performance criteria. Clearly, if one attempts to establish a model that<br />

should simulate the truth it would always be falsified. However, this is not very useful information.<br />

Therefore, we are using the conditional validation, or the validation restricted to domain of applicability<br />

(or numerical universal as opposed to strictly universal in Popperian terms). The good question is then<br />

what is good enough Or in other words what are the criteria How do we select them<br />

A good reference for model performance is to compare it with uncertainties of the available field observations.<br />

If the model performance is within this uncertainty range we often characterise the model as<br />

good enough. However, usually it is not so simple. How wide confidence bands do we accept on observational<br />

uncertainties – ranges corresponding to 65%, 95% or 99% Do we always then reject a model<br />

if it cannot perform within the observational uncertainty range In many cases even results from less<br />

accurate models may be useful.<br />

Therefore, the decision on what is good enough generally must be taken in a socio-economic context.<br />

For instance, the accuracy requirements to a model to be used for an initial screening of alternative<br />

options for location of a new small well field for a small water supply will be much smaller than the requirements<br />

to a model that is intended to be used for the final design of a large well field for a major<br />

water supply in an area with potential damaging effects on precious nature and other significant conflicts<br />

of interests. Thus, the accuracy criteria can not be decided universally by modellers or researchers,<br />

but must be different from case to case depending on how much is at stake in the decision to depend<br />

on the support from model predictions. This implies that the performance criteria must be discussed<br />

and agreed between the manager and the modeller beforehand.<br />

79


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Accuracy requirements and uncertainty assessments of model simulations are two sides of the same<br />

coin, just seen from two different perspectives, namely the water manager and the modeller. As all uncertainty<br />

can not be characterised as statistical uncertainty (see Fig. 33 and Tables 2 and 3 in Subsection<br />

4.3.1) it is also required to characterise accuracy requirements in qualitative terms. Furthermore the<br />

risk perception of the water manager and the stakeholders/public has to be considered. Therefore, involvement<br />

of stakeholders and public are most often required as an integrated part of this process (see<br />

also Section 2.1 and Figs. 1-2). According to the HarmoniQuA methodology stakeholder/public involvement<br />

is crucial at the beginning of a modelling project to frame the problem, define the requirements<br />

and assess the uncertainties (Henriksen et al., submitted).<br />

This way of thinking is well in line with the principles behind some of the Water Framework Directive<br />

Guidance Documents. For example the Guidance Document on Monitoring (EC, 2003a) does not specify<br />

the levels of precision and confidence required from the monitoring programmes, but rather states<br />

that the precision and confidence level should be sufficient to enable a meaningful assessment of for<br />

instance the status of the environment and should be sufficient to achieve an acceptable risk of making<br />

the wrong decision. This obviously calls for uncertainty assessments and public participation to have a<br />

central role in the entire process, which pave the road towards making adaptive management an important<br />

part of the river basin management process (Pahl-Wostl, 2002).<br />

4.4.5 Discussion – post evaluation<br />

The ideas and concepts behind the HarmoniQuA guidelines ([12], [13]) summarised above have been<br />

inspired from previous QA guidelines. The novel contributions have been inspired both from previous<br />

research activities (including [4], [5], [6], [7], [9], [11]) and from participation in a large range of national<br />

and international consultancy projects. Without having been in this crossroad between the research<br />

world and the practical world for more than two decades this would not have been possible. I consider<br />

my most important contributions in this respect to be:<br />

• The terminology and guiding principles behind the guidelines [12] are novel in their attempt to formulate<br />

a coherent approach that on the one hand has a solid scientifically philosophical foundation<br />

and on the other hand can be useful for practitioners. In the very controversial issue of model validation,<br />

where there has been almost a deadlock between different schools with respect to whether<br />

validation at all is possible, the philosophy of conditional validation is novel.<br />

• The major novelty of the HarmoniQuA approach does not lie in its guidance on model technical<br />

issues, but on its emphasis and more elaborate focus on the dialogue between modeller, water<br />

manager, reviewer, stakeholders and the public. In addition, there are novel elements on the large<br />

emphasis on uncertainty assessments throughout the modelling process and model validation. Finally,<br />

the emphasis on model reviews allows bringing in subjective knowledge and experience in the<br />

QA process.<br />

Both the HarmoniQuA guidelines and other recent good modelling practise guidelines have been<br />

deeply rooted both in the scientific community and among practitioners ([13]). As a comparison, ideas<br />

originating alone from the natural science community, such as the suggested Code of Practise on performing<br />

uncertainty analysis by Pappenberger and Beven (2006), are typically limited to valuable contributions<br />

on model technical issues, while they often do not consider the broader aspects of the modelling<br />

process such as the involvement of water managers and stakeholders.<br />

80


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

5 Conclusions and Perspectives for Future Work<br />

5.1 Summary of Main Scientific Contributions<br />

The contributions to scientific knowledge in the papers of the present thesis are discussed in the previous<br />

chapters. The main contributions have been in the following five areas:<br />

• New conceptual understanding and code development. The Suså model ([1], [2]) was based on a<br />

new conceptual understanding of the surface water/groundwater interaction in moraine catchment.<br />

The code and its application brought new insight regarding the effect of groundwater abstraction on<br />

streamflow in catchments with such hydrogeological characteristics.<br />

• Model validation. The adoption and adaptation of rather rigorous principles for model validation and<br />

the examples of their application both for lumped conceptual and distributed physically based models<br />

is a cornerstone in my research. This work was first published in [6] and [7] and later brought<br />

into a broader modelling framework in [12] and [13]. In particular the introduction of the term ‘conditional<br />

validation’ in [7] and the outline of its scientific philosophical basis in [12] is novel.<br />

• Scaling. The publications focussing on scaling ([7], [10]) presents ideas crystallised from work with<br />

scaling problems in many modelling studies ranging from point scale to thousands of km 2 . The later<br />

framework, outlined in Section 4.1 above does not in any way ‘solve’ the scaling problem but contributes<br />

to clarifications on applicable methodologies with focus on their respective assumptions and<br />

limitations.<br />

• Uncertainty assessment. During the past decade a considerable part of my research work has focussed<br />

on uncertainty aspects. I consider my main contributions in this respect to be the introduction<br />

of the broader uncertainty framework integrated into the modelling framework ([13], [14]) and<br />

the work with model structure uncertainty ([15]).<br />

• Modelling protocols and guidelines for quality assurance in the modelling process. The modelling<br />

protocol in [7] and the later and more comprehensive one presented as part of the guidelines for<br />

quality assurance in the modelling framework in [13] are a formalisation of experience and practises<br />

that have gradually emerged over the years. The novel elements in [13] are the emphasis on (a) the<br />

interactive dialogue between modeller, water manager, reviewer, stakeholders and the public; (b)<br />

uncertainty assessments throughout the modelling process; (c) model validation; and (d) experience<br />

and subjective knowledge introduced through external model reviews.<br />

These main contributions to scientific knowledge would, however, not have been possible without the<br />

experience and insight gained in modelling studies ranging from point scale ([3]) to large catchments<br />

([4], [5], [8], [9], [11]).<br />

81


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

5.2 Modelling Issues for Future Research<br />

Hydrological modelling has developed significantly during the three decades I have worked in this field.<br />

I started with editing punch cards and could only run one simulation per day (overnight) using model<br />

codes that today are considered small and simple. Since then, comprehensive new knowledge has<br />

been build into model codes and into the methodologies used in the modelling process.<br />

During the process of writing this thesis, where I had to review my older publications, it was interesting<br />

to note the gradual change in research focus. The first decade my research focused on development of<br />

new codes. During the second decade more general methodological problem areas such as scaling<br />

and model validation were addressed. Towards the end of the third decade the emphasis is now on the<br />

broader issues such as uncertainty assessment and quality assurance frameworks for the entire modelling<br />

process, and the interaction between the modelling and the water management processes. While<br />

this no doubt is affected by personal and career developments, it also reflects a general trend. We are<br />

no longer satisfied with being able to produce beautiful simulations with sophisticated new model<br />

codes; we also want to evaluate the credibility of such simulations and to apply them in real-world water<br />

management decisions.<br />

Certainly I did not foresee this development three decades ago. On this background it is therefore not<br />

wise to make long range forecasts on what we can expect as the key issues for future modelling research.<br />

Hence, the following list should not pretend to cover all the most important research issues for<br />

modelling during the coming many years. It rather presents a list of issues which I, seen from the perspective<br />

dealt with in the present thesis, consider the presently most important and fundamental problems<br />

requiring more research during the coming years.<br />

• Improved representation of heterogeneity in reactive transport modelling. There will always be a<br />

need to improve our conceptual understanding of hydrological processes. It appears that, whereas<br />

we have had some success with prediction of flows and hydraulic heads, the existing paradigms in<br />

hydrological modelling are not good enough to simulate concentrations of conservative and reactive<br />

contaminants. Flows and hydraulic heads are much less depending on heterogeneity than concentrations,<br />

and it will be necessary to include heterogeneity much more explicitly in the modelling than<br />

done until now. Examples of areas, where this is important, include simulation of transport and fate<br />

of contaminants in aquifers and simulation of the stream-aquifer interaction governed by processes<br />

in river valleys.<br />

• Utilisation of new data types. Whenever possible we should try to make use of new data types. New<br />

techniques for collecting satellite data on surface conditions and geophysical data on subsurface<br />

features are promising and have not been fully exploited yet. We can hope and expect that better<br />

techniques will be developed during the coming years. Thus, it is not unrealistic in some years to<br />

have improved data providing both a much better spatial resolution of catchment/aquifer properties<br />

and on-line information on state variables. The improved spatial resolution can help us give a better<br />

representation of heterogeneities in models (see above), while on-line information provide interesting<br />

potentials for improved management. In order to utilise on-line data optimally new and improved<br />

data assimilation (updating) techniques will be required.<br />

82


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

• Model structure error. Probably the most important single issue related to uncertainty of model predictions<br />

is how to assess uncertainty caused by model structure error. It is important, because the<br />

most interesting fields of model applications deal with assessments of the effects on the ecosystem<br />

of human activities. And it is at the same time fundamentally difficult, because we in such situations<br />

are using models beyond the situations, where we can test the model performance against field<br />

data. I consider the framework based on multiple conceptual models ([15]) only to be a very first<br />

beginning in this respect.<br />

• Uncertainty and credibility of modelling in relation to water resources management. Uncertainty<br />

assessments of model predictions are crucial for a sound use of models in water resources management<br />

in practise. Model predictions without uncertainty assessments correspond to only presenting<br />

a (minor) part of the available information. Uncertainty in relation to water resources management<br />

in practise is not confined to statistical uncertainty. It is also required to include aspects of<br />

qualitative uncertainty and ignorance. Furthermore, uncertainty must be seen in a broad socioeconomic<br />

context where stakeholder and policy views are taken into account. There are many future<br />

challenges on this multi-disciplinary road. How do we ensure that models incorporate the best<br />

available information and adequately address the issues and the priorities set by water managers<br />

and stakeholders How should we translate objectives and requirements formulated in qualitative<br />

language by water managers and stakeholders to accuracy criteria for a modelling study And how<br />

should we compile and present uncertainties from a modelling study in a way that is understandable<br />

by non-modellers Some of these questions are likely to be answered within the context of new water<br />

management paradigms such as adaptive management.<br />

83


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

6 References<br />

Abbott MB (1992) The theory of the hydrological model, or: the struggle for the soul of hydrology. In: O’Kane<br />

JP (Ed.) Advances in theoretical hydrology, Elsevier, 237-254.<br />

Abbott MB, Bathurst JC, Cunge JA, O'Connel PE, Rasmussen J (1986a) An introduction to the European<br />

Hydrological System - Systeme Hydrologique Européen "SHE", 1: History and philosophy of a physically-based<br />

distributed modelling system. Journal of Hydrology, 87, 45-59.<br />

Abbott MB, Bathurst JC, Cunge JA, O'Connel PE, Rasmussen J (1986b) An introduction to the European<br />

Hydrological System - Systeme Hydrologique Européen "SHE", 2: Structure of a physically-based distributed<br />

modelling system. Journal of Hydrology, 87, 61-77.<br />

Abrahamsen P, Hansen S (2000) Daisy: an open soil-crop-atmosphere system model. Environmental Modelling<br />

& Software, 15, 313-330.<br />

Andersen J, Refsgaard JC, Jensen KH (2001) Distributed hydrological modelling of the Senegal River Basin<br />

– model construction and validation. Journal of Hydrology, 247, 200-214.<br />

Anderson MP, Woessner WW (1992) The role of postaudit in model validation. Advances in Water Resources,<br />

15, 167-173.<br />

Babendreier JE (2003) National-scale multimedia risk assessment for hazardous waste disposal. International<br />

Workshop on Uncertainty, Sensitivity and Parameter Estimation for Multimedia Environmental<br />

Modelling held at U.S Nuclear Regulatory Commission, Rockville, Maryland, August 19-21, 2003. Proceedings,<br />

103-109.<br />

Bathurst JC (1986a) Physically-based distributed modelling of an upland catchment using the Systeme Hydrologique<br />

Européen. Journal of Hydrology, 87, 79-102.<br />

Bathurst JC (1986b) Sensitivity analysis of the Systeme Hydrologique Européen for an upland catchment.<br />

Journal of Hydrology, 87, 103-123.<br />

Beck MB (1987) Water quality modelling: a review of the analysis of uncertainty. Water Resources Research,<br />

23(8), 1393-1442.<br />

Beck MB (2005) Environmental foresight and structural change. Environmental Modelling & Software, 20,<br />

651-670.<br />

Bergström (1976) Development and application of a conceptual runoff model for Scandinavian catchments.<br />

PhD Thesis, University of Lund, Bulletin Series A No 52.<br />

Bergström S (1992) The HBV model – its structure and applications. SMHI RH No 4. Norrköping.<br />

Bergström S (1995) The HBV model. In: Singh VP (Ed) Computer Models of Watershed Hydrology. Water<br />

Resources Publications, Highlands Ranch, Colorado, 443-476.<br />

Bergström S, Forsman A (1973) Development of a conceptual deterministic rainfall-runoff model. Nordic<br />

Hydrology, 4, 147-170.<br />

Beven K (1989) Changing ideas in hydrology – the case of physically based models. Journal of Hydrology,<br />

105, 157-172.<br />

Beven K (1995) Linking parameters across scales: Subgrid parameterization and scale dependent hydrological<br />

models. Hydrological Processes, 9, 507-525.<br />

Beven K (1996a) A discussion of distributed hydrological modelling. In: Abbott MB, Refsgaard JC (Eds):<br />

Distributed Hydrological Modelling, Kluwer Academic Publishers, 255-278.<br />

Beven K (1996b) Response to comments on ‘A discussion of distributed hydrological modelling’. In: Abbott<br />

MB, Refsgaard JC (Eds): Distributed Hydrological Modelling, Kluwer Academic Publishers, 289-295.<br />

Beven K (2001) How far can we go in distributed hydrological modelling Hydrology and Earth System Sciences,<br />

5(1), 1-12.<br />

Beven K (2002a) Towards an alternative blueprint for a physically based digitally simulated hydrologic response<br />

modelling system. Hydrological Processes, 16(2), 189-206.<br />

Beven K (2002b) Towards a coherent philosophy for modelling the environment. Proceedings of the Royal<br />

Society of London, A, 458 (2026), 2465-2484.<br />

84


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Beven K, Binley AM (1992) The future of distributed models: model calibration and uncertainty prediction.<br />

Hydrological Processes, 6, 279-298.<br />

Binley AM, Beven KJ, Calver A, Watts LG (1981) Changing Responses in Hydrology: Assessing the Uncertainty<br />

in Physically Based Model Predictions. Water Resources Research, 27(6), 1253-1261.<br />

Birkinshaw SJ, Ewen J (2000) Nitrogen transformation component for SHETRAN catchment nitrate transport<br />

modelling. Journal of Hydrology, 230, 1-17.<br />

Blöschl G, Sivapalan M (1995) Scale issues in hydrological modelling: A review. Hydrological Processes, 9,<br />

251-290.<br />

Brown JD (2004) Knowledge, uncertainty and physical geography: towards the development of methodologies<br />

for questioning belief. Transactions of the Institute of British Geographers 29(3), 367-381.<br />

Brown JD, Heuvelink GBM, Refsgaard JC (2005) An integrated framework for assessing and recording uncertainties<br />

about environmental data. Water Science and Technology, 52(6), 153-160.<br />

Brown JD, Heuvelink GBM (2005) Data Uncertainty Engine (DUE) User’s Manual. University of Amsterdam.<br />

http://www.harmonirib.com.<br />

Butts MB, Payne JT, Kristensen M, Madsen H (2004) An evaluation of the impact of model structure on hydrological<br />

modelling uncertainty for streamflow prediction. Journal of Hydrology, 298, 242-266.<br />

Burnash RJC (1995) The NWS river forecast system - catchment modelling. In: Singh VP (Ed): Computer<br />

Models of Watershed Hydrology, Water Resources Publications, 311-366.<br />

Christensen S (1994) Hydrological Model for the Tude Å Catchment. Nordic Hydrology, 25, 145-166.<br />

Conan C, Bouraoui F, Turpin N, de Marsily G, Bidoglio G (2003) Modelling Flow and Nitrate Fate at Catchment<br />

Scale in Brittany (France). Journal of Environmental Quality, 32, 2026-2032.<br />

Crawford NH, Linsley RK (1966) Digital simulation in hydrology, Stanford Watershed Model IV, Department<br />

of Civil Engineering, Stanford University, Technical Report 39.<br />

Currie JA (1961) Gaseous diffusion in the aeration of aggregated soils. Soil Science, 92, 40-45.<br />

Dagan G (1986) Statistical theory of groundwater flow and transport: pore to laboratory, laboratory to formation<br />

and formation to regional scale. Water Resources Research, 22(9), 120-134.<br />

De Marsily, G Combes P, Goblet P (1992) Comments on 'Ground-water models cannot be validated', by<br />

Konikow LF, Bredehoeft, JD, Advances in Water Resources, 15, 367-369.<br />

Dewulf A, Craps M, Bouwen R, Pahl-Wostl C (2005) Integrated management of natural resources dealing<br />

with ambiguous issues, multiple actors and diverging frames. Water Science and Technology, 52(6),<br />

115-124.<br />

DHI (1995) MIKE 21 Short Description. Danish Hydraulic Institute, Hørsholm, Denmark.<br />

Djuurhus J, Hansen S, Schelde K, Jacobsen OH (1999) Modelling mean nitrate leaching from spatially variable<br />

fields using effective parameters. Geoderma, 87,261-279.<br />

Doherty J (2003) Ground water model calibration using pilot points and regularization. Ground Water, 41(2),<br />

170-177.<br />

Duan Q, Sorooshian S, Gupta VK (1994) Optimal use of the SCE-UA global optimization method for calibrating<br />

watershed models. Journal of Hydrology 158, 265–284.<br />

Dubus, IG, Brown CD, Beulke S (2003) Sources of uncertainty in pesticide fate modelling. The Science of<br />

the Total Environment, 317, 53-72.<br />

EC (1992) Working Group of Independent Experts on Variant C of the Gabcikovo-Nagymaros Project, Working<br />

Group Report, Commission of the European Communities, Czech and Slovak Federative Republic,<br />

Republic of Hungary, Budapest November 23, 1992.<br />

EC (1993a) Working Group of Monitoring and Water Management Experts for the Gabcikovo System of<br />

Locks - Data Report, Commission of the European Communities, Republic of Hungary, Slovak Republic,<br />

Budapest November 2, 1993.<br />

EC (1993b) Working Group of Monitoring and Water Management Experts for the Gabcikovo System of<br />

Locks - Report on Temporary Water Management Regime, Commission of the European Communities,<br />

Republic of Hungary, Slovak Republic, Bratislava, December 1, 1993.<br />

EC (2003a) Common Implementation Strategy for the Water Framework Directive (2000/60/EC). Guidance<br />

Document No. 7. Monitoring under the Water Framework Directive. Working Group 2.7. Office for the<br />

Official Publications of the European Communities, Luxembourg.<br />

85


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

EC (2003b) Common Implementation Strategy for the Water Framework Directive (2000/60/EC). Guidance<br />

Document No. 11. Planning Processes. Working Group 2.9. Office for the Official Publications of the<br />

European Communities, Luxembourg.<br />

EC (2004) Common Implementation Strategy for the Water Framework Directive (2000/60/EC) Guidance<br />

Document No 3, pressures and impacts, IMPRESS. Working Group 2.3. Office for the Official Publications<br />

of the European Communities, Luxembourg.<br />

Fleming G (1975) Computer simulation techniques in hydrology. Elsevier, New York.<br />

Franchini M, Pacciani M (1992) Comparative analysis of several conceptual rainfall-runoff models. Journal of<br />

Hydrology, 122, 161-219.<br />

Freeze RA, Harlan RL (1969) Blueprint for a physically-based digitally-simulated hydrologic response model.<br />

Journal of Hydrology, 9, 237-258.<br />

Gelhar LW (1986) Stochastic subsurface hydrology. From theory to application. Water Resources Research,<br />

22(9), 135-145.<br />

Graham DN, Butts MB (2005) Flexible integrated watershed modelling with MIKE SHE. In: Singh VP, Frevert<br />

DK (Eds) Watershed Models. CRC Press, Chapter 10.<br />

Graham LP (1999) Modelling runoff to the Baltic Sea, Ambio, 28, 328-334.<br />

Grayson RB, Moore ID, McHahon TA (1992a) Physically based hydrologic modelling, 1. A terrain-based<br />

model for investigative purposes. Water Resources Research, 28(10), 2639-2658.<br />

Grayson RB, Moore ID, McHahon TA (1992b) Physically based hydrologic modelling, 2. Is the concept realistic<br />

Water Resources Research, 28(10), 2639-2658.<br />

Grayson R, Blöschl G (2000) Spatial Modelling of Catchment Dynamics. In: Grayson R, Blöschl G (Eds.)<br />

Spatial Patterns in Catchment Hydrology: Observations and Modelling. Cambridge University Press,<br />

UK.<br />

Groenenberg JE, Kros J, van der Salm C, de Vries W (1995) Application of the model NUCSAM to the<br />

Solling spruce site. Ecological Modelling, 83, 97-107.<br />

GWP (2000) Integrated Water Resources Management. TAC Background Papers No. 4. Global Water Partnership,<br />

Stockholm.<br />

Hansen S, Jensen HE, Nielsen NE, Svendsen H (1991) Simulation of nitrogen dynamics and biomass production<br />

in winter wheat using the Danish simulation model DAISY. Fertilizer Research, 27, 245-259.<br />

Hansen S, Thorsen M, Pebesma E, Kleeschulte S, Svendsen H (1999) Uncertainty in simulated leaching<br />

due to uncertainty in input data. A case study. Soil Use and Management, 15, 167-175.<br />

Harrar WG, Sonnenborg TO, Henriksen HJ (2003) Capture zone, travel time and solute transport predictions<br />

using inverse modelling and different geological models. Hydrogeology Journal, 11(5), 536-548.<br />

Havnø K, Madsen MN, Dørge J (1995) MIKE 11 - A Generalized River Modelling Package. In: Singh VP (Ed)<br />

Computer Models of Watershed Hydrology, Water Resources Publications, Highlands Ranch, Colorado,<br />

733-782.<br />

Henriksen HJ, Refsgaard JC, Sonnenborg TO, Gravesen P, Brun A, Refsgaard A, Jensen KH (2001) STÅBI i<br />

grundvandsmodellering (Handbook in groundwater modelling). Danmarks og Grønlands Geologiske<br />

Undersøgelse, Rapport 2001/56. (In Danish)<br />

Henriksen HJ, Troldborg L, Nyegaard P, Sonnenborg TO, Refsgaard JC, Madsen B (2003) Methodology for<br />

construction, calibration and validation of a national hydrological model for Denmark. Journal of Hydrology<br />

280, 52-71.<br />

Henriksen HJ, Refsgaard JC, Højberg AL, Ferrand N, Gijsbers P, Scholten H (submitted) Public participation<br />

in relation to quality assurance of water resources modelling (HarmoniQuA).<br />

Heuvelink GBM, Pebesma EJ (1999) Spatial aggregation and soil process modelling. Geoderma, 89, 47-65.<br />

Hill MC (1998) Methods and guidelines for effective model calibration. U.S. Geological Survey, Water-<br />

Resources Investigations Report 98-4005. Denver CO.<br />

Højberg AL, Refsgaard JC (2005) Model Uncertainty - Parameter uncertainty versus conceptual models.<br />

Water Science and Technology, 52(6), 177-186.<br />

ICJ (1997) Case Concerning Gabcikovo-Nagymaros project (Hungary/Slovakia). Summary of the Judgement of<br />

25 September 1997. International Court of Justice, The Hague.<br />

86


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

ICWE (1992) The Dublin Statement and report of the conference. International Conference on Water and the<br />

Environment: Development issues for the 21st century. 26-31 January 1992, Dublin, Ireland.<br />

IPCC (2001) Climate Change 2001: The Scientific Basis. Contribution of Working Group I to the Third Assessment<br />

Report of the Intergovernmental Panel of Climate Change [Houghton JT, Ding Y, Griggs DJ,<br />

Noguer M, van der Linden PJ, Dai X, Maskell K and Johnson CA (eds)]. Cambridge University Press,<br />

Cambridge, UK and New York, NY, USA, 881 pp.<br />

Jensen KH, Mantoglou A (1992) Application of stochastic unsaturated flow theory, numerical simulations,<br />

and comparisons to field observations. Water Resources Research, 28, 269-284.<br />

Jensen RA, Jørgensen GH (1988) Hydrologisk overfladevands/grundvands model (Hydrological surface<br />

water/groundwater model). Technical report prepared by Danish Hydraulic Institute for the County of<br />

Storstrøm and the County of Vestsjælland. (in Danish)<br />

Jensen KH, Refsgaard JC (1991a) Spatial variability of physical parameters and processes in two field soils.<br />

Part I: Water Flow and Solute Transport at Local Scale. Nordic Hydrology, 22, 275-302.<br />

Jensen KH, Refsgaard JC (1991b) Spatial variability of physical parameters and processes in two field soils.<br />

Part II: Water flow at field scale. Nordic Hydrology, 22, 303-326.<br />

Jensen KH, Refsgaard JC (1991c) Spatial variability of physical parameters and processes in two field soils.<br />

Part III: Solute Transport at Field Scale. Nordic Hydrology, 22, 327-340.<br />

Jønch-Clausen T (1979) SHE. Système Hydrologiique Européen. A short description. Danish Hydraulic Institute,<br />

Hørsholm, Denmark.<br />

Jønch-Clausen T (2004) Integrated Water Resources Management (IWRM) and Water Efficiency Plans by<br />

2005. Why, What and How Global Water Partnership, TEC Background Papers No. 10, Stockholm.<br />

Jønch-Clausen T, Refsgaard JC (1984) A Mathematical Modelling System for Flood Forecasting. Nordic<br />

Hydrology, 15, 307-318.<br />

Kaiser-Hill (2001) Model Code and Scenario Selection Report Site-Wide Water Balance Rocky Flats Environmental<br />

Technology Site. Report 01-RF-00337. Kaiser-Hill Company LLC.<br />

Klauer B, Brown JD (2003) Conceptualising imperfect knowledge in public decision making: ignorance, uncertainty,<br />

error and ‘risk situations’. Environmental Research, Engineering and Management.<br />

Klemes V (1986) Operational testing of hydrological simulation models. Hydrological Sciences Journal, 31,<br />

13-24.<br />

Knudsen J, Thomsen A, Refsgaard JC (1986) WATBAL: A semi-distributed, physically based hydrological<br />

modelling system. Nordic Hydrology, 17, 347-362.<br />

Konikow LF, Bredehoeft JD (1992) Ground-water models cannot be validated. Advances in Water Resources,<br />

15, 75-83.<br />

Kros J, Reinds GJ, de Vries W, Latour JB, Bollen M (1995) Modelling of soil acidity and nitrogen availability<br />

in natural ecosystems in response to changes in acid deposition and hydrology. Report 95, DLO Winand<br />

Staring Centre, Wageningen.<br />

Kutchment LS, Demidov VN, Naden PS, Cooper DM, Broadhurst P (1996) Rainfall-runoff modelling of the<br />

Ouse basin, North Yorkshire: an application of a physically based distributed model. Journal of Hydrology,<br />

181, 323-342.<br />

Lane SA, Richards KS (2001) The ‘Validation’ of Hydrodynamic Models: Some Critical Perspectives. In:<br />

Anderson MG, Bates PD (Eds) Model Validation perspectives in Hydrological Science, 413-438. John<br />

Wiley & Sons, Ltd.<br />

Linkov I, Burmistrov D (2003) Model Uncertainty and Choices Made by Modelers: Lessons Learned from the<br />

International Atomic Energy Model Intercomparisons. Risk Analysis, 23(6), 1297-1308.<br />

Lloyd JW (1980) The importance of drift deposit influences on the hydrogeology of major British aquifers.<br />

Institution of Water Engineers and Scientists, Journal, 34, 346-356.<br />

Loague KM, Freeze RA (1985) A Comparison of Rainfall-Runoff Modelling Techniques on Small Upland<br />

Catchments. Water Resources Research, 21(2), 1985.<br />

Luckner L (1978) Gekoppelte Grundwasser-Oberflächenwassermodelle (A coupled groundwater-surface<br />

water model). Wasserwirtschaft-Wassertechnik, 1978, 276-278 (In German).<br />

Madsen H, Skotner C (2005) Adaptive state updating in real-time river flow forecasting – a combined filtering<br />

and error forecasting procedure. Journal of Hydrology, 308, 302-312.<br />

87


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Michaud J, Sorooshian S (1994) Comparison of simple versus complex distributed runoff models on a midsized<br />

semiarid watershed. Water Resources Research, 30(3), 593-605.<br />

Michaud JD, Shuttelworth WJ (1997) Executive summary of the Tuczon aggregation workshop. Journal of<br />

Hydrology, 190, 176-181.<br />

Middlemis H (2000) Murray-Darling Basin Commission. Groundwater flow modelling guideline. Aquaterra<br />

Consulting Pty Ltd., South Perth. Western Australia. Project no. 125.<br />

Miles JC, Rushton KR (1983) A coupled surface water and groundwater catchment model. Journal of Hydrology,<br />

62, 159-177.<br />

Neuman SP, Wierenga PJ (2003) A comprehensive strategy of hydrogeologic modeling and uncertainty<br />

analysis for nuclear facilities and sites. University of Arizona, Report NUREG/CR-6805.<br />

Nielsen DR, Bigger JW, Erk KT (1973) Spatial variability of field measured soil water properties. Hilgardia,<br />

42, 215-259.<br />

Nielsen SA, Hansen E (1973) Numerical simulation of the rainfall-runoff process on a daily basis. Nordic<br />

Hydrology, 4, 171-190.<br />

NRC (1990) Ground Water Models: Scientific and Regulatory Applications. National Research Council, National<br />

Academy Press, Washington, D.C.<br />

Oreskes N, Shrader-Frechette K, Belitz K (1994) Verification, validation and confirmation of numerical models<br />

in the earth sciences. Science, 264, 641-646.<br />

Pahl-Wostl C (2002) Towards sustainability in the water sector – The importance of human actors and processes<br />

of social learning. Aquatic Sciences, 64, 394-411.<br />

Panday S, Hayakorn PS (2004) A fully coupled physically-based spatially-distributed model for evaluating<br />

surface/subsurface flow. Advances in Water Resources, 27, 361-382.<br />

Pappenberger F, Beven KJ (2006) Ignorance in bliss: Or seven reasons not to use uncertainty analysis. Water<br />

Resources Research 42, W05302, doi:10.1029/2005WR004820.<br />

Pascual P, Steiber N, Sunderland E (2003) Draft guidance on development, evaluation and application of<br />

regulatory environmental models. The Council for Regulatory Environmental Modeling. Officie of Science<br />

Policy, Office of Research and Development. US Environmental Protection Agency, Washington<br />

D.C. 60 pp.<br />

Perkins SP, Sophocleous M (1999) Development of a Comprehensive Watershed Model Applied to Study<br />

Stream Yield under Drought Conditions. Ground Water, 37(3), 418-426.<br />

Perrin C, Michel C, Andréassian V (2001) Does a large number of parameters enhance model performance<br />

Comparative assessment of common catchment model structures on 429 catchments. Journal of Hydrology,<br />

242, 275-301.<br />

Poeter E, Anderson D (2005) Multiple Ranking and Inference in Ground Water Modeling. Ground Water,<br />

43(4), 597-605.<br />

Popper KR (1959) The logic of scientific discovery. Hutchingson & Co, London.<br />

Prickett TA, Lonnquist CG (1971) Selected digital computer techniques for groundwater resource evaluation.<br />

Illinois State Water Survey, Bulletin 55.<br />

Querner EP (1997) Description and application of the combined surface and groundwater flow model<br />

MOGROW. Journal of Hydrology, 192, 158-188.<br />

Quinn PF, Beven KJ (1993) Spatial and temporal predictions of soil moisture dynamics, runoff, variable<br />

source areas and evapotranspiration for Plynlimon, Mid-Wales. Hydrological Processes, 7, 425-448.<br />

Radwan M, Willems P, Berlamont J (2004) Sensivity and uncertainty analysis for river quality modelling.<br />

Journal of Hydroinformatics, 6, 83-99.<br />

Reed S, Koren V, Smith M, Zhang Z, Moreda F, Seo D-J (2004) Overall distributed model intercomparison<br />

project results. Journal of Hydrology, 298, 27-60.<br />

Refsgaard JC (1981) The surface water component of an integrated hydrological model. Danish Committee for<br />

Hydrology. Suså Report No. H12.<br />

Refsgaard JC (1996) Terminology, modelling protocol and classification of hydrological model codes. In:<br />

Abbott MB, Refsgaard JC (Eds): Distributed Hydrological Modelling, Kluwer Academic Publishers, 17-<br />

39.<br />

88


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Refsgaard JC, Stang O (1981) An integrated groundwater/surface water hydrological model. Danish Committee<br />

for Hydrology. Suså Report No. H13.<br />

Refsgaard JC, Rosbjerg D, Markussen LM (1983) Application of Kalman filter to real-time operation and to<br />

uncertainty analyses in hydrological modelling. IAHS Publication No 147, 273-282.<br />

Refsgaard JC, Storm B (1995) MIKE SHE. In: Singh VP (Ed) Computer Models of Watershed Hydrology.<br />

Water Resources Publications, Highlands Ranch, Colorado, 809-846.<br />

Refsgaard JC, Storm B, Abbott MB (1996) Comments on ‘A discussion of distributed hydrological modelling’.<br />

In: Abbott MB, Refsgaard JC (Eds): Distributed Hydrological Modelling, Kluwer Academic Publishers,<br />

279-287.<br />

Refsgaard JC, Ramaekers D, Heuvelink GBM, Schreurs V, Kros H, Rosén L, Hansen S (1998) Assessment<br />

of ‘cumulative’ uncertainty in spatial decision support systems: Application to examine the contamination<br />

of groundwater from diffuse sources (UNCERSDSS). Presented at the European Climate Science<br />

Conference, Vienna, 19-23 October 1998.<br />

Refsgaard JC, Butts MB (1999) Determination of grid scale parameters in catchment modelling by upscaling<br />

local scale parameters. Key note presentation. Proceedings of the EurAgEng International Workshop<br />

on Modelling of transport processes in soils at various scales in time and space, 24-26 November<br />

1999, Leuven, Belgium.<br />

Refsgaard JC, van der Sluijs JP, Højberg AL, Vanrolleghem P (2005) Harmoni-CA Guidance Uncertainty<br />

Analysis. Guidance 1. 46 pp. www.harmoni-ca.info.<br />

Rykiel ER (1996) Testing ecological models: the meaning of validation. Ecological Modelling, 90, 229-244.<br />

Saulnier GM, Beven K, Obled C (1997) Digital elevation analysis for distributed hydrological modelling: Reducing<br />

scale dependence in effective hydraulic conductivity values. Water Resources Research,<br />

33(9), 2097-2101.<br />

Scholten H, Van Waveren RH, Groot S, Van Geer FC, Wösten JHM, Koeze RD, Noort JJ (2000) Good Modelling<br />

Practice in water management. Paper presented on Hydroinformatics 2000, Cedar Rapids, IA,<br />

USA.<br />

Scholten H, Groot S (2002) Dutch guidelines. In: Refsgaard, JC (Ed) State-of-the-Art Report on Quality Assurance<br />

in modelling related to river basin management. Chapter 12, Geological Survey of Denmark<br />

and Greenland, Copenhagen. www.harmoniqua.org.<br />

Scholten H, Kassahun A, Refsgaard JC, Kargas T, Gavardinas C, Beulens AJM (2007) A methodology to<br />

support multidisciplinary model-based water management. Environmental Modelling & Software, 22,<br />

743-759.<br />

Singh VP (Ed) (1995) Computer Models of Watershed Hydrology. Water Resources Publications, Highlands<br />

Ranch, Colorado.<br />

Smith KA (1980) A model of the extent of anaerobic zones in aggregated soils and its potential application to<br />

estimates of denitrification. Journal of Soil Science, 31, 263-277.<br />

Sonnenborg TO, Christensen BSB, Nyegaard P, Henriksen HJ, Refsgaard JC (2003) Transient modelling of<br />

regional groundwater flow using parameter estimates from steady-state automatic calibration. Journal<br />

of Hydrology, 273, 188-204.<br />

Stang O (1981) A regional groundwater model for the Suså area. Danish Committee for Hydrology. Suså Report<br />

No. H9.<br />

Styczen M, Storm B (1993a) Modelling of N-movements on catchment scale – a tool for analysis and decisionmaking.<br />

1. Model description. Fertilizer Research, 36, 1-6.<br />

Styczen M, Storm B (1993b) Modelling of N-movements on catchment scale – a tool for analysis and decisionmaking.<br />

2. A case study. Fertilizer Research, 36, 7-17.<br />

Tampa Bay Water (2001) Scientific review of integrated hydrologic model ISGW/CNTB121. Prepared by West<br />

Consultants, Gartner Lee Ltd and AQUA TERRA Consultants for Tampa Bay Water, Florida.<br />

Thomas RG (1973) Groundwater models. FAO, Irrigation and Drainage Paper 21, Rome.<br />

Troch PA, Mancini M, Paniconni C, Wood EF (1993) Evaluation of a Distributed Catchment Scale Water<br />

Balance Model. Water Resources Research, 29(6), 1805-1817.<br />

Troeh FR, Jabro JD, Kirkham D (1982) Gaseous diffusion equations for porous materials. Geoderma, 27,<br />

239-253.<br />

89


Refsgaard JC – Doctoral Thesis January 2007<br />

Hydrological Modelling and River Basin Management<br />

Troldborg L (2004) The influence of conceptual geological models on the simulation of flow and transport in<br />

Quaternary aquifer systems. PhD Thesis. Geological Survey of Denmark and Greenland, Report<br />

2004/107.<br />

Van Asselt MBA, Rotmans J (2002) Uncertainty in Integrated Assessment Modelling. From Positivism to<br />

Pluralism. Climatic Change, 54: 75-105.<br />

Van der Sluijs JP, Craye M, Funtowicz SO, Kloprogge P, Ravetz J, Risbey JS (2005) Combining Quantitative<br />

and Qualitative Measures of Uncertainty in Model based Foresight Studies: the NUSAP System. Risk<br />

Analysis, 25(2), 481-492.<br />

Van Griensven A, Meixner T (2004) Dealing with unidentifiable sources of uncertainty within environmental<br />

models. In: Pahl C, Schmidt S, Jakeman T. (Eds.), iEMSs 2004 International Congress: "Complexity<br />

and Integrated Resources Management". International Environmental Modelling and Software Society,<br />

Osnabrück, Germany, June 2004.<br />

Van Loon E, Refsgaard JC (eds.) (2005) Guidelines for assessing data uncertainty in hydrological studies.<br />

HarmoniRiB Report. Geological Survey of Denmark and Greenland. http://www.harmonirib.com.<br />

Van Waveren RH, Groot S, Scholten H, Van Geer FC, Wösten JHM, Koeze RD, Noort JJ (2000) Good Modelling<br />

Practice Handbook, STOWA Report 99-05, Utrecht, RWS-RIZA, Lelystad, The Netherlands,<br />

http://waterland.net/riza/aquest/<br />

Vrugt J, Diks CGH, Gupta HV (2005) Improved treatment of uncertainty in hydrologic modelling: Combining<br />

the strengths of global optimization and data assimilation. Water Resources Research, 41, W01017,<br />

doi:10.1029/2004WR003059.<br />

Walker WE, Harremoës P, Rotmans J, Van der Sluijs JP, Van Asselt MBA, Janssen P, Krayer von Krauss<br />

MP (2003) Defining Uncertainty A Conceptual Basis for Uncertainty Management in Model-Based Decision<br />

Support, Integrated Assessment, 4(1), 5-17.<br />

Wardlaw RB (1978) The development of a deterministic integrated surface/subsurface hydrological response<br />

model. PhD Thesis, University of Stratchclyde, Glasgow.<br />

Wardlaw RB, Wyness A, Rippon P (1994) Integrated catchment modelling. Surveys in Geophysics, 15, 311-<br />

330.<br />

Weeks JB (1974) Simulated effects of oil-shale development on the hydrology of the Piceance basin, Colorado.<br />

US Geological Survey, Professional Paper 908.<br />

Wen X-H, Gómez-Hernández JJ (1996) Upscaling hydraulic conductivities in heterogeneous media: An overview.<br />

Journal of Hydrology, 183, ix-xxxii.<br />

WMO (1975) Intercomparison of conceptual models used in operational hydrological forecasting. WMO Operational<br />

Hydrology Report No 7, WMO No 429, World Meteorological Organisation, Geneva.<br />

WMO (1988) Intercomparison of models for snowmelt runoff. WMO Operational Hydrology Report No 23,<br />

WMO No 646, World Meteorological Organisation, Geneva.<br />

WMO (1992) Simulated real-time intercomparison of hydrological models. WMO Operational Hydrology Report<br />

No 38, WMO No 779, World Meteorological Organisation, Geneva.<br />

Wolf J, Beusen AHW, Groenendijk P, Kroon T, Rötter R, van Zeijts H (2003) The integrated modelling system<br />

STONE for calculating nutrient emissions from agriculture in the Netherlands. Environmental Modelling &<br />

Software, 18, 597-617.<br />

Wood EF, Sivapalan M, Beven KJ, Band L (1988) Effects of spatial variability and scale with implications to<br />

hydrologic modelling. Journal of Hydrology, 102, 29-47.<br />

WSSTP (2005) Water safe strong and sustainable. A European vision for water supply and sanitation in<br />

2030. Water Supply and Sanitation Technology Platform. October 2005. http://www.wsstp.org<br />

WWAP (2003) Water for People, Water for Life. UN World Water Development Report. Prepared as a collaborative<br />

effort of 23 UN agencies and convention secretariats co-ordinated by the World Water Assessment<br />

Programme. UNESCO, Paris. http://www.unesco.org/water/wwap/index.shtml<br />

90


[1]<br />

Refsgaard JC, Hansen E (1982) A Distributed Groundwater/Surface Water<br />

Model for the Suså Catchment. Part 1: Model Description.<br />

Nordic Hydrology, 13, 299-310.<br />

Reprinted with permission from Nordic Hydrology


[2]<br />

Refsgaard JC, Hansen E (1982) A Distributed Groundwater/Surface Water<br />

Model for the Suså Catchment. Part 2: Simulations of Streamflow Depletions<br />

Due to Groundwater Abstraction.<br />

Nordic Hydrology, 13, 311-322.<br />

Reprinted with permission from Nordic Hydrology


[3]<br />

Refsgaard JC, Christensen TH, Ammentorp HC (1991) A model for oxygen<br />

transport and consumption in the unsaturated zone.<br />

Journal of Hydrology, 129, 349-369.<br />

Reprinted from Journal of Hydrology with permission from Elsevier


[4]<br />

Refsgaard JC, Seth SM, Bathurst JC, Erlich M, Storm B, Jørgensen, GH,<br />

Chandra S (1992) Application of the SHE to catchments in India - Part 1:<br />

General results.<br />

Journal of Hydrology, 140, pp 1-23.<br />

Reprinted from Journal of Hydrology with permission from Elsevier


[5]<br />

Jain SK, Storm B, Bathurst JC, Refsgaard JC, Singh RD (1992) Application of<br />

the SHE to catchments in India - Part 2: Field experiments and simulation<br />

studies with the SHE on the Kolar subcatchment of the Narmada River.<br />

Journal of Hydrology, 140, 25-47.<br />

Reprinted from Journal of Hydrology with permission from Elsevier


[6]<br />

Refsgaard JC, Knudsen J (1996) Operational validation and intercomparison<br />

of different types of hydrological models.<br />

Water Resources Research, 32 (7), 2189-2202.<br />

Reproduced by permission of American Geophysical Union


WATER RESOURCES RESEARCH, VOL. 32, NO. 7, PAGES 2189–2202, JULY 1996<br />

Operational validation and intercomparison of different types<br />

of hydrological models<br />

Jens Christian Refsgaard and Jesper Knudsen<br />

Danish Hydraulic Institute, Hørsholm, Denmark<br />

Abstract. A theoretical framework for model validation, based on the methodology<br />

originally proposed by Klemes [1985, 1986], is presented. It includes a hierarchial<br />

validation testing scheme for model application to runoff prediction in gauged and<br />

ungauged catchments subject to stationary and nonstationary climate conditions. A case<br />

study on validation and intercomparison of three different models on three catchments in<br />

Zimbabwe is described. The three models represent a lumped conceptual modeling system<br />

(NAM), a distributed physically based system (MIKE SHE), and an intermediate<br />

approach (WATBAL). It is concluded that all models performed equally well when at<br />

least 1 year’s data were available for calibration, while the distributed models performed<br />

marginally better for cases where no calibration was allowed.<br />

Introduction<br />

Copyright 1996 by the American Geophysical Union.<br />

Paper number 96WR00896.<br />

0043-1397/96/96WR-00896$09.00<br />

In recent years water resources studies have become increasingly<br />

concerned with aspects of water resources for which data<br />

are not directly available. Examples include studies of the<br />

development potential of ungauged areas, environmental impacts<br />

of land use changes related to agricultural and forestry<br />

practices, conjunctive use of groundwater and surface water,<br />

and climate impact studies concerned with the effects on water<br />

resources of an anticipated climate change.<br />

In these and other types of studies, hydrological simulation<br />

models are often used to provide the missing information as a<br />

basis for decisions regarding the development and management<br />

of water and land resources.<br />

Traditionally, hydrological simulation modeling systems are<br />

classified in three main groups, namely, (1) empirical black<br />

box, (2) lumped conceptual, and (3) distributed physically<br />

based systems. The great majority of the modeling systems<br />

used in practice today belongs to the simple types (1) or (2)<br />

and require a modest numbers of parameters (approximately<br />

5–10) to be calibrated for their operation. Despite their simplicity,<br />

many models have proven quite successful in representing<br />

an already measured hydrograph.<br />

A severe drawback of these traditional modeling systems,<br />

however, is that their parameters are not directly related to the<br />

physical conditions of the catchment. Accordingly, it may be<br />

expected that their applicability is limited to areas where runoff<br />

has been measured for some years and where no significant<br />

change in catchment conditions have occurred.<br />

To provide a more appropriate tool for the type of studies<br />

mentioned above, considerable efforts within hydrological research<br />

have been directed toward development of distributed<br />

physically based catchment models. Such models use parameters<br />

which are related directly to the physical characteristics of<br />

the catchment (topography, soil, vegetation, and geology) and<br />

operate within a distributed framework to account for the<br />

spatial variability of both physical characteristics and meteorological<br />

conditions. These models aim at describing the hydrological<br />

processes and their interaction as and where they<br />

occur in the catchment and therefore offer the prospect of<br />

remedying the shortcomings of the traditional rainfall runoff<br />

models.<br />

Although there appears to be a certain degree of consensus<br />

at the theoretical level regarding the potential of the distributed<br />

physically based types of models, there are widely divergent<br />

points of view as to whether they offer a significant improvement<br />

in actual performance when compared to the wellproven<br />

lumped conceptual model type. Beven [1989, p. 161]<br />

argues from theoretical considerations of scale problems that<br />

“the current generation of distributed physically based models<br />

are lumped conceptual models,” and, further, that all current<br />

physically based models “are not well suited to applications to<br />

real catchments.” Grayson et al. [1992] support this view and<br />

claim that physically based models have been oversold by their<br />

developers. Other authors, for example, Smith et al. [1994],<br />

argue that this criticism is “overly pessimistic.”<br />

An evaluation of the capabilities of hydrological models<br />

when applied in the absence of site calibration data and limited<br />

validation data to predict the effects of major land use changes<br />

was made by the Task Committee on Quantifying Land-Use<br />

Change Effects [U.S. Committee, 1985], which reported a great<br />

belief among committee members in the capabilities of 28<br />

surface water hydrological modeling systems, most of which<br />

can be classified as lumped conceptual models. In view of the<br />

limited number of model comparison studies conducted and<br />

the less-than-encouraging results often obtained, this confidence<br />

is remarkable. According to the U.S. Committee [1985, p.<br />

1], “the reasons for this confidence were explored and appear<br />

to be based upon personal experience, possibly tempered by<br />

belief in the model originators.”<br />

Owing to the complexity of the problems involved, further<br />

theoretical evaluation is not likely to provide a definite conclusion<br />

regarding the capability and limitation of distributed,<br />

physically based modeling systems. For establishing a basis to<br />

better advance the discussion, relevant model validations appear<br />

to be a more fruitful approach, where the models concerned<br />

simply are subjected to a range of practical modeling<br />

tests to validate their capability for undertaking particular<br />

tasks.<br />

2189


2190<br />

REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

In this respect, Klemes [1986, p. 17], has developed a hierarchial<br />

scheme for model testing, which is based on the philosophy<br />

that “a hydrological simulation model must demonstrate,<br />

before it is used operationally, how well it can perform<br />

the kind of task for which it is intended.” It may appear needless<br />

to advocate such a basic and evident requirement. Unfortunately,<br />

it is well justified in view of the current practice in<br />

hydrological model testing.<br />

The present paper is based on results from a research<br />

project conducted at the Danish Hydraulic Institute (DHI)<br />

[1993a]. The project had two major objectives. The first objective<br />

was to identify a rigorous framework for the testing of<br />

model capabilities for different types of tasks. The second<br />

objective was to use this theoretical framework and conduct an<br />

intercomparison study involving application of three modeling<br />

systems of different complexity to a number of tasks ranging<br />

from traditional simulation of stationary, gauged catchments to<br />

simulation of ungauged catchments and of catchments with<br />

nonstationary climate conditions. Data from three catchments<br />

in Zimbabwe were used for the tests. The research project was<br />

a contribution to project D.5, “Testing the transferability of<br />

hydrological simulation models,” forming part of the World<br />

Climate Programme—Water [World Meteorological Organization<br />

(WMO), 1985].<br />

Some of the results of DHI [1993a] were presented by<br />

Refsgaard [1996] with a focus on modeling the land surface<br />

processes and the coupling between hydrological and atmospheric<br />

models within the global change context. Thus Refsgaard<br />

[1996] presents some of the results from two of the<br />

Zimbabwean catchments to illustrate data requirements and<br />

form the basis for conclusions regarding which type of hydrological<br />

model is required for climate change modeling. The<br />

present paper, on the other hand, emphasizes the modeling<br />

methodology and contains a summary of all the test results<br />

from all the three Zimbabwian catchments. It furthermore<br />

provides a general discussion of these results with references to<br />

similar studies reported in literature.<br />

Theoretical Framework for Model Validation<br />

Terminology<br />

No unique and generally accepted terminology is presently<br />

used in the hydrological community with regard to issues related<br />

to model validation. The framework used in the present<br />

paper is basically in line with the terminology defined by<br />

Schlesinger et al. [1979], Tsang [1991], and Flavelle [1992] and<br />

comprises the following key definitions.<br />

A modeling system (i.e., code) is a generalized software<br />

package, which can be used for different catchments without<br />

modifying the source code. Examples of modeling systems are<br />

MIKE SHE, SACRAMENTO, and MODFLOW.<br />

A model is a site-specific application of a modeling system,<br />

including given input data and specific parameter values. An<br />

example of a model is a MIKE SHE–based model for the<br />

Ngezi catchment (cf. the case study below).<br />

A modeling system or a code can be “verified.” A code<br />

verification involves comparison of the numerical solution generated<br />

by the code with one or more analytical solutions or<br />

with other numerical solutions. Verification ensures that the<br />

computer program accurately solves the equations that constitute<br />

the mathematical model.<br />

Model validation is here defined as the process of demonstrating<br />

that a given site-specific model is capable of making<br />

accurate predictions for periods outside a calibration period. A<br />

model is said to be validated if its accuracy and predictive<br />

capability in the validation period have been proven to lie<br />

within acceptable limits or errors. It is important to notice that<br />

the term model validation refers to a site specific validation of<br />

a model. This must not be confused with a more general<br />

validation of a generalized modeling system which, in principle,<br />

will never be possible.<br />

Testing Scheme for Validation of Hydrological Models<br />

The hierarchial testing scheme proposed by Klemes [1985,<br />

1986] appears suitable for testing the capability of a model to<br />

predict the hydrological effect of climate change, land use<br />

change, and other nonstationary conditions. Klemes distinguished<br />

between simulations conducted for the same station<br />

(catchment) used for calibration and simulations conducted<br />

for ungauged catchments. He also distinguished between cases<br />

where climate, land use, and other catchment characteristics<br />

remain unchanged (are stationary) and cases where they are<br />

not. This leads to the definitions of four basic categories of<br />

typical modeling tests.<br />

1. The split-sample test (SS) involves calibration of a<br />

model based on 3–5 years of data and validation on another<br />

period of a similar length.<br />

2. The differential split-sample test (DSS) involves calibration<br />

of a model based on data before catchment change occurs,<br />

adjustment of model parameters to characterize the change,<br />

and validation on the subsequent period.<br />

3. In the proxy-basin test (PB) no direct calibration is allowed,<br />

but advantage may be taken of information from other<br />

gauged catchments. Hence validation will comprise identification<br />

of a gauged catchment deemed to be of a nature similar to<br />

that of the validation catchment; initial calibration; transfer of<br />

model, including adjustment of parameters to reflect actual<br />

conditions within validation catchment; and validation.<br />

4. With the proxy-basin differential split-sample test (PB-<br />

DSS), again no direct calibration is allowed, but information<br />

from other catchments may be used. Hence validation will<br />

comprise initial calibration on the other relevant catchment,<br />

transfer of model to validation catchment, selection of two<br />

parameter sets to represent the periods before and after the<br />

change, and subsequent validations on both periods.<br />

Relevant Literature on Model Intercomparison<br />

Studies<br />

The testing of hydrological models through validation on<br />

independent data has for a long time been emphasized by the<br />

World Meteorological Organization (WMO). In their pioneering<br />

studies [WMO, 1975, 1986, 1992] several hydrological modeling<br />

systems of the empirical black box and the lumped conceptual<br />

types were tested on the same data from different<br />

catchments. The actual testing, however, only included the<br />

standard SS test comprising an initial calibration of a model<br />

and subsequent validation based on data from an independent<br />

period. No firm conclusions were derived regarding significant<br />

differences in performance among different model types.<br />

Franchini and Pacciani [1991] made a comparative analysis<br />

of seven different lumped conceptual models. They used an SS<br />

testing approach calibrating on a 1-month period and validating<br />

on a subsequent 3-month period. They concluded that in<br />

spite of a wide range of structural complexity all the models<br />

produced similar and equally valid results. With regard to the


REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

2191<br />

question of whether the simpler or the more complex variants<br />

within this group of models are better, they concluded that<br />

significantly different models produced basically equivalent results,<br />

with calibration times being generally proportional to the<br />

complexity of their structure. On the other hand, they concluded<br />

that the model structure should not be made too simple,<br />

because it will then cause a loss of the link with the physics<br />

of the problem and of the possibility of taking advantage of<br />

prior knowledge of the geomorphological nature of the catchment.<br />

Other researchers have conducted similar intercomparison<br />

studies involving empirical black box models and lumped conceptual<br />

models [Naef, 1981; Wilcox et al., 1990] with similar<br />

conclusions.<br />

Only a few studies have included comparisons of distributed<br />

physically based models with simpler models. Loague and<br />

Freeze [1985] in a classical study compared two empirical black<br />

box modeling systems (a regression model and a unit hydrograph<br />

model) and a quasi physically based system on three<br />

small experimental catchments ranging from 10 ha to 7.2 km 2 .<br />

The models were used on an event basis to simulate runoff<br />

peaks. The two empirical models were calibrated against runoff<br />

data and subsequently validated on independent data in an SS<br />

approach. The parameter values for the quasi physically based<br />

model were assessed directly from field data and not subject to<br />

any calibration before being validated against the same data as<br />

the two other models. Loague and Freeze [1985] found that all<br />

models performed poorly. For one catchment the quasi physically<br />

based model was subsequently applied with and without<br />

calibration of one key model parameter. Such calibration had<br />

little impact on the model performance during the validation<br />

period.<br />

In a study in the semiarid 150 km 2 Walnut Gulch experimental<br />

watershed Michaud and Sorooshian [1994] compared a<br />

lumped conceptual model (SCS), a distributed conceptual<br />

model (SCS with eight subcatchments, one per raingauge) and<br />

a distributed physically based model (KINEROS) for simulation<br />

of storm events. They found that with calibration, the<br />

accuracies of the two distributed models were similar. Without<br />

calibration the distributed physically based model performed<br />

better than the distributed conceptual model, and in both cases<br />

the lumped conceptual model performed poorly.<br />

Thus, as far as the test experience for distributed physically<br />

based models is concerned, both Loague and Freeze [1985] and<br />

Michaud and Sorooshian [1994] have performed tests on relatively<br />

small experimental catchments with very good data coverage.<br />

Both studies have used the models on ungauged conditions<br />

(without calibration) but in all cases under stationary<br />

climate conditions. The present paper presents results from<br />

larger catchments in Zimbabwe with ordinary data coverage<br />

and performs a sequence of rigorous tests of increasing complexity<br />

according to the hierarchial scheme outlined by Klemes<br />

[1986], involving intercomparisons between lumped conceptual<br />

and distributed physically based models.<br />

Hydrological Modeling Systems<br />

The following three modeling systems (codes) are used in<br />

the present study: a lumped conceptual rainfall-runoff modeling<br />

system (NAM), a semidistributed hydrological modeling<br />

system (WATBAL), and a distributed physically based hydrological<br />

modeling system (MIKE SHE). The NAM and MIKE<br />

SHE can be characterized as very typical of their respective<br />

classes, while the WATBAL falls in between these two standard<br />

classes. All three modeling systems are being used on a<br />

routine basis at the Danish Hydraulic Institute (DHI) in connection<br />

with consultancy and research projects.<br />

NAM<br />

NAM is a traditional hydrological modeling system of the<br />

lumped conceptual type operating by continuously accounting<br />

for the moisture contents in four mutually interrelated storages.<br />

The NAM was originally developed at the Technical<br />

University of Denmark [Nielsen and Hansen, 1973] and has<br />

been modified and extensively applied by DHI in a large number<br />

of engineering projects covering all climatic regimes of the<br />

world. Furthermore, the NAM has been transferred to more<br />

than 100 other organizations worldwide as part of DHI’s<br />

MIKE 11 generalized river modeling package. The structure of<br />

NAM is illustrated in Figure 1. The NAM has in its present<br />

version a total of 17 parameters; however, in most cases only<br />

about 10 of these are adjusted during calibration.<br />

WATBAL<br />

WATBAL was developed in the early 1980s by DHI in an<br />

attempt to enable full utilization of readily available, distributed<br />

data on land surface properties (topography, vegetation,<br />

and soil) in a physically based model, and yet it is simple<br />

enough to allow large-scale applications within reasonable<br />

computational requirements. Here the WATBAL is briefly<br />

introduced; more detailed information has been given by<br />

Knudsen et al. [1986].<br />

WATBAL has been designed to account for the spatial and<br />

temporal variations of soil moisture. On the basis of distributed<br />

information on meteorological conditions, topography,<br />

vegetation, and soil types, the catchment area is divided into a<br />

number of hydrological response units, as illustrated in Figure<br />

2, with each unit being characterized by a different composition<br />

of the above features. These units are used to provide the<br />

spatial representation of soil moisture, while temporal variations<br />

within each unit are accounted for by means of empirical<br />

relations for the processes affecting soil moisture, using physical<br />

parameters particular to each unit.<br />

For the representation of subsurface flows a simple lumped,<br />

conceptual approach is applied, using a cascade of linear reservoirs<br />

to account for the interflow and baseflow components<br />

(Figure 3). In summary, WATBAL provides a distributed physically<br />

based description of the surface processes affecting soil<br />

moisture (interception, infiltration, evapotranspiration, and<br />

percolation), while a lumped conceptual approach is used to<br />

represent subsurface flows. WATBAL has previously been<br />

used successfully for prediction of runoff from ungauged catchments<br />

[Nielsen and Bari, 1988].<br />

MIKE SHE<br />

MIKE SHE is a further development of the European Hydrological<br />

System—SHE [Abbott et al., 1986a, b]. It is a deterministic,<br />

fully distributed and physically based modeling system<br />

for describing the major flow processes of the entire land phase<br />

of the hydrological cycle. MIKE SHE solves the partial differential<br />

equations for the processes of overland and channel flow<br />

and unsaturated and saturated subsurface flow. The system is<br />

completed by a description of the processes of snow melt,<br />

interception, and evapotranspiration. The flow equations are<br />

solved numerically using finite difference methods.<br />

In the horizontal plane the catchment is discretized in a


2192<br />

REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

Figure 1. Structure of the NAM rainfall runoff modeling system [DHI, 1994].<br />

network of grid squares. The river system is assumed to run<br />

along the boundaries of these. Within each square the soil<br />

profile is represented by a number of computational nodes in<br />

the vertical direction, which above the groundwater table may<br />

become partly saturated. Lateral subsurface flow is only considered<br />

in the saturated part of the profile. Figure 4 illustrates<br />

the structure of the MIKE SHE. A description of the methodology<br />

and some experiences of model application to ordi-<br />

Figure 2. WATBAL representation of catchment characteristics and definition of hydrological response<br />

units [Knudsen et al., 1986].


REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

2193<br />

Figure 3. Principal structure of WATBAL [Knudsen et al., 1986].<br />

nary catchments have been given by Refsgaard et al. [1992] and<br />

Jain et al. [1992]. A more detailed description has been given<br />

by Refsgaard and Storm [1995].<br />

MIKE SHE is usually categorized as a physically based system.<br />

The characterization is, strictly speaking, correct only if it<br />

is applied on an appropriate scale. A number of scale problems<br />

arise when the MIKE SHE is used on a regional scale [Refsgaard<br />

and Storm, 1995]. In addition, if there is a considerable<br />

Figure 4.<br />

Schematic presentation of the MIKE SHE [DHI, 1993b].


2194<br />

REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

Figure 5.<br />

Location of the three catchments in Zimbabwe.<br />

uncertainty attached to the basic information, and if the spatial<br />

and temporal variables (such as groundwater table elevations)<br />

cannot be validated against observations, a MIKE SHE model<br />

of that particular site cannot be considered fully physically<br />

based but will degenerate towards a detailed conceptual<br />

model. In this case the calibration procedure is usually to<br />

adjust the parameters with the largest uncertainties attached,<br />

within a reasonable range.<br />

Case Study: Methodology<br />

Selected Catchments in Zimbabwe<br />

The three catchments in Zimbabwe that were selected for<br />

the model tests are Ngezi-South (1090 km 2 ), Lundi (254 km 2 ),<br />

and Ngezi-North (1040 km 2 ). The locations of the catchments<br />

are shown in Figure 5.<br />

A brief data collection/field reconnaissance to Zimbabwe<br />

was arranged to obtain relevant information. Daily series of<br />

rainfall and monthly series of pan evaporation were obtained<br />

from the Department of Meteorological Services. Records of<br />

mean daily discharges as well as information on water rights<br />

were obtained from the Hydrological Branch, Ministry of Energy<br />

Water Resources and Development. Detailed information<br />

on land use was obtained through subcontracting R. Whitlow,<br />

University of Zimbabwe, to prepare land-use maps based<br />

upon 1:25,000 aerial photographs. Furthermore, 1:50,000 topographical<br />

maps were collected and digitized. Information on<br />

vegetation characteristics was obtained from Timberlake [1989]<br />

as well as from J. Timberlake and N. Nobanda, National Herbarium<br />

(personal communication, 1989); B. Campell, Department<br />

of Biological Sciences (personal communication, 1989);<br />

and G. MacLaureen, Department of Crop Science, University<br />

of Zimbabwe (personal communication, 1989). Information on<br />

soil characteristics and hydrogeology was obtained from Anderson<br />

[1989]. Finally, valuable information of various kinds was<br />

provided by R. Whitlow, Department of Geography, University<br />

of Zimbabwe (personal communication, 1989); H. Elwell,<br />

Agritex (personal communication, 1989); J. Anderson, Chemistry<br />

and Soil Research Institute, Ministry of Agriculture (personal<br />

communication, 1989); and others. A more detailed description<br />

is given in DHI [1993a].<br />

The annual catchment rainfall and runoff for the periods<br />

selected for modeling are shown in Table 1, while some of the<br />

key features for the three catchments are presented in Table 2.<br />

It is noticed from the rainfall and runoff figures in Table 1 that<br />

there are very large interannual variations. From Table 2 it<br />

appears that there are significant differences in the vegetation<br />

and soil characteristics from catchment to catchment.<br />

Model Testing Scheme<br />

The model testing scheme is illustrated in Figure 6. The<br />

testing of the involved models has been undertaken in parallel<br />

and in the following sequence.<br />

1. The SS test was based on data from Ngezi-South comprising<br />

an initial calibration of the models and a subsequent<br />

validation using data for an independent period.<br />

2. The PB test involved transfer of models to the Lundi<br />

catchment and adjustment of parameters to reflect the prevailing<br />

catchment characteristics and validation without any calibration.<br />

3. The modified proxy-basin (M-PB) test was as above, but


REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

2195<br />

Table 1. Annual Rainfall and Runoff Values for the Three<br />

Zimbabwean Test Catchments<br />

Hydrological<br />

Year<br />

Rainfall,<br />

mm/yr<br />

Runoff,<br />

mm/yr<br />

Ngezi-South<br />

1971/1972 890 131<br />

1972/1973 317 2<br />

1973/1974 1290 349<br />

1974/1975 1087 236<br />

1975/1976 879 90<br />

1976/1977 872 116<br />

1977/1978 1131 245<br />

1978/1979 609 59<br />

Lundi<br />

1971/1972 920 89<br />

1972/1973 371 2<br />

1973/1974 1384 460<br />

1974/1974 1046 217<br />

1975/1976 857 89<br />

1981/1982 416 10<br />

1982/1983 528 7<br />

1983/1984 547 8<br />

Ngezi-North<br />

1977/1978 1047 156<br />

1978/1979 730 64<br />

1981/1982 430 12<br />

1982/1983 395 1<br />

1983/1984 436 4<br />

was adjusted by allowing model calibration based on 1 year of<br />

runoff data.<br />

4. For the DSS test, model calibration was based on data<br />

from an initial calibration period, and validation was based on<br />

data from a subsequent period. The differential nature of this<br />

test is justified by the fact that the later independent period<br />

includes three successive years (1981/1982–1983/1984) with a<br />

markedly lower rainfall than would be otherwise and hence<br />

represents a nonstationary climate scenario.<br />

5. The PB-DSS test involved transferring the models to the<br />

Ngezi-North catchment, adjusting the parameters to represent<br />

the catchment characteristics, and validating them by runoff<br />

simulation over a nonstationary climate period.<br />

6. The modified proxy-basin differential split-sample (M-<br />

PB-DSS) test was as above, though it allowed models to be<br />

calibrated using a short-term (1 year) record.<br />

Evaluation Criteria<br />

For measuring the performance of the models for each test,<br />

a standard set of criteria has been defined. The criteria have<br />

been designed with the sole purpose of measuring how closely<br />

the simulated series of daily flows agree with the measured<br />

series. Owing to the generalized nature of the defined model<br />

validations, it has been necessary to introduce several criteria<br />

for measuring the performance with regard to water balance,<br />

low flows, and peak flows.<br />

The standard set of performance criteria comprises a combination<br />

of the following four graphical plots and three numerical<br />

measures: (1) joint plots of the simulated and observed<br />

hydrographs; (2) scatter diagram of monthly runoffs; (3) flow<br />

duration curves; (4) scatter diagram of annual maximum discharges;<br />

(5) overall water balance; (6) the Nash-Sutcliffe coefficient<br />

(R2); and (7) an index (EI) measuring the agreement<br />

between the simulated and observed flow duration curves.<br />

The coefficient R2, introduced by Nash and Sutcliffe [1970],<br />

is computed on the basis of the sequence of observed and<br />

simulated monthly flows over the whole testing period (perfect<br />

agreement for R2 is 1):<br />

M<br />

R2 1 <br />

m1<br />

2 M<br />

Q o m Q s m Q o m Q¯ o 2<br />

m1<br />

where<br />

M total number of months;<br />

s<br />

Q m simulated monthly flows;<br />

o<br />

Q m observed monthly flows;<br />

Q¯o<br />

average observed monthly flows over whole period.<br />

The flow duration curve error index, EI, provides a numerical<br />

measure of the difference between the flow duration curves<br />

of simulated and observed daily flows (perfect agreement for<br />

EI is 1):<br />

EI 1 f oq f s q dq f oq dq<br />

where f o (q) is the flow duration curve based on observed daily<br />

flows, and f s (q) is the flow duration curve based on simulated<br />

daily flows.<br />

Table 2. Land-Use Vegetation and Soil Characteristics Estimated From Available<br />

Information and a Brief Field Visit<br />

Catchment<br />

Ngezi-South Lundi Ngezi-North<br />

Land use/vegetation (area %)<br />

Dense/closed woody vegetation 7 13 10<br />

Open woody vegetation 36 25 35<br />

Sparse woody vegetation 14 19 14<br />

Grassland 11 39 16<br />

Cropland 29 3 19<br />

Abandoned cropland 2 0 6<br />

Rock outcrops 1 0 0<br />

Soil depth range, m 0–2.5 0–1 0.5–6<br />

Saturated hydraulic<br />

conductivity in root zone<br />

range: 1–250<br />

average: 80<br />

range: 1–70<br />

average: 60<br />

range: 2–100<br />

average: 50<br />

soil, mm/hr<br />

Available water content in root<br />

zone soil, vol %<br />

range: 10–14 range: 10–12 range: 9–29<br />

average: 12 average: 11 average: 17


2196<br />

REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

Figure 6.<br />

Model validation test schemes.<br />

Model Construction, Calibration, and Application<br />

All models have had access to the same hydrometeorological<br />

data and catchment information at any time. Due to the nature<br />

of the different models, however, the WATBAL and SHE have<br />

been able to make more direct use of the available information<br />

than the NAM.<br />

In this respect, the NAM has disregarded the spatial variation<br />

of rainfall and used the catchment average series as input,<br />

and for the simulation of ungauged catchments, a subjective<br />

evaluation of catchment characteristics has been undertaken<br />

for estimation of the appropriate model parameters. On the<br />

other hand, the WATBAL and SHE have attempted to account<br />

for the spatial variability of rainfalls as well as information<br />

on typical storm durations to convert daily rainfall series<br />

to realistic hourly rainfalls. Furthermore, these models have<br />

directly used the available information on the spatial variation<br />

of topography and soil and vegetation types and their characteristics<br />

for model setup and estimation of appropriate model<br />

parameters.<br />

As an illustration of the differences in model complexity and<br />

the different abilities of the three modeling systems to utilize<br />

the available distributed catchment data, some key facts for the<br />

three model applications to the 1090 km 2 Ngezi-South catchment<br />

are given in the following three paragraphs.<br />

The NAM model considered the entire catchment as one<br />

unit, utilized only catchment areal rainfall, and initially disregarded<br />

information on soil, vegetation, and geology. Such information<br />

was subsequently used on a subjective basis for assessing<br />

likely parameter values in the PB tests on the other two<br />

catchments. During the model calibrations (when allowed) the<br />

values of the 10 parameters were assessed.<br />

The WATBAL model was established on the basis of six<br />

meteorological zones, eight soil types, and 11 vegetation types.<br />

The spatial occurrences of these three features resulted in 129<br />

hydrological response units. During the model calibrations<br />

(when allowed) parameter values reflecting root depths, soil<br />

water retention capacity, soil hydraulic conductivities, and time<br />

constants in subsurface flow routing were adjusted.<br />

The MIKE SHE also distributed the rainfall information to<br />

different inputs in six meteorological zones. Information on<br />

topography, soil, vegetation, and geology were distributed to a<br />

1-km grid. Thus MIKE SHE carried out calculations at 1090<br />

horizontal grid points. During the model calibrations (when<br />

allowed) parameter values reflecting soil depth and maximum<br />

root depths, as well as an empirical drainage time constant,<br />

were adjusted. In order to minimize the calibration work the<br />

parameter values were not varied within all 1090 grid points,<br />

but kept identical within each of the 13 land-use classes. In<br />

general, the parameters for which field data were available,<br />

such as soil water retention curves and leaf area index, were<br />

not modified during the calibration process.<br />

The present study has aimed at testing various types of<br />

general modeling systems. However, it should be emphasized<br />

that validation results are not solely dependent on the modeling<br />

system but, indeed, also depend on the hydrologist operating<br />

the model, including his or her personal interpretation of<br />

available information and subjective assessments. In the<br />

present study this element of uncertainty has been minimized<br />

to the extent possible by assigning three experienced hydrologists<br />

with comprehensive experience in the application of each<br />

of the three modeling systems and by providing each of them<br />

with the same catchment data.<br />

The calibration procedure adopted was that of “trial and<br />

error,” implying that the hydrologists made subjective adjustments<br />

of parameter values in between the calibration runs. The<br />

numerical and graphical performance criteria described above<br />

were used as important guidance for the hydrologists when<br />

deciding upon the set of parameter values which they assessed<br />

to be the optimal ones. As these decisions inevitably depend on<br />

the personal experiences and judgments of the hydrologists, it<br />

may be argued that this procedure adds an undesirable degree<br />

of subjectivity to the results. However, given the large number<br />

of performance criteria and the large number of adjustable<br />

parameters, especially in the WATBAL and MIKE SHE models,<br />

suitable and well-proven automatic parameter optimization<br />

techniques did not exist. Instead, by applying the standard<br />

calibration procedure by which the three hydrologists had comprehensive<br />

experience, the results may be seen as typical results<br />

from three different modeling systems, when using standard<br />

engineering procedures for data collection, model<br />

construction, and calibration.<br />

Results of Model Validation Test Scheme<br />

The results of the six tests outlined in Figure 6 are summarized<br />

in Figure 7, which shows the overall water balances and


REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

2197<br />

Figure 7.<br />

Summary of key validation results for all tests.<br />

the R2 and EI numerical criteria. Simulated and observed<br />

hydrographs are shown in Figure 8 for two of the tests from the<br />

Lundi and Ngezi-North catchments. Annual water balances<br />

are shown for all the tests in Figures 9–15. Assessments of<br />

uncertainties in the PB predictions are shown in Figures 16 and<br />

17. Note that the different performance criteria presented in<br />

the figures focus on different aspects, such as overall annual<br />

water balances (Figures 9–17), monthly flows (R2 in Figure 7),<br />

flow pattern on a daily basis (EI in Figure 7) and hydrograph<br />

shapes (Figure 8). The results are discussed test by test in the<br />

following sections.<br />

SS Test<br />

This test is based on data from Ngezi-South and comprises<br />

an initial calibration of the models and a subsequent validation<br />

using data for an independent period. As indicated in Figures<br />

7, 9, and 10 the performances of the three models are very<br />

similar. All models are able to provide a close fit to the recorded<br />

flows for the calibration period, while for the independent<br />

validation period the performance is somewhat reduced,<br />

as expected. The reduction is, however, limited, and all models<br />

are able to maintain a very good representation of the overall<br />

water balance and the interannual and seasonal variations, as<br />

well as the general flow pattern.<br />

PB Test<br />

This test comprises a transfer of models to the Lundi catchment,<br />

adjustment of parameters to reflect the prevailing catchment<br />

characteristics, and validation without any calibration.<br />

The PB test was arranged to test the capability of the different<br />

models to represent runoff from an ungauged catchment area,<br />

and hence no calibration was allowed prior to the simulation.<br />

All models have used the experience from the Ngezi-South<br />

calibrations in combination with the available information on<br />

the particular catchment characteristics for Lundi. While the<br />

NAM model has used this information in a purely subjective<br />

manner to revise model parameters, both the WATBAL and<br />

MIKE SHE models have directly used this information for the<br />

model setup. The estimates prepared by the latter two models<br />

have, however, also been influenced by the individual modelers’<br />

subjective interpretation of the available information on<br />

soil and vegetation characteristics.<br />

In order to assess the effects of the uncertainty in parameter<br />

estimation as perceived by the individual modelers, three alternative<br />

runoff simulations were prepared, reflecting expected<br />

low, central, and high (runoff) estimates, respectively. The results<br />

of the central estimates are included in Figures 7, 8a, and<br />

11, while annual runoff figures for the assessed uncertainty<br />

intervals are shown in Figure 16.<br />

In general, all models provide an excellent representation of<br />

the general flow pattern and the overall water balance, while<br />

maintaining the significant interannual variability to a satisfactory<br />

degree. The predicted hydrographs for the rainy season of<br />

1973/1974, shown in Figure 8a, confirm that the overall hydrograph<br />

pattern is predicted quite well by all three models.<br />

The overall performance of the central estimates by the<br />

NAM and MIKE SHE models is somewhat reduced compared<br />

to validation runs for the Ngezi-South catchment as expected<br />

when no calibration is possible. The estimates would, however,<br />

still be very valuable for all practical purposes. For the<br />

WATBAL model, the central estimate is even better than<br />

obtained for the validation period for Ngezi-South, providing<br />

for a very accurate representation of observed runoff record.<br />

From Figure 16 it appears that the assessed uncertainty<br />

interval for the NAM predictions of annual runoff is about<br />

twice as wide as for the WATBAL and MIKE SHE predictions.<br />

M-PB Test<br />

This test is based on the same data from Lundi as the above<br />

PB test. The M-PB test was undertaken to evaluate whether<br />

better model performance could be obtained should shortterm<br />

measurements be available for calibration. Hence, before<br />

the results of the previous test were revealed, 1 year (1975/<br />

1976) of runoff record was released for calibration, and the PB<br />

test repeated. The main results of this test are summarized in<br />

Figure 7, and annual water balances are shown in Figure 12.<br />

For the NAM model the short-term calibration leads to an<br />

improved performance, decreasing the deviation of the overall<br />

water balance to some 15%. At the same time, the statistics of<br />

R2 and EI confirm the good representation of monthly flows<br />

and the overall flow pattern in general.<br />

For the WATBAL model the short-term calibration introduces<br />

only a slight improvement in the overall performance.<br />

The reason for this is thought to be due to the originally very<br />

good performance, which in any case would be difficult to<br />

improve. The main benefit of the short runoff record is in this<br />

case primarily to confirm the validity of the central estimate


2198<br />

REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

Figure 8. (a) Lundi (central estimates) proxy-basin (PB) test hydrographs from 1973/1974. (b) Ngezi-North<br />

(central estimates) PB differential split-sample (SS) test hydrographs for 1977/1978.<br />

and hence to reduce the uncertainty related to the final runoff<br />

estimate. In this sense the calibration has proven quite valuable<br />

and would indeed be so in any practical case.<br />

For the MIKE SHE model the calibration has not introduced<br />

any improvement in the overall performance. As compared<br />

to the best of the original estimates (i.e., the low case)<br />

the calibration has in fact caused a deterioration of the performance.<br />

This rather unfortunate incident may occur for all<br />

Figure 9. Annual water balances for the calibration part of<br />

the SS test on Ngezi-South catchment.<br />

Figure 10. Annual water balances for the validation part of<br />

the SS test on Ngezi-South catchment.


REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

2199<br />

Figure 11.<br />

catchment.<br />

Annual water balances for PB test on Lundi<br />

Figure 13. Annual water balances for differential split sample<br />

(DSS) test on Lundi catchment.<br />

types of models when calibration data are not fully consistent,<br />

but it appears that the SHE type of model requires a greater<br />

reliability of input data than other, more simple types of models<br />

to avoid the pitfall of miscalibration.<br />

DSS Test<br />

This test consists of model calibrations based on data from<br />

Lundi for 4 wet years (1971/1972–1975/1976 with mean annual<br />

runoff of 171 mm) and validation on data from 3 very dry years<br />

(1981/1982–1983/1984 with mean annual runoff of 8 mm). The<br />

purpose of this test is to assess the capability of the models to<br />

do simulations under nonstationary climate conditions. A summary<br />

of the main results of the differential SS tests is given in<br />

Figure 7, and the annual water balances are shown in Figure<br />

13.<br />

As is evident from the results, both NAM and MIKE SHE<br />

predict the water balance well. The WATBAL model, however,<br />

grossly overestimates the peaks in the relative sense,<br />

causing the simulated average runoff to be about twice that<br />

measured (15 mm compared to 8 mm). The related statistics<br />

are poorer than those in the other testing schemes, but it<br />

should be noted that even small deviations cause poor statistics<br />

when mean flows are as low as those in this case.<br />

PB-DSS Test<br />

This test is based on data from the third catchment, Ngezi-<br />

North. Without allowing for any prior calibration, all modelers<br />

were requested to prepare low, central, and high estimates of<br />

the expected series of flows for the 1977/1978–1983/1984 period.<br />

This period contained a sequence of mainly wet years<br />

(1977/1978–1980/1981) followed by 3 consecutive dry years,<br />

with rainfalls being less than half of that experienced in the<br />

former period.<br />

At the stage when the measured flow record was revealed, it<br />

was unfortunately discovered that the record for the 1979/<br />

1980–1980/1981 years was erroneous and hence had to be<br />

disregarded when computing the test statistics. The results of<br />

this test are summarized in Figure 7, while the annual water<br />

Figure 12. Annual water balances for modified proxy-basin<br />

(M-PB) test on Lundi catchment.<br />

Figure 14. Annual water balances for proxy-basin differential<br />

split-sample (PB-DSS) test on Ngezi-North catchment.


2200<br />

REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

Figure 17. Assessments of uncertainty interval for prediction<br />

of annual water balances in the PB-DSS test on Ngezi-North<br />

catchment.<br />

Figure 15. Annual water balances for modified proxy-basin<br />

differential split-sample (M-PB-DSS) test on Ngezi-North<br />

catchment.<br />

balances are shown in Figure 14. The assessed uncertainty<br />

intervals of the model predicted annual runoff are shown in<br />

Figure 17.<br />

From Figure 17 it appears that all models have managed to<br />

provide for a nonbiased range of estimates of the overall water<br />

balance, which for some models is quite narrow: NAM, 50%;<br />

WATBAL, 30%; and MIKE SHE, 10%. In terms of the<br />

overall water balance, the central estimates of the models<br />

agree within 25% (NAM), 5% (WATBAL), and 2% (MIKE<br />

SHE). The agreement between the recorded and simulated<br />

monthly flows and the flow duration curves, however, is less<br />

accurate for NAM and MIKE SHE than for the WATBAL<br />

model, which provides for an excellent fit in terms of these<br />

measures. The reason for the somewhat lower R2 and EI<br />

figures for the NAM model is related to its generally less<br />

accurate prediction of flows, while for the MIKE SHE model<br />

this is directly linked to the erroneous assessment of a key<br />

drainage parameter, causing the model to produce much more<br />

base flow than actually exist.<br />

Hydrographs showing measured discharge and predictions<br />

by the three models for the rainy season of 1977/1978 are<br />

presented in Figure 8b. These graphs confirm the conclusions<br />

derived from the numerical criteria, R2, and EI, namely, that<br />

Figure 16. Assessments of uncertainty interval for prediction<br />

of annual water balances in the PB test on Lundi catchment.<br />

the WATBAL reproduces the observed hydrograph very well,<br />

while the daily hydrograph for MIKE SHE reveals major errors<br />

in overall flow pattern. Note that the model which produces<br />

the best overall water balance (MIKE SHE) has at the<br />

same time the poorest fit when compared on daily values.<br />

M-PB-DSS Test<br />

This test is based on the same data from Ngezi-North as the<br />

previous PB-DSS test. Following the calibration of all models<br />

based on only 1 year of data (1977/1978), before the results for<br />

other years were revealed the above test was repeated. The<br />

main results of the modified test are shown in Figures 7 and 15.<br />

These results clearly demonstrate that access to only 1 year of<br />

runoff data has enabled all models to provide an excellent<br />

representation of the runoff within the entire testing period.<br />

The overall water balance agrees within 7% for all models<br />

and despite the fact that the calibration was based on a wet<br />

year, annual flows for the dry period come within the right<br />

order of magnitude, although the relative deviation in some<br />

cases is quite significant. The high R2 and EI scores achieved<br />

by all models confirm that the representation of the monthly<br />

flow sequence and the overall flow pattern has become very<br />

good after the calibration.<br />

Discussion and Conclusions<br />

The three generalized modeling systems, NAM, WATBAL,<br />

and MIKE SHE, have been subject to a rigorous testing<br />

scheme on data from three Zimbabwean catchments. NAM is<br />

a typical representative for the lumped conceptual class of<br />

models, while MIKE SHE similarly belongs to the distributed<br />

physically based class. WATBAL falls between the two classes.<br />

However, for the specific applications in Zimbabwe, where<br />

surface water hydrological aspects have been dominated, it can<br />

be argued that WATBAL can be considered as another representative<br />

of the distributed physically based class.<br />

Although establishing an objective framework for the model<br />

tests and intercomparisons has been attempted, it should be<br />

recognized that the results of a certain validation will be influenced<br />

by the specific test conditions, including the particular<br />

climate, catchment characteristics, data availability, and quality<br />

as well as subjective assessments made by the user (e.g., interpretation<br />

of available information for determining model parameters).<br />

Hence the obtained results are not only a function


REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

2201<br />

of the modeling system itself, but also of the user and numerous<br />

other factors. To arrive at a firm conclusion many validations<br />

would usually be required, and the limited number of<br />

tests undertaken therefore suggests that individual results may<br />

only be cautiously concluded.<br />

With this caution regarding generality in mind, a number of<br />

specific conclusions may be derived from the case study. First,<br />

in view of the difficult tasks given to the models involving<br />

simulation for ungauged catchments and nonstationary time<br />

periods, the overall performance of the models is considered<br />

quite impressive. The overall water balance agrees within<br />

25% in all cases but one, and good results are achieved<br />

without balancing out excessive positive and negative deviations<br />

within individual years. In most cases the models score an<br />

R2 value at about 0.8 or greater and an EI index generally<br />

above 0.7.<br />

Secondly, the following is noted with regard to the specific<br />

types of validations tests:<br />

1. For the SS test the NAM, WATBAL, and MIKE SHE<br />

systems generally exhibit similar performance. All models are<br />

able to provide a close fit to the recorded flows for the calibration<br />

period, without severely reducing the performance<br />

during the independent validation period. Hence this test suggests<br />

that if an adequate runoff period for a few (3–5) years<br />

exists, any of the modeling systems could be used as a reliable<br />

tool for filling in gaps in such records or used to extend runoff<br />

series based on long-term rainfall series. Considering the data<br />

requirements and efforts involved in the setup of the different<br />

models, however, a simple model of the NAM type should<br />

generally be selected for such tasks.<br />

2. For the PB tests, designed for validating the capability of<br />

the models to represent flow series of ungauged catchments, it<br />

had been expected that the physically based models would<br />

produce better results than the simple type of models. The<br />

results, however, do not provide unambiguous support for this<br />

hypothesis. All three modeling systems generated good results,<br />

with the WATBAL providing slightly more accurate results<br />

than the others. Hence for the Zimbabwean conditions the<br />

additional capabilities of the MIKE SHE, as compared to the<br />

WATBAL, namely, the distributed physically based features<br />

relating to subsurface flow, proved to be of little value in<br />

simulating the water balance. For the PB tests it is noticed that<br />

the uncertainty range represented by the low and high estimates<br />

is significantly larger for the NAM than for the WAT-<br />

BAL and MIKE SHE cases. This probably reflects the fact that<br />

parameter estimation for ungauged catchments is generally<br />

more uncertain for the NAM, whose parameters are semiempirical<br />

coefficients without direct links to catchment characteristics.<br />

3. A general experience of the M-PB tests is that allowing<br />

for model calibration based on only 1 year of runoff data<br />

improves the overall performance of all models. The improvement<br />

appears to be particularly significant for the NAM model,<br />

which also showed the largest uncertainties in the cases where<br />

no calibration was possible.<br />

4. For the DSS tests all models have been able to simulate<br />

flows of the right order of magnitude and correct pattern.<br />

Hence all models have proven their ability to simulate the<br />

runoff pattern in periods with much reduced rainfall and runoff<br />

as compared to the calibration period. On the basis of these<br />

results there appears no immediate justification for using an<br />

advanced type of model to represent flows following a significant<br />

change of rainfall, providing a number of years are available<br />

for calibration purposes. It is tempting to extend this<br />

finding to suggest that the simple type of model could be used<br />

to assess the impact of climate change on water resources. It<br />

should be recognized, however, that above results cannot fully<br />

justify such a hypothesis, since a long-term climate change<br />

would probably bring about changes in vegetation and their<br />

evaporation. This type of nonstationarity has not been adequately<br />

tested.<br />

As far as the SS tests are concerned the above conclusion is<br />

in full agreement with results of other studies [e.g., Michaud<br />

and Sorooshian, 1994]. With regard to the PB tests the present<br />

conclusion in favor of the distributed physically based modeling<br />

systems is in agreement with, albeit more vague than, that<br />

of Michaud and Sorooshian [1994].<br />

In summary, the present study, as well as similar studies<br />

reported in literature, suggests the following conclusions with<br />

regard to rainfall runoff modeling.<br />

1. Given a few (1–3) years of runoff measurements, a<br />

lumped model of the NAM type would be a suitable tool from<br />

the point of view of technical and economical feasibility. This<br />

applies for catchments with homogeneous climatic input as<br />

well as cases where significant variations in the exogenous<br />

input is encountered.<br />

2. For ungauged catchments, however, where accurate<br />

simulations are critical for water resources decisions, a distributed<br />

model is expected to give better results than a lumped<br />

model if appropriate information on catchment characteristics<br />

can be obtained.<br />

Acknowledgments. The modeling work on the Zimbabwe catchments<br />

were carried out by our colleagues Børge Storm and Merete<br />

Styczen (MIKE SHE) and Roar Jensen (NAM), while the second<br />

author was responsible for the WATBAL work. During the data collection<br />

and field reconnaissance in Zimbabwe, kind help and assistance<br />

was provided by University of Zimbabwe; National Herbarium; and<br />

Department of Meteorological Services and Hydrological Branch,<br />

Ministry of Energy, Water Resources and Development. The study was<br />

carried out with financial support from the Danish Council of Technology,<br />

and the paper preparation was supported by the Danish Technical<br />

Research Council.<br />

References<br />

Abbott, M. B., J. C. Bathurst, J. A. Cunge, P. E. O’Connel, and J.<br />

Rasmussen, An introduction to the European Hydrological System—Systeme<br />

Hydrologique Europeen, “SHE,” 1, History and philosophy<br />

of a physically based distributed modelling system, J. Hydrol.,<br />

87, 45–59, 1986a.<br />

Abbott, M. B., J. C. Bathurst, J. A. Cunge, P. E. O’Connell, and J.<br />

Rasmussen, An introduction to the European Hydrological System—Système<br />

Hydrologique Européen “SHE,” 2, Structure of a<br />

physically based distributed modelling system, J. Hydrol., 87, 61–77,<br />

1986b.<br />

Anderson, J., Communal land physical resource inventory, Mhondoro<br />

and Ngezi, Draft Rep. A 551, Chem. and Soil Res. Inst., Minist. of<br />

Agric., Harare, Zimbabwe, 1989.<br />

Beven, K. J., Changing ideas in hydrology—The case of physically<br />

based models, J. Hydrol., 105, 157–172, 1989.<br />

Danish Hydraulic Institute (DHI), Validation of hydrological models,<br />

Phase II, Hørsholm, 1993a.<br />

Danish Hydraulic Institute (DHI), MIKE SHE WM, short description,<br />

1993b.<br />

Danish Hydraulic Institute (DHI), MIKE11 short description, 1994.<br />

Flavelle, P., A quantitative measure of model validation and its potential<br />

use for regulatory purposes, Adv. Water Resour., 15, 5–13, 1992.<br />

Franchini, M., and M. Pacciani, Comparative analysis of several conceptual<br />

rainfall-runoff models, J. Hydrol., 122, 161–219, 1991.<br />

Grayson, R. B., I. D. Moore, and T. A. McHahon, Physically based


2202<br />

REFSGAARD <strong>AND</strong> KNUDSEN: INTERCOMPARISON OF <strong>HYDROLOGICAL</strong> MODELS<br />

hydrologic modeling, 2, Is the concept realistic, Water Resour. Res.,<br />

28(10), 2659–2666, 1992.<br />

Jain, S. K., B. Storm, J. C. Bathurst, J. C. Refsgaard, and R. D. Singh,<br />

Application of the SHE to catchments in India, 2, Field experiments<br />

and simulation studies with the SHE on the Kolar subbasin to the<br />

Narmada River, J. Hydrol., 140, 25–47, 1992.<br />

Klemes, V., Sensitivity of water resources systems to climate variations,<br />

WCP Rep. 98, World Meteorological Organisation, Geneva, 1985.<br />

Klemes, V., Operational testing of hydrological simulation models,<br />

Hydrol. Sci. J., 31(1), 13–24, 1986.<br />

Knudsen, J., A. Thomsen, and J. C. Refsgaard, WATBAL: A semidistributed,<br />

physically based hydrological modelling system, Nordic<br />

Hydrol., 17, 347–362, 1986.<br />

Loague, K. M., and R. A. Freeze, A comparison of rainfall-runoff<br />

modeling techniques on small upland catchments, Water Resour.<br />

Res., 21(2), 229–248, 1985.<br />

Michaud, J., and S. Sorooshian, Comparison of simple versus complex<br />

distributed runoff models on a midsized semiarid watershed, Water<br />

Resour. Res., 30(3), 593–605, 1994.<br />

Naef, F., Can we model the rainfall-runoff process today, Hydrol. Sci.<br />

Bull., 26(3), 281–289, 1981.<br />

Nash, I. E., and I. V. Sutcliffe, River flow forecasting through conceptual<br />

models, I, J. Hydrol., 10, 282–290, 1970.<br />

Nielsen, S. A., and Bari, Simulation of runoff from ungauged catchments<br />

by a semi-distributed hydrological modelling system, Proceedings,<br />

6th IAHR Congress, Int. Assoc. for Hydraul. Res., Delft, Netherlands,<br />

1988.<br />

Nielsen, S. A., and E. Hansen, Numerical simulation of the rainfallrunoff<br />

process on a daily basis, Nordic Hydrol., 4, 171–190, 1973.<br />

Refsgaard, J. C., Model and data requirements for simulation of runoff<br />

and land surface processes, in Proceedings from NATO Advanced<br />

Research Workshop “Global Environmental Change and Land Surface<br />

Processes in Hydrology: The Trials and Tribulations of Modelling and<br />

Measurering, Tucson, May 17–21, 1993, edited by S. Sorooshian and<br />

V. K. Gupta, Springer-Verlag, New York, 1996.<br />

Refsgaard, J. C., and B. Storm, MIKE SHE, in Computer Models of<br />

Watershed Hydrology, edited by V. J. Singh, pp. 809–846, Water<br />

Resour. Publ., Littleton, Colo., 1995.<br />

Refsgaard, J. C., S. M. Seth, J. C. Bathurst, M. Erlich, B. Storm, G. H.<br />

Jørgensen, and S. Chandra, Application of the SHE to catchments in<br />

India, 1, General results, J. Hydrol., 140, 1–23, 1992.<br />

Schlesinger, S., R. E. Crosbie, R. E. Gagné, G. S. Innis, C. S. Lalwani,<br />

J. Loch, J. Sylvester, R. D. Wright, N. Kheir, and D. Bartos, Terminology<br />

for model credibility, Simulation, 32(3), 103–104, 1979.<br />

Smith, R. E., D. R. Goodrich, D. A. Woolhiser, and J. R. Simanton,<br />

Comment on “Physically based modeling, 2, Is the concept realistic”<br />

by R. B. Grayson, I. D. More, and T. A. McHahon, Water<br />

Resour. Res., 30(3), 851–854, 1994.<br />

Timberlake, J., Brief description of the vegetation of Mondoro and<br />

Ngezi communal lands, Mashonaland West, Natl. Herbarium,<br />

Harare, Zimbabwe, 1989.<br />

Tsang, C.-F., The modelling process and model validation, Ground<br />

Water, 29(6), 825–831, 1991.<br />

U.S. Committee, Task Committee on Quantifying Land-Use Change<br />

Effects, Evaluation of hydrological models used to quantify major<br />

land-use change effects, J. Irrig. Drain. Eng., 111(1), 1–17, 1985.<br />

Wilcox, B. P., W. J. Rawls, D. L. Brakensiek, and J. R. Wright, Predicting<br />

runoff from rangeland catchments: A comparison of two<br />

models, Water Resour. Res., 26(10), 2401–2410, 1990.<br />

World Meteorological Organization, (WMO), Intercomparison of<br />

conceptual models used in operational hydrological forecasting,<br />

WMO Oper. Hydrol. Rep. 7, WMO 429, Geneva, 1975.<br />

World Meteorological Organization (WMO), Third planning meeting<br />

on World Climate Programme Water, WCP 114, WMO/TD 106,<br />

Geneva, 1985.<br />

World Meteorological Organization (WMO), Intercomparison of<br />

models for snowmelt runoff, WMO Oper. Hydrol. Rep. 23, WMO 646,<br />

Geneva, 1986.<br />

World Meteorological Organization (WMO), Simulated real-time intercomparison<br />

of hydrological models, WMO Oper. Hydrol. Rep. 38,<br />

WMO 779, Geneva, 1992.<br />

J. Knudsen and J. C. Refsgaard, Danish Hydraulic Institute, Agern<br />

Alle 5, DK-2970 Hørsholm, Denmark.<br />

(Received September 25, 1995; revised March 15, 1996;<br />

accepted March 20, 1996.)


[7]<br />

Refsgaard JC (1997) Parametrisation, calibration and validation of distributed<br />

hydrological models.<br />

Journal of Hydrology, 198, 69-97.<br />

Reprinted from Journal of Hydrology with permission from Elsevier


[8]<br />

Refsgaard JC (1997) Validation and Intercomparison of Different Updating<br />

Procedures for Real-Time Forecasting.<br />

Nordic Hydrology, 28, 65-84.<br />

Reprinted with permission from Nordic Hydrology


[9]<br />

Refsgaard JC, Sørensen HR, Mucha I, Rodak D, Hlavaty Z, Bansky L,<br />

Klucovska J, Topolska J, Takac J, Kosc V, Enggrob HG, Engesgaard P,<br />

Jensen JK, Fiselier J, Griffioen J, Hansen S (1998) An Integrated Model for<br />

the Danubian Lowland – Methodology and Applications.<br />

Water Resources Management, 12, 433-465.<br />

Reprinted from Water Resources Management with permission from Springer<br />

(www.springerlink.com)


Water Resources Management 12: 433–465, 1998.<br />

© 1998 Kluwer Academic Publishers. Printed in the Netherlands.<br />

433<br />

An Integrated Model for the Danubian Lowland –<br />

Methodology and Applications<br />

J. C. REFSGAARD 1 ,H.R.SØRENSEN 1 , I. MUCHA 2 , D. RODAK 2 ,<br />

Z. HLAVATY 2 , L. BANSKY 2 , J. KLUCOVSKA 2 , J. TOPOLSKA 4 , J. TAKAC 3 ,<br />

V. KOSC 3 , H. G. ENGGROB 1 , P. ENGESGAARD 5 , J. K. JENSEN 5 ,<br />

J. FISELIER 6 , J. GRIFFIOEN 7 and S. HANSEN 8<br />

1 Danish Hydraulic Institute, Denmark<br />

2 Ground Water Consulting Ltd., Bratislava, Slovakia<br />

3 Irrigation Research Institute (VUZH), Bratislava, Slovakia<br />

4 Water Research Institute (VUVH), Bratislava, Slovakia<br />

5 Water Quality Institute (VKI), Denmark<br />

6 DHV Consultants BV, The Netherlands<br />

7 Netherlands Institute of Applied Geosciences TNO, The Netherlands<br />

8 Royal Veterinary and Agricultural University, Denmark<br />

(Received: 30 December 1997; in final form: 10 November 1998)<br />

Abstract. A unique integrated modelling system has been developed and applied for environmental<br />

assessment studies in connection with the Gabcikovo hydropower scheme along the Danube.<br />

The modelling system integrates model codes for describing the reservoir (2D flow, eutrophication,<br />

sediment transport), the river and river branches (1D flow including effects of hydraulic control structures,<br />

water quality, sediment transport), the ground water (3D flow, solute transport, geochemistry),<br />

agricultural aspects (crop yield, irrigation, nitrogen leaching) and flood plain conditions (dynamics<br />

of inundation pattern, ground water and soil moisture conditions, and water quality). The uniqueness<br />

of the established modelling system is the integration between the individual model codes, each of<br />

which provides complex descriptions of the various processes. The validation tests have generally<br />

been carried out for the individual models, whereas only a few tests on the integrated model were<br />

possible. Based on discussion and examples, it is concluded that the results from the integrated model<br />

can be assumed less uncertain than outputs from the individual model components. In an example,<br />

the impacts of the Gabcikovo scheme on the ecologically unique wetlands created by the river branch<br />

system downstream of the new reservoir have been simulated. In this case, the impacts of alternative<br />

water management scenarios on ecologically important factors such as flood frequency and duration,<br />

depth of flooding, depth to ground water table, capillary rise, flow velocities, sedimentation and water<br />

quality in the river system have been explicitly calculated.<br />

Key words: Danube, environmental impacts, floodplain, Gabcikovo, groundwater, hydropower, integrated<br />

modelling, river branch.<br />

434 J. C. REFSGAARD ET AL.<br />

Figure 1. The Danubian Lowland with the new reservoir and the Gabcikovo scheme.<br />

1. Introduction<br />

1.1. THE DANUBIAN LOWL<strong>AND</strong> <strong>AND</strong> THE GABCIKOVO HYDROPOWER SCHEME<br />

The Danubian Lowland (Figure 1) in Slovakia and Hungary between Bratislava and<br />

Komárno is an inland delta (an alluvial fan) formed in the past by river sediments<br />

from the Danube. The entire area forms an alluvial aquifer, which receives around<br />

30 m 3 s −1 infiltration water from the Danube throughout the year, in the upper parts<br />

of the area and returns it to the Danube and the drainage canals in the downstream<br />

part. The aquifer is an important water resource for municipal and agricultural<br />

water supply.<br />

Human influence has gradually changed the hydrological regime in the area.<br />

Construction of dams upstream of Bratislava together with straightening and embanking<br />

of the river for navigational and flood protection purposes as well as<br />

exploitation of river sediments have significantly deepened the river bed and lowered<br />

the water level in the river and surrounding ground water level. These changes<br />

have had a significant influence on the ground water regime as well as the sensitive<br />

riverine forests downstream of Bratislava. Despite this basically negative trend the<br />

floodplain area with its alluvial forests and associated ecosystems still represents a<br />

unique landscape of outstanding ecological importance.<br />

The Gabcikovo hydropower scheme was put into operation in 1992. A large<br />

number of hydraulic structures has been established as part of the hydropower<br />

scheme. The key structures are a system of weirs across the Danube at Cunovo<br />

15 km downstream of Bratislava, a reservoir created by the damming at Cunovo, a<br />

30 km long lined power and navigation canal, outside the floodplain area, parallel to<br />

the Danube River with intake to the hydropower plant, a hydropower plant and two


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 435<br />

ship-locks at Gabcikovo, and an intake structure at Dobrohost, 10 km downstream<br />

of Cunovo, diverting water from the new canal to the river branch system. The<br />

entire scheme has significantly affected the hydrological regime and the ecosystem<br />

of the region, see, e.g., Mucha et al. (1997). The scheme was originally planned as<br />

a joint effort between former Czecho-Slovakia and Hungary, and the major parts of<br />

the construction were carried out as such on the basis of a 1977 international treaty.<br />

However, since 1989 Gabcikovo has been a major matter of controversy between<br />

Slovakia and Hungary, who have referred some disputed questions to international<br />

expert groups (EC, 1992, 1993a, b) and others to the International Court of Justice<br />

in The Hague (ICJ, 1997).<br />

Comprehensive monitoring and assessments of environmental impacts have been<br />

made, see Mucha (1995) for an overview. Since 1995 a joint Slovak-Hungarian<br />

monitoring program has been carried out (JAR, 1995, 1996, 1997).<br />

1.2. NEED FOR INTEGRATED <strong>MODELLING</strong><br />

The hydrological regime in the area is very dynamic with so many crucial links<br />

and feedback mechanisms between the various parts of the surface- and subsurface<br />

water regimes that integrated modelling is required to thoroughly assess environmental<br />

impacts of the hydropower scheme. This is illustrated by the following three<br />

examples:<br />

• Ground water quality. Based on qualitative arguments it was hypothesised<br />

that the damming and creation of the reservoir might lead to changes in the<br />

oxidation-reduction state of the ground water. The reason for this is that the<br />

reservoir might increase infiltration from the Danube to the aquifer because of<br />

increased head gradients. On the other hand, fine sediment matter might accumulate<br />

on the reservoir bottom, thereby creating a reactive sediment layer. The<br />

river water infiltrating to the aquifer has to pass this layer, which might induce<br />

a change in the oxidation status of the infiltrating water. This could affect the<br />

quality of the ground water from being oxic or suboxic towards being anoxic,<br />

which is undesirable for Bratislava’s water works, most of which are located<br />

near the reservoir. Thus, the oxidation-reduction state of the groundwater is<br />

intimately linked to a balance between the rates of infiltrating reducing water<br />

and the aquifer oxidizing capacity. The infiltrating water is linked to the hydraulic<br />

behaviour of the reservoir: how large is the infiltration area and at which<br />

rates does the infiltration take place at different locations. However, without<br />

an integrated model it is not possible to quantify whether and under which<br />

conditions these mechanisms play a significant role in practise, whether they<br />

are correct in principle but without practical importance, and what measures<br />

should be realised.<br />

• Agricultural production. Changes in discharges in the Danube caused by diversion<br />

of some of the water through the power canal and creation of a reservoir<br />

436 J. C. REFSGAARD ET AL.<br />

Figure 2. Important processes and their interactions with regard to floodplain hydrology.<br />

would lead to changes in the ground water levels. As the agricultural crops<br />

depend on capillary rise from the shallow ground water table and irrigation, the<br />

new hydrological situation created by the damming of the Danube might influence<br />

both the crop yield, the irrigation requirements and the nitrogen leaching.<br />

Traditional crop models describing the root zone are not sufficient in this case,<br />

because the lower boundary conditions (ground water levels) are changed in a<br />

way that can only be quantified if also the reservoir, the river and canal system<br />

and the aquifer are explicitly included in the modelling.<br />

• Floodplain ecosystem. The flora and fauna, which in the floodplain area are<br />

dominated by the river side branches, depend on many factors such as flooding<br />

dynamics, flow velocities, depth of ground water table, soil moisture, water<br />

quality and sediments. Also in this case the important factors depend on the<br />

interaction between the groundwater and the surface water systems (illustrated<br />

in Figure 2), and even on water quality and sediments in the surface water<br />

system, so that quantitative impact assessments require an integrated modelling<br />

approach.<br />

2. Integrated Modelling System<br />

2.1. INDIVIDUAL MODEL COMPONENTS<br />

An integrated modelling system (Figure 3) has been established by combining the<br />

following existing and well proven model codes:<br />

• MIKE SHE (Refsgaard and Storm, 1995) which, on a catchment scale, can<br />

simulate the major flow and transport processes in the hydrological cycle:<br />

– 1-D flow and transport in the unsaturated zone


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 437<br />

Figure 3. Structure of the integrated modelling system with indication of the interactions<br />

between the individual models.<br />

– 3-D flow and transport in the ground water zone<br />

– 2-D flow and transport on the ground surface<br />

– 1-D flow and transport in the river.<br />

All of the above processes are fully coupled allowing for feedback’s and interactions<br />

between components. In addition, MIKE SHE includes modules for<br />

multi-component geochemical and biodegradation reactions in the saturated<br />

zone (Engesgaard, 1996).<br />

• MIKE 11 (Havnø et al., 1995), is a one-dimensional river modelling system.<br />

MIKE 11 is used for simulating hydraulics, sediment transport and morphology,<br />

and water quality. MIKE 11 is based on the complete dynamic wave<br />

formulation of the Saint Venant equations. The modules for sediment transport<br />

and morphology are able to deal with cohesive and noncohesive sediment<br />

transport, as well as the accompanying morphological changes of the river bed.<br />

The noncohesive model operates on a number of different grain sizes.<br />

• MIKE 21 (DHI, 1995), which has the same basic characteristics as MIKE 11,<br />

extended to two horizontal dimensions, and is used for reservoir modelling.<br />

• MIKE 11 and MIKE 21 include River/Reservoir Water Quality (WQ) and<br />

Eutrophication (EU) (Havnø et al., 1995; VKI, 1995) modules to describe oxygen,<br />

ammonium, nitrate and phosphorus concentrations and oxygen demands<br />

as well as eutrophication issues such as bio-mass production and degradation.<br />

• DAISY (Hansen et al., 1991) is a one-dimensional root zone model for simulation<br />

of soil water dynamics, crop growth and nitrogen dynamics for various<br />

agricultural management practices and strategies.<br />

438 J. C. REFSGAARD ET AL.<br />

2.2. INTEGRATION OF MODEL COMPONENTS<br />

The integrated modelling system is formed by the exchange of data and feedbacks<br />

between the individual modelling systems. The structure of the integrated<br />

modelling system and the exchange of data between the various modelling systems<br />

are illustrated in general in Figure 3 and the steps in the integrated modelling is<br />

described further in Section 6.2 and illustrated in Figure 10 for the case of flood<br />

plain modelling. The interfaces between the various models indicated in Figure 3<br />

are<br />

A) MIKE SHE forms the core of the integrated modelling system having interfaces<br />

to all the individual modelling systems. The coupling of MIKE SHE and<br />

MIKE 11 is a fully dynamic coupling where data is exchanged within each<br />

computational time step, see Section 2.3 below.<br />

B) Results of eutrophication simulations with MIKE 21 in the reservoir are used<br />

to estimate the concentration of various water quality parameters in the water<br />

that enters the Danube downstream of the reservoir. This information serves as<br />

boundary conditions for water quality simulations for the Danube using MIKE<br />

11.<br />

C) Sediment transport simulations in the reservoir with MIKE 21 provide information<br />

on the amount of fine sediment on the bottom of the reservoir. The<br />

simulated grain size distribution and sediment layer thickness is used to calculate<br />

leakage coefficients, which are used in ground water modelling with MIKE<br />

SHE to calculate the exchange of water between the reservoir and the aquifer.<br />

D) The DAISY model simulates vegetation parameters which are used in MIKE<br />

SHE to simulate the actual evapotranspiration. Ground water levels simulated<br />

with MIKE SHE act as lower boundary conditions for DAISY unsaturated zone<br />

simulations. Consequently, this process is iterative and requires several model<br />

simulations.<br />

E) Results from water quality simulations with MIKE 11 and MIKE 21 provide<br />

estimates of the concentration of various components/parameters in the water<br />

that infiltrates to the aquifer from the Danube and the reservoir. This can be<br />

used in the ground water quality simulations (geochemistry) with MIKE SHE.<br />

A general discussion on the limitations in the above couplings is given in Section 7<br />

below.<br />

2.3. A COUPLING OF MIKE SHE <strong>AND</strong> MIKE 11<br />

The focus in MIKE SHE lies on catchment processes with a comparatively less<br />

advanced description of river processes. In contrary, MIKE 11 has a more advanced<br />

description of river processes and a simpler catchment description than MIKE<br />

SHE. Hence, for cases where full emphasis is needed for both river and catchment<br />

processes a coupling of the two modelling systems is required.


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 439<br />

Figure 4. Principles of the coupling between the MIKE SHE catchment code and the MIKE<br />

11 river code.<br />

A full coupling between MIKE SHE and MIKE 11 has been developed (Figure<br />

4). In the combined modelling system, the simulation takes place simultaneously<br />

in MIKE 11 and MIKE SHE, and data transfer between the two models<br />

takes place through shared memory. MIKE 11 calculates water levels in rivers<br />

and floodplains. The calculated water levels are transferred to MIKE SHE, where<br />

flood depth and areal extent are mapped by comparing the calculated water levels<br />

with surface topographic information stored in MIKE SHE. Subsequently, MIKE<br />

SHE calculates water fluxes in the remaining part of the hydrological cycle. Exchange<br />

of water between MIKE 11 and MIKE SHE may occur due to evaporation<br />

from surface water, infiltration, overland flow or river-aquifer exchange. Finally,<br />

water fluxes calculated with MIKE SHE are exchanged with MIKE 11 through<br />

source/sink terms in the continuity part of the Saint Venant equations in MIKE 11.<br />

The MIKE SHE–MIKE 11 coupling is crucial for a correct description of the<br />

dynamics of the river-aquifer interaction. Firstly, the river width is larger than<br />

one MIKE SHE grid, in which case the MIKE SHE river-aquifer description is<br />

no longer valid. Secondly, the river/reservoir system comprises a large number of<br />

hydraulic structures, the operation of which are accurately modelled in MIKE 11,<br />

but cannot be accounted for in MIKE SHE. Thirdly, the very complex river branch<br />

system with loops and flood cells needs a very efficient hydrodynamic formulation<br />

such as in MIKE 11.<br />

440 J. C. REFSGAARD ET AL.<br />

2.4. COMPARISON TO OTHER <strong>MODELLING</strong> SYSTEMS REPORTED IN<br />

LITERATURE<br />

Yan and Smith (1994) described the demand and outlined a concept for a full<br />

integrated ground water–surface water modelling system including descriptions of<br />

hydraulic structures and agricultural irrigation as a decision support tool for water<br />

resources management in South Florida. Typical examples of integrated codes<br />

described in the literature are Menetti (1995) and Koncsos et al. (1995).<br />

In a review of recent advances in understanding the interaction of groundwater<br />

and surface water Winter (1995) mainly describes groundwater codes, such as<br />

MODFLOW, which have been expanded with some, but very limited, surface water<br />

simulation capabilities. The research activities are characterized as ‘... although<br />

studies of these systems have increased in recent years, this effort is minimal compared<br />

to what is needed’. Winter (1995) sees the prospects for the future as follows:<br />

‘Future studies of the interaction of groundwater and surface water would benefit<br />

from, and indeed should emphasise, interdisciplinary approaches. Physical hydrologists,<br />

geochemists, and biologists have a great deal to learn from each other, and<br />

contribute to each other, from joint studies of the interface between groundwater<br />

and surface water.’<br />

Integrated three-dimensional descriptions of flow, transport and geochemical<br />

processes is still rarely seen for groundwater modelling of large basins. Thus,<br />

according to a recent review of basin-scale hydrogeological modelling (Person<br />

et al., 1996) most of the existing reactive transport model codes are based on<br />

one-dimensional descriptions.<br />

While many model codes contain a distributed physically-based representation<br />

of one of the three main components: ground water, unsaturated zone, and surface<br />

water systems, only few codes provide a fully integrated description of all<br />

these three main components. For example in an up-to-date book (Singh, 1995)<br />

presenting descriptions of 25 hydrological codes only three codes, SHE/SHESED<br />

(Bathurst et al., 1995), IHDM (Calver and Wood, 1995) and MIKE SHE (Refsgaard<br />

and Storm, 1995) provide such integrated descriptions. Among these three<br />

codes only MIKE SHE has capabilities for modelling advection-dispersion and<br />

water quality. None of the three codes contained options for computations of hydraulic<br />

structures in river systems, nor agricultural modelling such as crop yield<br />

and nitrogen leaching.<br />

The individual components of the integrated modelling system presented in this<br />

paper, we believe, represent state-of-the-art within their respective disciplines. The<br />

uniqueness is the full integration.<br />

3. Methodology for Model Construction, Calibration, Validation and<br />

Application<br />

The terminology and methodology used in the following is based on the concepts<br />

outlined in Refsgaard (1997).


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 441<br />

3.1. MODEL CONSTRUCTION<br />

All of the applied models are based on distributed physically-based model codes.<br />

This implies that most of the required input data and model parameters can ideally<br />

be measured directly in nature.<br />

3.2. MODEL CALIBRATION<br />

The calibration of a physically-based model implies that simulation runs are carried<br />

out and model results are compared with measured data. The adopted calibration<br />

procedure was based on ‘trial and error’ implying that the model user in between<br />

calibration runs made subjective adjustments of parameter values within physically<br />

realistic limits. The most important guidance for the model user in this process was<br />

graphical display of model results against measured values. It may be argued that<br />

such manual procedure adds a degree of subjectivity to the results. However, given<br />

the very complex and integrated modelling focusing on a variety of output results<br />

and containing a large number of adjustable parameters, automatic parameter optimisation<br />

is not yet possible and ‘trial and error’ still becomes the only feasible<br />

method in practise.<br />

3.3. MODEL VALIDATION<br />

Good model results during a calibration process cannot automatically ensure that<br />

the model can perform equally well for other time periods as well, because the<br />

calibration process involves some manipulation of parameter values. Therefore,<br />

model validations based on independent data sets are required. To the extent possible,<br />

limited by data availability, the models have been validated by demonstrating<br />

the ability to reproduce measured data for a period outside the calibration period,<br />

using a so-called split-sample test (Klemes, 1986). For some of the models, the<br />

model was even calibrated on pre-dam conditions and validated on post-dam conditions,<br />

where the flow regime at some locations was significantly altered due to<br />

the construction of the reservoir and related hydraulic structures and canals.<br />

3.4. MODEL APPLICATION<br />

The validated models have finally been used, as an integrated system, in a scenario<br />

approach to assess the environmental impacts of alternative water management<br />

options. The uncertainties of the model predictions have been assessed through<br />

sensitivity analyses.<br />

442 J. C. REFSGAARD ET AL.<br />

4. Selected Results from Model Construction, Calibration and Validation of<br />

Individual Components<br />

Comprehensive data collection and processing as well as model calibration and<br />

validation were carried out (DHI et al., 1995). In the following sections a few<br />

selected results are presented for the individual components. Further aspects of<br />

model validation focusing on integrated aspects are discussed in Section 5.<br />

4.1. <strong>RIVER</strong> <strong>AND</strong> RESERVOIR FLOW <strong>MODELLING</strong><br />

The following models have been constructed, calibrated and validated:<br />

• one-dimensional MIKE 11 model for the Danube from Bratislava to Komarno,<br />

• one-dimensional MIKE 11 model for the river branch system at the Slovak<br />

floodplain, and<br />

• two-dimensional MIKE 21 model for the reservoir.<br />

The MIKE 11 models have been established in two versions reflecting post- and<br />

pre-dam conditions, respectively.<br />

4.1.1. MIKE 11 River Model for the Danube<br />

The MIKE 11 model for the Danube is based on river cross-sections measured in<br />

1989 and 1991. The applied boundary conditions were measured daily discharges<br />

at Bratislava (upstream) and a discharge rating curve at Komarno (downstream).<br />

The model was initially calibrated for two steady state situations reflecting a low<br />

flow situation (905 m 3 s −1 ) and a flow situation close to the long term average<br />

(2390 m 3 s −1 ), respectively. Subsequently, the model was calibrated in a nonsteady<br />

state against daily water level and discharge measurements from 1991. The model<br />

was finally validated by demonstrating the ability to reproduce measured daily<br />

water level data from 1990. Calibration and validation results are presented in<br />

Topolska and Klucovska (1995). For the post-dam model some river reaches were<br />

updated with cross-sections measured in 1993. In addition, the reservoir and related<br />

hydraulic structures and canals were included. As the conditions after damming<br />

of the Danube have changed significantly, re-calibration of the post-dam model<br />

was carried out for the period April 1993–July 1993. Subsequently, the model was<br />

validated against measured data from the period November 1992–March 1993.<br />

4.1.2. MIKE 11 Model for the River Branch System<br />

The Danubian floodplain is a forest area of major ecological interest characterised<br />

by a complex system of river branches. A layout of the river branch system is shown<br />

in Figure 5. The cross-sections in the river branch system were measured during the<br />

1960’s and 1970’s. The pre-dam model was calibrated against water level and flow<br />

data from the 1965 flood. In the post-dam situation, the branch system is fed by an


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 443<br />

Figure 5. Layout of the river branch system on the Slovakian side of the Danube.<br />

inlet structure with water from the power canal. The system consists of a number<br />

of compartments (cascades) separated by small dikes. On each of these dikes combined<br />

structures of culverts and spillways are located enabling some control of the<br />

water levels and flows in the system. Results of the model calibration against data<br />

measured during the summer 1994 are shown in Klucovska and Topolska (1995).<br />

Finally, the model was validated by demonstrating the ability to reproduce water<br />

levels measured during the summer of 1993. Some of these results are presented in<br />

Sørensen et al. (1996).<br />

4.1.3. MIKE 21 Reservoir Model<br />

A MIKE 21 hydrodynamic model for the reservoir was established based on a<br />

reservoir bathymetry measured in 1994. The spatial resolution of the finite difference<br />

model is 100 × 50 m. The model was calibrated against flow velocities<br />

measured in the reservoir in the autumn of 1994.<br />

4.2. GROUND WATER FLOW <strong>MODELLING</strong><br />

Ground water modelling has been carried out at three different spatial scales:<br />

• A regional ground water model for pre-dam conditions (3000 km 2 , 500 m<br />

horizontal grid, 5 vertical layers).<br />

• A regional ground water model for post-dam conditions (3000 km 2 , 500 m<br />

horizontal grid, 5 vertical layers).<br />

• A local ground water model for an area surrounding the reservoir for both preand<br />

post-dam conditions (200 km 2 , 250 m horizontal grid, 7 vertical layers).<br />

• A local ground water model for the river branch system for both pre- and postdam<br />

conditions (50 km 2 , 100 m horizontal grid, 2 vertical layers).<br />

• A cross-sectional (vertical profile) model near Kalinkovo at the left side of the<br />

reservoir (2 km long, 10 m horizontal grid, 24 vertical layers).<br />

The regional and local ground water models all use the coupled version of the<br />

MIKE SHE and MIKE 11 and hence, include modelling of evapotranspiration and<br />

444 J. C. REFSGAARD ET AL.<br />

snowmelt processes, river flow, unsaturated flow and ground water flow. The crosssectional<br />

model only includes ground water processes.<br />

4.2.1. Model Construction<br />

Comprehensive input data were available and used in the construction of the models.<br />

In general, the regional and the local models are based on the same data with<br />

the main difference being that the local models provide finer resolutions and less<br />

averaging of measured input data. The two regional models, reflecting pre- and<br />

post-dam conditions, are basically the same. The only difference is that the postdam<br />

model includes the reservoir and related hydraulic structures and seepage<br />

canals.<br />

The models are based on information on location of river systems and crosssectional<br />

river geometry, surface topography, land use and cropping pattern, soil<br />

physical properties and hydrogeology. In addition, time series of daily precipitation,<br />

potential evapotranspiration and temperature as well as discharge inflow at<br />

Bratislava have been used. Comprehensive geological data exist from this area, see<br />

e.g., Mucha (1992) and Mucha (1993). The aquifer, ranging in thickness from about<br />

10 m at Bratislava to about 450 m at Gabcikovo, consists of Danube river sediments<br />

(sand and gravel) of late Tertiary and mainly Quaternary age. The present model is<br />

based on the work of Mucha et al. (1992a, b).<br />

4.2.2. Model Calibration<br />

The ground water model was calibrated against selected measured time series of<br />

ground water levels. The following parameters were subject to calibration: specific<br />

yield in the upper aquifer layer, leakage coefficients for the river bed and hydraulic<br />

conductivities for the aquifer layers. The soil physical characteristics for the unsaturated<br />

zone have been adopted directly from the unsaturated zone/agricultural<br />

modelling.<br />

The river model that has been used in the ground water modelling is identical<br />

to the MIKE 11 river model of the Danube, which was successfully validated independently<br />

as a ‘stand alone model’ (Subsection 4.1, above). When coupling MIKE<br />

SHE and MIKE 11 water is exchanged between the two models. The amount of<br />

water that recharges the aquifer in the upstream part and re-enters the river further<br />

downstream is in the order of 10–60 m 3 s −1 depending on the Danube discharge<br />

and on the actual ground water level. The recharge is typically two orders of magnitude<br />

less than the Danube discharge, and hence, a re-calibration of the MIKE<br />

11 river model is not required. As the major part of the ground water recharge<br />

originates from infiltration through the river bed, the leakage coefficient for the<br />

river bed becomes very important. Limited field information was available on this<br />

parameter, and hence, it was assumed spatially constant and through calibration<br />

assessed to be 5 × 10 −5 s −1 for the Danube and Vah rivers and 5 × 10 −6 s −1 for


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 445<br />

the Little Danube. These values are in good agreement with previous modelling<br />

experiences (Mucha et al., 1992b).<br />

When keeping the specific yield and the leakage coefficients for the river bed<br />

fixed the main calibration parameters were the hydraulic conductivities of the saturated<br />

zone. About 300 time series of ground water level observations were available<br />

for the model area, typically in terms of 30–40 yr of weekly observations. The<br />

calibration was carried out on the basis of about 80 of these series for the period<br />

1986–1990. In the parameter adjustments the overall spatial pattern described in the<br />

geological model were maintained. Some of the calibration results are illustrated<br />

in Figure 6 showing observed Danube discharge data together with simulated and<br />

measured ground water levels for three wells located at different distances from the<br />

Danube. Wells 694 and 740 are seen to react relatively quickly to fluctuations in<br />

river discharge as compared to well 7221, which is located further away from the<br />

river. This illustrates how the dynamics of the Danube propagates and is dampened<br />

in the aquifer.<br />

4.2.3. Model Validation<br />

The calibrated ground water model was validated by demonstrating the ability to<br />

reproduce measured ground water tables after damming of the Danube. In this<br />

regard the only model modification is the inclusion of the reservoir and related<br />

structures and canals. Due to the nonstationarity of the hydrological regime such<br />

a validation test, which according to Klemes (1986) is denoted a differential splitsample<br />

test, is a demanding test. Figure 7 shows the simulated and observed ground<br />

water levels for the same three observation wells as shown for the calibration period<br />

in Figure 6. The effects of the damming of the Danube in October 1992, when the<br />

new reservoir was established, is clearly seen in terms of increased ground water<br />

levels and reduced ground water dynamics when comparing the two figures. These<br />

features are well captured by the model.<br />

4.3. GROUND WATER QUALITY<br />

A geochemical field investigation was carried out in a cross-section north of the<br />

reservoir near Kalinkovo as a basis for identifying the key geochemical processes<br />

and estimating parameter values (see Mucha, 1995). Eleven multi-screen wells<br />

were installed close to the water supply wells at Kalinkovo forming a 7.5 km long<br />

cross-section parallel to the regional ground water flow direction. The multi-screen<br />

wells have been sampled frequently to investigate the ongoing bio-geochemical<br />

processes during infiltration of the Danube river water into the aquifer.<br />

A ground water quality model was established for the Kalinkovo cross-sectional<br />

profile based on all the measured field data. This model includes a comprehensive<br />

description of the bio-geochemical processes such as kinetically controlled<br />

denitrification and equilibrium controlled inorganic chemistry based on the well<br />

known PHREEQE code. More details are given in Griffioen et al. (1995) and<br />

446 J. C. REFSGAARD ET AL.<br />

Figure 6. Danube discharge at Bratislava together with simulated and observed ground water<br />

levels for three wells before the damming of the Danube (calibration period).


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 447<br />

Figure 7. Simulated and observed ground water levels for three wells after damming of the<br />

Danube (validation period).<br />

Engesgaard (1996). The transport part of the Kalinkovo cross-section has been<br />

calibrated against 18 O isotope data. The parameters describing reactive processes<br />

have been assessed and adjusted on the basis of the detailed field measurements<br />

in the Kalinkovo cross-sectional profile. It was shown that the geochemical model<br />

behaves qualitatively correct (Engesgaard, 1996).<br />

4.4. UNSATURATED ZONE <strong>AND</strong> AGRICULTURAL <strong>MODELLING</strong><br />

Modelling of the pre-dam and post-dam conditions of agricultural potential and<br />

nitrate leaching risk was carried out using a representative selection of soil units,<br />

cropping pattern and meteorological data covering the area between Danube and<br />

Maly Danube (Figure 1). The DAISY model uses time-varying ground water levels<br />

(simulated with the regional MIKE SHE ground water model) as lower boundary<br />

condition, for the unsaturated flow simulations. Cropping pattern and fertiliser<br />

application is included in the model based on measurements and statistical data.<br />

The model was calibrated on the basis of data from field experiments carried<br />

out during the years 1981–1987 at the experimental station in Most near Bratislava.<br />

During this process the crop parameters used in the model were adjusted to Slovak<br />

448 J. C. REFSGAARD ET AL.<br />

conditions. After the initial model construction and calibration, the model performance<br />

was evaluated through preliminary simulations using data from a number of<br />

plots located on an experimental field site at Lehnice in the middle of the project<br />

area. On the basis of comparisons between measured and simulated values of<br />

nitrogen uptake, dry matter yield and nitrate concentrations in soil moisture, the<br />

model performance under Slovak conditions was considered satisfactory (DHI et<br />

al., 1995).<br />

4.5. <strong>RIVER</strong> <strong>AND</strong> RESERVOIR SEDIMENT TRANSPORT <strong>MODELLING</strong><br />

4.5.1. Danube River Sediment Transport<br />

A one-dimensional morphological model was established for the Danube. The<br />

model operates with cross-sectional averaged parameters representing the river<br />

reach between every computational point (i.e. approximately 500 m), a special<br />

technique for comparing ‘real’ and simulated state variables was required. Therefore,<br />

the changes in mean water level over a decade rather than changes in bed<br />

elevations were compared between observations and simulations. For this purpose<br />

the changes in the so-called ‘Low Regulation and Navigable Water Level’ (LR-<br />

NWL) were used. LR-NWL is specified by the Danube Commission as the water<br />

level corresponding to Q94% which is approximately 980 m 3 s −1 . By using such<br />

an approach, perturbations in bed levels from one cross-section to another did not<br />

destroy the picture of the overall trends in aggradation and degradation of the river<br />

bed. The results of the calibration (1974–84) and validation runs (1984–90) are<br />

described in Topolska and Klucovska (1995).<br />

4.5.2. Sediment Transport in the River Branch System<br />

A one-dimensional fine sediment model was constructed for the river branch system<br />

in order to have a tool for quantitative evaluation of the possible sedimentation<br />

in the river branch system for alternative water management options. The upstream<br />

boundary condition for the model was provided in terms of concentration of suspended<br />

sediments simulated by the reservoir model. As virtually no field data on<br />

sedimentation in the river branch system were available neither calibration nor validation<br />

was possible. Instead, experienced values of model parameters from other<br />

similar studies as reported in the literature were used.<br />

4.5.3. Reservoir Sediment Model<br />

A two-dimensional fine graded sediment model was constructed for the reservoir.<br />

The suspended sediment input was imposed as a boundary condition in Bratislava<br />

with time series of sediment concentrations of six suspended sediment fractions<br />

with their own grain sizes and fall velocities. The fall velocity for each of the six<br />

fractions was assessed according to field measurements. No further model calibration<br />

was carried out. The only field data available for validation were a few bed


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 449<br />

sediment samples from summer 1994 with data on sedimentation thickness and<br />

grain size analyses (Holobrada et al., 1994). A comparison of model results and<br />

field data indicated that a reservoir sedimentation of the right order of magnitude<br />

was simulated. The simulated reservoir sedimentation corresponded to 42% of the<br />

total suspended load at Bratislava.<br />

4.6. SURFACE WATER QUALITY <strong>MODELLING</strong><br />

4.6.1. Danube River Model<br />

A BOD-DO model (MIKE 11 WQ) has been used to describe the water quality<br />

in the main stream of the Danube between Bratislava and Komarno. This model<br />

describes oxygen concentration (DO) as a function of the decay of organic matter<br />

(BOD), transformation of nitrogen components, re-aeration, oxygen consumption<br />

by the bottom and oxygen production and respiration by living organisms.<br />

As the conditions from pre-dam to post-dam have changed significantly, separate<br />

calibrations and validations were carried out. The pre-dam model was calibrated<br />

against data from October 1991 and validated against data from April and August/September<br />

1991. The post-dam model was calibrated against data from May<br />

1993 and validated against data from June 1993.<br />

4.6.2. Model for the River Branch System<br />

The water quality in the river branches was simulated with a eutrophication model<br />

(MIKE 11 EU), in which the algae production is the driving force. The algae<br />

growth in this model is described as a function of incoming light, transparency<br />

of the water, temperature, sedimentation and growth rate of the algae and of the<br />

available inorganic nutrients. The calibration was carried out on the basis of few<br />

data available during the period June–August 1993. Due to lack of further data no<br />

independent model validation was possible and hence, the uncertainties related to<br />

applying the model for making quantitative predictions of the effects of alternative<br />

water management schemes may be considerable.<br />

4.6.3. Reservoir Model<br />

In the reservoir the driving force is also the algae growth and hence, a eutrophication<br />

model (MIKE 21 EU) was applied. The reservoir model was calibrated<br />

against measured data from August 1994. This field programme was substantial<br />

and resulted in much more data than available for the river branch system. Good<br />

correspondence between simulated and observed values were achieved during the<br />

calibration period. However, no further data have been available for independent<br />

validation tests.<br />

450 J. C. REFSGAARD ET AL.<br />

5. Validation of Integrated Model<br />

The model calibration and validation have basically been carried out for the individual<br />

models using separate domain data for river system, aquifer system, etc.<br />

Rigorous validation tests of the integrated model were generally not possible due<br />

to lack of specific and simultaneous data on the processes describing the various<br />

couplings. Furthermore, although reasonable good assessments of uncertainties<br />

of the individual model predictions could be made, it was not obvious how such<br />

uncertainty would propagate in the integrated model.<br />

It can be argued that uncertainties in output from one model would in principle<br />

influence the uncertainties in other components of the integrated modelling system,<br />

thus adding to the total uncertainty of the integrated model. Following this line of<br />

argument would lead to the conclusion that the uncertainty of predictions by the<br />

integrated model would be larger than the corresponding uncertainty of predictions<br />

made by traditional individual models. On the other hand it can also be argued<br />

that in the integrated modelling approach the uncertainties in the crucial boundary<br />

conditions are reduced, because assumptions needed for executing individual<br />

models are substituted by model simulations based on data from neighbouring<br />

domains, which, if properly calibrated and validated, better represent the boundary<br />

effects. This would lead to the conclusion that the uncertainties in predictions by<br />

the integrated model would be smaller than those of the individual models.<br />

In the present study, no theoretical analyses have been made of this problem.<br />

Instead, a few validation tests have been made for cases where the couplings could<br />

indirectly be checked by testing the performance of the integrated model against<br />

independent data. In the following, results from one of these validation tests for the<br />

integrated model are shown.<br />

The river-aquifer interaction changed significantly, when the reservoir was established.<br />

An important model parameter describing this interaction is the leakage<br />

coefficient, which was calibrated on the basis of ground water level data for the predam<br />

situation (Subsection 4.2). For the post-dam situation the MIKE 21 reservoir<br />

model calculates the thickness and grain sizes of the sedimentation at all points<br />

in the reservoir. By use of the Carman-Kozeny formula, the leakage factors are<br />

recalculated for the area which was now covered by the reservoir. The model results<br />

were then checked against ground water level observations from wells near the<br />

reservoir, and it was found, that a calibration factor of 10 had to be applied to the<br />

Carman-Kozeny formula. This can theoretically be justified by the fact that the<br />

sediments are stratified or layered due to variations in flow velocities during the<br />

sedimentation process. The same formula and the same calibration factor was also<br />

used for converting all texture data from aquifer sediment samples to hydraulic<br />

conductivity values in the model.<br />

Now, how can the validity of the integrated model be tested The ground water<br />

level observations from a few wells have been used to assess the leakage calibration<br />

factor, so although the model output was subsequently checked against data from


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 451<br />

Figure 8. Measured and simulated discharges in seepage canals. The data are from a particular<br />

day in May 1995 and in m 3 s −1 .<br />

many more wells, it may be argued that this in itself is not sufficient for a true model<br />

validation. Consider instead a comparison of simulated and measured discharges<br />

in the so-called seepage canals, which are small canals constructed a few hundred<br />

meters away from the reservoir with the aim of intercepting part of the infiltration<br />

through the bottom of the reservoir. In Figure 8 it can be seen that the model<br />

simulations match the measured data remarkably well at different locations along<br />

the seepage canals. Thus, at the two stations most downstream on both seepage<br />

canals (stations 2809 and 3214) the agreements between model predictions and<br />

field data are within 5%. This is a powerful test, because the discharge data have<br />

not been used at all in the calibration process, and because it integrates the effects of<br />

reservoir sedimentation, calculation of leakage factors and geological parameters.<br />

6. Model Application – Case Study of River Branch System<br />

6.1. HYDROLOGY OF <strong>RIVER</strong> BRANCH SYSTEM<br />

The hydrology of the river branch system is highly complex with many processes<br />

influencing the water characteristics of importance for flora and fauna (Figure 2).<br />

These processes are highly interrelated and dynamic with large variations in time<br />

and space. The complexity of the floodplain, with its river branch system, is indicated<br />

in Figures 5 and 9 for the 20 km reach downstream the reservoir on the<br />

Slovakian side, where alluvial forest occurs. Before the damming of the Danube<br />

452 J. C. REFSGAARD ET AL.<br />

Figure 9. Plan and perspective view of the surface topography, of the river branches and the<br />

related flood plains as represented in a model network of 100 m grid squares.<br />

in 1992 the river branches were connected with the Danube during periods with<br />

discharge above average. However, some of the branches were only active during<br />

flood situations a few days per year. It was anticipated that after the damming,<br />

the water level in the Danube would decrease significantly. Therefore, in order to<br />

avoid that water drains from the river branches to the Danube, resulting in totally<br />

dry river branches, the water outflow from branches into the Danube have been<br />

blocked except for the downstream one at chainage 1820 rkm (Figure 5). Now, the<br />

river branch system receives water from an inlet structure in the hydropower canal<br />

at Dobrohost (Figure 5). This weir has a design capacity of 234 m 3 s −1 . Together<br />

with the various hydraulic structures in the river branches, it controls the hydraulic,<br />

hydrological and ecological regime in the river branches and on the flood plains.


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 453<br />

Figure 10. Steps in integrated model for floodplain hydrology.<br />

6.2. <strong>MODELLING</strong> APPROACH<br />

Comprehensive field studies and modelling analyses are often carried out in connection<br />

with assessing environmental impacts of hydropower schemes. Recent examples<br />

from the Danube include the studies of the Austrian schemes Altenwörth<br />

(Nachtnebel, 1989) and Freudenau (Perspektiven, 1989). However, like in the Austrian<br />

cases, the modelling studies have most often been limited to independent<br />

modelling of river systems, groundwater systems or other subsystems, without<br />

providing an integrated approach as the one presented in this paper.<br />

The models in this study were applied in a scenario approach simulating the<br />

hydrological conditions resulting from alternative possible operations of the entire<br />

system of hydraulic structures (alternative water management regimes). Thus, one<br />

historical (pre-dam) regime and three hypothetical (post-dam) water regimes cor-<br />

454 J. C. REFSGAARD ET AL.<br />

responding to alternative operation schemes for the structures of the Gabcikovo<br />

system were simulated (DHI et al., 1995). Due to the integration of the overall<br />

modelling system each scenario simulation involves a sequence, some times in an<br />

iterative mode, of model calculations. For the case of river branch modelling a<br />

hierarchical scheme of simulation runs (Figure 10) included the following major<br />

steps:<br />

Step 1. Hydraulic river modelling (MIKE 11)<br />

Model simulation: The MIKE 11 model simulates the river flows and water<br />

levels in the entire river system and river branches.<br />

Coupling: The model outputs, in terms of flows into the reservoir at the upstream<br />

end and downstream outflows through the reservoir structures are used<br />

as boundary conditions for the reservoir modelling (Step 2). Furthermore, the<br />

flow velocities and water levels are used in the river water quality simulations<br />

(Step 4a).<br />

Step 2. Reservoir modelling (MIKE 21)<br />

Model simulation: The MIKE 21 reservoir model simulates velocities, sedimentation<br />

and eutrophication/water quality in the reservoir.<br />

Coupling: The flow boundary conditions are generated by the river model<br />

(Step 1). Results on sedimentation are used to calculate leakage coefficients.<br />

Results on oxygen, nitrogen and carbon can be used as boundary conditions of<br />

river water quality, water quality of infiltrating water (Step 3a).<br />

Step 3a. Regional ground water flow (MIKE SHE/MIKE 11)<br />

Model simulation: The coupled MIKE SHE/MIKE 11 model simulates the<br />

ground water flow and levels including the interaction with the river system<br />

and the reservoir.<br />

Coupling: In the reservoir, the infiltration is simulated on the basis of leakage<br />

coefficients, which have been calculated from the amount and composition<br />

(grain sizes) of the sedimentation on the reservoir bottom (Step 2). This link<br />

between reservoir sedimentation and ground water was shown to be crucial<br />

for the model results. Furthermore, an iterative link to the DAISY agricultural<br />

model exists (Step 3b). Hence, spatially and temporally varying ground<br />

water levels from MIKE SHE/MIKE 11 are used as lower boundary conditions<br />

in DAISY, which in turn simulates the leaf area index and the root zone<br />

depth which are used as input time series data in MIKE SHE/MIKE 11. The<br />

model outputs, in terms of ground water flow velocities, are used as input<br />

to the ground water quality simulation. The model results, in terms of river<br />

flow velocities and water levels, ground water flow velocities and water levels,<br />

are used as time varying boundary conditions for the local flood plain model<br />

(Step 4b).


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 455<br />

Step 3b. Root zone (DAISY)<br />

Model simulation: The DAISY model simulates the unsaturated zone flows,<br />

the vegetation development, including crop yield.<br />

Coupling: The DAISY has an iterative link to the MIKE SHE/MIKE 11 model<br />

(as described above under Step 3a).<br />

Step 4a. River branches water quality (MIKE 11)<br />

Model simulation: The MIKE 11 model simulates the river water quality (BOD,<br />

DO, COD, NO3, etc).<br />

Coupling: The model uses data from Step 2 and Step 4b and produces output<br />

on concentrations of COD and DO, which are used as input to the ecological<br />

assessments (Step 5).<br />

Step 4b. Flood plain model (MIKE SHE/MIKE 11)<br />

Model simulation: The coupled MIKE SHE/MIKE 11 model simulates all the<br />

flow processes in the flood plain area including water flows and storages on<br />

the ground surface, river flows and water levels, ground water flows and water<br />

levels, evapotranspiration, soil moisture content in the unsaturated zone and<br />

capillary rise.<br />

Coupling: The model uses data from Step 3a as boundary conditions and provides<br />

river flow velocities as the basis for the water quality and sediment<br />

simulations (Steps 4a and c). The model provides data on flood frequency and<br />

duration, depth of flooding, depth to ground water table, moisture content in the<br />

unsaturated zone and flow velocities in river branches, which are key figures in<br />

the subsequent ecological assessments (Step 5).<br />

Step 4c. River branches sedimentation (MIKE 11)<br />

Model simulation: The MIKE 11 model simulates the transport of fine sediments<br />

through the river branch system. As a result the sedimentation/erosion<br />

and the suspended sediment concentrations are simulated.<br />

Coupling: The model uses sediment concentrations simulated by the reservoir<br />

model (Step 2) as input. Furthermore, the flow velocities simulated by the local<br />

flood plain model (Step 4b) are used as the basis for the sediment calculations.<br />

The results, in terms of grain size of the river bed and concentrations of<br />

suspended material, are used as input to the ecological assessments (Step 5).<br />

Step 5. Ecology<br />

A correlation matrix between the physical/chemical parameters provided by<br />

the model simulations (Steps 4a, b and c) and the aquatic and terrestric ecotopes<br />

has been established for the project area. Alternative water management<br />

regimes can be described in terms of specific operation of certain hydraulic<br />

structures and corresponding distribution of water discharges primarily between<br />

the Danube, the Gabcikovo hydropower scheme and the river branch<br />

456 J. C. REFSGAARD ET AL.<br />

system. The hydrological effects of such alternative operations can be simulated<br />

by the integrated model and subsequently, the ecological impacts can be<br />

assessed in terms of likely changes of ecotopes.<br />

6.3. THE FLOODPLAIN MODEL<br />

The extent of the floodplain model area is indicated in Figure 5 and a perspective<br />

view of the area with the river branch system and floodplains is shown in Figure 9.<br />

The horizontal discretization of the finite difference model is 100 m, and the ground<br />

water zone is represented by two layers. Several hundreds of cross-sections and<br />

more than 50 hydraulic structures in the river branch system were included in the<br />

MIKE 11 model for the river system.<br />

For the pre-dam model, the surface water boundary conditions comprise a discharge<br />

time series at Bratislava and a discharge rating curve at the downstream end<br />

(Komarno). For the post-dam model, the Bratislava discharge time series has been<br />

divided into three discharge boundary conditions, namely at Dobrohost (intake<br />

from hydropower canal to river branch system), at the inlet to the hydropower<br />

canal and at the inlet to Danube from the reservoir. For the groundwater system,<br />

time varying ground water levels simulated with the regional ground water models<br />

act as boundary conditions. The Danube river forms an important natural boundary<br />

for the area. The Danube is included in the model, located on the model boundary,<br />

and symmetric ground water flow is assumed below the river. Hence, a zero-flux<br />

boundary condition is used for ground water flow below the river.<br />

To illustrate the complex hydrology and in particular the interaction between<br />

the surface and subsurface processes model results from a model simulation for a<br />

period in June–July 1993 are shown in Figures 11 and 12.<br />

Figure 11 presents the inlet discharges at the upstream point of the river branch<br />

system (Dobrohost), while the discharges and water levels at the confluence between<br />

the Danube and the hydropower outlet canal downstream of Gabcikovo<br />

during the same period are shown in Figure 12. Figure 11 further shows the soil<br />

moisture conditions for the upper two m below terrain and the water depth on the<br />

surface at location 2. Similar information is shown for location 1 in Figure 12. A<br />

soil water content above 0.40 (40 vol.%) corresponds to saturation. Location 2 is<br />

situated in the upstream part of the river branch system, while location 1 is located<br />

in the downstream part (see Figure 9).<br />

At location 2 (Figure 11) flooding is seen to occur as a result of river spilling<br />

(surface inundation occurs before the ground water table rises to the surface) whenever<br />

the inlet discharge exceeds approximately 60 m 3 s −1 . The soil moisture content<br />

is seen to react relatively fast to the flooding and the soil column becomes<br />

saturated. In contrary, full saturation and inundation does not occur in connection<br />

with the flood in the Danube in July, but the event is recognised through increasing<br />

ground water levels following the temporal pattern of the Danube flood.


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 457<br />

Figure 11. Observed inlet discharge to the river branch system at Dobrohost; simulated moisture<br />

contents at the upper two m of the soil profile at location 2 and simulated depths of<br />

inundation at location 2 during June–July 1993.<br />

At location 1 (Figure 12) the conditions are somewhat different. During the<br />

simulation period location 1 never becomes inundated due to high inlet flows at<br />

Dobrohost. However, during the July flood in Danube, inundation at location 1<br />

occurs as a result of increased ground water table caused by higher water levels in<br />

river branches due to backwater effects from the Danube. The surface elevation at<br />

location 1 is 116.4 m which is 0.4 m below the flood water level shown in Figure 12<br />

at the confluence (5 km downstream of location 1). It is noticed that the inundation<br />

at this location occurs as a result of ground water table rise and not due to spilling<br />

of the river (surface inundation occurs after the ground water table has reached<br />

ground surface).<br />

6.4. EXAMPLE OF MODEL RESULTS<br />

As an example of the results which can be obtained by the floodplain model, Figure<br />

13 shows a characterisation of the area according to flooding and depths to<br />

groundwater. The map has been processed on the basis of simulations for 1988 for<br />

pre-dam conditions. The classes with different ground water depths and flooding<br />

458 J. C. REFSGAARD ET AL.<br />

Figure 12. Simulated discharge and water levels in the Danube at the confluence between<br />

Danube and the outlet canal from the hydropower plant; simulated moisture contents at the<br />

upper two meter of the soil profile at location 1 and simulated depths of inundation at location<br />

1 in the river branch system during June–July 1993.<br />

have been determined from ecological considerations according to requirements<br />

of (semi)terrestrial (floodplain) ecotopes. From the figure the contacts between the<br />

main Danube river and the river branch system is clearly seen. Similar computations<br />

have been made by alternative water management schemes after damming of<br />

the Danube. The results of one of the hypothetical post-dam water management<br />

regimes, characterized by average water flows in the power canal, Danube and<br />

river branch system intake of 1470 m 3 s −1 , 400 m 3 s −1 and 45 m 3 s −1 , respectively,<br />

are shown in Figure 14. By comparing Figure 13 and Figure 14 the differences<br />

in hydrological conditions can clearly be seen. For instance the pre-dam conditions<br />

(Figure 13) are in many places characterised by high groundwater tables


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 459<br />

Figure 13. Hydrological regime in the river branch area for 1988 pre-dam conditions<br />

characterized in ecological classes.<br />

and small/seldom flooding, while the post-dam situation (Figure 14) generally has<br />

deeper ground water tables and more frequent flooding. From such changes in hydrological<br />

conditions inferences can be made on possible changes in the floodplain<br />

ecosystem.<br />

Further scenarios (not shown here) have, amongst others, investigated the<br />

effects of establishing underwater weirs in the Danube and in this way improvement<br />

of the connectivity between the Danube and the river branch system.<br />

7. Limitations in the Couplings made in the Integrated Model<br />

The integrated modelling system and the way it was applied includes different<br />

degrees of integration ranging from sequential runs, where results from one model<br />

are used as input to the next model, to a full integration, such as the coupling<br />

between MIKE SHE and MIKE 11. Hence, the system is not truly integrated in<br />

all respects. The justification for these different levels lies in assessments of where<br />

it was required in the present project area to account for feed back mechanisms<br />

and where such feed backs could be considered to be of minor importance for all<br />

practical purposes. For other areas with different hydrological characteristics, the<br />

required levels of integration are not necessarily the same. Therefore, a discussion<br />

460 J. C. REFSGAARD ET AL.<br />

Figure 14. Hydrological regime in the river branch area for a post-dam water management<br />

regime characterized in ecological classes. The scenario has been simulated using<br />

1988 observed upstream discharge data and a given hypothetical operation of the hydraulic<br />

structures.<br />

is given below on the universality and limitations of the various couplings made in<br />

the present case.<br />

A. Hydrological catchment/river hydraulics (MIKE SHE/MIKE 11)<br />

This coupling between the hydrological code and the river hydraulic code is fully<br />

dynamic and fully integrated with feed back mechanisms between the two codes<br />

within the same computational time step. This coupling cannot be treated sequentially<br />

in this area, since the feedback between river and aquifer works in both<br />

directions, with the river functioning as a source in part of the area and as a drain<br />

in other parts, and since the direction of the stream-aquifer interaction changes<br />

dynamically in time and space as a consequence of discharge fluctuations in the<br />

Danube. This coupling was shown to be crucial during the course of the project,<br />

and, due to the full integration, it is fully generic.<br />

B. Reservoir/river (MIKE 21/MIKE 11)<br />

This coupling is a simple one-way coupling with the reservoir model providing<br />

input data to the downstream river model, both in terms of sediment and water


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 461<br />

quality parameters. This coupling is sufficient in the present case, because there is<br />

no feedback from the downstream river to the reservoir. Even though this coupling<br />

is not fully generic, it may be sufficient in most cases, even in cases with a network<br />

of reservoirs and connecting river reaches.<br />

C. Reservoir/groundwater water exchange (MIKE 21/MIKE SHE)<br />

This coupling is a simple one-way coupling with the reservoir model providing<br />

data on sedimentation to the groundwater module of MIKE SHE, where they are<br />

used to calculate leakage coefficients in the surface water/ground water flow calculations.<br />

This coupling is sufficient in the present case, where the reservoir water<br />

table always is higher than the ground water table, and where the flow always is<br />

from the reservoir to the aquifer. However, for cases where water flows in both<br />

directions, or where there are significant temporal variations in the sedimentation,<br />

the present coupling is not necessarily sufficient.<br />

D. Hydrology catchment/crop growth (MIKE SHE/DAISY)<br />

This coupling is an iterative coupling with data flowing in both directions. However,<br />

it is not a full integration with the two model codes running simultaneously.<br />

Therefore, a number of iterations are required until the input data used in MIKE<br />

SHE (vegetation data simulated by DAISY) generates the input data used in DAISY<br />

(ground water levels) and vice versa. For example, changes in river water levels<br />

affect the ground water levels, implying that the crop growth conditions change and<br />

hence, the DAISY simulated vegetation data used by MIKE SHE to simulate the<br />

ground water levels are not correct. In such a case, the MIKE SHE simulation has to<br />

be repeated with the new crop growth data and subsequently, the DAISY simulation<br />

has to be repeated with the new ground water levels, etc., until the differences<br />

become negligible. This coupling has been used successfully in previous studies<br />

(Styczen and Storm, 1993), but may, due to the iterative mode, be troublesome in<br />

practise.<br />

E. Surface water/ground water quality (MIKE 11 – MIKE 21/MIKE SHE)<br />

In contrary to the full coupling of flows (coupling A) the corresponding water<br />

quality coupling is a simple one-way coupling with the river and reservoir models<br />

providing the water quality parameters in the infiltrating water and uses these as<br />

boundary conditions for the ground water quality simulations. This coupling is<br />

sufficient in the present case with respect to the reservoir, where the flow always<br />

is from the reservoir to the aquifer. The river-aquifer interaction involves flows in<br />

both directions, but the return flow from the aquifer to the Danube is very small<br />

(about 1%) as compared to the Danube flow, and hence, the feedback from the<br />

ground water quality to Danube water quality is assumed negligible. However, for<br />

other cases where the mass flux from the aquifer to the river system is important<br />

for the river water quality, the present one-way coupling will not be sufficient.<br />

462 J. C. REFSGAARD ET AL.<br />

8. Discussion and Conclusions<br />

The hydrological and ecological system of the Danubian Lowland is so complex<br />

with so many interactions between the surface and the subsurface water regimes<br />

and between physical, chemical and biological changes, that an integrated numerical<br />

modelling system of the distributed physically-based type is required in order<br />

to provide quantitative assessments of environmental impacts on the ground water,<br />

the surface water and the floodplain ecosystem of alternative management options<br />

for the Gabcikovo hydropower scheme.<br />

Such an integrated modelling system has been developed, and an integrated<br />

model has been constructed, calibrated and, to the extent possible, validated for<br />

the 3000 km 2 area. The individual components of the modelling system represent<br />

state-of-the-art techniques within their respective disciplines. The uniqueness is the<br />

full integration. The integrated system enables a quite detailed level of modelling,<br />

including quantitative predictions of the surface and ground water regimes in the<br />

floodplain area, ground water levels and dynamics, ground water quality, crop<br />

yield and nitrogen leaching from agricultural land, sedimentation and erosion in<br />

rivers and reservoirs, surface water quality as well as frequency, magnitude and<br />

duration of inundations in floodplain areas. The computations were carried out on<br />

Hewlett Packard Apollo 9000/735 UNIX workstations with 132 MB RAM. With<br />

a 300 MHz Pentium II NT computer a typical computational times for one of<br />

the steps described in Section 6.2 (Figure 10) would be 2–10 hr. Thus, although<br />

the integrated system is rather computationally demanding, the computational requirements<br />

are not a serious constraint in practise as compared to the demand for<br />

comprehensive field data.<br />

For most of the individual model components, traditional split-sample validation<br />

tests have been carried out, thus documenting the predictive capabilities of<br />

these models. However, this was not possible for some aspects of the integrated<br />

model. Hence, according to rigorous scientific modelling protocols, the integrated<br />

model can be argued to have a rather limited predictive capability associated with<br />

large uncertainties. A theoretical analysis of error propagation in such an integrated<br />

model would be quite interesting, but was outside the scope of the present study<br />

which was limited to the comprehensive task of developing the integrated modelling<br />

system and establishing the integrated model on the basis of all available<br />

data. However, on the basis of the few possible tests (e.g. Figure 7) of the integrated<br />

model against independent data not used in the calibration-validation process for<br />

the individual models, it is our opinion that the uncertainties of the integrated model<br />

are significantly smaller than those of the individual models. The two key reasons<br />

for this are: (1) in the integrated model the internal boundaries are simulated by<br />

neighbouring model components and not just assessed through qualified but subjective<br />

estimates by the modeller; and (2) the integrated model makes it possible<br />

to explicitly include more sources of data in validation tests that can not all be<br />

utilised in the individual models. Thus, by adding independent validation tests for


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 463<br />

the integrated model, such as the one shown in Figure 7 on discharges in seepage<br />

canals, to the validation tests for the individual models, the outputs of the integrated<br />

model have been subject to a more comprehensive test based on more data and<br />

hence, must be considered less uncertain than outputs from the individual models.<br />

The environmental impacts of the new reservoir and the diversion of water from<br />

the Danube through the Gabcikovo power plant can be simulated in rather fine<br />

detail by the integrated model established for the area. The integrated nature of<br />

the model has been illustrated by a case study focusing on hydrology and ecology<br />

in the wetland comprising the river branch system. The integrated model is not<br />

claimed to be capable of predicting detailed ecological changes at the species level.<br />

However, it is believed to be capable of simulating changes in the hydrological<br />

regime resulting from alternative water management decisions to such a degree of<br />

detail that it becomes a valuable tool for broader assessments of possible ecological<br />

changes in the area.<br />

Acknowledgements<br />

The present paper is based on results from the project ‘Danubian Lowland – Ground<br />

Water Model’ supported by the European Commission under the PHARE program.<br />

The project was executed by the Slovak Ministry of the Environment. The work<br />

was carried out by an international group of research and consulting organisations<br />

as reflected by the team of authors. The constructive criticisms of two anonymous<br />

reviewers are acknowledged.<br />

References<br />

Bathurst, J. C., Wicks, J. M. and O’Connel, P. E.: 1995, The SHE/SHESED basin scale water flow<br />

and sediment transport modelling system, In V. P. Singh (ed.), Computer Models of Watershed<br />

Hydrology, Water Resources Publications, pp. 563–594.<br />

Calver, A. and Wood, W. L.: 1995, The institute of hydrology distributed model, In V. P. Singh (ed.),<br />

Computer Models of Watershed Hydrology, Water Resources Publications, pp. 595–626.<br />

CEC: 1991, Commission of European Communities, Czech and Slovak Federative Republic,<br />

Danubian Lowland-Ground Water Model, No. PHARE/90/062/030/001/EC/WAT/1<br />

DHI: 1995, MIKE 21 Short Description. Danish Hydraulic Institute, Hørsholm, Denmark.<br />

DHI, DHV, TNO, VKI, Krüger and KVL: 1995, PHARE project Danubian Lowland – Ground Water<br />

Model (EC/WAT/1), Final Report. Prepared by a consultant group for the Ministry of the Environment,<br />

Slovak Republic and for the Commission of the European Communities, Vol. 1, 65 pp.;<br />

Vol. 2, 439 pp.; Vol. 3, 297 pp., Bratislava.<br />

EC: 1992, Working group of independent experts on variant C of the Gabcikovo-Nagymaros project,<br />

working Group Report, Commission of the European Communities, Czech and Slovak Federative<br />

Republic, Republic of Hungary, Budapest, 23 November, 1992.<br />

EC: 1993a, Working group of monitoring and water management experts for the Gabcikovo system<br />

of locks – Data Report, Commission of the European Communities, Republic of Hungary, Slovak<br />

Republic, Budapest, 2 November, 1993.<br />

464 J. C. REFSGAARD ET AL.<br />

EC: 1993b, Working group of monitoring and water management experts for the Gabcikovo system<br />

of locks – Report on temporary water management regime, Commission of the European<br />

Communities, Republic of Hungary, Slovak Republic, Bratislava, 1 December, 1993.<br />

Engesgaard, P.: 1996, Multi-Species Reactive Transport, In M. B. Abbott and J. C. Refsgaard (eds),<br />

Distributed Hydrological Modelling, Kluwer Academic Publishers, pp. 71–91.<br />

Griffioen, J., Engesgaard, P., Brun, A., Rodak, R., Mucha, I. and Refsgaard, J. C.: 1995, Nitrate<br />

and Mn-chemistry in the alluvial Danubian Lowland aquifer, Slovakia. Ground Water Quality:<br />

Remediation and Protection (GQ95), Proceedings of the Prague Conference, May 1995, IAHS<br />

Publ. No. 225, pp. 87–96.<br />

Hansen, S., Jensen, H. E., Nielsen, N. E. and Svendsen, H.: 1991, Simulation of nitrogen dynamics<br />

and biomass production in winter wheat using the Danish simulation model DAISY. Fertilizer<br />

Research 27, 245–259.<br />

Havnø, K., Madsen, M. N. and Dørge, J.: 1995, ‘MIKE 11 – A Generalized River Modelling<br />

Package’, In V. P. Singh (ed), Computer Models of Watershed Hydrology, Water Resources<br />

Publications, pp. 733–782.<br />

Holobrada, M., Capekova, Z., Lukac, M. and Misik, M.: 1994, Prognoses of the Hrusov reservoir<br />

eutrophication and siltation under various discharge distribution to the Old Danube (in Slovak),<br />

Water Research Institute (VUVH), Bratislava.<br />

ICJ: 1997, Case Concerning Gabcikovo-Nagymaros project (Hungary/Slovakia). Summary of the<br />

Judgement of 25 September 1997. International Court of Justice, The Hague, (available on<br />

www.icj-cij.org).<br />

JAR: 1995, 1996, 1997, Joint Annual Report of the environment monitoring in 1995, 1996, 1996<br />

according to the ‘Agreement between the Government of the Slovak Republic and the Government<br />

of Hungary about Certain Temporary Measures and Discharges to the Danube and Mosoni<br />

Danube’, signed 19 April, 1995.<br />

Klemes, V.: 1986, Operational testing of hydrological simulation models, Hydrological Sciences<br />

Journal, 13–24.<br />

Klucovska, J. and Topolska, J.: 1995, Water regime in the Danube river and its river branches, In I.<br />

Mucha (ed.), Gabcikovo Part of the Hydroelectric Power Project. Environmental Impact Review,<br />

Faculty of Natural Sciences, Comenius University, Bratislava, pp. 33–42.<br />

Kocinger, D.: 1995, Gabcikovo Part of the Hydroelectric Power Project, Basic Characteristics, In I.<br />

Mucha (ed.), Gabcikovo Part of the Hydroelectric Power Project – Environmental Impact Review,<br />

Faculty of Natural Sciences, Comenius University, Bratislava, pp. 5–14.<br />

Koncsos, L., Schütz, E. and Windau, U.: 1995, Application of a comprehensive decision support<br />

system for the water quality management of the river Ruhr, Germany, In S. P. Simonovic, Z.<br />

Kunzewicz, D. Rosbjerg and K. Takeuchi (eds), Modelling and Management of Sustainable<br />

Basin-Scale Water Resources Systems, IAHS Publ. No. 231, pp. 49–59.<br />

Menetti, M.: 1995, Analysis of regional water resources and their management by means of numerical<br />

simulation models and satellites in Mendoza, Argentina, In S. P. Simonovic, Z. Kunzewicz, D.<br />

Rosbjerg and K. Takeuchi (eds), Modelling and Management of Sustainable Basin-Scale Water<br />

Resources Systems, IAHS Publ. No. 231, pp. 49–59.<br />

Mucha, I.: 1992, Database processing of the hydropedological parameters for the ground water flow<br />

model of the Danubian Lowland (in Slovak), Ground Water Division, Faculty of Natural Science,<br />

Comenius University, Bratislava.<br />

Mucha, I., Paulikova, E., Hlavaty, Z., Rodak, D. and Pokorna, L.: 1992a, Danubian Lowland Ground<br />

Water Model, Working Manual to consortium of invited specialists for workshop in Bratislava,<br />

Ground Water Division, Faculty of Natural Sciences, Comenius University, Bratislava.<br />

Mucha, I., Paulikova, E., Hlavaty, Z. and Rodak, D.: 1992b, Elaboration of basis data for preparation<br />

of hydrogeological parameters for the model of the ground water flow of the Danubian Lowland<br />

area (in Slovak), Ground Water Division, Faculty of Natural Science, Comenius University,<br />

Bratislava.


AN INTEGRATED MODEL FOR THE DANUBIAN LOWL<strong>AND</strong> 465<br />

Mucha, I., Paulikova, E., Hlavaty, Z., Rodak, D. and Pokorna, L.: 1993, Surface and ground water<br />

regime in the Slovak part of the Danube alluvium, Ground Water Division, Faculty of Natural<br />

Science, Comenius University.<br />

Mucha, I. (ed): 1995, Gabcikovo part of the hydroelectric power project environmental impact<br />

review. Evaluation based on two years monitoring, Faculty of Natural Sciences, Comenius<br />

University, Bratislava.<br />

Mucha, I., Rodak, D., Hlavaty, Z. and Bansky, L.: 1997, Environmental aspects of the design<br />

and construction of the Gabcikovo Hydroelectric Power Project on the river Danube, Proceedings<br />

International Symposium on Engineering Geology and the Environment, organized by the<br />

Greek National Group of IAEG, Athens, June 1997, Engineering Geology and the Environment,<br />

pp. 2809–2817.<br />

Nachtnebel, H.-P. (ed): 1989, Ökosystemstudie Donaustau Altenwörth, Veränderungen durch das<br />

Donaukraftwerk Altenwörth, Österreische Akademie der Wissenschaften, Veröffentlichungen<br />

des Österreischen MaB-Programs, Band 14, Universitätsverlag Wagner, Innsbruck.<br />

Person, M., Raffensperger, J. P., Ge, S. and Garven, G.: 1996, Basin-scale hydrogeologic modelling,<br />

Rev. Geophys. 34(1), 61–87.<br />

Perspektiven: 1989, Staustufe Freudenau, Perspektiven, Magazin für Stadtgestaltung und Lebensqualität,<br />

Dezember 1989.<br />

Refsgaard, J. C.: 1997, Parameterisation, calibration and validation of distributed hydrological<br />

models, J. Hydrology 198, 69–97.<br />

Refsgaard, J. C. and Storm, B.: 1995, MIKE SHE, In V. P. Singh (ed), Computer Models of Watershed<br />

Hydrology, Water Resources Publications, pp. 809–846.<br />

Singh, V. P. (ed): 1995, Computer Models of Watershed Hydrology, Water Resources Publications.<br />

Sørensen, H. R., Klucovska, J., Topolska, T., Clausen, T. and Refsgaard, J. C.: 1996, An engineering<br />

case study – Modelling the influences of the Gabcikovo hydropower plant in the hydrology and<br />

ecology in the Slovak part of the river branch system, In M. B. Abbott and J. C. Refsgaard (eds),<br />

Distributed Hydrological Modelling, Kluwer Academic Publishers, pp. 233–253.<br />

Styczen, M. and Storm, B.: 1993, Modelling of N-movements on catchment scale – a tool for analysis<br />

and decision making. 1. Model description. 2. A case study, Fertilizer Research 36, 1–17.<br />

Topolska, J. and Klucovska, J.: 1995, River morphology, In I. Mucha (ed.), Gabcikovo Part of the Hydroelectric<br />

Power Project. Environmental Impact Review, Faculty of Natural Sciences, Comenius<br />

University, Bratislava, pp. 23–32.<br />

VKI: 1995, Short Description of water quality and eutrophication modules,. Water Quality Institute,<br />

Hørsholm, Denmark.<br />

Winter, T. C.: 1995, Recent advances in understanding the interaction of groundwater and surface<br />

water, Rev. Geophys., Supplement, U.S. National Report 1991–94 to IUGG, pp. 985–994.<br />

Yan, J. and Smith, K. R.: 1994, Simulation of integrated surface water and ground water systems –<br />

model formulation, Water Resources Bulletin 30(5), 879–890.


[10]<br />

Refsgaard JC, Thorsen M, Jensen JB, Kleeschulte S, Hansen S (1999) Large<br />

scale modelling of groundwater contamination from nitrogen leaching.<br />

Journal of Hydrology, 221(3-4), 117-140.<br />

Reprinted from Journal of Hydrology with permission from Elsevier


Journal of Hydrology 221 (1999) 117–140<br />

Large scale modelling of groundwater contamination from<br />

nitrate leaching<br />

J.C. Refsgaard a, *, M. Thorsen a , J.B. Jensen a , S. Kleeschulte b , S. Hansen c<br />

a Danish Hydraulic Institute, Hørsholm, Denmark<br />

b GIM, Luxembourg<br />

c Royal Veterinary and Agricultural University, Copenhagen, Denmark<br />

Received 20 July 1998; received in revised form 3 May 1999; accepted 31 May 1999<br />

Abstract<br />

Groundwater pollution from non-point sources, such as nitrate from agricultural activities, is a problem of increasing<br />

concern. Comprehensive modelling tools of the physically based type are well proven for small-scale applications with<br />

good data availability, such as plots or small experimental catchments. The two key problems related to large-scale simulation<br />

are data availability at the large scale and model upscaling/aggregation to represent conditions at larger scale. This paper<br />

presents a methodology and two case studies for large-scale simulation of aquifer contamination due to nitrate leaching. Readily<br />

available data from standard European level databases such as GISCO, EUROSTAT and the European Environment Agency<br />

(EEA) have been used as the basis of modelling. These data were supplemented by selected readily available data from national<br />

sources. The model parameters were all assessed from these data by use of various transfer functions, and no model calibration<br />

was carried out. The adopted upscaling procedure combines upscaling from point to field scale using effective parameters with a<br />

statistically based aggregation procedure from field to catchment scale, preserving the areal distribution of soil types, vegetation<br />

types and agricultural practices on a catchment basis. The methodology was tested on two Danish catchments with good<br />

simulation results on water balance and nitrate concentration distributions in groundwater. The upscaling/aggregation procedure<br />

appears to be applicable in many areas with regard to root zone processes such as runoff generation and nitrate leaching,<br />

while it has important limitations with regard to hydrograph shape due to its lack of accounting for scale effects in relation to<br />

stream aquifer interaction. 1999 Elsevier Science B.V. All rights reserved.<br />

Keywords: Upscaling; Databases; Non-point pollution; Nitrate leaching; Distributed model; Water balance<br />

1. Introduction<br />

Groundwater is a significant source of freshwater<br />

used by industry, agriculture and domestic users.<br />

However, increasing demand for water, increasing<br />

use of pesticides and fertilisers as well as atmospheric<br />

deposition constitute a threat to the quality of groundwater.<br />

The use of fertilisers and manure leads to the<br />

* Corresponding author.<br />

E-mail address: jcr@dhi.dk (J.C. Refsgaard)<br />

leaching of nitrates into the groundwater and atmospheric<br />

deposition contributes to the acidification of<br />

soils that may have an indirect effect on the contamination<br />

of water.<br />

In Europe, for instance, the present situation is<br />

summarised in EEA (1995), where it is assessed that<br />

the major part of aquifers in Northern and Central<br />

Europe are subject to risk of nitrate contamination<br />

amongst others due to agricultural activities. Therefore,<br />

policy makers and legislators in EU are<br />

concerned about the issue and a number of preventive<br />

0022-1694/99/$ - see front matter 1999 Elsevier Science B.V. All rights reserved.<br />

PII: S0022-1694(99)00081-5


118<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

legislation steps are being taken in these years (EU<br />

Council of Ministers, 1991; EC, 1996).<br />

In the scientific community, concerns on groundwater<br />

contamination have motivated the development<br />

of numerous simulation models for groundwater quality<br />

management. Groundwater models describing the<br />

flow and transport mechanisms of aquifers have been<br />

developed since the 1970s and applied in numerous<br />

pollution studies. They have mainly described the<br />

advection and dispersion of conservative solutes.<br />

More recently, geochemical and biochemical reactions<br />

have been included to simulate the transport<br />

and fate of pollutants from point sources as industrial<br />

and municipal waste disposal sites, see e.g. Mangold<br />

and Tsang (1991); Engesgaard et al. (1996) for overviews.<br />

Fewer attempts have been made to simulate<br />

non-point pollution at catchment scale resulting<br />

from agricultural activities, see e.g. Thorsen et al.<br />

(1996); Person et al. (1996) for overviews. The<br />

approaches range from relatively simple models<br />

with semi-empirical process descriptions of the<br />

lumped conceptual type such as ANSWERS (Beasley<br />

et al., 1980), CREAMS (Knisel, 1980; Knisel and<br />

Williams, 1995), GLEAMS (Leonard et al., 1987),<br />

SWRRB (Arnold and Wiliams, 1990; Arnold et al.,<br />

1995) and AGNPS (Young et al., 1995) to more<br />

complex models with a physically based process<br />

description. The physically based models are most<br />

commonly one-dimensional leaching models, such<br />

as RZWQM (DeCoursey et al., 1989, 1992), Daisy<br />

(Hansen et al., 1991) and WAVE (Vereecken et al.,<br />

1991; Vanclooster et al., 1994, 1995), which basically<br />

describe root zone processes only, while true,<br />

spatially distributed, catchment models based on<br />

comprehensive process descriptions, such as the<br />

coupled MIKE SHE/Daisy (Styczen and Storm,<br />

1993), are seldom reported. The simple conceptual<br />

models are attractive because they require relatively<br />

less data, which are usually easily accessible, while<br />

the predictive capability of these models with regard<br />

to assessing the impacts of alternative agricultural<br />

practises is questionable due to the semi-empirical<br />

nature of the process descriptions. On the contrary, a<br />

key problem in using the more complex catchment<br />

models operationally lies in the generally large data<br />

requirements prescribed by the developers of such<br />

model codes. However, due to the better process<br />

descriptions these models may for some types of<br />

application be expected to have better predictive<br />

capabilities than the simpler models (Heng and Nikolaidis,<br />

1998).<br />

Input data for the complex catchment models have<br />

traditionally been available in practise only for small<br />

areas such as experimental research catchments.<br />

However, as more and more data have been gathered<br />

in computerised databases and, in particular, in<br />

Geographical Information Systems (GIS), the data<br />

availability has improved significantly. Further,<br />

experience from case studies indicates that a considerable<br />

part of the input data may be derived from<br />

statistical data and more general databases (Styczen<br />

and Storm, 1995).<br />

The database of EUROSTAT, the statistical office<br />

of the European Commission, holds statistical information<br />

about different topics from all Member States<br />

of the European Union. Agricultural statistics provide<br />

information on main crops, on the structure of agricultural<br />

holdings and crop and on animal production.<br />

Environment statistics provide figures on impacts of<br />

other sector’s work on the environment, such as fertiliser<br />

and pesticide input, groundwater withdrawal,<br />

water quality or manure production on animal<br />

farms. These figures are mostly aggregated and<br />

published on national level.<br />

In order to use these statistics in a spatially distributed<br />

simulation model, the information needs to be<br />

spatially referenced to represent a unit on the ground.<br />

Therefore the statistical information needs to be<br />

linked to a GIS data set. Such GIS data is stored in<br />

the GISCO (Geographic Information System of the<br />

European Commission) database. The GISCO database<br />

holds spatial data about administrative boundaries<br />

down to commune level, thematic data sets<br />

such as the soil database, CORINE land cover (managed<br />

by the EEA) or climatic time series for about 2000<br />

measuring stations in the European Union.<br />

Thus on one hand, there is a clearly expressed need<br />

from decision makers at national and international<br />

level to have tools, which on the basis of readily available<br />

data can predict the risks of groundwater pollution<br />

from non-point sources and the impacts of<br />

alternative agricultural management practices; and<br />

on the other hand, the scientific community has<br />

achieved new knowledge and developed new tools<br />

aiming at this. However, there are some important<br />

gaps to be filled before the scientifically based tools


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 119<br />

Fig. 1. Schematic structure of the MIKE SHE.<br />

can be applied operationally for supporting the decision<br />

makers:<br />

• The physically based models are very promising<br />

tools for assessing the impacts of alternative agricultural<br />

practises, but have so far been tested on<br />

plot scale and very small experimental catchments,<br />

whereas the need from a policy making point of<br />

view mainly relates to application on a much larger<br />

scale. Hence, there is a need to derive and test<br />

methodologies for upscaling of such models to<br />

run with model grid sizes one to two order of<br />

magnitudes larger than usually done.<br />

• Readily available data on large (national and international)<br />

scales do exist, although in a somewhat<br />

aggregated form. However, such data have not yet<br />

been used as the basis for comprehensive modelling,<br />

which so far always have been based on more<br />

detailed data, often from experimental catchments.<br />

Hence, there is a need to test to which extent these<br />

readily available data are suitable for modelling.<br />

• There is a need to assess the predictive<br />

uncertainties, before it can be evaluated whether<br />

the approach of combining complex predictive<br />

models with existing data bases is of any practical<br />

use in the decision making process or whether the<br />

uncertainties are too large.<br />

This paper presents results from a joint EU research<br />

project on prediction of non-point nitrate contamination<br />

at catchment scale due to agricultural activities.<br />

Other results from the same study focussing on uncertainty<br />

aspects are presented in UNCERSDSS (1998),<br />

Refsgaard et al. (1998a, 1999) and Hansen et al.<br />

(1999).<br />

2. Methodology<br />

2.1. Materials and methods<br />

2.1.1. MIKE SHE<br />

MIKE SHE is a modelling system describing the<br />

flow of water and solutes in a catchment in a distributed<br />

physically based way. This implies numerical


120<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

solutions of the coupled partial differential equations<br />

for overland (2D) and channel flow (1D), unsaturated<br />

flow (1D) and saturated flow (3D) together with a<br />

description of evapotranspiration and snowmelt<br />

processes. The model structure is illustrated in Fig.<br />

1. For further details reference is made to the literature<br />

(Abbott et al., 1986; Refsgaard and Storm, 1995).<br />

2.1.2. Daisy<br />

Daisy (Hansen et al., 1991) is a one-dimensional<br />

physically based modelling tool for the simulation<br />

of crop production and water and nitrogen balance<br />

in the root zone. Daisy includes modules for description<br />

of evapotranspiration, soil water dynamics based<br />

on Richards’ equation, water uptake by plants, soil<br />

temperature, soil mineral nitrogen dynamics based<br />

on the advection–dispersion equation, nitrate uptake<br />

by plants and nitrogen transformations in the soil. The<br />

nitrogen transformations simulated by Daisy are<br />

mineralization–immobilization turnover, nitrification<br />

and denitrification. In addition, Daisy includes a<br />

module for description of agricultural management<br />

practices. Details on the Daisy application in the<br />

present study are given by Hansen et al. (1999).<br />

2.1.3. MIKE SHE/Daisy coupling<br />

By combining MIKE SHE and Daisy, a complete<br />

modelling system is available for the simulation of<br />

water and nitrate transport in an entire catchment. In<br />

the present case the coupling is a sequential one. Thus<br />

for all agricultural areas, Daisy first produces calculations<br />

of water and nitrogen behaviour from the soil<br />

surface and through the root zone. The percolation of<br />

water and nitrate at the bottom of the root zone simulated<br />

by Daisy, is then used as input to MIKE SHE<br />

calculations for the remaining part of the catchment.<br />

For natural areas, MIKE SHE calculates also the root<br />

zone processes assuming no nitrate contribution from<br />

these areas. Owing to the sequential execution of the<br />

two codes, it has to be assumed that there is no feed<br />

back from the groundwater zone (MIKE SHE) to the<br />

root zone (Daisy). Further, overland flow generated by<br />

high intensity rainfall (Hortonian) cannot be simulated<br />

by this coupling, while overland flow due to<br />

saturation from below (Dunne) can be accounted for<br />

by MIKE SHE.<br />

Thus, MIKE SHE does not in the present case<br />

handle evapotranspiration and other root zone<br />

processes in the agricultural areas. As Daisy is onedimensional,<br />

one Daisy run in principle should be<br />

carried out for each of MIKE SHE’s horizontal<br />

grids. However, several MIKE SHE grids are assumed<br />

to have identical root zone properties (soil, crop, agricultural<br />

management practices, etc.), so that in practise<br />

the outputs from each Daisy run can be used as<br />

input to several MIKE SHE grids.<br />

2.2. Data availability at European databases<br />

Input data for modelling at the European scale need<br />

to satisfy certain requirements to make them useful for<br />

large-scale applications:<br />

• The data must be available for the whole of<br />

Europe.<br />

• The data must be harmonised according to a<br />

common nomenclature in order to avoid regional<br />

or national inconsistencies.<br />

• The data should be available in a seamless database.<br />

• The data should be available from one single<br />

source to avoid regional or national inconsistencies.<br />

• The data should be available in a format which can<br />

be directly integrated into a Geographical Information<br />

System (GIS).<br />

Attached to the use of “European” data sets are also<br />

certain problems. The data are generalised in<br />

geometric as well as in thematic detail, local particularities<br />

which are especially important for hydrological<br />

simulations are not always accounted for. Often<br />

information that is required for specific modelling<br />

objectives is not directly available on European<br />

level demanding the establishment and use of transfer<br />

functions instead. On the contrary, information is<br />

sometimes too specific when it has been collected in<br />

the framework of a particular research project, e.g.<br />

information on a particular soil property is being<br />

collected in natural soils but not in agricultural soils.<br />

Given these formal requirements, a first task of the<br />

project was to study the availability of data sets suited<br />

for large-scale hydrological modelling of groundwater<br />

contamination from diffuse sources. After<br />

intensive searches of on-line data catalogues, paper<br />

publications and direct contacts with organisations<br />

holding relevant information, it was possible to


Table 1<br />

Data sources for European scale hydrological modelling<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 121<br />

Data<br />

Potential data source<br />

identified in European data<br />

base<br />

Source actually used for<br />

modelling<br />

Scale of available data used<br />

Topography USGS a /GISCO USGS/GISCO 1 km grid<br />

Soil type GISCO soil map GISCO soil map 1 km grid<br />

Soil organic matter RIVM b report Experience value for Danish Denmark<br />

arable soils c<br />

Vegetation EEA: CORINE land cover EEA: CORINE land cover 1 km grid<br />

River network and river DCW d<br />

Provided by an application 1 km grid<br />

cross sections<br />

developed within the project<br />

Geology<br />

Report on groundwater<br />

resources in Denmark (EC,<br />

1982) RIVM—digital map<br />

data of report<br />

Report on groundwater<br />

resources in Denmark (EC,<br />

1982)<br />

County, i.e. approximately<br />

3,000 km 2<br />

Groundwater abstraction<br />

Report on groundwater<br />

resources in Denmark (EC,<br />

1982) RIVM—digital map<br />

data of report<br />

Report on groundwater<br />

resources in Denmark (EC,<br />

1982)<br />

Commune, i.e.<br />

approximately 200 km 2<br />

Management practices SC-DLO e report Plantedirektoratet (1996) Denmark<br />

Crop type Eurostat—Regional Statistics Agricultural Statistics (1995) County, i.e. approximately<br />

3000 km 2<br />

Livestock density<br />

Eurostat—Regional Statistics<br />

Eurostat—Eurofarm<br />

Agricultural Statistics (1995) County, i.e. approximately<br />

3000 km 2<br />

Fertilizer consumption Eurostat—Environmental<br />

Statistics<br />

Agricultural Statistics (1995) County, i.e. approximately<br />

3000 km 2<br />

Manure production<br />

Eurostat—Environmental<br />

Statistics<br />

Agricultural Statistics (1995) County, i.e. approximately<br />

3000 km 2<br />

Atmospheric deposition MARS project National data Denmark<br />

Climatic variables MARS project f National data Denmark<br />

River runoff GRDC g National data Catchment<br />

a USGS—United States Geological Survey.<br />

b RIVM—National Institute of Public Health and the Environment of The Netherlands.<br />

c RIVM data only include natural areas, not arable land. Instead the figure was assessed on the basis of previous experience with Danish<br />

agricultural soils.<br />

d DCW—Digital Chart of the World.<br />

e SC-DLO—Winand Staring Centre, The Netherlands.<br />

f MARS—Monitoring Agriculture by Remote Sensing database.<br />

g GRDC—Global Runoff Data Centre, database mainly for large river basins.<br />

identify sources for all the information requirements.<br />

However, after evaluation of all the potential sources<br />

the following deficiencies became apparent:<br />

• Not all information was available in spatially referenced<br />

GIS format, therefore other sources such as<br />

tables and statistics had to be considered.<br />

• Not all information was available from<br />

“European” databases, finally national sources<br />

had to be considered. For these national sources<br />

strict requirements in terms of ease of availability,<br />

data quality and data comparability were<br />

imposed.<br />

• The scale of the available data was often too coarse<br />

for the application. Global data sets with 1 × 1<br />

longitude/latitude resolution are often not detailed<br />

enough.<br />

The potential “European scale” data sources and the<br />

data sources which ultimately was used for the model<br />

are shown in Table 1.<br />

Data about climatic variables were obtained from


122<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

the national meteorological institutes and river runoff<br />

from the national hydrological institutes. These data<br />

were only available from national sources, but on the<br />

contrary these data are probably the most easily available<br />

(if the issue of price charges is disregarded) and<br />

the most easily comparable due to international<br />

harmonised measuring techniques at these organisations.<br />

Regional statistics on Denmark obtained from<br />

EUROSTAT proved to be not detailed enough<br />

(country level only). The required statistical information<br />

could easily be recovered from Danish national<br />

statistics.<br />

Cost estimates for the compilation of the database<br />

have only been undertaken to a limited extent. The<br />

project data itself have mostly been obtained in<br />

exchange for the anticipated project results, i.e. at<br />

no cost. The main data that in a fully commercial<br />

environment cost a substantial amount of money are<br />

meteorological data which are available from the<br />

national meteorological institutes (Kleeschulte,<br />

1998).<br />

2.3. Change of scale<br />

Large scale hydrological models are required for a<br />

variety of applications in hydrological, environmental<br />

and land surface-atmosphere studies, both for research<br />

and for day to day water resources management<br />

purposes. The physically based models have so far<br />

mainly been tested and applied at small scale and<br />

therefore require upscaling. The complex interactions<br />

between spatial scale and spatial variability is widely<br />

perceived as a substantial obstacle to progress in this<br />

respect (Blöschl and Sivapalan, 1995; and many<br />

others).<br />

The research results on the scaling issue reported<br />

during the past decade have, depending on the particular<br />

applications, focussed on different aspects,<br />

which may be categorised as follows:<br />

• Subsurface processes focussing on the effect of<br />

geological heterogeneity.<br />

• Root zone processes including interactions<br />

between land surface and atmospheric processes.<br />

• Surface water processes focussing on topographic<br />

effects and stream–aquifer interactions.<br />

The effect of spatial heterogeneity on the description<br />

of subsurface processes has been the subject of<br />

comprehensive research for two decades, see e.g.<br />

Dagan (1986) and Gelhar (1986) for some of the<br />

first consolidated results and Wen and Gómez-<br />

Hernández (1996) for a more recent review, mainly<br />

related to aquifer systems. The focus in this area is<br />

largely concerned with upscaling of hydraulic<br />

conductivity and its implications on solute transport<br />

and dispersion processes in the unsaturated zone and<br />

aquifer system, typically at length scales less than<br />

1 km.<br />

The research in the land surface processes has<br />

mainly been driven by climate change research<br />

where the meteorologists typically focus on length<br />

scales up to 100 km. Michaud and Shuttelworth<br />

(1997), in a recent overview, conclude that substantial<br />

progress has been made for the description of surface<br />

energy fluxes by using simple aggregation rules. Sellers<br />

et al. (1997) conclude that “it appears that simple<br />

averages of topographic slope and vegetation parameters<br />

can be used to calculate surface energy and<br />

heat fluxes over a wide range of spatial scales, from<br />

a few meters up to many kilometers at least for grassland<br />

and sites with moderate topography”. An interesting<br />

finding is the apparent existence of a threshold<br />

scale, or representative elementary area (REA) for<br />

evapotranspiration and runoff generation processes<br />

(Wood et al., 1988, 1990, 1995). Famiglietti and<br />

Wood (1995) concludes on the implications of such<br />

an REA in a study of catchment evapotranspiration<br />

that “the existence of an REA for evapotranspiration<br />

modelling suggests that in catchment areas smaller<br />

than this threshold scale, actual patterns of model<br />

parameters and inputs may be important factors<br />

governing catchment-scale evapotranspiration rates<br />

in hydrological models. In models applied at scales<br />

greater than the REA scale, spatial patterns of dominant<br />

process controls can be represented by their<br />

statistical distribution functions”. The REA scales<br />

reported in the literature are in the order of 1–5 km 2 .<br />

The research on scale effects related to topography<br />

and stream–aquifer interactions has been rather<br />

limited as compared to the above two areas. Saulnier<br />

et al. (1997) have examined the effect of the grid sizes<br />

in digital terrain maps (DTM) on the model simulations<br />

using the topography-based TOPMODEL. They<br />

concluded that in particular for channel pixels the<br />

spatial resolution of the underlying DTM is important.<br />

Refsgaard (1997) using the distributed MIKE SHE


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 123<br />

Fig. 2. Schematic representation of upscaling/aggregation procedure.<br />

model to the Danish Karup catchment with grid sizes<br />

of 0.5, 1, 2 and 4 km, found that the discharge hydrograph<br />

shape was significantly affected for the 2 and<br />

4 km grids as compared to the almost identical model<br />

results with 0.5 and 1 km grids. He concluded that the<br />

main reason for this change was that the density of<br />

smaller tributaries within the catchment was smaller<br />

for the models with the larger grids.<br />

Many researchers doubt whether it is feasible to use<br />

the same model process descriptions at different<br />

scales. For instance Beven (1995) states that “… the<br />

aggregation approach towards macroscale hydrological<br />

modelling, in which it is assumed that a model<br />

applicable at small scales can be applied at larger<br />

scales using ‘effective’ parameter values, is an inadequate<br />

approach to the scale problem. It is also unlikely<br />

in the future that any general scaling theory can be<br />

developed due to the dependence of hydrological<br />

systems on historical and geological perturbations”.<br />

We have experienced some of the same problems<br />

and agree that it is generally not possible to apply<br />

the same model without recalibration at small and<br />

large scales. Therefore, we have used another<br />

approach based on a combination of aggregation and<br />

upscaling in accordance with the principles recommended<br />

by Heuvelink and Pebesma (1998). The<br />

scale terminology and the upscaling procedure<br />

adopted here are as follows (Fig. 2):<br />

• The basic modelling system is of the distributed<br />

physically based type. For application at point<br />

scale (where it is not used spatially distributed)<br />

the process descriptions of this model type can be<br />

tested directly against field data.<br />

• The model is in this case run with (equations and)<br />

parameter values in each horizontal grid point<br />

representing field scale (50–200 m) conditions.<br />

The field scale is characterised by ‘effective’ soil<br />

and vegetation parameters, but assuming only one<br />

soil type and one cropping pattern. Thus the spatial<br />

variability within a typical field is aggregated and<br />

accounted for in the ‘effective’ parameter values.<br />

• The smallest horizontal discretization in the model<br />

is the grid scale or grid size (1–5 km) that is larger<br />

than the field scale. This implies that all the variations<br />

between categories of soil type and crop type


124<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

Fig. 3. Locations of the Karup and Odense catchments in Denmark.<br />

within the area of each grid cannot be resolved and<br />

described at the grid level. Such input data whose<br />

variations are not included in the grid scale model<br />

representation, are distributed randomly at the<br />

catchment scale so that their statistical distributions<br />

are preserved at that scale.<br />

• The results from the grid scale modelling are then<br />

aggregated to catchment scale (10–50 km) and the<br />

statistical properties of model output and field data<br />

are then compared at catchment scale.<br />

• For applications to larger scales than catchment<br />

scale, such as continental scale, the catchment<br />

scale concept is used, just with more grid points.<br />

This implies that the continental scale can be<br />

considered to consist of several catchments, within<br />

each of which the field scale statistical variations<br />

are preserved and at which scale the predictive<br />

capability of the model thus lies.<br />

In the upscaling procedure a distinction is made<br />

between the terms upscaling and aggregation. Thus,<br />

spatial attributes are aggregated and model parameters<br />

are scaled up. A principal difference between<br />

aggregation and upscaling is that whereas aggregation<br />

can be defined irrespective of a model operating on<br />

the aggregated values, upscaling must always be<br />

defined in the context of a model that uses the parameters<br />

that have been scaled up (Heuvelink and<br />

Pebesma, 1998). In this respect the main principle<br />

of the upscaling procedure can be summarised as<br />

follows:<br />

• Upscale model from point scale to field scale.<br />

• Run model at grid scale using field scale parameters<br />

in such a way that their statistical properties<br />

are preserved at catchment scale.<br />

• Aggregate grid scale model output to catchment<br />

scale.<br />

This methodology mainly attempts to address scaling<br />

within the second of the above fields, namely root


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 125<br />

zone processes, while scaling in relation to subsurface<br />

processes and stream–aquifer interaction has not been<br />

considered when designing the present upscaling<br />

procedure. The methodology has some complications<br />

and critical assumptions:<br />

• The assumption of upscaling from point scale to<br />

field scale is crucial. This assumption is documented<br />

to be fulfilled in many cases (Jensen and<br />

Refsgaard 1991a–c; Djuurhus et al., 1999), but<br />

may fail in other cases (Bresler and Dagan,<br />

1983), for instance in areas where overland flow<br />

is a dominant flow mechanism.<br />

• Running the model at grid scale but using model<br />

parameters valid at a field scale, which is typically<br />

2 to 3 orders of magnitude smaller, is necessary to<br />

make the computational demand acceptable for<br />

catchment and continental scale applications. The<br />

solution to this is to assign inputs on soil and vegetation<br />

types not correctly georeferenced but such<br />

that their statistical distribution at catchment scale<br />

is preserved. This implies that results at grid scale<br />

are dubious and should not be used. The aggregation<br />

step up to catchment scale is therefore essential.<br />

• While the statistical properties of the critical<br />

root zone parameters due to the aggregation<br />

step have been preserved at catchment scale<br />

this is not the case for the geological, topographical<br />

and stream data which are used directly<br />

at the grid scale. A critical question is therefore,<br />

how the catchment scale model output,<br />

due to these other data, are influenced by selection<br />

of grid scale. Here, investigations with 1, 2<br />

and 4 km grids are made.<br />

3. Application<br />

3.1. Modelling approach for the Karup and Odense<br />

catchments<br />

The modelling studies have focussed on two<br />

aspects, namely the feasibility of using coarse aggregated<br />

data available at European level databases, and<br />

the effect of the upscaling procedure. The modelling<br />

aims at describing the integrated runoff at the catchment<br />

outlet and the distribution function of the nitrate<br />

concentrations sampled from available wells over the<br />

catchment (aquifer). On this basis the following<br />

approach has been adopted:<br />

1. Simulation models have been established for<br />

two catchments in Denmark, Karup Å and<br />

Odense Å (Fig. 3), in the following denoted<br />

the Karup and Odense models, respectively.<br />

The topographical areas for the Karup catchment<br />

gauging station 20.05 Hagebro is<br />

518 km 2 . Correspondingly, the catchment area<br />

at the gauging station used for the model validation<br />

tests in the Odense catchment, 45.26<br />

Ejby Mølle, is 536 km 2 . The most detailed<br />

studies were carried out for the Karup catchment,<br />

while the results for the Odense catchment<br />

were included mainly to check the<br />

generality of the conclusions derived from the<br />

Karup catchment.<br />

2. The models are established directly from the<br />

European level databases and all input parameter<br />

values are assessed from these data or in a predefined<br />

objective way from experience values<br />

obtained from previous model studies. Thus, the<br />

models are not calibrated at all.<br />

3. The results of the models are compared with field<br />

data, on which basis the model performance is<br />

assessed.<br />

4. The effects of upscaling have been examined in<br />

two ways:<br />

• The models are run with different grid sizes (1, 2<br />

and 4 km) and the results compared.<br />

• For the Karup catchment two different procedures<br />

have been compared, namely:<br />

the upscaling/aggregation procedure described<br />

above (Fig. 2), which according to its representation<br />

of agricultural crops is denoted ‘distributed’;<br />

a simpler procedure where the agricultural crops<br />

are upscaled all the way from field scale to<br />

catchment scale. This implies that one crop<br />

type represents all the agricultural areas. The<br />

dominant crop in the area, namely winter<br />

wheat, has been selected as the crop for the<br />

70% agricultural area, while the 30% natural/


126<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

Fig. 4. Surface topography, catchment delineation and river network for the Karup-EU model.<br />

urban areas remain as the only other vegetation<br />

type. This procedure is denoted ‘uniform’.<br />

3.2. Karup model<br />

3.2.1. Catchment and river system<br />

The catchment area and locations of the river<br />

branches (Fig. 4) were generated from the DEM by<br />

use of standard ARC/Info functionalities. The generated<br />

catchment areas for 1, 2 and 4 km grids were<br />

within 4% of the correct one at station 20.05 Hagebro.<br />

The river cross-sections were subsequently automatically<br />

derived on the basis of the following assumptions:<br />

• The bankful discharge (i.e. water flow up to top of<br />

cross-section) corresponds to a typical annual<br />

maximum discharge. This characteristic discharge<br />

is further assumed uniform in terms of specific<br />

runoff (1 s 1 km 2 ), so that the actual discharge<br />

at any cross section is estimated as the specific<br />

runoff multiplied by the upstream catchment area<br />

that can be estimated from the DEM.<br />

• The river slope corresponds to the slope of the<br />

surrounding surface, which can be derived from<br />

the DEM.<br />

• The cross-section has a trapezium shape with a<br />

fixed given angle and relation between depth and<br />

width.<br />

• The relation between discharge, slope and river<br />

cross-section can be determined by the Manning<br />

formula with a given Manning number.<br />

Most areas in Denmark are drained in order to make<br />

the land suitable for agriculture. Agricultural areas are<br />

typically artificially drained with tile drains in combination<br />

with small ditches. Other areas may be naturally<br />

drained by creeks and rivers. It is not possible to<br />

include a detailed and fully correct drainage description<br />

in a coarse model like the Karup model. Moreover,<br />

detailed information on drainage network is not<br />

available. Therefore, when establishing a coarse scale


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 127<br />

model, a lumped description must be used. In the<br />

present case it is simply assumed that the entire catchment<br />

area is drained and that the drains are located<br />

1 m below ground surface. Drainage water is<br />

produced whenever the groundwater table is located<br />

above this drainage level. Drainage water is routed to<br />

the nearest river node where it contributes as a source<br />

to the river flow. Routing of groundwater to the drains<br />

and further to the ultimate recipient is in MIKE SHE<br />

described using a linear routing technique, where a<br />

time constant is specified by the user. In this case a<br />

time constant of 2:3 × 10 7 s 1 was used corresponding<br />

to an average retention time (in the linear reservoir)<br />

of 50 days. This time constant represents a<br />

typical value for Danish catchments.<br />

3.2.2. Soil properties<br />

The soil texture classes in a 1 × 1 km resolution<br />

were provided by the GISCO soil data base. The<br />

texture classes were translated into soil parameters<br />

in terms of hydraulic conductivity functions and soil<br />

water retention curves using pedo-transfer functions<br />

(Cosby et al., 1984). According to the GISCO the<br />

Karup catchment is covered by coarse sandy soil for<br />

which the following key parameter values were estimated:<br />

(a) saturated hydraulic conductivity<br />

K s ˆ 1:7 × 10 5 m=s; (b) moisture content at saturation<br />

u s ˆ 40 vol%; (c) moisture content at field capacity<br />

u FC ˆ 20 vol%; and (d) moisture content at<br />

wilting point u wp ˆ 6vol%:<br />

A specific problem was related to assessment of soil<br />

organic matter, which is an important parameter for<br />

nitrogen turnover processes. As indicated in Table 1<br />

such information was not identified in any of the<br />

European data bases. Instead a value based on<br />

previous experience (Lamm, 1971) with Danish agricultural<br />

soils was estimated. In the plough layer (0–<br />

20 cm) a value of 1.5%C was used, and this value<br />

decreased rapidly with depth to a minimum of<br />

0.01%C below 1 m depth.<br />

3.2.3. Hydrogeology<br />

The geological perception of the area and the basis<br />

for estimation of the hydrogeological parameters used<br />

in the model are all based on EC (1982), where the<br />

aquifer is described as composed of two main geological<br />

layers.<br />

The upper layer is Quaternary sediments consisting<br />

of sands and gravel. The transmissivity of these sediments<br />

are assessed to be in the order of 2 × 10 3 m 2 =s<br />

and the thickness about 15 m (EC, 1982). This leads to<br />

a horizontal hydraulic conductivity of 1:3 × 10 4 m=s<br />

that was used in the model calculations. An anisotropy<br />

factor of 10 between horizontal and vertical hydraulic<br />

conductivities was assumed leading to a vertical<br />

hydraulic conductivity of 1:3 × 10 5 m=s: Moreover,<br />

a specific yield of 0.2 and a storage coefficient of<br />

10 4 m 1 was assumed.<br />

Below the Quaternary sediments there are Miocene<br />

quarts-sand sediments with a relatively high transmissivity<br />

of 3 × 10 3 m 2 =s and a thickness of typically<br />

10–20 m (EC, 1982). Hence, in the model a thickness<br />

of 15 m has been used. This leads to a horizontal<br />

hydraulic conductivity of 2:0 × 10 4 m=s: The same<br />

assumptions on anisotropy, specific yield and storage<br />

coefficients as for the Quaternary sediments were<br />

applied for the Miocene sediments.<br />

EC (1982) provides information on groundwater<br />

abstraction on a commune (local administrative unit)<br />

basis. The Miocene sediments are described as suitable<br />

for drinking water supply, why it is assumed that<br />

all groundwater abstractions are made from these<br />

sediments that are the lower layer in the model. The<br />

total abstraction is given as 13 × 10 6 m 3 =year: The<br />

exact location of the individual water supply wells<br />

is not given in EC (1982), and has been evenly distributed<br />

among 10–20 model grids located along the river<br />

system.<br />

The location of the reduction front in the aquifer is<br />

an important parameter for nitrate conditions. As<br />

percolation water containing nitrate moves into<br />

areas with reduced geochemical conditions the nitrate<br />

will disappear. No information on this important parameter<br />

was provided in EC (1982). It was assumed that<br />

the front separating oxic and reduced aquifer conditions<br />

all over the aquifer is located in the Miocene<br />

sediments, 3 m below the interface to the Quaternary<br />

sediments. This corresponds to a location 18 m below<br />

the terrain surface.<br />

3.2.4. Hydrometeorology<br />

Time series of daily precipitation and temperature<br />

based on standard meteorological stations within the<br />

catchment was used. In addition, monthly values of<br />

potential evapotranspiration were calculated by the<br />

Makkink equation on the basis of climate data from


128<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

the synoptic station at Karup airport. The data from<br />

synoptic stations are generally easily available internationally.<br />

3.2.5. Crop growth, evapotranspiration and nitrate<br />

leaching model<br />

Distributions of crop types and livestock densities<br />

were obtained from Agricultural Statistics (1995) and<br />

converted to slurry production using standard values<br />

for nitrogen content. Based on typical crop rotations<br />

proposed by The Danish Agricultural Advisory<br />

Centre and the constraints offered by crop distribution<br />

and livestock density two cattle farm rotations, one<br />

pig farm rotation and one arable farm rotation were<br />

constructed. In order to capture the effect of the interaction<br />

between weather conditions and crops, simulations<br />

were performed in such a way that each crop at<br />

its particular position in the considered rotation<br />

occurred exactly once in each of the years, which<br />

resulted in a total of 17 crop rotation schemes.<br />

These 17 schemes were distributed randomly over<br />

the area in such a way that the statistical distribution<br />

was in accordance with the agricultural statistics.<br />

To simulate the trend in the nitrate concentrations<br />

in the groundwater and in the streams, it is<br />

necessary to have information on the history of<br />

the fertiliser application in space and time. In<br />

Denmark, norms and regulations for fertilisation<br />

practice are defined (Plantedirektoratet, 1996)<br />

which regulate the maximum amount of nutrients<br />

allowed for a particular crop depending on forefruit<br />

and soil type, and in addition, provide norms<br />

for the lower limit of nitrogen utilisation for<br />

organic fertilisers. It was assumed that the farmers<br />

follow the statuary norms, and that the proportion<br />

of organic fertiliser to the individual crop in a<br />

rotation is proportional to the production of<br />

organic fertiliser in the rotation and to the relative<br />

nitrogen demand of the crop (the fertiliser norm of<br />

the particular crop in relation to the fertiliser norm<br />

of the rotation). Based on estimated application<br />

rates of organic and mineral fertilisers to the individual<br />

crops each year, the Daisy model simulated<br />

time series of nitrate leaching from the root zone<br />

for each agricultural grid. The MIKE SHE model<br />

then routed these fluxes further through the<br />

unsaturated zone and in the groundwater layers<br />

accounting for dispersion and dilution processes<br />

and finally into the Karup stream where the integrated<br />

load from the entire catchment was estimated.<br />

The parameterisation of the Daisy model is<br />

adopted from previous studies. The basic<br />

processes and standard parameter values were<br />

originally assessed from results of Danish agricultural<br />

field experiments (Hansen et al., 1990). As<br />

then the process description and standard parameters<br />

have only been subject to minor modifications<br />

in connection with model tests against data<br />

from The Netherlands, Germany, Denmark and<br />

Slovakia (Hansen et al 1991; Jensen et al, 1994,<br />

1996, 1997; Svendsen et al, 1995). Hence, the<br />

parameters related to both, evapotranspiration/<br />

water balance processes and to the nitrogen transformation<br />

processes have, except for the soil parameters<br />

described in Section 3.2.3, been taken as<br />

the standard values. More details on the parameter<br />

values, their assessed uncertainties and results<br />

from the Daisy simulations are provided in<br />

Hansen et al. (1999).<br />

3.2.6. Boundary and initial conditions<br />

In addition to precipitation and groundwater<br />

abstraction rates the following boundary conditions<br />

are used:<br />

• The area included in the catchment is per definition<br />

a hydrological catchment as based on topography.<br />

Thus a zero-flux boundary is used along the catchment<br />

boundaries, also for the aquifer layers. The<br />

bottom of the model is considered impermeable.<br />

• For all upstream river ends a zero-flux boundary<br />

condition is applied. For the downstream end, a<br />

constant water level was applied.<br />

The most important initial conditions are the moisture<br />

content in the unsaturated zone and the elevation<br />

of the groundwater table. The initial soil moisture<br />

content was assumed equal to field capacity, while<br />

the initial groundwater tables was assumed equal to<br />

the groundwater tables after a seven years simulation<br />

period with guessed initial conditions. The model was<br />

run for seven years (1987–1993). In order to reduce<br />

the importance of uncertain initial conditions, the two<br />

first years were considered as a ‘warming-up period’<br />

and the last five years were considered the simulation<br />

period.


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 129<br />

Table 2<br />

Water balance in mm/year for the Karup catchment at station 20.05 Hagebro (518 km 2 )<br />

Year Precipitation River flow Observed<br />

Model 1 km grid Model 2 km grid Model 4 km grid<br />

1989 812 428 392 353 460<br />

1990 1020 496 518 512 476<br />

1991 863 446 441 424 449<br />

1992 892 499 531 527 437<br />

1993 835 434 425 405 432<br />

Average 884 460 461 444 451<br />

3.3. Odense model<br />

The same procedure as outlined above for the<br />

Karup model was followed. The two main differences<br />

as compared to the Karup catchment are<br />

that the top soil belong to more fine textured<br />

classes with lower hydraulic conductivities and<br />

that the aquifer having groundwater abstraction<br />

is confined in the Odense catchment. This results<br />

in an assumption that the covering sediments are<br />

less permeable than the aquifer material. As no<br />

direct information on these confining sediments<br />

is given in EC (1982) the hydraulic properties of<br />

the soil in the root zone are assumed valid. This<br />

implies in practise that recharge rates to the<br />

aquifer is lower than in the Karup catchment<br />

and that the horizontal flow towards the drains<br />

and the river system is correspondingly larger. A<br />

similar geological geometry as in the Karup<br />

catchment is assumed, i.e. the upper less<br />

permeable, confining layer is assumed to have a<br />

thickness of 15 m and the reduction front is<br />

assumed to be located in the lower aquifer, 3 m<br />

below this confining layer.<br />

4. Results<br />

To test the model performance a number of validation<br />

tests were carried out for both catchments. Validation<br />

is here defined as substantiation that a site<br />

specific model performs simulations at a satisfactory<br />

level of accuracy. Hence, no universal validity of the<br />

general model code is tested nor claimed. In Tables 2<br />

and 3 and Figs. 5–8 results are shown for model grid<br />

sizes 1, 2 and 4 km and for the Karup catchment additionally<br />

for both the distributed and uniform upscaling<br />

procedures. The validation tests described below only<br />

considers the 1 km grid model runs, while the remaining<br />

results are discussed further below in the section<br />

dealing with scaling effects.<br />

4.1. Karup catchment<br />

The Karup model (1 km grid) was validated by<br />

comparison of model simulations and field data on<br />

the following aspects:<br />

• Annual water balances. Table 2 shows the annual<br />

water balances for the five years simulation period<br />

together with the observed annual discharge. The<br />

Table 3<br />

Water balance in mm/year for the Odense catchment at station 45.21 Ejby Mølle (536 km 2 )<br />

Year Precipitation River flow Observed<br />

Model 1 km grid Model 2 km grid Model 4 km grid<br />

1989 649 220 177 187 181<br />

1990 943 349 351 394 299<br />

1991 760 312 291 308 265<br />

1992 770 308 306 332 243<br />

1993 906 334 329 353 306<br />

Average 805 305 291 315 259


130<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

Fig. 5. Comparison of the recorded discharge hydrograph for the Karup catchment with simulations based on 1, 2 and 4 km grids. The two<br />

simulated curves corresponds to the combined upscaling/aggregation procedure (Distributed) and the simpler upscaling procedure (Uniform).<br />

simulated and observed hydrographs are shown in<br />

Fig. 5.<br />

• Nitrate concentrations in the upper groundwater<br />

layer. Simulated values are compared to observed<br />

values from 35 wells in terms of statistical distributions<br />

over the aquifer (Fig. 6).<br />

The main findings from these validation tests can be<br />

summarised as follows:<br />

• The annual water balance is simulated remarkably<br />

well. Thus the simulated and recorded flows, which<br />

also reflect the annual groundwater recharges in<br />

this area, differ only 2% as average values over<br />

the five year simulation period (Table 2).<br />

• The variation of the river runoff over the year is<br />

relatively well described, although not at all as<br />

good as the long term average water balance<br />

(Fig. 6). The model generally underestimates the<br />

runoff in the summer periods (low flows) and overestimates<br />

the winter flow. There may be many<br />

reasons for this. The most important is probably<br />

that the observed groundwater levels and dynamics<br />

are poorly reproduced by the model. The runoff<br />

from the Karup catchment is dominated by drainage<br />

flow and baseflow components. Thus a good<br />

simulation of groundwater levels and dynamics are<br />

required in order to produce a good runoff simulation.<br />

An improved simulation of groundwater<br />

levels and dynamics requires that the model<br />

includes, in particular, spatial variations of the<br />

transmissivity of the aquifer, which is not possible<br />

based on the available input data.<br />

• The nitrate concentrations simulated by the model<br />

are seen to match the observed data remarkably<br />

well, both with respect to average concentrations<br />

and statistical distribution of concentrations within<br />

the catchment. It may be noticed that the critical<br />

NO 3 concentration level of 50 mg/l (maximum<br />

admissible concentration according to drinking<br />

water standards) is exceeded in about 60% of the<br />

area.<br />

4.2. Odense catchment<br />

The Odense model (1 km grid) was validated by


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 131<br />

Fig. 6. Comparison of the statistical distribution of nitrate concentrations in groundwater for the Karup catchment predicted by the model with<br />

1, 2 and 4 km grids and observed in 35 wells. The upper figure corresponds to the upscaling/aggregation procedure resulting in a distributed<br />

representation of agricultural crops, while the lower figure is from the run with the upscaling procedure, where all the agricultural area is<br />

represented by one uniform crop.<br />

comparison of model simulations and field data on the<br />

following aspects:<br />

• Annual water balances. Table 3 shows the annual<br />

water balances for the five years simulation period<br />

together with the observed annual discharge. The<br />

simulated and observed hydrographs are shown in<br />

Fig. 7.<br />

• Nitrate concentrations in the upper groundwater<br />

layer. Simulated values are compared<br />

to observed values from 42 wells in terms<br />

of statistical distributions over the aquifer<br />

(Fig. 8).


132<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

Fig. 7. Discharge hydrographs for Odense catchment simulated with 1, 2 and 4 km grids.<br />

The main findings from these validation tests are:<br />

• The annual water balance is simulated reasonably<br />

well, although not with the same accuracy as for<br />

the Karup catchment. Thus the simulated and<br />

recorded flows differ 18% for the 1 km grid<br />

model as average values over the five year simulation<br />

period (Table 3). A comparison with another<br />

model study for this area reveals that one of the<br />

reasons for this deviation is uncertainties (errors) in<br />

the catchment delineation in the flat downstream<br />

part of the catchment. Another reason may be that<br />

Fig. 8. Comparison of the statistical distribution of nitrate concentrations in groundwater for the Odense catchment predicted by the model with<br />

1, 2 and 4 km grids and observed in 35 wells.


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 133<br />

the soil hydraulic conductivity functions and the<br />

soil water retention curves that significantly affect<br />

the evapotranspiration are not very accurately<br />

determined. These inaccuracies may originate<br />

either from non-representative soil texture data in<br />

the 1 km × 1 km GISCO database or by errors<br />

introduced by use of the pedo-transfer functions.<br />

• The variation of the river runoff over the year is<br />

relatively well described, although the winter<br />

peaks are simulated too small and the summer<br />

low flows too high, reflecting that some of the<br />

internal hydrological processes may not be simulated<br />

correctly.<br />

• The distribution of groundwater concentrations by<br />

the end of the simulation period is seen not to<br />

compare very well to the observations from 42<br />

wells. Thus, in 80% of the observation wells no<br />

nitrate was found, whereas the model simulates<br />

zero concentration in only 25% of the area. With<br />

respect to the critical concentration value of 50 mg/<br />

l, the observations indicate that such high concentrations<br />

are not found in the area, while the model<br />

simulates such concentrations to exist in about 5%<br />

of the catchment area. The main reason for this<br />

disagreement is most likely that in reality the<br />

nitrate is in most of the area reduced (disappears)<br />

in the confining sediments overlaying the aquifer.<br />

This is not simulated by the model, because the<br />

reduction front was assumed to be located within<br />

the aquifer, while analysis of local geological data<br />

reveals that it in reality is located in the upper<br />

confining layer over most of the aquifer.<br />

• It is noticed that the nitrate concentrations are<br />

significantly lower in the Odense catchment than<br />

in the Karup catchment, both the observed and the<br />

simulated values. The main reason for this is that<br />

the different soil properties and the less number of<br />

animals result in a lower nitrate leaching from the<br />

root zone in the Odense catchment.<br />

4.3. Scaling effects<br />

The results of running the Karup and Odense<br />

models with different computational grid sizes, 1, 2<br />

and 4 km, appear from Tables 2 and 3 for annual water<br />

balances and Figs. 5 and 7 for discharge hydrographs.<br />

Further, the results in terms of groundwater<br />

concentrations are shown in Figs. 6 and 8. From<br />

these results the following findings appear:<br />

• The simulated annual runoff is almost identical and<br />

thus independent of grid sizes. A reason for some<br />

of the small differences is that the catchment areas<br />

in the 1, 2 and 4 km models are not quite identical.<br />

Thus, the root zone processes responsible for<br />

generating the evapotranspiration and consequently<br />

the runoff does not appear to be scale<br />

dependent as long as the statistical properties of<br />

the soil and vegetation types are preserved, which<br />

is the case with the upscaling/aggregation procedure<br />

used in this case.<br />

• The hydrograph shape differs significantly for the<br />

three grid sizes. For the Karup model, the simulation<br />

with 1 km grid reproduces the low flow conditions<br />

reasonably well, whereas the 2 and 4 km grids<br />

have a rather poor description of the baseflow<br />

recession in general and the low flow conditions<br />

in particular. For the Odense model, the simulation<br />

with the 1 km grid shows too large baseflows<br />

during the low flow season, while the 2 km grid<br />

model has the right level and the 4 km grid<br />

model simulates less low flow than observed.<br />

This indicates that there are significant scale effects<br />

on the stream–aquifer interaction that are not properly<br />

described in the present upscaling/aggregation<br />

procedure.<br />

• The nitrate concentrations in the groundwater is<br />

not clearly influenced by the grid size for the<br />

Karup catchment, while there appears to be some<br />

effect for the Odense catchment. The reason for<br />

this difference is related to the different hydrogeological<br />

situations in the two catchments. In the<br />

Karup catchment the groundwater table is generally<br />

located a couple of meters below terrain<br />

surface and the horizontal flows take place in<br />

both the Quaternary and the Miocene sediments.<br />

Hence for both the 1, 2 and 4 km grid models, the<br />

main part of the horizontal groundwater flow takes<br />

place in the about 15 m of the aquifer located<br />

above the reduction front, and only a relatively<br />

small part of the flow lines are crossing the reduction<br />

front, below which the nitrate disappears. In<br />

the Odense catchment, the horizontal groundwater<br />

flows take place almost exclusively in the lower<br />

aquifer, of which only the upper 3 m is located


134<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

above the reduction front. This implies that a large<br />

part of the groundwater flow is crossing the reduction<br />

front on its route from the infiltration zones in<br />

the hilly areas towards the discharge zones near the<br />

river. As the size of the grid influences the smoothness<br />

of the aquifer geometry, the grid size will<br />

significantly influence the number of flow lines<br />

crossing the reduction front and hence the nitrate<br />

concentrations. Such scaling effect on geological<br />

conditions is not accounted for in the present<br />

upscaling/aggregation procedure.<br />

Further, for evaluating the importance of the<br />

combined upscaling/aggregation method (‘distributed’)<br />

a model run has been carried out for the Karup<br />

catchment with another upscaling method. This alternative<br />

method is based on upscaling of soil/crop types<br />

all the way from point scale to catchment scale. This<br />

implies that all the agricultural area is described by<br />

one representative (‘uniform’) crop instead of the 17<br />

cropping patterns used in the ‘distributed’ method.<br />

This representative crop has been assumed to have<br />

the same characteristics as the dominant crop, namely<br />

winter wheat, and further to be fertilised by the same<br />

total amount of the organic manure as in the other<br />

simulations, supplemented by some mineral fertiliser<br />

up to the nitrate amount prescribed in the norms<br />

defined by Plantedirektoratet (1996).<br />

The results are illustrated in Figs. 5 and 6 by the<br />

legend denoted ‘uniform’. The effects on the<br />

discharge hydrographs (Fig. 5) are seen to be negligible,<br />

indicating that the dominant crop (by chance) has<br />

similar evapotranspiration characteristics as the sum<br />

of the different crops weighted according to their<br />

actual occurrence. The nitrate concentrations in<br />

groundwater (Fig. 6) show some differences in terms<br />

of a lower average concentration and a less smooth<br />

areal distribution as compared to the distributed agricultural<br />

representation. Thus, in case of the ‘uniform’<br />

representation the nitrate concentrations fall in two<br />

main groups. Around 30% of the area, corresponding<br />

to the natural areas with no nitrate leaching, has<br />

concentrations between 0 and 20 mg/l, while the<br />

remaining 70%, corresponding to the agricultural<br />

area with the ‘uniform’ crop, has concentrations<br />

between 70 and 90 mg/l. In the ‘distributed’ agricultural<br />

representation the areal distribution curve is<br />

much smoother in accordance with the measured data.<br />

5. Discussion and conclusions<br />

Two prerequisites are required for performing large<br />

scale simulations of nitrate leaching on an operational<br />

basis: firstly access to readily available global (or in<br />

the present case European) databases, and secondly an<br />

adequate scaling enabling suitable models to be<br />

applied at a larger scale than the field scales for<br />

which they usually have been proven valid. A key<br />

challenge as compared to the experiences reported<br />

in the literature is then how to make use of the physically<br />

based model at large scale without possibility for<br />

detailed calibration at that scale, when we know that<br />

its physically based equations are developed for small<br />

scales. Such model can only be stated as well proven<br />

for small scales, and the few attempts made so far to<br />

use it on scales above 1000 km 2 have applied calibration<br />

at that scale (Refsgaard et al. 1998b, 1992; Jain et<br />

al., 1992).<br />

5.1. Data availability<br />

From the experiences gathered and the lessons<br />

learnt with regard to availability of European data<br />

bases the following conclusions can be drawn:<br />

• Not all of the existing “European” databases are<br />

generally applicable due to various restrictions<br />

(e.g. copyright, not open to other projects, pointers<br />

only).<br />

• Not all databases maintained by international institutions<br />

contain harmonised and integrated data<br />

sets. Many databases in fact only contain a collection<br />

of national data sets that are neither integrated<br />

in one seamless data set, nor harmonised in their<br />

contents or nomenclatures.<br />

• Not all input data requirements could be satisfied<br />

from GIS (spatial) data sets, why tables and paper<br />

maps are needed to supplement the information.<br />

However often the available data are too coarse<br />

in scale (e.g. EU statistics at a higher administrative<br />

unit than needed) or too specific (e.g. transfer<br />

functions for natural soils only but not for agricultural<br />

soils).<br />

• Use of national data sets is to some extent necessary,<br />

with restrictions to data quality and origin.<br />

• The search for data sets could have been largely<br />

improved by the existence of a European spatial


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 135<br />

data clearinghouse and the association of the<br />

available data sets with meta information.<br />

It is noted that in spite of comprehensive efforts<br />

made during recent years for assessing spatial data<br />

by use of advanced remote sensing technology the<br />

only data in the “European” databases which<br />

originate from remote sensing data are the<br />

CORINE land cover data, which were useful for<br />

distinguishing between natural, urban and agricultural<br />

areas, but which did not contain any further<br />

information about agricultural crops of importance in<br />

the present context.<br />

In spite of the above limitations, the attempts in<br />

the present study to identify suitable data sources<br />

at the European scale have shown that useful data<br />

are available at that scale for most of the required<br />

model input data. Although these data require<br />

some kind of transformation, as e.g. pedo-transfer<br />

functions, the data appear adequate for overall<br />

model simulations at this scale. However, some<br />

gaps exist in the European level databases. Thus,<br />

for the following data it was necessary to use<br />

national data sources:<br />

• Meteorological data on a daily basis.<br />

• Soil organic matter from arable land.<br />

• Agricultural statistics.<br />

• Agricultural practices.<br />

These data were all easily available at a national<br />

scale, and hence their availability is not expected to<br />

pose significant constraints for large scale modelling<br />

in other parts of Europe.<br />

The most critical data that may cause problems in<br />

terms of availability at larger scale are the geological<br />

data, for which no global (or European) digital database<br />

apparently exists. The present case study relied<br />

heavily on an EC report produced by the Danish<br />

Geological Survey. The information in this report<br />

proved adequate for the present purpose, although<br />

the lack of geochemical information turned out to<br />

have some importance for one of the two catchments.<br />

Similar readily available EC reports exist for other<br />

countries, but they appear to be non-standardised<br />

and comprise information at a variable level of details.<br />

Hence, the positive conclusions from using the geological<br />

data in EC (1982) for Denmark cannot<br />

necessarily be generalised.<br />

5.2. Parameter assessment—no calibration<br />

An important element of the present methodology<br />

is the principle not to carry out any calibration. The<br />

parameter values were assessed in three different<br />

ways:<br />

• Directly from the available data, e.g. topography<br />

and geology.<br />

• Indirectly from the available data through application<br />

of predefined transfer functions, e.g. the soil<br />

hydraulic parameters.<br />

• Use of standard parameter values that have been<br />

assessed in previous studies on other locations.<br />

While the first two methods can be characterised as<br />

fully objective and transparent, it may be argued that<br />

there always will be some elements of subjective<br />

assessment hidden in the use of standard parameter<br />

values and that the possible calibration exercises in<br />

previous studies may question the “no calibration”<br />

statement.<br />

In the present case the standard parameters originate<br />

from two model codes and associated accumulated<br />

experiences:<br />

• Parameters in the MIKE SHE part. The standard<br />

parameter used here is the time constant for routing<br />

of groundwater to drains (50 days). From comprehensive<br />

hydrological modelling experience on<br />

dozens of Danish catchments starting with<br />

Refsgaard and Hansen (1982) this value can be<br />

characterised as a typical value. It is not the optimal<br />

value that would be estimated in a calibration<br />

for any of the two respective catchments: Thus, for<br />

instance the calibrated value for Karup was in<br />

Refsgaard (1997) estimated to 33 days.<br />

• Parameters in the Daisy part. The standard parameters<br />

used here are the ones controlling the vegetation<br />

part of the evapotranspiration and the<br />

nitrogen turnover processes in the root zone.<br />

These parameters are essential both for the water<br />

balance and the nitrogen concentrations. The Daisy<br />

has standard parameter that can be used if no calibration<br />

is possible (or desirable). These standard<br />

parameter values have originally been assessed<br />

from agricultural field experiments on plot scales<br />

(Hansen et al, 1990). As then the process descriptions<br />

and associated standard parameter values


136<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

have only been subject to minor adjustments<br />

through a number of additional tests on new data<br />

sets from different countries. It should be emphasised<br />

that Daisy has not previously been calibrated<br />

on the Karup and Odense catchments. These two<br />

catchments, and in particular the Karup catchment,<br />

have been subject to modelling studies which have<br />

included calibration of the water balance (evapotranspiration)<br />

parameters. However, in the<br />

previous studies of the Karup catchment (Styczen<br />

and Storm, 1993) and (Refsgaard, 1997) the water<br />

balance in the root zone was simulated by MIKE<br />

SHE, which is not the case in the present study. As<br />

the process descriptions for evapotranspiration in<br />

MIKE SHE and Daisy are fundamentally different,<br />

the Daisy standard parameters used in the present<br />

study, have not been affected at all by the previous<br />

MIKE SHE studies in the same catchment.<br />

Thus although it may correctly be argued that the<br />

standard model parameters are results of previous<br />

studies where calibration was carried out, the specific<br />

parameters used in the present study have not been<br />

subject to, and are not results of, calibration neither in<br />

the Karup nor the Odense catchments.<br />

In our opinion, one of the strengths of physically<br />

based models is the possibility to assess many parameter<br />

values from standard values, achieved from<br />

experience through a number of other applications.<br />

We think that the results of the present study shows<br />

both this strength and some of limitations in this<br />

respect. Thus on one hand, the key results in terms<br />

of annual runoff and nitrogen concentration distributions<br />

are encouraging, while on the contrary Figs. 5<br />

and 7 clearly illustrate that it would be very easy to<br />

obtain a better hydrograph fit through calibration of a<br />

couple of parameter values.<br />

When parameter values are assessed in this way<br />

they inevitably are subject to considerable uncertainty,<br />

which again will generate significant uncertainty<br />

in model results. It is therefore highly relevant<br />

to conduct uncertainty analyses in order to assess<br />

whether the resulting uncertainty becomes so large<br />

that the model results are not of any use for water<br />

management in practise. A methodology and some<br />

results of such uncertainty analyses are provided in<br />

Hansen et al. (1999) for the root zone processes and in<br />

Refsgaard et al. (1998a) for the catchment processes.<br />

5.3. Upscaling<br />

The adopted upscaling methodology is a combination<br />

of upscaling and aggregation. Hence, upscaling in<br />

its traditional definition (Beven, 1995) is used only<br />

from point scale to field scale, where the same equations<br />

are assumed valid and where ‘effective’ parameter<br />

values are used. The parameter values<br />

estimated through pedo-transfer functions (soil data)<br />

and the vegetation parameters representing the different<br />

crops are assumed valid at field scale. Subsequently,<br />

an aggregation procedure is used to<br />

represent catchment scale conditions with regard to<br />

soil and vegetation types. This aggregation procedure<br />

is in full agreement with the findings made regarding<br />

the apparent existence of a threshold area (REA)<br />

above which “… spatial patterns of dominant process<br />

controls can be represented by their statistical distribution<br />

functions” (Famiglietti and Wood, 1995).<br />

This theoretical consideration is supported empirically<br />

by the model results, which show that the annual<br />

catchment runoff can be simulated well, even when<br />

using different model grid sizes. For the Karup catchment,<br />

where the nitrate reduction in the aquifer does<br />

not appear to have influenced the results adversely,<br />

even the statistical distribution of nitrate concentrations<br />

is simulated well.<br />

For simulation of annual runoff and nitrate concentration<br />

distributions, both of which are affected<br />

primarily by root zone processes, the impact of<br />

changes of scale is thus relatively small. In contrary<br />

to this, the impact on hydrograph shape is consistently<br />

rather large. This finding, which also is documented<br />

earlier in Refsgaard (1997), indicates that the applied<br />

upscaling/aggregation procedure has important<br />

limitations with regard to describing the stream–aquifer<br />

interactions. Thus in summary, upscaling of<br />

processes described by vertical, non-correlated, but<br />

patchy, columns is successful, while the upscaling<br />

fails in case of processes where horizontal flows<br />

between grids dominate. The differences in hydrograph<br />

shapes caused by the differences in grid sizes<br />

illustrate how careful a model user has to be when<br />

changing grid size. In our opinion it is not relevant<br />

to talk about an ‘optimal’ scale for hydrograph simulation.<br />

The important point is rather that the present<br />

methodology is scale dependent with regard to hydrograph<br />

simulation; hence a change of scale (grid size)


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 137<br />

generates a need for recalibration of parameters<br />

responsible for baseflow recession and low flow simulation.<br />

An alternative, and commonly used, upscaling<br />

procedure, where upscaling is used all the way from<br />

point scale to catchment scale by selecting the dominant<br />

crop type in each grid, resulted in one uniform<br />

crop representing all the agricultural area. Results<br />

indicate that whereas this uniform upscaling procedure<br />

may be sufficient for simulating annual water<br />

balance and discharge hydrographs, it is not satisfactory<br />

for simulation of nitrate leaching and groundwater<br />

concentrations. This is in agreement with<br />

Beven (1995) who states that upscaling from small<br />

scales to larger scales using effective parameter values<br />

cannot be assumed to be generally adequate.<br />

An inherent limitation of the applied upscaling/<br />

aggregation method is that it does not preserve the<br />

georeferenced location of simulated concentrations,<br />

but only their statistical distribution over the catchment<br />

area. Therefore, comparisons with field data<br />

make no sense on a well by well or subcatchment<br />

by subcatchment basis, and no information on the<br />

actual location of the simulated “hot spots” within<br />

the catchment is possible. If it from a management<br />

point of view is required with a more detailed spatial<br />

resolution of the model predictions, then the same<br />

upscaling method has to be carried out at a finer<br />

scale with all the statistical input data being supplied<br />

on a subcatchment basis. This is in principle straightforward,<br />

but in reality it may often be limited by data<br />

availability.<br />

A critical assumption in the upscaling procedure is<br />

the application of the point scale equations at the field<br />

scale with effective parameters. This corresponds to<br />

interpreting the field as a single equivalent soil<br />

column using effective hydraulic parameters. This<br />

approach was evaluated on two Danish experimental<br />

0.25 ha plots, a coarse sandy soil and sandy loam,<br />

using the Daisy model (Djurhuus et al., 1999). The<br />

two plots were monitored with respect to soil water<br />

content and nitrate in soil water at several depths at 57<br />

points, where also texture, soil water retention and<br />

hydraulic conductivity functions had been measured.<br />

The conclusions from comparing the field measured<br />

data with the model simulations over the experimental<br />

plot, represented by the 57 points, was that the<br />

observed mean nitrate concentrations were matched<br />

well by a simulation using the geometric means as<br />

effective parameters. This conclusion is in agreement<br />

with previous studies for Danish hydrological regime<br />

(Jensen and Refsgaard 1991a–c; Jensen and Mantoglou,<br />

1992). Other studies from other regimes (Bresler<br />

and Dagan, 1983) conclude that effective soil hydraulic<br />

parameters are not adequate for modelling water<br />

flow in spatially variable fields. The critical issue<br />

determining whether such approach is feasible or<br />

not may depend on whether Hortonian overland<br />

flow is created in the hydrological regime in question.<br />

Thus, although the upscaling methodology from point<br />

to field scale is far from universally valid, there are<br />

good reasons to believe that this assumption was satisfactorily<br />

fulfilled in the present case studies.<br />

The spatial patterns, which in subsurface hydrology<br />

is considered to be of significant importance (Wen and<br />

Gómez Hernández, 1996), have been treated in different<br />

ways with regard to continuous data (parameter<br />

values) and categorical data (soil and vegetation<br />

classes). The effects of spatial autocorrelation of soil<br />

and vegetation parameters within a field have been<br />

assumed incorporated into the ‘effective parameters’,<br />

which in the present case are assessed in a rather crude<br />

way through pedo-transfer functions and use of standard<br />

values. The categorical data have been treated<br />

differently in the aggregation procedure for soil and<br />

vegetation classes. The soil data (one soil type for<br />

Karup and two soil types for Odense) were assessed<br />

from the soil map and assigned at a grid basis so that<br />

the percentage of each soil type within a catchment<br />

was preserved and the individual grids to the largest<br />

possible extent were characterised by the dominant<br />

soil type within the respective grid. For the vegetation<br />

types, the same procedure was applied to initially<br />

distinguish between agricultural and non-agricultural<br />

areas by use of the land cover map. Subsequently, it<br />

was assumed that the spatial distribution of cropping<br />

patterns are random and without spatial autocorrelation.<br />

This is justified by the agricultural management<br />

practise of rotating the crops within the individual<br />

farms.<br />

5.4. General applicability of methodology<br />

From the results of the present study it appears that<br />

it is possible to use distributed physically based<br />

models of the same type as the MIKE SHE/Daisy


138<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

for catchment scale assessment of nitrate contamination<br />

from agricultural land. It appears obvious that<br />

such model application is straightforward and the<br />

above conclusion is valid for other areas in Denmark.<br />

The interesting question is therefore how general this<br />

conclusion is to other areas in Europe (and on other<br />

continents) and what the scientific and practical<br />

limitations are. In this respect the following considerations<br />

may be noted:<br />

• Except for the geological data, the general availability<br />

of which are somewhat uncertain, there is<br />

no reason to expect that the application of similar<br />

data for other catchments in other European countries<br />

should not be as relatively easy as the application<br />

for the two Danish catchments. Likewise, the<br />

encouraging simulation results of using European<br />

level databases, in spite of their often coarse resolution<br />

and high level of aggregation, may also be<br />

expected for other areas. With regard to geological<br />

data it may be noted that considerable efforts are<br />

being made at most (if not all) national geological<br />

institutes to provide geological data to users in<br />

digital form; hence the limitation on non-easy<br />

data availability existing so far is likely to be overcome,<br />

at least nationally, during the coming years.<br />

• The combined aggregation/upscaling procedure<br />

appears valid in many areas. The catchments for<br />

which it was used in the present study were limited<br />

to a maximum of about 500 km 2 . However, the<br />

further upscaling to larger areas provides no fundamental<br />

problems, as it consists of just a larger<br />

number of computational grids. Computationally,<br />

running a model like MIKE SHE/Daisy for an area<br />

of for instance 100 000 km 2 with e.g. 250<br />

subcatchments of each 100 grids is maybe close<br />

to the limit of what is practically feasible today<br />

(five years run would require 100 h CPU time on<br />

a Pentium 300 MHz), but this problem will soon<br />

disappear as computers become faster.<br />

• The MIKE SHE/Daisy modelling methodology is<br />

general and applicable to many other areas. Some<br />

limitations, however, is related to special geological<br />

conditions such as karstic flow and fissured<br />

aquifers, which cannot be described explicitly.<br />

Another important limitation is related to the<br />

upscaling procedure from point to field scale,<br />

which may fail in areas where Hortonian overland<br />

flow is a dominant mechanism. In this respect it<br />

should be noted that many areas with dominant<br />

overland flow regimes are mountainous regions<br />

characterised by thin soil layers and steep slopes,<br />

which generally not are regions with important<br />

aquifers.<br />

Hence, it may be concluded that the methodology<br />

can relatively easily be applied to larger areas and<br />

used as decision support tool for evaluation of legislative<br />

and management measures aiming at reducing<br />

nitrate contamination risks.<br />

Acknowledgements<br />

The present work was partly funded by the EC<br />

Environment and Climate Research Programme<br />

(contract number ENV4-CT95-0070). Good ideas<br />

and constructive comments to the manuscript by<br />

Gerard Heuvelink, University of Amsterdam, are<br />

greatly acknowledged. Further, the constructive criticism<br />

of Marnik Vanclooster, Université Catholique de<br />

Louvain, and an anonymous reviewer are<br />

acknowledged.<br />

References<br />

Abbott, M.B., Bathurst, J.C., Cunge, J.A., O’connell, P.E., Rasmussen,<br />

J., 1986. An introduction to the european hydrological<br />

system—systéme hydrologique européen SHE 2: structure of<br />

a physically based distributed modelling system. Journal of<br />

Hydrology 87, 61–77.<br />

Agricultural Statistics, 1995. Danmarks Statistik, 294 pp. (In<br />

Danish).<br />

Arnold, J.G., Williams, J.R., 1995. SWRRB—a watershed scale<br />

model for soil and water resources management. In: Singh,<br />

V.J. (Ed.). Computer Models of Watershed Hydrology, Water<br />

Resources Publication, pp. 847–908.<br />

Arnold, J.G., Williams, J.R., Nicks, A.D., Sammons, N.B., 1990.<br />

SWRRB—A basin scale simulation model for soil and water<br />

resources management, Texas A & M University Press, College<br />

Station 241 pp.<br />

Beasley, D.B., Huggins, L.F., Monke, E.J., 1980. ANWERS: a<br />

model for watershed planning. Transactions of ASAE 23 (4),<br />

938–944.<br />

Beven, K., 1995. Linking parameters across scales: subgrid parameterizations<br />

and scale dependent hydrological models. Hydrological<br />

Processes 9, 507–525.<br />

Blöschl, G., Sivapalan, M., 1995. Scale issues in hydrological<br />

modelling: a review. Hydrological Processes 9, 251–290.<br />

Brester, E., Dagan, G., 1983. Unsaturated flow in spatially variable


J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140 139<br />

fields: application of water flow models to various fields II.<br />

Water Resources Research 19, 421–428.<br />

Cosby, B.J., Hornberger, M., Clapp, Ginn, T.R., 1984. A statistical<br />

exploration of relationships of soil moisture characteristics to<br />

the physical properties of soils. Water Resources Research 20,<br />

682–690.<br />

Dagan, G., 1986. Statistical theory of groundwater flow and transport:<br />

pore to laboratory, laboratory to formation, and formation<br />

to regional scale. Water Resources Research 22 (9), 120–134.<br />

DeCoursey, D.G., Rojas, K.W., Ahuja, L.R., 1989. Potentials for<br />

non-point source groundwater contamination analyzed using<br />

RZWQM. Paper No. SW892562, presented at the International<br />

American Society of Agricultural Engineers’ Winter Meeting,<br />

New Orleans, Louisiana.<br />

DeCoursey, D.G., Ahuja, L.R., Hanson, J., Shaffer, M., Nash, R.,<br />

Rojas, K.W., Hebson, C., Hodges, T., Ma, Q., Johnsen, K.E.,<br />

Ghidey, F., 1992. Root zone water quality model, Version 1.0,<br />

Technical Documentation. United States Department of Agriculture,<br />

Agricultural Research Service, Great Plains Systems<br />

Research Unit, Fort Collins, Colorado, USA.<br />

Djuurhus, J., Hansen, S., Schelde, K., Jacobsen, O.H., 1999. Modelling<br />

mean nitrate leaching from spatially variable fields using<br />

effective parameters. Geoderma 87, 261–279.<br />

EC, 1982. Groundwater resources in Denmark. Commission of the<br />

European Communities. EUR 7941 (In Danish).<br />

EC, 1996. Commission proposal for an Action Programme for Integrated<br />

Groundwater Protection and Management, Brussels.<br />

EEA, 1995. Europe’s Environment. The Dobris Assessment. The<br />

European Agency, Copenhagen.<br />

Engesgaard, P., 1996. Multi-species reactive transport modelling.<br />

In: Abbott, M.B., Refsgaard, J.C. (Eds.). Distributed Hydrological<br />

Modelling, Kluwer Academic Publishers, Dordrecht, pp.<br />

71–91.<br />

EU, 1991. Resolution from Ministerial seminar held in The Hague<br />

in November 1991.<br />

Famiglietti, J.S., Wood, E.F., 1995. Effects of spatial variability and<br />

scale on arealy averaged evapotranspiration. Water Resources<br />

Research 31 (3), 699–712.<br />

Gelhar, L.W., 1986. Stochastic subsurface hydrology. From theory<br />

to applications. Water Resources Research 22 (9), 135–145.<br />

Hansen, S., Jensen, H.E., Nielsen, N.E., Svendsen, H., 1990. Daisy,<br />

a soil plant system model. NPO-forskning fra Miljøstyrelsen,<br />

Report no. A10. Danish Environmental Protection Agency,<br />

Copenhagen.<br />

Hansen, S., Jensen, H.E., Nielsen, N.E., Svendsen, H., 1991. Simulation<br />

of nitrogen dynamics and biomass production in winter<br />

wheat using the Danish simulation model Daisy. Fertilizer<br />

Research 27, 245–259.<br />

Hansen, S., Thorsen, M., Pebesma, E., Kleeschulte, S., Svendsen,<br />

H., 1999. Uncertainty in simulated leaching due to uncertainty<br />

in input data. A case study. Soil Use and Management.<br />

Heng, H.H., Nikolaidis, N.P., 1998. Modelling of non-point source<br />

pollution of nitrogen at the watershed scale. Journal of the<br />

American Water Resources Association 34 (2), 359–374.<br />

Heuvelink, G.B.M., Pebesma, E.J., 1998. Spatial aggregation and<br />

soil process modelling. Geoderma.<br />

Jain, S.K., Storm, B., Bathurst, J.C., Refsgaard, J.C., Singh, R.D.,<br />

1992. Application of the SHE to catchment in India. Part 2.<br />

Field experiments and simulation studies with the SHE on the<br />

Kolar subcatchment of the Narmada River. Journal of<br />

Hydrology 140, 25–47.<br />

Jensen, C., Stougaard, B., Østergaard, H.S., 1996. The performance<br />

of the Danish simulation model Daisy in prediction of Nmin at<br />

spring. Fertilizer Research 44, 79–85.<br />

Jensen, C., Stougaard, B., Østergaard, H.S., 1994. Simulation of the<br />

nitrogen dynamics in farm land areas in Denmark 1989–1993.<br />

Soil Use and Management 10, 111–118.<br />

Jensen, K.H., Refsgaard, J.C., 1991. Spatial variability of physical<br />

parameters in two fields. Part II: Water flow at field scale.<br />

Nordic Hydrology 22, 303–326.<br />

Jensen, K.H., Refsgaard, J.C., 1991. Spatial variability of physical<br />

parameters in two fields. Part III. Solute transport at field scale.<br />

Nordic Hydrology 22, 327–340.<br />

Jensen, K.H., Refsgaard, J.C., 1991. Spatial variability of physical<br />

parameters in two fields. Part I. Water flow and solute transport<br />

at local scale. Nordic Hydrology 22, 275–302.<br />

Jensen, K.H., Mantoglou, A., 1992. Application of stochastic unsaturated<br />

flow theory, numerical simulations, and comparisons to<br />

field observations. Water Resources Research 28, 269–284.<br />

Jensen, L.S., Mueller, T., Nielsen, N.E., Hansen, S., Crocker, G.J.,<br />

Grace, P.R., Klir, J., Körschens, M., Poulton, P.R., 1997. Simulating<br />

trends in soil organic carbon in long-term experiments<br />

using the soil–plant–atmosphere model DAISY. Geoderma 81<br />

(1–2), 5–28.<br />

Kleeschulte, S., 1998. Assessment of data availability for direct<br />

modelling use at the European scale. In: Refsgaard, J.C.,<br />

Ramaekers, D.A. (Eds.), Assessment of ‘cumulative’ uncertainty<br />

in spatial decision support systems: Application to examine<br />

the contamination of groundwater from diffuse sources.<br />

Final Report, vol. 1, EU contract ENV-CT95-070. http://<br />

projects.gim.lu/uncersdss.<br />

Knisel, W.G. (Ed.), 1980. CREAMS: a field-scale model for<br />

chemicals, runoff, and erosion from agricultural managements<br />

systems. US Department of Agriculture, Science,<br />

and Education Administration. Conservation Research Report<br />

no. 26, 643 pp.<br />

Knisel, W.G., Williams, J.R., 1995. Hydrology component of<br />

CREAMS and GLEAMS models. In: Singh, V.P. (Ed.). Computer<br />

Models of Watershed Hydrology, Water Resources Publication,<br />

pp. 1069–1114.<br />

Lamm, C.G., 1971. Det danske jordarkiv (The Danish soil<br />

archieve), Tidsskrift for Planteavl, pp. 703–720 (in Danish).<br />

Leonard, R.A., Knisel, W.G., Still, D.A., 1987. GLEAMS: groundwater<br />

loading effects of agricultural management systems.<br />

Transactions of ASAE 30, 1403–1418.<br />

Mangold, D.C., Tsang, C.F., 1991. A summary of subsurface hydrological<br />

and hydrochemical models. Reviews of Geophysics 29<br />

(1), 51–79.<br />

Michaud, J.D., Shuttelworth, W.J., 1997. Executive summary of the<br />

Tuczon aggregation workshop. Journal of Hydrology 190, 176–<br />

181.<br />

Person, M., Raffensperger, J.P., Ge, S., Garven, G., 1996. Basinscale<br />

hydrogeologic modelling. Reviews of Geophysics 34 (1),<br />

61–97.


140<br />

J.C. Refsgaard et al. / Journal of Hydrology 221 (1999) 117–140<br />

Plantedirektoratet, 1996. Guidelines and forms 1996/1997. Ministry<br />

for Food, Agriculture and Fishery, 38 pp. (In Danish).<br />

Refsgaard, J.C., 1997. Parameterisation, calibration and validation<br />

of distributed hydrological models. Journal of Hydrology 198,<br />

69–97.<br />

Refsgaard, J.C., Hansen, E., 1982. A distributed groundwater/<br />

surface water model for the Suså catchment. Part 1. Model<br />

description. Nordic Hydrology 13, 299–310.<br />

Refsgaard, J.C., Storm, B., 1995. MIKE SHE. In: Singh, V.P. (Ed.).<br />

Computer Models of Watershed Hydrology, Water Resources<br />

Publication, pp. 809–846.<br />

Refsgaard, J.C., Seth, S.M., Bathurst, J.C., Erlich, M., Storm, B.,<br />

Jørgensen, G.H., Chandra, S., 1992. Application of the SHE to<br />

catchment in India. Part1. General results. Journal of Hydrology<br />

140, 1–23.<br />

Refsgaard, J.C., Thorsen, M., Jensen, J.B., Hansen, S., Heuvelink,<br />

G., Pebesma, E., Kleeschulte, S., Ramamaekers, D., 1998.<br />

Uncertainty in spatial decision support systems—Methodology<br />

related to prediction of groundwater pollution. In: Babovic, V.,<br />

Larsen, L.C. (Eds.), Hydroinformatics ‘98. Proceedings of the<br />

Third International Conference on Hydroinformatics, Copenhagen,<br />

Balkema, 24–26 August 1998, pp. 1153–1159.<br />

Refsgaard, J.C., Sørensen, H.R., Mucha, I., Rodak, D., Hlavaty, Z.,<br />

Bansky, L., Klucovska, J., Topolska, J., Takac, J., Kosc, V.,<br />

Enggrob, H.G., Engesgaard, P., Jensen, J.K., Fiselier, J., Griffioen,<br />

J., Hansen, S., 1998. An integrated model for the Danubian<br />

Lowland—methodology and applications. Water<br />

Resources Management 12, 433–465.<br />

Refsgaard, J.C., Ramaekers, D., Heuvelink, G.B.M., Schreurs, V.,<br />

Kros, H., Rosén, L., Hansen, S., 1998. Assessment of ‘cumulative’<br />

uncertainty in spatial decision support systems: Application<br />

to examine the contamination of groundwater from diffuse<br />

sources (UNCERSDSS). Presented at the European Climate<br />

Science Conference, Vienna, 19–23 October.<br />

Saulnier, G.M., Beven, K., Obled, C., 1997. Digital elevation analysis<br />

for distributed hydrological modelling: Reducing scale<br />

dependence in effective hydraulic conductivity values. Water<br />

Resources Research 33 (9), 2097–2101.<br />

Sellers, P.J., Heiser, M.D., Hall, F.G., Verma, S.B., Desjardins,<br />

R.L., Schuepp, P.M., MacPherson, J.I., 1997. The impact of<br />

using area-averaged land surface properties—topography,<br />

vegetation conditions, soil wetness—in calculations of intermediate<br />

scale (approximately 10 km 2 ) surface-atmosphere<br />

heat and moisture fluxes. Journal of Hydrology 190, 269–301.<br />

Styczen, M., Storm, B., 1993. Modelling of N-movements on catchment<br />

scale—a tool for analysis and decision making. 1. Model<br />

description. 2. A case study. Fertilizer Research 36, 1–17.<br />

Styczen, M., Storm, B., 1995. Modelling of the effects of management<br />

practices on nitrogen in soils and groundwater. In: Bacon,<br />

P.E. (Ed.). Nitrogen Fertilization in the Environment, Marcel<br />

Dekker, New York, pp. 537–564.<br />

Svendsen, H., Hansen, S., Jensen, H.E., 1995. Simulation of crop<br />

production, water and nitrogen balances in two German agroecosystems<br />

using the Daisy model,. Ecological Modelling 81,<br />

197–212.<br />

Thorsen, M., Feyen, J., Styczen, M., 1996. Agrochemical modelling.<br />

In: Abbott, M.B., Refsgaard, J.C. (Eds.). Distributed<br />

Hydrological Modelling, Kluwer Academic Publishers,<br />

Dordrecht, pp. 121–141.<br />

UNCERSDSS, 1998. Assessment of cumulative uncertainty in<br />

Spatial Decision Support Systems: Application to examine the<br />

contamination of groundwater from diffuse sources<br />

(UNCERSDSS). EU contract ENV4-CT95-070. Final Report,<br />

available on http://projects.gim.lu/uncersdss.<br />

Vanclooster, M., Viaene, P., Christians, K., 1994. WAVE—a mathematical<br />

model for simulating agrochemicals in the soil and<br />

vadose environment. Reference and user’s manual (release<br />

2.0). Institute for Land and Water Management, Katholieke<br />

Universiteit Leuven, Belgium.<br />

Vanclooster, M., Viaene, P., Diels, J., Feyen, J., 1995. A deterministic<br />

validation procedure applied to the integrated soil crop<br />

model. Ecological Modelling 81, 183–195.<br />

Vereecken, H., Vanclooster, M., Swerts, M., Diels, J., 1991. Simulating<br />

nitrogen behaviour in soil cropped with winter wheat.<br />

Fertilizer Research 27, 233–243.<br />

Wen, X.-H., Gómez-Hernández, J.J., 1996. Upscaling hydraulic<br />

conductivities in heterogeneous media: An overview. Journal<br />

of Hydrology 183, ix–xxxii.<br />

Wood, E.F., Sivapalan, M., Beven, K.J., Band, L., 1988. Effects of<br />

spatial variability and scale with implications to hydrologic<br />

modelling. Journal of Hydrology 102, 29–47.<br />

Wood, E.F., Sivapalan, M., Beven, K., 1990. Similarity and scale in<br />

catchment storm response. Reviews of Geophysics 28, 1–18.<br />

Woods, R., Sivapalan, M., Duncan, M., 1995. Investigating the<br />

representative elementary area concept: an approach based on<br />

field data. Hydrological Processes 9, 291–312.<br />

Young, R.A., Onstad, C.A., Bosch, D.D., 1995. AGNPS: an agricultural<br />

nonpoint source model. In: Singh, V.P. (Ed.). Computer<br />

Models of Watershed Hydrology, Water Resources Publication,<br />

pp. 1001–1020.


[11]<br />

Thorsen M, Refsgaard JC, Hansen S, Pebesma E, Jensen JB, Kleeschulte S<br />

(2001) Assessment of uncertainty in simulation of nitrate leaching to aquifers<br />

at catchment scale.<br />

Journal of Hydrology, 242, 210-227.<br />

Reprinted from Journal of Hydrology with permission from Elsevier


Journal of Hydrology 242 (2001) 210±227<br />

www.elsevier.com/locate/jhydrol<br />

Assessment of uncertainty in simulation of nitrate leaching to<br />

aquifers at catchment scale<br />

M. Thorsen a , J.C. Refsgaard a, *, S. Hansen b , E. Pebesma c , J.B. Jensen a , S. Kleeschulte d<br />

a DHI Water and Environment, Hùrsholm, Denmark<br />

b Royal Veterinary and Agricultural University, Copenhagen, Denmark<br />

c University of Amsterdam, Amsterdam, The Netherlands<br />

d GIM, Luxembourg, Luxembourg<br />

Received 21 February 2000; revised 21 July 2000; accepted 23 October 2000<br />

Abstract<br />

Deterministic models are used to predict the risk of groundwater contamination from non-point sources and to evaluate the<br />

effect of alleviation measures. Such model predictions are associated with considerable uncertainty due to uncertainty in the<br />

input data used, especially when applied at large scales. The present paper presents a case study related to prediction of nitrate<br />

concentrations in groundwater aquifers using a spatially distributed catchment model. Input data were primarily obtained from<br />

databases at an European level. The model parameters were all assessed from these data by use of transfer functions, and no<br />

model calibration was carried out. The Monte Carlo simulation technique was used to analyse how uncertainty in input data<br />

propagates to model output. It appeared that the magnitude of the uncertainty depends signi®cantly on the considered temporal<br />

and spatial scale. Thus simulations of ¯ux concentrations leaving the root zone at grid level were associated with large<br />

uncertainties, whereas uncertainties in simulated concentrations at aquifer level on catchment scale was much smaller.<br />

q 2001 Elsevier Science B.V. All rights reserved.<br />

Keywords: Nitrate; Non-point pollution; Distributed model; Catchment scale; Uncertainty; Monte Carlo method<br />

1. Introduction<br />

1.1. Background<br />

Deterministic models are important tools for assessing<br />

nitrate leaching, transport and transformation in<br />

connection with groundwater resources management.<br />

Such models may be classi®ed according to the<br />

description of the physical processes as black box,<br />

* Corresponding author. Present address. Department of Hydrology,<br />

Geological Survey of Denmark and Greenland, Thoravej 8,<br />

DK-2400 Copenhagen, Denmark.<br />

E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />

conceptual and physically-based and according to<br />

the spatial description as lumped and distributed<br />

(Wood and O'Connell, 1985; Nemec, 1994;<br />

Refsgaard, 1996; and others). In this respect three<br />

typical model types are the lumped black box<br />

model, the lumped conceptual and the distributed<br />

physically-based. Most nitrogen leaching models<br />

such as RZWQM (DeCoursey et al., 1989) and<br />

DAISY (Hansen et al., 1991) are of the physicallybased<br />

type, but cover only the root zone at plot or ®eld<br />

scale. Within the ®elds of nitrogen modelling at a<br />

catchment scale, typical examples of a black box, a<br />

conceptual and a distributed physically-based model<br />

are statistical regression models (Simmelsgaard,<br />

0022-1694/01/$ - see front matter q 2001 Elsevier Science B.V. All rights reserved.<br />

PII: S0022-1694(00)00396-6


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 211<br />

1991), the SWRRB (Arnold et al., 1990; Arnold and<br />

Williams, 1995) and the MIKE SHE/DAISY (Styczen<br />

and Storm, 1993), respectively.<br />

The black box and conceptual models are<br />

attractive because they require relatively less<br />

data, which are usually easily accessible, while<br />

the predictive capability of these models with<br />

regard to assessing the impacts of alternative agricultural<br />

practices is questionable due to the semiempirical<br />

nature of the process descriptions. A key<br />

problem in using the more complex physicallybased<br />

catchment models operationally lies in the<br />

generally large data requirements prescribed by<br />

the developers of such model codes. However,<br />

due to the better process descriptions these models<br />

may for some types of application be expected to<br />

have better predictive capabilities than the simpler<br />

models (Heng and Nikolaidis, 1998). Traditionally,<br />

complex leaching models are only used on plot or<br />

®eld scales in areas with extraordinarily good data<br />

availability, and even for such cases the relevance<br />

of such an approach is often questioned because<br />

of the perceived uncertainty related to the model<br />

simulations (Skop, 1993). Hence, there is an<br />

evident need to assess the uncertainty related to<br />

large scale simulation of aquifer pollution from<br />

diffuse sources.<br />

When analysing for uncertainties in model<br />

simulations the two fundamentally different<br />

sources of uncertainty are: (1) uncertainty on<br />

input data in terms of input variables (time varying<br />

input such as climate data) and model parameters<br />

(e.g. soil physical characteristics); and (2)<br />

inadequate model structure (process descriptions,<br />

equations). When comparing the model outputs<br />

to measured ®eld data a third source of uncertainty<br />

has to be added, namely the error in the<br />

measurement of output from nature.<br />

Stochastic approaches are useful tools in uncertainty<br />

analyses. Assessment of uncertainties of<br />

model simulations requires a joint stochastic±deterministic<br />

approach, where the input data and/or the<br />

structure of the deterministic model somehow are<br />

considered stochastic. By considering input data as<br />

realisations of stochastic variables with given statistical<br />

properties, the governing equations become socalled<br />

stochastic partial differential equations<br />

(PDEs). The three traditional approaches to solving<br />

the stochastic PDEs are (1) state space formulations<br />

Ð Kalman ®ltering (Gelb, 1974; Ahsan and O'Connor,<br />

1994), (2) Monte Carlo techniques (Smith and<br />

Freeze, 1979a,b; Freeze, 1980; Zhang et al., 1993,<br />

and (3) analytical solutions to the stochastic PDEs<br />

(Gelhar, 1986; Dagan, 1986; Jensen and Mantoglou,<br />

1992). A severe limitation of the above three methods<br />

is that they only consider uncertainties on input data,<br />

while all of them assume the model structure to be<br />

correct. A more comprehensive approach also allowing<br />

consideration of the uncertainty in the model<br />

structure and process equations is the generalised likelihood<br />

uncertainty estimation (GLUE) methodology<br />

outlined in Beven and Binley (1992). Although no<br />

such studies have been reported yet, the GLUE in<br />

principle allows the uncertainty on model structure<br />

to be considered by introducing several alternative<br />

models, so that the Monte Carlo procedure includes<br />

both uncertainties on input data and on model structure.<br />

The objective of the present paper is, by use of<br />

Monte Carlo simulations, to assess whether a distributed<br />

physically-based model can provide fairly accurate<br />

predictions of nitrate concentrations in aquifers<br />

when applied at a catchment scale with input data only<br />

from readily available, aggregated data sources such<br />

as European databases. A limitation of the present<br />

paper is that only uncertainties in input data are<br />

considered, while errors in model structures are not<br />

taken into account.<br />

The studies reported in literature dealing with<br />

assessment of uncertainty of physically-based<br />

models consider only individual components of<br />

the hydrological cycle, typically groundwater,<br />

while the studies dealing with conceptual models,<br />

including both surface water, root zone and groundwater<br />

processes, have not considered uncertainties<br />

on nitrogen or other water quality aspects. Thus, to<br />

our knowledge, no similar attempts have been<br />

reported so far. The present paper focussing on<br />

uncertainty assessment at catchment scale is an<br />

extension of Refsgaard et al. (1999) and Hansen<br />

et al. (1999), where details on the deterministic<br />

modelling at catchment scale and the uncertainty<br />

aspects at the nitrogen leaching from the root<br />

zone, respectively, have been described. All three<br />

papers present results from the UNCERSDSS<br />

project (Refsgaard et al., 1998).


212<br />

M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />

2. Methodology<br />

2.1. Modelling approach<br />

The deterministic simulation is carried out by the<br />

coupled MIKE SHE/DAISY system. This is a<br />

coupling of a 1D root zone model (DAISY) and a<br />

3D distributed catchment model (MIKE SHE).<br />

MIKE SHE is a modelling system describing the<br />

¯ow of water and solutes in a catchment in a distributed<br />

physically-based way. This implies numerical<br />

solutions of the coupled PDEs for overland (2D) and<br />

channel ¯ow (1D), unsaturated ¯ow (1D) and saturated<br />

¯ow (3D) together with a description of evapotranspiration<br />

and snowmelt processes. For further<br />

details reference is made to the literature (Abbott et<br />

al., 1986; Refsgaard and Storm, 1995).<br />

DAISY (Hansen et al., 1991) is a 1D physicallybased<br />

modelling tool for the simulation of crop<br />

production and water and nitrogen balance in the<br />

root zone. DAISY includes modules for description<br />

of evapotranspiration, soil water dynamics based on<br />

Richards' equation, water uptake by plants, soil<br />

temperature, soil mineral nitrogen dynamics based<br />

on the advection±dispersion equation, nitrate uptake<br />

by plants and nitrogen transformations in the soil. The<br />

nitrogen transformations simulated by DAISY are<br />

mineralisation-immobilisation turnover (MIT), nitri®cation<br />

and denitri®cation. In addition, DAISY<br />

includes a module for description of agricultural<br />

management practices.<br />

By combining MIKE SHE and DAISY, a complete<br />

modelling system is available for the simulation of<br />

water and nitrate transport in an entire catchment. In<br />

the present case the coupling is a sequential one. Thus<br />

for all agricultural areas, DAISY ®rst performs calculations<br />

of water and nitrogen behaviour from the soil<br />

surface and through the root zone. The percolation of<br />

water and nitrate at the bottom of the root zone, simulated<br />

by DAISY, is then used as input to MIKE SHE<br />

calculations for the remaining part of the catchment.<br />

For natural areas, MIKE SHE calculates also the root<br />

zone processes assuming no nitrate contribution from<br />

these areas. Due to the sequential execution of the two<br />

codes, it has to be assumed that there is no feedback<br />

from the groundwater zone (MIKE SHE) to the root<br />

zone (DAISY). As the riparian buffer zone, where<br />

such feedback mechanism is effective, often mainly<br />

(like in our case study) constitutes a part of the natural<br />

areas, this limitation is of minor practical importance.<br />

Furthermore, overland ¯ow generated by high intensity<br />

rainfall (Hortonian) can not be simulated by this<br />

coupling, while saturation-excess overland ¯ow<br />

(Dunne) can be accounted for by MIKE SHE.<br />

Thus, MIKE SHE does not in the present case<br />

handle evapotranspiration and other root zone<br />

processes in the agricultural areas. As DAISY is 1D,<br />

one DAISY run in principle should be carried out for<br />

each of MIKE SHE's horizontal grids. However,<br />

several MIKE SHE grids are assumed to have identical<br />

root zone properties (soil, crop, agricultural<br />

management practices, etc), so that in practise the<br />

outputs from each DAISY run can be used as input<br />

to several MIKE SHE grids.<br />

To ful®l one of the overall objectives of the project,<br />

which was to assess the quality of European data sets<br />

for direct use for modelling at the European scale, two<br />

key constraints were applied to the modelling<br />

approach. One constraint was that, if possible, input<br />

data such as model parameters and driving variables<br />

should be based on publicly available information,<br />

which preferably could be accessed from the standard<br />

European databases such as GISCO or EUROSTAT,<br />

or from very easily available national sources.<br />

Another constraint was that all model parameters<br />

obtained from standard databases were to be used<br />

directly or by way of transfer functions without any<br />

model calibration.<br />

2.2. Scaling<br />

As the equations in both the MIKE SHE and the<br />

DAISY codes basically are point scale equations a<br />

scaling procedure had to be adopted in order to<br />

apply the codes at a catchment scale. MIKE SHE/<br />

DAISY is in this case run with equations and parameter<br />

values in each model grid point representing<br />

®eld scale conditions. The ®eld scale is characterised<br />

by `effective' soil and vegetation parameters, but<br />

assuming only one soil type and one cropping pattern.<br />

The smallest horizontal discretisation in the model is<br />

the grid scale (2 £ 2km 2 ) that is larger than the ®eld<br />

scale. This implies that all the variations between<br />

categories of soil type and crop type within the area<br />

of each grid can not be resolved and described at the<br />

grid level. Input data, whose variations are not


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 213<br />

Fig. 1. Location of the Karup catchment in Jutland, Denmark.<br />

included in the grid scale representation, are distributed<br />

randomly at the catchment scale so that their<br />

statistical distributions are preserved at that scale.<br />

The results from the grid scale modelling are then<br />

aggregated to catchment scale (130 grids) and the<br />

statistical properties of model output and ®eld data<br />

are then compared at catchment scale. Thus the scaling<br />

procedure from point scale to catchment scale<br />

may be characterised as a combination of an upscaling<br />

step and an aggregation step. The upscaling step is<br />

simply the important assumption that the point scale<br />

equations are valid at ®eld scale. The aggregation step<br />

highlights a key issue from the concept of representative<br />

elementary area (REA) (Wood et al., 1988),<br />

namely that variability can be explicitly represented<br />

only at scales larger than the model grid size.<br />

More details on the adopted scaling approach is<br />

provided in Refsgaard et al. (1999), where it is also<br />

documented that the approach can be assumed valid<br />

for the case study in question.<br />

2.3. Input error assessment<br />

The MIKE SHE/DAISY model contains a very<br />

large number of input parameters. Ideally, all these<br />

parameters should be treated stochastically and<br />

included in the uncertainty analyses. However, this<br />

would result in an unrealistically high number of<br />

Monte Carlo simulations and CPU-time. Therefore,<br />

the input uncertainty was limited to ®ve key parameters<br />

(see Section 3.2 below), which were selected<br />

so that they, by experience, are known to be the dominant<br />

parameters in the processes governing the water<br />

balance and nitrate leaching and transformation.


214<br />

M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />

The actual input error assessment, i.e. the choice<br />

and parameterisation of the joint probability distribution<br />

of the stochastic variables was partly based on the<br />

analysis of available data and partly on expert judgement.<br />

Available data comprised data from national<br />

surveys or previous studies. The expert judgement<br />

refers for instance to the choice of the distribution<br />

type if no data were present, and the assessment of<br />

`realistic' ranges between which the true parameter<br />

values were expected to vary. Although this assessment<br />

seems rather subjective, it was hard to ®nd a<br />

better way of doing this in the case of lacking data.<br />

Since the basic unit of calculation is a ®eld, the variation<br />

of ®eld-effective values was used for determining<br />

the range of the parameter probability distributions. A<br />

single realisation of such a parameter was then used in<br />

the model for each grid cell. All stochastic parameters<br />

were treated as being mutually independent. The<br />

reasons for this are that no signi®cant correlation was<br />

suspected a priori, and that no data were available to<br />

actually estimate possible correlation.<br />

2.4. Error propagation<br />

The propagation of errors in the input data to the<br />

model output was assessed using Monte Carlo analysis.<br />

This means that a number of realisations were drawn at<br />

random from the stochastic input parameter distributions<br />

and that the model was run for each realisation.<br />

The ensemble of model outputs then is an estimate of the<br />

model output probability distribution, as only in¯uenced<br />

by uncertainty in model input parameters. In order to<br />

reduce the number of Monte Carlo runs, Latin hypercube<br />

sampling was used to draw realisations from the<br />

input variables (McKay et al., 1979). This essentially<br />

means that each sample of a stochastic input variable<br />

was strati®ed in N strata with equal probability mass,<br />

where N equals the number of Monte Carlo runs. The<br />

theoretical background for the adopted Latin hypercube<br />

sampling method is described in Pebesma and Heuvelink<br />

(1999).<br />

3. Application<br />

3.1. Study area<br />

The area used in the study is the Karup river basin,<br />

located in the middle part of Jutland, Denmark<br />

(Fig. 1). The topographic catchment covers approximately<br />

500 km 2 of which 70% are used for agricultural<br />

purposes and 30% are natural areas. The<br />

catchment characteristics are described in Styczen<br />

and Storm (1993). The data used for the present<br />

study and the model construction are described in<br />

detail in Refsgaard et al. (1999) and Hansen et al.<br />

(1999). In the following a brief summary is provided.<br />

The catchment was in the model represented in a<br />

3D network. The discretisation used for the uncertainty<br />

analysis was 2 km in the horizontal direction<br />

and varied in the vertical from 5 to 40 cm in the unsaturated<br />

zone, and from 10 to 15 m in the saturated<br />

zone. The catchment area and the location of the<br />

river branches as well as the stream geometry were<br />

generated on the basis of a digital elevation map from<br />

USGS/GISCO using Arc/Info facilities. Spatial distributions<br />

of land use and soil types were derived from<br />

the GISCO database and hydrogeological data were<br />

obtained from EC (1982). Distributions of crop types<br />

and livestock densities were obtained from Agricultural<br />

Statistics (1995) and converted to slurry production<br />

using standard values for nitrogen content. Based<br />

on typical crop rotations proposed by The Danish<br />

Agricultural Advisory Centre and the constraints<br />

offered by crop distribution and livestock density<br />

two cattle farm rotations, one pig farm rotation and<br />

one arable farm rotation were constructed. In order to<br />

capture the effect of the interaction between weather<br />

conditions and crop, simulations were performed in<br />

such a way that each crop at its particular position in<br />

the considered rotation occurred once in each of the<br />

years in the rotation. This resulted in a total of 17<br />

agricultural crop rotation schemes and one scheme<br />

representing natural areas with no assumed nitrate<br />

leaching. These 18 schemes were distributed<br />

randomly over the area in such a way that the statistical<br />

distribution was in accordance with the agricultural<br />

statistics.<br />

To simulate the trend in the nitrate concentrations<br />

in the groundwater and in the streams, it is necessary<br />

to have information on the history of the fertiliser<br />

application in space and time. In Denmark, norms<br />

and regulations for fertilisation practice are de®ned<br />

(Plantedirektoratet, 1996). These regulate the maximum<br />

amount of nutrients allowed for a particular<br />

crop depending on forefruit and soil type, and in addition,<br />

provide norms for the lower limit of nitrogen


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 215<br />

Table 1<br />

Statistical properties of the input error considered in the Monte Carlo analysis<br />

Parameter Unit Distribution Mean Std. Range<br />

Daily rainfall<br />

Standard error % 50<br />

a<br />

Clay content % Uniform 8.5 0.0±17.0<br />

SOM2 % Truncated normal 0.5 0.22 0.06±0.94<br />

Cattle slurry<br />

Dry matter content % Truncated normal 7.5 2.5 1.89±14.35<br />

Total N content % Truncated normal 0.5 0.12 0.24±1.02<br />

Pig slurry<br />

Dry matter content % Truncated normal 4.9 2.5 0.82±13.79<br />

Total N content % Truncated normal 0.61 0.18 0.24±1.02<br />

Depth of reduction front m Uniform 22.5 18±27<br />

a<br />

The series was normalised so that the mean value was preserved.<br />

utilisation for organic fertilisers. It was assumed that<br />

the farmers follow these statuary norms. Based on<br />

estimated application rates of organic and mineral<br />

fertiliser to the individual crops each year, the<br />

DAISY model simulated time series of nitrate leaching<br />

from the root zone for each agricultural grid. The<br />

MIKE SHE model then routed these ¯uxes further<br />

through the unsaturated zone and in the groundwater<br />

layers accounting for dispersion and dilution<br />

processes and ®nally into the Karup stream where<br />

the integrated load from the entire catchment was<br />

estimated.<br />

The model was run for seven years, from 1987 to<br />

1993. The large storage possibilities in the unsaturated<br />

zone and the aquifer imply that the initial conditions<br />

in¯uence the simulation results for several years. The<br />

initial conditions were established by running the<br />

deterministic model twice for the period 1987±<br />

1993. In the ®rst run the initial conditions were<br />

guessed and in the second run they were taken as<br />

the simulated conditions by the end of the period.<br />

The simulated 1993 conditions in the second run<br />

were then used as initial conditions for the Monte<br />

Carlo runs. This procedure ensures that the initial<br />

conditions are consistent with the assumptions made<br />

in the deterministic simulation, but not necessarily<br />

with the parameter values drawn in the Monte Carlo<br />

runs, where e.g. a run with a parameter value resulting<br />

in higher nitrate leaching, in principle, should have<br />

been associated with higher initial nitrate concentrations<br />

in the aquifer. In order to reduce the effect of<br />

this, the two ®rst years were considered as a<br />

`warming-up period' and the last ®ve years were<br />

considered the simulation period.<br />

3.2. Assessment of input errors<br />

Uncertainty on the following ®ve parameters was<br />

introduced in the analysis: precipitation, soil hydraulic<br />

properties, soil organic matter (SOM) content,<br />

slurry composition, and depth of the nitrate reduction<br />

front in the aquifer. The rationale for selecting these<br />

®ve parameters and details on their assessment are<br />

provided in Sections 3.2.2±3.2.6 below. The statistical<br />

characteristics of the data included in the Monte<br />

Carlo analysis are shown in Table 1.<br />

3.2.1. Length scale and spatial correlation<br />

A fundamental question in the assessment of uncertainty<br />

of input data for a spatially distributed model<br />

like MIKE SHE/DAISY is whether the input data are<br />

spatially correlated or not. It is possible to take spatial<br />

correlation into account, however, it will complicate<br />

the Monte Carlo sampling considerably (Kros et al.,<br />

1999). The critical question in this relation is whether<br />

the spatial autocorrelation length scale of the input<br />

data is larger than the computational scale, or whether<br />

the dominating spatial variability takes place within a<br />

computational length scale, in which case it should be<br />

incorporated into the effective model parameters and<br />

their inherent uncertainties.<br />

As discussed above, the basic unit of calculation is<br />

the model grid (2 £ 2km 2 ) with some of the parameters,<br />

however, representing ®eld-effective values


216<br />

M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />

(typically 1±10 ha in size). Hence the soil hydraulic<br />

parameters, the SOM content and slurry composition<br />

are representing ®eld length scales in the order of<br />

100±300 m, while the precipitation and reduction<br />

front are represented at a 2 km length scale.<br />

For the ®eld related parameters the correlation<br />

length scales can be assumed smaller than 100 m.<br />

For soil hydraulic properties this is documented in<br />

previous studies (Hansen and Jensen, 1988), while<br />

no data exist on length scales for SOM. With respect<br />

to slurry composition this parameter is the result of<br />

farm management and storage conditions, and it is<br />

known that the temporal variability of the produced<br />

slurry on the individual farm is considerable. Hence, it<br />

is assumed that the variability within the individual<br />

®elds is much larger than the variability among the<br />

®elds.<br />

Daily rainfall data are known to have correlation<br />

length scales that are usually larger than the<br />

2 km grid scale used in the present case. Geostatistical<br />

analysis (Storm et al., 1988) suggests that<br />

the length scale for Danish conditions is in the<br />

order of 10 km. Similarly, the location of the<br />

reduction/oxidation front, which is mainly dependent<br />

on geological conditions, may be assumed to<br />

be signi®cantly larger than the 2 km grid.<br />

This implies that the three ®eld related parameters<br />

in principle should be treated as spatially independent<br />

in the Monte Carlo analysis, while the two other input<br />

data could be treated as almost spatially constant.<br />

As a consequence of the adopted scaling approach<br />

the relevant scale for which the uncertainty on the<br />

input data should be generated in the Monte Carlo<br />

analysis is the catchment scale and not the grid<br />

scale. The uncertainty at catchment scale can be<br />

generated either by allowing spatial variation among<br />

grids and use a variance applicable for grid scale in<br />

the Monte Carlo sampling or by assuming a spatially<br />

constant value and using the (smaller) catchment scale<br />

variance. In the present study we have adopted the<br />

latter approach. This has two important limitations.<br />

Firstly, the nitrate reduction processes in the aquifer,<br />

where the horizontal dimension with ¯ows between<br />

neighbouring grids is important, is not fully correctly<br />

described because the autocorrelation length scale is<br />

not preserved. Secondly, the output uncertainties are<br />

only simulated correctly at the catchment scale, while<br />

they are underestimated at grid scales.<br />

3.2.2. Precipitation<br />

In general the required daily climate data are available<br />

throughout Europe from the national meteorological<br />

institutes. Among the required meteorological<br />

variables the precipitation is the one, subject to most<br />

local variations. Therefore uncertainty on the daily<br />

amount of precipitation was included in the present<br />

analysis. The uncertainty was described by adding a<br />

random error to the measured series. This error was<br />

assumed to follow a normal distribution with zero<br />

mean and a standard deviation equivalent to 50% of<br />

the measured daily value. Thus, dry days were kept<br />

dry. The error was assumed to contain no temporal<br />

autocorrelation. Finally, the series was normalised so<br />

that the mean value, taken over the 25 Monte Carlo<br />

runs, was preserved. The adopted variance is in agreement<br />

with Allerup et al. (1982) as standard error of<br />

daily rainfall for a catchment of this size.<br />

3.2.3. Soil hydraulic properties<br />

The modelling system requires soil hydraulic parameters<br />

in terms of retention curves and hydraulic<br />

conductivity functions. Such data were not directly<br />

available through European databases. Instead, these<br />

properties were estimated using pedo-transfer functions<br />

based on soil information in terms of texture<br />

composition obtained from the GISCO soil database<br />

in which soils are divided into ®ve texture classes<br />

according to FAO classi®cation. All soil types of the<br />

Karup catchment fall within one texture class (coarse<br />

texture) which covers soils with less than 18% clay<br />

and more than 65% sand. As the texture class covers a<br />

wide range of different texture compositions, soil<br />

hydraulic properties derived from this information<br />

will be associated with considerable uncertainty.<br />

Based on a review by Tietje and Tapakenhinrichs<br />

(1993) evaluating available pedo-transfer functions<br />

and based on the constraints imposed by the available<br />

information on texture (clay, silt and sand content),<br />

the pedo-transfer functions proposed by Cosby et al.<br />

(1984) were selected. These functions estimate the<br />

saturated hydraulic conductivity and the parameters<br />

in the soil water retention function proposed by<br />

Campbell (1974). The hydraulic conductivity function<br />

was calculated according to Burdine (1952) using the<br />

same parameters. In order to facilitate a smooth retention<br />

function the Campbell functions were modi®ed<br />

according to the modi®cations of the Brooks±Corey


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 217<br />

function (Brooks and Corey, 1966) proposed by Smith<br />

(1992). In Danish soils the clay and the silt content are<br />

correlated. Based on information in the Danish Soil<br />

Library (Lamm, 1971) a relation between clay and silt<br />

has been established:<br />

Silt content ˆ 0:035 1 0:82 £ Clay content<br />

…r 2 ˆ 0:68†<br />

Adopting this relation and assuming that clay, silt<br />

and sand constitute all soil solids, the soil hydraulic<br />

properties can be calculated once the clay content is<br />

known. In the uncertainty analysis, the clay content<br />

was drawn strati®ed random from a uniform distribution<br />

ranging from 0 to 17% (Table 1). In reality, the<br />

uncertainty on the soil hydraulic parameters originate<br />

from two sources, namely the uncertainty on soil<br />

texture and the uncertainty related to use of the<br />

adopted pedotransfer function. In the present<br />

approach uncertainty is only associated to soil texture.<br />

Data from the Danish Soil Textural Database show<br />

that a uniform distribution, as adopted in the present<br />

study, clearly overestimates the uncertainty on soil<br />

texture (Bùrresen, 2000). The assumed large uncertainty<br />

range on soil texture may therefore compensate<br />

for the lack of uncertainty on the pedotransfer function,<br />

so that the integrated uncertainty on the soil<br />

hydraulic parameters is of the right order of magnitude.<br />

Considering that the autocorrelation length scale<br />

for soil texture is in the order of 100 m, this adopted<br />

uncertainty range may at a ®rst glance appear as a<br />

rather high uncertainty for soil texture at the catchment<br />

scale. However, as the FAO texture class is so<br />

broad that it actually covers different soil types with<br />

large differences in hydraulic properties the adopted<br />

catchment scale variance should be seen to cover<br />

uncertainty on which soil type actually is present in<br />

the catchment rather than uncertainty on hydraulic<br />

properties due to small scale variations.<br />

3.2.4. Soil organic matter<br />

In DAISY, the MIT model considers three types of<br />

organic matter: newly added relatively fresh organic<br />

matter (AOM) with a relatively short turnover rate,<br />

the living soil microbial biomass (SMB) and old<br />

native SOM with slow turnover, respectively. The<br />

former two can be initialised with default values<br />

when the model is run with a `warm-up' period of a<br />

couple of years prior to the actual simulation period.<br />

The latter comprises by far, most of the organic matter<br />

found in the soil. However, SOM is divided into two<br />

sub-pools, SOM 1 and SOM 2 . The turnover of SOM 1 is<br />

so slow that its contribution to the annual nitrogen<br />

mineralisation in agricultural soils is negligible.<br />

Hence, when initialising the MIT model the important<br />

factor is the quantity of SOM 2 . As the European databases<br />

did not provide this information we had to rely<br />

on estimates of both the amount of the organic matter<br />

present in the soil and the amount of this organic<br />

matter that is allocated to the SOM 2 . The assumed<br />

statistical properties of this uncertainty are shown in<br />

Table 1.<br />

3.2.5. Slurry composition<br />

Due to the high livestock density, slurry is a<br />

substantial source of nitrogen in the Karup region.<br />

Hence the management of slurry is of prime importance<br />

for the leaching losses. A main problem in<br />

management of slurry is the large variability found<br />

in the composition of the slurry. This variability<br />

makes the actual fertiliser application in slurry differ<br />

from the planned application and introduces therefore<br />

a considerable source of uncertainty. In the uncertainty<br />

analysis this has been accounted for by introducing<br />

uncertainty on the dry matter content and the<br />

nitrogen content of the slurry. The assumed error<br />

statistics are shown in Table 1. Further details on<br />

the agricultural management and the rationale behind<br />

the error statistics are provided in Hansen et al.<br />

(1999).<br />

3.2.6. Depth of reduction front<br />

In the uncertainty analysis the depth of the reduction<br />

front in the saturated zone was drawn from a<br />

uniform distribution in the interval 18±27 m below<br />

soil surface.<br />

3.3. Uncertainty analyses<br />

The initial part of the uncertainty analysis<br />

comprised an evaluation of the selected number of<br />

Monte Carlo runs. As the CPU-time required to run<br />

the model for the seven year period is substantial it<br />

was necessary to keep the number of Monte Carlo<br />

runs to a minimum. Therefore an initial choice of 25


218<br />

M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />

Table 2<br />

Evaluation of the representativeness of 25 Monte Carlo runs<br />

Variable 1±25 26±50 51±75 1±75 CV (%)<br />

Mean Std. Mean Std. Mean Std. Mean a Std.<br />

Leaching from root zone (kg N/ 64.7 19.2 68.2 18.9 67.2 16.7 66.7 18.1 27.1<br />

ha/year)<br />

Groundwater concentration (mg 47.7 8.0 48.3 7.2 47.6 6.0 47.8 7.0 14.6<br />

NO 3 /l)<br />

River ¯ow (mm/year) 464.0 22.0 464.0 23.0 464.0 17.0 464.0 21.0 4.5<br />

River concentration (mg NO 3 /l) 45.1 7.8 46.2 7.3 45.7 6.6 45.7 7.1 15.5<br />

a<br />

Homogeneity of means accepted by F-test.<br />

runs was made. In order to investigate whether 25<br />

Monte Carlo runs are suf®cient to capture the variability,<br />

75 Monte Carlo runs were performed and the<br />

results were split into 3 groups of 25 runs each and the<br />

statistical distribution of the three elements were<br />

compared. The output variables analysed were river<br />

¯ow, average NO 3 concentration in groundwater, and<br />

average NO 3 concentration in the stream. The three<br />

sets of Monte Carlo runs were evaluated by comparing<br />

the statistical distribution of simulation results, i.e.<br />

testing whether the simulation results can be<br />

described by a normal distribution and whether homogeneity<br />

of mean and variance can be assumed.<br />

In the second part of the uncertainty analysis the<br />

sources of uncertainty with respect to uncertainties<br />

associated with each of the selected Monte Carlo parameters<br />

were evaluated by performing ®ve sets of<br />

Monte Carlo simulations in each of which one of<br />

the initially stochastic parameters was kept deterministic.<br />

The uncertainty contributions of the different<br />

parameters were then evaluated. As annual leaching<br />

depends on weather, crop and crop position in the<br />

rotation, groundwater concentrations in single years<br />

were not considered, instead data averaged over the<br />

®ve year simulation period, 1989±1993, were used for<br />

the uncertainty analysis.<br />

4. Results Ð uncertainties of model results<br />

4.1. Evaluation of the number of Monte Carlo runs<br />

The main results of the comparison between three<br />

individual sets of 25 Monte Carlo runs are given in<br />

Table 2. Statistical tests showed that the hypothesis of<br />

homogeneity of means and variances can not be<br />

Fig. 2. Statistical distribution from 25 Monte Carlo runs of simulated average annual river ¯ow at the catchment outlet. The corresponding<br />

measured value based on daily river ¯ow data was 451 mm/year.


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 219<br />

Fig. 3. Statistical distribution over 25 Monte Carlo runs of simulated areal average NO 3 concentrations in upper aquifer layer by the end of<br />

1993. The corresponding measured value based on data from 35 wells was 58 mg/l.<br />

rejected. As the three sub-sets appear statistically<br />

similar it was concluded that 25 Monte Carlo runs<br />

were suf®cient to assess the uncertainty on the simulation<br />

results. It should be emphasised that the small<br />

number of Monte Carlo runs only is possible because<br />

we focus on mean values and standard deviations. If<br />

the aim were to assess uncertainties on extreme<br />

values, such as the 1% fractile, 25 runs would<br />

obviously not have been suf®cient.<br />

4.2. Comparisons with ®eld data<br />

The simulated uncertainty intervals on selected<br />

model results were, if possible, compared to corresponding<br />

measured data available from monitoring<br />

programmes conducted in the area. In this context it<br />

is noted that due to the adopted scaling approach, the<br />

simulation results are only supposed to re¯ect the ®eld<br />

observations at a catchment scale and not at a point<br />

scale.<br />

The simulated water balance represented by average<br />

annual river discharge at the catchment outlet<br />

vary from 428 to 502 mm/year (Fig. 2). The corresponding<br />

measured value is 451 mm/year which<br />

falls within the simulated interval and within 5% of<br />

both the median (462 mm) and the average (463 mm)<br />

Fig. 4. Statistical distribution over 25 Monte Carlo runs of percentage of catchment area with NO 3 concentrations above the drinking water limit<br />

of 50 mg/l. The corresponding measured value based on data from 35 wells was 57%.


220<br />

M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />

Fig. 5. Measured (B) and simulated ( £ ) areal distribution of NO 3 concentrations in groundwater at eight points in time. Measured values are based on 35 groundwater observations.


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 221<br />

Fig. 6. (a) Simulated time series of six monthly ¯ux concentrations from the root zone obtained in three different crop rotations (B ˆ mean,<br />

u ˆ ^ 1 £ std). The range of seasonal variation in standard errors is shown inside the ®gures. (b) Simulated time series of average areal aquifer<br />

concentrations (B ˆ mean, u ˆ ^ 1 £ std). The range of seasonal variation in standard errors is shown inside the ®gures.


222<br />

M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />

Fig. 6. (continued)<br />

of the simulated values. Fig. 3 presents the simulated<br />

distribution of average nitrate concentrations in the<br />

upper groundwater layer averaged over the entire<br />

catchment and over the ®ve years simulation period.<br />

The corresponding value obtained from observations<br />

in 35 wells is 58 mg/l, which falls within the simulated<br />

interval (35.4±61.4 mg/l) and within 25% of<br />

both the median (46.7 mg/l) and the average<br />

(47.4 mg/l) of the Monte Carlo runs. In Fig. 4 the<br />

fraction of the catchment area with groundwater<br />

concentrations above the drinking water limit of<br />

50 mg/l is shown in terms of statistical distribution<br />

for the 25 Monte Carlo runs. Also in this case the<br />

observed value from the 35 observation wells (57%)<br />

falls within the simulated interval (27±65%) and<br />

within 10% of the median (53%) of the Monte Carlo<br />

runs.<br />

A visual comparison is shown in Fig. 5, where<br />

observed areal distributions of nitrate concentrations<br />

from existing wells are compared to similar results<br />

from the Monte Carlo runs on a six-monthly basis.<br />

From this ®gure it is seen that the measured concentration<br />

distribution in general is within the uncertainty<br />

band generated from the Monte Carlo simulations,<br />

though not always centred. It appears that, in general,<br />

the simulated fraction of the area with nitrate concentrations<br />

exceeding 50 mg/l is slightly overestimated in<br />

the summer period and slightly underestimated in the<br />

winter period, indicating that the overall trend in the<br />

concentration level is simulated adequately whereas<br />

the seasonal variation in observed concentrations is<br />

not fully represented in the simulations.<br />

4.3. Nitrate concentrations in aquifer Ð at different<br />

temporal and spatial scales<br />

The results regarding the uncertainty on simulated<br />

nitrogen leaching from different cropping patterns and<br />

the importance of the contribution from different error<br />

sources are described in detail in Hansen et al. (1999).<br />

The present paper focuses on the catchment scale and<br />

on how uncertainties at a point scale propagate and are<br />

transformed (reduced) at larger spatial and temporal<br />

scales.<br />

The transformation process is illustrated in Fig. 6<br />

which shows the uncertainty, characterised by time<br />

series of the means and standard deviations among<br />

the 25 Monte Carlo runs for (a) six-monthly ¯ux<br />

concentrations from the root zone (DAISY output)<br />

for three different crop rotations, and (b) mean sixmonthly<br />

concentrations in the upper aquifer layer<br />

averaged over the entire aquifer. It is very clearly<br />

seen from the ®gures how the uncertainties are<br />

reduced when moving from root zone leakage to aquifer<br />

concentrations at catchment scale. Thus it is<br />

remarkable that for instance the average standard<br />

errors (standard deviation divided by mean) of six<br />

monthly root zone ¯ux concentrations in the order<br />

of 33±44% are reduced to a standard error of 18%<br />

on the assessed mean six monthly values for ground<br />

water concentrations at the catchment scale.<br />

The large seasonal variation in concentration levels<br />

observed in the percolation water (Fig. 6a) is levelled<br />

out in the simulated groundwater concentrations at<br />

both grid level and catchment level. This is mainly a


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 223<br />

Table 3<br />

Simulations used for evaluation of uncertainty contributions. All six<br />

sets are based on the input uncertainties drawn for the ®rst set of<br />

Monte Carlo simulations (1±25)<br />

Monte Carlo run series<br />

O<br />

A<br />

B<br />

C<br />

D<br />

E<br />

Status of parameters<br />

All ®ve parameters are treated<br />

stochastic<br />

Precipitation is treated<br />

deterministic<br />

Texture is treated deterministic<br />

Soil organic matter is treated<br />

deterministic<br />

Slurry composition is treated<br />

deterministic<br />

Depth of reduction front is<br />

treated deterministic<br />

result of dilution and averaging in the entire groundwater<br />

volume of the upper layer which accounts for<br />

8±13 m of the saturated zone. The differences in<br />

concentration levels between crop rotations is, on<br />

the other hand, still re¯ected in the groundwater<br />

concentrations of corresponding grids (Fig. 6b) with<br />

lowest concentration arising from the plant production<br />

rotations and highest concentrations from the pig rotations.<br />

4.4. Analyses of different sources of input error<br />

In addition to the basic set of Monte Carlo simulations<br />

(1±25), where all ®ve selected parameters were<br />

treated stochastically, ®ve series were simulated in<br />

each of which one of the Monte Carlo parameters<br />

was kept deterministic (Table 3). The results of<br />

these extra ®ve series were compared to the result of<br />

the basic set in order to evaluate the uncertainty associated<br />

with each of the selected parameters. In Table<br />

4, the uncertainty contribution of each series given as<br />

variances is shown. The variance contribution of<br />

single parameters was obtained by subtracting the<br />

total simulated variance obtained with only four<br />

stochastic parameters (e.g. series A) from the total<br />

variance obtained with ®ve stochastic parameters<br />

(series O). Ideally, the sum of the variances corresponding<br />

to the simulation series A±E should equate<br />

the variance associated with Monte Carlo run series<br />

O, if no covariance components were generated. It is,<br />

however, noted that discrepancies occur indicating<br />

that all variance and covariance components are not<br />

accounted for. In spite of this, the results can give a<br />

rough estimate on the relative importance of the<br />

selected sources of uncertainty.<br />

As can be seen from Table 2 (runs 1±25) the uncertainty<br />

on the simulated annual river ¯ows (CV ˆ std./<br />

mean ˆ 5%) was signi®cantly less than the uncertainty<br />

related to the components of the nitrogen<br />

balance i.e. nitrogen leaching (CV ˆ 30%) and nitrate<br />

concentrations in groundwater and stream water<br />

(CV ˆ 17%). According to Table 4 the uncertainty<br />

on simulated river ¯ow was dominated by contributions<br />

from uncertainty on soil texture and on precipitation,<br />

whereas the uncertainties associated with<br />

components of the nitrogen balance were dominated<br />

by the uncertainty contributions from both soil<br />

texture, SOM and slurry composition. Uncertainty<br />

on precipitation contributed only little to the simulated<br />

uncertainties on the nitrogen components despite<br />

the in¯uence it had on the water balance. The depth of<br />

Table 4<br />

Estimation of uncertainty on selected simulation results distributed on calculated variance contribution (s 2 ) from precipitation (A), soil texture<br />

(B), soil organic matter (C), slurry composition (D), and depth of the reduction front (E), respectively<br />

Variable Variance contribution from single parameters SUM (A:E) All parameters O a<br />

A B C D E<br />

Leaching from root zone<br />

0 192 100 114 0 406 370<br />

(kg/ha year)<br />

Groundwater<br />

2 30 29 28 0 89 64<br />

concentration (mg/l)<br />

River ¯ow (mm/year) 284 345 6 6 0 641 499<br />

River concentration (mg/l) 0 27 21 19 0 67 61<br />

a<br />

Variance from simulations with all ®ve Monte Carlo parameters included.


224<br />

M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />

the reduction front appeared to have only minor in¯uence<br />

on the uncertainty of stream water concentrations<br />

in the present simulations.<br />

5. Discussion and conclusions<br />

From the analysis of input error contributions it was<br />

observed that only three of the ®ve input parameters<br />

included in the uncertainty analysis contributed<br />

signi®cantly to the simulated variation in the model<br />

output related to the nitrogen balance, i.e. areal leaching<br />

from the root zone and average nitrate concentrations<br />

in groundwater and stream water. Of these three<br />

only one, soil texture, is related to the transport<br />

processes. The two others, SOM and slurry composition,<br />

are related to the nitrogen turnover processes.<br />

The uncertainty introduced to the driving variable<br />

precipitation in¯uenced the simulated water balance<br />

but not the simulated nitrogen balance. This indicates<br />

that the timing of the percolating water governed by<br />

the hydraulic parameters is more important for the<br />

simulated nitrogen loads than the total annual<br />

amounts of percolation. This result is supported by<br />

other studies showing that one of the major factors<br />

in¯uencing nitrogen losses from the root zone under<br />

northern temperate climate is the amount of readily<br />

available organic nitrogen present in the soil at the end<br />

of the growing season where groundwater recharge is<br />

initiated (Landbrugets RaÊdgivningscenter, 1996). The<br />

predicted uncertainty on the simulated river ¯ow is in<br />

good agreement with results from Storm et al. (1988).<br />

The uncertainty introduced to the depth of the<br />

reduction front in the saturated zone had no in¯uence<br />

on the simulation results. The main reason for this is<br />

that the simulated groundwater levels were shallower<br />

than normally observed in the area. This prevented the<br />

percolating water from passing through the reduced<br />

zone before entering the stream. If the hydrogeological<br />

parameters had been included in the Monte Carlo<br />

analysis, the depth of the reduction front might have<br />

contributed to the simulated variation in the nitrogen<br />

balance component, in particular stream ¯ow concentrations,<br />

as well.<br />

A fundamental limitation of the adopted approach<br />

is that the errors due to incorrect model structure are<br />

neglected. One approach to assess such model error is<br />

through comparison of predicted and observed values.<br />

In the present case it was, however, not possible<br />

during the validation tests to identify a signi®cant<br />

model error. This must not be taken as a general<br />

proof for a correct model structure. It only shows<br />

that the model performs without apparent model<br />

error for the particular case study.<br />

Another limitation of the adopted approach lies in<br />

the choice of associating input uncertainty to only ®ve<br />

parameters. Although these ®ve parameters according<br />

to our experience are the most important ones in the<br />

different processes governing the nitrate leaching and<br />

transformation, this has not been documented by<br />

systematic sensitivity analyses, either by us or by<br />

other authors. It can be argued that the uncertainties<br />

have been underestimated by neglecting the uncertainty<br />

on the other input parameters. Hence, the absolute<br />

uncertainty ®gures should be considered with<br />

some reservation.<br />

A third limitation is the mostly subjective method<br />

of assessing errors in input data. If suitable data had<br />

been available for assessing such errors in a statistically<br />

more rigorous way this should have been done.<br />

Cases where such data are available are typically<br />

studies on small experimental areas, while our case<br />

is more comparable to practical studies, where such<br />

data most often are not available. In spite of the weak<br />

data basis for the input error assessment, the adopted<br />

Monte Carlo analysis is still valuable as a rigorous<br />

method of analysing uncertainty propagation,<br />

although the predicted uncertainties should be treated<br />

with some caution.<br />

When considering uncertainties at different scales it<br />

must be noticed that due to the adopted approaches<br />

with respect to upscaling and Monte Carlo sampling<br />

the uncertainties can only be assumed to be correctly<br />

assessed at the catchment scale, while the uncertainties<br />

at smaller scale are underestimated. This ampli-<br />

®es the ®nding re¯ected in Fig. 6, namely that the<br />

uncertainties in ¯ux concentrations leaving the root<br />

zone is much larger than the uncertainty at the catchment/aquifer<br />

scale. Taking this into account one could<br />

argue that the uncertainty in simulated ¯ux concentrations<br />

leaving the root zone at point/grid scale is so<br />

large that this in itself may lead to the conclusion<br />

that modelling with this type of model, this grid<br />

size, and this data basis is of minor practical use.<br />

However, the uncertainty at the catchment (or aquifer)<br />

scale, which is an interesting scale seen from a water


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 225<br />

supply and policy point of view is reduced so much<br />

that the results may be useful in practice. This duality<br />

illustrates that discussions of model uncertainty are<br />

useless unless the type of simulation result is de®ned<br />

precisely in terms of spatial and temporal scale, which<br />

is probably one of the reasons why `®eld/process<br />

study oriented scientists' and `modellers/large scale<br />

oriented scientists' often misunderstand each other.<br />

One way of reducing the simulated uncertainty<br />

would be to increase the quality of the input data<br />

support either by using national databases instead of<br />

the European data sets or by actually gathering site<br />

speci®c data through ®eld monitoring. The uncertainty<br />

related to the texture composition could be<br />

reduced by using national soil databases, which<br />

often include more detailed classi®cation systems<br />

than the FAO approach provided in the GISCO database.<br />

Keeping the procedure of using pedo-transfer<br />

functions for obtaining hydraulic parameters this<br />

would decrease the uncertainty within each de®ned<br />

soil class. Based on the effect of keeping soil texture<br />

deterministic (Table 4) it could for example be<br />

expected that a 50% reduction in the input error<br />

related to soil texture obtained by collecting better<br />

data in this way would reduce the uncertainty on<br />

simulated groundwater concentration with approximately<br />

25%. Gathering of better precipitation data<br />

would, on the other hand, only improve simulation<br />

of the water balance and not in¯uence the simulated<br />

uncertainty in groundwater concentrations signi®cantly.<br />

Another way of decreasing the uncertainty would<br />

be to carry out model calibration, as this in principle<br />

would decrease the uncertainty related to the input<br />

parameters. In practice it is, however, dif®cult to<br />

quantify how much the input error of a single parameter<br />

should be reduced if calibration involving this<br />

parameter is conducted. In the present study, calibration<br />

of the hydrogeological parameters by use of<br />

measured groundwater levels and observed stream<br />

¯ow might have in¯uenced both the simulated<br />

groundwater concentrations by introducing a more<br />

diverse hydrology and in particular the simulated<br />

stream concentrations as the reduction front may<br />

have come into function. Calibration of the root<br />

zone processes would have required ®eld data in<br />

terms of e.g. soil moisture contents, nitrogen concentrations<br />

in the root zone, crop yields, etc., data which<br />

are not often available. In order to get some idea of the<br />

quality of the simulated mass balances, one possibility<br />

could be to calibrate the simulated crop yields using<br />

regional agricultural statistics, though these can only<br />

provide rather rough estimates.<br />

From the results of the present study it can be<br />

concluded that the present modelling approach appear<br />

feasible for estimating uncertainties in predicted<br />

nitrate concentrations at larger scales, and hereby<br />

also for evaluating the reliability of the simulation<br />

results. The results also indicate that the use of distributed<br />

physically-based models is feasible at the catchment<br />

scale, even if data have to be obtained from<br />

readily available aggregated data sources such as<br />

European databases. Given the constraints for obtaining<br />

data and given that no model calibration was<br />

performed in the present case study, the validation<br />

tests came out surprisingly well as measured groundwater<br />

concentrations were within the uncertainty<br />

intervals of the simulated groundwater concentration.<br />

The uncertainty of the model simulations at catchment<br />

scale are at a relatively low level, and thus the predictive<br />

capability of the model appear very interesting<br />

from a practical water resources management point<br />

of view.<br />

Acknowledgements<br />

The present work was partly funded by the EC<br />

Environment and Climate Research Programme<br />

(contract number ENV4-CT95-0070). We thank the<br />

two reviewers, Tim Burt and Bernd Huwe, for valuable<br />

comments to an earlier version of this manuscript.<br />

References<br />

Abbott, M.B., Bathurst, J.C., Cunge, J.A., O'Connell, P.E., Rasmussen,<br />

J., 1986. An introduction to the European hydrological<br />

system Ð SysteÂme Hydrologique EuropeÂen `SHE'. 1. History<br />

and philosophy of a physically based distributed modelling<br />

system. 2. Structure of a physically based distributed modelling<br />

system. Journal of Hydrology 87, 45±77.<br />

Agricultural Statistics, 1995. Danmarks Statistik, 294pp.<br />

Ahsan, M., O'Connor, K.M., 1994. A reappraisal of the Kalman<br />

®ltering technique as applied in river ¯ow forecasting. Journal<br />

of Hydrology 161, 197±226.<br />

Allerup, P., Madsen, H., Riis, J., 1982. Methods for calculating areal


226<br />

M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227<br />

precipitation Ð applied to the SusaÊ-catchment. Nordic Hydrology<br />

13, 263±278.<br />

Arnold, J.G., Williams, J.R., Nicks, A.D., Sammons, N.B., 1990.<br />

SWRRB Ð A Basin Scale Simulation Model for Soil and Water<br />

Resources Management. Texas A & M University Press,<br />

College Station (241 pp).<br />

Arnold, J.G., Williams, J.R., 1995. SWRRB Ð a watershed scale<br />

model for soil and water resources management. In: Singh, V.P.<br />

(Ed.). Computer Models of Watershed Hydrology. Water<br />

Resources Publication, pp. 847±908.<br />

Beven, K., Binley, A.M., 1992. The future role of distributed<br />

models: model calibration and predictive uncertainty. Hydrological<br />

Processes 6, 279±298.<br />

Brooks, R.H., Corey, A.T., 1966. Properties of porous media affecting<br />

¯uid ¯ow. Journal of the Irrigation and Drainage Division of<br />

the American Society of Civil Engineering 92, 61±88.<br />

Burdine, N.T., 1952. Relative permeability calculations from poresize<br />

distribution data. Transactions of the AIME 198, 35±42.<br />

Bùrgesen, C.D., 2000. Personal communication. Danish Institute of<br />

Agricultural Science.<br />

Campbell, G.S., 1974. A simple method for determining unsaturated<br />

conductivity from moisture retention data. Soil Science<br />

117, 311±314.<br />

Cosby, B.J., Hornberger, M., Clapp, Ginn, T.R., 1984. A statistical<br />

exploration of relationships of soil moisture characteristics to<br />

the physical properties of soils. Water Resources Research 20,<br />

682±690.<br />

Dagan, G., 1986. Statistical theory of groundwater ¯ow and transport:<br />

pore to laboratory, laboratory to formation, and formation<br />

to regional scale. Water Resources Research 22 (9), 120±134.<br />

DeCoursey, D.G., Rojas, K.W., Ahuja, L.R., 1989. Potentials for<br />

non-point source groundwater contamination analyzed using<br />

RZWQM. Paper no. SW892562. Presented at the International<br />

American Society of Agricultural Engineers' Winter Meeting,<br />

New Orleans, Louisiana.<br />

EC, 1982. Groundwater resources in Denmark. Commission of the<br />

European Communities. EUR 7941 (in Danish).<br />

Freeze, R.A., 1980. A stochastic-conceptual analysis of the rainfallrunoff<br />

process on a hillslope. Water Resources Research 16 (2),<br />

391±408.<br />

Gelb, A. (Ed.), 1974. Applied Optimal Estimation MIT Press,<br />

Cambridge, MA.<br />

Gelhar, L.W., 1986. Stochastic subsurface hydrology. From theory<br />

to applications. Water Resources Research 22 (9), 135±145.<br />

Hansen, S., Jensen, H.E., 1988. Spatial variability of soil physical<br />

properties. Theoretical and experimental analysis. II. Soil water<br />

variables-data acquisition, processing and basic statistics.<br />

Research report no. 1210. Department of Soil and Water and<br />

Plant Nutrition. The Royal Veterinary and Agricultural University,<br />

Copenhagen, 54pp.<br />

Hansen, S., Jensen, H.E., Nielsen, N.E., Svendsen, H., 1991. Simulation<br />

of nitrogen dynamics and biomass production in winter<br />

wheat using the Danish simulation model DAISY. Fertiliser<br />

Research 27, 245±259.<br />

Hansen, S., Thorsen, M., Pebesma, E., Kleeschulte, S., Svendsen, H.,<br />

1999. Uncertainty in simulated leaching due to uncertainty in input<br />

data. A case study. Soil Use and Management 15, 167±175.<br />

Heng, H.H., Nikolaidis, N.P., 1998. Modelling of nonpoint source<br />

pollution of nitrogen at the watershed scale. Journal of the<br />

American Water Resources Association 34 (2), 359±374.<br />

Jensen, K.H., Mantoglou, A., 1992. Application of stochastic<br />

unsaturated ¯ow theory, numerical simulations and comparison<br />

to ®eld observations. Water Resources Research 28 (1),<br />

269±284.<br />

Kros, J., Pebesma, E.J., Reinds, G.J., Finke, P.A., 1999. Uncertainty<br />

assessment in modelling soil acidi®cation at the European scale:<br />

a case study. Journal of Environmental Quality 28 (2), 366±377.<br />

Lamm, C.G., 1971. The Danish soil database. Tidskrift for Planteavl<br />

75, 703±720 (in Danish).<br />

Landbrugets RaÊdgivningscenter, 1996. Square grid for nitrate investigations<br />

in Danmark 1990±1993. Landskontoret for Planteavl,<br />

Skejby, Denmark (in Danish).<br />

McKay, M.D., Conover, W.J., Beckman, R.J., 1979. A comparison<br />

of three methods for selection values of input variables in the<br />

analysis of output from a computer code. Technometrics 2,<br />

239±245.<br />

Nemec, J., 1994. Distributed hydrological models in the perspective<br />

of forecasting operational real time hydrological systems<br />

(FORTHS). In: Rosso, P., Peano, A., Becchi, I., Bemporad,<br />

G.A. (Eds.). Advances in Distributed Hydrology. Water<br />

Resources Publications, pp. 69±84.<br />

Pebesma, E.J., Heuvelink, G.B.M., 1999. Latin hypercube sampling<br />

of Gaussian random ®elds. Technometrics 41 (4), 303±312.<br />

Plantedirektoratet, 1996. Vejledninger og skemaer 1996/1997.<br />

Ministry for Food, Agriculture and Fishery, 38pp.<br />

Refsgaard, J.C., 1996. Terminology, modelling protocol and classi-<br />

®cation of hydrological model codes. In: Abbott, M.B.,<br />

Refsgaard, J.C. (Eds.). Distributed Hydrological Modelling.<br />

Kluwer Academic, pp. 17±39.<br />

Refsgaard, J.C., Storm, B., 1995. MIKE SHE. In: Singh, V.P. (Ed.).<br />

Computer Models of Watershed Hydrology. Water Resources<br />

Publication, pp. 809±846.<br />

Refsgaard, J.C., Ramaekers, D., Heuvelink, G.B.M., Schreurs, V.,<br />

Kros, H., RoseÂn, L., Hansen, S., 1998. Assessment of cumulative<br />

uncertainty in spatial decision support systems: application<br />

to examine the contamination of groundwater from diffuse<br />

sources (UNCERSDSS). Presented at the European Climate<br />

Science Conference, Vienna, 19±23 October, 1998. To appear<br />

in conference proceedings.<br />

Refsgaard, J.C., Thorsen, M., Birk Jensen, J., Kleeschulte, S.,<br />

Hansen, S., 1999. Large scale modelling of groundwater<br />

contamination from nitrogen leaching. Journal of Hydrology<br />

221, 117±140.<br />

Simmelsgaard, S.E., 1991. Estimating functions for nitrogen leaching:<br />

nitrogen fertilizers in agriculture Ð requirement and leaching<br />

now and in the future. National Institute of Agricultural<br />

Economics, Copenhagen, Denmark (in Danish).<br />

Skop, E., 1993. Calculation of nitrogen leaching on a regional scale.<br />

Technical report no. 65. National Environmental Research Institute,<br />

Silkeborg, Denmark, 54 pp (in Danish).<br />

Smith, L., Freeze, R.A., 1979a. Stochastic analysis of steady state<br />

¯ow in a bounded domain. 1. One-dimensional simulations.<br />

Water Resources Research 15 (3), 521±528.<br />

Smith, L., Freeze, R.A., 1979b. Stochastic analysis of steady state


M. Thorsen et al. / Journal of Hydrology 242 (2001) 210±227 227<br />

¯ow in a bounded domain. 2. Two-dimensional simulations.<br />

Water Resources Research 15 (6), 1543±1559.<br />

Smith, R.E., 1992. An integrated simulation model of<br />

nonpoint-source pollutants at the ®eld scale. Department of<br />

Agriculture, Agricultural Research Service, 120pp.<br />

Storm, B., Jensen, K.H., Refsgaard, J.C., 1988. Estimation of catchment<br />

rainfall uncertainty and its in¯uence on runoff prediction.<br />

Nordic Hydrology 19, 77±88.<br />

Styczen, M., Storm, B., 1993. Modelling of N-movements on catchment<br />

scale Ð a tool for analysis and decision making. 1. Model<br />

description. & 2. A case study. Fertiliser Research 36, 1±17.<br />

Tietje, O., Tapkenhinrichs, M., 1993. Evaluation of pedo-transfer<br />

functions. Soil Science Society of America Journal 57, 1088±<br />

1095.<br />

Wood, E., O'Connell, P.E., 1985. Real-time forecasting. In: Anderson,<br />

M.G., Burt, T.P. (Eds.). Hydrological Forecasting. Wiley,<br />

New York, pp. 505±558.<br />

Wood, E.F., Sivapalan, M., Beven, K.J., Band, L., 1988. Effects of<br />

spatial variability and scale with implications to hydrologic<br />

modelling. Journal of Hydrology 102, 29±47.<br />

Zhang, H., Haan, C.T., Nofziger, D.L., 1993. An approach to estimating<br />

uncertainties in modelling transport of solutes through<br />

soils. Journal of Contaminant Hydrology 12, 35±50.


[12]<br />

Refsgaard JC, Henriksen HJ (2004) Modelling guidelines – terminology and<br />

guiding principles.<br />

Advances in Water Resources, 27(1), 71-82.<br />

Reprinted from Advances in Water Resources with permission from Elsevier


Advances in Water Resources 27 (2004) 71–82<br />

www.elsevier.com/locate/advwatres<br />

Modelling guidelines––terminology and guiding principles<br />

Jens Christian Refsgaard * , Hans Jørgen Henriksen<br />

Department of Hydrology, Geological Survey of Denmark and Greenland (GEUS), Øster Voldgade 10, Copenhagen DK-1350, Denmark<br />

Received 29 October 2002; received in revised form 7 August 2003; accepted 18 August 2003<br />

Abstract<br />

Some scientists argue, with reference to Popper’s scientific philosophical school, that models cannot be verified or validated.<br />

Other scientists and many practitioners nevertheless use these terms, but with very different meanings. As a result of an increasing<br />

number of examples of model malpractice and mistrust to the credibility of models, several modelling guidelines are being elaborated<br />

in recent years with the aim of improving the quality of modelling studies. This gap between the views and the lack of<br />

consensus experienced in the scientific community and the strongly perceived need for commonly agreed modelling guidelines is<br />

constraining the optimal use and benefits of models. This paper proposes a framework for quality assurance guidelines, including a<br />

consistent terminology and a foundation for a methodology bridging the gap between scientific philosophy and pragmatic modelling.<br />

A distinction is made between the conceptual model, the model code and the site-specific model. A conceptual model is<br />

subject to confirmation or falsification like scientific theories. A model code may be verified within given ranges of applicability and<br />

ranges of accuracy, but it can never be universally verified. Similarly, a model may be validated, but only with reference to sitespecific<br />

applications and to pre-specified performance (accuracy) criteria. Thus, a model’s validity will always be limited in terms<br />

of space, time, boundary conditions and types of application. This implies a continuous interaction between manager and modeller<br />

in order to establish suitable accuracy criteria and predictions associated with uncertainty analysis.<br />

Ó 2003 Elsevier Ltd. All rights reserved.<br />

Keywords: Model guidelines; Scientific philosophy; Validation; Verification; Confirmation; Domain of applicability; Uncertainty<br />

1. Introduction<br />

Models describing water flows, water quality and<br />

ecology are being developed and applied in increasing<br />

number and variety. With the requirements imposed by<br />

the EU Water Framework Directive the trend in recent<br />

years to base water management decisions to a larger<br />

extent on model studies and to use more sophisticated<br />

models is likely to be reinforced. At the same time<br />

insufficient attention is generally given to documenting<br />

the predictive capability of the models. Therefore, contradictions<br />

emerge regarding the various claims of<br />

model applicability on the one hand and the lack of<br />

documentation of these claims on the other hand.<br />

Hence, the credibility of the models is often questioned,<br />

and sometimes with good reason.<br />

As emphasised by e.g. Forkel [12] modelling studies<br />

involve several partners with different responsibilities.<br />

* Corresponding author. Tel.: +45-38-14-27-76; fax: +45-38-14-20-<br />

50.<br />

E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />

The Ôkey players’ are code developers, model users and<br />

water resources managers. However, due to the complexity<br />

of the modelling process and the different backgrounds<br />

of these groups, gaps in terms of lack of mutual<br />

understanding easily develop. For example, the strengths<br />

and limitations of modelling applications are most often<br />

difficult, if not impossible, to assess by the water resources<br />

managers. Similarly, the transformation of water<br />

managers’ objectives to specific performance criteria can<br />

be very difficult to assess for the model users. Due to lack<br />

of documentation and transparency, modelling projects<br />

can be difficult to audit, and without a considerable effort<br />

it is hardly possible to reconstruct, repeat and reproduce<br />

the modelling process and its results.<br />

In the water resources management community a<br />

number of different guidelines on good modelling practise<br />

have been prepared. One of the most, if not the most,<br />

comprehensive examples of modelling guidelines has<br />

been developed in The Netherlands [37] as a result of a<br />

process involving all the main players in the Dutch water<br />

management field. The background for this process was<br />

a perceived need for improving the quality in modelling<br />

0309-1708/$ - see front matter Ó 2003 Elsevier Ltd. All rights reserved.<br />

doi:10.1016/j.advwatres.2003.08.006


72 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />

by addressing malpractice such as careless handling of<br />

input data, insufficient calibration and validation and<br />

model use outside its scope [34]. Similarly, the background<br />

for modelling guidelines for the Murray–Darling<br />

Basin in Australia was a perception among the end-users<br />

that model capabilities may have been Ôover-sold’, and<br />

that there is a lack of consistency in approaches, communication<br />

and understanding among and between<br />

modellers and water resources managers, often resulting<br />

in considerable uncertainty for decision making [25].<br />

A key problem in relation to establishment of generally<br />

acceptable modelling guidelines is confusion on<br />

terminology. For example the terms validation and<br />

verifications are used with different, and some times<br />

interchangeable, meaning by different authors. The<br />

confusion arises from both semantic and philosophical<br />

considerations [32]. Another important problem is the<br />

lack of consensus related to the so far non-conclusive<br />

debate on the fundamental question concerning whether<br />

a water resources model can be validated or verified, and<br />

whether it as such can be claimed to be suitable or valid<br />

for particular applications [3,11,16,20,26].<br />

Finally, modelling guidelines have to reflect and be in<br />

line with the underlying philosophy of environmental<br />

modelling which have changed significantly during the<br />

past decades from what in retrospect may be called<br />

rather naive enthusiasms (see for example Freeze and<br />

Harlan [13]; Abbott [1]––many of us focussed on the<br />

huge potentials of sophisticated models outlined in these<br />

early days without reflecting too much on the associated<br />

limitations) to what now appears to be a much more<br />

balanced and mature view (e.g. Beven [7,9]).<br />

Thus, there is a gap between the theory and practice,<br />

i.e. between the various, contradictory views and the<br />

lack of a common terminology and methodology in the<br />

scientific community on the one side, and the need of<br />

having quality assurance guidelines for practical model<br />

applications on the other side. The objective of the<br />

present paper is to establish guiding principles for<br />

quality assurance guidelines, including establishing a<br />

consistent terminology and a foundation for a methodology<br />

bridging the gap between scientific philosophy and<br />

pragmatic modelling.<br />

2. Key opinions in the scientific community<br />

The present paper does not attempt to provide a full<br />

review of all relevant papers on this subject. Rather, it<br />

provides a review of a few selected characteristic<br />

examples.<br />

2.1. Terminology<br />

No unique and generally accepted terminology and<br />

methodology exist at present in the scientific community<br />

with respect to modelling protocol and guidelines for<br />

good modelling practise. Examples of general methodologies<br />

exist [4,32,33], but they use different terminology<br />

and have significant differences with respect to the<br />

underlying scientific philosophy.<br />

A rigorous and comprehensive terminology for model<br />

credibility was presented by Schlesinger et al. [33]. This<br />

terminology was developed by a committee composed of<br />

members from diverse disciplines and background with<br />

the intent that it could be employed in all types of simulation<br />

applications. In regard to terminology, distinctions<br />

are made between model qualification (adequacy<br />

of conceptual model), model verification (adequacy of<br />

computer programme) and model validation (adequacy<br />

of site-specific model). With the exception of a few<br />

important terms, such as generic model code and model<br />

calibration, which are not considered by Schlesinger<br />

et al. [33], their proposed terminology includes all the<br />

important elements of the modelling process.<br />

Konikow and Bredehoeft [20], in their thought provoking<br />

paper, express the view that ‘‘the terms validation<br />

and verification have little or no place in<br />

groundwater science; these terms lead to a false impression<br />

of model capability’’. Their main argument relates<br />

to the anti-positivistic view that a theory (in this case a<br />

model) can never be proved to be generally valid, but<br />

may in contrary be falsified by just one example. They<br />

argue and recommend that the term history matching,<br />

which does not indicate a claim of predictive capability,<br />

should be used instead.<br />

Oreskes et al. [26], in their classic and philosophically<br />

based paper, distinguish between verification, validation<br />

and confirmation:<br />

• Verify is ‘‘an assertion or establishment of truth’’. To<br />

verify a model therefore means to demonstrate its<br />

truth. According to the authors ‘‘verification is only<br />

possible in closed systems in which all the components<br />

of the system is established independently and<br />

are known to be correct. In its application to models<br />

of natural systems, the term verification is highly misleading.<br />

It suggests a demonstration of proof that is<br />

simply not accessible’’. They argue that mathematical<br />

components are subject to verification, because they<br />

are part of closed systems, but numerical models in<br />

application cannot be verified because of uncertainty<br />

of input parameters, scaling problems and uncertainty<br />

in observations.<br />

• The term validation is weaker than the term verification.<br />

Thus validation does not necessarily denote an<br />

establishment of truth, but rather ‘‘the establishment<br />

of legitimacy, typically given in terms of contracts,<br />

arguments and methods’’. They argue that ‘‘the term<br />

valid may be useful for assertions about a generic<br />

model code but is clearly misleading if used to refer<br />

to actual model results in any particular realisation’’.


J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 73<br />

• The term confirmation is weaker than the terms verification<br />

and validation. It is used with regard to a theory,<br />

when it is found that the theory is in agreement<br />

with empirical observations. As discussed below such<br />

agreement does not prove that the theory is true, it<br />

only confirms it.<br />

Oreskes et al. [26] do not define how the terms verification<br />

and validation should be used, but rather define<br />

their meaning and set limitations to the contexts in<br />

which they meaningfully can be used.<br />

An important distinction is made between open and<br />

closed systems. A system is a closed system if its true<br />

conditions can be predicted or computed exactly. This<br />

applies to mathematics and mostly to physics and<br />

chemistry. Systems where the true behaviour cannot be<br />

computed due to uncertainties and lack of knowledge on<br />

e.g. input data and parameter values are called open<br />

systems. The systems we are dealing with in water resources<br />

management, based on geosciences, biology and<br />

socio-economy, are open systems.<br />

It may be argued that e.g. the behaviour of a<br />

groundwater flow system can be predicted correctly if all<br />

the details of the subsurface (soil system and geological<br />

system) media were known, because the fundamental<br />

physical laws governing the flow are known. However,<br />

in practice it will never be possible to know all the details<br />

of the media down to molecular scale, and hence<br />

uncertainties will always exist. For instance, several<br />

alternative representations of the subsurface system at<br />

microscopic scale will be able to provide the same<br />

flow field at a macroscopic scale. Therefore, the results<br />

from a groundwater flow model are said to be nonunique.<br />

In addition, as the system is a so-called open<br />

system, the boundary conditions generate further<br />

uncertainty.<br />

Matalas et al. [24] draw a distinction between the<br />

terms Ômodel’ and Ôtheory’. They state that ‘‘a theory<br />

represents a synthesis of understanding, which provides<br />

not only a description of what constitutes the states of<br />

the system and their connectedness (i.e. postulated<br />

concepts), but also deducted consequences from these<br />

postulates. A model is an analogy or an abstraction,<br />

which ...may be derived intuitively and without formal<br />

deductive capability’’.<br />

Rykiel [32] argues that models can be validated as<br />

acceptable for pragmatic purposes, whereas theoretical<br />

validity is always provisional. In this respect he, like<br />

Matalas et al. [24], distinguishes between scientific<br />

models and predictive (engineering) models. Scientific<br />

models can be corroborated (confirmed) or refuted<br />

(falsified) in the sense of hypothesis testing, while predictive<br />

models can be validated or invalidated in the<br />

sense of engineering performance testing. Thus according<br />

to Rykiel [32], validation is not a procedure for<br />

testing scientific theory or for certifying the Ôtruth’ of<br />

current scientific understanding, but rather a testing<br />

of whether a model is acceptable for its intended use.<br />

Within the hydraulic engineering community attempts<br />

have been made to establish a common quality<br />

assurance methodology IAHR [18]. The IAHR methodology<br />

comprises guidelines for standard validation<br />

documents, where validation of a software package is<br />

considered in four steps [10,23]: conceptual validation,<br />

algorithmic validation, software validation and functional<br />

validation. It is noted that the term validation in<br />

the IAHR methodology corresponds to what other authors<br />

call code verification, while schemes for validation<br />

of site-specific models are not included.<br />

2.2. Scientific philosophical aspects of verification and<br />

validation<br />

Different principal schools of philosophical thought<br />

exist on the issue of verification and validation. During<br />

the second half of the 19th century and the first half of<br />

the 20th century positivism was the dominant philosophical<br />

school. Matalas et al. [24] characterises the<br />

positivistic school in the following way: ‘‘...theories are<br />

proposed through inductive logic, and the proposed<br />

theories are confirmed or refuted on the basis of critical<br />

experiments designed to verify the consequences of the<br />

theories. And through theory reduction or adoption of<br />

new or modified theories, science is able to approach<br />

truth’’. The logic rationale behind positivism is the<br />

inductive method, i.e. the inference from singular<br />

statements, such as accounts of results of observations<br />

or experiments, to universal statements, such as hypothesis<br />

or theories.<br />

Popper [29] opposed the positivistic school arguing<br />

that science is deductive rather than inductive, and that<br />

theories cannot be verified, only falsified. The deductive<br />

method implies inferences from a universal statement to<br />

a singular statement, where conclusions are logically<br />

derived from given premises. Science is considered as a<br />

hypothetico-deductive activity, implying that empirical<br />

observations must be framed as deductive consequences<br />

of a general theory or scientific law. If the observations<br />

can be shown to be true then the theory or law is said to<br />

be corroborated. Popper used the term corroborate instead<br />

of confirmation, because he ‘‘wanted a neutral<br />

term to describe the degree to which a theory has stood<br />

up to severe tests and proved its mettle’’.<br />

The greater the number and diversity of confirming<br />

observations the more credible the theory or law becomes.<br />

But no matter how much data and how many<br />

confirmations we have, there will always be the possibility<br />

that more than one theory can explain the observations.<br />

Over time the false theories are likely to be<br />

confronted with observations that falsify them. Thus,<br />

scientific theories are never certain or proved but only<br />

hypotheses subject to corroboration or falsification.


74 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />

Popper [29] distinguished between two kinds of universal<br />

statements: the Ôstrictly universal’ and the Ônumerical<br />

universal’. The strictly universal statements are<br />

those usually dealt with when speaking about theories<br />

or natural laws. They are a kind of Ôall-statement’<br />

claiming to be true for any place and any time. In contrary<br />

numerical universal statements refers only to a<br />

finite class of specific elements within a finite individual<br />

spatio-temporal region. A numerical universal statement<br />

is thus in fact equivalent to conjunctions of singular<br />

statements.<br />

Kuhn [21] also strongly criticised positivism, and in<br />

a discussion of selection of correct scientific theories<br />

(paradigms) states ‘‘... few philosophers of science still<br />

seek absolute criteria for the verification of scientific<br />

theories. Noting that no theory can ever be exposed to<br />

all possible relevant tests, they ask not whether a theory<br />

has been verified but rather about its probability in the<br />

light of the evidence that actually exists. And to answer<br />

that question one important school is driven to compare<br />

the ability of different theories to explain the evidence at<br />

hand.’’<br />

According to the deductive approach a given system<br />

is reduced into elements or sub-systems that are closed,<br />

i.e. without uncertainties from the boundary or initial<br />

conditions, and a given hypothesis is then confirmed by<br />

use of causal relationships and rigouristic logic. The<br />

deductive method is the traditional scientific philosophy<br />

and methodology for Ôexact sciences’ such as physics and<br />

chemistry. Hansen [15] and Baker [5] argue that this<br />

deductive or Ôtheory-directed’ scientific method is not<br />

suitable to earth sciences, such as geology and biology,<br />

which are characterised by open systems, and where<br />

many of the signs in the historical development process<br />

are not preserved. Instead, they argue for another scientific<br />

method, which they, respectively, denote Ôholistic’<br />

or Ôearth-directed’. The earth-directed scientific method<br />

does not focus on idealised theories verified in experimental<br />

laboratories. Instead, it is oriented towards<br />

observations in nature, uncontrolled by artificial constraints.<br />

The earth-directed method, being more Ôsoft’<br />

and accepting conclusions on the complex state of nature<br />

from an integration of many observations, but<br />

without the logical rigorous proof required by the<br />

deductive method, can be argued to be well in line with<br />

Popper’s philosophy where the scientific knowledge<br />

comprises a variety of falsifiable theories that are subject<br />

to tests against observations [15].<br />

2.3. Philosophy of environmental modelling<br />

Following several papers (ranging from Beven [6] to<br />

[7]) with comprehensive critique against the predominant<br />

philosophy underlying most environmental modelling,<br />

Beven [9] outlines a new philosophy for modelling<br />

of environmental systems. The basic aim of this new<br />

approach is to extend the most common, past approach<br />

with a more realistic account of uncertainty rejecting the<br />

idea of being able to identify only one optimal model as<br />

being the most reliable for a given case. His basic idea is<br />

in line with Oreskes et al. [26] that verification and<br />

validation of environmental models is impossible, because<br />

natural systems are open. Instead environmental<br />

models may be non-unique subject to only a conditional<br />

confirmation, due to e.g. errors in model structure, calibration<br />

of parameters and period of data used for<br />

evaluation. Due to this there will always be the possibility<br />

of equifinality in that many different model<br />

structures and parameter sets may give simulations that<br />

cannot be falsified from the available observational<br />

data. Beven therefore argues that the range of behavioural<br />

models (structures and parameter sets) is best<br />

represented in terms of mapping of the Ôlandscape space’<br />

into the Ômodel space’, and that uncertainty predictions<br />

should consider all the behavioural models.<br />

3. Proposed terminology and methodological framework<br />

The following terminology is inspired by the generalised<br />

terminology for model credibility proposed by<br />

Schlesinger et al. [33], but modified and extended to<br />

accommodate some of the scientific philosophical issues<br />

raised above. The simulation environment is divided<br />

into four basic elements as shown in Fig. 1. The inner<br />

arrows describe the processes that relate the elements to<br />

each other, and the outer circle refers to the procedures<br />

that evaluate the credibility of these processes.<br />

In general terms a model is understood as a simplified<br />

representation of the natural system it attempts to describe.<br />

However, in the terminology proposed below a<br />

distinction is made between three different meanings of<br />

the general term model, namely the conceptual model,<br />

the model code and the model that here is defined as a<br />

site-specific model. The most important elements in the<br />

terminology and their interrelationships are defined as<br />

follows:<br />

Reality: The natural system, understood here as the<br />

study area.<br />

Conceptual model: A description of reality in terms of<br />

verbal descriptions, equations, governing relationships<br />

or Ônatural laws’ that purport to describe reality. This is<br />

the user’s perception of the key hydrological and ecological<br />

processes in the study area (perceptual model)<br />

and the corresponding simplifications and numerical<br />

accuracy limits that are assumed acceptable in order to<br />

achieve the purpose of the modelling. A conceptual<br />

model thus includes both a mathematical description<br />

(equations) and a descriptions of flow processes, river<br />

system elements, ecological structures, geological features,<br />

etc. that are required for the particular purpose of<br />

modelling. By drawing an analogy to the scientific


J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 75<br />

Fig. 1. Elements of a modelling terminology. Modified after Schlesinger et al. [33].<br />

philosophical discussion above the conceptual model in<br />

other words constitutes the scientific hypothesis or theory<br />

that we assume for our particular modelling study.<br />

Model code: A mathematical formulation in the form<br />

of a computer program that is so generic that it, without<br />

program changes, can be used to establish a model with<br />

the same basic type of equations (but allowing different<br />

input variables and parameter values) for different study<br />

areas.<br />

Model: A site-specific model established for a particular<br />

study area, including input data and parameter<br />

values.<br />

Model confirmation: Determination of adequacy of<br />

the conceptual model to provide an acceptable level of<br />

agreement for the domain of intended application. This<br />

is in other words the scientific confirmation of the theories/hypotheses<br />

included in the conceptual model.<br />

Code verification: Substantiation that a model code is<br />

in some sense a true representation of a conceptual<br />

model within certain specified limits or ranges of application<br />

and corresponding ranges of accuracy.<br />

Model calibration: The procedure of adjustment of<br />

parameter values of a model to reproduce the response<br />

of reality within the range of accuracy specified in the<br />

performance criteria.<br />

Model validation: Substantiation that a model within<br />

its domain of applicability possesses a satisfactory range<br />

of accuracy consistent with the intended application of<br />

the model.<br />

Model set-up: Establishment of a site-specific model<br />

using a model code. This requires, among other things,<br />

the definition of boundary and initial conditions and<br />

parameter assessment from field and laboratory data.<br />

Simulation: Use of a validated model to gain insight<br />

into reality and obtain predictions that can be used by<br />

water managers. This includes insight into how reality<br />

can be expected to respond to human interventions. In<br />

this connection uncertainty assessments of the model<br />

predictions are very important.<br />

Performance criteria: Level of acceptable agreement<br />

between model and reality. The performance criteria<br />

apply both for model calibration and model validation.<br />

Domain of applicability (of conceptual model): Prescribed<br />

conditions for which the conceptual model has<br />

been tested, i.e. compared with reality to the extent<br />

possible and judged suitable for use (by model confirmation).<br />

Domain of applicability (of model code): Prescribed<br />

conditions for which the model code has been tested, i.e.<br />

compared with analytical solutions, other model codes<br />

or similar to the extent possible and judged suitable for<br />

use (by code verification).<br />

Domain of applicability (of model): Prescribed conditions<br />

for which the site-specific model has been tested,<br />

i.e. compared with reality to the extent possible and<br />

judged suitable for use (by model validation).<br />

The credibility of the descriptions or the agreements<br />

between reality, conceptual model, model code and<br />

model are evaluated through the terms confirmation,<br />

verification, calibration and validation. Thus, the relation<br />

between reality and the scientific description of reality<br />

which is constituted by the conceptual model with its<br />

theories and equations on flow and transport processes,<br />

its interpretation of the geological system and ecosystem<br />

at hand, etc., is evaluated through the confirmation of<br />

the conceptual model. As a logical consequence of our


76 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />

position on scientific methodology, we use the term<br />

confirmation in connection with conceptual model. This<br />

implies that we agree that it is never possible to prove<br />

the truth of a theory/hypothesis and as such of a conceptual<br />

model. And even if a site-specific model is<br />

eventually accepted as valid for specific conditions, this<br />

is not a proof that the conceptual model is true, because,<br />

due to non-uniqueness, the site-specific model may turn<br />

out to perform right for the wrong reasons.<br />

Methods for conceptual model confirmation should<br />

follow the standard procedures for confirmation of scientific<br />

theories. This implies that conceptual models<br />

should be confronted with actual field data and be<br />

subject to critical peer reviews. Furthermore, the feedback<br />

from the calibration and validation process may<br />

also serve as a means by which one or a number of<br />

alternative conceptual model(s) may be either confirmed<br />

or falsified.<br />

The ability of a given model code to adequately describe<br />

the theory and equations defined in the conceptual<br />

model by use of numerical algorithms is evaluated<br />

through the verification of the model code. Use of the<br />

term verification in this respect is in accordance with<br />

Oreskes et al. [26], because mathematical equations are<br />

closed systems. The methodologies used for code verification<br />

include comparing a numerical solution with an<br />

analytical solution or with a numerical solution from<br />

other verified codes. However, some programme errors<br />

only appear under circumstances that do not routinely<br />

occur, and may not have been anticipated. Furthermore,<br />

for complex codes it is virtually impossible to verify that<br />

the code is universally accurate and error-free. Therefore,<br />

the term code verification must be qualified in<br />

terms of specified ranges of application and corresponding<br />

ranges of accuracy. A code may be applied<br />

outside its documented ranges of application, but in<br />

such cases it must not carry the label Ôverified’ and<br />

caution should be expressed with respect to its results.<br />

The application of a model code to be used for setting<br />

up a site-specific model is usually associated with model<br />

calibration. The model performance during calibration<br />

depends on the quantity and quality of the available<br />

input and observation data as well as on the conceptual<br />

model. If sufficient accuracy cannot be achieved either<br />

the conceptual model and/or the data have to be reevaluated.<br />

A discussion of the problems and methodologies<br />

in model calibration is provided by Gupta et al.<br />

[14].<br />

Often the model performance during calibration is<br />

used as a measure of the predictive capability of a<br />

model. This is a fundamental error. Many studies (e.g.<br />

Refsgaard and Knudsen [31]; Liden [22]) have demonstrated<br />

that the model performance against independent<br />

data not used for calibration is generally poorer than the<br />

performance achieved in the calibration situation.<br />

Therefore, the credibility of a site-specific model’s<br />

capability to make predictions about reality must be<br />

evaluated against independent data. This process is denoted<br />

model validation. In designing suitable model<br />

validation tests a guiding principle should be that a<br />

model should be tested to show how well it can perform<br />

the kind of task for which it is specifically intended [19].<br />

This implies for instance that for the case where a model<br />

is intended to be used for conditions similar to conditions<br />

where test data exist, such as extension of<br />

streamflow records, a standard split-sample test may be<br />

applied. However, models are often intended to be used<br />

as management tools to help answer questions such as:<br />

What happens to the water resources if land use is<br />

changed In such case no site-specific test data exist and<br />

the question of defining a validation test scheme becomes<br />

non-trivial.<br />

4. Discussion<br />

4.1. Scientific philosophical aspects<br />

The fundamental view expressed by scientific philosophers<br />

is that verification and validation of numerical<br />

models of natural systems is impossible, because natural<br />

systems are never closed and because the mapping of<br />

model results are always non-unique [26]. Thus, seen<br />

from a theoretical point it is tempting to conclude that<br />

the establishment of modelling guidelines comprising<br />

these terms simply is not possible.<br />

On the other hand, there is a large and increasing<br />

need to establish guidelines to improve the quality of<br />

modelling, and such guidelines need to address the issues<br />

of verification and validation in order to be operational<br />

in practise. Irrespective of what the scientific community<br />

decides regarding terminology and validation methodology,<br />

including the associated philosophical aspects,<br />

models are being used more and more to support water<br />

resources management in practise. As long as the present<br />

situation continues, characterised by a large degree<br />

of confusion on terminology and methodology, the potential<br />

benefits of using models are severely constrained.<br />

They are often subject to either Ôoverselling’ or Ômistrust’,<br />

and misunderstandings between model users and<br />

water resources managers may easily occur in the absence<br />

of a commonly accepted and understood Ôlanguage’.<br />

Thus, establishment of a terminology and<br />

methodology that bridge the gap between scientific<br />

philosophy and pragmatic modelling is a key challenge<br />

and an important one.<br />

This gap between a scientific philosophical and a<br />

pragmatic modelling position is also clearly reflected in<br />

the dialogue between Konikow and Bredehoeft [20] and<br />

De Marsily et al. [11]. Following the Popperian school,<br />

Konikow and Bredehoeft [20] express the view that ‘‘the<br />

terms validation and verification have little or no place


J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 77<br />

in ground-water science; these terms lead to a false<br />

impression of model capability’’. De Marsily et al. [11],<br />

in a response, argue for a more pragmatic view: ‘‘...<br />

using the model in a predictive mode and comparing it<br />

with new data is not a futile exercise; it makes a lot of<br />

sense to us. It does not prove that the model will be<br />

correct for all circumstances, it only increases our confidence<br />

in its value. We do not want certainty; we will be<br />

satisfied with engineering confidence.’’<br />

With regard to scientific methodology we fundamentally<br />

agree with the views of Popper [29] and the<br />

earth-directed theoretical method described by Baker<br />

[5]. Consequently, we agree with the view of Oreskes<br />

et al. [26], Konikow and Bredehoeft [20] and many<br />

others that it is not possible to carry out model verification<br />

or model validation, if these terms are used<br />

without restriction to domains of applicability and levels<br />

of accuracy.<br />

The restrictions in use of the terms confirmation,<br />

verification and validation imposed by the respective<br />

domains of applicability imply, according to Popper’s<br />

views, that the conceptual model, model code and<br />

site-specific models can only be classified as numerical<br />

universal statements as opposed to strictly universal<br />

statements. This distinction is fundamental for our<br />

proposed methodology and its link to scientific philosophical<br />

theories.<br />

4.2. Model confirmation, verification and validation<br />

An important aspect of our proposed methodology<br />

lies in the separation between the three different Ôversions’<br />

of the word model, namely the conceptual model,<br />

the model code and the site-specific model. This separation<br />

is in line with Matalas et al. [24] and Rykiel [32],<br />

who distinguish between the theory (conceptual model)<br />

and the engineering model (the site-specific model).<br />

Similarly, Schlesinger et al. [33] distinguish between<br />

conceptual model and computerised model. Schlesinger<br />

et al. [33], Matalas et al. [24] and Rykiel [32] do not<br />

separate the model code from the site-specific model.<br />

Due to this distinction it is possible, at a general level,<br />

to talk about confirmation of a theory or a hypothesis<br />

about how nature can be described using the relevant<br />

scientific method for that purpose, and, at a site-specific<br />

level, to talk about validity of a given model within<br />

certain domains of applicability and associated with<br />

specified accuracy limits.<br />

As Beven [9] argues we need to distinguish between<br />

our qualitative understanding (perceptual model) and<br />

the practical implementation of that understanding in<br />

our conceptual model. As we have defined a conceptual<br />

model as combination of a perceptual model and the<br />

simplifications acceptable for a particular model study a<br />

conceptual model becomes site-specific and even case<br />

specific. For example a conceptual model of a groundwater<br />

aquifer may be described as two-dimensional for a<br />

study focussing on regional groundwater heads, while it<br />

may need to include more complex three-dimensional<br />

geological structures for detailed simulation of solute<br />

transport studies.<br />

Confirmation of a conceptual model is a non-trivial<br />

issue. It is hardly possible to prescribe general test<br />

procedures, in particular not exact tests. Conceptual<br />

models are more difficult in some domains than in<br />

others. For example, the process descriptions/equations<br />

and the actual system is relatively easily identifiable in<br />

a hydrodynamic river flow system as compared to a<br />

groundwater system or an ecosystem, because the geology<br />

will never be completely known in a groundwater<br />

system and the biological processes may not be well<br />

known in an ecosystem. The more complex and difficult<br />

the conceptual model becomes the more Ôsoft’ the confirmation<br />

tests may turn out to be. Thus, expert<br />

knowledge in terms of peer reviews may be an important<br />

element of such tests.<br />

In cases where considerable uncertainty exists in the<br />

conceptual model, the possibility of testing alternative<br />

conceptual models should be promoted. An example of<br />

this is given by Troldborg [35], who reports a study<br />

where three scientists developed alternative geological<br />

interpretations for the same area, and three numerical<br />

groundwater models were set-up and calibrated on this<br />

basis. During this process, or in the subsequent validation<br />

phase, one or more of these models may turn out to<br />

perform so poorly that the underlying conceptual model<br />

has to be rejected. This approach of building the<br />

uncertainty of our knowledge of reality into alternative<br />

conceptual models, which are subsequently subject to a<br />

confirmation test, is fully in line with Popper’s scientific<br />

philosophical school. Unfortunately, this is very seldom<br />

pursued in practise.<br />

Code verification is not an activity that is carried out<br />

from scratch in every modelling study. In a particular<br />

study it has to be ascertained that the domain of<br />

applicability for which the selected model code has been<br />

verified covers the conditions specified in the actual<br />

conceptual model. If that is not the case, additional<br />

verification tests have to be conducted. Otherwise, the<br />

code explicitly must be classified as not verified for this<br />

particular study, and the subsequent simulation results<br />

therefore have to be considered with extra caution.<br />

Establishment of validation test schemes for the situations,<br />

where the split-sample test is not sufficient, is an<br />

area, where limited work has been carried out so far.<br />

The only rigorous and comprehensive methodology reported<br />

in literature is that of Klemes [19]. He proposed a<br />

systematic scheme of validation tests, where a distinction<br />

is made between simulations conducted for the<br />

same catchment as was used for calibration (split-sample<br />

test) and simulations conducted for ungauged catchments<br />

(proxy-basin tests). He also distinguished between


78 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />

cases where catchment conditions such as climate, land<br />

use and ground water abstraction are stationary (splitsample<br />

test) and cases where they are not (differential<br />

split-sample test). A further discussion, including examples,<br />

of Klemes’s test scheme is given in Refsgaard<br />

[30]. The two key principles are: (a) the validation tests<br />

must be carried out against independent data, i.e. data<br />

that have not been used during calibration, and (b) the<br />

model should be tested to show how good it can perform<br />

the kind of task for which it is specifically intended to be<br />

applied subsequently. This implies e.g. that multi-site<br />

validation is needed if predictions of spatial patterns are<br />

required, and multi-variable checks are required if predictions<br />

of the behaviour of individual subsystems<br />

within a catchment is needed. Thus, a model should only<br />

be assumed valid with respect to outputs that have been<br />

explicitly validated. This means for instance that a<br />

model which is validated against catchment runoff cannot<br />

automatically be assumed valid also for simulation<br />

of erosion on a hillslope within the catchment, because<br />

smaller scale processes may dominate here; it will need<br />

validation against hillslope soil erosion data.<br />

From a theoretical point of view the procedures<br />

outlined by Klemes [19] for the proxy-basin and the<br />

differential split-sample tests, where tests have to be<br />

carried out using data from similar catchments, are<br />

weaker than the usual split-sample test, where data from<br />

the specific catchment are available. However, no<br />

obviously better testing schemes exist. Therefore, this<br />

will have to be reflected in the performance criteria in<br />

terms of larger expected uncertainties in the predictions.<br />

It must be realised that the validation test schemes<br />

proposed above are so demanding that many applications<br />

today would fail to meet them. Thus, for many<br />

cases where either proxy-basin and differential splitsample<br />

tests are required, suitable test data simply do<br />

not exist. This is for example the case for prediction of<br />

regional scale transport of potential contamination from<br />

underground radionuclide deposits over the next thousands<br />

of years. In such case model validation is not<br />

possible. This does not imply that these modelling<br />

studies are not useful, only that their output should be<br />

recognised to be somewhat more uncertain than is often<br />

stated and that the term Ôvalidated model’ should not<br />

be used. Thus, a model’s validity will always be confined<br />

in terms of space, time, boundary conditions, types of<br />

application, etc.<br />

According to the methodology, model validation<br />

implies substantiating that a site-specific model can<br />

produce simulation results within the range of accuracy<br />

specified in the performance criteria for the particular<br />

study. Hence, before carrying out the model calibration<br />

and the subsequent validation tests quantitative performance<br />

criteria must be established. In determining<br />

the acceptable level of accuracy a trade-off will, either<br />

explicitly or implicitly, have to be made between costs,<br />

in terms of data collection and modelling work, and<br />

associated benefits that can be obtained due to more<br />

accurate model results. Consequently, the acceptable<br />

level of accuracy will vary from case to case and must be<br />

seen in a socio-economic context. It should therefore<br />

usually not be defined by the modeller, but in a dialogue<br />

between the modeller and the manager.<br />

4.3. Need for interaction between manager, code developer<br />

and modeller<br />

As discussed above, the validation methodologies<br />

presently used, even in research projects, are generally<br />

not rigorous and far from satisfactory. At the same time<br />

models are being used in practise and daily claims are<br />

being made on validity of models and on the basis of, at<br />

the best, not very strict and rigorous test schemes. An<br />

important question then, is how can the situation be<br />

improved in the future As emphasised by Forkel [12]<br />

improvements cannot be achieved by the research<br />

community alone, but requires an interaction between<br />

the three main Ôplayers’, namely water resources managers,<br />

code developers and model users (modellers).<br />

The key responsibilities of the water resources manager<br />

are to specify the objectives and define the acceptance<br />

limits of accuracy performance criteria for the<br />

model application. Furthermore, it is the manager’s<br />

responsibility to define requirements for code verification<br />

and model validation. In many consultancy jobs<br />

accuracy criteria and validation requirements are not<br />

specified at all, with the result being that the model user<br />

implicitly defines them in accordance with the achieved<br />

model results. In this respect it is important in the terms<br />

of references for a given model application to ensure<br />

consistency between the objectives, the specified accuracy<br />

criteria, the data availability and the financial<br />

resources. In order for the manager to make such evaluations,<br />

some knowledge on the modelling process is<br />

required.<br />

The model user has the responsibility for selection of<br />

a suitable code as well as for construction, calibration<br />

and validation of the site-specific model. In particular,<br />

the model user is responsible for preparing validation<br />

documents in such a way that the domain of applicability<br />

and the range of accuracy of the model are<br />

explicitly specified. Furthermore, the documentation of<br />

the modelling process should ideally be done in enough<br />

detail that it can be repeated several years later, if required.<br />

The model user has to interact with the water<br />

resources manager on assessments of realistic model<br />

accuracies. Furthermore, the model user must be aware<br />

of the capabilities and limitations of the selected code<br />

and interact with the code developer with regard to<br />

reporting of user experience such as shortcomings in<br />

documentation, errors in code, market demands for<br />

extensions, etc.


J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 79<br />

The key responsibilities of the developer of the model<br />

code are to develop and verify a model code. In this<br />

connection it is important that the capabilities and<br />

limitations of the code appear in the documentation. As<br />

code development is a continuous process, code maintenance<br />

and regular updating with new versions improved<br />

as a response to user reactions become important. Although<br />

a model code should be comprehensively documented,<br />

there will in practise always occur doubts once<br />

in a while on its functioning, even for experienced users.<br />

Hence, active support to and dialogue with model users<br />

are crucial for ensuring operational model applications<br />

at a high professional level.<br />

4.4. Performance criteria––when is a model good enough<br />

A critical issue in relation to the methodological<br />

framework is how to define the performance criteria. We<br />

agree with Beven [9] that any conceptual model is<br />

known to be wrong and hence any model will be falsified<br />

if we investigate it in sufficient detail and specify very<br />

high performance criteria.<br />

Clearly, if one attempts to establish a model that<br />

should simulate the truth it would always be falsified.<br />

However, this is not a very useful information. Therefore,<br />

we are using the conditional validation, or the<br />

validation restricted to domain of applicability (or<br />

numerical universal as opposed to strictly universal in<br />

Popperian terms). The good question is then what is<br />

good enough Or in other words what are the criteria<br />

How do we select them<br />

A good reference for model performance is to compare<br />

it with uncertainties of the available field observations.<br />

If the model performance is within this uncertainty<br />

range we often characterise the model as good enough.<br />

However, usually it is not so simple. How wide confidence<br />

bands do we accept on observational uncertainties––ranges<br />

corresponding to 65%, 95% or 99% Do<br />

we always then reject a model if it cannot perform within<br />

the observational uncertainty range In many cases even<br />

results from less accurate models may be very useful.<br />

Therefore, our answer is that the decision on what is<br />

good enough generally must be taken in a socio-economic<br />

context. For instance, the accuracy requirements<br />

to a model to be used for an initial screening of alternative<br />

options for location of a new small well field for a<br />

small water supply will be much smaller than the<br />

requirements to a model that is intended to be used for<br />

the final design of a large well field for a major water<br />

supply in an area with potential damaging effects on<br />

precious nature and other significant conflicts of interests.<br />

Thus, we believe that the accuracy criteria cannot<br />

be decided universally by modellers or researchers, but<br />

must be different from case to case depending on how<br />

much is at stake in the decision to depend on the support<br />

from model predictions. This implies that the performance<br />

criteria must be discussed and agreed between the<br />

manager and the modeller beforehand. However, as the<br />

modelling process and the underlying study progresses<br />

with improved knowledge on the data and model<br />

uncertainties as well as on the risk perception of the<br />

concerned stakeholders it may well be required to adjust<br />

the performance criteria in a sort of adaptive project<br />

management context [27].<br />

4.5. The role of uncertainty assessments<br />

Should we then trust a model if it happens to pass a<br />

validation test Are we sure that this model is the best<br />

one and that the underlying conceptual basis and input<br />

data are basically correct<br />

Yes on the one hand, in such case we may trust a<br />

model as a suitable tool to make predictions through<br />

model simulations. But on the other hand, we can never<br />

be sure that a model that passes a validation test will<br />

have a sound conceptual basis. It could be right for the<br />

wrong reasons, e.g. by compensating error in conceptual<br />

model (model structure) with errors in parameter values.<br />

And we know that it would be possible to find many<br />

other models that can pass the validation test, and that it<br />

would not be possible beforehand to identify one of these<br />

models as the best one in all respects. Having realised this<br />

equifinality problem the relevant question is what we<br />

should do to address it in practical cases. In this respect<br />

our framework prescribes that model predictions (see<br />

definition of Ôsimulation’ in Section 3) made subsequent<br />

to passing a validation test should include uncertainty<br />

assessments. Hence, we basically agree with Beven [9]<br />

that uncertainty assessments are necessary, and that such<br />

uncertainty analyses should include uncertainty on<br />

model structure, parameter values etc. Different methodologies<br />

exist for conducting uncertainty assessments,<br />

e.g. Beven [8] and Van Asselt and Rotmans [36].<br />

5. Guiding principles and future perspectives for modelling<br />

guidelines<br />

5.1. Guiding principles<br />

In our opinion the two key factors causing the poor<br />

quality of the modelling work in practise are: (a) too<br />

poor quality of the modelling work done by practitioners<br />

(inadequate use of guidelines and quality assurance<br />

procedures and inadequate role play between manager<br />

(client) and modeller (consultant)) and (b) lack of data<br />

and methodology in the hydrological science. Modelling<br />

guidelines like [25,37] almost exclusively address the<br />

former issue while scientific literature like [7,9] focus on<br />

the latter issue. In our opinion it is crucial that the two<br />

lines of action are combined. This implies that we need<br />

to define modelling guidelines that are both operational


80 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />

in practise and scientifically founded. The framework we<br />

have described here attempts to establish one such a<br />

bridge between the two fields, i.e. pragmatic modelling<br />

and natural science. An important aspect of this<br />

framework is in a scientifically consistent way to enable<br />

the manager and the modeller to make the compromises<br />

that are required in practise.<br />

On this background the following five key principles<br />

for pragmatic modelling have emerged:<br />

• A terminology that is internally consistent. We<br />

acknowledge that many authors in the scientific literature<br />

use different terminology and that, in particular,<br />

some authors do not use the terms verification<br />

and validation. However, these terms are also widely<br />

used, and we need in practise to have understandable<br />

terms for these operations. Thus, with the clear distinction<br />

between conceptual model, model code and<br />

site-specific model and the restrictions to domains<br />

of applicability (numerical universal in Popperian<br />

sense) we believe that our terminology is in accordance<br />

with the main stream of scientific philosophy.<br />

• We never talk about universal code verification or universal<br />

model validation, but always restrict these<br />

terms to clearly defined domains of applicability. This<br />

is a necessary assumption for the consistency of the<br />

terminology and methodology and must be emphasised<br />

explicitly in any guidelines.<br />

• Validation tests against independent data that have<br />

not also been used for calibration are necessary in order<br />

to be able to document the predictive capability<br />

of a model.<br />

• Model predictions achieved through simulation<br />

should be associated with uncertainty assessments<br />

where amongst others the uncertainty in model structure<br />

and parameter values should be accounted for.<br />

• A continuous interaction between manager and modeller<br />

is crucial for the success of the modelling process.<br />

One of the key aspects in this regard is to establish suitable<br />

performance criteria for the model calibration<br />

and validation tests. This dialogue is also very important<br />

in connection with uncertainty assessments.<br />

5.2. Future challenges<br />

Some of the issues dealt with in the present manuscript<br />

are still not fully explored. The four most<br />

important future challenges are:<br />

• Establishment of accuracy criteria for a modelling<br />

study is a very important issue and one where we<br />

maybe differ from most scientific literature. Modellers<br />

often establish numerical accuracy criteria in order to<br />

classify the goodness of a given model [2,17,28].<br />

These attempts are very useful in making the performance<br />

more transparent and quantitative, but do not<br />

provide an objective means to decide what the optimal<br />

accuracy criteria really should be in a given case.<br />

According to our framework no universal accuracy<br />

criteria can be established, i.e. it is generally not possible<br />

from a natural scientific point of view to tell<br />

when a model performance is good enough. Such<br />

acceptance criteria will vary from case to case<br />

depending on the socio-economic context, i.e. what<br />

is at stake in the decisions to be supported by the<br />

model predictions. The good question now is: how<br />

do we translate the Ôsoft’ socio-economic objectives<br />

to Ôhard-core’ model performance criteria This is<br />

obviously a challenge that cannot be solved by natural<br />

science alone, but need to be addressed in a much<br />

broader context including aspects of economy, stakeholder<br />

interests and risk perception. Until we become<br />

better to overcome this challenge we will, however,<br />

not be able to arrive at the optimal balance between<br />

the costs of modelling and the derived societal benefits.<br />

Although this work has hardly begun yet, and<br />

we know that it is a very difficult road, we see no real<br />

alternative.<br />

• Although all experience shows that models generally<br />

perform poorer in validation tests against independent<br />

data than they do in calibration tests, model validation<br />

is in our opinion a much neglected issue, both<br />

in many modelling guidelines and in the scientific<br />

literature. Maybe many scientists have not wanted<br />

to use the term validation due to the scientific philosophically<br />

related controversies, but in any case<br />

many scientists are not advocating the need for model<br />

validation. One of the unfortunate consequences of<br />

this Ôlack of interest’ is that not much work has<br />

been devoted to developing suitable validation test<br />

schemes since Klemes [19]. In our opinion further<br />

development of suitable testing schemes and imposing<br />

them to all modelling projects is a major future<br />

challenge.<br />

• A third issue that requires considerable attention is<br />

how do we decide among alternative model structures<br />

and parameter sets (the equifinality problem). If we<br />

use multiple criteria one model may be better on<br />

one criteria and another on another criteria. In our<br />

opinion we need not necessarily chose. We know that<br />

all conceptual models are wrong and we know that<br />

wrong conceptual models are compensated by biased<br />

model parameter values through calibration. But, unless<br />

we can falsify a conceptual model directly, which<br />

is very difficult, or unless the resulting model is falsified<br />

through the validation test, this model is a possible<br />

candidate for predictions. And if several models<br />

pass the validation tests we may not be able to tell<br />

which one is the best. In such case they should all<br />

be considered suitable, and the fact that they provide<br />

different predictive results should be used as part of<br />

the uncertainty assessments. Work on this relatively


J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82 81<br />

new paradigm has just begun [9] and a lot of work is<br />

still required to further develop and operationalise it.<br />

• Finally, there are many more challenges related to<br />

uncertainty in water resources management. Quality<br />

assurance and uncertainty assessments are two<br />

aspects that are very closely linked. Initially, the manager<br />

has to define accuracy criteria from a perception<br />

of which uncertainty level he believes is suitable in a<br />

particular case (see above). Subsequently, as the modelling<br />

study proceeds, the dialogue between modeller<br />

and manager has to continue with the necessary<br />

trade-off between modelling accuracy and cost of<br />

modelling study. In the uncertainty assessments it is<br />

very important to go beyond the traditional statistical<br />

uncertainty analysis. Thus, e.g. aspects of scenario<br />

uncertainty and ignorance should generally be included<br />

and in addition the uncertainties originating<br />

from data and models often needs to be integrated<br />

with socio-economic aspects in order to form a suitable<br />

basis for the further decision process [36]. Thus,<br />

like with the accuracy criteria (above) the use of<br />

uncertainty assessments in water resources management<br />

goes beyond natural science.<br />

Acknowledgements<br />

The present work was carried out within the Project<br />

ÔHarmonising Quality Assurance in model based catchments<br />

and river basin management (HarmoniQuA)’,<br />

which is partly funded by the EC Energy, Environment<br />

and Sustainable Development programme (Contract<br />

EVK2-CT2001-00097). The constructive comments and<br />

suggestions to the manuscript by the HarmoniQuA<br />

project team and by our colleague William (Bill) G.<br />

Harrar are acknowledged. Finally, the constructive<br />

criticisms by Keith Beven, University of Lancaster;<br />

Rodger Grayson, University of Melbourne and a third,<br />

anonymous referee helped to improve the manuscript<br />

significantly.<br />

References<br />

[1] Abbott MB. The theory of the hydrological model, or: the<br />

struggle for the soul of hydrology. In: O’Kane JP, editor.<br />

Advances in theoretical hydrology. Elsevier; 1992. p. 237–54.<br />

[2] Andersen J, Refsgaard JC, Jensen KH. Distributed hydrological<br />

modelling of the Senegal River Basin––model construction and<br />

validation. J Hydrol 2001;247:200–14.<br />

[3] Anderson MG, Bates PD, editors. Model validation: perspectives<br />

in hydrological science. John Wiley and Sons; 2001.<br />

[4] Anderson MP, Woessner WW. The role of postaudit in model<br />

validation. Adv Water Resour 1992;15:167–73.<br />

[5] Baker VR. Conversing with the Earth: the geological approach to<br />

understanding. In: Frodeman R, editor. Earth matters The earth science,<br />

philosophy and the claims of community. Prentice Hall; 2000.<br />

[6] Beven K. Changing ideas in hydrology––the case of physically<br />

based models. J Hydrol 1989;105:157–72.<br />

[7] Beven K. Towards an alternative blueprint for a physically based<br />

digitally simulated hydrologic response modelling system. Hydrol<br />

Process 2002;16(2):189–206.<br />

[8] Beven K, Binley AM. The future of distributed models: model<br />

calibration and uncertainty prediction. Hydrol Process 1992;6:<br />

279–98.<br />

[9] Beven K. Towards a coherent philosophy for modelling the<br />

environment. Proc Roy Soc Lond A 2002;458(2026):2465–84.<br />

[10] Dee DP. A pragmatic approach to model validation. In: Lynch<br />

DR, Davies AM, editors. Quantitative skill assessment of coastal<br />

ocean models. Washington: AGU; 1995. p. 1–13.<br />

[11] De Marsily G, Combes P, Goblet P. Comments on ’Ground-water<br />

models cannot be validated’, by Konikow LF, Bredehoeft, JD.<br />

Adv Water Resour 1992;15:367–9.<br />

[12] Forkel C. Das numerische Modell––ein schmaler Grat zwischen<br />

vertrauensw€urdigem Werkzeug und gef€ahrlichem Spielzeug. Presented<br />

at the 26. IWASA, RWTH Aachen, 4–5 January 1996.<br />

[13] Freeze RA, Harlan RL. Blueprint for a physically-based digitallysimulated<br />

hydrologic response model. J Hydrol 1969;9:237–58.<br />

[14] Gupta HV, Sorooshian S, Yapo PO. Toward improved calibration<br />

of hydrologic models: multiple and noncommensurable<br />

measures of information. Water Resour Res 1998;34(4):751–<br />

63.<br />

[15] Hansen JM. The line in the sand the wave on the water––Steno’s<br />

theory on the language of nature and the limits of the knowledge.<br />

Copenhagen: Fremad; 2000. 440 pp (in Danish).<br />

[16] Hassanizadeh SM, Carrera J. Editorial, special issue on validation<br />

of geo-hydrological models. Adv Water Resour 1992;15:1–3.<br />

[17] Henriksen HJ, Troldborg L, Nyegaard P, Sonnenborg TO,<br />

Refsgaard JC, Madsen B. Methodology for construction, calibration<br />

and validation of a national hydrological model for<br />

Denmark. J Hydrol 2003;280(1–4):52–71.<br />

[18] IAHR. Publication of guidelines for validation documents and<br />

call for discussion. Int Assoc Hydraul Res Bull 1994;11:41.<br />

[19] Klemes V. Operational testing of hydrological simulation models.<br />

Hydrol Sci J 1986;31:13–24.<br />

[20] Konikow LF, Bredehoeft JD. Ground-water models cannot be<br />

validated. Adv Water Resour 1992;15:75–83.<br />

[21] Kuhn TS. The structure of scientific revolutions. Chicago:<br />

University of Chicago Press; 1962.<br />

[22] Liden R. Conceptual runoff models for material transport<br />

estimations. PhD dissertation, Report No. 1028, Lund Institute<br />

of Technology, Lund University, Sweden, 2000.<br />

[23] Los H, Gerritsen H. Validation of water quality and ecological<br />

models. Presented at the 26th IAHR Conference, London, Delft<br />

Hydraulics, 11–15 September 1995, 8 pp.<br />

[24] Matalas NC, Landwehr JM, Wolman MG. Prediction in water<br />

management. In: Scientific basis of water resource management.<br />

Washington, DC: National Research Council, National Academy<br />

Press; 1982. p. 118–27.<br />

[25] Middlemis H. Murray–Darling Basin Commission. Groundwater<br />

flow modelling guideline. Aquaterra Consulting Pty Ltd, South<br />

Perth, Western Australia. Project no. 125, 2000.<br />

[26] Oreskes N, Shrader-Frechette K, Belitz K. Verification, validation<br />

and confirmation of numerical models in the earth sciences.<br />

Science 1994;264:641–6.<br />

[27] Pahl-Wostl C. Towards sustainability in the water sector––the<br />

importance of human actors and processes of social learning.<br />

Aquat Sci 2002;64:394–411.<br />

[28] Parkin G, O’Donnell GO, Ewen J, Bathurst JC, O’Connel PE,<br />

Lavabre J. Validation of catchment models for predicting land-use<br />

and climate change impacts. 2. Case study for a Mediterranean<br />

catchment. J Hydrol 1996;175:595–613.<br />

[29] Popper KR. The logic of scientific discovery. London: Hutchingson<br />

& Co; 1959.<br />

[30] Refsgaard JC. Towards a formal approach to calibration and<br />

validation of models using spatial data. In: Grayson R, Bl€oschl G,


82 J.C. Refsgaard, H.J. Henriksen / Advances in Water Resources 27 (2004) 71–82<br />

editors. Spatial patterns in catchment hydrology: Observations<br />

and modelling. Cambridge University Press; 2001. p. 329–54.<br />

[31] Refsgaard JC, Knudsen J. Operational validation and intercomparison<br />

of different types of hydrological models. Water Resour<br />

Res 1996;32(7):2189–202.<br />

[32] Rykiel ER. Testing ecological models: The meaning of validation.<br />

Ecol Modell 1996;90:229–44.<br />

[33] Schlesinger S, Crosbie RE, Gagne RE, Innis GS, Lalwani CS,<br />

Loch J, et al. Terminology for model credibility. SCS Tech Comm<br />

Model Credibil Simul 1979;32(3):103–4.<br />

[34] Scholten H, Van Waveren RH, Groot S, Van Geer FC, W€osten<br />

JHM, Koeze RD, et al. Good modelling practice in water<br />

management. Paper presented on Hydroinformatics 2000, Cedar<br />

Rapids, IA, USA, 2000.<br />

[35] Troldborg L. Effects of geological complexity on groundwater age<br />

prediction. Poster Session 62C, AGU December 2000. EOS<br />

Transactions, 81(48), F435.<br />

[36] Van Asselt MBA, Rotmans J. Uncertainty in integrated assessment<br />

modelling––from positivism to pluralism. Climat Change<br />

2002;54(1–2):75–105.<br />

[37] Van Waveren RH, Groot S, Scholten H, Van Geer FC, W€osten<br />

JHM, Koeze RD, et al. Good modelling practice handbook.<br />

STOWA Report 99-05, Utrecht, RWS-RIZA, Lelystad, The<br />

Netherlands. Available from: http://waterland.net/riza/aquest/.


[13]<br />

Refsgaard JC, Henriksen HJ, Harrar WG, Scholten H, Kassahun A (2005)<br />

Quality assurance in model based water management – review of existing<br />

practice and outline of new approaches.<br />

Environmental Modelling & Software, 20, 1201-1215.<br />

Reprinted from Environmental Modelling & Software with permission from Elsevier


Environmental Modelling & Software 20 (2005) 1201–1215<br />

www.elsevier.com/locate/envsoft<br />

Quality assurance in model based water management – review of<br />

existing practice and outline of new approaches<br />

Jens Christian Refsgaard a, ) , Hans Jørgen Henriksen a , William G. Harrar a ,<br />

Huub Scholten b , Ayalew Kassahun b<br />

a Geological Survey of Denmark and Greenland (GEUS), Øster Voldgade 10, DK-1350 Copenhagen K, Denmark<br />

b Wageningen University (WU), Dreijenplein 2, 6703 HB, Wageningen, The Netherlands<br />

Received 11 December 2003; received in revised form 30 March 2004; accepted 30 July 2004<br />

Abstract<br />

Quality assurance (QA) is defined as protocols and guidelines to support the proper application of models. In the water<br />

management context we classify QA guidelines according to how much focus is put on the dialogue between the modeller and the<br />

water manager as: (Type 1) Internal technical guidelines developed and used internally by the modeller’s organisation; (Type 2)<br />

Public technical guidelines developed in a public consensus building process; and (Type 3) Public interactive guidelines developed as<br />

public guidelines to promote and regulate the interaction between the modeller and the water manager throughout the modelling<br />

process. State-of-the-art QA practices vary considerably between different modelling domains and countries. It is suggested that<br />

these differences can be explained by the scientific maturity of the underlying discipline and differences in modelling markets in terms<br />

of volume of jobs outsourced and level of competition. The structure and key aspects of new generic guidelines and a set of<br />

electronically based supporting tools that are under development within the HarmoniQuA project are presented. Model credibility<br />

can be enhanced by a proper modeller-manager dialogue, rigorous validation tests against independent data, uncertainty<br />

assessments, and peer reviews of a model at various stages throughout its development.<br />

Ó 2004 Elsevier Ltd. All rights reserved.<br />

Keywords: Modelling guidelines; Quality assurance; Water resources management; Uncertainty; Support tools<br />

1. Introduction<br />

Models describing water flows, water quality and<br />

ecology are being developed and applied in increasing<br />

number and variety. The trend in recent years has been<br />

to base water management decisions to a larger extent<br />

on modelling studies, and to use more sophisticated<br />

models. In Europe this trend is likely to be reinforced by<br />

the EU Water Framework Directive due to its demand<br />

for integrating groundwater, surface water, ecological<br />

) Corresponding author. Tel.: C45 38 142 776; fax: C45 38 142<br />

050.<br />

E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />

and economic aspects of water management at the river<br />

basin scale and due to the explicit requirement to study<br />

impacts of alternative measures (human interventions)<br />

intended to improve the ecological status in the river<br />

basin. Insufficient attention is often given to documenting<br />

the predictive capability of models. Therefore,<br />

contradictions may emerge regarding the various claims<br />

of model applicability on the one hand and the lack of<br />

documentation of these claims on the other hand.<br />

Hence, the credibility of the model is often questioned,<br />

and sometimes with good reason.<br />

Another important trend is the demand to involve<br />

different stakeholders in the water resources management<br />

process, and therefore also indirectly in the<br />

modelling process (Pahl-Wostl, 2002). This stakeholder<br />

1364-8152/$ - see front matter Ó 2004 Elsevier Ltd. All rights reserved.<br />

doi:10.1016/j.envsoft.2004.07.006


1202 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

involvement does not imply active participation in<br />

the technical modelling itself, but rather appears as<br />

a demand to be able to understand and review the<br />

various assumptions and their implications for the<br />

modelling results. This trend is seen at the global scale<br />

in connection with the generally accepted principles<br />

behind integrated water resources management, where<br />

public participation is a key element (GWP-TAC, 2000).<br />

In Europe, this is reflected in the EU Water Framework<br />

Directive, where it is explicitly prescribed that stakeholders<br />

and the general public should be involved in the<br />

water resources management process.<br />

The need for improving the quality of the modelling<br />

process has been emphasised by the research community,<br />

e.g. Klemes (1986), NRC (1990), Anderson and<br />

Woessner (1992), Forkel (1996), and Rykiel (1996). The<br />

recommendations made in this respect primarily focus<br />

on scientific/technical guidance on how the modeller<br />

should carry out various steps during the modelling<br />

process in order to achieve the best and most reliable<br />

results.<br />

Anderson and Bates (2001) in a discussion of model<br />

credibility and scientific integrity state that ‘‘over the last<br />

decade we have begun to have an appreciation of the<br />

need to be much more rigorous in establishing<br />

procedures for defining model credibility’’. They argue<br />

further that this demand has not evolved from the<br />

hydrological science itself due to immaturity and data<br />

limitations, but instead comes from policy makers and<br />

regulators who wish to have some kind of certification<br />

of model results.<br />

As emphasised by e.g. Forkel (1996) modelling<br />

studies involve several partners with different responsibilities.<br />

The ‘key players’ are code developers, model<br />

users and water managers. However, a lack of mutual<br />

understanding may develop due to the complexity of the<br />

modelling process and the different backgrounds of the<br />

‘key players’. For example, the strengths and limitations<br />

of modelling applications are often difficult, if not<br />

impossible, for the water managers to assess. Similarly,<br />

the transformation of objectives defined by the water<br />

manager to specific performance criteria can be very<br />

difficult for the model users to assess. It can be difficult<br />

to audit modelling projects due to the lack of proper<br />

documentation and transparency. Furthermore, it is<br />

often difficult to reconstruct and reproduce the modelling<br />

process and its results.<br />

In the water resources management community many<br />

different guidelines on good modelling practise have<br />

been developed. One of, if not the most, comprehensive<br />

example of a modelling guideline has been developed in<br />

The Netherlands (Van Waveren et al., 2000; Scholten<br />

and Groot, 2002) as a result of a process involving all<br />

the main players in the Dutch water management field.<br />

The background for this process was a perceived need<br />

for improving the quality in modelling by addressing<br />

malpractice issues such as careless handling of input<br />

data, insufficient calibration and validation, and model<br />

use outside its intended scope (Scholten et al., 2000).<br />

Similarly, modelling guidelines for the Murray-Darling<br />

Basin in Australia were developed due to the perception<br />

among end-users that model capabilities may have been<br />

‘over-sold’, and that there was a lack of consistency in<br />

approaches, communication and understanding among<br />

and between the modellers and the water managers,<br />

which often resulted in considerable uncertainty for<br />

decision making (Middlemis, 2000).<br />

As pointed out by Merrick et al. (2002) good<br />

modelling practice cannot be decomposed into a set of<br />

rigid rules that can be followed without communication<br />

between modellers and water managers. Furthermore,<br />

there is a risk that modellers will not embrace guidelines<br />

aiming to inject too much consistency in the review<br />

procedure. Experiences from Australia have shown that<br />

review reports are commonly interpreted by water<br />

managers (non-modellers) as quite negative. Nonmodellers<br />

may tend to focus mainly on the negative<br />

review comments rather than balance those against the<br />

positive comments. This may mostly be the case for<br />

projects where there has not been a proper specification<br />

of the purpose and conditions at the initiation of the<br />

model study or where previous reviews during earlier<br />

project stages have been inadequate. External reviews<br />

performed at the end of a project when things may have<br />

already gone wrong may often result in defensive<br />

responses both from the modellers and the water<br />

managers (Henriksen, 2002a).<br />

All the existing modelling guidelines that we are<br />

aware of exist as reports. Electronically based support is<br />

only available as text forms to record modelling<br />

activities. No electronically based tool that is coupled<br />

to a knowledge base defining how to carry out the<br />

modelling (electronic version of guidelines with comprehensive<br />

guidance to different types of users) exists at<br />

present. This is a paradox, considering the significant<br />

resources that are invested in improving modelling<br />

software packages with respect to new sophisticated<br />

information technology.<br />

Poor modelling results may be caused by the lack of<br />

adequate model codes, or data of insufficient quantity or<br />

quality. However, according to our experience the most<br />

prevalent reason for poor modelling results is the<br />

inadequate use of guidelines and quality assurance<br />

procedures, and improper interaction between the<br />

manager (client) and the modeller (consultant). Our<br />

work has been carried out within the context of an EU<br />

supported research project (http://www.harmoniqua.org)<br />

aimed at developing a common set of quality<br />

assurance guidelines and supporting software tools. The<br />

scientific philosophical basis for the adopted terminology<br />

and guiding principles are described by Refsgaard<br />

and Henriksen (2004). The objective of the present


J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

1203<br />

paper is to establish new approaches and outline the<br />

requirements of supporting tools for quality assurance<br />

procedures in the modelling process.<br />

2. Theoretical framework<br />

2.1. Terminology and scientific basis<br />

The terminology and methodology used in the<br />

following are based on Refsgaard and Henriksen (2004).<br />

The key elements in the terminology are illustrated in<br />

Fig. 1 and the most important definitions are:<br />

A model code is a generic software program, which<br />

can be used for different study areas without<br />

modifying the source code.<br />

A model is a site application of a code to a particular<br />

study area, including input data and parameter<br />

values.<br />

A model code can be verified. A code verification<br />

involves comparison of the numerical solution<br />

generated by the code with one or more analytical<br />

solutions or with other numerical solutions. Verification<br />

ensures that the computer programme accurately<br />

solves the equations that constitute the<br />

mathematical model.<br />

Model validation is here defined as the process of<br />

demonstrating that a given site-specific model is<br />

capable of making accurate predictions for periods<br />

outside a calibration period. A model is said to be<br />

validated if its accuracy and predictive capability in<br />

the validation period have been proven to lie within<br />

acceptable limits or errors.<br />

These terms are commonly used, although with<br />

differences in meaning between authors. Our views on<br />

Fig. 1. Elements of a modelling terminology (Refsgaard and<br />

Henriksen, 2004).<br />

these terms and the ongoing discussion on validationfalsification-confirmation<br />

as well as between the terms<br />

perceptual model, conceptual model and site-specific<br />

model are given in Refsgaard and Henriksen (2004).<br />

Here we just note that, from a quality assurance<br />

guideline point of view, it is fundamental for us to<br />

make a clear distinction between the terms conceptual<br />

model, model code and (site-specific) model. Furthermore,<br />

we never use the terms verification and validation<br />

in a universal sense, but always restricted to clearly<br />

defined domains of applicability (numerical universal in<br />

Popperian sense).<br />

In addition to ensure a proper quality of work the<br />

three most important underlying principles that have<br />

been identified from an analysis of the modelling process<br />

are (Refsgaard and Henriksen, 2004):<br />

Validation tests against independent data that have<br />

not also been used for calibration are necessary in<br />

order to be able to document the predictive<br />

capability of a model.<br />

Model predictions achieved through simulation<br />

should be associated with uncertainty assessments<br />

where amongst others the uncertainty in model<br />

structure and parameter values should be accounted<br />

for.<br />

A continuous interaction between water manager and<br />

modeller is crucial for the success of the modelling<br />

process. One of the key aspects in this regard is to<br />

establish suitable performance criteria for the model<br />

calibration and validation tests. This dialogue is also<br />

very important in connection with uncertainty<br />

assessments.<br />

2.2. Types of QA guidelines<br />

2.2.1. Definition and classification<br />

of quality assurance (QA)<br />

Quality assurance (QA) is defined by NRC (1990) as<br />

the procedural and operational framework used by an<br />

organisation managing the modelling study to assure<br />

technically and scientifically adequate execution of all<br />

tasks included in the study, and to assure that all<br />

modelling-based analysis is reproducible and defensible.<br />

In line with this we define QA guidelines as protocols<br />

and guidelines to support good application of models in<br />

water management.<br />

QA in the modelling process has two main components:<br />

(a) QA in development of model codes; and (b)<br />

QA in relation to application studies. Our paper focuses<br />

on the second component only.<br />

QA in model application studies includes data<br />

analyses, methodologies of good modelling practice,<br />

reviews and administrative procedures. Such QA guidelines<br />

can be classified according to how much focus is


1204 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

put on the consensus building process between the<br />

modeller and the water manager in the following three<br />

classes:<br />

Internal technical guidelines (Type 1) established and<br />

used internally by the modeller’s organisation.<br />

Public technical guidelines (Type 2) established as<br />

public guidelines and used internally by the modeller’s<br />

organisation.<br />

Public interactive guidelines (Type 3) established as<br />

public guidelines and based on regulation of the<br />

interaction between the modeller and the water<br />

manager throughout the modelling process.<br />

2.2.2. Type 1: Internal technical guidelines<br />

Most organisations involved in modelling studies<br />

have some kind of internal QA procedures. They usually<br />

focus on the technical aspects, i.e. to ensure that the<br />

modelling work itself is done without making unqualified<br />

judgements or errors. The betters of these are<br />

based on the modelling protocols and similar scientifically<br />

based procedures originating from the research<br />

community. These procedures are internal in nature<br />

because they have been established or adopted unilaterally<br />

by the modeller’s organisation, and because they<br />

seldom deal with the interaction between modeller and<br />

end-user. Examples of Type 1 guidelines include:<br />

Internal QA procedures, common in many companies.<br />

Text books. Many textbooks contain chapters with<br />

recommended modelling protocols (e.g. Anderson<br />

et al., 1993).<br />

Manuals to software packages with hints on the best<br />

way to use a model (e.g. Rumbaugh and Rumbaugh,<br />

2001; DHI, 2002).<br />

2.2.3. Type 2: Public technical guidelines<br />

These guidelines often contain the same substance as<br />

the internal technical guides mentioned above. However,<br />

they differ in the sense that they have been<br />

prepared through a consultative and consensus building<br />

process involving many persons and organisations. They<br />

focus on the technical aspects and give no or little<br />

emphasis to the interaction between the modeller and<br />

the end-user. Examples of Type 2 guidelines include:<br />

The CAMASE guidelines for modelling that were<br />

developed after substantial consultation within the<br />

scientific modelling community (CAMASE, 1996).<br />

Standards from American Society for Testing and<br />

Materials (e.g. ASTM, 1994).<br />

Many of the UK standards, especially the older ones<br />

(Packman, 2002).<br />

2.2.4. Type 3: Public interactive guidelines<br />

These guidelines have, like the public technical<br />

guidelines (Type 2), been established through a public<br />

consultative and consensus building process. However,<br />

they differ from the Type 2 guidelines by an additional<br />

focus on regulating the interaction between the modeller<br />

and the water manager, who often have the roles of<br />

consultant and client, respectively.<br />

Important elements in public interactive guidelines<br />

are reviews that, in addition to QA in the sense of technical<br />

guidance, can facilitate the consensus-building process<br />

between the parties. Experience shows that such a<br />

process is crucial for the overall credibility of the modelling<br />

process. Examples of such QA guidelines include<br />

(more details on these guidelines provided in next<br />

chapter):<br />

The Dutch guidelines (Van Waveren et al., 2000;<br />

Scholten and Groot, 2002).<br />

The Australian groundwater flow modelling guidelines<br />

established by the Murray-Darling Basin<br />

Commission (Middlemis, 2000; Merrick et al.,<br />

2002; Henriksen, 2002a).<br />

The Danish groundwater modelling guidelines<br />

(Henriksen, 2002b).<br />

Some of the recent UK standards (Packman, 2002).<br />

Californian guidelines prepared by Bay-Delta Modelling<br />

Forum (BDMF, 2000).<br />

2.3. Development stage and prevalence<br />

of QA guidelines<br />

Reviews of a number of existing QA guidelines (see<br />

details in next chapter) revealed significant differences in<br />

current practice, both between domains and between<br />

different countries. In some domains and some countries<br />

there has been a clear trend over the past couple of<br />

decades to move from Type 1 to Type 2 or Type 3<br />

guidelines. In order to understand the development of<br />

QA guidelines and be able to provide recommendations<br />

based on anticipated future needs, it is important to try<br />

to understand why the present differences in the<br />

developmental stage of QA guidelines exist. The<br />

hypothesis that we will test is that the development<br />

stage depends on two main factors:<br />

The scientific maturity of the underlying discipline,<br />

i.e. how well understood are the underlying processes<br />

and how easily available are the data<br />

necessary for practical applications. In this respect,<br />

a mature scientific discipline is one where there is<br />

a general acceptance in the scientific community on<br />

how the processes are described, there are no<br />

significant controversies on key issues, and it is<br />

feasible to acquire the necessary data for practical


J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

1205<br />

studies. Similarly, an immature scientific discipline is<br />

one where some processes are not well understood,<br />

where there are several alternative ‘schools’ on how<br />

to describe things, and where it is often not possible<br />

to obtain sufficient field data necessary to perform<br />

scientifically sound modelling. Immature scientific<br />

disciplines are often considered as being complex,<br />

and are characterised by unresolved problems such<br />

as scale problems. For example, whereas biology is<br />

a relatively old science in comparison with hydrogeology,<br />

biota (ecological) modelling is considered<br />

to be immature in contrast to groundwater flow<br />

modelling which is considered to be mature. Biota<br />

modelling is rather uncertain due to the inherent<br />

complexity of ecological systems and the general<br />

limited availability of relevant field data, whereas the<br />

mathematical principles describing groundwater<br />

flow are well established and flow systems are<br />

readily characterised in the field.<br />

The modelling market maturity, i.e. how well developed<br />

is the market for modelling studies. In this<br />

respect, a mature market is characterised by (a) the<br />

modelling market is relatively old with numerous<br />

examples of good and poor quality modelling<br />

studies, and the motivation for establishing QA<br />

guidelines is largely due to water managers having<br />

experience with studies of poor quality; (b) most jobs<br />

are outsourced to private consultants; (c) the volume<br />

of modelling work is large, so that a number of<br />

consultants can be sustained and standard routines<br />

can evolve; and (d) there is a considerable competition<br />

among modellers in getting the jobs. Similarly,<br />

an immature market is characterised by (a) it is relatively<br />

new (typically !10 years); (b) most modelling<br />

studies are carried out by government agencies themselves;<br />

(c) the volume of work for the consultants is<br />

small; and (d) there is virtually no competition<br />

among modellers, instead the work is carried out by<br />

a few specialised groups which are often located in or<br />

have close ties to the research community.<br />

If these hypotheses were true one would a priori<br />

expect that a considerable degree of scientific maturity is<br />

required for QA guidelines of Type 2 to develop, and<br />

that further a mature modelling market is a necessary<br />

prerequisite for the development of Type 3 guidelines.<br />

3. Existing guidelines<br />

Reviews of existing QA guidelines were conducted<br />

(Refsgaard, 2002). The reviews attempted to cover two<br />

aspects: (a) variation of practices between seven different<br />

modelling domains (groundwater, precipitation-runoff,<br />

hydrodynamics, flood forecasting, surface water quality,<br />

biota (ecology) and socio-economy); and (b) differences<br />

between geographical regions. The reviews of stateof-the-art<br />

in the seven domains were carried out by<br />

seven different organisations with special expertise in the<br />

respective domains. During these reviews a broad search<br />

of relevant QA guidelines were made with primary focus<br />

on existing guidelines in Europe and secondarily<br />

on guidelines from North America and Australia.<br />

Subsequently, a few cases with guidelines from different<br />

geographical areas were selected for a more detailed<br />

review. The reviews did not intend to be exhaustive by<br />

including all important QA guidelines, but aimed at<br />

selecting guidelines representative for conditions in<br />

Europe, North America and Australia.<br />

In order to test the above hypotheses the conclusions<br />

of the state-of-the-art of QA guidelines for the different<br />

domains summarised in Section 3.1 are plotted in Fig. 2<br />

as a function of scientific maturity. Furthermore,<br />

examples of guidelines from different countries are<br />

Scientific<br />

maturity<br />

Mature<br />

FF<br />

HD<br />

GW-HD<br />

Immature<br />

SWQ<br />

Biota<br />

GW-WQ<br />

Type 1<br />

Internal<br />

PR<br />

HD-Sed<br />

SE<br />

GW-AD<br />

Type 2<br />

Public<br />

Modelling domains<br />

GW-HD: Groundwater flow<br />

GW-AD: Groundwater solute transport<br />

GW-WQ: Groundwater geochemistry<br />

PR: Precipitation runoff<br />

HD: Hydrodynamic – surface water flow<br />

HD-Sed: Sediment transport/morphology<br />

FF: Flood forecasting<br />

SWQ: Surface water quality<br />

Biota: Biota (ecology)<br />

SE: Socio-economy<br />

Type 3<br />

Interactive<br />

QA<br />

guidelines<br />

Fig. 2. State-of-the-art for QA guidelines in different modelling domains plotted against maturity of the underlying scientific disciplines.


1206 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

Modelling<br />

market<br />

Mature<br />

(Old, big,<br />

competive)<br />

ASTM<br />

UK<br />

BDMF<br />

AUS-GW<br />

NL-GMP<br />

DK-GW<br />

UK<br />

UK<br />

Immature<br />

(New, small,<br />

specialised)<br />

CEE<br />

FR-FF<br />

Cases-guidelines<br />

BDMF: Bay Delta Modelling Forum (California)<br />

AUS-GW: Australia, groundwater<br />

NL-GMP: Dutch Good Modelling Practise<br />

DK-GW: Denmark, groundwater<br />

UK: United Kingdom, several domains<br />

ASTM: American Society for Testing and Materials<br />

CEE: Central and Eastern Europe<br />

FR-FF: France, flood forecasting<br />

Type 1<br />

Internal<br />

Type 2<br />

Public<br />

Type 3<br />

Interactive<br />

QA<br />

guidelines<br />

Fig. 3. Different types of guidelines as a function of maturity in the modelling market.<br />

presented in Section 3.2 and Fig. 3 with focus on market<br />

maturity.<br />

3.1. State-of-the-art in different modelling domains<br />

Groundwater modelling (Refsgaard and Henriksen,<br />

2002): In this field, QA guidelines are well developed<br />

and used in many countries, but mostly in groundwater<br />

flow modelling, where the state-of-the-art corresponds<br />

to Type 3 guidelines. For solute transport, and in<br />

particular for geochemical modelling, relatively few<br />

guidelines exist and they are not commonly used. The<br />

need for QA guidelines differs from country to country,<br />

amongst others due to different stages of development of<br />

the groundwater modelling market. For instance, the<br />

guides from the American Society for Testing and<br />

Materials (ASTM) were among the first of their kind to<br />

be developed, in the early 1990s, because the practical<br />

application of groundwater models at that time had<br />

progressed further in the USA than in most other<br />

countries.<br />

Precipitation-runoff modelling (Perrin et al., 2002a):<br />

Relatively few guidelines exist for this domain as standalone<br />

guidelines. The guidelines that do exist are generally<br />

confined to relatively simple (lumped) approaches,<br />

while no generic guidelines exist for the more complex<br />

models of the distributed physically-based type. Thus,<br />

the state-of-the-art for precipitation-runoff as a standalone<br />

domain may be characterised as Type1/Type2.<br />

However, it is also noted that precipitation-runoff<br />

modelling is often used as an integral part of other<br />

domains, e.g. groundwater models, hydrodynamic<br />

models, flood forecasting models and surface water<br />

quality models. For some of these integrated applications<br />

some guidelines have been developed which<br />

include the precipitation-runoff domain. This is, for<br />

instance, the case for the Danish groundwater guidelines<br />

(Henriksen, 2002b) which include aspects of precipitation-runoff<br />

modelling.<br />

Hydrodynamic modelling (Metelka and Krejcik,<br />

2002a): This domain includes environmental applications<br />

such as modelling of urban drainage and sewer<br />

systems, rivers, floodplains, estuaries and coastal waters<br />

both with respect to flows, sediment and morphological<br />

issues. QA guidelines are well developed in some fields<br />

(e.g. in urban drainage and river modelling), but not in<br />

other fields (e.g. sediment and morphological modelling).<br />

For hydrodynamic modelling in coastal areas and<br />

estuaries few QA guidelines have been identified. The<br />

state-of-the-art may be characterised as Type 2 for most<br />

parts of the domain and Type 1 for other parts. It is<br />

noted that hydrodynamic modelling is often an integral<br />

part of flood forecasting and surface water quality<br />

modelling. Although very similar in theoretical scientific<br />

background, this domain is different from the field of<br />

Computational Fluid Dynamics that typically is used for<br />

industrial purposes.<br />

Flood forecasting modelling (Balint, 2002): This<br />

domain differs fundamentally from the other domains<br />

by being based on real-time operation. This implies that<br />

the models, once established, are applied on a routine<br />

(daily) basis although often under extreme boundary<br />

conditions. The focus on QA in this domain is often<br />

concentrated on data quality for the on-line data<br />

acquisition. Due to this fundamental difference in nature,<br />

the status of QA guidelines for this domain does not fit<br />

well into the above classification, and it is not easily<br />

comparable to the status of the other domains.<br />

Surface water quality modelling (Da Silva et al.,<br />

2002): Surface water quality modelling is based on<br />

a description of physical, chemical and biological<br />

processes. Often the data availability to assess model<br />

processes and parameters is sparse and often the<br />

key processes are not well understood. QA guidelines


J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

1207<br />

are generally not well developed. The state-of-the-art<br />

may be characterised as Type 1.<br />

Biota (ecological) modelling (Old et al., 2002):<br />

Ecology is a diverse branch of biology that focuses on<br />

the relations of flora and fauna to one another and to<br />

their physical environment. Ecological models are<br />

widely used today, but perceived as being rather<br />

uncertain due to the inherent complexity of ecological<br />

systems and the general limited availability of relevant<br />

field data. QA guidelines are generally not well<br />

developed. The state-of-the-art may be characterised as<br />

Type 1.<br />

Socio-economic modelling (Heinz and Eberle, 2002):<br />

No general QA guidelines exist for socio-economic<br />

modelling. The few existing guidelines, such as the<br />

CAMS, CFMPS and RBMPs in the UK, are specific for<br />

particular types of application, and they are so far only<br />

used in practice in a few countries. The state-of-the-art<br />

may be characterised as Type1/Type2.<br />

In Fig. 2 the state-of-the-art for QA guidelines in the<br />

respective modelling domains have been plotted against<br />

the scientific maturity of the underlying disciplines. The<br />

scientific maturity of the respective domains has been<br />

assessed subjectively on the basis of the criteria outlined<br />

in Section 2.3 above. There is a tendency that the least<br />

developed guidelines (Type 1) appear in domains where<br />

the underlying scientific basis is characterised as<br />

immature, i.e. in surface water quality, biota (ecology)<br />

and groundwater quality, reflecting that many fundamental<br />

scientific issues remain to be solved. Similarly,<br />

the Type 2 and Type 3 guidelines are dominant in<br />

domains characterised by scientific maturity. However,<br />

there are clear exceptions such as precipitation-runoff<br />

and flood forecasting, where other factors than scientific<br />

maturity must play a role for the development stage of<br />

QA guidelines.<br />

3.2. Current practice in different countries<br />

The current practice of using QA guidelines in<br />

different countries has been illustrated through some<br />

selected cases that have been reviewed in Refsgaard<br />

(2002). InFig. 3 the type of QA guidelines used in the<br />

case studies is plotted against the maturity of the<br />

modelling market that has been assessed subjectively on<br />

the basis of the criteria given in Section 2.3 above. The<br />

practice as reflected by the case studies and shown on<br />

the figure is summarised as follows:<br />

Dutch guidelines (Scholten and Groot, 2002): The<br />

Dutch guidelines are the most generic of the existing<br />

guidelines in the sense that they cover all the domains<br />

relevant for river basin management. The technical<br />

guidance for different modelling domains exist, but are<br />

not as detailed as some of the guidelines that only cover<br />

one domain (e.g. ASTM guides or Australian guidelines<br />

on groundwater flow modelling). The Dutch guidelines<br />

emphasise the dialogue process between modeller and<br />

water manager, including the review procedures. The<br />

Dutch guidelines belong to Type 3. The Dutch<br />

modelling market may be characterised as mature.<br />

Australian groundwater flow modelling guidelines<br />

(Henriksen, 2002a): The Australian guidelines are<br />

technically comprehensive. They focus on the dialogue<br />

between the modeller and the water manager in general<br />

and on review procedures in particular. The guidelines<br />

were developed over several years with involvement of<br />

all of the key stakeholders. The Australian guidelines<br />

belong to Type 3. The Australian groundwater modelling<br />

market may be characterised as mature.<br />

Danish groundwater modelling guidelines (Henriksen,<br />

2002b): The Danish Handbook of Good Modelling<br />

Practice and draft guidelines is similar to the Australian<br />

ones, although some important details differ. The water<br />

managers, who also ensure that they presently are being<br />

used in most studies, have initiated the Danish guidelines.<br />

The Danish guidelines belong to Type 3. The<br />

Danish groundwater modelling market may be characterised<br />

as mature.<br />

Central and Eastern Europe (Metelka and Krejcik,<br />

2002b;Van Gils and Groot, 2002): Public QA guidelines<br />

are neither well developed nor used. Many modellers<br />

therefore rely only on internal QA procedures (Type 1)<br />

adopted by their respective organisations. This situation<br />

reflects a new and unregulated market for modelling<br />

services, and a market where the managers and their<br />

organisations often are technically too weak to adopt<br />

and enforce QA guidelines.<br />

French guidelines in flood forecasting (Perrin et al.,<br />

2002b): Public or interactive guidelines do not exist in<br />

this area, and the case study describes a set of internal<br />

technical guidelines (Type 1). Although flood forecasting<br />

is an old modelling discipline, the modelling<br />

market is virtually non-existent, because flood forecasting<br />

modelling in France (as well as in most other<br />

countries) is carried out either by a government agency<br />

or by a specialised research institute.<br />

UK guidelines (Packman, 2002): QA guidelines are<br />

generally very well developed in the UK. Application of<br />

guidelines is prescribed as a routine in most areas of<br />

model application. Thus, in general the UK market for<br />

modelling services is well regulated and characterised as<br />

being mature. Most of the guidelines are of Type 2 and<br />

some recent ones of Type 3. The exceptions to this are<br />

the surface water quality and biota (ecological) domains<br />

where no general guidelines exist. The guidelines in these<br />

domains are therefore confined to internal procedures<br />

inspired by textbooks and manuals (Type 1).<br />

Bay Delta Modelling Forum, California (BDMF,<br />

2000): The Californian guidelines provide a framework,<br />

but very few technical details. The main emphasis of<br />

these guidelines is on the interaction between modellers,<br />

managers and the public (Type 3). In this respect various


1208 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

kinds of reviews are prescribed at various stages of the<br />

modelling process. The American market in general and<br />

the Californian in particular are well established<br />

(mature).<br />

American Society for Testing and Materials (ASTM,<br />

1992, 1994): The American guidelines are especially<br />

comprehensive in the groundwater domain, where they<br />

have served as inspiration for all the other groundwater<br />

guidelines, including the Australian and the Danish<br />

guidelines. There are a number of guidelines on various<br />

elements of the modelling process. These guides are 5–10<br />

years old and are mainly technical of nature, while<br />

limited focus is put on the interaction and review<br />

process.<br />

In addition to the above QA guidelines ISO (the<br />

International Organisation for Standardisation) regularly<br />

publishes quality management and quality assurance<br />

standards. ISO standards provide guidance on<br />

fundamental principles and procedures, but on a rather<br />

general level. We have found ISO standards addressing<br />

development, supply and maintenance of computer<br />

software (ISO 9000-3:1997) and other standards providing<br />

guidance for a general process based quality<br />

management system in an organisation (ISO<br />

9004:2000(E)). However, none of the ISO standards<br />

include any particular guidance on matters related to<br />

water resources modelling or management, and they are<br />

therefore of limited practical use as compared to the<br />

above other QA guidelines dedicated to water resources<br />

modelling.<br />

3.3. Content of existing guidelines<br />

3.3.1. Key elements<br />

The existing guidelines all comprise modelling protocols<br />

with recommended steps and technical guidance<br />

on how to perform these steps in the modelling process.<br />

The key elements may be divided into two groups,<br />

namely: (1) technical guides on how to use models; and<br />

(2) guides for regulating the interaction between<br />

modeller and end-user/water manager. The key elements<br />

in the technical guides include:<br />

Definition of the purpose of the modelling study.<br />

Collection and processing of data.<br />

Establishment of a conceptual model.<br />

Selection of code or alternatively programming and<br />

verification of code.<br />

Model set-up.<br />

Establishment of performance criteria.<br />

Model calibration.<br />

Model validation.<br />

Uncertainty assessments.<br />

Simulation with model application for a specific<br />

purpose.<br />

Reporting.<br />

The key elements in the interaction between the<br />

modeller and the end-user in addition to some of the<br />

above elements also includes other aspects:<br />

Definition of the purpose of the modelling study,<br />

including translation of the end-users needs to<br />

preliminary performance criteria.<br />

Establishment of performance criteria. The accuracy<br />

of the model predictions has to be established via<br />

a trade off between the benefits of improving the<br />

accuracy in terms of less uncertainty on the<br />

management decisions and the costs of improving<br />

the accuracy through additional model studies and/<br />

or collection of additional field data.<br />

Reviews with subsequent consultation between the<br />

modeller and the end-user at different phases of the<br />

modelling project.<br />

The content of the technical guides are to a large<br />

extent domain specific, while the elements of the<br />

interaction between the modeller and the end-user are<br />

more general in nature and differ only slightly from one<br />

domain to another.<br />

3.3.2. Integration across modelling domains<br />

Almost all the existing guidelines were developed for<br />

a specific domain e.g. groundwater modelling. As<br />

integrated modelling may be expected to play an<br />

important role in connection with implementation of<br />

the EU Water Framework Directive and adoption of<br />

Integrated Water Resources Management principles,<br />

guidelines not including integrated modelling aspects are<br />

inadequate. Even the Dutch guidelines (Scholten and<br />

Groot, 2002) which cover a large number of domains are<br />

essentially single domain guidelines, because they do not<br />

provide guidance on how to integrate across domains<br />

(interdependencies etc.). However, the Dutch guidelines<br />

do have the clear advantage over other existing guidelines<br />

in that they are based on a common methodology<br />

and a common glossary.<br />

It should be noted though that some guidelines cover<br />

more than one modelling domain, as they are defined<br />

here. For instance hydrodynamic modelling or groundwater<br />

modelling are often combined with precipitationrunoff,<br />

and guidelines combining these domains exist.<br />

3.3.3. Differences in terminology<br />

As illustrated in Refsgaard (2002) the terminology<br />

used in the modelling community varies significantly<br />

between domains and even to some extent from one<br />

country to another. This clearly demonstrates the need<br />

for establishing one common terminology and glossary<br />

for modelling applications as addressed by Refsgaard<br />

and Henriksen (2004).


J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

1209<br />

4. Outline of new guidelines – HarmoniQuA<br />

4.1. Overall aim and structure<br />

On the basis of the knowledge achieved through the<br />

review of existing guidelines, the HarmoniQuA project<br />

aims to develop a new comprehensive set of guidelines<br />

and supporting software tools to facilitate an improved<br />

quality of the modelling process and hence enhance the<br />

confidence of all stakeholders.<br />

HarmoniQuA forms part of the CATCHMOD<br />

cluster of EU research projects (Blind, 2004). It aims<br />

to be a methodological component of a future infrastructure<br />

for model based decision support for water<br />

management at catchment and river basin scale. This<br />

main goal will be reached by providing the elements of<br />

a methodological layer in this infrastructure, embodied<br />

in a knowledge base (KB) and software tools. HarmoniQuA<br />

will collect methodological expertise, structure<br />

this knowledge and identify and fill in gaps. It will<br />

consist of generic and domain specific knowledge,<br />

modelling software specific aspects, and a transparent<br />

and consistent glossary of terms and concepts. This<br />

body of knowledge will be structured in a knowledge<br />

base. The following set of software tools will provide<br />

functionality for the HarmoniQuA system:<br />

guideline tool: will generate guidelines from the KB;<br />

monitoring tool: will monitor all activities within<br />

a modelling job and store these activities as a single<br />

model journal in a model archive;<br />

report tool: generates reports from a model journal;<br />

advisor tool: advises modellers in new modelling jobs<br />

based on decisions and choices of previous jobs and<br />

associated model journals in the model archive.<br />

An overview of the HarmoniQuA products (KB and<br />

tools) and how these interact with the activities of the<br />

users is presented in Fig. 4. The lower part of Fig. 4<br />

depicts the five major steps of the modelling process.<br />

These five major steps are decomposed into 45 tasks,<br />

with interrelations (order and feedback) as shown in<br />

Fig. 5. Each task has an internal structure, i.e. name,<br />

definition, explanation, interrelations with other tasks,<br />

activities, activity related methods, references, task<br />

inputs and outputs. This knowledge structure (steps,<br />

tasks, within-task-knowledge) is stored in the KB. The<br />

five steps and the tasks have been selected on the basis of<br />

existing modelling protocols and QA guidelines and<br />

include the key elements outlined in Section 3.3 above.<br />

Model based decision support has several dimensions,<br />

which hinder a ‘one-size-fits-all’-approach. HarmoniQuA<br />

attempts to serve several types of users in<br />

Knowledge Base<br />

Guidelines<br />

Software capabilities<br />

Glossary<br />

Domains:<br />

Groundwater<br />

Precipitation-runoff<br />

Hydrodynamics<br />

Flood forecasting<br />

Water quality<br />

Biota (ecology)<br />

Socio-economics<br />

Model<br />

Archive<br />

Model journal, Project A<br />

Model journal, Project B<br />

Model journal, Project C<br />

Model journal, Project D<br />

MoST<br />

Reporting<br />

Specific for types<br />

of users<br />

Guidance<br />

Generic + specific for:<br />

- model domain<br />

- user<br />

- job complexity<br />

Advise<br />

From previous<br />

model projects<br />

Monitoring<br />

Generic + specific for:<br />

- model domain<br />

- user<br />

- job complexity<br />

User<br />

Model Team<br />

Single/multiple domain<br />

Model Study<br />

Plan<br />

Data and<br />

Conceptualisation<br />

Model<br />

Set-up<br />

Calibration<br />

and<br />

Validation<br />

Simulation and<br />

Evaluation<br />

Reporting and client review take place in each step<br />

Fig. 4. HarmoniQuA tools (MoST) to support the QA process.


1210 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

Model Study Plan<br />

Describe Problem and<br />

Context<br />

Define Objectives<br />

Identify Data Availability<br />

Determine Requirements<br />

Prepare Terms of<br />

Reference<br />

Proposal and Tendering<br />

no<br />

Agree on<br />

Model Study Plan and<br />

Budget<br />

yes<br />

Legends<br />

Ordinary task<br />

Decision task<br />

Review task<br />

feedforward<br />

feedfback<br />

Data and Conceptualisation<br />

Model Set-up Calibration and Validation Simulation and Evaluation<br />

Describe System and<br />

Data Availability<br />

Construct Model<br />

Specify Stages in<br />

Calibration Strategy<br />

Simulations<br />

Process Raw Data<br />

no<br />

Test Runs<br />

Completed<br />

bad<br />

Select Optimisation<br />

Method<br />

Check<br />

Simulations<br />

no<br />

bad<br />

yes<br />

Sufficient<br />

Data<br />

yes<br />

Model Structure and<br />

Processes<br />

Model Parameters<br />

Summarise Conceptual<br />

Model and Assumptions<br />

Need for<br />

Alternative<br />

Conceptual<br />

Models<br />

no<br />

Process Model Structure<br />

Data<br />

no<br />

no<br />

OK<br />

Specify or Update<br />

Calibration + Validation<br />

Targets and Criteria<br />

Report and Revisit<br />

Model Study Plan (Model<br />

Set-up)<br />

Review Model Set-up<br />

and Calibration and<br />

Validation Plan<br />

bad<br />

yes<br />

Define Stop Criteria<br />

Select Calibration<br />

Parameters<br />

Parameter<br />

Optimisation<br />

yes<br />

All Calibration<br />

Stages<br />

Completed<br />

yes<br />

Assess<br />

Soundness of<br />

Calibration<br />

OK<br />

Validation<br />

no<br />

no<br />

no<br />

not OK<br />

no<br />

bad<br />

yes<br />

Analyse and Interpret<br />

Results<br />

Assess<br />

Soundness of<br />

Simulation<br />

yes<br />

Uncertainty Analysis of<br />

Simulation<br />

Reporting of Simulation<br />

(incl. Uncertainty)<br />

Review of Simulation<br />

yes<br />

Model Study Closure<br />

bad<br />

no<br />

no<br />

Assess<br />

Soundness of<br />

Conceptualisation<br />

yes<br />

bad<br />

Assess<br />

Soundness of<br />

Validation<br />

not OK<br />

Code Selection<br />

Report and Revisit<br />

Model Study Plan<br />

(Conceptualisation)<br />

OK<br />

Uncertainty Analysis of<br />

Calibration and<br />

Validation<br />

Document Model Scope<br />

no<br />

Review<br />

Conceptualisation and<br />

Model Set-up Plan<br />

yes<br />

Report and Revisit<br />

Model Study Plan<br />

(Calibration + Validation)<br />

Review Calibration and<br />

Validation and<br />

Simulation Plan<br />

yes<br />

no<br />

Fig. 5. The five steps and 45 tasks of modelling process in the HarmoniQuA knowledge base.


J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

1211<br />

a series of water management domains, in jobs of<br />

diverse complexity and diverse application purpose.<br />

In this way, users working on a specific job will only be<br />

confronted with guidelines, instructions, decisions and<br />

activities that are relevant to their role in a particular<br />

modelling job.<br />

The HarmoniQuA tools have been developed in<br />

Prote´ ge´ 2000 following an ontological approach. More<br />

details can be found in Kassahun et al. (2004). The tools<br />

are available on http://www.harmoniqua.org/.<br />

4.2. Key elements<br />

Some of the key features to be implemented in the<br />

new HarmoniQuA guidelines are:<br />

4.2.1. Interactive guidelines<br />

The dialogue between the different players is crucial<br />

to ensure that the output from the modelling process is<br />

understandable for stakeholders and beneficial for the<br />

client. The importance of involvement of stakeholder<br />

and public opinions are emphasised by Pahl-Wostl<br />

(2002) and addressed in some Type 3 guidelines (e.g.<br />

BDMF, 2000; Pascual et al., 2003). In HarmoniQuA,<br />

each of the five major steps (Fig. 5) is therefore<br />

concluded with a dialogue task, in terms of either<br />

contract negotiation (first step) or reviews (last four<br />

steps). A dialogue task encourages the assessment of the<br />

present step and provides the opportunity to redefine the<br />

content of the model study plan for the next step based<br />

upon the results and findings of the present step. These<br />

dialogue steps provide flexibility to the modelling study<br />

and ensure that the tasks that have yet to be performed<br />

can be modified according to the achieved results and<br />

perceptions of modeller and client.<br />

4.2.2. Transparency and reproducibility<br />

Transparency and reproducibility are important,<br />

especially for large studies involving use of complex<br />

models. This will be ensured through the Monitoring<br />

Tool which enables modelling teams, consisting of<br />

modellers, managers and auditors, to be guided through<br />

the modelling process, to monitor all modelling activities<br />

and to oversee the status of each task to perform. With<br />

an increasing tendency to reuse existing models or<br />

rebuild them with additional data, modified conceptual<br />

models (revised model structure and/or inclusion of<br />

additional processes) and improved calibration and<br />

validation tests, this functionality of the Monitoring<br />

Tool becomes very important.<br />

4.2.3. Accuracy criteria<br />

Establishment of accuracy criteria for a modelling<br />

study is a very important, but difficult, issue. Modellers<br />

often establish numerical accuracy criteria in order to<br />

classify the goodness of a given model (e.g. Henriksen<br />

et al., 2003; Scholten and Van der Tol, 1998). These<br />

attempts are very useful in making the performance<br />

more transparent and quantitative, but do not provide<br />

an objective means to decide what the optimal accuracy<br />

criteria really should be in a given case. According to<br />

Refsgaard and Henriksen (2004) no universal accuracy<br />

criteria can be established, i.e. it is generally not possible<br />

from a natural scientific point of view to tell when<br />

a model performance is good enough. Such acceptance<br />

criteria will vary from case to case depending on the<br />

socio-economic context, i.e. what is at stake in the<br />

decisions to be supported by the model predictions. An<br />

appropriate question may be: how do we translate the<br />

‘soft’ socio-economic objectives to ‘hard-core’ model<br />

performance criteria This is obviously a challenge that<br />

cannot be solved by natural science alone, but needs to<br />

be addressed in a much broader context including<br />

aspects of economy, stakeholder interests and risk<br />

perception.<br />

Performance statistics must comprise quantifiable<br />

and objective measures. However numerical measures<br />

cannot stand alone. Often expert opinions are necessary<br />

supplements.<br />

4.2.4. Uncertainty assessments<br />

Quality assurance and uncertainty assessments are<br />

two aspects that are very closely linked. Initially, the<br />

manager has to define accuracy criteria from a perception<br />

of which uncertainty level he/she believes is suitable<br />

for a particular case (see above). Subsequently, as the<br />

modelling study proceeds, the dialogue between modeller<br />

and manager has to continue with the necessary<br />

trade off between modelling accuracy and the cost of the<br />

modelling study. In the uncertainty assessments it is very<br />

important to go beyond the traditional statistical<br />

uncertainty analysis. Thus, e.g. aspects of scenario<br />

uncertainty and ignorance should generally be included<br />

and in addition the uncertainties originating from data<br />

and models often needs to be integrated with socioeconomic<br />

aspects in order to form a suitable basis for<br />

the further decision process (e.g. Van Asselt and<br />

Rotmans, 2002). Thus, like with the accuracy criteria<br />

(above) the use of uncertainty assessments in water<br />

resources management goes beyond natural science.<br />

Assessment of uncertainty due to errors in the model<br />

structure is a particularly difficult task and is most often<br />

neglected. One way of evaluating this source of uncertainty<br />

is through the establishment of alternative<br />

conceptual models. This aspect is emphasised in the<br />

HarmoniQuA guidelines.<br />

4.2.5. Model validation<br />

Although experience shows that models generally<br />

perform poorer in validation tests against independent<br />

data than they do in calibration tests, model validation is<br />

in our opinion a neglected issue, both in many modelling


1212 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

guidelines and in the scientific literature. Maybe many<br />

scientists have not wanted to use the term validation due<br />

to the scientific philosophically related controversies, but<br />

in any case many scientists are not advocating the need for<br />

model validation. One of the unfortunate consequences<br />

of this ‘lack of interest’ is that not much work has been<br />

devoted to developing suitable validation test schemes<br />

since Klemes (1986). In our opinion further development<br />

of suitable testing schemes, particularly for non-linear<br />

models and for applications comprising extrapolations<br />

beyond the calibration data basis, and imposing them to<br />

all modelling projects is a major future challenge.<br />

4.2.6. Dedication aspects<br />

The QA guidelines describe the different tasks and<br />

responsibilities of the different types of users such as (1)<br />

modellers; (2) water managers; (3) auditors; (4) stakeholders<br />

(other than water manager); and (5) general<br />

public.<br />

The QA guidelines are developed so that they<br />

adequately reflect the different requirements in several<br />

modelling domains (and still maintain a common generic<br />

core to ensure coherency). Furthermore, the guidelines<br />

will be applicable for studies where several domains,<br />

including socio-economy, are integrated.<br />

The QA guidelines differentiate according to job<br />

complexity in modelling, e.g. (1) basic (rough calculations);<br />

(2) intermediate (moderately complex calculations);<br />

and (3) comprehensive (sophisticated, detailed<br />

calculations).<br />

5. Discussion and conclusions<br />

5.1. Types and reasons of existing QA guidelines<br />

We have classified quality assurance (QA) guidelines in<br />

three types: Internal technical guidelines (Type 1), Public<br />

technical guidelines (Type 2), and Public interactive<br />

guidelines (Type 3). We have then characterised the<br />

conditions for which the guidelines are used by (a) the<br />

scientific maturity of the underlying discipline(s) and (b)<br />

the maturity of the modelling market in the region/<br />

country for which the guidelines were developed. Our<br />

review of existing QA guidelines is not exhaustive, but<br />

limited to examples aimed at being representative for<br />

conditions in Europe, North America and Australia.<br />

Thus, we have for instance not reviewed QA guidelines<br />

from countries in Asia, where modelling has taken place<br />

for many years. The results of our review revealed<br />

significant variations in the type of guidelines available<br />

and their usage between different modelling domains and<br />

countries. We hypothesised that the stage of QA guideline<br />

development largely depends on the maturity of both the<br />

specific scientific discipline and the modelling market in<br />

the respective country or region (Figs. 2 and 3).<br />

Considering Figs. 2 and 3 it appears that the maturity<br />

of the scientific discipline and market both play an<br />

important role in QA development. However, neither<br />

the scientific level nor the market maturity alone is able<br />

to explain the differences in the stage of QA guideline<br />

development. If the underlying process understanding or<br />

necessary data are too weak, then the modelling process<br />

lacks credibility no matter how well QA procedures are<br />

adhered to. Hence, the motivation to establish sophisticated<br />

QA guidelines in such cases is small. Similarly,<br />

even though a specific discipline may be scientifically<br />

mature, modellers may be reluctant to use sophisticated<br />

QA guidelines if they are not required to do so by<br />

regulators and/or water managers. The general development<br />

of QA guidelines has progressed over time<br />

from Type 1 towards Type 3. A developmental process<br />

that is consistent with the results of the reviews as<br />

reflected in Figs. 2 and 3 is the following.<br />

Initially, when models are introduced for practical<br />

application, internal technical guidelines (Type 1)<br />

originating from the research community are applied.<br />

The development from Type 1 to Type 2 QA guidelines<br />

requires a certain degree of maturity within both the<br />

specific scientific discipline and the market. This implies<br />

that there should not be significant lacks of knowledge<br />

on process descriptions, and that there is a common<br />

agreement about the scientifically sound procedures for<br />

solving the problems in this domain. The development<br />

of Type 2 guidelines is most often driven by the demands<br />

of regulators and water managers. The development<br />

from Type 2 to Type 3 requires a clear and conscious<br />

demand from regulators and water managers.<br />

It would also have been possible to classify the QA<br />

guidelines after other criteria, for example according to<br />

how uncertainty analysis is treated, whether they apply<br />

to single or multiple domains and whether they apply to<br />

natural or social science. We have chosen our classification<br />

for two main reasons. Firstly, an improved mutual<br />

understanding between modeller and water manager is<br />

crucial for a model application to be successful in<br />

practice, and this should be facilitated by the QA<br />

guidelines. Secondly, the trend of increasing stakeholder<br />

involvement in the water resources management process<br />

demands that QA guidelines also enable stakeholders to<br />

observe and take part in parts of the modelling process.<br />

Our characterisation of QA guidelines according to<br />

scientific and market maturity has some weaknesses.<br />

First of all, the assessments have been done subjectively,<br />

because there was no other feasible method. Secondly,<br />

the two characteristics are not completely independent.<br />

Thus a large and mature market will often put demands<br />

on new scientific knowledge and hence to enhance the<br />

scientific development, as well as it will lead to needs for<br />

improved technical standards.<br />

Altogether, it may be concluded that our hypotheses<br />

on the importance of scientific and market maturity for


J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

1213<br />

the development of QA guidelines have not been<br />

falsified. However, due to the above weaknesses and<br />

the limited empirical basis (review not exhaustive but<br />

selected examples) this conclusion should be taken with<br />

some reservation.<br />

5.2. Organisational requirements<br />

for QA guidelines to be effective<br />

As emphasised by e.g. Forkel (1996) modelling<br />

studies involve several partners with different responsibilities.<br />

The ‘key players’ are code developers, model<br />

users (modellers) and water managers (including planning<br />

and regulatory authorities). To a large extent the<br />

quality of the modelling study is determined by the<br />

expertise, attitudes and motivation of the teams involved<br />

in the modelling and quality assessment process.<br />

The attitude of the modellers is important. NRC<br />

(1990) characterises this as follows: ‘‘most modellers<br />

enjoy the modelling process but find less satisfaction in<br />

the process of documentation and quality assurance’’.<br />

Scholten and Groot (2002) describe the main problem<br />

with the Dutch Handbook on Good Modelling Practice<br />

that they all like it, but only a few use it.<br />

QA will only become successful if both of the parties,<br />

modeller and water manager, are motivated and active in<br />

supporting its use. The water manager has a particular<br />

responsibility, because he/she has the power to request<br />

and pay for adequate QA in modelling studies. Therefore,<br />

QA guidelines can only be expected to be used in practice,<br />

if the water manager prescribes their use. In this respect it<br />

is very important that the water manager has the technical<br />

capacity to organise the QA process. A significant<br />

problem for water manager’s organisation is that it often<br />

lacks individuals who are trained at an appropriate level<br />

to understand and use models. If the water manager does<br />

not possess such skill within his/her own staff, an external<br />

modelling expert can be hired to help the manager in the<br />

QA process. However, this requires that the manager is<br />

aware of the problem and the need.<br />

5.3. The HarmoniQuA guidelines<br />

The approach adopted in the present HarmoniQuA<br />

guidelines correspond to Type 3. However, in addition<br />

to its focus on the dialogue and role play between the<br />

various actors in the modelling process, i.e. modellers,<br />

water managers, auditors and the public/stakeholders,<br />

the HarmoniQuA approach is innovative compared to<br />

existing Type 3 QA guidelines on the following aspects:<br />

Supporting software tools, beyond simple scoreboards<br />

and templates, are novel and important<br />

elements. These tools, which contain the knowledge<br />

base (KB), can guide the users through the<br />

modelling process, monitor decisions and outcomes,<br />

and provide experienced based advise on the<br />

appropriate route to be followed. This will significantly<br />

improve the transparency and reproducibility<br />

of the modelling process. To our knowledge no such<br />

tools exist or are under development at present.<br />

The focus on performance and accuracy criteria<br />

in the modelling process is not novel as such. However,<br />

the current adaptation of these criteria through<br />

the process in connection with the formalised review<br />

steps is, if not novel, then at least emphasised much<br />

more in the HarmoniQuA guidelines than in any<br />

other existing guidelines. This approach allows the<br />

HarmoniQuA guidelines to fit nicely with the new<br />

ideas of adaptive management (Pahl-Wostl, 2002).<br />

The uncertainty aspects are given a more central role<br />

than in existing guidelines, where uncertainty often<br />

is confined to assessment of predictive uncertainties<br />

towards the end of the study. In the HarmoniQuA<br />

guidelines uncertainty aspects plays an important<br />

role in 13 of the 45 tasks. Thus, uncertainty<br />

assessment is a central element in the dialogue<br />

between modeller and water manager already in the<br />

beginning of the model study when the initial<br />

performance criteria are outlined. Furthermore,<br />

HarmoniQuA recommends including less quantifiable<br />

elements such as scenario uncertainty and<br />

model structural uncertainty in the assessment.<br />

Model validation tests against independent data have<br />

more emphasis than in most other guidelines.<br />

Although the most comprehensive of the existing<br />

guidelines, the Dutch guidelines (Van Waveren<br />

et al., 2000), for example recommends validation<br />

to be carried out, they do not describe validation<br />

tests beyond the traditional split-sample test.<br />

The HarmoniQuA guidelines are unique in their<br />

dedication aspects, namely that different tasks and<br />

responsibilities are described for different users,<br />

different modelling domains and different levels of<br />

modelling job complexity. The Australian groundwater<br />

modelling guidelines have the same feature,<br />

but only with respect to the review procedures<br />

(Merrick et al., 2002).<br />

The HarmoniQuA guidelines consist of a comprehensive<br />

set of QA guidelines for multiple modelling domains<br />

combined with the supporting software tools. These<br />

functionalities appear to be well suited to the challenges<br />

and demands of modern water resources management.<br />

The usefulness, user friendliness and appreciation by the<br />

users will be assessed through a testing of the guidelines<br />

and tools in a range of river basin modelling projects.<br />

Acknowledgements<br />

The present work was carried out within the<br />

Project ‘Harmonising Quality Assurance in model based


1214 J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

catchments and river basin management (Harmoni-<br />

QuA)’, which is partly funded by the EC Energy,<br />

Environment and Sustainable Development programme<br />

(Contract EVK1-CT2001-00097). The constructive comments<br />

of five anonymous reviewers are acknowledged.<br />

References<br />

Anderson, M.G., Bates, P.D., 2001. Hydrological science: model<br />

credibility and scientific integrity. In: Anderson, M.G., Bates, P.D.<br />

(Eds.), Model Validation. Perspectives in Hydrological Science.<br />

John Wiley & Sons, Chichester, pp. 1–10.<br />

Anderson, M.P., Woessner, W.W., 1992. The role of postaudit in<br />

model validation. Advances in Water Resources 15, 167–173.<br />

Anderson, M.P., Ward, D.S., Lappala, E.G., Prickett, T.A., 1993.<br />

Computer models for subsurface water. In: Maidment, D.R. (Ed.),<br />

Handbook of Hydrology. McGraw-Hill, Inc (Chapter 22).<br />

ASTM, 1992. Standard Practice for Evaluating Mathematical Models<br />

for the Environmental Fate of Chemicals. Standard E978-92,<br />

American Society for Testing and Materials, http://www.astm.org.<br />

ASTM, 1994. Standard Guide for Application of a Ground-Water<br />

Flow Model to a Site-Specific Problem. Standard D5447-93,<br />

American Society for Testing and Materials, http://www.astm.org.<br />

Balint, G., 2002. State-of-the-art for flood forecasting modelling. In:<br />

Refsgaard, J.C. (Ed.), State-of-the-Art Report on Quality Assurance<br />

in Modelling Related to River Basin Management. Chapter 7,<br />

Geological Survey of Denmark and Greenland, Copenhagen,<br />

http://www.harmoniqua.org.<br />

BDMF, 2000. Protocols for Water and Environmental Modeling.<br />

Bay-Delta Modeling Forum. Ad hoc Modeling Protocols Committee,<br />

http://www.sfei.org/modelingforum/.<br />

Blind, M., 2004. ICT requirements for an ‘evolutionary’ development<br />

of WFD compliant River Basin Management Plans. In: Pahl, C.,<br />

Schmidt, S., Jakeman, T. (Eds.), iEMSs 2004 International<br />

Congress: ‘‘Complexity and Integrated Resources Management’’.<br />

International Environmental Modelling and Software Society,<br />

Osnabru¨ ck, Germany, June 2004.<br />

CAMASE, 1996. CAMASE was a Concerted Action for the Development<br />

and Testing of Quantitative Methods for research on<br />

Agricultural Systems and the Environment, http://www.bib.wau.<br />

nl/camase/.<br />

Da Silva, M.C., Barbosa, A.E., Rocha, J.S., Fortunato, A.B., 2002.<br />

State-of-the-art for surface water quality modelling. In: Refsgaard,<br />

J.C. (Ed.), State-of-the-Art Report on Quality Assurance in<br />

Modelling Related to River Basin Management. Chapter 8,<br />

Geological Survey of Denmark and Greenland, Copenhagen,<br />

http://www.harmoniqua.org.<br />

DHI, 2002. MIKE 11 User Guide. DHI Water & Environment,<br />

Hørsholm, Denmark.<br />

Forkel, C., 1996. Das numerische Modell – ein schmaler Grat zwischen<br />

vertrauenswu¨ rdigem Werkzeug und gefährlichem Spielzeug. Presented<br />

at the 26. IWASA, RWTH Aachen, 4–5 January 1996.<br />

GWP-TAC, 2000. Integrated Water Management, TEC Background<br />

Papers No. 4, Global Water Partnership, SE-105 25 Stockholm,<br />

Sweden, ISBN: 91-630-9229-8.<br />

Heinz, I., Eberle, S., 2002. State-of-the-art for socio-economic<br />

modelling. In: Refsgaard, J.C. (Ed.), State-of-the-Art Report on<br />

Quality Assurance in Modelling Related to River Basin Management.<br />

Chapter 10, Geological Survey of Denmark and Greenland,<br />

Copenhagen, http://www.harmoniqua.org.<br />

Henriksen, H.J., 2002a. Australian groundwater modelling guidelines.<br />

In: Refsgaard, J.C. (Ed.), State-of-the-Art Report on Quality<br />

Assurance in Modelling Related to River Basin Management.<br />

Chapter 13, Geological Survey of Denmark and Greenland,<br />

Copenhagen, http://www.harmoniqua.org.<br />

Henriksen, H.J., 2002b. Danish groundwater modelling guidelines. In:<br />

Refsgaard, J.C. (Ed.), State-of-the-Art Report on Quality Assurance<br />

in Modelling Related to River Basin Management. Chapter<br />

14, Geological Survey of Denmark and Greenland, Copenhagen,<br />

http://www.harmoniqua.org.<br />

Henriksen, H.J., Troldborg, L., Nyegaard, P., Sonnenborg, T.O.,<br />

Refsgaard, J.C., Madsen, B., 2003. Methodology for construction,<br />

calibration and validation of a national hydrological model for<br />

Denmark. Journal of Hydrology 280 (1–4), 52–71.<br />

Kassahun, A., Scholten, H., Zompanakis, G., Gavardinas, C., 2004.<br />

Support for model based water management with the HarmoniQuA<br />

toolbox. In: Pahl, C., Schmidt, S., Jakeman, T. (Eds.),<br />

iEMSs 2004 International Congress: ‘‘Complexity and Integrated<br />

Resources Management’’. International Environmental Modelling<br />

and Software Society, Osnabru¨ ck, Germany, June 2004.<br />

Klemes, V., 1986. Operational testing of hydrological simulation<br />

models. Hydrological Sciences Journal 31, 13–24.<br />

Merrick, N.P., Middlemis, H., Ross, J.B., 2002. Groundwater<br />

Modelling Guidelines for Australia – Recommended Procedures<br />

for Modelling Reviews. International Groundwater Conference.<br />

Balancing the Groundwater Budget. Northern Territory. Australia.<br />

12–17 May 2002.<br />

Metelka, T., Krejcik, J., 2002a. State-of-the-art for hydrodynamic. In:<br />

Refsgaard, J.C. (Ed.), State-of-the-Art Report on Quality Assurance<br />

in Modelling Related to River Basin Management. Chapter 6,<br />

Geological Survey of Denmark and Greenland, Copenhagen,<br />

http://www.harmoniqua.org.<br />

Metelka, T., Krejcik, J., 2002b. Quality assurance in Central and<br />

Eastern Europe. In: Refsgaard, J.C. (Ed.), State-of-the-Art Report<br />

on Quality Assurance in Modelling Related to River Basin<br />

Management. Chapter 15, Geological Survey of Denmark and<br />

Greenland, Copenhagen, http://www.harmoniqua.org.<br />

Middlemis, H., 2000. Murray-Darling Basin Commission. Groundwater<br />

Flow Modelling Guideline. Aquaterra Consulting Pty Ltd.<br />

South Perth. Western Australia. Project no. 125.<br />

NRC, 1990. Ground Water Models: Scientific and Regulatory<br />

Applications. National Research Council, National Academy<br />

Press, Washington, D.C.<br />

Old, G.H., Packman, J.C., Calver, A.N., 2002. State-of-the-art<br />

for biota (ecological) modelling. In: Refsgaard, J.C. (Ed.),<br />

State-of-the-Art Report on Quality Assurance in Modelling<br />

Related to River Basin Management. Chapter 9, Geological<br />

Survey of Denmark and Greenland, Copenhagen, http://www.<br />

harmoniqua.org.<br />

Packman, J.C., 2002. Quality Assurance in the UK. In: Refsgaard, J.C.<br />

(Ed.), State-of-the-Art Report on Quality Assurance in Modelling<br />

Related to River Basin Management. Chapter 17, Geological<br />

Survey of Denmark and Greenland, Copenhagen, http://www.<br />

harmoniqua.org.<br />

Pahl-Wostl, C., 2002. Towards sustainability in the water sector – the<br />

importance of human actors and processes of social learning.<br />

Aquatic Sciences 64, 394–411.<br />

Pascual, P., Stiber, N., Sunderland, E., 2003. Draft Guidance on the<br />

Development, Evaluation, and Application of Regulatory Environmental<br />

Models. Council for Regulatory Environmental Modeling.<br />

US EPA, Washington D.C.<br />

Perrin, C., Andreassian, V., Michel, C., 2002a. State-of-the-art for<br />

precipitation-runoff modelling. In: Refsgaard, J.C. (Ed.), Stateof-the-Art<br />

Report on Quality Assurance in Modelling Related to<br />

River Basin Management. Chapter 5, Geological Survey of Denmark<br />

and Greenland, Copenhagen, http://www.harmoniqua.org.<br />

Perrin, C., Andreassian, V., Michel, C., 2002b. Quality assurance for<br />

precipitation-runoff modelling in France. In: Refsgaard, J.C. (Ed.),<br />

State-of-the-Art Report on Quality Assurance in Modelling<br />

Related to River Basin Management. Chapter 16, Geological<br />

Survey of Denmark and Greenland, Copenhagen, http://www.<br />

harmoniqua.org.


J.C. Refsgaard et al. / Environmental Modelling & Software 20 (2005) 1201–1215<br />

1215<br />

Refsgaard, J.C. (Ed.), 2002. State-of-the-Art Report on Quality<br />

Assurance in Modelling Related to River Basin Management.<br />

Report from the EU research project HarmoniQuA, http://www.<br />

harmoniqua.org. 18 chapters, 182 pp. Geological Survey of<br />

Denmark and Greenland, Copenhagen.<br />

Refsgaard, J.C., Henriksen, H.J., 2002. State-of-the-art for Groundwater<br />

Modelling. In: Refsgaard, J.C. (Ed.), State-of-the-Art Report<br />

on Quality Assurance in Modelling Related to River Basin<br />

Management. Chapter 4, Geological Survey of Denmark and<br />

Greenland, Copenhagen, http://www.harmoniqua.org.<br />

Refsgaard, J.C., Henriksen, H.J., 2004. Modelling guidelines –<br />

terminology and guiding principles. Advances in Water Resources<br />

27, 71–82.<br />

Rumbaugh, J.O., Rumbaugh, D.B., 2001. Guide to Using Groundwater<br />

Vistas. Environmental Simulations, Inc, Virginia, USA.<br />

Rykiel, E.R., 1996. Testing ecological models: the meaning of<br />

validation. Ecological Modelling 90, 229–244.<br />

Scholten, H., Van der Tol, M.W.M., 1998. Quantitative validation of<br />

deterministic models: when is a model acceptable In: Obaidat, M.S.,<br />

Davoli, F., DeMarinis, D. (Eds.), The Proceedings of the Summer<br />

Computer Simulation Conference. SCS, The Society for Computer<br />

Simulation International, San Diego, CA, USA, pp. 404–409.<br />

Scholten, H., Groot, S., 2002. Dutch guidelines. In: Refsgaard, J.C.<br />

(Ed.), State-of-the-Art Report on Quality Assurance in modelling<br />

related to river basin management. Chapter 12, Geological<br />

Survey of Denmark and Greenland, Copenhagen, http://www.<br />

harmoniqua.org.<br />

Scholten, H., Van Waveren, R.H., Groot, S., Van Geer, F.C., Wo¨ sten,<br />

J.H.M., Koeze, R.D., Noort, J.J., 2000. Good Modelling Practice<br />

in Water Management. Paper Presented on Hydroinformatics<br />

2000, Cedar Rapids, IA, USA.<br />

Van Asselt, M.B.A., Rotmans, J., 2002. Uncertainty in integrated<br />

assessment modelling – From positivism to pluralism. Climatic<br />

Change 54 (1–2), 75–105.<br />

Van Gils, J.A.G., Groot, S., 2002. Examples of good modelling<br />

practice in the Danube Basin. In: Refsgaard, J.C. (Ed.), Stateof-the-Art<br />

Report on Quality Assurance in Modelling Related to<br />

River Basin Management. Chapter 18, Geological Survey of<br />

Denmark and Greenland, Copenhagen, http://www.harmoniqua.<br />

org.<br />

Van Waveren, R.H., Groot, S., Scholten, H., Van Geer, F.C., Wo¨ sten,<br />

J.H.M., Koeze, R.D., Noort, J.J., 2000. Good Modelling Practice<br />

Handbook, STOWA Report 99-05, Utrecht, RWS-RIZA, Lelystad,<br />

The Netherlands, http://waterland.net/riza/aquest/ (In Dutch).


[14]<br />

Refsgaard JC, Nilsson B, Brown J, Klauer B, Moore R, Bech T, Vurro M,<br />

Blind M, Castilla G, Tsanis I, Biza P (2005) Harmonised techniques and<br />

representative river basin data for assessment and use of uncertainty<br />

information in integrated water management (HarmoniRiB).<br />

Environmental Science and Policy, 8, 267-277.<br />

Reprinted from Environmental Science and Policy with permission from Elsevier


Environmental Science & Policy 8 (2005) 267–277<br />

www.elsevier.com/locate/envsci<br />

Harmonised techniques and representative river basin data for<br />

assessment and use of uncertainty information in<br />

integrated water management (HarmoniRiB)<br />

Jens Christian Refsgaard a, *, Bertel Nilsson a , James Brown b ,<br />

Bernd Klauer c , Roger Moore d , Thomas Bech e , Michele Vurro f , Michiel Blind g ,<br />

Guillermo Castilla h , Ioannis Tsanis i , Pavel Biza j<br />

a Geological Survey of Denmark and Greenland (GEUS), Department of Hydrology, Øster Voldgade, DK-1350 Copenhagen, Denmark<br />

b Universiteit van Amsterdam (UVA), Amsterdam, The Netherlands<br />

c Centre for Environmental Research (UFZ), Leipzig, Germany<br />

d Centre for Ecology and Hydrology (CEH), Wallingford, UK<br />

e DHI Water and Environment (DHI), Hørsholm, Denmark<br />

f Istituto di Ricerca Sulle Acque del CNR (IRSA), Bari, Italy<br />

g Institute of Inland Water Management and Waste Water Treatment (RIZA), Lelystad, The Netherlands<br />

h Universidad de Castilla – La Mancha (UCLM), Albacete, Spain<br />

i Technical University Crete (TUC), Chania, Greece<br />

j Povodi Moravi (PM), Brno, Czech Republic<br />

Abstract<br />

This paper describes progress on HarmoniRiB, a European Commission Framework 5 project. The HarmoniRiB project aims to support<br />

the implementation of the EU Water Framework Directive (WFD) by developing concepts and tools for handling uncertainty in data and<br />

modelling, and by designing, building and populating a database containing data and associated uncertainties for a number of representative<br />

basins. This river basin network aims at becoming a ‘virtual laboratory for modelling studies’, and it will be made available for the scientific<br />

community. The data may, e.g. be used for comparison and demonstration of methodologies and models relevant to the WFD.<br />

# 2005 Elsevier Ltd. All rights reserved.<br />

Keywords: Uncertainty; River basin management; Data; Models; River basin network; HarmoniRiB; Water Framework Directive<br />

1. Introduction<br />

1.1. Problems to be addressed<br />

The Water Framework Directive (WFD) provides a<br />

European policy basis at the river basin scale. The river basin<br />

management and planning process prescribed in the WFD is<br />

an adaptation of the Integrated Water Resources Management<br />

principles (GWP, 2000), involving all physical<br />

domains in water management, sectors of water use,<br />

socio-economics and stakeholder participation. As such,<br />

* Corresponding author. Tel.: +45 38 14 27 76; fax: +45 38 14 20 50.<br />

E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />

the WFD poses new challenges to water resources managers.<br />

The traditional physical domain specific and sectoral<br />

approaches need to be combined and extended to fulfil<br />

the WFD requirements. The preparation of the river basin<br />

management plans, prescribed in the WFD, is furthermore<br />

influenced by uncertainties on the underlying data and<br />

modelling results. In several sections of the WFD document,<br />

uncertainty is addressed (Blind and de Blois, 2003). In<br />

addition, most of the WFD guidance documents, being more<br />

specific than the WFD document itself, explicitly emphasise<br />

that uncertainty analyses should be performed. However, in<br />

spite of strong recommendations to consider uncertainty<br />

aspects the guidance documents do not include recommendations<br />

on how to do so.<br />

1462-9011/$ – see front matter # 2005 Elsevier Ltd. All rights reserved.<br />

doi:10.1016/j.envsci.2005.02.001


268<br />

J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />

Therefore, there is a clear and urgent need for developing<br />

new concepts, methodologies and tools that can be used to<br />

assist in implementing the WFD. In order to support such<br />

research and development, it is necessary to have a network<br />

of representative river basins with datasets suitable for this<br />

purpose. This implies that the datasets, in addition to<br />

covering the diversity in terms of ecological regimes and<br />

socio-economic conditions found across Europe, must have<br />

built-in information on the uncertainties in the data.<br />

1.2. Objectives<br />

The paper presents status and preliminary results from an<br />

ongoing research project, HarmoniRiB, that is supported<br />

under EU’s 5th Framework Programme. The overall goal of<br />

HarmoniRiB is to develop methodologies for quantifying<br />

uncertainty and its propagation from the raw data to concise<br />

management information. The four specific project objectives<br />

are:<br />

To establish a practical methodology and a set of tools for<br />

assessing and describing uncertainty originating from<br />

data and models used in decision making processes for the<br />

production of integrated water management plans. It will<br />

include a methodology for integrating uncertainties on<br />

basic data and models and socio-economic uncertainties<br />

into a decision support concept applicable for implementation<br />

of the WFD.<br />

To provide a conceptual model for data management that<br />

can handle uncertain data and implement it for a network<br />

of representative river basins.<br />

To provide well documented datasets, suitable for<br />

studying the influence of uncertainty on management<br />

decisions for a network of representative river basins and<br />

to provide examples of their use in the development of<br />

integrated water management plans.<br />

To disseminate intermediate and final results among<br />

researchers and end-users across Europe and obtain and<br />

incorporate feedback on the methodologies, tools and the<br />

datasets.<br />

2. Uncertainty assessments<br />

2.1. Definitions and taxonomy<br />

Uncertainty and associated terms such as error, risk and<br />

ignorance are defined and interpreted differently by different<br />

authors (see Walker et al., 2003 for a review). The different<br />

definitions reflect, among other factors, the different<br />

scientific disciplines and philosophies of the authors<br />

involved, as well as the intended audience. In addition they<br />

vary depending on their purpose. Some are rather generic,<br />

such as Funtowicz and Ravetz (1990), while others apply<br />

more specifically to model based water management, such as<br />

Beck (1987). The terminology used in HarmoniRiB has<br />

emerged after discussions between social scientists and<br />

natural scientists specifically aiming at applications in<br />

model based water management (Klauer and Brown, 2003).<br />

By doing so we adopt a subjective interpretation of<br />

uncertainty in which the degree of confidence that a decision<br />

maker has about possible outcomes and/or probabilities of<br />

these outcomes is the central focus. Thus, according to our<br />

definition a person is uncertain if s/he lacks confidence<br />

about the specific outcomes of an event. Reasons for this lack<br />

of confidence might include a judgement that the information<br />

is incomplete, blurred, inaccurate, imprecise or<br />

potentially false. Similarly, a person is certain if s/he is<br />

confident about the outcome of an event. It is possible that a<br />

person feels certain but has misjudged the situation (i.e. s/he<br />

is wrong).<br />

There are many different (decision) situations, with<br />

different possibilities for characterising of what we know or<br />

do not know and of what we are certain or uncertain. A first<br />

distinction is between ignorance as a lack of awareness<br />

about imperfect knowledge and uncertainty as a state of<br />

confidence about knowledge (which includes the act of<br />

ignoring). Our state of confidence may range from being<br />

certain to admitting that we know nothing (of use), and<br />

uncertainty may be expressed at a number of levels in<br />

between. Regardless of our confidence in what we know,<br />

ignorance implies that we can still be wrong (‘in error’). In<br />

this respect Brown (2004) has defined a taxonomy of<br />

imperfect knowledge illustrated in Fig. 1.<br />

In evaluating uncertainty, it is useful to distinguish<br />

between uncertainty that can be quantified, e.g. by<br />

probabilities and uncertainty that can only be qualitatively<br />

described, e.g. by scenarios. If one throws a balanced die, the<br />

precise outcome is uncertain, but the ‘attractor’ of a perfect<br />

die is certain: we know precisely the probability for each of<br />

the 6 outcomes, each being 1/6. This is what we mean with<br />

‘uncertainty in terms of probability’. However, the estimates<br />

for the probability of each outcome can also be uncertain. If<br />

a model study says: ‘‘there is a 30% probability that this area<br />

will flood two times in the next year’’, there is not only<br />

‘uncertainty in terms of probability’ but also uncertainty<br />

regarding whether the estimate of 30% is a reliable estimate.<br />

Secondly, it is useful to distinguish between bounded<br />

uncertainty, where all possible outcomes have been<br />

identified (they can be distinct or indistinct) and unbounded<br />

uncertainty, where the known outcomes are considered<br />

incomplete. Since quantitative probabilities require ‘all<br />

possible outcomes’ of an uncertain event and each of their<br />

individual probabilities to be known, they can only be<br />

defined for ‘bounded uncertainties’. If probabilities cannot<br />

be quantified in any undisputed way, we often can still<br />

qualify the available body of evidence for the possibility of<br />

various outcomes.<br />

The bounded uncertainty where all probabilities are<br />

deemed known (Fig. 1) is often denoted ‘statistical<br />

uncertainty’ (e.g. Walker et al., 2003). This is the case<br />

traditionally addressed in model based uncertainty assess-


J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 269<br />

Fig. 1. Taxonomy of imperfect knowledge resulting in different uncertainty situations (Brown, 2004).<br />

ment. It is important to note that this case constitutes one of<br />

many decision situations outlined in Fig. 1, and in other<br />

situations the main uncertainty in a decision situation cannot<br />

be characterised statistically.<br />

2.2. Framework for describing data uncertainty<br />

By considering space–time variability and data type,<br />

Brown et al. (2005) have distinguished 13 uncertainty<br />

categories of uncertain data (Table 1).<br />

By considering measurement scale, it becomes possible<br />

to quickly limit the relevant uncertainty models for a certain<br />

variable. On a discrete measurement scale, for example, it is<br />

only relevant to consider discrete probability distribution<br />

functions, whereas continuous density functions are required<br />

for continuous numerical data. In addition, the use of space<br />

and time variability determines the need for autocorrelation<br />

functions alongside a probability density function ( pdf ).<br />

Brown et al. (2005) explain that this classification of data by<br />

measurement scale and space–time variability is useful for<br />

uncertainty assessment because: (1) it reduces the amount of<br />

required information requested from the user in populating a<br />

database; (2) it reduces the amount of information stored in a<br />

database (model parameter values); (3) it ensures a close<br />

relationship between the structure of the probability model<br />

and the techniques used to estimate its parameters and; (4) it<br />

encourages planning of measurement campaigns for<br />

collecting information on uncertainty.<br />

Each data category is associated with a range of<br />

uncertainty models, for which more specific pdfs may be<br />

developed with different simplifying assumptions (e.g.<br />

Gaussian; second-order stationarity; degree of temporal and<br />

spatial autocorrelation). The advantages of allowing a range<br />

of possible models for each data category are threefold.<br />

First, there is a need to explicitly define an appropriate set of<br />

statistical assumptions for a particular dataset. Secondly, a<br />

range of possible assumptions can be defined a priori, and<br />

hence the significance of particular assumptions can be<br />

demonstrated with examples. Finally, the trade-off between<br />

model complexity, identifiability and reliability can be<br />

reviewed over time and balanced against the (changing)<br />

practical constraints on assessing uncertainty. For example,<br />

levels of risk and expertise can be associated with the<br />

simplifying assumptions allowed in a pdf, with default<br />

Table 1<br />

The subdivision and coding of uncertainty-categories, along the ‘axes’ of space–time variability and measurement scale (Brown et al., 2005)<br />

Space–time variability<br />

Measurement scale<br />

Continuous numerical Discrete numerical Categorical Narrative<br />

}<br />

Constant in space and time A1 A2 A3<br />

Varies in time, not in space B1 B2 B3<br />

Varies in space, not in time C1 C2 C3<br />

Varies in time and space D1 D2 D3<br />

4


270<br />

J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />

models for low-risk applications involving users with<br />

limited expertise. Minimum requirements can also be<br />

identified for specific datasets, such as data on toxic<br />

chemicals.<br />

Categorical data (3) differ from numerical data (1, 2) and<br />

narrative (4) in three important ways. First, categorical data<br />

cannot be manipulated statistically (i.e. computation of<br />

mean and variance), because the categories are not measured<br />

on a numerical scale. Secondly, individual values may be<br />

assigned to unique classes (one value to one class), where<br />

pdfs are based on the measured frequency, or perceived<br />

probability (Bayes rule), that a value occurs in a particular<br />

‘hard’ class or they can be partially assigned to multiple<br />

classes (fuzzy), where probabilities reflect doubt about the<br />

proportional membership of a value to a particular class<br />

(Heuvelink and Burrough, 1993). For the purposes of an<br />

uncertainty analysis, this distinction is important, because<br />

accuracy assessments are more complicated for fuzzy<br />

descriptions of reality. An important issue often overlooked<br />

with categorical data (e.g. the confusion matrix in landcover<br />

classification) is the problem of correlation in space<br />

and time or between datasets, since traditional statistical<br />

techniques do not apply to categorical data.<br />

Reviews with results on data uncertainty reported in the<br />

literature have been compiled into a guideline report for<br />

assessing uncertainty in various types of data originating<br />

from meteorology, soil physics and geochemistry, hydrogeology,<br />

land cover, topography, discharge, surface water<br />

quality, ecology and socio-economics (Van Loon and<br />

Refsgaard, 2005).<br />

2.3. Software tool to support uncertainty assessment in<br />

data and models<br />

The components of the HarmoniRiB uncertainty software<br />

are shown in Fig. 2.<br />

There are four software components in the HarmoniRiB<br />

design, namely: (1) a module for assessing uncertainties in<br />

data and storing this information within a database design<br />

(the database design is described briefly below (assess data<br />

uncertainty)); (2) a module for assessing uncertainties in<br />

models (assess model uncertainty); (3) a module for<br />

sampling from a distribution of uncertain inputs and<br />

(possibly) model parameters and implementing the model<br />

for each realisation of the uncertain inputs and parameters<br />

(uncertainty propagation); (4) a module for synthesising and<br />

presenting the uncertainty results ( present uncertainty).<br />

The Data Uncertainty Engine (DUE) is illustrated in<br />

Fig. 3. It separates the analysis of data uncertainties into four<br />

stages, whereby objects are first imported into the software<br />

(1), the sources of uncertainty are then identified (2)<br />

(important for a structured analysis) and are translated into a<br />

simple model (3) (e.g. probability model) from which<br />

‘alternative realities’ can be generated. These ‘alternative<br />

realities’ are used in an uncertainty propagation analysis to<br />

establish the impacts of data uncertainty on other operations,<br />

such as modelling. Finally, it is necessary to reflect on the<br />

quality of an uncertainty analysis (4), as they are fraught<br />

with assumptions and difficulties and can be misleading<br />

without quality control. The information required to<br />

generate ‘alternative realities’ of one or more environmental<br />

attributes is stored in the project database (see below).<br />

The methodology proposed for assessing model uncertainty<br />

is outlined in Refsgaard et al. (submitted for<br />

publication).<br />

2.4. Uncertainty in socio-economics<br />

Often uncertainty assessments are confined to uncertainties<br />

in data and models originating from natural science. We<br />

also consider uncertainty in socio-economic aspects by<br />

developing concepts based on the management of water<br />

resources and river basins (e.g. Cech, 2003). It takes into<br />

account literature on evaluation, e.g. cost-benefit analysis<br />

(Hanley and Spash, 1993; Bergstrom et al., 2001), multicriteria<br />

analysis (Roy, 1996; Munier, 2004) and decision<br />

making under uncertainty (Jungermann et al., 1998). The<br />

innovative aspects of our work lie in the further development<br />

Fig. 2. HarmoniRiB software components.


J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 271<br />

Fig. 3. Screen shots from the HarmoniRiB data uncertainty assessment tool.<br />

of these ideas to support the implementation of the WFD and<br />

particularly elaborating the role of uncertainty in the process<br />

of creating and selecting management measures.<br />

The uncertainty in socio-economic data of official<br />

statistics (Eurostat, Statistical bureaus of German Länder<br />

and the FRG) has been surveyed. We found that the efforts to<br />

produce accurate economic data are enormous but the<br />

knowledge and awareness of the remaining uncertainties is<br />

generally low. Despite the lack of knowledge and awareness<br />

about uncertainty in socio-economic data and their sources<br />

we judge the consideration of these uncertainties in river<br />

basin management as highly relevant. On the basis of our<br />

investigations and our experience, we expect that it will be<br />

difficult to reach a meaningful quantification of many of<br />

these uncertainties. Methods for the systematic collection of<br />

qualitative information on uncertainties as well as strategies<br />

to deal with uncertainties that are not necessarily based on<br />

quantification are therefore needed.<br />

3. Databases for accommodating uncertain data<br />

3.1. Functionality with respect to data uncertainty<br />

We have designed and developed software for a database<br />

than can handle data and data uncertainty. The novelty of<br />

this database is that it meets the following requirements:<br />

It can store time-series data.<br />

It can store spatial data, both raster and vector, as well as<br />

time-series of spatial data.<br />

It can store information about uncertainty in these data.<br />

The uncertainty characteristics are described according to<br />

the uncertainty categories listed in Table 1. This implies that<br />

for the continuous data types the uncertainty is described by<br />

use of a probability density function (pdf) and a correlation<br />

matrix (or correlation function) for normally distributed<br />

data. For categorical data (such as land cover or soil type), a<br />

non-parametric distribution is typically required, and may be<br />

stored alongside transition probabilities for describing statistical<br />

dependence. The HarmoniRiB database design therefore<br />

allows the user to associate a probability model with<br />

each uncertain data item. In future, the database will be<br />

extended to allow numerical bounds (e.g. confidence intervals)<br />

and scenarios when probabilities cannot be defined.<br />

Information on the sources of uncertainty and the quality of<br />

an uncertainty model is also stored in the database.<br />

An initial list of pdfs and autocorrelation functions are<br />

included in a Probability Distribution Function Dictionary<br />

and an Autocorrelation Function Dictionary of the database.<br />

In addition the software will allow a user to add new<br />

functions when required. In practice, it may not be possible<br />

to calculate the pdf parameters for every attribute value in<br />

the database individually. It may only be feasible to calculate<br />

them at the level of the attribute with which the value is<br />

associated (i.e. an assumption of stationarity in space or<br />

time). In all cases, an uncertainty model is referenced by an<br />

Uncertainty Model ID (UMID), which acts as a pointer to an<br />

uncertainty model that applies to a specific location in space<br />

or time and to the information on statistical dependence<br />

between locations and attributes.


272<br />

J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />

3.2. General database functionality<br />

The overall aim of the HarmoniRiB database system is to<br />

enable the HarmoniRIB Data Centre to receive, quality<br />

control, store and make available the representative basin<br />

data being assembled by the project. Ideally, it should be<br />

able to handle any data required for developing WFDcompliant<br />

River Basin Management Plans. This includes<br />

data for underlying modelling studies, and thus exceeds the<br />

WFD needs for reporting or river basin characterisations.<br />

The data will cover a wide range of water related topics but<br />

will mainly take the form of site descriptions and time series<br />

records. They will also include spatial data describing site<br />

locations, networks and variables such as land use or<br />

elevation. The proposed HarmoniRiB database design for<br />

holding these data is generic and is based on the WIS Cube<br />

(Moore, 1997). The major enhancements are not only the<br />

inclusion of uncertainty but also the seamless linking of<br />

metadata to data and a new underlying table design.<br />

At the user level, a HarmoniRiB database perceives the<br />

world as being composed of objects. These are any objects<br />

whose description and history the user wishes to record. The<br />

types or classes of object are decided by the user. Examples<br />

of object classes relevant to the WFD are sampling points,<br />

wells, reservoirs and rivers.<br />

The descriptions of objects and the events observed at<br />

them are recorded in terms of attribute values. Attributes,<br />

like object classes, are decided and defined by the user, the<br />

definitions being held in a dictionary. Awide range of spatial<br />

and non-spatial data types are supported, allowing the<br />

system to record most known or foreseeable types of<br />

attribute information required for the implementation of the<br />

WFD. Examples of attributes are object identifiers (names,<br />

reference codes, serial numbers, etc.), position, mean daily<br />

river flow, concentration (of e.g. nitrate), soil type and<br />

hydraulic conductivity.<br />

At the conceptual level, there is no differentiation<br />

between spatial and non-spatial attributes. They are all<br />

stored within the same logical framework.<br />

One way of visualising the manner in which data are<br />

stored in a HarmoniRiB database is to imagine a large cube,<br />

made up of individual cells as shown in Fig. 4. The three<br />

axes of this cube represent objects (WHERE observations<br />

were made), attributes (which record WHAT the observation<br />

was a measure of) and occasions (WHEN the observations<br />

were made). Thus, each cell in the cube records the value of<br />

an attribute at a particular object for a particular point in<br />

time. For example, one cell might record the concentration<br />

of calcium on 29 June 2002 at 10:20 (GMT) in the river<br />

Thames at Wallingford.<br />

The design regards all attribute values as potentially<br />

changeable over time, thus enabling it to handle time-series<br />

data such as river flow. This facility applies to spatial<br />

attributes as well as conventional time series making it<br />

possible to track an object’s movement. There is no<br />

constraint on the number of objects, attributes or occasions<br />

Fig. 4. The Cube as a way of visualising how time series data are stored<br />

(Tindal et al., 2004).<br />

which can be recorded, other than that imposed by the<br />

physical limits of the hardware. The Cube is otherwise<br />

unlimited in all directions.<br />

The cells in the cube hold the users’ data. Each cell<br />

contains a single attribute value. A cell can also contain<br />

some or all of the following information associated with the<br />

value:<br />

A qualifier for the value. A qualifier is an item of<br />

information which users may enter in order to amplify the<br />

meaning of an attribute value. For example, qualifiers may<br />

be useful in:<br />

Bird or bacteriological count attributes where the value<br />

may take the form of, say, ‘more than 10,000’. In this<br />

case, the value would be entered as 10,000, and the<br />

qualifier as ><br />

Chemical concentration attributes, where the actual<br />

concentration is unknown, but it is possible to say that it<br />

is less than a certain value, where the value represents<br />

the limit of detection of the analysis method. The value<br />

would be entered as the limit of detection, for example<br />

0.001, and the qualifier as <<br />

A method of derivation identifier. The method code is a<br />

user defined code identifying the source from which the<br />

value was obtained or the method by which it was derived.<br />

This information can be used, for example, by future users<br />

of the value, to determine its reliability.<br />

A measure of the value’s uncertainty in the form of a<br />

reference to an uncertainty model stored elsewhere in the<br />

database. This part of the requirement represents the<br />

major area of innovation and is likely to evolve as the<br />

project progresses.<br />

Dataset ID. Every value in the database has a pointer<br />

connecting the value to the dataset of which it is a<br />

member. The definition of what constitutes a dataset is up<br />

to the user. The only mandatory part of its definition is that<br />

the data values that make up a dataset must be owned by<br />

the same person or organisation. This condition is<br />

necessary to facilitate access control which will relate<br />

to ‘owned’ blocks of data.


J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 273<br />

Uncertainty Model ID. Each value contains a reference to<br />

an uncertainty model, which describes the range of<br />

possible values that an attribute might take at a given<br />

location.<br />

At the physical level, the data will be stored in a set of<br />

tables in a relational database such as Oracle. These will be<br />

held in a single account managed by the database administrator.<br />

Approved applications such as the data load facility<br />

will have direct access to this account and will be able to<br />

select and update data. Users and user written applications<br />

will be given read only access to the database via their own<br />

accounts.<br />

The database software is developed for application on an<br />

ArcSDE/ArcGIS platform using ESRI technology.<br />

4. River basin network and data<br />

Many networks of river basin data have been established<br />

for research purposes during the last couple of decades. A<br />

review of the characteristics of existing networks with<br />

respect to type of data, geographical coverage, data<br />

accessibility and data use by third parties is provided by<br />

Passarella and Vurro (2003). Examples of existing international<br />

networks are Flow Regimes from International<br />

Experimental and Network Data (FRIEND); Global Runoff<br />

Data Centre (GRDC); Hydrology for the Environment, Life<br />

and Policy (HELP); World Hydrological Cycle Observing<br />

System (WHYCOS); European River and Catchment<br />

Database Pilot Project (ERICA); Inventory of the Catchments<br />

for Research in Europe (ICARE) metadatabase and<br />

the Experimental Representative Basins (ERB) network and<br />

GLOWA.<br />

In addition to these international networks, many national<br />

databases containing data from national networks of river<br />

basins exist, e.g. Lowland Catchment Research (LOCAR);<br />

Data Storage for the Rijkswaterstaat (DONAR) and British<br />

Oceanographic Data Centre (BODC).<br />

Some of the existing networks provide data for<br />

operational purposes, while most of them have been<br />

established for research purposes. Many of these networks<br />

have existed for long periods and have served (and still do)<br />

important purposes. However, seen from a Water Framework<br />

Directive perspective, most of them have the key<br />

deficiency that they focus on only some aspects (domains) of<br />

Fig. 5. Location of the HarmoniRiB network of representative river basins.


274<br />

J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />

data required for water management in WFD, and most<br />

typically they do not contain data on ecological and socioeconomic<br />

aspects. Even comprehensive national databases<br />

such as LOCAR and DONAR do not contain do not contain<br />

much data on groundwater, land use and socio-economics.<br />

Among the international networks HELP has the broadest<br />

scope with a focus on socio-economic aspects. HELP,<br />

however, does not include groundwater or coastal water<br />

data. Furthermore, HELP so far only consists of rather few<br />

river basins Worldwide and does not have a good coverage in<br />

Europe.<br />

Thus, none of the existing river basin networks can<br />

provide suitable datasets for supporting research on<br />

integrated water management of direct relevance for<br />

implementation of the WFD. In addition, none of the<br />

existing networks comprise any quantifiable information on<br />

data uncertainty. Consequently, it is concluded that there is a<br />

clear need to supplement the existing networks with a<br />

network of representative river basins that as its principal<br />

aim has to provide data supporting research in integrated<br />

water resources management as required by the WFD. The<br />

HarmoniRiB river basin network is meant for this purpose.<br />

The HarmoniRiB network of representative river basins<br />

comprise eight basins, see Fig. 5 for locations and Table 2 for<br />

characteristic features. These basins have been selected to<br />

ensure a good coverage across Europe in terms of ecoregions,<br />

types of water problems, socio-economic conflicts<br />

and amount and quality of existing data. In addition, two of<br />

the river basins (Odense and Jucar) are also included in the<br />

Pilot River Basin Network, where the EC guidance<br />

documents have been tested. The aim of HarmoniRiB is,<br />

through interaction with the respective river basin organisations<br />

and data owners, to provide well documented data for<br />

research purposes, suitable for studying the influence of<br />

uncertainty on management decisions. The data will be<br />

publicly accessible for all research purposes. Thus, scientists<br />

may use the data to, e.g. assess the appropriateness of<br />

models and other tools in relation to the WFD.<br />

For each of the eight river basins a comprehensive<br />

amount of data is presently being collected and uploaded to<br />

the HarmoniRiB database. The data basically include all<br />

data that are required to carry out analysis for the WFD<br />

implementation (Blind and de Blois, 2003). Most of the data<br />

are organised in seven datasets, one for each of the six<br />

domains: climate, rivers, lakes, groundwater, transitional<br />

waters, and coastal waters, and one for spatial data, river<br />

basin characteristics and socio-economic data. Specific lists<br />

of data have been prepared by matching the data<br />

requirements given in the guidance documents on ‘Monitoring’<br />

(EC, 2003b) and ‘Analysis of pressures and impacts’<br />

(EC, 2003a), with the data available in the respective river<br />

basins (Rasmussen, 2003).<br />

After collecting and reformatting the data they are being<br />

uploaded to the HarmoniRiB Data Centre. Subsequently,<br />

uncertainty will be assessed and added to the data following<br />

the framework outlined above.<br />

Table 2<br />

Key characteristics of the HarmoniRiB network of representative river basins<br />

Dominant land use Main water uses Main conflicting interest<br />

GNP<br />

(Euro/pers/year)<br />

Country river basin Area (km 2 ) Population<br />

density<br />

(person/km 2 )<br />

Flood protection, minimum discharges, water quality<br />

CZ, Svratka 3998 142 5600 Agriculture, forest Drinking water, electrical power,<br />

recreation, nature<br />

DE, Weisse Elster 5325 278 15000 Agriculture Drinking water, industry Point and non-point sources; wastewater and contaminated<br />

sites; strong economic and social changes.<br />

DK, Odense 1090 135 25000 Agriculture Public water supply,<br />

recreation, nature<br />

Agricultural contamination; groundwater abstraction depletes<br />

stream flow and wetlands<br />

Farming use; hydroelectrical use; touristic water demand<br />

ES, Jucar 21328 28 9900 Agriculture Irrigation, hydroelectric,<br />

touristic supply, industry<br />

GR, Geropotamou 600 66 10000 Agriculture Irrigation, touristic Water shortage, water quality, oversized dam, salt intrusion,<br />

difficulties in sharing water among municipalities<br />

IT, Candelaro 1980 230 10277 Agriculture Irrigation, industry Water shortage; rainfall rates decrease; intensive<br />

horticultural farming.<br />

Agriculture, water quality, ecology, flooding —<br />

room for water retention<br />

NL + DE Vecht 3780 (1980 in NL) 311 19000 Industry, agriculture, habitation Agriculture, drinking water,<br />

receiving water, recreation<br />

Water supply vs. ecology<br />

UK, Thames 12917 929 30000 Urban, agriculture Public water supply, ecosystem,<br />

recreation


J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 275<br />

5. Case studies<br />

For each of the river basins the methodologies will be<br />

tested through one case study for each of the eight river<br />

basins. The focus in the case studies will be assessment of<br />

uncertainties related to various aspects of the decision<br />

process related to evaluating potential measures for<br />

achieving the WFD objective of good ecological status.<br />

The following aspects of uncertainty will be considered:<br />

Uncertainty related to framing of the decision making<br />

process. This uncertainty will typically be described in<br />

qualitative terms.<br />

Uncertainty related to prediction of effects of a given<br />

measure, i.e. what is the impact of a given management<br />

decision such as changes in agricultural practice of<br />

abstraction of groundwater. Such predictions will often be<br />

made by use of hydrological models and involve the<br />

following sources of uncertainty:<br />

- Uncertainty of input data.<br />

- Uncertainty of model parameter values.<br />

- Uncertainty of model techniques (numerical solution,<br />

software bugs, etc.).<br />

- Uncertainty of model structure.<br />

Uncertainty on economic assessments, which, like for<br />

uncertainty in hydrological model predictions, may<br />

originate from economic data and from the choice of<br />

evaluation method.<br />

A key problem in assessing the uncertainty of the effects<br />

of a measure is that the effects usually are estimated as a<br />

difference between two model simulations, e.g. a reference<br />

run describing the present conditions and a run where the<br />

measure is taken into account. Procedures for assessing u-<br />

ncertainty of a model simulation are well known, while<br />

procedures for assessing uncertainties in differences between<br />

two simulation runs are theoretically difficult and rarely<br />

used. However, here we are mainly interested in the uncertainty<br />

on the difference figures. These uncertainties related<br />

to differences in simulated output may be much smaller than<br />

the uncertainties in the model predictions of each simulation<br />

(Reichert and Borsuk, 2005) as many sources of uncertainty<br />

affect the predictions for different alternatives in similar<br />

ways.<br />

The results of the case study will be uncertainties<br />

expressed partly quantitatively and partly qualitatively. The<br />

quantitative parts may be illustrated as in Fig. 6, where the<br />

uncertainty on the impacts (hydrological models) are shown<br />

along the vertical axis and the uncertainty on the costs of<br />

implementing a measure is shown along the horizontal axis.<br />

In the hypothetical example shown in Fig. 6 measure no. 1<br />

(PoM 1) is clearly suboptimal as compared to the two other<br />

measures, because its effect is much lower and the<br />

implementation cost higher. A decision on whether to<br />

chose PoM 2 or PoM 3 is, however, more difficult, because<br />

the uncertainty ranges are overlapping both with regards to<br />

effects and costs. The choice will also be influenced by the<br />

risk strategy of the decision maker. If the decision maker<br />

wants a high degree of certainty for an effect corresponding<br />

to the dashed line denoted ‘Minimum effect’ s/he will have<br />

to select PoM 3, even if the expected cost efficiency of PoM<br />

2 is more favourable.<br />

Fig. 6. Graphical representation of uncertainty in simulated effect of measure vs. estimated uncertainty in cost of implementing a measure.


276<br />

6. Discussion and conclusions<br />

J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277<br />

Acknowledgement<br />

Assessment of uncertainty in model simulations is<br />

important when such models are used to support decisions<br />

in water resources management (Beven and Binley, 1992;<br />

Pahl-Wostl, 2002; Jakeman and Letcher, 2003; Refsgaard<br />

and Henriksen, 2004). This is reflected in EU’s new water<br />

management approaches as described in the Water Framework<br />

Directive (EC, 2000) and the associated guidance<br />

documents. A basic principle in EU environmental policy on<br />

which the WFD is based is ‘‘...to contribute to pursuit of the<br />

objectives of preserving, protecting and improving the<br />

quality of the environment in prudent and rational use of<br />

natural resources, and to be based on the precautionary<br />

principle ... ’’ (paragraph 11 in the directive). The holistic<br />

concept that is prescribed in the WFD with its integrated<br />

approach to natural resources and socio-economic issues<br />

therefore requires that uncertainty be considered in the<br />

decision making process in order for it to become truly<br />

rational. This need for taken uncertainties into account is<br />

also explicitly stated in the WFD guidance documents<br />

(Blind and de Blois, 2003).<br />

The key sources of uncertainty of importance for<br />

evaluating the effect and cost of a measure in relation to<br />

preparing a WFD-compliant river basin management plan<br />

are (1) uncertainty related to framing of the decision<br />

making process; (2) uncertainty related to hydrological<br />

models (input data, parameter values, model technique,<br />

model structure) and; (3) uncertainty in economic assessments.<br />

The framework adopted in HarmoniRiB addresses<br />

this wide spectrum of uncertainties. The particularly<br />

novel contributions of HarmoniRiB in this respect are<br />

related to the assessment of uncertainty in data and to<br />

the integration of uncertainty in effects of a measure<br />

(outputs from hydrological models) and socio-economic<br />

uncertainty, including uncertainty in costs of implementing a<br />

measure.<br />

New principles often lead to a demand for new research<br />

for supporting their implementation. This is also the case for<br />

the WFD. Hence there is a need for easy access to river basin<br />

datasets suitable for WFD related research. None of the<br />

existing international river basin networks can provide<br />

suitable datasets for supporting research on integrated water<br />

management of direct relevance for implementation of the<br />

WFD. In addition, none of the existing networks comprise<br />

any quantifiable information on data uncertainty. The<br />

HarmoniRiB project aims at filling this gap by designing,<br />

building and populating a database containing data and<br />

associated uncertainties for a eight river basins representatively<br />

characterising the diversity of climatic regimes and<br />

water management challenges across Europe. This river<br />

basin network aims at becoming a ‘virtual laboratory for<br />

modelling studies’, and it will be made available for the<br />

scientific community. The data may, e.g. be used for<br />

comparison and demonstration of methodologies and<br />

models relevant to the WFD.<br />

This work is partly funded by the EC Energy,<br />

Environment and Sustainable Development programme<br />

(Contract EVK1-2002-00109).<br />

References<br />

Beck, M.B., 1987. Water quality modelling: a review of the analysis of<br />

uncertainty. Water Resour. Res. 23 (8), 1393–1442.<br />

Bergstrom, J.C., Boyle, K.J., Poe, G.L. (Eds.), 2001. The Economic Value<br />

of Water Quality. Edward Elgar, Chaltenham.<br />

Beven, K., Binley, A.M., 1992. The future of distributed models, model<br />

calibration and uncertainty predictions. Hydrol. Processes 6, 279–298.<br />

Blind, M., de Blois, C., 2003. The Water Framework Directive and its<br />

Guidance Documents — Review of data aspects. In: Refsgaard, J.C.,<br />

Nilsson, B. (Eds.), Requirements, Report, Geological Survey of Denmark,<br />

Greenland, Copenhagen (Chapter 5). Available on http://<br />

www.harmonirib.com/.<br />

Brown, J.D., 2004. Knowledge, uncertainty and physical geography:<br />

towards the development of methodologies for questioning belief.<br />

Trans. Inst. Br. Geographers 29 (3), 367–381.<br />

Brown, J.D., Heuvelink, G.B.M., Refsgaard, J.C., 2005. An integrated<br />

framework for assessing and recording uncertainties about environmental<br />

data. To appear in a special issue of Water Sci. Technol.<br />

Cech, T.V., 2003. Principles of Water Resources — History, Development,<br />

Management, and Policy. John Wiley & Sons, New York.<br />

EC, 2000. Water Framework Directive. Directive 2000/60/EC. European<br />

Commission.<br />

EC, 2003a. Guidance for the analysis of Pressures and Impacts in accordance<br />

with the Water Framework Directive. Working Group 2.1.<br />

Available on http://forum.europa.eu.int/Public/irc/env/wfd/library.<br />

EC, 2003b. Water Framework Directive, Common Implementation Strategy.<br />

Working group 2.7. Monitoring. Available on http://forum.europa.eu.int/Public/irc/env/wfd/library.<br />

Funtowicz, S.O., Ravetz, J., 1990. Uncertainty and Quality in Science for<br />

Policy. Kluwer Academic Publishers, Dordrecht.<br />

GWP, 2000. Integrated Water Resources Management. TAC Background<br />

Papers No. 4. Global Water Partnership, Stockholm. Available on http://<br />

www.gwpforum.org/.<br />

Hanley, N., Spash, C.L., 1993. Cost-Benefit Analysis and the Environment.<br />

Edward Elgar, Brookfield.<br />

Heuvelink, G.B.M., Burrough, P.A., 1993. Error propagation in cartographic<br />

modelling using Boolean logic and contionous classification.<br />

Int. J. Geogr. Inform. Sci. 7 (3), 231–246.<br />

Jakeman, A.J., Letcher, R.A., 2003. Integrated assessment and modelling:<br />

features, principles and examples for catchment management. Environ.<br />

Modell. Software 18, 491–501.<br />

Jungermann, H., Pfister, H-R., Fischer, K., 1998. Die Psychologie der<br />

Entscheidung (The Psychology of Decisions). Spektrum Akademischer<br />

Verlag, Heidelberg.<br />

Klauer, B., Brown, J.D., 2003. Conceptualising imperfect knowledge in<br />

public decision making: ignorance, uncertainty, error and ‘risk situations’.<br />

Environ. Res., Eng. Manage.<br />

Moore, R.V., 1997. The logical and physical design of the land Ocean<br />

Interaction Study database. Sci. Total Environ. 194/195, 137–146.<br />

Munier, N., 2004. Multicriteria Environmental Assessment. Kluwer Academic<br />

Publishers, Dortrecht.<br />

Pahl-Wostl, C., 2002. Towards sustainability in the water sector — the<br />

importance of human actors and processes of social learning. Aquatic<br />

Sci. 64, 394–411.<br />

Passarella, G., Vurro, M., 2003. Review of Existing River Basin Networks.<br />

In: Refsgaard, J.C., Nilsson, B. (Eds.), Requirements Report. Geological


J.C. Refsgaard et al. / Environmental Science & Policy 8 (2005) 267–277 277<br />

Survey of Denmark and Greenland, Copenhagen (Chapter 3). Available<br />

on http://www.harmonirib.com/.<br />

Rasmussen, P., 2003. Requirements for Data for HarmoniRiB. In:<br />

Refsgaard, J.C., Nilsson, B. (Eds.), Requirements Report. Geological<br />

Survey of Denmark and Greenland, Copenhagen (Chapter 7). Available<br />

on http://www.harmonirib.com/.<br />

Refsgaard, J.C., Henriksen, H.J., 2004. Modelling guidelines — terminology<br />

and guiding principles. Adv. Water Resour. 27, 71–82.<br />

Refsgaard, J.C., van der Sluijs, J.P., Brown, J., van der Keur, P., submitted<br />

for publication. A framework for dealing with uncertainty due to model<br />

structure error.<br />

Reichert, P., Borsuk, M.E., 2005. Does high forecast uncertainty preclude<br />

effective decision support. Environ. Modell. Software 20 (8), 991–1001.<br />

Roy, B., 1996. Multicriteria Methodology for Decision Aiding. Kluwer<br />

Academic Publishers, Dortrecht.<br />

Tindal, C.I., Moore, R.V., Dunbar, M., Goodwin, T., 2004. The HarmoniRiB<br />

project — the effect of uncertainty on catchment management. In:<br />

British Hydrological Society International Conference on Hydrology:<br />

Science and Practice for the 21st Century, 12–16 July 2004, London,<br />

UK.<br />

Walker, W.E., Harremoës, P., Rotmans, J., Van der Sluijs, J.P., Van Asselt,<br />

M.B.A., Janssen, P., Krayer von Krauss, M.P., 2003. Defining uncertainty.<br />

A conceptual basis for uncertainty management in model-based<br />

decision support. Integrated Assess. 4 (1), 5–17.<br />

Van Loon, E., Refsgaard, J.C. (Eds.), 2005. Guidelines for assessing data<br />

uncertainty in hydrological studies. First draft version prepared September<br />

2004. Final version to be published beginning of 2005 on http://<br />

www.harmonirib.com/.<br />

Jens Christian Refsgaard is co-ordinator of the HarmoniRiB project.<br />

Since his graduation in hydrology at the Technical University of Denmark in<br />

1976 he has worked with hydrological modelling and water resources<br />

management at DTU, DHI and now at GEUS, where he holds a position<br />

as research professor. He is currently also WP leader in HarmoniQuA<br />

(quality assurance in the modelling process) and NeWater (new approaches<br />

in water resources management).<br />

Bertel Nilsson is a research scientist in hydrogeology at Geological Survey<br />

of Denmark and Greenland since 1988.<br />

James Brown is a postdoctoral research associate at the University of<br />

Amsterdam with interests in environmental modelling, methods for uncertainty<br />

analysis of models, and the impacts of scientific uncertainty on<br />

decision making.<br />

Bernd Klauer has a professional background in mathematics, physics and<br />

economics. After his PhD in economics from the University of Heidelberg<br />

he became engaged at the UFZ Centre for Environmental Research, Leipzig.<br />

There he currently works as a senior scientist and leader of a research group<br />

on integrated assessment and decision support.<br />

Roger Moore is a member of the Centre for Ecology and Hydrology, UK.<br />

His backgound lies in civil engineering but has spent most of his career<br />

working on integrated database design mainly in the UK but also around the<br />

world. Currently, he is also co-ordinator for The FP5 project HarmonIT.<br />

Thomas Bech holds an MSc in electronics engineering and computer<br />

science, and has worked as software developer and project manager at<br />

Seven Technologies and DHI Water & Environment. He is currently<br />

working as a Software Development Manager at DHI Water & Environment.<br />

Michele Vurro graduated in hydraulic engineering. Researcher at<br />

CNR.IRSA from 1982, and is now principal researcher with responsibility<br />

for methodology and techniques for protecting and managing water<br />

resources, with particular emphasis on water budget under scarce water<br />

availability.<br />

Michiel Blind, Msc Environmental Science — Water Systems Analysis, has<br />

worked 5 years on monitoring network design at Wageningen University,<br />

where after he continued his career at RWS-RIZA, on IT-water management<br />

issues. He is mainly involved in European Research Projects on Catchment<br />

modelling.<br />

Guillermo Castilla is a forest engineer specialized in Remote Sensing and<br />

GIS. He is currently involved in the dissemination activities of HarmoniRiB.<br />

Ioannis K. Tsanis is a professor in the Department of Environmental<br />

Engineering at Technical University of Crete. He obtained his PhD in civil<br />

engineering from University of Toronto. His research activities are in the<br />

areas of hydroinformatics, water resources management and coastal engineering.<br />

His main background is hydrological modelling, water resources<br />

management and hydroinformatics.<br />

Pavel Biza has been educated in civil engineering and developed his career<br />

at the water board Povodi Moravy in the Czech Republic. He is now<br />

involved in development of river basin management plans.


[15]<br />

Refsgaard JC, van der Sluijs JP, Brown J, van der Keur P (2006). A<br />

framework for dealing with uncertainty due to model structure error.<br />

Advances in Water Resources, 29, 1586-1597.<br />

Reprinted from Advances in Water Resources with permission from Elsevier


Advances in Water Resources 29 (2006) 1586–1597<br />

www.elsevier.com/locate/advwatres<br />

A framework for dealing with uncertainty due to model<br />

structure error<br />

Jens Christian Refsgaard a, *, Jeroen P. van der Sluijs b ,<br />

James Brown c , Peter van der Keur a<br />

a Department of Hydrology, Geological Survey of Denmark and Greenland (GEUS), Oster Voldgade 10, 1350 Copenhagen, Denmark<br />

b Copernicus Institute for Sustainable Development and Innovation, Department of Science Technology and Society,<br />

Utrecht University, Utrecht, The Netherlands<br />

c University of Amsterdam (UVA), Amsterdam, The Netherlands<br />

Received 29 July 2004; received in revised form 6 September 2005; accepted 21 November 2005<br />

Available online 5 January 2006<br />

Abstract<br />

Although uncertainty about structures of environmental models (conceptual uncertainty) is often acknowledged to be the main<br />

source of uncertainty in model predictions, it is rarely considered in environmental modelling. Rather, formal uncertainty analyses<br />

have traditionally focused on model parameters and input data as the principal source of uncertainty in model predictions. The traditional<br />

approach to model uncertainty analysis, which considers only a single conceptual model, may fail to adequately sample the<br />

relevant space of plausible conceptual models. As such, it is prone to modelling bias and underestimation of predictive uncertainty.<br />

In this paper we review a range of strategies for assessing structural uncertainties in models. The existing strategies fall into two<br />

categories depending on whether field data are available for the predicted variable of interest. To date, most research has focussed<br />

on situations where inferences on the accuracy of a model structure can be made directly on the basis of field data. This corresponds<br />

to a situation of ‘interpolation’. However, in many cases environmental models are used for ‘extrapolation’; that is, beyond the situation<br />

and the field data available for calibration. In the present paper, a framework is presented for assessing the predictive uncertainties<br />

of environmental models used for extrapolation. It involves the use of multiple conceptual models, assessment of their<br />

pedigree and reflection on the extent to which the sampled models adequately represent the space of plausible models.<br />

Ó 2005 Elsevier Ltd. All rights reserved.<br />

Keywords: Environmental modelling; Model error; Model structure; Conceptual uncertainty; Scenario analysis; Pedigree<br />

1. Introduction<br />

1.1. Background<br />

* Corresponding author. Tel.: +45 38 14 27 76; fax: +45 38 14 20 50.<br />

E-mail address: jcr@geus.dk (J.C. Refsgaard).<br />

Assessing the uncertainty of model simulations is<br />

important when such models are used to support decisions<br />

about water resources [6,33,23,39]. The key<br />

sources of uncertainty in model predictions are (i) input<br />

data; (ii) model parameter values; and (iii) model structure<br />

(=conceptual model). Other authors further distinguish<br />

uncertainty in model context, model assumptions,<br />

expert judgement and indicator choice [46,54,48] but<br />

these are beyond the scope of this paper. Uncertainties<br />

due to input data and due to parameter values have been<br />

dealt with in many studies, and methodologies to deal<br />

with these are well developed. However, no generic<br />

methodology exists for assessing the effects of model<br />

structure uncertainty, and this source of uncertainty is<br />

frequently neglected.<br />

Any model is an abstraction, simplification and interpretation<br />

of reality. The incompleteness of a model<br />

0309-1708/$ - see front matter Ó 2005 Elsevier Ltd. All rights reserved.<br />

doi:10.1016/j.advwatres.2005.11.013


J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1587<br />

structure and the mismatch between the real causal<br />

structure of a system and the assumed causal structure<br />

as represented in a model always result in uncertainty<br />

about model predictions. The importance of the model<br />

structure for predictions is well recognised, even for situations<br />

where predictions are made on output variables,<br />

such as discharge, for which field data are available<br />

[16,8]. The considerable challenge faced in many applications<br />

of environmental models is that predictions are<br />

required beyond the range of available observations,<br />

either in time or in space, e.g. to make extrapolations<br />

towards unobservable futures [2] or to make predictions<br />

for natural systems, such as ecosystems, that are likely<br />

to undergo structural changes [4]. In such cases, uncertainty<br />

in model structure is recognised by many authors<br />

to be the main source of uncertainty in model predictions<br />

[44,13,31,28].<br />

1.2. An example – five alternative conceptual models<br />

The problem is illustrated for a study conducted by<br />

the County of Copenhagen in 2000 involving a real<br />

water management decision [11,37]. The County of<br />

Copenhagen is the authority responsible for water<br />

resources management in the county where the city of<br />

Copenhagen abstracts groundwater for most of its water<br />

supply. According to a new Water Supply Act the<br />

county had to prepare an action plan for protection of<br />

groundwater against pollution. As a first step, the<br />

county asked five groups of Danish consulting firms to<br />

conduct studies of the aquifer’s vulnerability towards<br />

pollution in a 175 km 2 area west of Copenhagen, where<br />

the groundwater abstraction amounts to about 12 million<br />

m 3 /year. The key question to be answered was:<br />

which parts of this particular area are most vulnerable<br />

to pollution and need to be protected The five consultants<br />

were among the most well reputed consulting firms<br />

in Denmark, and they were known to have different<br />

views and preferences on which methodologies are most<br />

suitable for assessing vulnerability. As the task was one<br />

of the first consultancy studies on a new major market<br />

for preparation of groundwater protection plans it was<br />

considered a prestigious job to which the consultants<br />

generally allocated some of their most qualified<br />

professionals.<br />

The five consultants used significantly different<br />

approaches. One consultant based his approach on<br />

annual fluctuations of piezometric heads assuming that<br />

larger fluctuations represent greater interaction between<br />

aquifer and surface water systems and hence a larger<br />

vulnerability. Several consultants used the DRASTIC<br />

multi-criteria method [1], but modified it in different<br />

ways by changing weights and adding new, mainly geochemically<br />

oriented, criteria. One consultant based his<br />

approach on advanced hydrological modelling of both<br />

groundwater and surface water systems using the MIKE<br />

SHE code [40], while two other consultants used simpler<br />

groundwater modelling approaches. Thus, the five consultants<br />

had different perceptions of what causes<br />

groundwater pollution and used models with different<br />

processes and causal relationships to describe the possibility<br />

of groundwater pollution in the area. In addition,<br />

their different interpretations and interpolations made<br />

from common field data resulted in significantly different<br />

figures for e.g. areal means of precipitation and<br />

evapotranspiration and the thickness of various geological<br />

layers [37].<br />

The conclusions of the five consultants regarding vulnerability<br />

to nitrate pollution are shown in Fig. 1. Itis<br />

apparent that the five estimates differ substantially from<br />

each other. In the present case, no data exist to validate<br />

the model predictions, because the five models were used<br />

to make extrapolations. Thus, it is not possible, from<br />

existing field data, to tell which of the five model estimates<br />

are more reliable. The differences in prediction<br />

originate from two main sources: (i) data and parameter<br />

uncertainty and (ii) conceptual uncertainty. Although<br />

the data and parameter uncertainties were not explicitly<br />

assessed by any of the consultants (as is common in such<br />

studies), the substantial differences in model structures<br />

and the fact that the consultants all used the same raw<br />

data point to structural uncertainty as the main cause<br />

of difference between the five model results and as a<br />

major source of uncertainty in model predictions.<br />

Fig. 1. Model predictions on aquifer vulnerability towards nitrate<br />

pollution for a 175 km 2 area west of Copenhagen [11].


1588 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />

Usually a water manager bases their decisions on the<br />

conclusions from only one study. The uniqueness of the<br />

present study was that five consultants were asked to<br />

answer the same question on the basis of the same data.<br />

In this respect the differences between the five estimates<br />

are striking and clearly do not provide a sound basis for<br />

deciding anything about which areas should be protected.<br />

A worrying question, which is left unanswered,<br />

is whether the basis for decisions is similarly poor in<br />

the many other cases where only a single conceptual<br />

model has been adopted and where millions of DKK<br />

have subsequently been used to prepare and implement<br />

action plans.<br />

1.3. Objective and outline of paper<br />

The objective of this paper is to review possible strategies<br />

for dealing with model structure errors and to outline<br />

a framework for handling the effects of model<br />

structure errors on predictive uncertainty, with particular<br />

emphasis on situations where model predictions represent<br />

extrapolations to situations not covered by<br />

calibration data and are often outside the domain on<br />

which our knowledge on the dynamics of the system<br />

and our understanding of its causal relationships is<br />

based.<br />

The paper is organised so that reviews of existing<br />

strategies and the discussion of their potentials and limitations<br />

are given in Section 2. A new framework is presented<br />

in Section 3 for analysing the uncertainties due to<br />

model structure errors when models are used for making<br />

extrapolations beyond their calibration base. Finally,<br />

the problems and perspectives of the new framework<br />

are discussed in Section 4. The terminology used is<br />

defined in Appendix.<br />

2. Review of possible strategies<br />

2.1. Classification<br />

The existing strategies for assessing uncertainty due<br />

to incomplete or inadequate model structure may be<br />

grouped into the categories shown in Fig. 2. The most<br />

important distinction is whether data exist that makes<br />

it possible to make inferences on the model structure<br />

uncertainty directly. This requires that data are available<br />

for the output variable of predictive interest and for conditions<br />

similar to those in the predictive situation. In<br />

other words it is a distinction between whether the<br />

model predictions can be considered as interpolations<br />

or extrapolations relative to the calibration situation.<br />

The two main categories are thus equivalent to different<br />

situations with respect to model validation tests.<br />

According to Klemes’ classical hierarchical test scheme<br />

[26,38], the interpolation case corresponds to situations<br />

where the traditional split-sample test is suitable, while<br />

the extrapolation case corresponds to situations where<br />

no data exist for the concerned output variable<br />

(proxy-basin test) or where the basin characteristics<br />

are considered non-stationary, e.g. for predictions of<br />

effects of climate change or effects of land use change<br />

(differential split-sample test).<br />

In the review of existing strategies given below examples<br />

of studies have been selected to illustrate the classification<br />

and the common approaches. It is not an<br />

Availability of data for<br />

model validation test<br />

Target data exist<br />

(interpolation)<br />

No direct data<br />

(extrapolation)<br />

Increase<br />

parameter<br />

uncertainty<br />

Estimate<br />

structural<br />

term<br />

Multiple<br />

conceptual<br />

models<br />

Expert<br />

elicitation<br />

Pedigree<br />

analysis<br />

Intermediate data<br />

(differential splitsample<br />

case)<br />

No data at all<br />

(proxy basin case)<br />

Fig. 2. Classification of existing strategies for assessing conceptual model uncertainty.


J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1589<br />

exhaustive review, but illustrates the range of<br />

approaches available to diagnose structural uncertainty<br />

in models.<br />

2.2. Data exist – interpolation<br />

In this situation, calibration is usually carried out<br />

against a sample of the existing field data to ensure some<br />

kind of optimal parameter values, and then the model<br />

predictions are compared with the remaining (‘independent’)<br />

field data. The deviations between model predictions<br />

and independent field observations can be used<br />

to infer the model’s conceptual error. Different methodologies<br />

can be used in this respect.<br />

2.2.1. Increasing parameter uncertainty to account<br />

for structural uncertainty<br />

One strategy is to increase the parameter uncertainty<br />

to a level where it is assumed to compensate for omitting<br />

model structure error from the analysis. Van Griensven<br />

and Meixner [45] provide an example of this. They<br />

assess the total predictive uncertainty without identifying<br />

or quantifying the underlying sources of uncertainty.<br />

They use the split-sample approach assessing ranges of<br />

predictive uncertainty from analyses of predictions and<br />

data for a period different from the calibration period.<br />

Their total predictive uncertainty is assessed by increasing<br />

the model parameter uncertainty beyond the magnitudes<br />

estimated during calibration to a level where the<br />

resulting predictive uncertainty intervals bracket the<br />

observations. This technique does not introduce a separate<br />

stochastic term for the structural uncertainty, but<br />

represents the structural term in the parameter term.<br />

The model structure error is likely to influence the model<br />

simulations in non-random and temporally varying<br />

ways. By compensating the model structure error by<br />

increasing the variance of a temporally constant random<br />

variable the results from this approach can be questioned,<br />

particularly if used for predictions in situations<br />

where split-sample tests are not made.<br />

2.2.2. Estimation of the structural uncertainty term<br />

Other strategies attempt to estimate the structural<br />

contribution to uncertainty in the model predictions.<br />

An example of such an approach is given by Radwan<br />

et al. [35], who estimate the total predictive uncertainty<br />

from a statistical analysis of the residuals between model<br />

predictions and observations. Further, they analyse the<br />

propagated uncertainties from model input and parameter<br />

values. By subtracting these two uncertainties from<br />

the total predictive uncertainty they assign the remaining<br />

predictive uncertainty to be an effect of model structure<br />

uncertainty. It is then possible to add the model<br />

structure uncertainty when making other predictions.<br />

This approach assumes that the uncertainties from different<br />

sources are additive. This assumption is questionable,<br />

because the combination of uncertainties is often<br />

non-linear due to interactions, correlations and dependencies<br />

between variables in a model. It also assumes<br />

that the differences in predictions and observations are<br />

caused by structural error and not by the poor specification<br />

of input and parameter uncertainty, nor by errors in<br />

the observations.<br />

Vrugt et al. [53] present another stochastic approach<br />

based on a simultaneous parameter optimisation and<br />

data assimilation with an ensemble Kalman filter. By<br />

specifying values for measurement error and a so-called<br />

‘stochastic forcing term’, representing structural uncertainty,<br />

they are able to estimate the dynamic behaviour<br />

of the model structure uncertainty. Both techniques<br />

assume a smooth contribution from structural uncertainty,<br />

but an important advantage of the latter is that<br />

parameter innovations (an output from the Kalman filter)<br />

may be used to diagnose non-stationarity in system<br />

structure.<br />

2.3. No direct data – extrapolation<br />

In cases where model structure errors cannot be<br />

assessed directly due to a lack of relevant data, the main<br />

strategy is to do the extrapolation with multiple conceptual<br />

models. Two supporting methods can be used here<br />

for the generation and qualification of each of the alternative<br />

models: expert elicitation and pedigree analysis<br />

(Fig. 2).<br />

2.3.1. Multiple conceptual models<br />

In the scenario approach a number of alternative<br />

conceptual models are considered. For each of these,<br />

the model input and parameter uncertainties may be<br />

analysed and the differences between model predictions<br />

are then seen as a measure of the model structure uncertainty.<br />

The idea of using alternative or competing candidate<br />

model structures was introduced in water quality<br />

modelling some time ago [5]. The issue typically dealt<br />

with here is whether models developed for current conditions<br />

can yield correct predictions when used under<br />

changed control. Van Straten and Keesman [50] note<br />

in this respect that good performance at the calibration<br />

stage does not guarantee correctly predicted behaviour,<br />

due to non-stationarity of the underlying processes in<br />

space or time.<br />

The multiple modelling approach has also been used<br />

in flood forecasting. For example, Butts et al. [8] use 10<br />

different model structures to evaluate structural uncertainty<br />

in flood predictions. They conclude that exploring<br />

an ensemble of model structures provides a useful<br />

approach in assessing simulation uncertainty.<br />

In groundwater modelling different conceptual models<br />

are typically based on different geological interpretations<br />

[18,43,42,30,34]. Højberg and Refsgaard [21]<br />

present an example using three different conceptual


1590 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />

models, based on three alternative geological interpretations<br />

for a multi-aquifer system in Denmark. Each of<br />

the models was calibrated against piezometric head data<br />

using inverse technique. The three models provided<br />

equally good and very similar predictions of groundwater<br />

heads, including well field capture zones. However,<br />

when using the models to extrapolate beyond the calibration<br />

data to predictions of flow pathways and travel<br />

times the three models differed dramatically. When<br />

assessing the uncertainty contributed by the model<br />

parameter values, the overlap of uncertainty ranges<br />

between the three models significantly decreased when<br />

moving from groundwater heads to capture zones and<br />

travel times. They conclude that the larger the degree<br />

of extrapolation, the more the underlying conceptual<br />

model dominates over the parameter uncertainty and<br />

the effect of calibration.<br />

The strategy of applying several alternative models<br />

based on codes with different model structures is also<br />

common in climate change modelling. In its description<br />

of uncertainty related to model predictions of both present<br />

and future climates the Intergovernmental Panel on<br />

Climate Change (IPCC) [22] bases its evaluation on scenarios<br />

of many (up to 35) different models. The same<br />

strategy is followed in the dialogue model [52]. Dialogue<br />

is a so-called integrated assessment model (IAM) of climate<br />

change. It has been developed as an interactive<br />

decision-support tool for energy supply policy making.<br />

Dialogue simulates the cause effect chain of climate<br />

change, using mono-disciplinary sub-models for each<br />

step in the chain. The chain starts with scenarios for economic<br />

growth, energy demand, fuel mix etc., leading to<br />

emissions of greenhouse gasses, leading to changes in<br />

atmospheric composition, leading to radiative forcing<br />

of the climate, leading to climate change, leading to<br />

impacts of climate change on societies and ecosystems.<br />

Rather than selecting one mono-disciplinary sub-model<br />

for each step, as most other climate IAMs do, dialogue<br />

uses multiple models for each step (for instance, three<br />

different carbon cycle models, simplified versions of five<br />

different global climate model – outcomes, etc.), representing<br />

the major part of the spectrum of expert opinion<br />

in each discipline.<br />

2.3.2. Expert elicitation<br />

Expert elicitation can be used as a supporting method<br />

in uncertainty analysis. It is a structured process to elicit<br />

subjective judgements and ideas from experts. It is<br />

widely used in uncertainty assessment to quantify uncertainties<br />

in cases where there is no or too few direct<br />

empirical data available to infer uncertainty. Usually<br />

the subjective judgement is represented as a probability<br />

density function reflecting the experts’ degree of belief.<br />

Expert elicitation aims to specify uncertainties in a structured<br />

and documented way, ensuring the account is both<br />

credible and traceable to its assumptions. Typically it is<br />

applied in situations where there is scarce or insufficient<br />

empirical material for a direct quantification of uncertainty<br />

[20]. An example with use of expert elicitation<br />

to estimate probabilities of alternative conceptual models<br />

is given by Meyer et al. [29]. They assessed probabilities<br />

as subjective values, from expert elicitation,<br />

reflecting a belief about the relative plausibility of each<br />

model based on its apparent consistency with available<br />

knowledge and data.<br />

Expert elicitation can also be used to generate ideas<br />

about alternative causal structures (conceptual models)<br />

that govern the behaviour of a system. Techniques used<br />

in decision analysis include group model building [51]<br />

and the hexagon method [19] but these techniques usually<br />

aim to achieve consensus. From the point of view<br />

of model structure uncertainty, these elicitation techniques<br />

could perhaps be used to generate alternative<br />

conceptual models.<br />

2.3.3. Pedigree analysis<br />

Another supporting method is pedigree analysis. The<br />

idea comes from Funtowicz and Ravetz [17], who note<br />

that statistical uncertainty in terms of inexactness does<br />

not cover all relevant dimensions of uncertainty, including<br />

the methodological and epistemological dimensions.<br />

To promote a more differentiated insight into uncertainty<br />

they propose to extend good scientific practice with five<br />

qualifiers for quantitative scientific information: numeral<br />

unit, spread, assessment, and pedigree (NUSAP). By<br />

adding expert judgement of reliability (assessment) and<br />

systematic multi-criteria evaluation of the processes by<br />

which numbers have been produced (pedigree), NUSAP<br />

has extended the statistical approach to uncertainty (inexactness)<br />

with the methodological (unreliability) and epistemological<br />

ignorance dimensions. By providing a<br />

separate qualification for each dimension of uncertainty,<br />

it enables flexibility in their expression.<br />

Each special sort of information has its own aspects<br />

that are key to its pedigree, so different pedigree matrices<br />

using different pedigree criteria can be used to qualify<br />

different sorts of information. Early applications of pedigree<br />

analysis of environmental models have focussed on<br />

parameter pedigree, using proxy representation, empirical<br />

basis, methodological rigor, theoretical understanding<br />

and validation as pedigree criteria. Later on,<br />

pedigree analysis has been extended to assessment of<br />

model assumptions and problem framing [49,12].<br />

2.4. Discussion of strengths/weaknesses and potentials/<br />

limitations<br />

The strategies used in ‘interpolation’, i.e. for situations<br />

that are similar to the calibration situation with<br />

respect to variables of interest and conditions of the natural<br />

system, have the advantage that they can be based<br />

directly on field data. A fundamental weakness is that


J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1591<br />

field data are themselves uncertain. Nevertheless, in<br />

many cases, they can be expected to provide relatively<br />

accurate estimates of, at least, the total predictive uncertainty<br />

for the specific measured variable and for the<br />

same conditions as those in the calibration and validation<br />

situation. Some of the methods cannot differentiate<br />

how the total predictive uncertainty originates from<br />

model input, model parameter and model structure<br />

uncertainty. Other methods attempt to do so. However,<br />

this distinction is, as recognised by many authors, e.g.<br />

Vrugt et al. [53], problematic. In the case of uncalibrated<br />

models, the parameter uncertainty is very difficult to<br />

assess quantitatively, and wrong estimates of model<br />

parameter uncertainty will influence the estimates of<br />

model structure uncertainty. In the case of calibrated<br />

models, estimates of model parameter uncertainty can<br />

often be derived from autocalibration routines. An inadequate<br />

model structure will, however, be compensated<br />

by biased parameter values to optimise the model fit<br />

with field data during calibration. Hence, the uncertainty<br />

due to model structure will be underestimated in<br />

this case.<br />

A more serious limitation of the strategies depending<br />

on observed data is that they are only applicable for situations<br />

where the output variables of interest are measured<br />

(e.g. [35,45,53]). While relevant field data are<br />

often available for variables such as water levels and<br />

water flows, this is usually not the case for concentrations,<br />

or when predictions are desired for scenarios<br />

involving catchment change, such as land use change<br />

or climate change. Another serious limitation stems<br />

from an assumption that the underlying system does<br />

not undergo structural changes, such as changes in ecosystem<br />

processes due to climate change.<br />

The strategy that uses multiple conceptual models<br />

benefits from an explicit analysis of the effects of alternative<br />

model structures. Furthermore, it makes it possible<br />

to include expert knowledge on plausible model structures.<br />

This strategy is strongly advocated by Neuman<br />

and Wierenga [31] and Poeter and Anderson [34]. They<br />

characterise the traditional approach of relying on a single<br />

conceptual model as one in which plausible conceptual<br />

models are rejected (in this case by omission). They<br />

conclude that the bias and uncertainty that results from<br />

reliance on an inadequate conceptual model are typically<br />

much larger than those introduced through an<br />

inadequate choice of model parameter values.<br />

This view is consistent with Beven [7] who outlines a<br />

new philosophy for modelling of environmental systems.<br />

The basic aim of his approach is to extend traditional<br />

schemes with a more realistic account of uncertainty,<br />

rejecting the idea that a single optimal model exists for<br />

any given case. Instead, environmental models may be<br />

non-unique in their accuracy of both reproduction of<br />

observations and prediction (i.e. unidentifiable or equifinal),<br />

and subject to only a conditional confirmation, due<br />

to e.g. errors in model structure, calibration of parameters<br />

and period of data used for evaluation. A weakness<br />

of the multiple modelling strategy, is the absence of<br />

quantitative information about the extent to which each<br />

model is plausible. Furthermore, it may be difficult to<br />

sample from the full range of plausible conceptual models.<br />

In this respect, expert knowledge on which the formulations<br />

of multiple conceptual models are based, is<br />

an important and unavoidable subjective element. The<br />

level of subjectivity can be reduced if the scenarios are<br />

generated in a formalised and reproducible manner.<br />

For example, this is possible with the TPROGS procedure<br />

[9,10], by which alternative geological models can<br />

be generated stochastically. The subjectivity does not<br />

disappear with this approach. Rather, it is transferred<br />

from formulation of the geological model itself to<br />

assumptions on probability functions and correlation<br />

structures of the various geological units that are more<br />

easily constrained in practice.<br />

The strategy of expert elicitation has the advantage<br />

that subjective expert knowledge can be included in<br />

the evaluation. It has the potential to make use of all<br />

available knowledge including knowledge that cannot<br />

be easily formalised otherwise. It can include views of<br />

sceptics, and reveals the level of expert disagreement<br />

on certain estimates. Expert elicitation also has several<br />

limitations. The fraction of experts holding a given view<br />

is not proportional to the probability of that view being<br />

correct. One may safely average estimates of model<br />

parameters, but if the expert’s models were incommensurate,<br />

one cannot average models [25]. If differences<br />

in expert opinion are irresolvable, weighing and combining<br />

the individual estimates of distributions is impossible.<br />

In practice, the opinions are often weighted<br />

equally, although sometimes self-rating is used to obtain<br />

a weight-factor for the experts competence. Finally, the<br />

results of expert elicitation tend to be sensitive to the<br />

selection of the experts whose estimates are gathered.<br />

In a review of four different case studies in which pedigree<br />

analysis was applied, Van der Sluijs et al. [49] show<br />

that pedigree analysis broadens the scope of uncertainty<br />

assessment and stimulates scrutiny of underlying methods<br />

and assumptions. Craye et al. [12] reported similar<br />

experiences. It facilitates structured, creative thinking<br />

on conceivable sources of error and fosters an enhanced<br />

appreciation of the issue of quality in information. It<br />

thereby enables a more effective criticism of quantitative<br />

information by providers, clients, and also users of all<br />

sorts, expert and lay. It provides differentiated insight<br />

in what the weakest parts of a given knowledge base<br />

are. It is flexible in its use and can be used on different<br />

levels of comprehensiveness: from a ‘back of the envelope’<br />

sketch based on self-elicitation to a comprehensive<br />

and sophisticated procedure involving structured informed<br />

in-depth group discussions, covering each pedigree<br />

criterion. The scoring of pedigree criteria is to a certain


1592 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />

degree subjective. Subjectivity can partly be remedied by<br />

the design of unambiguous pedigree matrices and by<br />

involving multiple experts in the scoring. The choice of<br />

experts to do the scoring is also a potential source of<br />

bias. The method is relatively new, with a limited (but<br />

growing) number of practitioners. There is as yet no settled<br />

guideline for good practice. We must keep in mind<br />

that it is not a panacea for the problem of unquantifiable<br />

uncertainty.<br />

3. New framework<br />

We propose that conceptual uncertainty can be<br />

assessed by adopting a protocol based on the six elements<br />

shown in Fig. 3. The central aim is to establish<br />

a number of plausible conceptual models, with a range<br />

that adequately samples the space of possible conceptual<br />

models, to evaluate the tenability of each conceptual<br />

model and the overall range of models selected in relation<br />

to the perceived uncertainty on model structure<br />

and to propagate the uncertainties in each case.<br />

STEP 1: Formulate a conceptual model. A conceptual<br />

model is established. Since we have defined a conceptual<br />

model as a combination of our qualitative process<br />

understanding and the simplifications acceptable for a<br />

particular modelling study, a conceptual model becomes<br />

highly site-specific and even case-specific. For example a<br />

conceptual model of an aquifer may be described as<br />

Formulate a conceptual<br />

model<br />

Set up and calibrate<br />

model<br />

Sufficient conceptual<br />

models<br />

Perform validation tests<br />

and accept/reject models<br />

Evaluate tenability and<br />

completeness of<br />

conceptual models<br />

Make model predictions<br />

and assess uncertainty<br />

Fig. 3. Protocol for assessing conceptual model uncertainty.<br />

two-dimensional for a study focussing on regional<br />

groundwater heads, while it may need to include threedimensional<br />

geological structures for detailed simulation<br />

of contaminant transport. Formulating a new conceptual<br />

model may involve changing or refining the model<br />

structure, e.g. by modifying the hydrogeological interpretations<br />

(in the case of groundwater models), dimensionality,<br />

temporal and spatial resolution, initial and<br />

boundary conditions and process descriptions (governing<br />

equations).<br />

STEP 2: Set up and calibrate model. On the basis of<br />

the formulated conceptual model a site- and case-specific<br />

model is set up. Subsequently the model is calibrated<br />

and the model parameter uncertainty assessed.<br />

For the purposes of ‘interpolation’ (i.e. relevant observations<br />

are available), the parameter uncertainty can<br />

reasonably be constrained through calibration. However,<br />

for the case of ‘extrapolation’, the risk of calibrating<br />

model parameters for prediction of unobserved<br />

variables is that the model becomes biased for the unobserved<br />

variable.<br />

STEP 3: Sufficient conceptual models The first two<br />

steps are repeated until sufficient conceptual models<br />

are included. This judgement will be influenced by the<br />

practical constraints on including additional models<br />

and the desire to include additional conceptual models<br />

that are substantially different from those already<br />

included.<br />

STEP 4: Perform validation tests (to the extent data<br />

availability allows). In order to evaluate how well the<br />

models describe the system in question, the performances<br />

of each of the models are tested by comparing<br />

model predictions with independent field data, i.e. data<br />

not used for calibration. This may be achieved by splitting<br />

the sample data into a calibration and validation<br />

set, or, alternatively, by cross-validation (e.g. bootstrapping:<br />

[15]) against ‘independent data’. The models whose<br />

predictive capability is deemed low are discarded and<br />

the reasons for these predictive failures are explored,<br />

where possible, for insight into the origins of structural<br />

uncertainty. In ‘extrapolation’ cases, data will usually<br />

not be available for validation tests and STEP 4 must<br />

be skipped. However, in some cases, it is possible to test<br />

‘intermediate’ model results. For example a groundwater<br />

model aimed at prediction of concentration values<br />

can often be tested against groundwater head and discharge<br />

data, or sparse concentration data may be available<br />

for parts of the study area.<br />

STEP 5: Evaluate tenability and completeness of conceptual<br />

models. The aim of this step is to analyse the<br />

retained models with respect to their predictive bias<br />

and uncertainty. This has two elements: (i) to evaluate<br />

the tenability of each conceptual model; and (ii) as far<br />

as possible, to evaluate the extent to which the retained<br />

models represent the space of plausible conceptual models.<br />

The tenability of the conceptual models is evaluated


J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1593<br />

Table 1<br />

Pedigree matrix for evaluating the tenability of a conceptual model<br />

Plausibility Colleague consensus<br />

Score Supporting empirical evidence Theoretical understanding Representation of<br />

understood<br />

Proxy Quality and quantity<br />

underlying mechanisms<br />

Highly plausible All but cranks<br />

Well-established theory Model equations reflect high<br />

mechanistic process detail<br />

Controlled experiments and large<br />

sample direct measurements<br />

4 Exact measures of the<br />

modelled quantities<br />

Reasonably plausible All but rebels<br />

Model equations reflect<br />

acceptable mechanistic<br />

process detail<br />

Accepted theory with<br />

partial nature<br />

(in view of the<br />

phenomenon it describes)<br />

Historical/field data uncontrolled<br />

experiments small sample<br />

direct measurements<br />

3 Good fits or measures of<br />

the modelled quantities<br />

Somewhat plausible Competing schools<br />

Aggregated parameterised<br />

meta model<br />

Accepted theory with<br />

partial nature and<br />

limited consensus on reliability<br />

Modelled/derived data indirect<br />

measurements<br />

2 Well correlated but not<br />

measuring the same thing<br />

Preliminary theory Grey box model Not very plausible Embryonic field<br />

Educated guesses indirect approx.<br />

rule of thumb estimate<br />

1 Weak correlation but<br />

commonalties in measure<br />

Crude speculation Crude speculation Black box model Not at all plausible No opinion<br />

0 Not correlated and not<br />

clearly related<br />

through expert reviews. First, the strength of the tenability<br />

of each conceptual model is evaluated by using the<br />

pedigree matrix in Table 1. A structured procedure for<br />

the elicitation of pedigree scores is given by Van der Sluijs<br />

et al. [47]. Note that there is no need to arrive at a<br />

consensus pedigree score for each criterion: if experts<br />

disagree on the pedigree scores for a given model, this<br />

reflects further epistemological uncertainty surrounding<br />

that model. Next, the adequacy of the retained conceptual<br />

models to represent the range of plausible models is<br />

evaluated. This is an assessment of whether the space of<br />

the retained conceptual models is sufficient to encapsulate<br />

the relevant range of plausible conceptual models<br />

without becoming impractical. This has strong similarities<br />

to Dunn’s concept of context validation [14]. Context<br />

validity refers to the validity of inferences that we<br />

have estimated the proximal range of rival hypotheses.<br />

Context validation can be performed by a bottom-up<br />

process to elicit from experts rival hypotheses on causal<br />

relations governing the dynamics of a system. One could<br />

argue that an infinite number of conceivable models<br />

might exist. However, it has been shown in projects<br />

where such elicitation processes were used, that the<br />

cumulative distribution of unique rival models flattens<br />

out after consultation of a limited number of experts,<br />

usually somewhere between 20 and 25 when chosen with<br />

diverse enough backgrounds [27].<br />

STEP 6: Make model predictions and assess uncertainty.<br />

Together with model predictions of the desired<br />

variables, uncertainty assessments are carried out. This<br />

will typically include uncertainty in input data and<br />

parameter values in addition to the conceptual uncertainty.<br />

Furthermore, on the basis of the goodness of<br />

the conceptual models, evaluated in STEP 5 the goodness<br />

of the assessed predictive uncertainty associated<br />

with the model structure should be evaluated.<br />

4. Discussion and conclusions<br />

4.1. Methodologies to assess conceptual uncertainty<br />

As discussed above, the existing strategies fall into<br />

two main categories, each with limitations. The strategies<br />

where model structure errors are assessed from<br />

observed data are confined to interpolation cases,<br />

understood as cases where the model can be calibrated<br />

and validated against field data for the variables of predictive<br />

interest and where the natural system does not<br />

undergo structural change. The strategies used for situations<br />

involving extrapolation depend either on multiple<br />

conceptual models (preferred) or on expert elicitation or<br />

pedigree analysis for a single conceptual model (usually<br />

less preferred).<br />

The novelty of our proposed framework is the combination<br />

of multiple conceptual models and the pedigree


1594 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />

approach for assessing the overall tenability of these<br />

models in one formalised protocol. Some of our proposed<br />

steps are similar to other approaches for dealing<br />

with equifinality, multiple possible models and the<br />

rejection of non-behavioural model [6,31]. Other steps<br />

are based on qualitative approaches, including expert<br />

knowledge in a structured manner [20,49]. The aim of<br />

our new framework is not to identify the ‘‘true’’ model<br />

structure or the cause of the errors in the existing model<br />

structure. Instead, we propose an approach that integrates<br />

different types of knowledge, not previously combined,<br />

such as quantitative and qualitative uncertainty,<br />

to estimate the impact of model structure uncertainty<br />

on model predictions.<br />

The GLUE approach (generalised likelihood uncertainty<br />

estimation, [6,7]) also operates with a range of<br />

alternative models. Although almost all applications<br />

of GLUE reported so far operate with only one model<br />

structure and many alternative model parameter sets, it<br />

is possible to use GLUE with alternative model structures<br />

[24]. In addition to prescribing multiple conceptual<br />

models, an important difference between our<br />

proposed approach and GLUE is that we recommend<br />

parameter optimisation is conducted as part of the calibration<br />

in order to take full advantage of the information<br />

in field data. There are different opinions about<br />

whether calibration by parameter optimisation is advisable<br />

or not. The main advantage of calibration is that it<br />

improves the ability of the model to reproduce hydrological<br />

behaviour of a system within the limits of<br />

observed behaviour [31]. An important by-product is<br />

that it provides useful information about the uncertainty<br />

of model parameters. The disadvantage is that<br />

parameter optimisation may result in biased parameter<br />

values to compensate for errors in model structure and<br />

that many parameter sets (i.e. many models) perform<br />

more or less equally well but provide different results.<br />

In implementing our framework, model calibration<br />

might be skipped and many models with different<br />

parameter sets retained, as in the GLUE approach.<br />

The reason we are not advocating such an approach<br />

is partly for pragmatic reasons (very large computational<br />

requirements) and partly that we aim to focus<br />

on model structure uncertainty rather than parameter<br />

uncertainty.<br />

Although intended for use in a very different context,<br />

the central aim behind our proposed protocol is similar<br />

to the approach of IPCC [22], who assign a level of confidence<br />

to their assessment of climate change by evaluating<br />

predictions from multiple models. The level of<br />

confidence placed in a particular finding reflects both<br />

the degree of consensus amongst modellers and the<br />

quantity of evidence that is available to support the finding.<br />

IPCC [22] classifies the confidence qualitatively in<br />

three levels: (i) ‘well established’, (ii) ‘evolving’ and (iii)<br />

‘speculative’.<br />

4.2. Critical issues for implementing the new protocol<br />

4.2.1. Performance criteria – threshold for accepting/<br />

rejecting models<br />

A critical issue in relation to acceptance/rejection of<br />

models (STEP 4 above) is how to define performance<br />

criteria. We agree with Beven [7] that any conceptual<br />

model is (known to be) wrong in an absolute sense,<br />

and hence that any model will be rejected if we investigate<br />

it in sufficient detail and specify very high performance<br />

criteria. On the other hand, the whole point in<br />

modelling is to simplify.<br />

A good reference for model performance is to compare<br />

it with uncertainties of the available field observations.<br />

If the model performance is within this<br />

uncertainty range we may characterise the model as<br />

good enough. However, usually it is less straightforward.<br />

For example, how wide should the confidence<br />

bands be before we reject models or accept them within<br />

observational uncertainties – ranges corresponding to<br />

65%, 95% or 99% Indeed, the differences between<br />

95% and 99% may be significant in practical terms. Do<br />

we always then reject a model if it cannot perform<br />

within the observational uncertainty range How reasonable<br />

are our estimates of uncertainty in observations<br />

In many cases, even the results from less<br />

accurate models may be very useful.<br />

Another reference for what is acceptable accuracy is<br />

the use of a benchmark model as discussed by e.g. Seibert<br />

[41]. The difficulty is then transferred to selecting<br />

an appropriate benchmark.<br />

Our answer is that the decision on performance criteria<br />

must, in general, be taken in a socio-economic context,<br />

for which predictive uncertainties must be clearly<br />

explained and open to interpretation beyond small<br />

groups of scientists. Thus, we believe that the accuracy<br />

criteria cannot be decided universally by modellers or<br />

researchers, but must be different from case to case<br />

depending on the nature of a decision and the risks<br />

involved.<br />

4.2.2. Qualitative assessment of tenability of conceptual<br />

models<br />

Pedigree analysis structures the critical appraisal of<br />

alternative model structures and provides insight in the<br />

state of knowledge on which each of the conceivable<br />

model structures is based. However, it does not give<br />

an indication of the relative quality of the various model<br />

structures. With reference to Table 1, the pedigree analysis<br />

for a simple statistical model (A) and a complex<br />

mechanistic model (B) could, for example, result in<br />

statements like:<br />

• Model A is weakly correlated to the predicted variable<br />

(Proxy, score 1), based on a large sample of<br />

direct measurements (Quality and quantity, score


J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1595<br />

4), built on a preliminary theory and a black box<br />

model (Theoretical understanding, score 1; Representation<br />

of mechanisms, score 1), somewhat plausible<br />

(Plausibility, score 2) and controversial among colleagues<br />

(Colleague consensus, score 2);<br />

• Model B exactly addresses the desired predictive variable<br />

(Proxy, score 4), is based on data with rule of<br />

thumb estimates (Quality and quantity, score 1), built<br />

on a well-established theory with model equations<br />

reflecting high process details (Theoretical understanding,<br />

score 4; Representation of mechanisms,<br />

score 4), reasonably plausible and accepted by all colleagues<br />

except rebels (Plausibility and Colleague consensus,<br />

score 3).<br />

Such statements cannot be integrated in a quantitative<br />

uncertainty analysis in terms of probabilities, but<br />

they should be available as the best possible scientifically<br />

based characterisation of uncertainties and as such be<br />

made available to those involved in the decision making<br />

process.<br />

Furthermore, as the selected conceptual models can<br />

never cover all possibilities, but instead cover limited<br />

range, it is important to emphasise that the overall<br />

uncertainty of model predictions cannot be assessed in<br />

an absolute sense, only in a conditional or relative sense<br />

[7,31]. Our suggested method does not alter this fundamentally.<br />

However, we believe that the outcome of the<br />

proposed formalised review is a qualitative assessment<br />

that is more useful in a decision making context than<br />

unstructured information, or verbose information from<br />

scientific outlets that is not always available to the decision<br />

maker. The challenge is to design environmental<br />

management strategies that are robust against the uncertainties<br />

identified. Inclusion of a wider range of conceivable<br />

model structures may help to anticipate surprises<br />

that would have been overlooked otherwise.<br />

4.2.3. Different degrees of extrapolation<br />

Our proposed framework deals with situations where<br />

predictions involve extrapolations beyond available field<br />

data. However, there are different degrees of extrapolation<br />

(Fig. 2). If we look at the situation where a threedimensional<br />

groundwater model is calibrated against<br />

groundwater head and discharge data, model predictions<br />

of groundwater recharge to a given layer is a smaller<br />

extrapolation than model predictions of groundwater<br />

age or contaminant concentration. In both situations,<br />

model predictions are carried out for variables that have<br />

not been used as calibration targets and for which no<br />

traditional split-sample validation tests are possible.<br />

The type of validation test recommended for such situation<br />

is a proxy-basin test, which according to the principles<br />

in Klemes [26] and Refsgaard [38], for instance,<br />

could imply that validation tests have to be conducted<br />

in two similar catchments where relevant data (e.g. concentrations)<br />

exist, and where such data are not used for<br />

calibration. The residuals in the other catchments can<br />

then be seen as a measure of the uncertainty to be<br />

expected in the catchment of interest.<br />

If model predictions are made for groundwater heads<br />

in cases involving groundwater abstraction, and the<br />

existing data available for calibration and validation<br />

tests do not include such abstraction, we also have an<br />

extrapolation case, although of a different nature. In this<br />

case we have data for the variable of predictive interest,<br />

but the catchment characteristics are non-stationary.<br />

This corresponds to the situation of model validation<br />

denoted by a differential split-sample test [26,38]. The<br />

differential split-sample test scheme recommended by<br />

Klemes also operates by tests on similar catchments<br />

where data for the type of non-stationary situation exist.<br />

Differential split-sample tests are often less demanding<br />

than proxy-basin tests [36]. A similar type of differential<br />

split-sample situation arises when predictions are<br />

required for a system in which structural change is<br />

expected (e.g. [50,4].<br />

In cases where the conceptual models can be transferred<br />

to other catchments in a reliable and reproducible<br />

way, such proxy-basin and differential split-sample tests<br />

could be conducted and the results used to evaluate the<br />

goodness of the underlying conceptual models. It is<br />

worth noting that Klemes’ test schemes, which also<br />

apply for cases of extrapolation, operate with tests for<br />

two alternative catchments. This has clear similarities<br />

with our strategy of recommending the use of multiple<br />

conceptual models.<br />

4.3. Perspectives<br />

In many cases where environmental models are used<br />

to make predictions that are extrapolations beyond the<br />

calibration base, no suitable framework exists for assessing<br />

the effects of model structure error. The proposed<br />

framework is composed of elements originating from<br />

different scientific disciplines. The elements are well<br />

tested individually, but not previously applied in such<br />

an integrated manner for water resources or environmental<br />

modelling applications. The full framework still<br />

needs to be tested in real-life cases.<br />

Acknowledgement<br />

For the three authors from GEUS and UVA the present<br />

work was supported by the Project ‘Harmonised<br />

Techniques and Representative River Basin Data for<br />

Assessment and Use of Uncertainty Information in<br />

Integrated Water Management’ (www.harmonirib.com),<br />

which is partly funded by the EC Energy, Environment<br />

and Sustainable Development programme (Contract<br />

EVK1-2002-00109). The constructive comments of


1596 J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597<br />

Hoshin V. Gupta and two anonymous reviewers are<br />

acknowledged.<br />

Appendix. Terminology<br />

The terminology used is mainly based on Refsgaard<br />

and Henriksen [39]:<br />

Reality: The system that we aim to represent with the<br />

model, understood here as the study area.<br />

Conceptual model: A representation of ‘reality’ in<br />

terms of verbal descriptions, equations, governing relationships<br />

or ‘natural laws’ that purport to describe reality.<br />

This is the user’s perception of the key hydrological<br />

and ecological processes in the study area (perceptual<br />

model) and the corresponding simplifications and<br />

numerical accuracy limits that are assumed acceptable<br />

in order to achieve the purpose of the modelling. A conceptual<br />

model therefore includes a mathematical<br />

description (equations) of assumed processes and a<br />

description of the objects they interact with, including<br />

river system elements, ecological structures, geological<br />

features, etc. that are required for the particular purpose<br />

of modelling.<br />

Model code: A generic mathematical description of a<br />

conceptual model, implemented in a computer program.<br />

It is generic in the sense that, without program changes,<br />

it can be used to establish a model with the same basic<br />

type of equations (but allowing different input variables<br />

and parameter values) for a different study area.<br />

Model: A case-specific tailored version of a model<br />

code established for a particular study area and set of<br />

modelling objectives (output variables) including specific<br />

input data and parameter values.<br />

Model confirmation: Determination of the adequacy<br />

of the conceptual model to provide an acceptable performance<br />

for the domain of intended application.<br />

Code verification: Substantiation that a model code<br />

adequately represents a conceptual model within certain<br />

specified limits or ranges of application and corresponding<br />

ranges of accuracy.<br />

Model calibration: The procedure of adjusting the<br />

parameter values of a model in such a way that the<br />

model reproduces an observed response of the system<br />

represented in the model within the range of accuracy<br />

specified in the performance criteria.<br />

Model validation: Substantiation that a model, within<br />

its domain of applicability, possesses a satisfactory<br />

range of accuracy, consistent with the intended application<br />

of the model. Note that various authors have criticised<br />

the use of the word validation for predictive<br />

models because universal validation of a model is in<br />

principle impossible and therefore prefer to use the term<br />

model evaluation [32,3]. In our definition [39] the term<br />

validation is not used in a universal sense, but is always<br />

restricted to clearly defined domains of applicability and<br />

performance accuracy (‘numerical universal’ in Popperian<br />

sense).<br />

Pedigree: Pedigree conveys an evaluative account of<br />

the production process of information, and indicates different<br />

aspects of the underpinning and scientific status<br />

of the knowledge used. Pedigree is expressed by means<br />

of a set of pedigree criteria to assess these different<br />

aspects. Criteria for model parameter pedigree are for<br />

instance proxy representation, empirical basis, methodological<br />

rigor, theoretical understanding and validation.<br />

Assessment of pedigree involves qualitative expert<br />

judgement. To minimise arbitrariness and subjectivity<br />

in measuring strength, a pedigree matrix is used to code<br />

qualitative expert judgements for each criterion into a<br />

discrete numeral scale from 0 (weak) to 4 (strong) with<br />

linguistic descriptions (modes) of each level on the scale<br />

[49].<br />

References<br />

[1] Aller LT, Bennet T, Lehr JH, Petty RJ. DRASTIC: a standardized<br />

system for evaluating ground water pollution potential using<br />

hydrogeologic setting, US EPA Robert S. Kerr Environmental<br />

Research Laboratory, EPA/600/287/035, Ada, OK, 1987.<br />

[2] Babendreier JE. National-scale multimedia risk assessment for<br />

hazardous waste disposal. In: International workshop on uncertainty,<br />

sensitivity and parameter estimation for multimedia<br />

environmental modelling held at US Nuclear Regulatory Commission,<br />

Rockville (MD), August 19–21, 2003. Proceedings, pp.<br />

103–9.<br />

[3] Beck MB. Model evaluation and performance. In: El-Shaarawi<br />

AH, Piegorsch WW, editors. Encyclopedia of environmetrics, vol.<br />

3. Chichester: John Wiley & Sons, Ltd; 2002. p. 1275–9.<br />

[4] Beck MB. Environmental foresight and structural change. Environ<br />

Modell Software 2005;20:651–70.<br />

[5] Beck MB, van Straten G, editorsUncertainty and forecasting of<br />

water quality. Springer-Verlag; 1983.<br />

[6] Beven K, Binley AM. The future of distributed models, model<br />

calibration and uncertainty predictions. Hydrol Process<br />

1992;6:279–98.<br />

[7] Beven K. Towards a coherent philosophy for modelling the<br />

environment. Proc Roy Soc London, A 2002;458(2026):<br />

2465–84.<br />

[8] Butts MB, Payne JT, Kristensen M, Madsen H. An evaluation of<br />

the impact of model structure on hydrological modelling uncertainty<br />

for streamflow prediction. J Hydrol 2004;298:242–66.<br />

[9] Carle SF, Fog GE. Transition probability based on indicator<br />

geostatistics. Math Geol 1996;28(4):453–77.<br />

[10] Carle SF, Fog GE. Modeling spatial variability with one and<br />

multidimensional contineous-lag Markov chains. Math Geol<br />

1997;29(7):891–917.<br />

[11] Copenhagen County. Pilot project on establishment of methodology<br />

for zonation of groundwater vulnerability. In: Proceedings<br />

from seminar on groundwater zonation, November 7, 2000,<br />

County of Copenhagen [in Danish].<br />

[12] Craye M, van der Sluijs JP, Funtowicz S. A reflexive approach to<br />

dealing with uncertainties in environmental health risk science and<br />

policy. Int J Risk Assess Manage 2005;5(2):216–36.<br />

[13] Dubus IG, Brown CD, Beulke S. Sources of uncertainty in<br />

pesticide fate modelling. Sci Total Environ 2003;317:53–72.<br />

[14] Dunn W. Using the method of context validation to mitigate type<br />

III errors in environmental policy analysis. In: Hisschemöller M,


J.C. Refsgaard et al. / Advances in Water Resources 29 (2006) 1586–1597 1597<br />

Hoppe HV, Dunn W, Ravetz J, editors. Knowledge, power and<br />

participation in environmental policy. Policy studies review<br />

annual, vol. 12. New Jersey (USA): Transaction Publishers. p.<br />

417–36.<br />

[15] Efron B, Tibshirani RJ. An introduction to the bootstrap.<br />

Monographs on statistics and applied probability. New<br />

York: Chapman and Hall; 1993.<br />

[16] Franchini M, Pacciani M. Comparative analysis of several<br />

conceptual rainfall-runoff models. J Hydrol 1992;122:161–219.<br />

[17] Funtowicz SO, Ravetz JR. Uncertainty and quality in science for<br />

policy. Dordrecht: Kluwer; 1990. p. 229.<br />

[18] Harrar WG, Sonnenborg TO, Henriksen HJ. Capture zone, travel<br />

time and solute transport predictions using inverse modelling and<br />

different geological models. Hydrogeol J 2003;11(5):536–48.<br />

[19] Hodgson AM. Hexagons for systems thinking. Eur J Oper Res<br />

1992;59:220–30.<br />

[20] Hora SC. Acquisition of expert judgement: examples from risk<br />

assessment. J Energy Eng 1992;118:136–48.<br />

[21] Højberg AL, Refsgaard JC. Model uncertainty – parameter<br />

uncertainty versus conceptual models. Water Sci Technol<br />

2005;52(6):177–86.<br />

[22] IPCC. Climate change 2001: the scientific basis. Contribution of<br />

working group I to the third assessment report of the intergovernmental<br />

panel of climate change [Houghton JT, Ding Y, Griggs<br />

DJ, Noguer M, van der Linden PJ, Dai X, Maskell K, Johnson<br />

CA, editors]. Cambridge University Press, Cambridge (UK) and<br />

New York (NY, USA). p. 881.<br />

[23] Jakeman AJ, Letcher RA. Integrated assessment and modelling:<br />

features, principles and examples for catchment management.<br />

Environ Modell Software 2003;18:491–501.<br />

[24] Jensen JB. Parameter and uncertainty estimation in groundwater<br />

modelling. PhD thesis, Department of Civil Engineering, Aalborg<br />

University, Series Paper no. 23, 2003.<br />

[25] Keith DW. When is it appropriate to combine expert judgements<br />

Clim Change 1996;33:139–43.<br />

[26] Klemes V. Operational testing of hydrological simulation models.<br />

Hydrol Sci J 1986;31:13–24.<br />

[27] Kloprogge P, van der Sluijs JP. The inclusion of stakeholder<br />

knowledge and perspectives in integrated assessment of climate<br />

change. Climatic Change, in press.<br />

[28] Linkov I, Burmistrov D. Model uncertainty and choices made by<br />

modelers: lessons learned from the international atomic energy<br />

model intercomparisons. Risk Anal 2003;23(6):1297–308.<br />

[29] Meyer PD, Ye M, Neuman SP, Cantrell KJ. Combined estimation<br />

of hydrogeologic conceptual model and parameter uncertainty.<br />

NUREG/CR-6843 Report, NRC, Washington, DC, 2004.<br />

[30] National Research Council. Conceptual models of flow and<br />

transport in the vadose zone. Washington, DC: National Academy<br />

Press; 2001.<br />

[31] Neuman SP, Wierenga PJ. A comprehensive strategy of hydrogeologic<br />

modeling and uncertainty analysis for nuclear facilities<br />

and sites. University of Arizona, Report NUREG/CR-6805,<br />

2003.<br />

[32] Oreskes N, Shrader-Frechette K, Belitz K. Verification, validation,<br />

and confirmation of numerical models in the Earth Sciences.<br />

Science 1994;263:641–6.<br />

[33] Pahl-Wostl C. Towards sustainability in the water sector – the<br />

importance of human actors and processes of social learning.<br />

Aquat Sci 2002;64:394–411.<br />

[34] Poeter E, Anderson D. Multiple ranking and inference in ground<br />

water modeling. Ground Water 2005;43(4):597–605.<br />

[35] Radwan M, Willems P, Berlamont J. Sensivity and uncertainty<br />

analysis for river quality modelling. J Hydroinform 2004:83–99.<br />

[36] Refsgaard JC, Knudsen J. Operational validation and intercomparison<br />

of different types of hydrological models. Water<br />

Resources Res 1996;32(7):2189–202.<br />

[37] Refsgaard JC, Hansen LK, Vahman M. Groundwater zonation in<br />

Copenhagen County – Intercomparision of thematic results from<br />

different consultants. In: Seminar on groundwater zonation,<br />

County of Copenhagen, November 7, 2000 [in Danish].<br />

[38] Refsgaard JC. Towards a formal approach to calibration and<br />

validation of models using spatial data. In: Grayson R, Blöschl G,<br />

editors. Spatial patterns in catchment hydrology: observations<br />

and modelling. Cambridge University Press; 2001. p. 329–54.<br />

[39] Refsgaard JC, Henriksen HJ. Modelling guidelines – terminology<br />

and guiding principles. Adv Water Resources 2004;27:71–82.<br />

[40] Refsgaard JC, Storm B. MIKE SHE. In: Singh VP, editor.<br />

Computer models of watershed hydrology. Water Resources<br />

Publication; 1995. p. 809–46.<br />

[41] Seibert J. On the need for benchmarks in hydrological modelling.<br />

Hydrol Process 2001;15(6):1063–4.<br />

[42] Selroos JO, Walker DD, Strom A, Gylling B, Follin S. Comparison<br />

of alternative modelling approaches for groundwater flow in<br />

fractured rock. J Hydrol 2001;257:174–88.<br />

[43] Troldborg L. The influence of conceptual geological models on<br />

the simulation of flow and transport in quaternary aquifer<br />

systems. PhD Thesis. Geological Survey of Denmark and Greenland,<br />

Report 2004/107.<br />

[44] Usunoff E, Carrera J, Mousavi SF. An approach to the design of<br />

experiments for discriminating among alternative conceptual<br />

models. Adv Water Resources 1992;15:199–214.<br />

[45] Van Griensven A, Meixner T. Dealing with unidentifiable sources<br />

of uncertainty within environmental models. In: Pahl C, Schmidt<br />

S, Jakeman T, editors. iEMSs 2004 international congress:<br />

‘‘Complexity and integrated resources management’’. International<br />

Environmental Modelling and Software Society, Osnabrück,<br />

Germany, June 2004.<br />

[46] Van der Sluijs JP. Anchoring amid uncertainty; On the management<br />

of uncertainties in risk assessment of anthropogenic climate<br />

change, Ph.D. thesis, Utrecht University, 1997. p. 260.<br />

[47] Van der Sluijs JP, Potting J, Risbey JS, Van Vuuren D, de Vries B,<br />

Beusen A, et al. Uncertainty assessment of the IMAGE/TIMER<br />

B1 CO2 emissions scenario, using the NUSAP method. Report<br />

commissioned by the Netherlands National Research Program on<br />

global Air Pollution and Climate Change, RIVM, Bilthoven, The<br />

Netherlands, 2002. p. 225.<br />

[48] Van der Sluijs JP, Risbey JS, Kloprogge P, Ravetz JR, Funtowicz<br />

SO, Corral Quintana S, et al. RIVM/MNP Guidance for<br />

uncertainty assessment and communication: detailed guidance,<br />

report commissioned by RIVM/MNP – Copernicus Institute,<br />

Department of Science, Technology and Society, Utrecht University,<br />

Utrecht, The Netherlands, 2003. p. 71.<br />

[49] Van der Sluijs JP, Craye M, Funtowicz SO, Kloprogge P, Ravetz<br />

J, Risbey JS. Combining quantitative and qualitative measures of<br />

uncertainty in model based foresight studies: the NUSAP system.<br />

Risk Anal 2005;25(2):481–92.<br />

[50] Van Straten G, Keesman KJ. Uncertainty propagation and<br />

speculation in projective forecasts of environmental change: a<br />

lake-eutrophication example. J Forecast 1991;10:163–90.<br />

[51] Vennix JAM. Group model-building: tackling messy problems.<br />

Syst Dyn Rev 1999;15(4).<br />

[52] Visser H, Folkert RJM, Hoekstra J, De Wolff JJ. Identifying key<br />

sources of uncertainty in climate change projections. Clim Change<br />

2000;45:421–57.<br />

[53] Vrugt JA, Diks CGH, Gupta HV. Improved treatment of<br />

uncertainty in hydrologic modelling: combining the strengths of<br />

global optimization and data assimilation. Water Resources Res<br />

2005;41(1). Art No W01017.<br />

[54] Walker WE, Harremoës P, Rotmans J, Van der Sluijs JP, Van<br />

Asselt MBA, Janssen P, et al. Defining uncertainty. A conceptual<br />

basis for uncertainty management in model-based decision support.<br />

Integr Assessment 2003;4(1):5–17.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!