Euradwaste '08 - EU Bookshop - Europa
Euradwaste '08 - EU Bookshop - Europa Euradwaste '08 - EU Bookshop - Europa
specific sampling strategy, with the only exception of the simplest method (correlation ratios - CR). Graphical methods provide complementary visual information that helps understanding the meaning of numerical sensitivity indices and the global structure of the system model. 2.2.1 Monte Carlo based methods Monte Carlo based methods may be divided in three types: regression/correlation based methods, Monte Carlo Filtering and gridding methods. Regression/correlation methods assume that inputs and outputs can have a linear relation. The simplest method is Pearson’s correlation coefficient. This is a measure of linear relation that takes values between –1 and +1. Positive values indicate joint linear increase or decrease, negative values indicate that when the input increases the output decreases and vice versa. The closer the value is to +1 or to –1, the stronger the relation is, thus the more important the input parameter is. An alternative related measure of sensitivity is the slope or regression coefficient of the output versus the input (after standardisation of the sampled values, subtracting the corresponding sample mean and dividing by the corresponding standard deviation, in order to avoid scale effects in the sensitivity indices). When the importance of several inputs is analysed at the same time, the tool used is multiple regression together with input and output standardisation. In this case, the indices used are the Partial Correlation Coefficients (PCC) and the Standardised Regression Coefficients (SRC). These methods become really important when the inputs are correlated, otherwise they are respectively exactly the same as Pearson’s correlation coefficient and regression coefficients in simple regression. The importance of PCCs and SRCs comes from the fact that they measure respectively the correlation and the regression coefficient between one input and one output after removing the influence of all the other inputs. Unfortunately, by default, these indices are used to analyse only main effects, interactions are hardly ever considered in the analysis. Additionally, hardly ever practitioners study possible transformations of inputs and outputs to get a more appropriate regression model. In many cases, instead of linear, the relation between inputs and outputs is monotonic, in those cases it is convenient to transform inputs and outputs into their ranks, the largest sampled value is transformed into n (sample size), the second largest one into n-1,…, the smallest into 1. This way, the same tools (renamed as Partial Rank Regression Coefficients -PRCC- and Standardised Rank Regression Coefficients –SRRC-) may be used to assess the importance of each input parameter. The reliability of the results obtained via linear regression depends on the Coefficient of Determination (R 2 ) of the regression model obtained. If R 2 is close to 1 the results are very reliable, if it is close to 0, it means that this sensitivity method is not appropriate to study the system model at hand. Monte Carlo Filtering (MCF) is based on dividing the output sample in two or more subsets according to some criterion (achievement of a given condition, exceeding a threshold, etc.) and testing if the inputs associated to those subsets are different or not. As an example, we could divide the output sample in two parts, the one that exceeds a safety limit and the rest. We could wonder if points in both subsamples are related to different regions of a given input or if they may be related to any region of that input. In the first case knowing the value of that input parameter would be important in order to be able to predict if the safety limit will be exceeded or not, while in the second case it would not be. The tools used to provide adequate answers to this type of questions are a set of parametric and non-parametric statistics and their associated tests, as for example the two-sample t test, the two-sample F test, the two-sample Smirnov test, the k-sample Smirnov test, the Cramervon Mises test, the Wilcoxon test (also known as Mann-Whitney test) and the Kruskal-Wallis test, see Conover (1980) for details about each specific statistic and test. In general, non-parametric tests should be preferred for fewer restrictions are imposed on the samples used. The idea behind Gridding and the tests used are similar to Monte Carlo Filtering. The only real difference is that the cri- 390
teria to divide a sample in two or more parts are set on the input space and the test is performed using the corresponding points in the output space. 2.2.2 Variance based methods The variance, or equivalently the standard deviation, and the entropy, are the main measures of uncertainty in the theory of Probability. The larger the variance of a random variable is, the less accurate our knowledge about it is. Decreasing the variance of a given output variable is quite an attractive target, that may be achieved sometimes by decreasing the variance of input parameters (this is not always true, remember the possibility of risk dilution). This is what makes so attractive methods that try to find out what fraction of the output uncertainty (variance) may be attributed to the uncertainty (variance) in each input. Variance based methods find their theoretical support in Sobol’s decomposition of any integrable function in the unit reference hypercube into 2 k orthogonal summands of different dimension: the mean value of the function, k functions which depend each one only on one input parameter, k(k- 1)/2 functions that depend only on two input parameters, k(k-1)(k-2)/6 that depend only on three input parameters and so on. Replacing any output variable of the system model (our function) by its Sobol’s decomposition in the integral used to compute its variance produces in a straightforward manner the decomposition of the variance in its components. The quotient between each component of the variance and the total variance provides the fraction of the variance attributed to each single input parameter (main effects), each combination of only two input parameters (second order interactions) and so on. These are called Sobol’s sensitivity indices; see Sobol (1993). It is important to remark that Sobol’s decomposition is equivalent to the classical Analysis of Variance (ANOVA) used in Statistics. Several algorithms have been proposed to compute Sobol’s indices, the first one by himself. The main problem is related to the efficiency of the method. It needs one specific sample to compute each sensitivity index. Since its development, huge efforts have been done to improve the strategies (algorithms) to compute Sobol’s indices, see for example Saltelli (2002) and Tarantola et al. (2006). It remains as a powerful but expensive method (in terms of computational cost). Independently, and quite before the development of Sobol’s decomposition and Sobol’s indices, a method had been developed to compute first order sensitivity indices (equivalent to first order Sobol’s sensitivity indices): the Fourier Amplitude Sensitivity Test (FAST), see Cukier et al. (1973), Schaibly and Shuler (1973) and Cukier et al. (1975). In order to compute sensitivity indices, these authors create a search curve that covers reasonably well the input space. Each input parameter is assigned an integer frequency. Varying simultaneously all input parameters according to that set of frequencies generates the search curve. Equally spaced points are sampled from the search curve and used to perform a Fourier analysis. The coefficients corresponding to the frequency (and its harmonics) assigned to each input parameter are used to compute the corresponding sensitivity index. Saltelli et al. (1999) did further improvements of the method, among them the possibility of computing total sensitivity indices for a given input parameter (the fraction of the variance due to it and all its interactions of any order). FAST remains unable to compute sensitivity indices for interactions. Correlation Ratios are an alternative to Sobol’s method and FAST to compute first order sensitivity indices using a normal sample (SRS, LHS, etc.). So, though a method used to compute variance based sensitivity indices, it could also be considered Monte Carlo based. 391
- Page 356 and 357: vance on radionuclide migration. Ma
- Page 358 and 359: 342
- Page 360 and 361: The Ruprechtov site, located in the
- Page 362 and 363: Colloid Concentration / μg/l 10000
- Page 364 and 365: exists as a stable mineral phase in
- Page 366 and 367: pared to other sites with SOC-beari
- Page 368 and 369: [3] Hauser, W. Geckeis, H., Götz,
- Page 370 and 371: The science and technology group, r
- Page 372 and 373: FEPCAT RTDC 1 RTDC 2 RTDC 3 Clay-ri
- Page 374 and 375: A1: Transport mechanisms Diffusivit
- Page 376 and 377: 360
- Page 378 and 379: maintaining and develop competence
- Page 380 and 381: A project would be justified to det
- Page 382 and 383: 366
- Page 384 and 385: 368
- Page 386 and 387: The main goal of RTDC-1 is to provi
- Page 388 and 389: 3. RTDC3 In RTD component 3 methodo
- Page 390 and 391: lower depths are less saline. For t
- Page 392 and 393: [3] Marivoet, J., Beuth, T., Alonso
- Page 394 and 395: certainty, conducted in RTDC-1 as W
- Page 396 and 397: A simplistic summary might place PA
- Page 398 and 399: There are at least three non-numeri
- Page 400 and 401: 10-11 June 2008. The workshop was a
- Page 402 and 403: Close dialogue between a regulator
- Page 404 and 405: ony, interactions, etc., and to che
- Page 408 and 409: 2.2.3 Graphical methods Let us call
- Page 410 and 411: 3. The sensitivity analysis benchma
- Page 412 and 413: 4. Discussion and conclusions Three
- Page 414 and 415: 398
- Page 416 and 417: 2. Methodology The project particip
- Page 418 and 419: Closely related to this proposal on
- Page 420 and 421: 4.3 Structure The TP structure must
- Page 422 and 423: 4.5 Implementation It is proposed t
- Page 424 and 425: 5.1 The CARD Project has shown that
- Page 426 and 427: � What steps should be taken to m
- Page 428 and 429: 412
- Page 430 and 431: 414
- Page 432 and 433: materials. For attaining the stated
- Page 434 and 435: the bottom of the heated press-mold
- Page 436 and 437: 420
- Page 438 and 439: 422
- Page 440 and 441: focuses on the study of the combine
- Page 442 and 443: emitting radioactive waste to study
- Page 444 and 445: 428
- Page 446 and 447: 2.1 Laboratory experiment The dispo
- Page 448 and 449: 3. Results 3.1 Laboratory experimen
- Page 450 and 451: Barrier. Clays in Natural & Enginee
- Page 452 and 453: 2. Experimental data 2.1 Laboratory
- Page 454 and 455: sented in the accompanying poster,
specific sampling strategy, with the only exception of the simplest method (correlation ratios - CR).<br />
Graphical methods provide complementary visual information that helps understanding the meaning<br />
of numerical sensitivity indices and the global structure of the system model.<br />
2.2.1 Monte Carlo based methods<br />
Monte Carlo based methods may be divided in three types: regression/correlation based methods,<br />
Monte Carlo Filtering and gridding methods. Regression/correlation methods assume that inputs<br />
and outputs can have a linear relation. The simplest method is Pearson’s correlation coefficient.<br />
This is a measure of linear relation that takes values between –1 and +1. Positive values indicate<br />
joint linear increase or decrease, negative values indicate that when the input increases the output<br />
decreases and vice versa. The closer the value is to +1 or to –1, the stronger the relation is, thus the<br />
more important the input parameter is. An alternative related measure of sensitivity is the slope or<br />
regression coefficient of the output versus the input (after standardisation of the sampled values,<br />
subtracting the corresponding sample mean and dividing by the corresponding standard deviation,<br />
in order to avoid scale effects in the sensitivity indices). When the importance of several inputs is<br />
analysed at the same time, the tool used is multiple regression together with input and output standardisation.<br />
In this case, the indices used are the Partial Correlation Coefficients (PCC) and the<br />
Standardised Regression Coefficients (SRC). These methods become really important when the inputs<br />
are correlated, otherwise they are respectively exactly the same as Pearson’s correlation coefficient<br />
and regression coefficients in simple regression. The importance of PCCs and SRCs comes<br />
from the fact that they measure respectively the correlation and the regression coefficient between<br />
one input and one output after removing the influence of all the other inputs. Unfortunately, by default,<br />
these indices are used to analyse only main effects, interactions are hardly ever considered in<br />
the analysis. Additionally, hardly ever practitioners study possible transformations of inputs and<br />
outputs to get a more appropriate regression model.<br />
In many cases, instead of linear, the relation between inputs and outputs is monotonic, in those<br />
cases it is convenient to transform inputs and outputs into their ranks, the largest sampled value is<br />
transformed into n (sample size), the second largest one into n-1,…, the smallest into 1. This way,<br />
the same tools (renamed as Partial Rank Regression Coefficients -PRCC- and Standardised Rank<br />
Regression Coefficients –SRRC-) may be used to assess the importance of each input parameter.<br />
The reliability of the results obtained via linear regression depends on the Coefficient of Determination<br />
(R 2 ) of the regression model obtained. If R 2 is close to 1 the results are very reliable, if it is<br />
close to 0, it means that this sensitivity method is not appropriate to study the system model at hand.<br />
Monte Carlo Filtering (MCF) is based on dividing the output sample in two or more subsets according<br />
to some criterion (achievement of a given condition, exceeding a threshold, etc.) and testing if<br />
the inputs associated to those subsets are different or not. As an example, we could divide the output<br />
sample in two parts, the one that exceeds a safety limit and the rest. We could wonder if points<br />
in both subsamples are related to different regions of a given input or if they may be related to any<br />
region of that input. In the first case knowing the value of that input parameter would be important<br />
in order to be able to predict if the safety limit will be exceeded or not, while in the second case it<br />
would not be. The tools used to provide adequate answers to this type of questions are a set of parametric<br />
and non-parametric statistics and their associated tests, as for example the two-sample t<br />
test, the two-sample F test, the two-sample Smirnov test, the k-sample Smirnov test, the Cramervon<br />
Mises test, the Wilcoxon test (also known as Mann-Whitney test) and the Kruskal-Wallis test,<br />
see Conover (1980) for details about each specific statistic and test. In general, non-parametric tests<br />
should be preferred for fewer restrictions are imposed on the samples used. The idea behind Gridding<br />
and the tests used are similar to Monte Carlo Filtering. The only real difference is that the cri-<br />
390