20.04.2013 Views

Documentation of the Evaluation of CALPUFF and Other Long ...

Documentation of the Evaluation of CALPUFF and Other Long ...

Documentation of the Evaluation of CALPUFF and Other Long ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

C.4.2.1 SUMMARY OF LRT MODEL RANKINGS FOR CTEX3 USING STATISTICAL PERFORMANCE<br />

MEASURES<br />

Table C‐5 summarizes <strong>the</strong> rankings between <strong>the</strong> six LRT models for <strong>the</strong> 11 performance<br />

statistics analyzed. Depending on <strong>the</strong> statistical metric, three different models were ranked as<br />

<strong>the</strong> best performing model for a particular statistic with CAMx being ranked first more than <strong>the</strong><br />

o<strong>the</strong>r models (46%) <strong>and</strong> FLEXPART ranked first second most (36%). CALGRID was consistently<br />

ranked <strong>the</strong> worst performing model being <strong>the</strong> poorest performing model for 6 <strong>of</strong> <strong>the</strong> 11<br />

performance statistics.<br />

In testing <strong>the</strong> efficacy <strong>of</strong> <strong>the</strong> RANK statistic for providing an overall ranking <strong>of</strong> model<br />

performance we <strong>the</strong> ranking <strong>of</strong> <strong>the</strong> six LRT models using <strong>the</strong> average rank <strong>of</strong> <strong>the</strong> 11<br />

performance statistics versus <strong>the</strong> ranking from <strong>the</strong> RANK statistical metric (Table C‐5). The<br />

average rank <strong>of</strong> model performance for <strong>the</strong> six LRT dispersion models <strong>and</strong> <strong>the</strong> CTEX3<br />

experiment averaged across all 11 performance statistics <strong>and</strong> <strong>the</strong> comparison to <strong>the</strong> RANK<br />

rankings was as follows:<br />

Ranking Average <strong>of</strong> 11<br />

Statistics<br />

RANK<br />

1. CAMx CAMx<br />

2. SCIPUFF SCIPUFF<br />

3. FLEXPART FLEXPART<br />

4. HYSPLIT <strong>CALPUFF</strong><br />

5. <strong>CALPUFF</strong> HYSPLIT<br />

6. CALGRID CALGRID<br />

For <strong>the</strong> CTEX3 experiment, <strong>the</strong> average rankings across <strong>the</strong> 11 statistics is nearly identical to <strong>the</strong><br />

rankings produced by <strong>the</strong> RANK integrated statistics that combines <strong>the</strong> four statistics for<br />

correlation (PCC), bias (FB), spatial (FMS) <strong>and</strong> cumulative distribution (KS) with only HYSPLIT<br />

<strong>and</strong> <strong>CALPUFF</strong> exchanging places as <strong>the</strong> 4 th <strong>and</strong> 5 th best performing models. <strong>CALPUFF</strong><br />

performance was weighted down in <strong>the</strong> average statistic rankings due to lower scores in <strong>the</strong><br />

FA2 <strong>and</strong> FA5 metrics compared to HYSPLIT. If not for this, <strong>the</strong> average rank across all 11 metrics<br />

would have been <strong>the</strong> same as Draxler’s RANK score. Although this deviation did occur in <strong>the</strong><br />

fourth <strong>and</strong> fifth ranked positions, <strong>the</strong> RANK statistic remains a valid performance statistic for<br />

indicating over all model performance <strong>of</strong> a LRT dispersion model. However, <strong>the</strong> analyst should<br />

use discretion in relying too heavily upon RANK score without consideration to which<br />

performance metrics are important measures for <strong>the</strong> particular evaluation goals. For example,<br />

if performance goals are not concerned with a model’s ability to perform well in space <strong>and</strong><br />

time, <strong>the</strong>n reliance upon spatial statistics such as <strong>the</strong> FMS in <strong>the</strong> composite RANK value may not<br />

be appropriate. In <strong>the</strong> case <strong>of</strong> this evaluation, since space/time considerations are paramount<br />

for proper LRT model performance, <strong>the</strong> RANK metric is a valuable tool to rapidly assess model<br />

performance across a broad range <strong>of</strong> metrics being evaluated.<br />

32

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!