16.05.2015 Views

D5 Annex report WP 3: ETIS Database methodology ... - ETIS plus

D5 Annex report WP 3: ETIS Database methodology ... - ETIS plus

D5 Annex report WP 3: ETIS Database methodology ... - ETIS plus

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>D5</strong> <strong>Annex</strong> <strong>WP</strong> 3: DATABASE METHODOLOGY AND DATABASE USER MANUAL –<br />

FREIGHT TRANSPORT DEMAND<br />

The COMEXT database for the Year 2000 has been used for the testing phase of <strong>ETIS</strong>. The<br />

first test has considered whether it is necessary to apply additional error checking routines to<br />

avoid the inclusion of data errors in the subsequent O/D matrices. This has been achieved by<br />

comparing ‘smoothed’ data to the raw data to measure the impact of erratic values in the<br />

database.<br />

A technique for identifying outliers (errors or “erratics”) has been developed within the MDS<br />

Transmodal trade forecasting model, taking into account a long time series of trade data.<br />

During this process, the COMEXT data is converted into time series vectors for individual trade<br />

flows (e.g. French exports of SITC 56 in tonnes to Italy) for quarterly time periods covering<br />

approximately fifteen years. The smoothing software samples four data points for each year and<br />

calculates the mean and the standard deviation for that year. It then compares each year's mean<br />

and standard deviation with all the others. Then if there are any years with unusual levels of<br />

variance, they are investigated by the software, and according to certain thresholds individual<br />

quarterly values may be marked as outliers and the software will replace them with interpolated<br />

values. It means that normally erratic series will be left untouched, but erratic points or<br />

sequences within normally stable series will be changed. Every year, new data is collected, and<br />

the process is repeated, so it is possible that what is regarded as an outlier may change over time<br />

as the software learns more about the time series.<br />

The use of the smoothing algorithm can be illustrated, by comparing the smoothed data to the<br />

original data.<br />

In 2000, imports into EU countries amounted to 2.501 billion tonnes according to COMEXT.<br />

After smoothing the estimate was 2.391 billion tonnes, a change of only 4%. The largest<br />

absolute error in a single 2 digit SITC category is 13 million tonnes, for SITC 33, petroleum.<br />

However this is only a 2% difference within that category. The largest percentage error is for<br />

SITC 83, travel goods, with a 63% percent difference. However this only amounts to an<br />

absolute difference of 1.133 million tonnes. Most of the difference can be traced to a figure of<br />

1.079 million tonnes for travel goods between the UK and Germany.<br />

Looking at the same trade flow using German export data a total of 0.001 million tonnes can be<br />

found, a level that agrees more readily with the smoothed data. This difference (1000 times) is<br />

untypical however. Absolute differences are typically about 1 million tonnes per 2 digit SITC<br />

category, and relative differences are typically about 6%. It should also be noted that<br />

differences between smoothed and un­smoothed series do not necessarily imply that the unsmoothed<br />

series contains errors, only values that are unlikely to be repeated.<br />

At these levels, and given the scope of <strong>ETIS</strong>, the potential impacts of measurement errors are<br />

not alarming, particularly when the annual version of COMEXT is used. However, the example<br />

related above does suggest that a high level comparison between the trade data used within<br />

<strong>ETIS</strong> and the data based upon a smoothed quarterly time series will reveal a small number of<br />

44<br />

Document2<br />

27 May 2004

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!