01.07.2014 Views

name-collision-02aug13-en

name-collision-02aug13-en

name-collision-02aug13-en

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

organizations that operate as widely-trusted certification authorities (i.e., those with root<br />

certificates embedded in browsers and OSs); and<br />

web browser and other software v<strong>en</strong>dors.<br />

3.4.2 Data sources and root server assistance<br />

Preliminary discussions on the practicalities of this study took place during the RIPE66 and<br />

DNS-OARC meetings in mid-May 2013 since several of the RSOs were att<strong>en</strong>ding. These<br />

discussions considered what data would be needed for the study, how to collect and deliver that<br />

data, likely timelines/milestones, and what levels of assistance the RSOs could provide.<br />

The cons<strong>en</strong>sus was that the main data source for DNS analysis should be DNS-OARC’s DITL<br />

(Day in the life of the Internet) 15 data sets. Many RSOs already contributed data to that initiative<br />

and had begun preparations for the 2013 DITL exercise which would start later that month.<br />

Using DITL data for this study had the b<strong>en</strong>efit of not requiring RSOs to commit resources for<br />

some other type of data gathering. In addition, access to the DITL data was covered by a single<br />

data sharing agreem<strong>en</strong>t common to all DNS-OARC members. This meant that the study team<br />

would be able to analyze those data almost immediately and avoid the pot<strong>en</strong>tial delays that might<br />

arise from the legal complexities of arranging confid<strong>en</strong>tiality and/or data access agreem<strong>en</strong>ts with<br />

individual root server operators.<br />

3.4.3 DITL data processing<br />

The sheer size of the DITL data sets pres<strong>en</strong>ted many chall<strong>en</strong>ges; managing roughly 8TB of<br />

compressed data spread across more than 500,000 files and organizing the workflow around<br />

them was a non-trivial exercise. All of this work had to be performed at DNS-OARC under the<br />

terms of its data sharing agreem<strong>en</strong>t. DITL data could not be copied or moved off-site; they could<br />

be accessed only across the local network from DNS-OARC’s file servers.<br />

Before any data gathering was carried out, the team made an assessm<strong>en</strong>t of the available<br />

hardware at DNS-OARC and the pot<strong>en</strong>tial software that could be used. Some b<strong>en</strong>chmarking was<br />

done to assess the hardware or network footprint of these tools and how long particular tools<br />

would take to process the data. Pragmatic choices were th<strong>en</strong> made about how best to proceed—<br />

which tools would be most suited to the available platforms; what approaches to processing the<br />

data would and would not work well; how to arrange the workflows; and estimating how long<br />

each run over the data sets would take. Appropriate scripts were th<strong>en</strong> developed and tested.<br />

These produced summary results which were submitted for statistical analysis.<br />

15 https://www.dns-oarc.net/oarc/data/ditl<br />

Name Collision Study Report Page 18<br />

Version 1.5 2013.08.02

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!