13.08.2022 Views

advanced-algorithmic-trading

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

327

The Reuters 21578 dataset can be found at http://kdd.ics.uci.edu/databases/reuters21578/

reuters21578.tar.gz as a compressed tar GZIP file. The first task is to create a new working

directory and download the file into it. Please modify the directory name below as you see fit:

cd ~

mkdir -p quantstart/classification/data

cd quantstart/classification/data

wget http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.tar.gz

On Windows you will need to use the respective command line syntax to create the directories,

or use Windows Explorer, and use a web browser to download the data.

We can then unzip and untar the file:

tar -zxvf reuters21578.tar.gz

On Windows you can use 7-Zip for this procedure.

If we list the contents of the directory (ls -l) we can see the following (the permissions and

ownership details have been omitted for brevity):

... 186 Dec 4 1996 all-exchanges-strings.lc.txt

... 316 Dec 4 1996 all-orgs-strings.lc.txt

... 2474 Dec 4 1996 all-people-strings.lc.txt

... 1721 Dec 4 1996 all-places-strings.lc.txt

... 1005 Dec 4 1996 all-topics-strings.lc.txt

... 28194 Dec 4 1996 cat-descriptions_120396.txt

... 273802 Dec 10 1996 feldman-cia-worldfactbook-data.txt

... 1485 Jan 23 1997 lewis.dtd

... 36388 Sep 26 1997 README.txt

... 1324350 Dec 4 1996 reut2-000.sgm

... 1254440 Dec 4 1996 reut2-001.sgm

... 1217495 Dec 4 1996 reut2-002.sgm

... 1298721 Dec 4 1996 reut2-003.sgm

... 1321623 Dec 4 1996 reut2-004.sgm

... 1388644 Dec 4 1996 reut2-005.sgm

... 1254765 Dec 4 1996 reut2-006.sgm

... 1256772 Dec 4 1996 reut2-007.sgm

... 1410117 Dec 4 1996 reut2-008.sgm

... 1338903 Dec 4 1996 reut2-009.sgm

... 1371071 Dec 4 1996 reut2-010.sgm

... 1304117 Dec 4 1996 reut2-011.sgm

... 1323584 Dec 4 1996 reut2-012.sgm

... 1129687 Dec 4 1996 reut2-013.sgm

... 1128671 Dec 4 1996 reut2-014.sgm

... 1258665 Dec 4 1996 reut2-015.sgm

... 1316417 Dec 4 1996 reut2-016.sgm

... 1546911 Dec 4 1996 reut2-017.sgm

... 1258819 Dec 4 1996 reut2-018.sgm

... 1261780 Dec 4 1996 reut2-019.sgm

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!