21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

(2c) Interfering adverbs in preposition-infinitive sequences<br />

The teaching domain is only one, quite specific, area where corpus data are of<br />

interest. In the case of research corpora, factors like size, coverage, diversity and<br />

annotation correctness are usually much more important than colourful interfaces.<br />

So far, the morphological and syntactic modules of the parser have been used<br />

in the following corpus annotation tasks and tests (for a quantitative performance<br />

evaluation, cp. chapters 3.9 and 8.1):<br />

The ECI-corpus (excerpt from the Borba-Ramsey corpus published on cd-rom by<br />

the European Corpus Initiative)<br />

ca. 670.000 words used for internal research in the development of the parser<br />

mixed genre Brazilian Portuguese texts (science, fiction, plays, conversation<br />

etc.)<br />

This corpus has been re-tagged with the latest version of the parser, in collaboration<br />

with Diana Santos at SINTEF (Oslo), and will be made available at<br />

www.oslo.sintef.no/portug/.<br />

VEJA articles (1996 editions, kindly provided by the editor)<br />

ca. 600.000 words, used for internal research and teaching examples<br />

Brazilian Portuguese news magazine texts (mixed topics)<br />

- 429 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!