21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

circumflex-) accented words without R-forms, but these are typically one-syllable<br />

words with word-final vowel, where Portuguese orthographic convention adds<br />

accents with a phonetic distinctive value. These words are covered by procedure<br />

(a1).<br />

-> if the word is still unanalysable, try the acute accent on one vowel after the<br />

other.<br />

This procedure covers the few cases of multi-syllable function words (i.e., without<br />

R-forms) or missing accent errors in verb forms - other than monosyllabic (a1) - that<br />

are not covered by Luso-Brazilian variation rules.<br />

(b1) the word does contain an accented vowel<br />

-> remove the accent, unless it is located word-final<br />

Word-final accents may be changed but not removed, because (a) this accent position<br />

is rarely chosen by error, and (b) word-final unaccented vowels, mimicking inflexion<br />

endings, bear a great risk of overgeneration, i.e. false positive analyses.<br />

-> if the word is still unanalisable, exchange acute and circumflex instead<br />

In the final analysis, in order to retain corpus fidelity 39 , all changes - variation or<br />

spelling correction - are marked with the ALT-tag (='altered'), after the word form in<br />

question. The only exception are variations listed separately in the main or inflexion<br />

endings lexicon. These will sometimes be marked as rare (), Brazilian (B) or<br />

European (L), but no canonical form will be given.<br />

Below a short list of examples indicating the use of the ALT-tag:<br />

(2)<br />

moiro ALT mouro<br />

"mouro" ADJ M S<br />

"mouro" N M S<br />

(a1) ve ALT vê<br />

"vê" N M S<br />

"ver" V IMP 2S VFIN<br />

"ver" V PR 3S IND VFIN<br />

(a2) inaudivel ALT inaudível<br />

"inaudível" ADJ M/F S<br />

(b1) francêsa ALT francesa<br />

"francês" N F S<br />

"francês" ADJ F S<br />

39 Ideally, any analysed corpus excerpt should allow the reconstruction of the original text. Therefore, all word form<br />

changes, typically introduced by the preprocessor, like splitting of fused preposition-determiner units (da, nele , marked<br />

by -tags), fusion into polylexicals (em=vez=de) or orthographical canonisation (the "ALT-case") must be marked<br />

on the altered form.<br />

- 57 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!