21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

“base form-1” .. .. WORD CLASS-2 INFLEXION<br />

“base form-2” .. .. WORD CLASS-3 INFLEXION<br />

“base form-2” .. .. WORD CLASS-4 INFLEXION<br />

A rules file ordinarily consists of the following sections:<br />

DELIMITERS (1 section, defines sentence boundaries)<br />

SETS (1 or more lists of set-definitions, compiled as one)<br />

MAPPINGS (1 list of mapping rules for adding context dependent tags)<br />

CONSTRAINTS (1 or more lists of CG-rules, compiled one section at a time)<br />

END<br />

In case there are several constraints sections with constraint grammar rules, these will<br />

be applied to the input text in the same order sections have in the file. This way, it is<br />

possible to distinguish, for instance, between morphological disambiguation, to be done<br />

before, and syntactic disambiguation, to be done after the mapping of syntactic tags.<br />

Comments can be added anywhere in the rules file after a #-sign.<br />

DELIMITERS<br />

The compiler is told which text window the rules are to be applied to. In the case of<br />

PALAVRAS the following punctuation delimiters are included:<br />

;<br />

Note that quotes and single hyphens are not included. This may result in complex<br />

sentences with parenthetical clauses causing trouble for rules based on, e.g., the<br />

uniqueness principle. On the other hand, it is easier to satisfy, for instance, verbal<br />

valency in a larger window.<br />

A few special non-punctuation delimiters are used: which is<br />

automatically added to mark the left hand border of the first sentence in a text, and <br />

which is used for graphical line breaks in news paper corpora, in connection with<br />

otherwise undelimited headlines or pictures.<br />

SETS<br />

In the cg2 compiler, rules can not only apply to word forms or their tags, but also to sets<br />

of words or tags or combinations of these. A set definition is introduced by:<br />

(a) LIST set-name =<br />

followed by a list of set elements (tags or tag combinations), separated by blanks, or<br />

(b) SET set-name =<br />

followed by a list of pre-defined sets (or tags in parentheses), linked by set operators.<br />

Elements in (a) can be:<br />

- 152 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!