21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

is due to the fact that, in my CG-notation, subclause function is tagged as a number two tag onto<br />

complementisers or main verbs, but will appear as its own node in constituent tree analysis (on<br />

which the table is based). Though PALAVRAS tags co-ordinators for what they co-ordinate (e.g.<br />

for direct object co-ordination), the CG-to-tree transformation program used in the<br />

evaluation, did not yet handle co-ordination, so paratactic attachment was not quantified. A<br />

distinction was made between clause- and group-level @PRED, but appositions (@APP) were<br />

regarded as clause level constituents by the tree-generator.<br />

The shaded columns (tag only) contain data directly reflecting CG tag output, while<br />

the bold face columns (tag/attachment) show the decrease in performance if<br />

attachment errors are counted, too, even where function tags are correct. The third<br />

column type (attachment only), finally, reflects pure attachment performance,<br />

judging the tree as such, without taking function tag errors into account.<br />

For the system as a whole, PALAVRAS’ recall and precision converge on the<br />

97% syntactic tag correctness mark known from other text samples (cp. chapter<br />

3.9) 249 . The fact that not much (0.3%) is lost when pure attachment errors are<br />

included is encouraging proof that CG-to-tree transformation is, in fact, feasible. A<br />

recall and precision for dependency per se 250 of over 97.9% and 99.5%, respectively,<br />

suggest that the attachment information contained in PALAVRAS’ output is actually<br />

more robust than its function tag information.<br />

There is a fair deal of variation in the specific performance data for individual<br />

constituents. By comparison to the English FDG, PALAVRAS has a better recall for<br />

@SUBJ and @ACC, and a slightly worse one for @SC.<br />

Interestingly, subjects have a high precision and a (relatively) low recall, while<br />

direct objects have a (relatively) low precision, but a high recall, suggesting that the<br />

present rule set could be biased in favour of @ACC and against @SUBJ.<br />

Apart from the @ACC - @SUBJ discrepancy, performance was best for verbal<br />

function and subordinators (where the real disambiguation task resides in the<br />

morphology), as well as prenominals (@>N) and arguments of prepositions (@P

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!