21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Intuitively, one might assume<br />

(a) that a subject reading is more likely before the predicator than after it, and<br />

(b) that noun phrases denoting humans, are more likely to function as agent than<br />

others, and might therefore have a larger affinity to subject function<br />

Whereas (a) is a syntactic rule and fits in naturally with the CG-rules on the syntactic<br />

level, (b) presupposes semantic lexical information, that must be expressed as<br />

secondary tags, i.e. tags, that are not (on this level!) intended for disambiguation<br />

themselves.<br />

In order to test the two assumptions, I have statistically analysed the computer's<br />

parses for one and a quarter million words, as shown in table (1). Since shorter,<br />

manually controlled texts show the parser's syntactic error rate to be lower than 3% (cf.<br />

chapter 3.9), the dubious cases will disappear in a sea of safe correct readings (like<br />

those where the uniqueness principle can be applied, or where verbs have obligatory<br />

direct objects), - and therefore distributional patterns may be trusted even when derived<br />

from automatic analysis alone. Even if all errors were subject-object errors (which they<br />

are not!), a ±3% margin of statistical significance would not change much in the ratios<br />

calculated below.<br />

(1) The influence of the semantic feature on the probability of subject<br />

tags vs. direct object tags (573.285 words from VEJA, plain numbers, and<br />

690.269 words from the Borba-Ramsey corpus, numbers in italics). Percentages<br />

measure the frequency of a given function within a certain semantic group.<br />

- 164 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!