21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

VFIN 9185 3748 19 375 1079 540 28274<br />

INF 11 0 0 0 26 4664<br />

GER 0 0 0 1 35<br />

PCP 88 16 23 5033<br />

ADV 2670 33 7023<br />

PROP 283 3762<br />

all 54633<br />

words: 69603 17950 30619 4970 903 5335 13938 11704 121170<br />

Since this statistical analysis ignores closed class, the overall ambiguity figures will<br />

obviously be lower than what is found for the language as a whole (about 1.7 readings<br />

pr. word form when using portmanteau tags, 2.0 when not). When also ignoring word<br />

class internal inflexion and subclass ambiguity (shaded), the 121.170 potential open<br />

class words get 155.022 different word class readings (about 1,28 pr. potential open<br />

class word form). In all, the text contains 170.998 open class readings (about 1,41 pr.<br />

potential open class word form). The remaining 0,3 readings pr. word form (to reach<br />

1,7) can be accounted for as the sum of cross-group ambiguity between the closed and<br />

open word class groups, plus closed-class internal ambiguity.<br />

As can be seen, the most common ambiguity is the N-VFIN class, followed<br />

closely by N-ADJ and VFIN-VFIN internal ambiguity. Of these, the first is syntactically<br />

most important, since an error here will cause additional errors in the syntactic tags. The<br />

risk of such error spreading is smaller for N-ADJ and very small for word class internal<br />

ambiguities like VFIN-VFIN.<br />

Apart from sheer number, the importance of an ambiguity class must, however, be<br />

measured against the size of the word classes in question. Thus, N is a very large word<br />

class, so maybe this explains its ambiguity rating in absolute terms, - but how large is<br />

the ambiguity risk for, say, a noun in relative terms?<br />

(2) Table: relative frequencies for word class ambiguity<br />

WC2 N ADJ VFIN INF GER PCP ADV PROP ambiguity<br />

WC1 (%) (%) (%) (%) (%) (%) (%) (%) index<br />

N 3.1 13.3 15.7 1.1 0.0 3.1 3.0 2.8 42.2<br />

ADJ 51.6 1.3 13.2 0.6 0.0 13.0 6.5 5.1 91.5<br />

VFIN 35.8 7.7 30.0 12.2 0.1 1.2 3.5 1.8 96.0<br />

INF 15.4 2.3 75.4 0.2 0.0 0.0 0.0 0.5 93.8<br />

GER 0.7 1.0 2.1 0.0 0.0 0.0 0.0 0.1 3.9<br />

PCP 41.2 43.7 7.0 0.0 0.0 1.6 0.3 0.4 94.3<br />

ADV 14.8 8.3 7.7 0.0 0.0 0.1 19.2 0.2 50.4<br />

PROP 16.6 7.8 4.6 0.2 0.0 0.2 0.3 2.4 32.2<br />

all 45.1<br />

- 119 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!