15.06.2013 Views

Selected Papers from the Fourteenth International ... - STIBA Malang

Selected Papers from the Fourteenth International ... - STIBA Malang

Selected Papers from the Fourteenth International ... - STIBA Malang

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

1 Ans van Kemenade, Tanja Milicev & R. Harald Baayen<br />

The format gives: first <strong>the</strong> text reference, <strong>the</strong>n a number of parameter values,<br />

<strong>the</strong>n <strong>the</strong> source file named after <strong>the</strong> query files with which YCOE was searched,<br />

<strong>the</strong>n <strong>the</strong> source text, and, finally, <strong>the</strong> subperiod of <strong>the</strong> text as defined in <strong>the</strong> Helsinki<br />

Corpus.<br />

The results coded in <strong>the</strong> comma-separated file were <strong>the</strong>n inputted in R (R<br />

Development Core team 2004). We analysed <strong>the</strong> data with a generalized linear<br />

mixed model (Baayen 2007 & Faraway 2006) with <strong>the</strong> NP Specificity as binary<br />

dependent variable to model <strong>the</strong> probability of a specific realization of <strong>the</strong> high<br />

NP. The text in which an example was attested was included as a random effect<br />

factor in <strong>the</strong> model. Two fixed-effect predictors emerged as significant. As shown<br />

in Figure 1, <strong>the</strong> likelihood of a specific realization of <strong>the</strong> high NP decreased when<br />

<strong>the</strong> position of <strong>the</strong> NP was mid ra<strong>the</strong>r than high (log odds contrast coefficient<br />

–1.46, p < 0.0002). Figure 1 also visualizes that this likelihood increased for<br />

NPs realizing proper names (log odds contrast coefficient 2.76, p = 0.0186) and<br />

decreased for indefinite NPs (log odds contrast coefficient –4.24, p < 0.0001).<br />

The standard deviation of <strong>the</strong> text random variable was estimated at 1.055. The<br />

estimated scale was 0.993, indicating that <strong>the</strong> use of a binomial link function for<br />

this data set is fully justified.<br />

0.0 0.2 0.4 0.6 0.8 1.0<br />

probability of specific NP<br />

o<br />

high<br />

position of NP<br />

o<br />

0.0 0.2 0.4 0.6 0.8 1.0<br />

probability of specific NP<br />

o<br />

mid definite indefinite proper name<br />

definiteness<br />

Figure 1. Relation between NP-type, NP position and specificity of NP.<br />

Fur<strong>the</strong>rmore, we have measured <strong>the</strong> relevance of a discourse antecedent. A<br />

generalized linear mixed effect model with a binomial link revealed that <strong>the</strong> likelihood<br />

of specific realization of <strong>the</strong> high NP decreased when it appeared in <strong>the</strong> mid<br />

position ra<strong>the</strong>r than in <strong>the</strong> high position (estimated log odds contrast coefficient<br />

−1.18, p < 0.0001) and that it increased when an antecedent was present (estimated<br />

log odds contrast coefficient 1.67, p < 0.0001), see Figure 2. The estimated<br />

o<br />

o

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!