10.04.2013 Views

Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Training and testing the classifier on the EPAS_arg2 list with no pronouns produced an accuracy<br />

of 49,73%. As was the case for the corresponding classification of argument 1, it is likely that<br />

the relatively small dataset is a disadvantage for the classification process.<br />

4.1.3 Comments on the results<br />

The results obtained through classifying the EPAS indicate that the information present in the<br />

EPAS derived from the text collection does provide clues about which word to expect in a<br />

specific position. The accuracy scores obtained by training and testing on the EPAS extracted<br />

from a collection of texts suggest that even a small collection of texts on the same domain<br />

provide information to enable a classification approach based on contextual distribution. In the<br />

tests described above, there was a reoccurring tendency that in a number of the cases where the<br />

wrong category was assigned in the test phase, the assigned category bore some semantic<br />

resemblance to the correct category. This reinforces the initial intuition that similar words are<br />

used in similar environments and that the environment can contribute with clues toward the<br />

semantic meaning of a word.<br />

In the following section, the notion of finding words which are similar to each other by virtue of<br />

occurring in the same environments will be explored further.<br />

4.2 Step II: Association of concept groups<br />

The fundamental idea in this thesis is that words display certain semantic features based solely<br />

on the context they are found in. Therefore, when looking for possible antecedents for an<br />

anaphoric expression, the candidates should not only be weighted according to their cooccurrence<br />

in an identical context pattern in a corpus, but also according to their co-occurrence<br />

with similar context patterns. The assumption that words which occur in identical contexts have<br />

related meanings can be used to retrieve words with similar meanings from the data material.<br />

With a target argument and a target predicate as starting point, the association method goes<br />

through the EPAS list and returns words which occur in similar environments to the target<br />

argument. This association is performed in three steps:<br />

68

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!