Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005 Unni Cathrine Eiken February 2005

10.04.2013 Views

4.4 Are concept classes useful for anaphora resolution? The EPAS list has been processed in different ways in this chapter. The tests which have been described provide an indication of how context patterns extracted from the text collection can be used to create expectations of which words (or which type of words) that are likely to occur in a given contextual environment. These expectations can be used to anticipate which word, or rather which concept, might be the antecedent for an anaphor. The concept groups which emerged in the association process are simply classes of semantically related words which tend to have similar contextual distributions within the domain of the text corpus. In order to indicate the usefulness of such concept classes in the process of resolving an anaphor, the test set of the EPAS list (all EPAS containing pronouns) was processed with different methods. In (4-13) the results of these methods are shown. In addition to the tests in TiMBL described in the above, the anaphors in the test set were resolved manually using the Lappin and Leass approach as described in section 2.1.2. For these test, the sentence with the anaphor, as well as the preceding sentence, was considered in each case. This purely syntactic approach identified the correct antecedent in 16 of the 32 test instances. (4- 13) Method Correct assignments Syntactic method 50% (16/32) TiMBL 46.87% (15/32) TiMBL with concept groups 78,12% (25/32) The results shown in (4-13) suggest that using concept groups may indeed be a useful approach in anaphora resolution. Especially in the case of anaphoric expressions where the antecedent is not clearly stated in the text it may be useful to have an idea of which type of antecedent one might expect. 10 of the 32 EPAS containing pronouns were of this kind. The syntactic approach could naturally not resolve these anaphors, as an antecedent not clearly present in the text hardly can feature on a list of possible candidates. These types of anaphors require real-world or domain knowledge to be resolved. In the case of 4 of these 10 EPAS, the EPAS list could not be consulted to find likely antecedents. Because of the small size of the data set, some predicates only feature once. This was the case for the five predicates jobbe-utfra (work-from), kartlegge (map), ta (take), varsle (notify) and ville (want) which all only co-occur with pronouns. With the 78

exemption of jobbe-utfra, none of the antecedents in these cases can be predicted on the basis of the distribution of predicates and arguments in the EPAS list. (4-14) shows the instances where the EPAS list could be consulted in the process of finding likely antecedents for these anaphors. In the case of ha (have) and komme-i-kontakt-med (come-into-contact-with), other EPAS with the same predicates where retrieved from the EPAS list. Since ha and komme-i-kontakt-med occur in identical or very similar patterns with politi as the first argument, this would be the preferred candidate for the antecedent in (4-14a), (4-14c) and (4-14d). In the case of (4-14b), the predicate jobbe-utfra only has this one occurrence in the EPAS list. This means that similar patterns must be examined in the search for a possible antecedent. By consulting the EPAS list, it can be found that teori (theory) only occurs as a second argument in connection with politi as first argument. This would suggest that politi is a potential antecedent for the pronoun. By applying the concept groups, the list of possible antecedents motivated by the texts can be expanded to also include the other arguments which have been found to display a similar distribution to the arguments which actually co-occur with the predicate in question. In the case of the pronouns in (4-14), politi is the correct antecedent in all of the cases. (4- 14) EPAS with pronoun similar EPAS antecedents from list a. ha,pron,teori ha,etterforsker,observasjon ha,politi,medarbeider ha,politi,teori b. jobbeutfra,pron,teori ha,politi,teori forkaste,politi,teori c. komme-i-kontaktmed,pron,bilfører d. komme-i-kontaktmed,pron,syklist komme-i-kontaktmed,politi,bilfører komme-i-kontaktmed,politi,generic-nomkomme-i-kontaktmed,politi,bilfører komme-i-kontaktmed,politi,generic-nom politi etterforsker politi concepts lensmann Fonn 79 etterforsker lensmann Fonn politi etterforsker lensmann Fonn politi etterforsker lensmann Fonn The examples in (4-14) indicate how the method described in this thesis can function. In cases where there is no clearly expressed antecedent in a text, or where the resolution of an antecedent

exemption of jobbe-utfra, none of the antecedents in these cases can be predicted on the basis of<br />

the distribution of predicates and arguments in the EPAS list. (4-14) shows the instances where<br />

the EPAS list could be consulted in the process of finding likely antecedents for these anaphors.<br />

In the case of ha (have) and komme-i-kontakt-med (come-into-contact-with), other EPAS with<br />

the same predicates where retrieved from the EPAS list. Since ha and komme-i-kontakt-med<br />

occur in identical or very similar patterns with politi as the first argument, this would be the<br />

preferred candidate for the antecedent in (4-14a), (4-14c) and (4-14d). In the case of (4-14b), the<br />

predicate jobbe-utfra only has this one occurrence in the EPAS list. This means that similar<br />

patterns must be examined in the search for a possible antecedent. By consulting the EPAS list,<br />

it can be found that teori (theory) only occurs as a second argument in connection with politi as<br />

first argument. This would suggest that politi is a potential antecedent for the pronoun. By<br />

applying the concept groups, the list of possible antecedents motivated by the texts can be<br />

expanded to also include the other arguments which have been found to display a similar<br />

distribution to the arguments which actually co-occur with the predicate in question. In the case<br />

of the pronouns in (4-14), politi is the correct antecedent in all of the cases.<br />

(4- 14)<br />

EPAS with<br />

pronoun<br />

similar EPAS antecedents<br />

from list<br />

a. ha,pron,teori ha,etterforsker,observasjon<br />

ha,politi,medarbeider<br />

ha,politi,teori<br />

b. jobbeutfra,pron,teori<br />

ha,politi,teori<br />

forkaste,politi,teori<br />

c. komme-i-kontaktmed,pron,bilfører<br />

d. komme-i-kontaktmed,pron,syklist <br />

komme-i-kontaktmed,politi,bilfører <br />

komme-i-kontaktmed,politi,generic-nomkomme-i-kontaktmed,politi,bilfører <br />

komme-i-kontaktmed,politi,generic-nom<br />

politi<br />

etterforsker<br />

politi<br />

concepts<br />

lensmann<br />

Fonn<br />

79<br />

etterforsker<br />

lensmann<br />

Fonn<br />

politi etterforsker<br />

lensmann<br />

Fonn<br />

politi etterforsker<br />

lensmann<br />

Fonn<br />

The examples in (4-14) indicate how the method described in this thesis can function. In cases<br />

where there is no clearly expressed antecedent in a text, or where the resolution of an antecedent

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!