Unni Cathrine Eiken February 2005
Unni Cathrine Eiken February 2005 Unni Cathrine Eiken February 2005
4.4 Are concept classes useful for anaphora resolution? The EPAS list has been processed in different ways in this chapter. The tests which have been described provide an indication of how context patterns extracted from the text collection can be used to create expectations of which words (or which type of words) that are likely to occur in a given contextual environment. These expectations can be used to anticipate which word, or rather which concept, might be the antecedent for an anaphor. The concept groups which emerged in the association process are simply classes of semantically related words which tend to have similar contextual distributions within the domain of the text corpus. In order to indicate the usefulness of such concept classes in the process of resolving an anaphor, the test set of the EPAS list (all EPAS containing pronouns) was processed with different methods. In (4-13) the results of these methods are shown. In addition to the tests in TiMBL described in the above, the anaphors in the test set were resolved manually using the Lappin and Leass approach as described in section 2.1.2. For these test, the sentence with the anaphor, as well as the preceding sentence, was considered in each case. This purely syntactic approach identified the correct antecedent in 16 of the 32 test instances. (4- 13) Method Correct assignments Syntactic method 50% (16/32) TiMBL 46.87% (15/32) TiMBL with concept groups 78,12% (25/32) The results shown in (4-13) suggest that using concept groups may indeed be a useful approach in anaphora resolution. Especially in the case of anaphoric expressions where the antecedent is not clearly stated in the text it may be useful to have an idea of which type of antecedent one might expect. 10 of the 32 EPAS containing pronouns were of this kind. The syntactic approach could naturally not resolve these anaphors, as an antecedent not clearly present in the text hardly can feature on a list of possible candidates. These types of anaphors require real-world or domain knowledge to be resolved. In the case of 4 of these 10 EPAS, the EPAS list could not be consulted to find likely antecedents. Because of the small size of the data set, some predicates only feature once. This was the case for the five predicates jobbe-utfra (work-from), kartlegge (map), ta (take), varsle (notify) and ville (want) which all only co-occur with pronouns. With the 78
exemption of jobbe-utfra, none of the antecedents in these cases can be predicted on the basis of the distribution of predicates and arguments in the EPAS list. (4-14) shows the instances where the EPAS list could be consulted in the process of finding likely antecedents for these anaphors. In the case of ha (have) and komme-i-kontakt-med (come-into-contact-with), other EPAS with the same predicates where retrieved from the EPAS list. Since ha and komme-i-kontakt-med occur in identical or very similar patterns with politi as the first argument, this would be the preferred candidate for the antecedent in (4-14a), (4-14c) and (4-14d). In the case of (4-14b), the predicate jobbe-utfra only has this one occurrence in the EPAS list. This means that similar patterns must be examined in the search for a possible antecedent. By consulting the EPAS list, it can be found that teori (theory) only occurs as a second argument in connection with politi as first argument. This would suggest that politi is a potential antecedent for the pronoun. By applying the concept groups, the list of possible antecedents motivated by the texts can be expanded to also include the other arguments which have been found to display a similar distribution to the arguments which actually co-occur with the predicate in question. In the case of the pronouns in (4-14), politi is the correct antecedent in all of the cases. (4- 14) EPAS with pronoun similar EPAS antecedents from list a. ha,pron,teori ha,etterforsker,observasjon ha,politi,medarbeider ha,politi,teori b. jobbeutfra,pron,teori ha,politi,teori forkaste,politi,teori c. komme-i-kontaktmed,pron,bilfører d. komme-i-kontaktmed,pron,syklist komme-i-kontaktmed,politi,bilfører komme-i-kontaktmed,politi,generic-nomkomme-i-kontaktmed,politi,bilfører komme-i-kontaktmed,politi,generic-nom politi etterforsker politi concepts lensmann Fonn 79 etterforsker lensmann Fonn politi etterforsker lensmann Fonn politi etterforsker lensmann Fonn The examples in (4-14) indicate how the method described in this thesis can function. In cases where there is no clearly expressed antecedent in a text, or where the resolution of an antecedent
- Page 33 and 34: 2.2.2 Different types of context So
- Page 35 and 36: neighbours. For example, a target w
- Page 37 and 38: with it. Selectional constraints al
- Page 39 and 40: 3 From text to EPAS - the extractio
- Page 41 and 42: 3.2 Predicate-argument structures "
- Page 43 and 44: speaker flexibility with regards to
- Page 45 and 46: and woman occur together both in su
- Page 47 and 48: occur with. Arguments which are unl
- Page 49 and 50: 3.3.1 NorGram in outline Norsk komp
- Page 51 and 52: Figure 3 The most useful structure
- Page 53 and 54: 3.4 Altering the source As already
- Page 55 and 56: (3- 12) (3- 13) Politiet leter ette
- Page 57 and 58: ARG1 and ARG2 arrays display a valu
- Page 59 and 60: (3- 20) Anne Slåtten bodde i et st
- Page 61 and 62: value and highly desirable. As such
- Page 63 and 64: this project, this can be interpret
- Page 65 and 66: The process of classifying the cons
- Page 67 and 68: There are several different distanc
- Page 69 and 70: . ankomme,etterforsker,?,? ankomme,
- Page 71 and 72: Test 2 Training set: EPAS_arg1 with
- Page 73 and 74: The training and test material was
- Page 75 and 76: • level 0: words which co-occur w
- Page 77 and 78: (4- 9) avklare,obduksjon,? bede-om,
- Page 79 and 80: (4-10) below shows the output for t
- Page 81 and 82: In the introduction to this chapter
- Page 83: the EPAS can be used in the classif
- Page 87 and 88: antecedent for (4-15a). In the case
- Page 89 and 90: Figure 7 Interestingly enough, howe
- Page 91 and 92: When testing on knowledge-dependent
- Page 93 and 94: Firth, J. R. (1957): A synopsis of
- Page 95 and 96: Appendix A: Ekstraktor.pl - algorit
- Page 97 and 98: finnARG2(); This function has exact
- Page 99 and 100: #legger lest linje inn i @prt derso
- Page 101 and 102: sub fjernEP{ #fjerner elementer fra
- Page 103 and 104: } splice(@ARGx); $imax = @ARG3ep; @
- Page 105 and 106: } else{ } } } push(@liste, $ARG0ep[
- Page 107 and 108: 101 Appendix C: the EPAS list 23-å
- Page 109 and 110: 103 obdusere,,kvinne observere,,23-
- Page 111 and 112: Appendix D: Text aligned with EPAS
- Page 113 and 114: eventualiteter. Vi varslet Kripos.
- Page 115 and 116: Etterforskerne har flere observasjo
- Page 117 and 118: # Subrutine som tar inn argumentnum
- Page 119 and 120: Appendix F: POS-based structures SE
- Page 121: Vi har ingen spesiell teori som vi
exemption of jobbe-utfra, none of the antecedents in these cases can be predicted on the basis of<br />
the distribution of predicates and arguments in the EPAS list. (4-14) shows the instances where<br />
the EPAS list could be consulted in the process of finding likely antecedents for these anaphors.<br />
In the case of ha (have) and komme-i-kontakt-med (come-into-contact-with), other EPAS with<br />
the same predicates where retrieved from the EPAS list. Since ha and komme-i-kontakt-med<br />
occur in identical or very similar patterns with politi as the first argument, this would be the<br />
preferred candidate for the antecedent in (4-14a), (4-14c) and (4-14d). In the case of (4-14b), the<br />
predicate jobbe-utfra only has this one occurrence in the EPAS list. This means that similar<br />
patterns must be examined in the search for a possible antecedent. By consulting the EPAS list,<br />
it can be found that teori (theory) only occurs as a second argument in connection with politi as<br />
first argument. This would suggest that politi is a potential antecedent for the pronoun. By<br />
applying the concept groups, the list of possible antecedents motivated by the texts can be<br />
expanded to also include the other arguments which have been found to display a similar<br />
distribution to the arguments which actually co-occur with the predicate in question. In the case<br />
of the pronouns in (4-14), politi is the correct antecedent in all of the cases.<br />
(4- 14)<br />
EPAS with<br />
pronoun<br />
similar EPAS antecedents<br />
from list<br />
a. ha,pron,teori ha,etterforsker,observasjon<br />
ha,politi,medarbeider<br />
ha,politi,teori<br />
b. jobbeutfra,pron,teori<br />
ha,politi,teori<br />
forkaste,politi,teori<br />
c. komme-i-kontaktmed,pron,bilfører<br />
d. komme-i-kontaktmed,pron,syklist <br />
komme-i-kontaktmed,politi,bilfører <br />
komme-i-kontaktmed,politi,generic-nomkomme-i-kontaktmed,politi,bilfører <br />
komme-i-kontaktmed,politi,generic-nom<br />
politi<br />
etterforsker<br />
politi<br />
concepts<br />
lensmann<br />
Fonn<br />
79<br />
etterforsker<br />
lensmann<br />
Fonn<br />
politi etterforsker<br />
lensmann<br />
Fonn<br />
politi etterforsker<br />
lensmann<br />
Fonn<br />
The examples in (4-14) indicate how the method described in this thesis can function. In cases<br />
where there is no clearly expressed antecedent in a text, or where the resolution of an antecedent