Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005 Unni Cathrine Eiken February 2005

10.04.2013 Views

with such an approach is stated by the developers: “the strategy is simple, but requires a fairly large amount of knowledge to be useful for a broad range of cases” (Carbonell and Brown 1988, p. 97). Generally speaking, the knowledge bases that knowledge-based systems for anaphora resolution rely on are difficult to represent and process, and require a considerable amount of human input (Mitkov 2001, p. 110). The information is structured using different frameworks; often each anaphora resolution system structures its knowledge base in a system-specific manner. Rather than giving an outline of various specific methods belonging to the traditional approaches, some of the formats used for knowledge representation are briefly mentioned below. Several frameworks have been developed to cope with the need for a formalism to represent real-world or domain knowledge. Most of these have been part of specific anaphora resolutions systems and have not constituted independent frameworks for the representations of real-world knowledge. Minsky’s Frames (Minsky 1975, in Botley and McEnery 2000) is a framework for representing knowledge about stereotyped objects and events. The frames are dynamic in the sense that the information they hold about a particular object or event can change if new information is encountered. Input into the system is interpreted in accordance with the information present in the frames; the frames generate expectations about the input (Botley and McEnery 2000, p. 12). In the case of a “shooting frame” being evoked upon processing of the sentence in (2-9a), the expectation that if somebody misses, it is likely to be the same person that also was doing the shooting, is created. Following such an expectation, it is easy to identify the correct antecedent for the anaphor. Schank’s Scripts (Schank 1972, in Botley and McEnery 2000) have some similarity to Minsky’s Frames, but are primarily used to represent knowledge about events which do not undergo change (Botley and McEnery 2000, p. 12). Information about role assignment and the sequence of events in given contexts is represented in the script. 2.1.2.3 Alternative approaches to anaphora resolution Hand-coded knowledge bases that aim at representing real-world or domain knowledge are expensive and labor-intensive to build and maintain. As a consequence, the focus has shifted toward systems that rely less heavily on world knowledge in the last 15 years (see Mitkov 2003 20

for an overview). Many of these systems incorporate semantic and real-world knowledge, but use methods that enable the collection of this information to have a high degree of automation (Baldwin 1997; Dagan and Itai 1990; Dagan et al. 1995; Nasukawa 1994; inter al.). Mitkov (2003) terms these systems knowledge-poor and attributes their growth in number in recent years to the fact that corpora and similar electronic linguistic resources have become better, larger and more available. Some of these systems do not really attempt at building a world- or domain knowledge base (Baldwin 1997; Nasukawa 1994), but rather look at features such as co- occurrence patterns in the text itself, while others integrate corpora and use them as a form of abstract knowledge base (Dagan and Itai 1990; Dagan et al. 1995). Among the different “alternative” approaches, Dagan and Itai’s (1990) statistical approach, Dagan et al.’s (1995) estimation of unseen patterns and Nasukawa’s (1994) knowledge-free method are of particular interest for this project. Dagan and Itai’s (1990) method is that of using co-occurrence patterns observed in a corpus as a type of selectional restrictions. Co-occurrence patterns observed in a large corpus are thought to reflect the semantic constraints that apply to natural language. Candidates for antecedents for the anaphor it are identified in the text and put in the place of the anaphor to be resolved. This produces co-occurrence patterns that are checked against the corpus. Subsequently the candidate present in the most frequently occurring cooccurrence pattern is chosen as the antecedent. This method relies on a large corpus, as only patterns which actually have been seen in the corpus are considered. Infrequent patterns will not be picked since they generally speaking will not feature on the top of the pattern list. Dagan et al. (1995) offer a solution to this problem by presenting a similar method which also estimates the probability of co-occurrence patterns that have not been observed in the training data. They state the importance of distinguishing between probable and improbable unobserved cooccurrence patterns and emphasise that the “distinctions ought to be made using the data that do occur in the corpus” (Dagan et al. 1995, p. 164). Anologies are made between specific unseen co-occurrence patterns and observed co-occurrences which contain similar words, determining word similarity by a similarity metric. Patterns that contain similar words to the target word and that have been observed in the training data are used to calculate how likely the target word is to occur in the same pattern. Nasukawa (1994) presents a resolution rate of 93,8% in an even knowledge-poorer method for pronoun resolution. Instead of drawing information from a corpus, word frequency and co-occurrence patterns in the text itself are used to filter out the most likely candidate for the antecedent. In Nasukawa’s approach, inter-sentential data is 21

for an overview). Many of these systems incorporate semantic and real-world knowledge, but<br />

use methods that enable the collection of this information to have a high degree of automation<br />

(Baldwin 1997; Dagan and Itai 1990; Dagan et al. 1995; Nasukawa 1994; inter al.). Mitkov<br />

(2003) terms these systems knowledge-poor and attributes their growth in number in recent<br />

years to the fact that corpora and similar electronic linguistic resources have become better,<br />

larger and more available. Some of these systems do not really attempt at building a world- or<br />

domain knowledge base (Baldwin 1997; Nasukawa 1994), but rather look at features such as co-<br />

occurrence patterns in the text itself, while others integrate corpora and use them as a form of<br />

abstract knowledge base (Dagan and Itai 1990; Dagan et al. 1995).<br />

Among the different “alternative” approaches, Dagan and Itai’s (1990) statistical approach,<br />

Dagan et al.’s (1995) estimation of unseen patterns and Nasukawa’s (1994) knowledge-free<br />

method are of particular interest for this project. Dagan and Itai’s (1990) method is that of using<br />

co-occurrence patterns observed in a corpus as a type of selectional restrictions. Co-occurrence<br />

patterns observed in a large corpus are thought to reflect the semantic constraints that apply to<br />

natural language. Candidates for antecedents for the anaphor it are identified in the text and put<br />

in the place of the anaphor to be resolved. This produces co-occurrence patterns that are checked<br />

against the corpus. Subsequently the candidate present in the most frequently occurring cooccurrence<br />

pattern is chosen as the antecedent. This method relies on a large corpus, as only<br />

patterns which actually have been seen in the corpus are considered. Infrequent patterns will not<br />

be picked since they generally speaking will not feature on the top of the pattern list. Dagan et<br />

al. (1995) offer a solution to this problem by presenting a similar method which also estimates<br />

the probability of co-occurrence patterns that have not been observed in the training data. They<br />

state the importance of distinguishing between probable and improbable unobserved cooccurrence<br />

patterns and emphasise that the “distinctions ought to be made using the data that do<br />

occur in the corpus” (Dagan et al. 1995, p. 164). Anologies are made between specific unseen<br />

co-occurrence patterns and observed co-occurrences which contain similar words, determining<br />

word similarity by a similarity metric. Patterns that contain similar words to the target word and<br />

that have been observed in the training data are used to calculate how likely the target word is to<br />

occur in the same pattern. Nasukawa (1994) presents a resolution rate of 93,8% in an even<br />

knowledge-poorer method for pronoun resolution. Instead of drawing information from a<br />

corpus, word frequency and co-occurrence patterns in the text itself are used to filter out the<br />

most likely candidate for the antecedent. In Nasukawa’s approach, inter-sentential data is<br />

21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!