10.04.2013 Views

Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

neighbours. For example, a target word’s local context can be seen as its subject and object, or<br />

as the adjective preceding it.<br />

Several studies show that classifying words based on the local context in which they occur gives<br />

information about the semantic meaning of the words, rather than their membership within a<br />

thematic domain, as found when examining the topical context (Hindle 1990; Grefenstette 1992;<br />

Lin 1998; Lin and Pantel 2001; Pereira et al. 1993; inter al.). This indicates that access to<br />

features within a word’s local context can contribute to saying something about the meaning of<br />

the word and ultimately to act as a foundation for the formation of concept groups of<br />

semantically similar words. Distributional representations based on a word’s local context are<br />

useful for measuring the semantic similarity of words. Lin (1997) exploits this in an algorithm<br />

for word sense disambiguation and states that local context gives crucial clues about the<br />

meaning of a word following the intuition that:<br />

“Two different words are likely to have similar meanings if they occur in identical local<br />

contexts.” (Lin 1997, p 64).<br />

Since the local context can comprise syntactic and semantic information, it provides a means to<br />

access different information relevant to the type of analysis that will be performed on the<br />

material. Several approaches describe methods for finding similar nouns based on the<br />

distributional patterns of words in the local context (Hindle 1990; Grefenstette 1992; Lin 1998;<br />

Lin and Pantel 2001; Pantel and Lin 2002; Pereira et al. 1993; inter al.). These methods classify<br />

words in accordance with their distributional patterns, not using hand-coded semantic<br />

knowledge as a basis, but rather inferring the required knowledge from a text corpus as part of<br />

the analysis process. The approaches all adopt different methods for judging the similarity of<br />

words. Below, some of the approaches to finding similar words are described briefly; the<br />

similarity metrics, however, will not be discussed in this outline.<br />

Hindle (1990) shows that the contextual distribution of words provides a useful semantic<br />

classification, also in the event of an automated classification process with no human<br />

intervention. His method examines predicate-argument structures in a large corpus and<br />

automatically classifies nouns into semantically similar sets on the basis of the predicates they<br />

combine with. The similarity between nouns is measured as being a function of mutual<br />

29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!