Unni Cathrine Eiken February 2005
Unni Cathrine Eiken February 2005
Unni Cathrine Eiken February 2005
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2.2.2 Different types of context<br />
So far, we have argued that using context as a tool to indicate the semantic meaning of a word is<br />
a useful method in linguistics. The method’s theoretical foundation dates back to the middle of<br />
the twentieth century, but has not been pursued much in the last few decades. Even though the<br />
linguistic foundation of this method has been discussed, the advance in computational resources<br />
in recent years has brought this approach forward again. However, this being said, the different<br />
types of context that can be taken into consideration have not been discussed so far in this thesis.<br />
Agreeing on the fact that the semantic meaning of a word is suggested from the linguistic<br />
context in which it occurs, or “the company it keeps”, supports the notion that different words<br />
used in the same context are semantically similar. It does, however, not provide a means for<br />
calculating the degree of this similarity or even finding out exactly which words are similar to<br />
each other. Depending on the information that is desired to obtain about a target word, different<br />
context types mirror different aspects of the semantic meaning of a word. Any approach that<br />
attempts at describing semantic meaning based on the contextual distribution of words in a text<br />
collection must first define the type of context that best will reflect the desirable information.<br />
Somewhat simplified, we distinguish between topical context and local context.<br />
2.2.2.1 Topical context<br />
Topical context (Miller and Leacock 2000), or document context, is a quite wide term that<br />
covers what we could call the “wide conception” of what context is. All other content words<br />
which occur in the same environment as a target word are considered to make up the context of<br />
the word, and following the discussion above, contribute to indicating the semantic meaning of<br />
the target word. A target word’s contextual environment can be further specified depending on<br />
the purpose; in short, the context simply is all the words which occur within a context window<br />
of varying size. The window can be set to cover a certain number of words before and after a<br />
target word, or also to consist of the entire document the target word occurs in. Different<br />
parameters determine the weighting of each word found within the context window; for example<br />
words can be weighted according to their distance from the target word. One extreme way of<br />
looking at topical context might be the bag of words model, where a document is seen as an<br />
unordered collection of words, and the words are weighted by the number of times they occur in<br />
27