Unni Cathrine Eiken February 2005
Unni Cathrine Eiken February 2005
Unni Cathrine Eiken February 2005
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
process. To be as useful as possible, the meaning structures should be normalised and<br />
generalisable.<br />
The examples above show how normalisation through use of EPAS realises the concept of<br />
canonical form to some degree and seems particularly useful for the purpose of the present<br />
work. By using grammatic relations such as subject and object as reference points, semantically<br />
equivalent sentences, such as (3-2a) and (3-2b), would be given different meaning structures due<br />
to the difference in verbal voice. Structuring the meanings conveyed with the sentences in (3-2)<br />
within a grammatical relations paradigm would make it necessary to mark the verbal voice as<br />
well as the grammatical relations. In addition, active and passive structures would have to be<br />
treated differently in the subsequent analysis. Basing the extraction merely on syntactic<br />
properties of the sentences in the corpus would make the extracted material very difficult to<br />
classify, mainly because similar meanings would be represented differently.<br />
The advantages of a normalised and generalisable dataset is further clarified by the following<br />
example. Upon a simple grammatical analysis, the sentences shown in (3- 2) can be categorised<br />
based on the syntactic roles predicate, subject and object. The result of such a classification is<br />
shown in examples (3-5) and (3-6):<br />
(3- 5)<br />
predicate subject object<br />
a. drepe morder kvinne<br />
kill<br />
b. drepe<br />
kill<br />
c. drepe<br />
kill<br />
murderer<br />
kvinne<br />
woman<br />
kvinne<br />
woman<br />
woman<br />
murderer<br />
murderer<br />
?<br />
The structures in (3-5) above can be extracted upon part of speech tagging of the sentences in<br />
(3-2). The active and passive predicate receives the same structure, and as no semantic<br />
information is available, the structuring of the arguments is in accordance with their status as<br />
subject or object. Attempting to classify these subjects and objects based on their co-occurrence<br />
with the predicate produces groupings of words which are not directly generalisable. Murderer<br />
38