10.04.2013 Views

Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3.4 Altering the source<br />

As already mentioned, parsing randomly selected Norwegian texts is not an entirely<br />

straightforward task. Although NorGram provides for a quite broad grammar, not all linguistic<br />

constructions are parsed and, more importantly, not all words are covered in the lexicon. Ideally,<br />

it would be desirable to collect a limited domain treebank consisting of parsed sentences of the<br />

original texts as I found them on the internet. In practice, this was not a feasible task. It early<br />

became evident that the texts to be analyzed would have to be simplified for practical reasons.<br />

For the purpose of classification, I needed to extract the EPAS present in the texts. All the other<br />

information that was included in every sentence was not essential or necessary for the project.<br />

Although aware that it would be more scientific, and in any respect better, to extract the EPAS<br />

from original texts that have not been tampered with by me, this was not possible within the<br />

framework of this thesis. Given that I would have to simplify the texts in any case, I decided to<br />

cut most information that was irrelevant for the extraction of the (most central) EPAS. This<br />

process was performed on alle sentences in the text collection. Mainly adverbial phrases were<br />

excluded, on the basis that they would not be included in the extracted EPAS in any case. The<br />

examples in (3-11) below illustrate a typical example:<br />

(3- 11)<br />

a. Original sentence:<br />

Etter at hun ble funnet opplyste et vitne at hun hadde hørt<br />

høye rop om hjelp fra stedet tidlig søndag morgen.<br />

After she was found a witness informed that she had heard loud screams for<br />

help from the area early Sunday morning.<br />

b. Simplified form:<br />

Et vitne opplyste at hun hadde hørt høye rop.<br />

A witness informed that she had heard loud screams.<br />

c. Extracted structures:<br />

høre,vitne,rop<br />

høy,rop,?<br />

opplyse,vitne,?<br />

hear, witness, scream<br />

loud, scream,?<br />

inform, witness,?<br />

47

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!