Unni Cathrine Eiken February 2005

Unni Cathrine Eiken February 2005 Unni Cathrine Eiken February 2005

10.04.2013 Views

Mitkov, Ruslan. (2001): Outstanding issues in anaphora resolution. In: Alexander Gelbukh (ed): Computational Linguistics and Intelligent Text Processing, pp. 110-125 Mitkov, Ruslan. (2003): Anaphora Resolution. Chapter 14 in Mitkov (ed): The Oxford Handbook of Computational Linguistics. Oxford University Press, pp. 266-283. Nasukawa, Tetsuya. (1994): Robust method of pronoun resolution using full-text information. Proceedings of the 15 th International Conference on Computational Linguistics (COLING’94, Kyoto), pp.1157-1163. Available at: http://acl.eldoc.ub.rug.nl/mirror/C/C94/index.html NorGram website (2004): http://www.hf.uib.no/i/LiLi/SLF/Dyvik/norgram/ Consulted 23/11-2004 OBT (2005): Oslo-Bergen-taggeren Available at: http://decentius.aksis.uib.no/cl/cgp/obt.html Pantel, Patrick and Dekang Lin (2002): Discovering word senses from text. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Edmonton), pp. 613-619. Pereira, Fernando, N. Tishby, L. Lee. (1993): Distributional clustering of English words. Proceedings of the 31st Annual Meeting of the ACL, pp. 183-190. Available at: http://acl.eldoc.ub.rug.nl/mirror/P/P93/index.html Robbins, R.H. (1997): A Short History of Linguistics. Longman. Saeed, John I. (1997): Semantics. Blackwell. Velldal, Erik. (2003): Modelling Word Senses With Fuzzy Clustering. Cand. Philol. Thesis in Language, Logic and Information. University of Oslo. Wolff, Karl Erich. (1994): A first course in formal concept analysis. In: Faulbaum, F. (ed): SoftStat’93 Advances in Statistical Software 4, pp. 429-438. 88

Appendix A: Ekstraktor.pl – algorithm The algorithm behind Ekstraktor is divided into two separate parts: information retrieval from the Prolog file and processing of the information that was found and stored. First a Prolog output file is opened and each line of the file is read. Based on patternmatching, lines from the file are stored in different arrays according to which pattern they match. Subsequent to the information-extraction from the Prolog file, the information stored in the arrays is processed for the purpose of creating predicate-argument structures. In the following. I will give a brief outline of the processing steps. I will do this by describing each of the central functions in Ekstraktor. The term epmor (eng: ep mother) corresponds to the first EP in the ARG0ep-array, in most cases meaning the EP “in question”. finnHoved(); Finds the semantic forms of the main/first predicate-argument structure in the sentence. This function calls the following (sub)functions: finnEP1(); Since the entities parsed are full sentences, the main structures is limited to having a verb as its head. This function searches the array catsuff for a pattern with the first member of ARG0ep as its EP. If such a pattern is found, the EP is discarded and the first members of arrays ARG0ep and ARG0verdi are removed. finnPred(); Finds the semantic value of the sentence’s predicate/ARG0. Goes through the array semform searching for a pattern with the first member of ARG0ep as EP. If such a pattern is found, the semantic form is retrieved and stored in the array predikat. In order to avoid an “empty” semantic form if the argument is a proper noun, it is checked if the retrieved form matches named. If so, the array navn is searched for a pattern with the first member of ARG0ep as EP. If such an entry is found, predikat is emptied and the new semantic form is stored there. Some predicates have an extra attribute which is stored in the array prt. Each line in this array is searched for a pattern with the first member of ARG0ep as EP. If such an entry is found, the semantic form is retrieved and stored in the array ekstra. lagVerbStruktur(); Creates the correct verbal structure for the predicate. This is for the cases where the predicate has an additional attribute – as in the predicate “lete etter” (Eng: look for). The 89

Appendix A: Ekstraktor.pl – algorithm<br />

The algorithm behind Ekstraktor is divided into two separate parts: information retrieval<br />

from the Prolog file and processing of the information that was found and stored.<br />

First a Prolog output file is opened and each line of the file is read. Based on patternmatching,<br />

lines from the file are stored in different arrays according to which pattern they<br />

match.<br />

Subsequent to the information-extraction from the Prolog file, the information stored in<br />

the arrays is processed for the purpose of creating predicate-argument structures. In the<br />

following. I will give a brief outline of the processing steps. I will do this by describing<br />

each of the central functions in Ekstraktor.<br />

The term epmor (eng: ep mother) corresponds to the first EP in the ARG0ep-array, in<br />

most cases meaning the EP “in question”.<br />

finnHoved();<br />

Finds the semantic forms of the main/first predicate-argument structure in the sentence.<br />

This function calls the following (sub)functions:<br />

finnEP1();<br />

Since the entities parsed are full sentences, the main structures is limited to having a verb<br />

as its head. This function searches the array catsuff for a pattern with the first member of<br />

ARG0ep as its EP. If such a pattern is found, the EP is discarded and the first members of<br />

arrays ARG0ep and ARG0verdi are removed.<br />

finnPred();<br />

Finds the semantic value of the sentence’s predicate/ARG0. Goes through the array<br />

semform searching for a pattern with the first member of ARG0ep as EP. If such a pattern<br />

is found, the semantic form is retrieved and stored in the array predikat.<br />

In order to avoid an “empty” semantic form if the argument is a proper noun, it is<br />

checked if the retrieved form matches named. If so, the array navn is searched for a<br />

pattern with the first member of ARG0ep as EP. If such an entry is found, predikat is<br />

emptied and the new semantic form is stored there.<br />

Some predicates have an extra attribute which is stored in the array prt. Each line in this<br />

array is searched for a pattern with the first member of ARG0ep as EP. If such an entry is<br />

found, the semantic form is retrieved and stored in the array ekstra.<br />

lagVerbStruktur();<br />

Creates the correct verbal structure for the predicate. This is for the cases where the<br />

predicate has an additional attribute – as in the predicate “lete etter” (Eng: look for). The<br />

89

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!