21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

appear truncated or not, depending on phonetic harmony and vowel distribution: 'N'<br />

may become both '-ene-' or 'en-'.<br />

To solve this puzzle, I introduced all letter names in their various forms into<br />

the suffix lexicon, with combination restrictions saying that they belong to the word<br />

class 'b' (abbreviation) and have inward compatibility only with other elements of the<br />

same type. Certain suffixes (like '-ista'), then, allow for left hand combination with<br />

these letter elements. Since letter names also appear in the root form lexicon, the<br />

program can now analyse party member expressions as long derivation chains of<br />

abbreviation letters (which, formally, stand for the party name word elements).<br />

'petebista' is thus recognised as a Portuguese word, and reads in the analysis file:<br />

P N M/F S<br />

In the same way, other productive expressions phonetically derived from<br />

abbreviations, can now be tagged.<br />

2.2.4.4 Names: problems with an immigrant society<br />

In my system, I define the word class of proper nouns (lexicon entry 'n', PoS tag<br />

'PROP') as capitalised words distinguished from nouns and adjectives by featuring<br />

both number (S/P) and gender (M/F) as lexeme categories, not word form categories.<br />

(1) LEXICON ENTRY TAG SEQUENCE<br />

Filipinas PROP F P<br />

Dardanelos PROP M P<br />

Estados=Unidos PROP M P<br />

Amado PROP M S<br />

Berlim PROP M S<br />

Andrómeda PROP F S<br />

OMS PROP F S<br />

PC PROP M S<br />

Presently, there are about 1.300 names in the lexicon, consisting of single word<br />

proper nouns, or lexicalised name chains 21 , about 8% being abbreviations, with a<br />

male/female ratio of roughly 4:3 (this being about the same as for ordinary nouns).<br />

Since proper nouns, like ordinary nouns, can trigger agreement in verb chains ('A<br />

OMS foi lançada ...') or modifiers ('o grande Amado'), lexicon information is quite<br />

important for disambiguation. The word 'a', which - among other things - can be<br />

either a preposition of movement or a feminine article, can be disambiguated with<br />

the help of the neighbouring noun's gender information in the following example.<br />

21 I define a name chain as consisting of at least one proper noun followed by any number of non-clausal dependents<br />

(with capitalised nouns and adjectives) and/or (possibly capitalised) distinctors (like jr., VI), and preceded by any<br />

number of capitalised prenominals and/or (possibly capitalised) pre-name nouns (titles etc.).<br />

- 41 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!