21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Unlike other words, abbreviations may 35 contain<br />

a) word internal capitalisation even in non-headline text (VARIG, eV)<br />

b) punctuation and other non-letter characters, word internal or word final:<br />

- full stop: av. (avenida)<br />

- dash: c.-a (conpanhia)<br />

- slash: d/d (dias de dato)<br />

While the recognition of dashes and slashes as word internal is not a banality (one<br />

needs corresponding lexicon entries and a tagger with a "soft" notion of word<br />

delimiters), full stops are a particular nuisance. In order to weed out the alternative<br />

reading "sentence delimiter", it is necessary (a) to distinguish between those<br />

abbreviations that can appear in sentence-final position and those that can't<br />

(especially "title" abbreviations like cap., card., com., dr., fr., gen., gov., insp., l.,<br />

maj., pres., prof., r., rev., s., sarg., sr., ten.), and (b) check the following word for<br />

potential "sentence-initiality" (i.e., upper case first letter). The last check (c) is for<br />

single capital letters, which may be part of a name chain when followed by an upper<br />

case word (e.g.: J.P.Jacobsen, where, incidentally, the 'J.' is so much part of the<br />

name, that its pronounciation, 'I', does not disturb any educated Dane).<br />

(2) Flow chart: abbreviation or clause boundary?<br />

title abbreviation ? (a)<br />

yes - no<br />

in-sentence followed by lower case ? (b)<br />

yes - no<br />

in-sentence lower case abbreviation?<br />

(c1)<br />

yes - no<br />

in-sentence one-letter abbreviation ? (c2)<br />

yes - no<br />

in-sentence + $. (sentence delimiter)<br />

35 Since these traits are not universal, they can't be used by the tagger for defining abbreviations. Cp. the "ordinary<br />

looking" Ag (silver) and cd (the SI-unit candela) to the more distinctly "abbreviational" ag. (august) and CD (compact<br />

disk) or Cd. (cadmium).<br />

- 52 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!