21.04.2013 Views

Eckhard Bick - VISL

Eckhard Bick - VISL

Eckhard Bick - VISL

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

definitions on the ”target grammar” (i.e. the particular grammatical description of<br />

Portuguese to be implemented by my system).<br />

1.2 The parser and the text<br />

This dissertation is a Janus work, both practical and theoretical at the same time, one<br />

face mirroring and complementing the other. After all, a major point was simply<br />

showing that “it could be done” - that a Constraint Grammar for a Romance<br />

langugage would work just as well as for English.<br />

As a practical product, the parser and its applications can speak for<br />

themselves, and, in fact, do so every day – at http://visl.hum.sdu.dk/ - , serving users<br />

across the internet. In what could be called the theoretical ortext part of this<br />

dissertation, apart from discussing the architecture and performance of the parser, I<br />

will be concerned both with the process of building the parser and with its linguistic<br />

spin-off for Constraint Grammar and parsing in general, and the analysis of<br />

Portuguese in particular. Both tool and target grammar will be discussed, with<br />

chapter 3 focusing on the first, and chapter 4 focusing on the second.<br />

Chapter 2 describes the system’s lexicon based morphological analyser, and since<br />

the quality of any CG-system is heavily dependent on the acuracy and coverage of its<br />

lexico-morphological input base, the analyser and its lexicon constitute an important<br />

first brick in the puzzle. However, chapters 2.1, 2.2 and 2.3, which treat the<br />

architecture of the program as such, as well as the interplay of its root-, suffix-,<br />

prefix- and inflexion-lexica, are rather technical in nature, and not, as such,<br />

necessary to understand the following chapters, which may be addressed directly and<br />

individually. In 2.2.4, the Beast will raise its head in the section on the dynamic<br />

lexicon, where non-word words like abbreviations, enclitics, complex names and<br />

polylexical expressions are discussed, and the principle of structural morphological<br />

heuristics is explained. 2.2.5 is a reference chapter, where morphological word<br />

classes and inflexion features are defined, and 2.2.6 quantifies the analyser’s lexical<br />

coverage.<br />

Chapter 3 introduces the Constraint Grammar formalism as a tag based<br />

disambiguation technique, compares it to other approaches, and discusses the types<br />

of ambiguity it can be used to resolve, as well as the lexical, morphological and<br />

structural information that can be used in the process. It is in chapter 3 that the ”tool<br />

grammar” as such is evaluated, both quantitatively and qualitatively, with special<br />

emphasis on level interaction and rule typology. Finally, the system’s performance is<br />

measured on different types of text (and speech) data and for different levels of<br />

analysis.<br />

”Level interaction” is central to the concept of Incremental Parsing (or Progressive<br />

Level Parsing) and addresses the interplay between lower level tags (already<br />

- 10 -

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!