21.11.2013 Views

YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ...

YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ...

YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

deals with fruit plantations, it should be classiied in the second group. the<br />

third group is relevant if, for instance, materials for furniture are discussed. the<br />

fourth group is based on a metaphoric application of the term; it deals with<br />

electricity or artiicial illumination.<br />

2. mEthODOLOGICAL PARtICULARItIES Of tEXt mINING<br />

the methodologies of text mining include technologies developed in the<br />

context of computer linguistics or linguistic informatics. Actually it is underlined<br />

that text mining has led to a revival of the corresponding ideas and features.<br />

In particular, mathematical and statistical approaches are supposed to be<br />

of high importance. the identiication of so-called stop words — functional<br />

words which describe relations between terms without having a special meaning<br />

of their own — and the calculation of word frequencies are of basic interest.<br />

they contribute to the analysing of those patterns which are essential to<br />

the relevant meaning of the text. the methods may be paraphrased, as done<br />

by hippner and Rentzmann (2006):<br />

(text) ‘mining methoden: Nachdem terme aus den textdokumenten extrahiert<br />

worden sind und die textuellen Daten somit eine Struktur erhalten<br />

haben, können Verfahren angewandt werden, die aus dem klassischen<br />

Data mining bekannt sind: texte können automatisch vorgegebenen Kategorien<br />

zugeordnet werden (Klassiikation) oder sie können so gruppiert<br />

werden, dass ähnliche texte zusammengeführt werden (Segmentierung).<br />

Ebenso kann das gemeinsame Auftreten von termen analysiert werden<br />

(Abhängigkeitsanalyse).’ ( 2 )<br />

In an extension of the abovementioned example, this could lead to the<br />

recognition of ‘pear’ (Birne) being a central term of the document. It is, however,<br />

still not known which of the different meanings is concerned. Ontologies<br />

could help to take other terms into account as well, and to inally arrive at a<br />

clear document classiication. the network of an ontological description of the<br />

concepts around Birne is certainly much more complex than this illustration<br />

might express:<br />

( 2 ) translation: ‘text-mining methods: After the extraction of expressions from text<br />

documents thus giving textual data a structure, methods can be applied which are wellknown<br />

from classical data-mining: texts can be associated to predeined categories<br />

(classiication) or they can be grouped in a way that similar texts are brought together<br />

(segmentation). It is also possible to analyse the existence of expressions in common<br />

(analysis of dependencies).’<br />

01_2007_5222_txt_ML.indd 154 6-12-2007 15:14:05

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!