21.11.2013 Views

YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ...

YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ...

YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

WORKSHOP<br />

uments. the tasks and objectives for the analysis process are more or less the<br />

same (Dörre, Gerstl and Seiffert, 2004, p. 480).<br />

Information is typically identiied through processes discovering patterns<br />

and relations mainly by means of statistical pattern learning. texts are generally<br />

regarded as unstructured data in contrast to database information, which<br />

is supposed to be structured. text mining usually involves the process of structuring<br />

the input text by ‘parsing’, which is completed by the addition and/or<br />

removal of linguistic features. this restructuring of data permits the derivation<br />

patterns as well as evaluation and interpretation of the output. the quality of<br />

text mining is usually judged on the combination of relevance, novelty and<br />

tractability. typical text-mining tasks include text classiication, text clustering,<br />

concept or entity extraction, document summarisation and modelling of entity<br />

relations.<br />

text-mining processes may be described as a subsequent low of activities.<br />

By means of statistical algorithms, the key terms of a textual entity are<br />

identiied. Comparison with entries in ontologies offers possibilities to group<br />

those texts together with similar ones. In this way, a basis of knowledge is created<br />

and extended after analysing other documents.<br />

An example will show the complexity of the necessary methods. Imagine<br />

that a document contains the German word Birne ‘pear’. It has to be taken into<br />

account that the use of this term could be an ellipsis or a metaphor. that leads<br />

us to the following virtual classes, which distinguish from each other by the<br />

different meanings of the key term:<br />

(1) a kind of fruit,<br />

(2) the tree which produces the fruits (‘pear tree’); this is the elliptic use for<br />

Birnenbaum,<br />

(3) the wood of a pear tree which is used for the construction of furniture; this<br />

is an ellipsis for Birnenholz,<br />

(4) an electric bulb which in many cases has a form resembling a pear; this is<br />

a metaphor well established in the German vocabulary and at the same<br />

time an ellipsis for Glühbirne,<br />

(5) ironically the head of a human being which in certain stylistic contexts<br />

may be compared with a pear; in that case it could be regarded as a metaphor.<br />

Although the last one of these variants only has to be taken into account<br />

depending on the stylistic context, the other ones need deeper analysis so that<br />

the documents concerned can be related to similar ones. In the irst case, this<br />

could consist of references to other types of fruit or foods. If the document<br />

152 | 153<br />

01_2007_5222_txt_ML.indd 153 6-12-2007 15:14:05

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!