YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ...
YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ... YEARS OF EUROPEAN ONLINE ANNÉES DE EN LIGNE ...
004.6 Data 004.7 Computer communication 004.8 Artiicial intelligence 004.9 Application-oriented computer-based techniques 005 management (Revision from 2001) 006 Standardisation of products, operations, weights, measures and time 007 Activity and organising. Information. Communication and control theory generally (cybernetics) 008 Civilisation. Culture. Progress 009 humanities. Arts subjects in general 01 Bibliography and bibliographies. Catalogues 02 Librarianship 03 General reference works. Encyclopaedias 050 Serial publications. Periodicals (their function, business and editorial management) 06 Organisations and other types of cooperation. Including: Associations. Congresses. Exhibitions. museums 070 Newspapers. the press. Including: Journalism 08 Polygraphies. Collective works 09 manuscripts. Rare and remarkable works 1 Philosophy. Psychology 2 Religion. Theology 3 Social sciences 4 Unassigned 5 Natural sciences 6 Technology 7 The arts 8 Language. Linguistics. Literature 9 Geography. Biography. History It has to be underlined that the classiication systems are static, although relations may be indicated. An object can only be stored in one category; for other aspects, additional methods have to be applied. the classiication schema has to be ixed before its irst use; later modiications are very dificult to implement, because there is a risk of having to reclassify data already stored. Nevertheless, the classiication methodology was revitalised by the Internet, as it is quite simple to create a hyperlink system on this basis. most search engines offer thematic categories which simplify the research of the users. 01_2007_5222_txt_ML.indd 162 6-12-2007 15:14:06
WORKSHOP Thesauri thesauri are controlled vocabularies with a hierarchical structure. In this respect, they reuse the same approach as the classiication methodologies, but on the level of the components of natural speech. the main items which are registered in a thesaurus are generally called descriptors, as they are used to describe the contents of a textual object. the entries in the thesaurus are reined by a sophisticated system of links to broader terms, narrower terms, synonyms and related terms. the advantage of the use of thesauri is that research can automatically be: • redirected from registered synonyms to the corresponding descriptors; • limited to narrower terms if the number of results exceeds a certain threshold; • extended to broader and/or related terms if the results are too poor. Although a big step forwards, thesauri also have their shortcomings. the quality of the research results depends on the quality of the use of the descriptors. furthermore, only the main concepts of a text will be taken into consideration. So there is always the risk that texts are only retrieved in the context they are prepared for, while other present concepts are not found. Semantic networks the idea of thesauri is driven forward by the technologies in the context of a semantic network. they are mainly based on the use of ontologies, which are comparable to thesauri, but instead of linking terms, they describe the relations of concepts. the example overleaf illustrates this approach in comparison with thesauri. the example is simpliied, but it is obvious that the description of the nature of the relation is much more useful than a pure link. the evaluation of these descriptions allows for some logical conclusions which can automatically be made on such basis. In semantic networks, heritage is an important aspect. So, on a deeper level, the properties of the hyperonymic level are inherited and are also available for association with other concepts. As the work for the creation and maintenance of thesauri is already very time-consuming and complex, the elaboration of ontologies is even more complicated. this is why this technology is not yet used for big projects or large amounts of data, but for small excerpts. the spanning of the Internet by 162 | 163 01_2007_5222_txt_ML.indd 163 6-12-2007 15:14:06
- Page 111 and 112: WORKSHOP 3.2.4. ARE THERE ACTS, DEC
- Page 113 and 114: WORKSHOP Electronic signature of PD
- Page 115 and 116: WORKSHOP formats are available: htm
- Page 117 and 118: WORKSHOP the chain of conidence is
- Page 119 and 120: WORKSHOP the object of SOLON is to
- Page 121 and 122: WORKSHOP 5. A secure session is now
- Page 123 and 124: WORKSHOP ESTONIA A certiicate-based
- Page 125 and 126: WORKSHOP ertheless, some assistance
- Page 127 and 128: WORKSHOP (b) If the system is XmL-b
- Page 129 and 130: COHERENCE OF TERMINOLOGY AND SEARCH
- Page 131 and 132: WORKSHOP nym for legal categories,
- Page 133 and 134: WORKSHOP Article 4(2) of Directive
- Page 135 and 136: WORKSHOP the tool could prove to be
- Page 137 and 138: EUR-LEX: FROM DATA STRUCTURES TO LE
- Page 139 and 140: WORKSHOP duces a ‘magic result’
- Page 141 and 142: WORKSHOP sion of the current one, i
- Page 143 and 144: WORKSHOP for test and demonstration
- Page 145 and 146: WORKSHOP focus on text representati
- Page 147 and 148: WORKSHOP As a irst step, the existi
- Page 149 and 150: WORKSHOP REfERENCES Bench-Capon, t.
- Page 151 and 152: TEXT MINING 1. INtRODUCtION the gro
- Page 153 and 154: WORKSHOP uments. the tasks and obje
- Page 155 and 156: WORKSHOP In order to create more ef
- Page 157 and 158: WORKSHOP jects. the problem, howeve
- Page 159 and 160: WORKSHOP space and their maintenanc
- Page 161: WORKSHOP the success of the impleme
- Page 165 and 166: WORKSHOP knowledge which are reusab
- Page 167 and 168: WORKSHOP fuhr, Norbert. 2004. Infor
- Page 169 and 170: WORKSHOP Oberle, Daniel; Staab, Ste
- Page 171: En tant que déléguée de la Grèc
- Page 175 and 176: PRESS REVIEW / REVUE DE PRESSE " 17
- Page 178 and 179: 01_2007_5222_txt_ML.indd 178 6-12-2
- Page 180: 01_2007_5222_txt_ML.indd 180 6-12-2
004.6 Data<br />
004.7 Computer communication<br />
004.8 Artiicial intelligence<br />
004.9 Application-oriented computer-based techniques<br />
005 management (Revision from 2001)<br />
006 Standardisation of products, operations, weights,<br />
measures and time<br />
007 Activity and organising. Information. Communication<br />
and control theory generally (cybernetics)<br />
008 Civilisation. Culture. Progress<br />
009 humanities. Arts subjects in general<br />
01 Bibliography and bibliographies. Catalogues<br />
02 Librarianship<br />
03 General reference works. Encyclopaedias<br />
050 Serial publications. Periodicals (their function, business<br />
and editorial management)<br />
06 Organisations and other types of cooperation. Including:<br />
Associations. Congresses. Exhibitions. museums<br />
070 Newspapers. the press. Including: Journalism<br />
08 Polygraphies. Collective works<br />
09 manuscripts. Rare and remarkable works<br />
1 Philosophy. Psychology<br />
2 Religion. Theology<br />
3 Social sciences<br />
4 Unassigned<br />
5 Natural sciences<br />
6 Technology<br />
7 The arts<br />
8 Language. Linguistics. Literature<br />
9 Geography. Biography. History<br />
It has to be underlined that the classiication systems are static, although<br />
relations may be indicated. An object can only be stored in one category; for<br />
other aspects, additional methods have to be applied. the classiication<br />
schema has to be ixed before its irst use; later modiications are very dificult<br />
to implement, because there is a risk of having to reclassify data already stored.<br />
Nevertheless, the classiication methodology was revitalised by the Internet, as<br />
it is quite simple to create a hyperlink system on this basis. most search engines<br />
offer thematic categories which simplify the research of the users.<br />
01_2007_5222_txt_ML.indd 162 6-12-2007 15:14:06