Sharing Knowledge: Scientific Communication - SSOAR

Weitere Magazine

Empfehlungen

Info

160 Judith Plümer modification, the status of the document (preprint/article), the mimetype of the original file, its language, source, URL and a rights statement. The first extension to the standard Dublin Core element set that is specific within the set was the use of „DC.subject.msc“ to store the MSC codes. The encoding of the metadata is currently done via the HTML2.0 META tag. However, the encoding of metadata in HTML 2.0 leads to some problems. For example there is no grouping mechanism which says for example that this email address belongs to that person’s name. That is the reason why we store only one email address in the metadata files. HTML4.0 gives more attributes to the META tag, but the grouping problem is not solved there. Another problem with the META tag in HTML is the double storage of data, once to be found in the head as metadata and again in the body of the document as a visible text for human eyes. If someone changes the visible part of the metadata document it seems to be correct for him in the browser, but it may contain incorrect metadata. These problems can be solved when switching to RDF by using DC and vCard vocabulary and object typing as we will describe below. As already mentioned, the metadata files or title pages of the preprints contain an abstract and MSC codes. Therefore the creation of the files should be done by the authors of the papers themselves. To do so a detailed knowledge about Dublin Core and HTML encoding is necessary or a tool that cares for the syntactical correctness. Such a tool, called the Mathematics Metadata Markup Editor, was developed at Osnabrück in collaboration with E. Hilf, Th. Severiens, M. Jost and M. Kaplan. This tool has been successively enhanced to introduce features such as controls on the input, MSC browsing, and the output of Dublin Core metadata encoded in HTML in a first version. The current version additionally supports RDF. The author of a paper can type in the respective metadata and via mouse click the metadata file is generated by a perl script. The Mathematics Metadata Markup Version 3.1 tool can be used remotely or downloaded for local installation from ftp://ftp.math.uos.de/pub/MMM/. Now the stored metadata guarantee a much higher quality in retrieval than for example a full text search on the original documents. On the one hand the result sets are more specific and on the other hand the results can be presented in a well structured way. How documents enter the system and how they are presented Once the metadata are stored on a WWW server they must be collected in some way. However, the whole document cannot be stored in a central index. The copyright of the authors allows the storage of copies for private or personal use, but it is forbidden to keep them in a database and distribute them without the permission of the author.
MPRESS - transition of metadata formats 161 The gathering part of the Harvest [Hardy, 1996, Technical Report] software which is used in Math-Net is responsible for this collecting process. It is a configurable robot. Now the gathering of the documents does not mean making copies and keeping them somewhere. It means taking a temporary copy, extracting the relevant information from this copy, storing this information in a database and deleting the temporary copy. One gathering agent is determined by several configuration files that have to be created by an administrator. There is one configuration file that contains the URLs that the agent should gather and evaluate together with a lot of parameters which say whether he should run recursively, at which depth the recursion should stop, how many documents should be gathered at most, which documents should be included or excluded during the recursion, how many different hosts may be visited during the recursion and which types of protocols should be used (ftp, http). The other configuration files can be used in their default form or be modified, too. As there are configuration files which say how to handle different mimetypes and formats, how to interpret suffixes, and how to summarize different formats. The gatherer pipes the retrieved document through the essence machine [Hardy, 1996, Trans. Comp. Sci] that generates summaries of the documents in SOIF [Hardie, 1999] (summary object interchange format, see below). The SOIF documents are stored by the gatherer and the original resources are deleted. The SOIF documents of one agent reside in a gnu-zipped ASCII file with some additional information. For example, there is a small databases storing the MD5 [Rivest, 1992] numbers of the original files to avoid multiple copies of one document. Internally the MD5 number is also used to check whether a document has been changed or not in case the agent does not visit a site for the first time. (The agents of MPRESS run every other week.) If the document has not been changed, the essence process does not have to be run again on the same document. In this case only the time-to-life (TTL) of the respective SOIF-record is modified. This saves local computing power, but it does not save netload, which would be desirable. So what we wanted was an incrementally running agent. The original harvest software was modified in this way so that it runs incrementally for MPRESS. There are summarizers for a variety of formats. The pages in the Web that contain mathematically relevant information are usually stored in PostScript, PDF or HTML. The summarizers of the harvest software do not handle 8-bit and unicode characters correctly. So we had to modify the PostScript and HTML summarizer to interpret at least Umlauts and „ß“ for German needs in order to allow correct responses when querying for German names. This problem is solved with X-Harvest and HyREX because they are able to handle Unicode characters (see below).
Seite 1:
Sharing Knowledge: Scientific Commu
Seite 4 und 5:
Tagungsberichte Herausgegeben vom I
Seite 6 und 7:
Die Deutsche Bibliothek - CIP-Einhe
Seite 8 und 9:
6 Inhalt Infrastrukturen für innov
Seite 11:
Vorwort Zur neunten Frühjahrstagun
Seite 14 und 15:
12 Heike Andermann ted, in 1994-199
Seite 16 und 17:
14 Heike Andermann schaftlerInnen e
Seite 18 und 19:
16 Heike Andermann tung beibehalten
Seite 20 und 21:
18 Heike Andermann NBII). 26 Für d
Seite 23 und 24:
Qualitätssicherung und Nutzung von
Seite 25 und 26:
Seite 27 und 28:
Seite 29 und 30:
Seite 31 und 32:
Seite 33 und 34:
Seite 35 und 36:
Seite 37 und 38:
Seite 39 und 40:
vascoda Das gemeinsame Portal von I
Seite 41 und 42:
vascoda - Das gemeinsame Portal von
Seite 43 und 44:
Seite 45 und 46:
Seite 47:
Seite 50 und 51:
48 Klaus Hahn Abstract The advancem
Seite 52 und 53:
50 Klaus Hahn her“ [6]). So wäre
Seite 54 und 55:
52 Klaus Hahn men auch als Begriff
Seite 56 und 57:
54 Klaus Hahn Arbeiten notwendig (>
Seite 58 und 59:
56 Klaus Hahn Fazit Zur effektiven
Seite 61 und 62:
Unterstützung kooperativer Verfahr
Seite 63 und 64:
Seite 65 und 66:
Seite 67 und 68:
Seite 69 und 70:
Seite 71 und 72:
Seite 73 und 74:
PhysNet und seine Spiegel - Das Pro
Seite 75 und 76:
Seite 77 und 78:
Seite 79 und 80:
Seite 81 und 82:
Seite 83 und 84:
Seite 85 und 86:
Online-Hochschulschriften für die
Seite 87 und 88:
Seite 89 und 90:
Seite 91 und 92:
Seite 93 und 94:
Seite 95:
Seite 98 und 99:
96 Rudi Schmiede, Stephan Körnig s
Seite 100 und 101:
98 Rudi Schmiede, Stephan Körnig s
Seite 102 und 103:
100 Rudi Schmiede, Stephan Körnig
Seite 104 und 105:
Seite 106 und 107:
Seite 108 und 109:
Seite 110 und 111:
108 Jutta von Maurice strument lief
Seite 112 und 113: 110 Jutta von Maurice bare Kopien v
Seite 114 und 115: 112 Jutta von Maurice einschließli
Seite 116 und 117: 114 Jutta von Maurice hebungen. Dem
Seite 118 und 119: 116 Jutta von Maurice http://www.df
Seite 121 und 122: Maßnahmen zur Förderung der Infor
Seite 133 und 134: LIMES - A System for a Distributed
Seite 139: LIMES - A System for a Distributed
Seite 142 und 143: 140 Frank Oldenettel, Michael Malac
Seite 160 und 161: 158 Judith Plümer prints werden du
Seite 164 und 165: 162 Judith Plümer The summarizers
Seite 166 und 167: 164 Judith Plümer based on a few p
Seite 168 und 169: 166 Judith Plümer Migration from H
Seite 170 und 171: 168 Judith Plümer Contact Judith P
Seite 172 und 173: 170 Dennis Reil für traditionelle
Seite 174 und 175: 172 Dennis Reil gartner (2002) ist
Seite 176 und 177: 174 Dennis Reil können. Wichtig f
Seite 178 und 179: 176 Dennis Reil nahezu identisch, s
Seite 180 und 181: 178 Dennis Reil Insgesamt kann also
Seite 183 und 184: Reflections on the Value Chain of S
Seite 193: Reflections on the Value Chain of S
Seite 196 und 197: 194 Natascha Schumann, Wolfgang Mei
Seite 207 und 208: ViFaPhys - Virtuelle Fachbibliothek
Seite 213 und 214:
ViFaPhys - Virtuelle Fachbibliothek
Seite 215 und 216:
Weiterentwicklung von digitalen Bib
Seite 217 und 218:
Seite 219 und 220:
Seite 221 und 222:
Seite 223 und 224:
Seite 225 und 226:
Seite 227:
Seite 230 und 231:
228 Markus Kalb, Günther Specht ve
Seite 232 und 233:
230 Markus Kalb, Günther Specht Di
Seite 234 und 235:
232 Markus Kalb, Günther Specht Da
Seite 236 und 237:
234 Markus Kalb, Günther Specht 4.
Seite 238 und 239:
236 Markus Kalb, Günther Specht pe
Seite 240 und 241:
238 Markus Kalb, Günther Specht [O
Seite 242 und 243:
240 Maximilian Stempfhuber � Info
Seite 244 und 245:
242 Maximilian Stempfhuber Weiterle
Seite 246 und 247:
244 Maximilian Stempfhuber Zum zwei
Seite 248 und 249:
246 Maximilian Stempfhuber Abbildun
Seite 251 und 252:
Das didaktische Metadatensystem DML
Seite 253 und 254:
Seite 255 und 256:
Seite 257 und 258:
Seite 259 und 260:
Seite 261 und 262:
Seite 263 und 264:
Seite 265 und 266:
Seite 267 und 268:
Seite 269 und 270:
The C 2 M project: a wrapper genera
Seite 271 und 272:
Seite 273 und 274:
Seite 275 und 276:
Seite 277 und 278:
Seite 279 und 280:
Seite 281 und 282:
Seite 283 und 284:
Seite 285 und 286:
Analyse der Qualität der multimedi
Seite 287 und 288:
Seite 289 und 290:
Seite 291 und 292:
Seite 293 und 294:
Seite 296:
Ziele des erstmals von der Initiati
Alle anzeigen

Sharing Knowledge: Scientific Communication - SSOAR

Erfolgreiche ePaper selbst erstellen

Template löschen?

Als Template speichern?