19.11.2012 Views

Bull's Head and Mermaid - The Bernstein Project - Österreichische ...

Bull's Head and Mermaid - The Bernstein Project - Österreichische ...

Bull's Head and Mermaid - The Bernstein Project - Österreichische ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Ill. 5: <strong>Bernstein</strong> database architecture<br />

Architecture<br />

After detailed discussions with the different partners in the<br />

project, it was decided that a new large database with<br />

copies of the existing databases should not be created, but<br />

searches should take place directly in the original databases.<br />

For this approach various protocols were investigated,<br />

<strong>and</strong> with “Search/Retrieval via URL” (SRU) an appropriate<br />

solution was found. To realise this protocol, a SRU-gateway<br />

was implemented that could be easily configured for each<br />

database. It was specified that MySQL- und Microsoft Access-databases<br />

should be supported by this SRU-gateway<br />

(Ill. 5).<br />

As shown in the figure above, the SRU-gateway can be<br />

installed either near the original database (dashed line) or<br />

directly on the <strong>Bernstein</strong> server (dashed dotted line).<br />

SRU/SRW Protocol<br />

SRU (Search/Retrieval via URL) und SRW (Search/Retrieve<br />

Web Service) are st<strong>and</strong>ard search protocols for Internet<br />

search queries. <strong>The</strong>y were developed <strong>and</strong> published by the<br />

United States Library of Congress (see http://www.loc.gov/st<strong>and</strong>ards/sru/).<br />

<strong>The</strong> actual requests use CQL (Contextual<br />

Query Language), a st<strong>and</strong>ard syntax for representing<br />

queries <strong>and</strong> the results are returned in XML (eXtensible<br />

Markup Language).<br />

Since August 2007, SRU version 1.2 has become the current<br />

st<strong>and</strong>ard version; it no longer distinguishes between<br />

the SRU <strong>and</strong> SRW protocols, but includes both.<br />

<strong>The</strong> search queries are sent via HTTP (HyperText Transfer<br />

Protocol) using simple GET or POST requests, or enveloped<br />

in XML using SOAP (originally for Simple Object Access Protocol).<br />

<strong>The</strong> following operations are specified in the SRU protocol:<br />

• explain: provides a description of the facilities available<br />

on the SRU server<br />

• scan: enables the range of available terms at any given<br />

point to be listed in ascending order<br />

• searchRetrieve: allows a search to be submitted <strong>and</strong> a request<br />

to retrieve matching records in a specific sort order<br />

CQL allows a search with logical operators (AND, OR, NOT)<br />

<strong>and</strong> numerical relations (=,?, , =, =), as well as an exact<br />

search <strong>and</strong> a search for substrings. Furthermore, the sort order<br />

of the results can be specified in advance.<br />

<strong>The</strong> response to a search request is returned in XML. As<br />

default schema, “Dublin Core” with fifteen elements is<br />

used. A response can either consist only of the number of<br />

hits, or can include an explicit number of matching records<br />

starting with a specific one.<br />

SRU Gateway<br />

<strong>The</strong> SRU-gateway was developed as a servlet in Java <strong>and</strong> is<br />

therefore independent from the platform. All the functions<br />

of the SRU version 1.2 that are necessary for searches in<br />

watermarks databases have been implemented. <strong>The</strong> technical<br />

requirements are a Java runtime environment (JRE =<br />

1.6_3), a servlet container (Apache Tomcat = 5.5.23) <strong>and</strong><br />

read access to the respective MySQL- or Microsoft Accessdatabase.<br />

<strong>The</strong> individual databases are configured by the use of an<br />

XML file (“config.xml”) that includes the data for the database<br />

access <strong>and</strong> the mapping of all database fields. <strong>The</strong> following<br />

example shows an extract of the configuration file<br />

for the WZMA database:<br />

<br />

bernstein_wzma_g.motif_long<br />

bernstein_wzma_m.refnr_wm<br />

bernstein_wzma_m.parA<br />

bernstein_wzma_m.parH<br />

bernstein_wzma_m.parW<br />

bernstein_wzma_m.origin<br />

bernstein_wzma_m.date_begin<br />

bernstein_wzma_m.date_end<br />

bernstein_wzma_m.source<br />

bernstein_wzma_m.path_wm<br />

bernstein_wzma_m.id<br />

<br />

http://www.ksbm.oeaw.ac.at/imgjpg/%img<br />

path%%dcx.refnr%.jpg<br />

http://www.ksbm.oeaw.ac.at/_scripts/php/loadWmarkImg.php?id=%id%<br />

In the case of a “searchRetrieve” request, the transmitted<br />

fieldnames are first mapped onto the fieldnames of the<br />

specific database according to the configuration file, <strong>and</strong><br />

the CQL is transformed into SQL. <strong>The</strong>n the individual items<br />

of the result set are transformed into XML with regard to<br />

the mapping (<strong>and</strong> according to the SRU protocol) <strong>and</strong> returned.<br />

In the following example we assume that an advanced<br />

search for the motif “vogel krone” was executed via the<br />

<strong>Bernstein</strong> portal, which results in 107 hits in the Piccard-Online<br />

database:<br />

103

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!