28.06.2013 Views

Papers in PDF format

Papers in PDF format

Papers in PDF format

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

4.6.1 Access times<br />

To reduce access times there are two mechanisms currently used: mirror<strong>in</strong>g and cach<strong>in</strong>g.<br />

Mirror<strong>in</strong>g is quite common with FTP archives. To make optimal use of the mirrors, ARCHIE<br />

allows to locate the closest copy of a file. However, the Web cannot directly be compared to<br />

FTP because: a) files tend to be much smaller and b) it is not common to use a search mach<strong>in</strong>e<br />

to locate just one file and most Web users don't even know or obey netiquette. Thus, for best<br />

usage the lookup of the closest file must be somehow automated, e.g. by resolv<strong>in</strong>g names depend<strong>in</strong>g<br />

on the location of the user.<br />

Advantages and disadvantages of replication (mirror<strong>in</strong>g) are as follows:<br />

+ Mirror<strong>in</strong>g guarantees fast access, even <strong>in</strong> bad connected countries<br />

+ The update frequency can (and should) be selected as required<br />

- Where is the master source?<br />

- Is the <strong>in</strong><strong>format</strong>ion up to date?<br />

- Cache servers may be filled with copies of the same <strong>in</strong><strong>format</strong>ion<br />

Cach<strong>in</strong>g is very helpful if some <strong>in</strong><strong>format</strong>ion is required by many people us<strong>in</strong>g one common<br />

proxy/cache server. Thus cach<strong>in</strong>g works best if one server is used by a group with common<br />

<strong>in</strong>terests.<br />

Advantages and disadvantages of cach<strong>in</strong>g are as follows:<br />

+ Transparent to user<br />

+ F<strong>in</strong>e for teams and frequently requested <strong>in</strong><strong>format</strong>ion<br />

+ Easy to implement<br />

- Delays every request<br />

- Usually used for diverse groups<br />

- Average hit rates of only 30%<br />

- Slow/no access to <strong>in</strong><strong>format</strong>ion not <strong>in</strong> cache<br />

4.6.2 Consistent view<br />

Another problem with distributed data is to offer a consistent view to the user. Distributed<br />

databases provide good mechanisms, but can be only used if very close cooperation exists between<br />

the <strong>in</strong>volved partners. Usually this will not be the case for organizations which would<br />

like to comb<strong>in</strong>e their data for access through the Web.<br />

Fortunately, there are mechanisms to allow common search <strong>in</strong>terfaces without a distributed<br />

database, th<strong>in</strong>k of Lycos which is, <strong>in</strong> the widest sense, a common query <strong>in</strong>terface for (nearly)<br />

all documents on the Web. However, these search eng<strong>in</strong>es are not a) focused on user <strong>in</strong>terests,<br />

and b) capable of <strong>in</strong>dex<strong>in</strong>g databases because there is the problem of an <strong>in</strong>f<strong>in</strong>ite query space.<br />

The best way to deal with this problem is to generate a dynamic or static HTML-tree from<br />

your database. This is especially helpful if your data is structured <strong>in</strong> a simple way, for example,<br />

if you have stored contact <strong>in</strong><strong>format</strong>ion for your employees you may generate a company phone<br />

book structured by departments. This will also allow full text <strong>in</strong>dex<strong>in</strong>g, e.g. with WAIS. An<br />

update can be easily done, just generate the structure aga<strong>in</strong> and delete the old one.<br />

-23-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!