28.06.2013 Views

Papers in PDF format

Papers in PDF format

Papers in PDF format

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

etween subsequent requests, and an understand<strong>in</strong>g of server structure to recognise which items a user is likely<br />

to access based on his current position <strong>in</strong> the server. It may be possible to do this analysis <strong>in</strong> real-time as each<br />

request was served, but it would pose a large overhead to already over-loaded servers.<br />

Authentication is often used by sites need<strong>in</strong>g a fail-safe and universal way to track users through their servers.<br />

The advantages of this method are that it provides a method of user track<strong>in</strong>g which works with most browsers<br />

and it identifies <strong>in</strong>dividual users on subsequent visits even if they are connect<strong>in</strong>g from a different computer.<br />

However, it is extremely <strong>in</strong>convenient as it requires users to first register with the site and then remember a<br />

username and password that they should use on all subsequent visits to the site. Many users f<strong>in</strong>d that it<br />

generally isn't possible to use the same username and password at all such sites as many won't allow users their<br />

primary choice of username/password. Furthermore, based on our own experiences, we expect that many users<br />

who would otherwise look through the site may be dissuaded because of the <strong>in</strong>convenience of register<strong>in</strong>g or<br />

remember<strong>in</strong>g their username. Thus, <strong>in</strong> general, we would not recommend this technique for sites hop<strong>in</strong>g to get<br />

a high volume of traffic and who wouldn't otherwise use authentication for controll<strong>in</strong>g access to their site.<br />

HTTP/1.0 Extensions<br />

The Cookie mechanism for client-side stateful transactions <strong>in</strong> HTTP is an extension to the HTTP protocol<br />

proposed by Netscape Corporation and implemented by the Netscape browser and several servers [Cookies<br />

1995]. When a browser requests a resource from a server for the first time, the server responds with a cookie,<br />

which the browser stores and sends as part of each subsequent request. This allows the server to l<strong>in</strong>k up<br />

requests from a particular browser <strong>in</strong>to a click-trail. Cookies can be persistent, l<strong>in</strong>k<strong>in</strong>g requests from one<br />

brows<strong>in</strong>g session with requests from the previous one.<br />

The Keep-Alive extension to HTTP/1.0 allows several resources to be requested over a s<strong>in</strong>gle connection. This<br />

is implemented by the Netscape browser and several servers. In pr<strong>in</strong>ciple it allows several requests to be<br />

matched up together as com<strong>in</strong>g from one browser. However, <strong>in</strong> practice browsers use it only for s<strong>in</strong>gle pages,<br />

request<strong>in</strong>g the page itself and all its embedded objects <strong>in</strong> one request. This limits the usefulness of the<br />

extension to follow a browser between dist<strong>in</strong>ct pages.<br />

In HTTP/1.1<br />

The HTTP/1.1 proposal <strong>in</strong>troduces a new persistent connection architecture as the default connection type<br />

[Field<strong>in</strong>g et al. 1996]. This supersedes the Keep-Alive extension header described <strong>in</strong> 2.1.2. Any number of<br />

requests can be made on a s<strong>in</strong>gle connection, until either the server or browser closes the connection. The<br />

specification does not make clear the circumstances under which connections should be closed or ma<strong>in</strong>ta<strong>in</strong>ed<br />

and as, at the time of writ<strong>in</strong>g, there are no widespread implementations of the protocol, it is difficult to<br />

comment on whether this new architecture will improve user track<strong>in</strong>g. It is likely, however, that for match<strong>in</strong>g<br />

of click-trails where the requests are punctuated by hours or days, that this architecture will not help.<br />

Track<strong>in</strong>g at the Browser<br />

The chief difficulty with server-side click-trail track<strong>in</strong>g us<strong>in</strong>g any of the mechanisms described <strong>in</strong> section 2.1 is<br />

that you can only track requests to the server. Frequently, browsers cache pages, and provide history<br />

mechanisms to allow navigation `Back' to the previous and `Forward' to the next page <strong>in</strong> the history. The<br />

browsers quite correctly do not generate new requests for this navigation. A consequence of this is, however,<br />

that it is not possible to ma<strong>in</strong>ta<strong>in</strong> an accurate position of a user with<strong>in</strong> a site if the user has navigated us<strong>in</strong>g the<br />

history mechanism. This is a major problem as a study at Georgia Institute of Technology analysed brows<strong>in</strong>g<br />

strategies and determ<strong>in</strong>ed that a total of 42.7% of navigation was through the history mechanism [Catledge et<br />

al. 1995].<br />

An alternative to user-track<strong>in</strong>g at the server side is to extend users' browsers to send usage <strong>in</strong><strong>format</strong>ion to<br />

<strong>in</strong>terested parties whenever a new page is accessed. This can be implemented very easily with Mosaic's<br />

Common Client Interface (CCI) and a small helper application which connects to the CCI port of the browser,<br />

and relays WWW movement <strong>in</strong><strong>format</strong>ion via TCP or multicast to <strong>in</strong>terested parties. Both WebCast [Burns<br />

1995] and FollowWWW [Brown et al. 1996] are applications which make use of this technique. An alternative<br />

would be a Netscape plug-<strong>in</strong> which monitors the actions of the user and sends movement <strong>in</strong><strong>format</strong>ion.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!