28.06.2013 Views

Papers in PDF format

Papers in PDF format

Papers in PDF format

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Abstract<br />

Shame and War Revisited<br />

Add<strong>in</strong>g Semantic Markup to HTML<br />

Philip Greenspun<br />

Laboratory for Computer Science and Artificial Intelligence Laboratory<br />

Massachusetts Institute of Technology<br />

"HTML represents the worst of two worlds. We could have taken a <strong>format</strong>t<strong>in</strong>g language and added hypertext anchors so<br />

that users had beautifully designed documents on their desktops. We could have developed a powerful document<br />

structure language so that browsers could automatically do <strong>in</strong>telligent th<strong>in</strong>gs with Web documents. What we have got<br />

with HTML is ugly documents without <strong>format</strong>t<strong>in</strong>g or structural <strong>in</strong><strong>format</strong>ion." I wrote that <strong>in</strong> August 1994. In the<br />

<strong>in</strong>terven<strong>in</strong>g 18 months, style sheets have substantially enhanced HTML's <strong>format</strong>t<strong>in</strong>g capabilities, but no progress has<br />

been made on the structure problem. I propose a class-based semantic markup system compatible with exist<strong>in</strong>g browsers<br />

and HTTP servers.<br />

Introduction<br />

"Ow<strong>in</strong>g to the neglect of our defences and the mishandl<strong>in</strong>g of the German problem <strong>in</strong> the last five years,<br />

we seem to be very near the bleak choice between War and Shame. My feel<strong>in</strong>g is that we shall choose<br />

Shame, and then have War thrown <strong>in</strong> a little later, on even more adverse terms than at present."<br />

W<strong>in</strong>ston Churchill <strong>in</strong> a letter to Lord Moyne, 1938 [Gilbert 1991]<br />

If you asked a naive user what the Web would do for them, they'd probably say "I could ask my computer to f<strong>in</strong>d me the<br />

cheapest pair of blue jeans be<strong>in</strong>g sold on the Internet and 10 seconds later, I'd be star<strong>in</strong>g at a photo of the product and<br />

be<strong>in</strong>g asked to confirm the purchase. I'd see an announcement for a concert and click a button on my Web browser to<br />

add the date to my calendar; the <strong>in</strong><strong>format</strong>ion would get transferred automatically."<br />

We computer scientists know that the Web doesn't actually work this way for naive users. Of course, armed with 20<br />

years of Internet experience and the latest <strong>in</strong> equipment and software, we computer scientists go out <strong>in</strong>to the Web and...<br />

fall <strong>in</strong>to exactly the same morass. When we f<strong>in</strong>d a conference announcement, we can't click the mouse and watch entries<br />

show up <strong>in</strong> our electronic calendars. We will have to wait for computers to develop natural language understand<strong>in</strong>g and<br />

common sense reason<strong>in</strong>g. That doesn't seem like such a long way off until one reflects that, given the ability to<br />

understand language and reason a bit, the computer could go to college for four years and come back capable of tak<strong>in</strong>g<br />

over our job.<br />

Recently adopted HTML styles sheets offer us a glimmer of hope on the <strong>format</strong>t<strong>in</strong>g front. It may yet be possilbe to<br />

render a novel readbly <strong>in</strong> HTML. However, style sheets can't fix all of HTML's <strong>format</strong>t<strong>in</strong>g deficiencies and certa<strong>in</strong>ly<br />

don't accomplish anyth<strong>in</strong>g on the semantic tagg<strong>in</strong>g front.<br />

Fix<strong>in</strong>g the <strong>format</strong>t<strong>in</strong>g problem; frames are not the answer (but maybe style<br />

sheets are)<br />

When a reader connects to a $100,000 Web server via a $10,000/month T3 l<strong>in</strong>e, his first thought is likely to be "Wow,<br />

this document looks almost as good as it would if had been hastily pr<strong>in</strong>ted out from a simple word processor." His<br />

second thought is likely to be "Wow, this document looks only almost as good as it would if had been hastily pr<strong>in</strong>ted<br />

out from a simple word processor."<br />

Increased <strong>format</strong>t<strong>in</strong>g capabilities are fundamentally beneficial. It is more efficient for one person to spend a few days<br />

<strong>format</strong>t<strong>in</strong>g a document well than for 20 million users to each spend five m<strong>in</strong>utes <strong>format</strong>t<strong>in</strong>g a document badly. Yet the<br />

orig<strong>in</strong>al Web model was the latter. Users would edit resource files on the Unix mach<strong>in</strong>es or dialog boxes on their<br />

Mac<strong>in</strong>tosh to choose the fonts, sizes, and colors that best suited their hardware and taste. Still, when they were all done,<br />

Travels with Samantha and the Bible ended up look<strong>in</strong>g more or less the same.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!