Beginning Microsoft SQL Server 2008 ... - S3 Tech Training

Beginning Microsoft SQL Server 2008 ... - S3 Tech Training Beginning Microsoft SQL Server 2008 ... - S3 Tech Training

cdn.s3techtraining.com
from cdn.s3techtraining.com More from this publisher
17.06.2013 Views

Chapter 16: A Brief XML Primer So, with all that said, in this chapter we’ll look at: ❑ What XML is ❑ What other technologies are closely tied to XML I mentioned a bit ago that XML is usually not a good way to store data, but there are exceptions. One way that XML is being utilized for data storage is for archival purposes. XML compresses very well, and it is in a very open kind of format that will be well understood for many years to come — if not forever. Compare that to, say, just taking a SQL Server 2008 backup. A decade from now when you need to restore some old data to review archival information, you may very well not have a SQL Server installation that can handle such an old backup file, but odds are very strong indeed that you’ll have something around that can both decompress (assuming you used a mainstream compression library such as ZIP) and read your data. Very handy for such “deep” archives. XML Basics 474 There are tons and tons of books out there on XML (for example, Wrox’s Professional XML, by Evjen et al). Given how full this book already is, my first inclination was to shy away from adding too much information about XML itself, and assume that you already knew something about XML. I have, however, come to realize that even all these years after XML hit the mainstream, I continue to know an awful lot of database people who think that XML “is just some Web technology,” and, therefore, have spent zero time on it — they couldn’t be more wrong. XML is first and foremost an information technology. It is not a Web-specific technology at all. Instead, it just tends to be thought of that way (usually by people who don’t understand XML) for several reasons — such as: ❑ XML is a markup language, and looks a heck of a lot like HTML to the untrained eye. ❑ XML is often easily transformed into HTML. As such, it has become a popular way to keep the information part of a page, with a final transformation into HTML only on request — a separate transformation can take place based on criteria (such as what browser is asking for the information). ❑ One of the first widely used products to support XML was Microsoft’s Internet Explorer. ❑ The Internet is quite often used as a way to exchange information, and that’s something that XML is ideally suited for. Like HTML, XML is a text-based markup language. Indeed, they are both derived from the same original language, called SGML. SGML has been around for much longer than the Internet (at least what we think of as the Internet today), and is most often used in the printing industry or in government related documentation. Simply put, the “S” in SGML doesn’t stand for simple (for the curious, SGML stands for “standard generalized markup language”) — SGML is anything but intuitive and is actually a downright pain to learn. (I can only read about 35 percent of SGML documents that I’ve seen. I have, however, been able to achieve a full 100 percent nausea rate when reading any SGML.) XML, on the other hand, tends to be reasonably easy to decipher.

So, this might have you asking the question: “Great — where can I get a listing of XML tags?” Well, you can’t — at least, not in the sense that you’re thinking when you ask the question. XML has very few tags that are actually part of the language. Instead, it provides ways of defining your own tags and utilizing tags defined by others (such as the industry groups I mentioned earlier in the chapter). XML is largely about flexibility — which includes the ability for you to set your own rules for your XML through the use of either an XML schema document or the older Document Type Definition (DTD). An XML document has very few rules placed on it just because it happens to be XML. The biggie is that it must be what is called well formed. We’ll look into what well formed means shortly. Now, just because an XML document meets the criteria of being well formed doesn’t mean that it would be classified as being valid. Valid XML must not only be well formed, but must also live up to any restrictions placed on the XML document by XML schemas or DTDs that document references. We will briefly examine DTDs and XML schemas later on in this chapter. XML can also be transformed. The short rendition of what this means is that it is relatively easy for you to turn XML into a completely different XML representation or even a non-XML format. One of the most common uses for this is to transform XML into HTML for rendering on the Web. The need for this transformation presents us with our first mini-opportunity to compare and contrast HTML with XML. In the simplest terms, XML is about information, and HTML is about presentation. The information stored in XML is denoted through the use of what are called elements and attributes. Elements are usually created through the use of an opening and a closing tag (there’s an exception, but we’ll see that later) and are identified with a case-sensitive name (no spaces allowed). Attributes are items that further describe elements and are embedded in the element’s start tag. Attribute values must be in matched single or double quotes. Parts of an XML Document Well, a few of the names have already flown by, but it makes sense, before we get too deep into things, to stop and create something of a glossary of terms that we’re going to be utilizing while talking about XML documents. What we’re really going to be doing here is providing a listing of all the major parts of an XML document that you will run into, as shown in Figure 16-1. Many of the parts of the document are optional, though a few are not. In some cases, having one thing means that you have to have another. In other cases, the parts of the document are relatively independent of each other. We will take things in something of a hierarchical approach (things that belong “inside” of something will be listed after whatever they belong inside of), and where it makes sense, in the order you’ll come across them in a given XML document. The Document Chapter 16: A Brief XML Primer The document encompasses everything from the very first character to the last. When we refer to an XML document, we are referring to both the structure and the content of that particular XML document. 475

So, this might have you asking the question: “Great — where can I get a listing of XML tags?” Well,<br />

you can’t — at least, not in the sense that you’re thinking when you ask the question. XML has very<br />

few tags that are actually part of the language. Instead, it provides ways of defining your own tags<br />

and utilizing tags defined by others (such as the industry groups I mentioned earlier in the chapter).<br />

XML is largely about flexibility — which includes the ability for you to set your own rules for<br />

your XML through the use of either an XML schema document or the older Document Type Definition<br />

(DTD).<br />

An XML document has very few rules placed on it just because it happens to be XML. The biggie is that<br />

it must be what is called well formed. We’ll look into what well formed means shortly. Now, just because<br />

an XML document meets the criteria of being well formed doesn’t mean that it would be classified as<br />

being valid. Valid XML must not only be well formed, but must also live up to any restrictions placed on<br />

the XML document by XML schemas or DTDs that document references. We will briefly examine DTDs<br />

and XML schemas later on in this chapter.<br />

XML can also be transformed. The short rendition of what this means is that it is relatively easy for you<br />

to turn XML into a completely different XML representation or even a non-XML format. One of the most<br />

common uses for this is to transform XML into HTML for rendering on the Web. The need for this transformation<br />

presents us with our first mini-opportunity to compare and contrast HTML with XML. In the<br />

simplest terms, XML is about information, and HTML is about presentation.<br />

The information stored in XML is denoted through the use of what are called elements and attributes.<br />

Elements are usually created through the use of an opening and a closing tag (there’s an exception, but<br />

we’ll see that later) and are identified with a case-sensitive name (no spaces allowed). Attributes are<br />

items that further describe elements and are embedded in the element’s start tag. Attribute values must<br />

be in matched single or double quotes.<br />

Parts of an XML Document<br />

Well, a few of the names have already flown by, but it makes sense, before we get too deep into things,<br />

to stop and create something of a glossary of terms that we’re going to be utilizing while talking about<br />

XML documents.<br />

What we’re really going to be doing here is providing a listing of all the major parts of an XML document<br />

that you will run into, as shown in Figure 16-1. Many of the parts of the document are optional,<br />

though a few are not. In some cases, having one thing means that you have to have another. In other<br />

cases, the parts of the document are relatively independent of each other.<br />

We will take things in something of a hierarchical approach (things that belong “inside” of something<br />

will be listed after whatever they belong inside of), and where it makes sense, in the order you’ll come<br />

across them in a given XML document.<br />

The Document<br />

Chapter 16: A Brief XML Primer<br />

The document encompasses everything from the very first character to the last. When we refer to an<br />

XML document, we are referring to both the structure and the content of that particular XML document.<br />

475

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!