EXtensible Markup Language (XML) - Cultural View

EXtensible Markup Language (XML) - Cultural View EXtensible Markup Language (XML) - Cultural View

culturalview.com
from culturalview.com More from this publisher
14.07.2013 Views

EXtensible Markup Language (XML) Visit the Cultural View of Technology XML Tutorial page for videos and exercises PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Thu, 17 Jun 2010 01:47:38 UTC

<strong>EXtensible</strong> <strong>Markup</strong><br />

<strong>Language</strong> (<strong>XML</strong>)<br />

Visit the <strong>Cultural</strong> <strong>View</strong> of Technology <strong>XML</strong> Tutorial page for videos and exercises<br />

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.<br />

PDF generated at: Thu, 17 Jun 2010 01:47:38 UTC


Contents<br />

Articles<br />

Binary <strong>XML</strong> 1<br />

Business Process Definition Metamodel 2<br />

CDATA 3<br />

CDuce 6<br />

Character entity reference 7<br />

CodeSynthesis XSD 9<br />

D3L 10<br />

Darwin Information Typing Architecture 10<br />

DITA Open Toolkit 14<br />

Document Structure Description 15<br />

Document-Centric 16<br />

Document-centric <strong>XML</strong> processing 17<br />

Dynamic <strong>XML</strong> 18<br />

ECMAScript for <strong>XML</strong> 18<br />

Efficient <strong>XML</strong> Interchange 20<br />

Embedded RDF 21<br />

EpiDoc 21<br />

eXtensible Server Pages 23<br />

Fast Infoset 24<br />

Global listings format 26<br />

GMX 26<br />

GMX-V 27<br />

Head-Body Pattern 28<br />

HyTime 28<br />

Internationalization Tag Set 29<br />

Klip 32<br />

List of <strong>XML</strong> and HTML character entity references 33<br />

Log4js 44<br />

MAREC 46<br />

Media Object Server 47<br />

METS 47<br />

Numeric character reference 50<br />

Office Open <strong>XML</strong> 52<br />

Office Open <strong>XML</strong> file formats 61


OIO<strong>XML</strong> 70<br />

Open <strong>XML</strong> Paper Specification 71<br />

PCDATA 77<br />

Plain Old <strong>XML</strong> 78<br />

Portable Application Description 79<br />

Publishing Requirements for Industry Standard Metadata 80<br />

QName 82<br />

QTI 83<br />

Resource Description Framework 89<br />

Resources of a Resource 98<br />

Reverse Ajax 99<br />

Root element 100<br />

Schematron 101<br />

Simple Outline <strong>XML</strong> 103<br />

Simple <strong>XML</strong> 104<br />

Streaming <strong>XML</strong> 105<br />

Styled Layer Descriptor 105<br />

Topic (<strong>XML</strong>) 106<br />

Unique Particle Attribution 107<br />

VTD-<strong>XML</strong> 108<br />

X-expression 114<br />

XBRLS 114<br />

Xdos 116<br />

XDR Schema 116<br />

XEE (Starlight) 117<br />

XEP 118<br />

<strong>XML</strong> 119<br />

<strong>XML</strong> and MIME 132<br />

<strong>XML</strong> appliance 133<br />

<strong>XML</strong> Base 135<br />

<strong>XML</strong> Catalog 136<br />

<strong>XML</strong> Certification Program 138<br />

<strong>XML</strong> Configuration Access Protocol 143<br />

<strong>XML</strong> Control Protocol 144<br />

<strong>XML</strong> data binding 145<br />

<strong>XML</strong> database 146<br />

<strong>XML</strong> editor 150<br />

<strong>XML</strong> Enabled Directory 153


<strong>XML</strong> Encryption 154<br />

<strong>XML</strong> Events 154<br />

<strong>XML</strong> framework 156<br />

<strong>XML</strong> Literals 157<br />

<strong>XML</strong> namespace 157<br />

<strong>XML</strong> Pretty Printer 158<br />

<strong>XML</strong> Protocol 159<br />

<strong>XML</strong> schema 160<br />

<strong>XML</strong> Schema Editor 162<br />

<strong>XML</strong> Schema <strong>Language</strong> Comparison 165<br />

<strong>XML</strong> Studio 171<br />

<strong>XML</strong> Telemetric and Command Exchange 172<br />

<strong>XML</strong> template engine 174<br />

<strong>XML</strong> tree 177<br />

<strong>XML</strong> validation 177<br />

<strong>XML</strong>-Enabled Networking 178<br />

<strong>XML</strong>-Retrieval 180<br />

<strong>XML</strong>HttpRequest 182<br />

<strong>XML</strong>Socket 187<br />

XPath 188<br />

XPath 2.0 189<br />

Xs3p 192<br />

XSQL 193<br />

References<br />

Article Sources and Contributors 194<br />

Image Sources, Licenses and Contributors 198<br />

Article Licenses<br />

License 199


Binary <strong>XML</strong> 1<br />

Binary <strong>XML</strong><br />

Binary <strong>XML</strong> refers to any specification which defines the compact representation of <strong>XML</strong> (Extensible <strong>Markup</strong><br />

<strong>Language</strong>) in a binary format. While there are several competing formats, none has been widely adopted by a<br />

standards organization or accepted as a de facto standard. Using a binary <strong>XML</strong> format generally reduces the<br />

verbosity of <strong>XML</strong> documents and cost of parsing [1] , but hinders the use of ordinary text editors and third-party tools<br />

to view and edit the document. Binary <strong>XML</strong> is typically used in applications where standard <strong>XML</strong> is not an option<br />

due to performance limitations, but the ability to convert the document to and from a form which is easily viewed<br />

and edited is valued. Other advantages may include enabling random access and indexing of <strong>XML</strong> documents.<br />

The major challenge for binary <strong>XML</strong> is to create a single, widely adopted standard. The International Organization<br />

for Standardization (ISO) and the International Telecommunications Union (ITU) published the Fast Infoset standard<br />

in 2007 and 2005, respectively. The World Wide Web Consortium (W3C) has produced the first draft of the EXI<br />

format specification. Another standard (ISO/IEC 23001-1), known as Binary MPEG format for <strong>XML</strong> (BiM), has<br />

been standardized by the ISO in 2001. BiM is used by many ETSI standards for Digital TV and Mobile TV. The<br />

Open Geospatial Consortium also provides a Binary <strong>XML</strong> Encoding Specification (currently a Best Practice Paper)<br />

optimized for geo-related data (GML).<br />

Alternatives to binary <strong>XML</strong> include using traditional file compression methods on <strong>XML</strong> documents (for example<br />

gzip); or using an existing standard such as ASN.1. Traditional compression methods, however, offer only the<br />

advantage of compression, without the advantage of decreased parsing time or random access. ASN.1 is being used<br />

as the basis of Fast Infoset, which is one binary <strong>XML</strong> standard. There are also hybrid approaches (e.g., VTD-<strong>XML</strong>)<br />

that attach a small index file to an <strong>XML</strong> document to eliminate the overhead of parsing [2] .<br />

Adoption<br />

Projects and file formats which use binary <strong>XML</strong> include:<br />

• Fast Infoset, a standard published by ISO/IEC and ITU-T<br />

• Efficient <strong>XML</strong> from AgileDelta, Inc., selected as the basis for the W3C Standard for Binary <strong>XML</strong> (EXI)<br />

• Extensible Binary Meta <strong>Language</strong> (EBML) from Matroska<br />

• Wireless Binary <strong>XML</strong> (WB<strong>XML</strong>)<br />

Other projects that have functionality related to (or competing with) binary representations include:<br />

• VTD-<strong>XML</strong> from XimpleWare and VTD-<strong>XML</strong> project<br />

• BiM Standard, from the ISO, developed by the MPEG working group<br />

• Protocol Buffers from Google<br />

• Data Distribution Service from OMG<br />

References<br />

[1] The performance woe of binary <strong>XML</strong> http://webservices.sys-con.com/read/250512.htm<br />

[2] Index <strong>XML</strong> documents with VTD-<strong>XML</strong> (http://xml.sys-con.com/read/453082.htm)


Business Process Definition Metamodel 2<br />

Business Process Definition Metamodel<br />

The Business Process Definition Metamodel (BPDM) is a standard definition of concepts used to express business<br />

process models (a metamodel), adopted by the OMG (Object Management Group). Metamodels define concepts,<br />

relationships, and semantics for exchange of user models between different modeling tools. The exchange format is<br />

defined by XSD (<strong>XML</strong> Schema) and XMI (<strong>XML</strong> for Metadata Interchange), a specification for transformation of<br />

OMG metamodels to <strong>XML</strong>. Pursuant to the OMG's policies, the metamodel is the result of an open process<br />

involving submissions by member organizations, following a Request for Proposal [1] (RFP) issued in 2003. BPDM<br />

was adopted in initial form in July 2007, and finalized in July 2008.<br />

BPDM provides abstract concepts as the basis for consistent interpretation of specialized concepts used by business<br />

process modelers. For example, the ordering of many of the graphical elements in a BPMN (Business Process<br />

Modeling Notation) diagram is depicted by arrows between those elements, but the specific elements can have a<br />

variety of characteristics. For example, all BPMN events have some common characteristics, and a variety of<br />

specific events are designated by the type of circle and the icon in the circle. The abstract BPDM concepts ensure<br />

implementers of different modeling tools will associate the same characteristics and semantics with the modeling<br />

elements to ensure models are interpreted the same way when moved to a different tool. Users of the modeling tools<br />

do not need to be concerned with the abstractions-they only see the specialized elements.<br />

BPDM extends business process modeling beyond the elements defined by BPMN and BPEL to include interactions<br />

between otherwise-independent business processes executing in different business units or enterprises<br />

(choreography). A choreography can be specified independently of its participants, and used as a requirement for the<br />

specification of the orchestration implemented by a participant. BPDM provides for the binding of orchestration to<br />

choreography to ensure compatibility. Many current business process models focus on specification of executable<br />

business processes that execute within an enterprise (orchestration).<br />

The BPDM specification addresses the objectives of the OMG RFP [1] on which it is based:<br />

• BPDM "will define a set of abstract business process definition elements for specification of executable business<br />

processes that execute within an enterprise, and may collaborate between otherwise-independent business<br />

processes executing in different business units or enterprises."<br />

• common metamodel to unify the diverse business process definition notations that exist in the industry containing<br />

semantics compatible with leading business process modeling notations.<br />

• A metamodel that complements existing UML metamodels so that business processes specifications can be part<br />

of complete system specifications to assure consistency and completenes.<br />

• The ability to integrate process models for workflow management processes, automated business processes, and<br />

collaborations between business units.<br />

• Support for the specification of web services choreography, describing the collaboration between participating<br />

entities and the ability to reconcile the choreography with supporting internal business processes.<br />

• The ability to exchange business process specifications between modeling tools, and between tools and execution<br />

environments using XMI.<br />

The RFP seeks to "improve communication between modelers, including between business and software modelers,<br />

provide flexible selection of tools and execution environments, and promote the development of more specialized<br />

tools for the analysis and design of processes."<br />

For exchange of business process models, BPDM is an alternative to the existing process interchange format XPDL<br />

(<strong>XML</strong> Process Definition <strong>Language</strong>) from the WfMC (Workflow Management Coalition). The two specifications<br />

are similar in that they can be used by process design tools to exchange business process definitions. They are<br />

different in that BPDM provides a specification of semantics integrated in a metamodel, and it includes additional<br />

modeling capabilities such as choreography, discussed above. In addition, XPDL has many implementations, though


Business Process Definition Metamodel 3<br />

only some support for XPDL 2.x, needed for interchanging BPMN. BPDM implementations are in preparation,<br />

including support for BPMN, and translation to XPDL.<br />

External links<br />

• BPDM Tutorial [2]<br />

• Design Rationale [3] (see Section 4, also Sections 7.6 and 7.9).<br />

• Other introductory presentations [4]<br />

• Web pages showing metamodels [5] in UML notation<br />

• Specification documents, in two parts:<br />

• Common Infrastructure [6] (see Section 4.4.1.1 for an overview of metamodeling).<br />

• Process Definition [7] .<br />

References<br />

[1] http://www.omg.org/cgi-bin/doc?bei/03-01-06<br />

[2] http://doc.omg.org/omg/08-06-32<br />

[3] http://doc.omg.org/bmi/08-09-07<br />

[4] http://www.conradbock.org/#BPDM<br />

[5] ftp://ftp.omg.org/pub/docs/dtc/08-05-11/pages/188c21b53f42002f.htm<br />

[6] http://doc.omg.org/dtc/08-05-07<br />

[7] http://doc.omg.org/dtc/08-05-10<br />

CDATA<br />

The term CDATA, meaning character data, is used for distinct, but related purposes in the markup languages<br />

SGML and <strong>XML</strong>. The term indicates that a certain portion of the document is general character data, rather than<br />

non-character data or character data with a more specific, limited structure.<br />

CDATA sections in <strong>XML</strong><br />

In an <strong>XML</strong> document or external parsed entity, a CDATA section is a section of element content that is marked for<br />

the parser to interpret as only character data, not markup. A CDATA section is merely an alternative syntax for<br />

expressing character data; there is no semantic difference between character data that manifests as a CDATA section<br />

and character data that manifests as in the usual syntax in which "


CDATA 4<br />

John Smith]]><br />

then the code is interpreted the same as if it had been written like this:<br />

&lt;sender&gt;John Smith&lt;/sender&gt;<br />

That is, the "sender" tags will have exactly the same status as the "John Smith"— they will be treated as text.<br />

Similarly, if the numeric character reference &#240; appears in element content, it will be interpreted as the single<br />

Unicode character 00F0 (small letter eth). But if the same appears in a CDATA section, it will be parsed as six<br />

characters: ampersand, hash mark, digit 2, digit 4, digit 0, semicolon.<br />

Uses of CDATA sections<br />

New authors of <strong>XML</strong> documents often misunderstand the purpose of a CDATA section, mistakenly believing that its<br />

purpose is to "protect" data from being treated as ordinary character data during processing. Some APIs for working<br />

with <strong>XML</strong> documents do offer options for independent access to CDATA sections, but such options exist above and<br />

beyond the normal requirements of <strong>XML</strong> processing systems, and still do not change the implicit meaning of the<br />

data. Character data is character data, regardless of whether it is expressed via a CDATA section or ordinary markup.<br />

CDATA sections are useful for writing <strong>XML</strong> code as text data within an <strong>XML</strong> document. For example, if one wishes<br />

to typeset a book with XSL explaining the use of an <strong>XML</strong> application, the <strong>XML</strong> markup to appear in the book itself<br />

will be written in the source file in a CDATA section. However, a CDATA section cannot contain the string "]]>"<br />

and therefore it is not possible for a CDATA section to contain nested CDATA sections. The preferred approach to<br />

using CDATA sections for encoding text that contains the triad "]]>" is to use multiple CDATA sections by splitting<br />

each occurrence of the triad just before the ">". For example, to encode "]]>" one would write:<br />

]]><br />

This means that to encode "]]>" in the middle of a CDATA section, replace all occurrences of "]]>" with the<br />

following:<br />

]]]]><br />

This effectively stops and restarts the CDATA section.<br />

Use of CDATA in program output<br />

For generating <strong>XML</strong> "by hand", CDATA sections do not remove the need for escaping. The string ]]> (the CDATA<br />

end marker) must be escaped with a string such as ]]]]>, which breaks the string across separate<br />

CDATA sections. An alternative to using CDATA sections which may be simpler in some circumstances is to<br />

escape the single characters & and < (normally using &amp; or &#38; and &lt; or &#60;). The different approaches<br />

produce equally valid <strong>XML</strong>, and most <strong>XML</strong> parsers will not preserve the distinctions between them in their output.<br />

CDATA sections in XHTML documents are liable to be parsed differently by web browsers if they render the<br />

document as HTML, since HTML parsers do not recognise the CDATA start and end markers, nor do they recognise<br />

HTML entity references such as &lt; within tags. This can cause rendering problems in web browsers and<br />

can lead to cross-site scripting vulnerabilities if used to display data from untrusted sources, since the two kinds of<br />

parser will disagree on where the CDATA section ends.<br />

Since it is useful to be able to use less-than signs (


CDATA 5<br />

example:<br />

<br />

//<br />

<br />

or this CSS example:<br />

<br />

/**/<br />

<br />

This technique is only necessary when using inline scripts and stylesheets, and is language-specific. CSS stylesheets,<br />

for example, only support the second style of commenting-out (/* ... */), but CSS also has less need for the < and &<br />

characters than JavaScript and so less need for explicit CDATA markers.<br />

CDATA in DTDs<br />

CDATA-type attribute value<br />

In Document Type Definition (DTD) files for SGML and <strong>XML</strong>, an attribute value may be designated as being of<br />

type CDATA: arbitrary character data. Within a CDATA-type attribute, character and entity reference markup is<br />

allowed and will be processed when the document is read.<br />

For example, if an <strong>XML</strong> DTD contains<br />

<br />

it means that elements named foo may optionally have an attribute named "a" which is of type CDATA. In an <strong>XML</strong><br />

document that is valid according to this DTD, an element like this might appear:<br />

<br />

and an <strong>XML</strong> parser would interpret the "a" attribute's value as being the character data "1 & 2 are < 3".<br />

CDATA-type entity<br />

An SGML or <strong>XML</strong> DTD may also include entity declarations in which the token CDATA is used to indicate that<br />

entity consists of character data. The character data may appear within the declaration itself or may be available<br />

externally, referenced by a URI. In either case, character reference and parameter entity reference markup is allowed<br />

in the entity, and will be processed as such when it is read.<br />

CDATA-type element content<br />

An SGML DTD may declare an element's content as being of type CDATA. Within a CDATA-type element, no<br />

markup will be processed. It is similar to a CDATA section in <strong>XML</strong>, but has no special boundary markup, as it<br />

applies to the entire element.


CDATA 6<br />

External links<br />

• CDATA Confusion [1]<br />

• Character Data and <strong>Markup</strong> (in <strong>XML</strong>) [2]<br />

References<br />

[1] http://www.flightlab.com/~joe/sgml/cdata.html<br />

[2] http://www.w3.org/TR/REC-xml/#syntax<br />

CDuce<br />

CDuce is an <strong>XML</strong>-oriented functional language, which extends XDuce in a few directions. It features <strong>XML</strong> regular<br />

expression types, <strong>XML</strong> regular expression patterns, <strong>XML</strong> iterators. CDuce is not strictly speaking an <strong>XML</strong><br />

transformation language since it can be used for general-purpose programming.<br />

CDuce conforms to basic standards: Unicode, <strong>XML</strong>, DTD, Namespaces are fully supported, <strong>XML</strong> Schema is<br />

partially supported.<br />

Benefits of CDuce<br />

• static verifications (e.g.: ensure that a transformation produces a valid document);<br />

• in particular, we aim at smooth and safe compositions of <strong>XML</strong> transformations, and incremental programming;<br />

• static optimizations and efficient execution model (knowing the type of a document is crucial to extract<br />

information efficiently).<br />

Features particular to CDuce<br />

• <strong>XML</strong> objects can be manipulated as first-class citizen values: elements, sequences, tags, characters and strings,<br />

attribute sets; sequences of <strong>XML</strong> elements can be specified by regular expressions, which also apply to characters<br />

strings;<br />

• functions themselves are first-class values, they can be manipulated, stored in data structure, returned by a<br />

function,...<br />

• a powerful pattern matching operation can perform complex extractions from sequences of <strong>XML</strong> elements;<br />

• a rich type algebra, with recursive types and arbitrary boolean combinations (union, intersection, complement)<br />

allows precise definitions of data structures and <strong>XML</strong> types; general purpose types and types constructors are<br />

taken seriously (products, extensible records, arbitrary precision integers with interval constraints, Unicode<br />

characters);<br />

• polymorphism through a natural notion of subtyping, and overloaded functions with dynamic dispatch;<br />

• a highly-effective type-driven compilation schema.<br />

External links<br />

• CDuce [1]<br />

References<br />

[1] http://www.cduce.org


Character entity reference 7<br />

Character entity reference<br />

In the markup languages SGML, HTML, XHTML and <strong>XML</strong>, a character entity reference is a reference to a<br />

particular kind of named entity that has been predefined or explicitly declared in a Document Type Definition<br />

(DTD). The "replacement text" of the entity consists of a single character from the Universal Character Set/Unicode.<br />

The purpose of a character entity reference is to provide a way to refer to a character that is not universally<br />

encodable.<br />

Although in popular usage character references are often called "entity references" or even "entities", this usage is<br />

wrong. A character reference is a reference to a character, not to an entity. Entity reference refers to the content of a<br />

named entity. An entity declaration is created by using the syntax in a document type<br />

definition (DTD) or <strong>XML</strong> schema. Then, the name defined in the entity declaration is subsequently used in the<br />

<strong>XML</strong>. When used in the <strong>XML</strong>, it is called an entity reference.<br />

Concepts<br />

<strong>XML</strong> has two relevant concepts:<br />

Predefined entity<br />

A "predefined entitys reference" is a reference to one of the special characters denoted by:<br />

Character coding<br />

entity character code (dec) meaning<br />

&quot; " x22 (34) (double) quotation mark<br />

&amp; & x26 (38) ampersand<br />

&apos; ' x27 (39) apostrophe (= apostrophe-quote)<br />

&lt; < x3C (60) less-than sign<br />

&gt; > x3E (62) greater-than sign<br />

A "character reference" is a construct such as &#xa0; or equally &#160; that refers to a character by means of its<br />

numeric Unicode code point, i.e. here, the character code 160 (or xA0 in hexa) refers the &nbsp; character, the<br />

non-breaking space.<br />

See also<br />

• SGML entity<br />

• Character encodings in HTML<br />

• Numeric character reference<br />

• List of <strong>XML</strong> and HTML character entity references


Character entity reference 8<br />

External links<br />

• Entities Table [1]<br />

• A Simple Character Entity Chart [2]<br />

• A character entity chart with images for entities [3]<br />

• A Clear and Quick Reference to HTML Symbol Entities Codes [4]<br />

References<br />

[1] http://www.elizabethcastro.com/html/extras/entities.html<br />

[2] http://www.evolt.org/article/ala/17/21234/<br />

[3] http://www.escapecodes.info/<br />

[4] http://www.entitycode.com/


CodeSynthesis XSD 9<br />

CodeSynthesis XSD<br />

Written<br />

in<br />

C++<br />

Type library or framework<br />

CodeSynthesis XSD is an <strong>XML</strong> Data Binding compiler for C++ developed by Code Synthesis and dual-licensed<br />

under the GNU GPL and a proprietary license. Given an <strong>XML</strong> instance specification (<strong>XML</strong> Schema), it generates<br />

C++ classes that represent the given vocabulary as well as parsing and serialization code. It is supported on a large<br />

number of platforms, including AIX, GNU/Linux, HP-UX, Mac OS X, Solaris, Windows, HP OpenVMS, and IBM<br />

z/OS. Supported C++ compilers include GNU G++, Intel C++, HP aCC, Sun C++, IBM XL C++, and Microsoft<br />

Visual C++. A version for mobile and embedded systems, called CodeSynthesis XSD/e, is also available.<br />

One of the unique features of CodeSynthesis XSD is its support for two different <strong>XML</strong> Schema to C++ mappings:<br />

in-memory C++/Tree and stream-oriented C++/Parser. The C++/Tree mapping is a traditional mapping with a<br />

tree-like, in-memory data structure. C++/Parser is a new, SAX-like mapping which represents the information stored<br />

in <strong>XML</strong> instance documents as a hierarchy of vocabulary-specific parsing events. In comparison to C++/Tree, the<br />

C++/Parser mapping allows one to handle large <strong>XML</strong> documents that would not fit in memory, perform<br />

stream-oriented processing, or use an existing in-memory representation.<br />

CodeSynthesis XSD itself is written in C++ [1] .<br />

External links<br />

• CodeSynthesis XSD Home Page [2]<br />

• An Introduction to the C++/Tree Mapping [3]<br />

• An Introduction to the C++/Parser Mapping [4]<br />

• An Introduction to <strong>XML</strong> Data Binding in C++ [5]<br />

References<br />

[1] Bjarne Stroustrup. C++ applications (http://www.research.att.com/~bs/applications.html), 2007-05-25. Retrieved on 2007-06-18.<br />

[2] http://www.codesynthesis.com/products/xsd/<br />

[3] http://www.codesynthesis.com/projects/xsd/documentation/cxx/tree/guide/ [4]<br />

http://www.codesynthesis.com/projects/xsd/documentation/cxx/parser/guide/ [5]<br />

http://www.artima.com/cppsource/xml_data_binding.html


D3L 10<br />

D3L<br />

D3L (Data Definition Description <strong>Language</strong>) is an <strong>XML</strong>-based message description language that describes the<br />

structure that an application's native, non-<strong>XML</strong> format message (known also as its native view) must follow to<br />

communicate. Currently used in Oracle Application Server InterConnect, D3L message description language is used<br />

to interact through several transport adapters, including FTP, HTTP(S), MQ Series, and SMTP.<br />

External links<br />

http://download-uk.oracle.com/docs/cd/B10465_01/integrate.904/b10404/appx_d3l.htm#620714<br />

Darwin Information Typing Architecture<br />

The Darwin Information Typing Architecture (DITA) is an <strong>XML</strong>-based architecture for authoring, producing,<br />

and delivering information. Although its main applications have so far been in technical publications, DITA is also<br />

used for other types of documents such as policies and procedures.<br />

Origin and name<br />

The DITA architecture and a related DTD and <strong>XML</strong> Schema were originally developed by IBM. The architecture<br />

incorporates ideas in <strong>XML</strong> architecture, such as modular information architecture, various features for content reuse,<br />

and specialization, that had been developed over previous decades. [1] DITA is now an OASIS standard.<br />

The first word in the name "Darwin Information Typing Architecture" is a reference to the naturalist Charles Darwin.<br />

The key concept of "specialization" in DITA is in some ways analogous to Darwin's concept of evolutionary<br />

adaptation, with a specialized element inheriting the properties of the base element from which it is specialized.<br />

Features and limitations<br />

Topic orientation<br />

DITA content is written as modular topics, as opposed to long "book-oriented" files. A DITA map contains links to<br />

topics, organized in the sequence (which may be hierarchical) in which they are intended to appear in finished<br />

documents. A DITA map defines the table of contents for deliverables. Relationship tables in DITA maps can also<br />

specify which topics link to each other.<br />

Modular topics can be easily reused in different deliverables. However, the strict topic-orientation of DITA makes it<br />

an awkward fit for content that contains lengthy narratives that do not lend themselves to being broken into small,<br />

standalone chunks. Experts stress the importance of content analysis in the early stages of implementing structured<br />

[2] [3] [4]<br />

authoring.


Darwin Information Typing Architecture 11<br />

Content references<br />

Fragments of content within topics (or less commonly, the topics themselves) can be reused through the use of<br />

content references (conref), a transclusion mechanism.<br />

Conditional text<br />

Conditional text allows filtering or styling content based on attributes for audience, platform, product, and other<br />

properties.<br />

Metadata<br />

DITA includes extensive metadata elements and attributes, which make topics easier to find.<br />

Information typing<br />

DITA specifies three basic topic types: Task, Concept and Reference. Each of the three basic topic types is a<br />

specialization of a generic Topic type, which contains a title element, a prolog element for metadata, and a body<br />

element. The body element contains paragraph, table, and list elements, similar to HTML.<br />

1. A Task topic is intended for a procedure that describes how to accomplish a task. A Task topic lists a series of<br />

steps that users follow to produce an intended outcome. The steps are contained in a taskbody element, which is a<br />

specialization of the generic body element. The steps element is a specialization of an ordered list element.<br />

2. Concept information is more objective, containing definitions, rules, and guidelines.<br />

3. A Reference topic is for topics that describe command syntax, programming instructions, and other reference<br />

material, and usually contains detailed, factual material.<br />

Specialization<br />

DITA allows adding new elements and attributes through specialization of base DITA elements and attributes.<br />

Through specialization, DITA can accommodate new topic types, element types, and attributes as needed for specific<br />

industries or companies. Specializations of DITA for specific industries, such as the semiconductor industry, are<br />

standardized through OASIS technical committees or subcommittees. A significant percentage of organizations<br />

using DITA also develop their own specializations.<br />

The extensibility of DITA permits organizations to specialize DITA by defining specific information structures and<br />

still use standard tools to work with them. The ability to define company-specific information architectures enables<br />

companies to use DITA to enrich content with metadata that is meaningful to them, and to enforce company-specific<br />

rules on document structure.<br />

Compatibility with non-DITA content<br />

The element types and structures in DITA topics are similar to popular languages such as HTML. For example, a<br />

bulleted or numbered list can be copied and pasted directly from HTML to DITA.<br />

DITA maps can include both DITA topics and non-DITA documents (such as HTML files and Microsoft Word<br />

documents) in document hierarchies. However, processors are generally limited in their ability to merge DITA and<br />

non-DITA content into consolidated printed documents.


Darwin Information Typing Architecture 12<br />

Creating content in DITA<br />

DITA map and topic documents are <strong>XML</strong> files. As with HTML, any images, video files, or other files which need to<br />

appear in output are inserted via reference. Any <strong>XML</strong> editor can therefore be used to write DITA content, with the<br />

exception of editors that support only a limited set of <strong>XML</strong> schemas (such as XHTML editors). Various editing tools<br />

have been developed that provide specific features to support DITA, such as visualization of conrefs.<br />

Publishing content written in DITA<br />

DITA is conceived as an end-to-end architecture. In addition to indicating what elements, attributes, and rules are<br />

part of the DITA language, the DITA specification [5] includes rules for publishing DITA content in print, HTML,<br />

online Help, and other formats.<br />

For example, the DITA specification indicates that if the conref attribute of element A contains a path to element B,<br />

the contents of element B will be displayed in the location of element A. DITA-compliant publishing solutions,<br />

known as DITA processors, must handle the conref attribute according to the specified behaviour. Rules also exist<br />

for processing other rich features such as conditional text, index markers, and topic-to-topic links. Applications that<br />

transform DITA content into other formats, and meet the DITA specification's requirements for interpreting DITA<br />

markup, are known as DITA processors.<br />

DITA Open Toolkit<br />

When DITA was released as a public <strong>XML</strong> standard in 2001, IBM contributed the DITA Open Toolkit (DITA OT)<br />

to the wider community. The DITA OT was therefore the first DITA processor, and continues to be the foundation of<br />

most publishing of DITA content. It is currently an active open-source project, with contributions from several<br />

companies.<br />

Out of the box, the DITA OT handles all valid DITA specializations and produces several output formats, including:<br />

• PDF, through XSL-FO<br />

• XHTML<br />

• Microsoft Compiled HTML Help<br />

• Eclipse Help<br />

• Java Help<br />

• Oracle Help<br />

• Rich Text Format<br />

The DITA OT can also be extended to produce other (arbitrary) output formats. The raw DITA OT can be run from<br />

the command line. Some DITA authoring tools and content management systems now integrate the DITA OT, or<br />

parts of it, into their own publishing workflows. Standalone tools have also been developed to run the DITA OT via<br />

a graphical user interface instead of the command line.<br />

The DITA OT includes customizable stylesheets that control the formatting and layout of human-readable<br />

deliverables.


Darwin Information Typing Architecture 13<br />

Brief history<br />

• March 2001 Introduction by IBM<br />

• May 2002 Domain specialization added to topic specialization<br />

• April 2004 OASIS [6] Technical Committee for DITA formed<br />

• February 2005 SourceForge [7] begins DITA Open Toolkit support<br />

• June 2005 DITA v1.0 approved as an OASIS standard<br />

• August 2005 DITA Open Toolkit v1.1 is released<br />

• March 2006 OASIS launches DITA.<strong>XML</strong>.org [8]<br />

• August 2007 DITA V1.1 is approved by OASIS, including Bookmap specialization<br />

See also<br />

• DocBook<br />

• S1000D<br />

• List of document markup languages<br />

• Comparison of document markup languages<br />

References<br />

• IBM's Introduction to DITA [9]<br />

• DITA Architectural Specification, v 1.1 [5]<br />

• DITA <strong>Language</strong> Specification, v 1.1 [10]<br />

Further reading<br />

• Priestley, Michael; Swope, Amber (2008) (PDF). The DITA Maturity Model Whitepaper [11] . IBM Corp and<br />

JustSystems.<br />

• Doyle, Bob (2008) (PDF). DITA Tools from A to Z [12] . Society for Technical Communication.<br />

External links<br />

• DITA <strong>XML</strong>.org community site [8]<br />

• DITA World [13] — Comprehensive list of DITA resources: articles, vendors, user groups and more<br />

• DITA Open Toolkit User Guide and Reference [14]<br />

• Roadmap for DITA Development [15] , OASIS DITA Technical Committee<br />

• DITA News [16] - aggregates DITA bloggers, has extensive resources, and DITA tools listing<br />

• RuDI: Ruby Utilities for DITA processing [17]


Darwin Information Typing Architecture 14<br />

References<br />

[1] Doyle, Bob. "History of DITA" (http://dita.xml.org/book/history-of-dita). . Retrieved 2009-07-31.<br />

[2] "Implementing DITA versus implementing custom <strong>XML</strong> architecture" (http://www.scriptorium.com/whitepapers/dita_assessment/<br />

dita_assessment4.html). Scriptorium Publishing Services, Inc. 2008. . Retrieved 2009-07-29.<br />

[3] "Structure, DITA, and content other than technical documentation …" (http://rockley.com/blog/?p=22). The Rockley Group. October 16,<br />

2007. . Retrieved 2009-07-29.<br />

[4] "Survey on DITA Chellenges" (http://writepoint.com/blog/?p=1011). WritePoint Ltd.. January 18, 2010. . Retrieved 2010-01-21.<br />

[5] http://docs.oasis-open.org/dita/v1.1/CS01/archspec/archspec.html<br />

[6] http://www.oasis-open.org<br />

[7] http://dita-ot.sourceforge.net<br />

[8] http://dita.xml.org<br />

[9] http://www.ibm.com/developerworks/xml/library/x-dita1/<br />

[10] http://docs.oasis-open.org/dita/v1.1/CS01/langspec/ditaref-type.html<br />

[11] http://na.justsystems.com/files/Whitepaper-DITA_MM.pdf<br />

[12] http://www.ditanews.com/tools/STC_Intercom.pdf<br />

[13] http://www.ditaworld.com<br />

[14] http://dita-ot.sourceforge.net/doc/ot-userguide/xhtml/<br />

[15] http://wiki.oasis-open.org/dita/Roadmap_for_DITA_development<br />

[16] http://www.ditanews.com<br />

[17] http://kenai.com/projects/rudi/pages/Home<br />

DITA Open Toolkit<br />

The DITA Open Toolkit is a free and open-source implementation of the OASIS DITA Technical Committee's<br />

specification for Darwin Information Typing Architecture (DITA) DTDs and Schemas. [1]<br />

The Toolkit transforms DITA content (topics and maps) into deliverable formats like web (XHTML), print (PDF),<br />

and online Help.<br />

The DITA Open Toolkit, or dita-ot for short, is a set of Ant- and Java-based, open source tools that provide a<br />

"reference implementation" for processing DITA maps and topical content to multiple output formats.<br />

It is a demonstration of DITA's capabilities for single source publishing, modularity, structured writing, information<br />

typing, inheritance, specialization, topic-based authoring, conditional processing, component publishing, task<br />

orientation, and content reuse.<br />

Several <strong>XML</strong> editors and <strong>XML</strong> content management systems integrate the DITA Open Toolkit into their products,<br />

including Oxygen <strong>XML</strong> Editor, XMetaL, and Syntext Serna.<br />

See also<br />

Darwin Information Typing Architecture<br />

Further reading<br />

• Linton, Jen and Bruski, Kylene (2006). Introduction to DITA: A Basic User Guide to the Darwin Information<br />

Typing Architecture [2] . Denver, CO: Comtech Services.<br />

External links<br />

• http://dita.xml.org<br />

• SourceForge page on the DITA OT [3]<br />

• Don Day's Resources Page for the DITA OT [4]<br />

• DITA Open Toolkit User Guide [5]


DITA Open Toolkit 15<br />

• Download page for the DITA OT [6]<br />

• DITA Users [7] - a member organization with workspace folders and online version of the DITA Open Toolkit<br />

• PHP debugging tools for the DITA OT [8]<br />

• DITA Infocenter [9] - DITA OT User Guide in online help format<br />

References<br />

[1] http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita<br />

[2] http://www.comtech-serv.com/dita2.shtml#book<br />

[3] http://sourceforge.net/projects/dita-ot/<br />

[4] http://www.ditaopentoolkit.org/<br />

[5] http://dita-ot.sourceforge.net/SourceForgeFiles/doc/user_guide.html<br />

[6] http://sourceforge.net/project/showfiles.php?group_id=132728<br />

[7] http://www.ditausers.org<br />

[8] http://www.vrcommunications.com/Code/ditaotug131-18042007-tools.zip<br />

[9] http://www.ditainfocenter.com<br />

Document Structure Description<br />

Document Structure Description, or DSD, is a schema language for <strong>XML</strong>, that is, a language for describing valid<br />

<strong>XML</strong> documents. It's an alternative to DTD or the W3C <strong>XML</strong> Schema.<br />

An example of DSD in its simplest form:<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

This says that element named "foo" in the <strong>XML</strong> namespace "http://example.com" may have two attributes, named<br />

"first" and "second". A "foo" element may not have any character data. It must contain one subelement, named "bar",<br />

also in the "http://example.com" namespace. A "bar" element is not allowed any attributes, character data or<br />

subelements.


Document Structure Description 16<br />

One <strong>XML</strong> document that would be valid according to the above DSD would be:<br />

<br />

<br />

<br />

Current Software store<br />

• Prototype Java Processor [1] from BRICS<br />

External links<br />

• DSD home page [2]<br />

• Full DSD specification [3]<br />

• Comparison of DTD, W3C <strong>XML</strong> Schema, and DSD [4]<br />

References<br />

[1] http://www.brics.dk/DSD/dsd2<br />

[2] http://www.brics.dk/DSD/<br />

[3] http://www.brics.dk/DSD/dsd2.html<br />

[4] http://www.brics.dk/~amoeller/<strong>XML</strong>/schemas/<br />

Document-Centric<br />

Document Centric <strong>XML</strong> processing is a notion first introduced in VTD-<strong>XML</strong>. Before VTD-<strong>XML</strong>, traditional<br />

<strong>XML</strong> processing models (e.g. DOM, SAX and JAXB etc.) are designed around the notion of objects. The <strong>XML</strong> text,<br />

merely as the serialization of the objects, is relegated to the status of a second-class citizen. You base your<br />

applications on DOM nodes, string and various business objects, but rarely on the physical documents. If you have<br />

followed my articles on DevX so far, it should quickly become obvious that this object-oriented approach of <strong>XML</strong><br />

processing makes little sense because of the performance hits from virtually all directions. Not only are object<br />

creation and garbage collection inherently memory and CPU inefficient, but your applications incur the cost of<br />

re-serialization with even the smallest changes to the original text.<br />

With document-centric <strong>XML</strong> processing, the <strong>XML</strong> document (the persistent format of data) is the starting point from<br />

which everything else comes about. Whether it is parsing, XPath evaluation, modifying content, or slicing element<br />

fragments, by default you no longer work directly with objects. You only do that when it makes sense. More often<br />

than not, you treat documents purely as syntax, and think in bytes, byte arrays, integers, offsets, lengths, fragments<br />

and namespace-compensated fragments. The first-class citizen in this paradigm is the <strong>XML</strong> text. And the<br />

object-centric notions of <strong>XML</strong> processing, such as serialization and de-serialization (or marshalling and<br />

unmarshalling) are often displaced, if not replaced, by more document-centric notions of parsing and composition.<br />

Increasingly you will find that your <strong>XML</strong> programming experience is getting simpler. And not surprisingly, the<br />

simpler, intuitive way to think about <strong>XML</strong> processing is also the most efficient and powerful.


Document-centric <strong>XML</strong> processing 17<br />

Document-centric <strong>XML</strong> processing<br />

Document-centric <strong>XML</strong> processing is one of two conceptual approaches to processing <strong>XML</strong> content, along with<br />

Data-centric <strong>XML</strong> processing. Although there is no universally accepted definition of the term, following articles<br />

discuss features typically associated with this approach:<br />

• Data-centric vs Document-centric <strong>XML</strong> [1]<br />

• Text-centric vs data-centric <strong>XML</strong> retrieval [2]<br />

Applications based on Document-centric Approach<br />

VTD-<strong>XML</strong><br />

Before VTD-<strong>XML</strong>, traditional <strong>XML</strong> processing models (e.g. DOM, SAX and JAXB etc.) are designed around the<br />

notion of objects. The <strong>XML</strong> text, merely as the serialization of the objects, is relegated to the status of a second-class<br />

citizen. Applications are based on DOM nodes, strings and various business objects, but rarely on the physical<br />

documents. This object-oriented approach of <strong>XML</strong> processing has serious issues because of the performance hits<br />

from virtually all directions. Not only are object creation and garbage collection inherently memory and CPU<br />

inefficient, but applications incur the cost of re-serialization with even the smallest changes to the original text.<br />

With document-centric <strong>XML</strong> processing, the <strong>XML</strong> document (the persistent format of data) is the starting point from<br />

which everything else comes about. Whether it is parsing, XPath evaluation, modifying content, or slicing element<br />

fragments, by default you no longer work directly with objects. You only do that when it makes sense. More often<br />

than not, one treat documents purely as syntax, and think in bytes, byte arrays, integers, offsets, lengths, fragments<br />

and namespace-compensated fragments. The first-class citizen in this paradigm is the <strong>XML</strong> text. And the<br />

object-centric notions of <strong>XML</strong> processing, such as serialization and de-serialization (or marshalling and<br />

unmarshalling) are often displaced, if not replaced, by more document-centric notions of parsing and composition.<br />

References<br />

[1] http://techessence.info/node/51<br />

[2] http://nlp.stanford.edu/IR-book/html/htmledition/text-centric-vs-data-centric-xml-retrieval-1.html


Dynamic <strong>XML</strong> 18<br />

Dynamic <strong>XML</strong><br />

Dynamic <strong>XML</strong> means dynamic data that is in an <strong>XML</strong> format.<br />

Another popular use of this term also refers to information which is extracted from a database (commonly a<br />

relational database) and placed into <strong>XML</strong> format. Clearly this is a completely different case as it does not involve<br />

any updates to the data – and is in fact static data. In this context the word "dynamic" is taking the alternative<br />

meaning of "automated", in the sense that something which is performed dynamically is actioned without effort.<br />

ECMAScript for <strong>XML</strong><br />

ECMAScript for <strong>XML</strong> (E4X) is a programming language extension that adds native <strong>XML</strong> support to ECMAScript<br />

(which includes ActionScript, DMDScript, JavaScript, and JScript). The goal is to provide an alternative to DOM<br />

interfaces that uses a simpler syntax for accessing <strong>XML</strong> documents. It also offers a new way of making <strong>XML</strong><br />

visible. Before the release of E4X, <strong>XML</strong> was always accessed at an object level. E4X instead treats <strong>XML</strong> as a<br />

primitive (like characters, integers, and booleans). This implies faster access, better support, and acceptance as a<br />

building block (data structure) of a program.<br />

E4X is standardized by Ecma International in the ECMA-357 standard [1] . The first edition was published in June<br />

2004, the second edition in December 2005.<br />

Browser support<br />

E4X is currently supported by Mozilla's Rhino, used in OpenOffice.org and several other projects, and<br />

SpiderMonkey, used in Firefox, Thunderbird, and other XUL-based applications. It is also supported by Tamarin, the<br />

JavaScript engine used in the Flash virtual machine. It is not currently supported by Nitro (Safari), V8 (Google<br />

Chrome), or Internet Explorer.[2]<br />

Example<br />

var sales = <br />

<br />

<br />

<br />

;<br />

alert( sales.item.(@type == "carrot").@quantity );<br />

alert( sales.@vendor );<br />

for each( var price in sales..@price ) {<br />

}<br />

alert( price );<br />

delete sales.item[0];<br />

sales.item += ;<br />

sales.item.(@type == "oranges").@quantity = 4;


ECMAScript for <strong>XML</strong> 19<br />

Implementations<br />

The first implementation of E4X was designed by Terry Lucas and John Schneider and appeared in BEA's Weblogic<br />

Workshop 7.0 released in February 2002. BEA's implementation was based on Rhino and released before the<br />

ECMAScript E4X spec was completed in June 2004. John Schneider wrote an article [3] on the <strong>XML</strong> extensions in<br />

BEA's Workshop at the time.<br />

• E4X is implemented in SpiderMonkey (Gecko's JavaScript engine) since version 1.6.0 [4] and in Rhino (Mozilla's<br />

other JavaScript engine written in Java instead of C) since version 1.6R1 [5] .<br />

• As Mozilla Firefox is based on Gecko, it can be used to run scripts using E4X. The specification is supported in<br />

the 1.5 release or later.<br />

• Adobe's ActionScript 3 scripting language fully supports E4X. Early previews of ActionScript 3 were first made<br />

available in late 2005. Adobe officially released the language with Flash Player 9 on June 28, 2006.<br />

• E4X is available in Flash CS3, Adobe AIR and Adobe Flex as they use ActionScript 3 as a scripting language.<br />

• E4X is also available in Adobe Acrobat and Adobe Reader versions 8.0 or higher.<br />

• E4X is also available in Aptana's Jaxer Ajax application server which uses the Mozilla engine server-side.<br />

• Since the release of Alfresco Community Edition 2.9B, E4X is also available in this enterprise document<br />

management system.<br />

External links<br />

• ECMA-357 standard [1]<br />

• E4X at faqts.com [6]<br />

• Slides from 2005 E4X Presentation by Brendan Eich, Mozilla Chief Architect [7]<br />

• E4X at Mozilla Developer Center [8]<br />

• Introducing E4X at xml.com [9] : compares E4X and json<br />

• Processing <strong>XML</strong> with E4X [10] at Mozilla Developer Center<br />

• Tutorial from W3 Schools [11]<br />

• E4X: Beginner to Advanced [12] at Yahoo Developer Network<br />

References<br />

[1] http://www.ecma-international.org/publications/standards/Ecma-357.htm<br />

[2] http://code.google.com/p/chromium/issues/detail?id=30975<br />

[3] http://web.archive.org/web/20080403052807/http://dev2dev.bea.com/pub/a/2002/09/JSchneider_<strong>XML</strong>.html<br />

[4] SpiderMonkey 1.6.0 release notes (http://www.mozilla.org/js/spidermonkey/release-notes/JS_160.html)<br />

[5] Rhino 1.6R1 Change log (http://www.mozilla.org/rhino/rhino16R1.html)<br />

[6] http://www.faqts.com/knowledge_base/index.phtml/fid/1762<br />

[7] https://developer.mozilla.org/presentations/xtech2005/e4x/ [8]<br />

https://developer.mozilla.org/en/docs/E4X<br />

[9] http://www.xml.com/pub/a/2007/11/28/introducing-e4x.html<br />

[10] https://developer.mozilla.org/index.php?title=En/Core_JavaScript_1.5_Guide/Processing_<strong>XML</strong>_with_E4X<br />

[11] http://www.w3schools.com/e4x/default.asp<br />

[12] http://developer.yahoo.com/flash/articles/e4x-beginner-to-advanced.html


Efficient <strong>XML</strong> Interchange 20<br />

Efficient <strong>XML</strong> Interchange<br />

Efficient <strong>XML</strong> Interchange (EXI) is a proposed data format from the Efficient <strong>XML</strong> Interchange Working Group<br />

of the World Wide Web Consortium (W3C). It is one of the various efforts to encode <strong>XML</strong> documents in a binary<br />

data format, rather than plain text.<br />

Using a binary <strong>XML</strong> format generally reduces the verbosity of <strong>XML</strong> documents, and may reduce the cost of parsing.<br />

Performance of writing (generating) content is usually not similarly improved, although this depends on actual<br />

binary representation used.<br />

The EXI format is derived from the AgileDelta Efficient <strong>XML</strong> format [1] .<br />

See also<br />

• Binary <strong>XML</strong><br />

• Fast Infoset<br />

External links<br />

• Efficient <strong>XML</strong> Interchange Format 1.0 (Candidate Recommendation) [2]<br />

• Efficient <strong>XML</strong> Interchange Working Group home page [3]<br />

• EXIficient - Open Source implementation of the EXI Format 1.0 [4]<br />

• W3C binary <strong>XML</strong> requirements [5]<br />

References<br />

[1] "Lightning-Fast Delivery of <strong>XML</strong> to More Devices in More Locations" (http://www.agiledelta.com/product_efx.html). AgileDelta.<br />

2007-05-08. . Retrieved 2007-07-17.<br />

[2] http://www.w3.org/TR/exi/<br />

[3] http://www.w3.org/<strong>XML</strong>/EXI/<br />

[4] http://exificient.sourceforge.net/<br />

[5] http://www.w3.org/TR/2005/NOTE-xbc-characterization-20050331/


Embedded RDF 21<br />

Embedded RDF<br />

Embedded RDF (eRDF) is a syntax for writing HTML in such a way that the information in the HTML document<br />

can be extracted (with an eRDF parser or XSLT stylesheet) into Resource Description Framework.<br />

It was invented by Ian Davis in 2005, and partly inspired by microformats, a simplified approach to semantically<br />

annotate data in websites. [1]<br />

See also<br />

• RDFa, W3C's approach at embedding RDF<br />

• GRDDL, a way to extract (annotated) data out of XHTML and <strong>XML</strong> documents and transform it into an RDF<br />

graph<br />

• Microdata (HTML5), a proposed feature of HTML5 that improves on the capabilities of microformats<br />

External links<br />

• eRDF [2]<br />

References<br />

[1] Ian Davis (http://iandavis.com/)<br />

[2] http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml<br />

EpiDoc<br />

The EpiDoc Collaborative [1] , building recommendations for structured markup of epigraphic documents in TEI<br />

<strong>XML</strong>, was originally formed in 2000 by scholars at the University of North Carolina at Chapel Hill: Tom Elliott, the<br />

former director of the Ancient World Mapping Center, with Hugh Cayless and Amy Hawkins. The guidelines have<br />

matured considerably through extensive discussion on the <strong>Markup</strong> list [2] and other discussion fora, at several<br />

conferences, and through the experience of various pilot projects. The first major—but not by any means the<br />

only—epigraphic project to adopt and pilot the EpiDoc recommendations has been the Inscriptions of Aphrodisias,<br />

and the guidelines have reached a degree of stability for the first time during this process.<br />

The EpiDoc schema and guidelines may also be applied, perhaps with some local modification to related<br />

palaeolgraphical fields including Papyrology (projects in progress), Sigillography, and Numismatics.<br />

Guidelines<br />

The EpiDoc Guidelines are available in two forms:<br />

1. the stable guidelines, released periodically and available at: http://www.stoa.org/epidoc/gl/(Current version 5<br />

[3] )<br />

2. the source code, available in its most up-to-date form in the CVS repository at SourceForge [4] ; the GL source<br />

files are a series of <strong>XML</strong> documents


EpiDoc 22<br />

Tools<br />

Tool developed by and for the EpiDoc community include:<br />

• The EpiDoc webapp, available from the SourceForge [4] CVS repository (the same application is used to deliver<br />

the guidelines).<br />

• The EpiDoc Crosswalker, a tool to transform data in both directions between EpiDoc and other encoding<br />

schemes, markup schemas, and databases. (In progress.)<br />

• CHET-C (the Chapel Hill Electronic Text-Converter), an application originally written in VBA, then as a<br />

free-standing Java app, and now available as a self-contained Javascript platform written by Hugh Cayless. [5] (A<br />

Python and XSLT version of CHET-C is under construction as part of the IDP project.)<br />

• Transcoder: a Java tool for converting between Beta Code, Unicode NF C, Unicode NF D, and GreekKeys<br />

encoding for Greek script on the fly (download link to follow).<br />

Projects<br />

• Concordia [6] , King's College London and New York University<br />

• Inscriptions of Aphrodisias [7] , King's College London, UK<br />

• Inscriptions of Roman Cyrenaica [7] , KCL<br />

• Integrating Digital Papyrology (Duke University, Columbia University, Heidelberg University, King's College<br />

London), see now http://papyri.info/<br />

• US Epigraphy Project [8] , Brown University, Providence RI, USA<br />

• Vindolanda Tablets Online [9] , Oxford University, UK<br />

• Etruscan Texts Project [10] , University of Massachusetts Amherst, Amherst MA, USA<br />

Bibliography<br />

• G. Bodard, 'Digital Epigraphy and Lexicographical and Onomastic <strong>Markup</strong>', in (edd. Aitken, Fraser, Thompson)<br />

Ancient Greek Lexicography: Electronic Databanks and the design of new dictionaries, Cardiff: University Press<br />

of Wales, (forthcoming 2007).<br />

• G. Bodard / Ch. Roueché, 'The Epidoc Aphrodisias Pilot Project', Forum Archaeologiae 23/VI/2002, online at<br />

http://farch.net (available: 2006-04-07)<br />

• J. Flanders / C. Roueché, 'Introduction for Epigraphers', online at http://epidoc.sf.net/IntroEpigraphers.shtml<br />

(available: 2006-04-25)<br />

• A. Mahoney, 'Epigraphy', in (edd. Burnard, O'Brian, Unsworth) Electronic Textual Editing (2006), preview online<br />

at http://www.tei-c.org/Activities/ETE/Preview/mahoney.xml (available: 2006-04-07)<br />

See also<br />

• Leiden Conventions<br />

• Epigraphy<br />

• Text Encoding Initiative<br />

• Digital Classicist<br />

References<br />

[1] http://epidoc.sourceforge.net/<br />

[2] http://lsv.uky.edu/archives/markup.html<br />

[3] http://www.stoa.org/epidoc/gl/5/<br />

[4] http://sourceforge.net/projects/epidoc<br />

[5] http://www.stoa.org/projects/epidoc/stable/chetc-js/chetc.html<br />

[6] http://concordia.atlantides.org/


EpiDoc 23<br />

[7] http://insaph.kcl.ac.uk/<br />

[8] http://usepigraphy.brown.edu/<br />

[9] http://vindolanda.csad.ox.ac.uk/<br />

[10] http://etp.classics.umass.edu/<br />

eXtensible Server Pages<br />

eXtensible servers Pages (XSP) is an <strong>XML</strong>-based language, which offers the possibility of dynamically arranged<br />

Java code into <strong>XML</strong> documents.<br />

It was developed by the Apache Software Foundation for the Web Publishing Framework Cocoon. The focus of XSP<br />

is the separation of content, logic and presentation. The Java program code is in its own <strong>XML</strong> section <br />

that can either occur within or outside of the root element ().<br />

The Java code is compiled with the first call. These directives are replaced by the generated content so that the<br />

resulting, augmented <strong>XML</strong> document can be subject to further processing with XSL Transformations.<br />

XSP pages are transformed into Cocoon producers, typically as Java classes, though any scripting language for<br />

which a Java-based processor exists could also be used.<br />

Directives can be either XSP built-in processing tags or user-defined library tags. XSP built-in tags are used to<br />

embed procedural logic, substitute expressions and dynamically build <strong>XML</strong> nodes. User-defined library tags act as<br />

templates that dictate how program code is generated from information encoded in each dynamic tag.<br />

External links<br />

• Cocoon XSP 2.1 [1]<br />

• XSP 1.x - Working Draft [2]<br />

References<br />

[1] http://cocoon.apache.org/2.1/userdocs/xsp/logicsheet.html<br />

[2] http://cocoon.apache.org/1.x/wd-xsp.html


Fast Infoset 24<br />

Fast Infoset<br />

Fast Infoset (or FI) is an international standard that specifies a binary encoding format for the <strong>XML</strong> Information Set<br />

(<strong>XML</strong> Infoset) as an alternative to the <strong>XML</strong> document format. It aims to provide more efficient serialization than the<br />

text-based <strong>XML</strong> format.<br />

One can think of FI as gzip for <strong>XML</strong>, though FI aims to optimize both document size and processing performance,<br />

whereas gzip optimizes only the size. While the original formatting is lost, no information is lost in the conversion<br />

from <strong>XML</strong> to FI and back to <strong>XML</strong>.<br />

The Fast Infoset specification is defined by both the ITU-T and the ISO standards bodies. FI is officially named<br />

ITU-T Rec. X.891 and ISO/IEC 24824-1 (Fast Infoset), respectively. However, it is commonly referred to by the<br />

name Fast Infoset. The standard was published by ITU-T on May 14, 2005, and by ISO on May 4, 2007.<br />

The Fast Infoset standard can be downloaded from the ITU website at [1]. There are no intellectual property<br />

restrictions on its implementation and use.<br />

A common misconception is that FI requires ASN.1 tool support. Although the formal specification uses ASN.1<br />

formalisms, ASN.1 tools are not required by implementations.<br />

Structure<br />

The underlying file format is ASN.1, with tag/length/value blocks. Text values of attributes and elements are therefor<br />

stored with length prefixes rather than end delimeters, so there is no need to escape special characters. There is also<br />

no need for any end tags, and binary data need not be base64 encoded.<br />

Although ASN.1 is used for storage, Fast Infoset is a higher level protocol built upon it. In particular, element and<br />

attribute names are stored within the octet stream, unlike raw ASN.1. This means that it is possible to recover a<br />

conventional <strong>XML</strong> file from the binary stream without the need to reference any <strong>XML</strong> Schema. It does not attempt<br />

to convert and <strong>XML</strong> Schema directly into an ASN.1 definition. (ASN.1 "Tags" are just type names, eg. String,<br />

Integer, or complex types.)<br />

An index table is built for most strings, which includes element and attribute names, and their values. This means<br />

that the text of repeated tags and values only appears once per document. The details are complex.<br />

Implementations<br />

Reference implementation<br />

A Java implementation [2] of the FI specification is available as part of the GlassFish project. The library is open<br />

source and is distributed under the terms of the Apache License 2.0. Several projects use this implementation,<br />

including the reference implementation for JAX-RPC and JAX-WS used in JWSDP.<br />

Alternative implementations<br />

The OSS Fast Infoset Tools [3] are designed for use with applications written in C or C++.<br />

Liquid Technologies [4] provides both C++ and C# .NET implementations of Fast Infoset with its <strong>XML</strong> Data<br />

Binding product Liquid <strong>XML</strong>.<br />

Applied Informatics [5] provides a C++ implementation [6] of Fast Infoset based on the POCO C++ Libraries.<br />

FastInfoset.NET [7] is a C# implementation for the .NET Framework. It is licensed under a proprietary licence.<br />

The XIOT [8] library has parts of Fast Infoset implemented to read and write compressed binary X3D files. It is<br />

licensed under LGPL.


Fast Infoset 25<br />

Performance<br />

Because Fast Infosets are compressed as as part of the <strong>XML</strong> generation process, they are much faster than using<br />

Zip-style compression algorithms on an <strong>XML</strong> stream, although they can produce slightly larger files.<br />

SAX-type parsing performance of Fast Infoset is also much faster than parsing performance of <strong>XML</strong> 1.0, even<br />

without any Zip-style compression. Typical increases in parsing speed observed for the reference Java<br />

implementation are a factor of 10 compared to Java Xerces, and a factor of 4 compared to the Piccolo driver [9] (one<br />

[10] [11] [12]<br />

of the fastest Java-based <strong>XML</strong> parsers).<br />

Typical applications<br />

Portable Devices - With mobile devices typically having access to low bandwidth data connections, and have slower<br />

CPUs. This can make Fast Infoset a better choice, lowering both data transmission and data processing times.<br />

Persisting Large Volumes of Data - When persisting <strong>XML</strong> either to file or a database, the volume of data your<br />

system produces can often get out of hand. This has a number of detrimental effects; the access times go up as you're<br />

reading more data, CPU load goes up as <strong>XML</strong> data takes more effort to process, and your storage costs go up. By<br />

persisting your <strong>XML</strong> data in Fast Infoset format, it is possible to reduce the data volume by up to 80 percent.<br />

Passing <strong>XML</strong> via the internet - As soon as an application starts passing information over the internet, one of the<br />

main bottlenecks is bandwidth. If you send reasonable chunks of data, this bottleneck can seriously degrade the<br />

performance of your client applications and limit your server's ability to process requests. Reducing the amount of<br />

data moving across the internet reduces the time it takes a message to be sent or received, while increasing the<br />

number of transactions a server can process per hour.<br />

See also<br />

• Binary <strong>XML</strong><br />

• EXI<br />

• X3D<br />

External links<br />

• A heavy technical description on Sun [13]<br />

• FastInfoset.NET home page [7]<br />

• FI project home page [14]<br />

• Fast Infoset page at the ASN.1 site [15]<br />

• OSS Fast Infoset Tools page [3]<br />

• Free download of the Fast Infoset standard (ITU-T Rec. X.891) from the ITU Web site [1]<br />

• Free download of the Fast Infoset standard (ISO/IEC 24824-1:2007) from ISO Freely Available Standards [16]


Fast Infoset 26<br />

References<br />

[1] http://www.itu.int/rec/T-REC-X.891-200505-I/en<br />

[2] https://fi.dev.java.net/<br />

[3] http://www.oss.com/xml/products/fi.html<br />

[4] http://www.liquid-technologies.com/Product_XmlCompression.aspx<br />

[5] http://www.appinf.com/<br />

[6] http://www.appinf.com/en/products/fis.html<br />

[7] http://www.noemax.com/products/fastinfoset/index.html<br />

[8] http://forge.collaviz.org/community/xiot<br />

[9] http://piccolo.sourceforge.net/<br />

[10] "Fast Infoset performance reports" (https://fi.dev.java.net/performance.html). 2005-10-06. . Retrieved 2007-10-11.<br />

[11] "Japex Report: ParsingPerformance" (https://fi.dev.java.net/reports/parsing/report.html). 2005-01-10. . Retrieved 2007-10-11.<br />

[12] "Japex Report: SizePerformance" (https://fi.dev.java.net/reports/size/report.html). 2005-01-10. . Retrieved 2007-10-11.<br />

[13] http://java.sun.com/developer/technicalArticles/xml/fastinfoset/<br />

[14] http://fi.dev.java.net/<br />

[15] http://asn1.elibel.tm.fr/xml/finf.htm<br />

[16] http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html<br />

Global listings format<br />

Global listings format (GLF) refers to metadata for transferring program guide information and multimedia<br />

information. It is coded in <strong>XML</strong> format.<br />

GMX<br />

GMX [1] (global mail exchange) is also the name of a German company with an international webmail product<br />

GMX Mail.<br />

GMX is a collection of current and proposed standards, primarily targeted at the needs of the translation industry,<br />

although able to be used for other purposes also. They are concerned with measuring quantitatively aspects of a<br />

document, particularly those with relevance to the translation process (e.g. word counts, complexity). The primary<br />

use cases are in quoting, estimating and billing translation work.<br />

GMX-V is the first of the three standards to be completed. Work will commence in 2007 on GMX-Q and GMX-C.<br />

Quality (GMX-Q) will deal with the level of quality required for a task. For example, the quality required for the<br />

translation of a legal document is much higher than that for technical documentation that will have a relatively small<br />

audience. Complexity (GMX-C) will take into consideration the source and format of the original document and its<br />

subject matter. For example, a highly complex document dealing with a specific tight domain is far more complex to<br />

translate than user instructions for a simple consumer device.<br />

GMX-V forms part of the Open Architecture for <strong>XML</strong> Authoring and Localization (OAXAL) reference architecture.<br />

References<br />

[1] http://www.gmx.net/


GMX-V 27<br />

GMX-V<br />

GMX-V (Global Information Management Metrics eXchange - Volume: Word and Character Count Standard) is a<br />

word and character count standard for electronic documents. GMX-V is developed and maintained by OSCAR [1]<br />

(Open Standards for Container/Content Allowing Re-use), a special interest group of LISA [2] (Localization Industry<br />

Standards Association).<br />

GMX-V is one of the tripartite series of standards from the Localization Industry Standards Association (LISA).<br />

GMX-V deals with electronic document metrics.<br />

GMX is made up of the following standards:<br />

• GMX-V - Volume<br />

• GMX-C - Complexity<br />

• GMX-Q - Quality<br />

GMX-V forms part of the Open Architecture for <strong>XML</strong> Authoring and Localization (OAXAL) reference architecture.<br />

Scope and Primary Goal<br />

GMX-V is designed to fulfill two primary roles:<br />

• Establish a verifiable way of calculating the primary word and character counts for a given electronic document.<br />

• Establish a specific <strong>XML</strong> vocabulary that enables the automatic exchange of metric data<br />

Description<br />

GMX-V is itself based on other well established standards:<br />

• Unicode 5.0 normalized form<br />

• Unicode Technical Report 29 – Text Boundaries<br />

• OASIS <strong>XML</strong> Localization Interchange File Format (XLIFF) 1.2<br />

• LISA OSCAR Segmentation Rules Exchange (SRX) 2.0<br />

External links<br />

• GMX-V page on the LISA OSCAR web site [3]<br />

• GMX-V specification [4]<br />

References<br />

[1] OSCAR (http://www.lisa.org/sigs/oscar/) - Open Standards for Container/Content Allowing Re-use<br />

[2] LISA (http://www.lisa.org/index.html) - Localization Industry Standards Association<br />

[3] http://www.lisa.org/Global-information-m.104.0.html<br />

[4] http://www.lisa.org/fileadmin/standards/GMX-V.html


Head-Body Pattern 28<br />

Head-Body Pattern<br />

The Head-Body Pattern is a common <strong>XML</strong> design pattern, used for example in the SOAP protocol. This pattern is<br />

useful when a message, or parcel of data, requires considerable metadata. While mixing the meta-data with the data<br />

could be done it makes the whole confusing. In this pattern the meta-data or meta-information are structured as the<br />

header, sometimes known as the envelope. The ordinary data or information are structured as the body, sometimes<br />

known as the payload. <strong>XML</strong> is employed for both head and body.<br />

HyTime<br />

HyTime (Hypermedia/Time-based Structuring <strong>Language</strong>) is a markup language that is an "application" of SGML.<br />

HyTime defines a set of hypertext-oriented element types that, in effect, supplement SGML and allow SGML<br />

document authors to build hypertext and multimedia presentations in a standardized way.<br />

HyTime is an international standard published by the ISO and IEC. The first edition was published in 1992, and the<br />

second edition was published in 1997.<br />

Legacy<br />

Some of the concepts formalized in HyTime were later incorporated into HTML and <strong>XML</strong>:<br />

• HTML is an application of SGML for hypertext document presentations, that assigns specific semantics and<br />

processing expectations to a fixed set of element types.<br />

• <strong>XML</strong> defines a simplified subset of SGML that focuses on providing an open vocabulary of element types for data<br />

modeling and establishes precise expectations for how the marked-up data is read and subsequently fed to another<br />

software application for further processing, but does not assign semantics to the element types or establish<br />

expectations for how the data is processed.<br />

Standard<br />

The HyTime standard itself is ISO/IEC 10744, first published in 1992 and available from the International<br />

Organization for Standardization. It was developed by ISO/IEC JTC1/SC34 (ISO/IEC Joint Technical Committee 1,<br />

[1] [2]<br />

Subcommittee 34 - Document description and processing languages).<br />

Further reading<br />

• Steven DeRose and David Durand, "Making Hypermedia Work: A User's Guide to HyTime," Kluwer Academic<br />

Publishers 1994 (ISBN 0-7923-9432-1).<br />

External links<br />

• ISO/IEC 10744:1992 - Information technology -- Hypermedia/Time-based Structuring <strong>Language</strong> (HyTime) [3]<br />

• Robin Cover's HyTime resource list [4]<br />

• ISO/IEC 10744 Amendment 1 [5] - an amendment to ISO/IEC 10744:1997 Annex A.3<br />

• Standards: HyTime: A standard for structured hypermedia interchange [6] by Charles Goldfarb, from IEEE<br />

Computer magazine, vol. 24, iss. 8 (Aug. 1991), pp. 81–84<br />

• A Brief History of the Development of SMDL and HyTime [7]


HyTime 29<br />

References<br />

[1] ISO. "JTC 1/SC 34 - Document description and processing languages" (http://www.iso.org/iso/iso_technical_committee.<br />

html?commid=45374). ISO. . Retrieved 2009-12-25.<br />

[2] ISO JTC1/SC34. "JTC 1/SC 34 - Document Description and Processing <strong>Language</strong>s" (http://www.itscj.ipsj.or.jp/sc34/). . Retrieved<br />

2009-12-25.<br />

[3] http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?csnumber=18834<br />

[4] http://xml.coverpages.org/hytime.html<br />

[5] http://www.y12.doe.gov/sgml/wg8/document/1957.htm<br />

[6] http://ieeexplore.ieee.org/iel1/2/2778/00084880.pdf?tp=&arnumber=84880&isnumber=2778<br />

[7] http://www.sgmlsource.com/history/hthist.htm<br />

Internationalization Tag Set<br />

The Internationalization Tag Set (ITS) [1] is a set of attributes and elements designed to provide<br />

internationalization and localization support in <strong>XML</strong> documents.<br />

The ITS specification identifies concepts (called "ITS data categories") which are important for internationalization<br />

and localization. It also defines implementations of these concepts through a set of elements and attributes grouped<br />

in the ITS namespace. <strong>XML</strong> developers can use this namespace to integrate internationalization features directly into<br />

their own <strong>XML</strong> schemas and documents.<br />

Overview<br />

ITS v1.0 includes seven data categories:<br />

• Translate: Defines what parts of a document are translatable or not.<br />

• Localization Note: Provides alerts, hints, instructions, and other information to help the localizers or the<br />

translators.<br />

• Terminology: Indicates parts of the documents that are terms and optionally pointers to information about these<br />

terms.<br />

• Directionality: Indicates what type of display directionality should be applied to parts of the document.<br />

• Ruby: Indicates what parts of the document should be displayed as ruby text. (Ruby is a short run of text<br />

alongside a base text, typically used in East Asian documents to indicate pronunciation or to provide a brief<br />

annotation).<br />

• <strong>Language</strong> Information: Identifies the language of the different parts of the document.<br />

• Elements Within Text: Indicates how elements should be treated with regard to linguistic segmentation.<br />

The vocabulary is designed to work on two different fronts: First by providing markup usable directly in the <strong>XML</strong><br />

documents. Secondly, by offering a way to indicate if there are parts of a given markup that correspond to some of<br />

the ITS data categories and should be treated as such by ITS processors.<br />

ITS applies to both new document types as well as existing ones. It also applies to both markups without any<br />

internationalization features as well as the class of documents already supporting some internationalization or<br />

localization-related functions.<br />

ITS can be specified using global rules and local rules.<br />

• The global rules are expressed anywhere in the document (embedded global rules), or even outside the document<br />

(external global rules), using the its:rules element.<br />

• The local rules are expressed by specialized attributes (and sometimes elements) specified inside the document<br />

instance, at the location where they apply.


Internationalization Tag Set 30<br />

Examples<br />

Example of ITS markup for the Translate data category:<br />

The elements and attributes with the its prefix are part of the ITS namespace. The its:rules element list the different<br />

rules to apply to this file. There is one its:translateRule rule that indicates that any content inside the head element<br />

should not be translated.<br />

The its:translate attributes used in some elements are utilised to override the global rule. Here, to make translatable<br />

the content of title and to make non-translatable the text "faux pas".<br />

<br />

<br />

Sep-10-2006 v5<br />

Ealasaidh McIan<br />

ealasaidh@hogw.ac.uk<br />

The Origins of Modern Novel<br />

<br />

<br />

<br />

<br />

<br />

<br />

Introduction<br />

It would certainly be quite a faux<br />

pas to start a dissertation on the origin of modern novel without<br />

mentioning the Epic of Gilgamesh...<br />

<br />

<br />

<br />

Example of ITS markup for the Localization Note data category:<br />

The its:locNote element specifies that any node corresponding to the XPath expression "//msg/data" has an<br />

associated note. The location of that note is expressed by the locNotePointer attribute, which holds a relative XPath<br />

expression pointing to the node where the note is, here ="../notes".<br />

Note also the use of the its:translate attribute to mark the notes elements as non-translatable.<br />

<br />

<br />

<br />

<br />


Internationalization Tag Set 31<br />

A division by 0 was going to be computed.<br />

Invalid parameter.<br />

<br />

<br />

<br />

ITS limitations<br />

ITS does not have a solution to all <strong>XML</strong> internationalization and localization issues.<br />

One reason is that the version 1.0 does not have data categories for everything. For example, there is currently no<br />

way to indicate a relation source/target in bilingual files where some parts of a document store the source text and<br />

some other parts the corresponding translation.<br />

The other reason is that many aspects of internationalization cannot be resolved with a markup. They have to do with<br />

the design of the DTD or the schema itself. There are best practices, design and authoring guidelines [2] that are<br />

necessary to follow to make sure documents are correctly internationalized and easy to localize. For example, using<br />

attributes to store translatable text is a bad idea for many different reasons, but ITS cannot prevent an <strong>XML</strong><br />

developer to make such choice.<br />

External links<br />

• Internationalization Tag Set (ITS) Version 1.0 [3]<br />

• W3C Internationalization Home [4]<br />

• Best Practices for <strong>XML</strong> Internaltionalization (Working Draft) [5]<br />

• List of ITS implementations and articles about ITS [6]<br />

References<br />

[1] http://www.w3.org/TR/its/<br />

[2] http://www.w3.org/TR/xml-i18n-bp/<br />

[3] http://www.w3.org/TR/2007/REC-its-20070403/<br />

[4] http://www.w3.org/International/<br />

[5] http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070427/<br />

[6] http://www.w3.org/International/its/links.html


Klip 32<br />

Klip<br />

Klip is an <strong>XML</strong> file that contains markup, styles and JavaScript that provides the<br />

Klipfolio desktop dashboard platform with rules for the retrieval, interpretation, and<br />

presentation of arbitrary information sources such as web pages, RSS feeds, and<br />

proprietary <strong>XML</strong> back-ends. The Klip file extension is ".klip".<br />

When opened in Klipfolio, a Klip is rendered as a small window that displays text<br />

and image content. The size, position and visibility of the Klip on-screen is managed by the user. Settings particular<br />

to each Klip can be found in a "Klip Setup" dialog.<br />

Klips are considered by most to be widgets, and KlipFolio a widget engine. There are thousands of different Klips<br />

available as free downloads at Klipfolio.com [1] . Klips proivde all manner of information such as weather conditions,<br />

news headlines, stock quotes etc. The consumer version of KlipFolio is freeware and can be downloaded, installed,<br />

and used by anyone that cares to do so.<br />

Example usage<br />

This very simple example can be written using a plain text or <strong>XML</strong> editor.<br />

<br />

<br />

<br />

My Klip<br />

Your Description here....<br />

The author of the Klip<br />

15 keywords maximum to upload to KlipFarm the Klip directory<br />

<br />

<br />

http://mydomain.com/myxml.xml<br />

http://mydomain.com/myicon.jpg<br />

http://mydomain.com/mybanner.gif<br />

<br />

<br />

15<br />

<br />

<br />

Saving it as first.klip will allow you to open it using KlipFolio.<br />

Note: Klip also stands for the meaningful word Clip in a lot of Eastern Countries (e.g. Czechia, Lithuania, Poland,<br />

Serbia, Slovakia)


Klip 33<br />

See also<br />

• KlipFolio<br />

• Serence<br />

Links<br />

• KlipFolio Homepage [1]<br />

References<br />

[1] http://www.klipfolio.com/<br />

klip izle (http://www.klipizle.gen.tr)<br />

List of <strong>XML</strong> and HTML character entity<br />

references<br />

In SGML, HTML and <strong>XML</strong> documents, the logical constructs known as character data and attribute values consist<br />

of sequences of characters, in which each character can manifest directly (representing itself), or can be represented<br />

by a series of characters called a character reference, of which there are two types: a numeric character reference<br />

and a character entity reference. This article lists the character entity references that are valid in HTML and <strong>XML</strong><br />

documents.<br />

Character reference overview<br />

A numeric character reference refers to a character by its Universal Character Set/Unicode code point, and uses the<br />

format<br />

or<br />

&#nnnn;<br />

&#xhhhh;<br />

where nnnn is the code point in decimal form, and hhhh is the code point in hexadecimal form. The x must be<br />

lowercase in <strong>XML</strong> documents. The nnnn or hhhh may be any number of digits and may include leading zeros. The<br />

hhhh may mix uppercase and lowercase, though uppercase is the usual style.<br />

In contrast, a character entity reference refers to a character by the name of an entity which has the desired character<br />

as its replacement text. The entity must either be predefined (built-in to the markup language) or explicitly declared<br />

in a Document Type Definition (DTD). The format is the same as for any entity reference:<br />

&name;<br />

where name is the name of the entity. The semicolon is required.


List of <strong>XML</strong> and HTML character entity references 34<br />

Predefined entities in <strong>XML</strong><br />

The <strong>XML</strong> specification does not use the term "character entity" or "character entity reference". The <strong>XML</strong><br />

specification defines five "predefined entities" representing special characters, and requires that all <strong>XML</strong> processors<br />

honor them. The entities can be explicitly declared in a DTD, as well, but if this is done, the replacement text must<br />

be the same as the built-in definitions. <strong>XML</strong> also allows other named entities of any size to be defined on a<br />

per-document basis.<br />

The table below lists the five <strong>XML</strong> predefined entities. The "Name" column mentions the entity's name. The<br />

"Character" column shows the character, if it is renderable. In order to render the character, the format &name; is<br />

used; for example, &amp; renders as &. The "Unicode code point" column cites the character via standard<br />

UCS/Unicode "U+" notation, which shows the character's code point in hexadecimal. The decimal equivalent of the<br />

code point is then shown in parentheses. The "Standard" column indicates the first version of <strong>XML</strong> that includes the<br />

entity. The "Description" column cites the character via its canonical UCS/Unicode name, in English.<br />

Name Character Unicode code point (decimal) Standard Description<br />

quot " U+0022 (34) <strong>XML</strong> 1.0 (double) quotation mark<br />

amp & U+0026 (38) <strong>XML</strong> 1.0 ampersand<br />

apos ' U+0027 (39) <strong>XML</strong> 1.0 apostrophe (= apostrophe-quote)<br />

lt < U+003C (60) <strong>XML</strong> 1.0 less-than sign<br />

gt > U+003E (62) <strong>XML</strong> 1.0 greater-than sign<br />

Character entity references in HTML<br />

The HTML 4 DTDs define 252 named entities, references to which act as mnemonic aliases for certain Unicode<br />

characters. The HTML 4 specification requires the use of the standard DTDs and does not allow users to define<br />

additional entities.<br />

In the table below, the "Standard" column indicates the first version of the HTML DTD that defines the character<br />

entity reference. HTML 4.01 did not provide any new character references.<br />

Name Character Unicode code point<br />

(decimal)<br />

Standard DTD DTD<br />

Old ISO<br />

subset ISOsubset<br />

Description Description<br />

quot " U+0022 (34) HTML 2.0 HTMLspecial ISOnum quotation mark (= APL quote)<br />

amp & U+0026 (38) HTML 2.0 HTMLspecial ISOnum ampersand<br />

apos ' U+0027 (39) XHTML<br />

1.0<br />

HTMLspecial ISOnum apostrophe (= apostrophe-quote); see below<br />

lt < U+003C (60) HTML 2.0 HTMLspecial ISOnum less-than sign<br />

gt > U+003E (62) HTML 2.0 HTMLspecial ISOnum greater-than sign<br />

nbsp U+00A0 (160) HTML 3.2 HTMLlat1 ISOnum no-break space (= non-breaking space) spaces<br />

iexcl ¡ U+00A1 (161) HTML 3.2 HTMLlat1 ISOnum inverted exclamation mark<br />

cent ¢ U+00A2 (162) HTML 3.2 HTMLlat1 ISOnum cent sign<br />

pound £ U+00A3 (163) HTML 3.2 HTMLlat1 ISOnum pound sign<br />

curren ¤ U+00A4 (164) HTML 3.2 HTMLlat1 ISOnum currency sign<br />

yen ¥ U+00A5 (165) HTML 3.2 HTMLlat1 ISOnum yen sign (= yuan sign)<br />

brvbar ¦ U+00A6 (166) HTML 3.2 HTMLlat1 ISOnum broken bar (= broken vertical bar)


List of <strong>XML</strong> and HTML character entity references 35<br />

sect § U+00A7 (167) HTML 3.2 HTMLlat1 ISOnum section sign<br />

uml ¨ U+00A8 (168) HTML 3.2 HTMLlat1 ISOdia diaeresis (= spacing diaeresis); see German<br />

umlaut<br />

copy © U+00A9 (169) HTML 3.2 HTMLlat1 ISOnum copyright sign<br />

ordf ª U+00AA (170) HTML 3.2 HTMLlat1 ISOnum feminine ordinal indicator<br />

laquo « U+00AB (171) HTML 3.2 HTMLlat1 ISOnum left-pointing double angle quotation mark (= left<br />

not ¬ U+00AC (172) HTML 3.2 HTMLlat1 ISOnum not sign<br />

pointing guillemet)<br />

shy U+00AD (173) HTML 3.2 HTMLlat1 ISOnum soft hyphen (= discretionary hyphen)<br />

reg ® U+00AE (174) HTML 3.2 HTMLlat1 ISOnum registered sign ( = registered trade mark sign)<br />

macr ¯ U+00AF (175) HTML 3.2 HTMLlat1 ISOdia macron (= spacing macron = overline = APL<br />

overbar)<br />

deg ° U+00B0 (176) HTML 3.2 HTMLlat1 ISOnum degree sign<br />

plusmn ± U+00B1 (177) HTML 3.2 HTMLlat1 ISOnum plus-minus sign (= plus-or-minus sign)<br />

sup2 ² U+00B2 (178) HTML 3.2 HTMLlat1 ISOnum superscript two (= superscript digit two =<br />

squared)<br />

sup3 ³ U+00B3 (179) HTML 3.2 HTMLlat1 ISOnum superscript three (= superscript digit three =<br />

cubed)<br />

acute ´ U+00B4 (180) HTML 3.2 HTMLlat1 ISOdia acute accent (= spacing acute)<br />

micro µ U+00B5 (181) HTML 3.2 HTMLlat1 ISOnum micro sign<br />

para U+00B6 (182) HTML 3.2 HTMLlat1 ISOnum pilcrow sign ( = paragraph sign)<br />

middot · U+00B7 (183) HTML 3.2 HTMLlat1 ISOnum middle dot (= Georgian comma = Greek middle<br />

cedil ¸ U+00B8 (184) HTML 3.2 HTMLlat1 ISOdia cedilla (= spacing cedilla)<br />

sup1 ¹ U+00B9 (185) HTML 3.2 HTMLlat1 ISOnum superscript one (= superscript digit one)<br />

ordm º U+00BA (186) HTML 3.2 HTMLlat1 ISOnum masculine ordinal indicator<br />

raquo » U+00BB (187) HTML 3.2 HTMLlat1 ISOnum right-pointing double angle quotation mark (=<br />

dot)<br />

right pointing guillemet)<br />

frac14 ¼ U+00BC (188) HTML 3.2 HTMLlat1 ISOnum vulgar fraction one quarter (= fraction one<br />

quarter)<br />

frac12 ½ U+00BD (189) HTML 3.2 HTMLlat1 ISOnum vulgar fraction one half (= fraction one half)<br />

frac34 ¾ U+00BE (190) HTML 3.2 HTMLlat1 ISOnum vulgar fraction three quarters (= fraction three<br />

quarters)<br />

iquest ¿ U+00BF (191) HTML 3.2 HTMLlat1 ISOnum inverted question mark (= turned question mark)<br />

Agrave À U+00C0 (192) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with grave (= Latin capital<br />

letter A grave)<br />

Aacute Á U+00C1 (193) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with acute<br />

Acirc  U+00C2 (194) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with circumflex<br />

Atilde à U+00C3 (195) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with tilde<br />

Auml Ä U+00C4 (196) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with diaeresis<br />

Aring Å U+00C5 (197) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with ring above (= Latin<br />

capital letter A ring)


List of <strong>XML</strong> and HTML character entity references 36<br />

AElig Æ U+00C6 (198) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter AE (= Latin capital ligature<br />

Ccedil Ç U+00C7 (199) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter C with cedilla<br />

Egrave È U+00C8 (200) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with grave<br />

Eacute É U+00C9 (201) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with acute<br />

Ecirc Ê U+00CA (202) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with circumflex<br />

Euml Ë U+00CB (203) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with diaeresis<br />

Igrave Ì U+00CC (204) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with grave<br />

Iacute Í U+00CD (205) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with acute<br />

Icirc Î U+00CE (206) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with circumflex<br />

Iuml Ï U+00CF (207) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with diaeresis<br />

ETH Ð U+00D0 (208) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter ETH<br />

Ntilde Ñ U+00D1 (209) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter N with tilde<br />

Ograve Ò U+00D2 (210) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with grave<br />

Oacute Ó U+00D3 (211) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with acute<br />

Ocirc Ô U+00D4 (212) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with circumflex<br />

Otilde Õ U+00D5 (213) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with tilde<br />

Ouml Ö U+00D6 (214) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with diaeresis<br />

times × U+00D7 (215) HTML 3.2 HTMLlat1 ISOnum multiplication sign<br />

Oslash Ø U+00D8 (216) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with stroke (= Latin capital<br />

AE)<br />

letter O slash)<br />

Ugrave Ù U+00D9 (217) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with grave<br />

Uacute Ú U+00DA (218) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with acute<br />

Ucirc Û U+00DB (219) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with circumflex<br />

Uuml Ü U+00DC (220) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with diaeresis<br />

Yacute Ý U+00DD (221) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter Y with acute<br />

THORN Þ U+00DE (222) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter THORN<br />

szlig ß U+00DF (223) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter sharp s (= ess-zed); see<br />

German Eszett<br />

agrave à U+00E0 (224) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with grave<br />

aacute á U+00E1 (225) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with acute<br />

acirc â U+00E2 (226) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with circumflex<br />

atilde ã U+00E3 (227) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with tilde<br />

auml ä U+00E4 (228) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with diaeresis<br />

aring å U+00E5 (229) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with ring above<br />

aelig æ U+00E6 (230) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter ae (= Latin small ligature ae)<br />

ccedil ç U+00E7 (231) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter c with cedilla<br />

egrave è U+00E8 (232) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with grave<br />

eacute é U+00E9 (233) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with acute<br />

ecirc ê U+00EA (234) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with circumflex


List of <strong>XML</strong> and HTML character entity references 37<br />

euml ë U+00EB (235) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with diaeresis<br />

igrave ì U+00EC (236) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with grave<br />

iacute í U+00ED (237) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with acute<br />

icirc î U+00EE (238) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with circumflex<br />

iuml ï U+00EF (239) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with diaeresis<br />

eth ð U+00F0 (240) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter eth<br />

ntilde ñ U+00F1 (241) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter n with tilde<br />

ograve ò U+00F2 (242) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with grave<br />

oacute ó U+00F3 (243) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with acute<br />

ocirc ô U+00F4 (244) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with circumflex<br />

otilde õ U+00F5 (245) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with tilde<br />

ouml ö U+00F6 (246) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with diaeresis<br />

divide ÷ U+00F7 (247) HTML 3.2 HTMLlat1 ISOnum division sign<br />

oslash ø U+00F8 (248) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with stroke (= Latin small<br />

letter o slash)<br />

ugrave ù U+00F9 (249) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with grave<br />

uacute ú U+00FA (250) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with acute<br />

ucirc û U+00FB (251) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with circumflex<br />

uuml ü U+00FC (252) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with diaeresis<br />

yacute ý U+00FD (253) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter y with acute<br />

thorn þ U+00FE (254) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter thorn<br />

yuml ÿ U+00FF (255) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter y with diaeresis<br />

OElig ΠU+0152 (338) HTML 4.0 HTMLspecial ISOlat2 Latin capital ligature oe ligature<br />

oelig œ U+0153 (339) HTML 4.0 HTMLspecial ISOlat2 Latin small ligature oe ligature<br />

Scaron Š U+0160 (352) HTML 4.0 HTMLspecial ISOlat2 Latin capital letter s with caron<br />

scaron š U+0161 (353) HTML 4.0 HTMLspecial ISOlat2 Latin small letter s with caron<br />

Yuml Ÿ U+0178 (376) HTML 4.0 HTMLspecial ISOlat2 Latin capital letter y with diaeresis<br />

fnof ƒ U+0192 (402) HTML 4.0 HTMLsymbol ISOtech Latin small letter f with hook (= function =<br />

circ ˆ U+02C6 (710) HTML 4.0 HTMLspecial ISOpub modifier letter circumflex accent<br />

florin)<br />

tilde ˜ U+02DC (732) HTML 4.0 HTMLspecial ISOdia small tilde<br />

Alpha Α U+0391 (913) HTML 4.0 HTMLsymbol Greek capital letter Alpha<br />

Beta Β U+0392 (914) HTML 4.0 HTMLsymbol Greek capital letter Beta<br />

Gamma Γ U+0393 (915) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Gamma<br />

Delta Δ U+0394 (916) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Delta<br />

Epsilon Ε U+0395 (917) HTML 4.0 HTMLsymbol Greek capital letter Epsilon<br />

Zeta Ζ U+0396 (918) HTML 4.0 HTMLsymbol Greek capital letter Zeta<br />

Eta Η U+0397 (919) HTML 4.0 HTMLsymbol Greek capital letter Eta<br />

Theta Θ U+0398 (920) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Theta<br />

Iota Ι U+0399 (921) HTML 4.0 HTMLsymbol Greek capital letter Iota


List of <strong>XML</strong> and HTML character entity references 38<br />

Kappa Κ U+039A (922) HTML 4.0 HTMLsymbol Greek capital letter Kappa<br />

Lambda Λ U+039B (923) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Lambda<br />

Mu Μ U+039C (924) HTML 4.0 HTMLsymbol Greek capital letter Mu<br />

Nu Ν U+039D (925) HTML 4.0 HTMLsymbol Greek capital letter Nu<br />

Xi Ξ U+039E (926) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Xi<br />

Omicron Ο U+039F (927) HTML 4.0 HTMLsymbol Greek capital letter Omicron<br />

Pi Π U+03A0 (928) HTML 4.0 HTMLsymbol Greek capital letter Pi<br />

Rho Ρ U+03A1 (929) HTML 4.0 HTMLsymbol Greek capital letter Rho<br />

Sigma Σ U+03A3 (931) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Sigma<br />

Tau Τ U+03A4 (932) HTML 4.0 HTMLsymbol Greek capital letter Tau<br />

Upsilon Υ U+03A5 (933) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Upsilon<br />

Phi Φ U+03A6 (934) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Phi<br />

Chi Χ U+03A7 (935) HTML 4.0 HTMLsymbol Greek capital letter Chi<br />

Psi Ψ U+03A8 (936) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Psi<br />

Omega Ω U+03A9 (937) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Omega<br />

alpha α U+03B1 (945) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter alpha<br />

beta β U+03B2 (946) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter beta<br />

gamma γ U+03B3 (947) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter gamma<br />

delta δ U+03B4 (948) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter delta<br />

epsilon ε U+03B5 (949) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter epsilon<br />

zeta ζ U+03B6 (950) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter zeta<br />

eta η U+03B7 (951) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter eta<br />

theta θ U+03B8 (952) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter theta<br />

iota ι U+03B9 (953) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter iota<br />

kappa κ U+03BA (954) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter kappa<br />

lambda λ U+03BB (955) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter lambda<br />

mu μ U+03BC (956) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter mu<br />

nu ν U+03BD (957) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter nu<br />

xi ξ U+03BE (958) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter xi<br />

omicron ο U+03BF (959) HTML 4.0 HTMLsymbol NEW Greek small letter omicron<br />

pi π U+03C0 (960) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter pi<br />

rho ρ U+03C1 (961) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter rho<br />

sigmaf ς U+03C2 (962) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter final sigma<br />

sigma σ U+03C3 (963) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter sigma<br />

tau τ U+03C4 (964) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter tau<br />

upsilon υ U+03C5 (965) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter upsilon<br />

phi φ U+03C6 (966) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter phi<br />

chi χ U+03C7 (967) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter chi<br />

psi ψ U+03C8 (968) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter psi


List of <strong>XML</strong> and HTML character entity references 39<br />

omega ω U+03C9 (969) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter omega<br />

thetasym ϑ U+03D1 (977) HTML 4.0 HTMLsymbol NEW Greek theta symbol<br />

upsih ϒ U+03D2 (978) HTML 4.0 HTMLsymbol NEW Greek Upsilon with hook symbol<br />

piv ϖ U+03D6 (982) HTML 4.0 HTMLsymbol ISOgrk3 Greek pi symbol<br />

ensp U+2002 (8194) HTML 4.0 HTMLspecial ISOpub en space spaces<br />

emsp U+2003 (8195) HTML 4.0 HTMLspecial ISOpub em space spaces<br />

thinsp U+2009 (8201) HTML 4.0 HTMLspecial ISOpub thin space spaces<br />

zwnj U+200C (8204) HTML 4.0 HTMLspecial NEW RFC 2070 zero-width non-joiner<br />

zwj U+200D (8205) HTML 4.0 HTMLspecial NEW RFC 2070 zero-width joiner<br />

lrm U+200E (8206) HTML 4.0 HTMLspecial NEW RFC 2070 left-to-right mark<br />

rlm U+200F (8207) HTML 4.0 HTMLspecial NEW RFC 2070 right-to-left mark<br />

ndash – U+2013 (8211) HTML 4.0 HTMLspecial ISOpub en dash<br />

mdash — U+2014 (8212) HTML 4.0 HTMLspecial ISOpub em dash<br />

lsquo ‘ U+2018 (8216) HTML 4.0 HTMLspecial ISOnum left single quotation mark<br />

rsquo ’ U+2019 (8217) HTML 4.0 HTMLspecial ISOnum right single quotation mark<br />

sbquo ‚ U+201A (8218) HTML 4.0 HTMLspecial NEW single low-9 quotation mark<br />

ldquo “ U+201C (8220) HTML 4.0 HTMLspecial ISOnum left double quotation mark<br />

rdquo ” U+201D (8221) HTML 4.0 HTMLspecial ISOnum right double quotation mark<br />

bdquo „ U+201E (8222) HTML 4.0 HTMLspecial NEW double low-9 quotation mark<br />

dagger † U+2020 (8224) HTML 4.0 HTMLspecial ISOpub dagger<br />

Dagger ‡ U+2021 (8225) HTML 4.0 HTMLspecial ISOpub double dagger<br />

bull • U+2022 (8226) HTML 4.0 HTMLspecial ISOpub bullet (= black small circle) black<br />

hellip … U+2026 (8230) HTML 4.0 HTMLsymbol ISOpub horizontal ellipsis (= three dot leader)<br />

permil ‰ U+2030 (8240) HTML 4.0 HTMLspecial ISOtech per mille sign<br />

prime ′ U+2032 (8242) HTML 4.0 HTMLsymbol ISOtech prime (= minutes = feet)<br />

Prime ″ U+2033 (8243) HTML 4.0 HTMLsymbol ISOtech double prime (= seconds = inches)<br />

lsaquo ‹ U+2039 (8249) HTML 4.0 HTMLspecial ISO proposed single left-pointing angle quotation mark proposed<br />

rsaquo › U+203A (8250) HTML 4.0 HTMLspecial ISO proposed single right-pointing angle quotation mark proposed<br />

oline ‾ U+203E (8254) HTML 4.0 HTMLsymbol NEW overline (= spacing overscore)<br />

frasl ⁄ U+2044 (8260) HTML 4.0 HTMLsymbol NEW fraction slash (= solidus)<br />

euro € U+20AC (8364) HTML 4.0 HTMLspecial NEW euro sign<br />

image ℑ U+2111 (8465) HTML 4.0 HTMLsymbol ISOamso black-letter capital I (= imaginary part)<br />

weierp ℘ U+2118 (8472) HTML 4.0 HTMLsymbol ISOamso script capital P (= power set = Weierstrass p)<br />

real ℜ U+211C (8476) HTML 4.0 HTMLsymbol ISOamso black-letter capital R (= real part symbol)<br />

trade U+2122 (8482) HTML 4.0 HTMLsymbol ISOnum trademark sign<br />

alefsym ℵ U+2135 (8501) HTML 4.0 HTMLsymbol NEW alef symbol (= first transfinite cardinal) alefsym<br />

larr ← U+2190 (8592) HTML 4.0 HTMLsymbol ISOnum leftwards arrow<br />

uarr ↑ U+2191 (8593) HTML 4.0 HTMLsymbol ISOnum upwards arrow


List of <strong>XML</strong> and HTML character entity references 40<br />

rarr → U+2192 (8594) HTML 4.0 HTMLsymbol ISOnum rightwards arrow<br />

darr ↓ U+2193 (8595) HTML 4.0 HTMLsymbol ISOnum downwards arrow<br />

harr ↔ U+2194 (8596) HTML 4.0 HTMLsymbol ISOamsa left right arrow<br />

crarr ↵ U+21B5 (8629) HTML 4.0 HTMLsymbol NEW downwards arrow with corner leftwards (=<br />

carriage return)<br />

lArr ⇐ U+21D0 (8656) HTML 4.0 HTMLsymbol ISOtech leftwards double arrow lArr<br />

uArr ⇑ U+21D1 (8657) HTML 4.0 HTMLsymbol ISOamsa upwards double arrow<br />

rArr ⇒ U+21D2 (8658) HTML 4.0 HTMLsymbol ISOnum rightwards double arrow rArr<br />

dArr ⇓ U+21D3 (8659) HTML 4.0 HTMLsymbol ISOamsa downwards double arrow<br />

hArr ⇔ U+21D4 (8660) HTML 4.0 HTMLsymbol ISOamsa left right double arrow<br />

forall ∀ U+2200 (8704) HTML 4.0 HTMLsymbol ISOtech for all<br />

part ∂ U+2202 (8706) HTML 4.0 HTMLsymbol ISOtech partial differential<br />

exist ∃ U+2203 (8707) HTML 4.0 HTMLsymbol ISOtech there exists<br />

empty ∅ U+2205 (8709) HTML 4.0 HTMLsymbol ISOamso empty set (= null set = diameter)<br />

nabla ∇ U+2207 (8711) HTML 4.0 HTMLsymbol ISOtech nabla (= backward difference)<br />

isin ∈ U+2208 (8712) HTML 4.0 HTMLsymbol ISOtech element of<br />

notin ∉ U+2209 (8713) HTML 4.0 HTMLsymbol ISOtech not an element of<br />

ni ∋ U+220B (8715) HTML 4.0 HTMLsymbol ISOtech contains as member<br />

prod ∏ U+220F (8719) HTML 4.0 HTMLsymbol ISOamsb n-ary product (= product sign) prod<br />

sum ∑ U+2211 (8721) HTML 4.0 HTMLsymbol ISOasmb n-ary summation sum<br />

minus − U+2212 (8722) HTML 4.0 HTMLsymbol ISOtech minus sign<br />

lowast ∗ U+2217 (8727) HTML 4.0 HTMLsymbol ISOtech asterisk operator<br />

radic √ U+221A (8730) HTML 4.0 HTMLsymbol ISOtech square root (= radical sign)<br />

prop ∝ U+221D (8733) HTML 4.0 HTMLsymbol ISOtech proportional to<br />

infin ∞ U+221E (8734) HTML 4.0 HTMLsymbol ISOtech infinity<br />

ang ∠ U+2220 (8736) HTML 4.0 HTMLsymbol ISOamso angle<br />

and ∧ U+2227 (8743) HTML 4.0 HTMLsymbol ISOtech logical and (= wedge)<br />

or ∨ U+2228 (8744) HTML 4.0 HTMLsymbol ISOtech logical or (= vee)<br />

cap ∩ U+2229 (8745) HTML 4.0 HTMLsymbol ISOtech intersection (= cap)<br />

cup ∪ U+222A (8746) HTML 4.0 HTMLsymbol ISOtech union (= cup)<br />

int ∫ U+222B (8747) HTML 4.0 HTMLsymbol ISOtech integral<br />

there4 ∴ U+2234 (8756) HTML 4.0 HTMLsymbol ISOtech therefore<br />

sim ∼ U+223C (8764) HTML 4.0 HTMLsymbol ISOtech tilde operator (= varies with = similar to) sim<br />

cong ≅ U+2245 (8773) HTML 4.0 HTMLsymbol ISOtech congruent to<br />

asymp ≈ U+2248 (8776) HTML 4.0 HTMLsymbol ISOamsr almost equal to (= asymptotic to)<br />

ne ≠ U+2260 (8800) HTML 4.0 HTMLsymbol ISOtech not equal to<br />

equiv ≡ U+2261 (8801) HTML 4.0 HTMLsymbol ISOtech identical to; sometimes used for 'equivalent to'<br />

le ≤ U+2264 (8804) HTML 4.0 HTMLsymbol ISOtech less-than or equal to<br />

ge ≥ U+2265 (8805) HTML 4.0 HTMLsymbol ISOtech greater-than or equal to


List of <strong>XML</strong> and HTML character entity references 41<br />

sub ⊂ U+2282 (8834) HTML 4.0 HTMLsymbol ISOtech subset of<br />

sup ⊃ U+2283 (8835) HTML 4.0 HTMLsymbol ISOtech superset of sup<br />

nsub ⊄ U+2284 (8836) HTML 4.0 HTMLsymbol ISOamsn not a subset of<br />

sube ⊆ U+2286 (8838) HTML 4.0 HTMLsymbol ISOtech subset of or equal to<br />

supe ⊇ U+2287 (8839) HTML 4.0 HTMLsymbol ISOtech superset of or equal to<br />

oplus ⊕ U+2295 (8853) HTML 4.0 HTMLsymbol ISOamsb circled plus (= direct sum)<br />

otimes ⊗ U+2297 (8855) HTML 4.0 HTMLsymbol ISOamsb circled times (= vector product)<br />

perp ⊥ U+22A5 (8869) HTML 4.0 HTMLsymbol ISOtech up tack (= orthogonal to = perpendicular) perp<br />

sdot ⋅ U+22C5 (8901) HTML 4.0 HTMLsymbol ISOamsb dot operator sdot<br />

lceil ⌈ U+2308 (8968) HTML 4.0 HTMLsymbol ISOamsc left ceiling (= APL upstile)<br />

rceil ⌉ U+2309 (8969) HTML 4.0 HTMLsymbol ISOamsc right ceiling<br />

lfloor ⌊ U+230A (8970) HTML 4.0 HTMLsymbol ISOamsc left floor (= APL downstile)<br />

rfloor ⌋ U+230B (8971) HTML 4.0 HTMLsymbol ISOamsc right floor<br />

lang U+2329 (9001) HTML 4.0 HTMLsymbol ISOtech left-pointing angle bracket (= bra) lang<br />

rang U+232A (9002) HTML 4.0 HTMLsymbol ISOtech right-pointing angle bracket (= ket) rang<br />

loz ◊ U+25CA (9674) HTML 4.0 HTMLsymbol ISOpub lozenge<br />

spades ♠ U+2660 (9824) HTML 4.0 HTMLsymbol ISOpub black spade suit black<br />

clubs ♣ U+2663 (9827) HTML 4.0 HTMLsymbol ISOpub black club suit (= shamrock) black<br />

hearts ♥ U+2665 (9829) HTML 4.0 HTMLsymbol ISOpub black heart suit (= valentine) black<br />

diams ♦ U+2666 (9830) HTML 4.0 HTMLsymbol ISOpub black diamond suit black<br />

Notes:<br />

• DTD: the full public DTD name (where the character entity name is defined) is actually mapped from one of the<br />

following three defined named entities:<br />

HTMLlat1<br />

maps to:<br />

• PUBLIC "-//W3C//ENTITIES Latin 1//EN//HTML" in HTML (the DTD is implicitly defined,<br />

no system URI is needed);<br />

• PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN" "http://www.w3.org/TR/xhtml1/<br />

HTMLsymbol<br />

DTD/xhtml-lat1.ent" in XHTML 1.0;<br />

maps to:<br />

• PUBLIC "-//W3C//ENTITIES Symbols//EN//HTML" in HTML (the DTD is implicitly defined,<br />

no system URI is needed);<br />

• PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN" "http://www.w3.org/TR/xhtml1/<br />

HTMLspecial<br />

DTD/xhtml-symbol.ent" in XHTML 1.0;<br />

maps to:


List of <strong>XML</strong> and HTML character entity references 42<br />

• PUBLIC "-//W3C//ENTITIES Special//EN//HTML" in HTML (the DTD is implicitly defined,<br />

no system URI is needed);<br />

• PUBLIC "-//W3C//ENTITIES Special for XHTML//EN" "http://www.w3.org/TR/xhtml1/<br />

DTD/xhtml-special.ent" in XHTML 1.0.<br />

• Old ISO subset: these are old (documented) character subsets used in legacy encodings before the unification<br />

within ISO 10646.<br />

• Description: the standard ISO 10646 and Unicode character name is displayed first for each character, with<br />

non-standard but legacy synonyms shown in italics between parentheses after an equal sign)<br />

• spaces: a blue background has been used in order to display each space's width.<br />

• ISO proposed: these characters have been standardized in ISO 10646 after the release of HTML 4.0.<br />

• ligature: this is a standard misnomer as this is a separate character in some languages.<br />

• black: here it seems to mean filled as opposed to hollow.<br />

• alefsym: 'alef symbol' is NOT the same as U+05D0 'Hebrew letter alef', although the same glyph could be used<br />

to depict both characters.<br />

• lArr: ISO 10646 does not say that 'leftwards double arrow' is the same as the 'is implied by' arrow but also does<br />

not have any other character for that function. So lArr can be used for 'is implied by' as ISOtech suggests.<br />

• rArr: ISO 10646 does not say that 'rightwards double arrow' is the 'implies' character but does not have another<br />

character with this function so rArr can be used for 'implies' as ISOtech suggests.<br />

• prod: 'n-ary product' is NOT the same character as U+03A0 'Greek capital letter Pi' though the same glyph might<br />

be used for both.<br />

• sum: 'n-ary summation' is NOT the same character as U+03A3 'Greek capital letter Sigma' though the same glyph<br />

might be used for both.<br />

• sim: 'tilde operator' is NOT the same character as U+007E 'tilde', although the same glyph might be used to<br />

represent both.<br />

• sup: note that nsup, U+2283 'not a superset of', is not covered by the Symbol font encoding and is not included.<br />

Should it be, for symmetry? It is in the ISOamsn subset.<br />

• perp: Unicode only defines U+22A5 as the "up tack". The Unicode symbol for "perpendicular" is U+27C2. The<br />

two symbols look similar, but are separate in Unicode. However, HTML uses U+22A5 as its "perpendicular"<br />

symbol. This is a discrepancy between HTML and Unicode.<br />

• sdot: 'dot operator' is NOT the same character as U+00B7 'middle dot'.<br />

• lang: 'left-pointing angle bracket' is NOT the same character as U+003C 'less than' or U+2039 'single<br />

left-pointing angle quotation mark'.<br />

• rang: 'right-pointing angle bracket' is NOT the same character as U+003E 'greater than' or U+203A 'single<br />

right-pointing angle quotation mark'.<br />

Entities representing special characters in XHTML<br />

The XHTML DTDs explicitly declare 253 entities (including the 5 predefined entities of <strong>XML</strong> 1.0) whose expansion<br />

is a single character, which can therefore be informally referred to as "character entities". These (with the exception<br />

of the &apos; entity) have the same names and represent the same characters as the 252 character entities in HTML.<br />

Also, by virtue of being <strong>XML</strong>, XHTML documents may reference the predefined &apos; entity, which is not one of<br />

the 252 character entities in HTML. Additional entities of any size may be defined on a per-document basis.<br />

However, the usability of entity references in XHTML is affected by how the document is being processed:<br />

• If the document is read by a conforming HTML processor, then only the 252 HTML character entities can safely<br />

be used. The use of &apos; or custom entity references may not be supported and may produce unpredictable<br />

results.<br />

• If the document is read by an <strong>XML</strong> parser that does not or cannot read external entities, then only the five built-in<br />

<strong>XML</strong> character entities (see above) can safely be used, although other entities may be used if they are declared in


List of <strong>XML</strong> and HTML character entity references 43<br />

the internal DTD subset.<br />

• If the document is read by an <strong>XML</strong> parser that does read external entities, then the five built-in <strong>XML</strong> character<br />

entities can safely be used. The other 248 HTML character entities can be used as long as the XHTML DTD is<br />

accessible to the parser at the time the document is read. Other entities may also be used if they are declared in the<br />

internal DTD subset.<br />

Because of the special &apos; case mentioned above, only &quot;, &amp;, &lt;, and &gt; will work in all processing<br />

situations.<br />

See also<br />

• Character encodings in HTML<br />

• HTML decimal character rendering<br />

• SGML entity<br />

References<br />

• Unicode Consortium [1] . See also: Unicode Consortium<br />

• UnicodeData.txt from the Unicode Consortium [2]<br />

• World Wide Web Consortium [3] . See also: World Wide Web Consortium<br />

• <strong>XML</strong> 1.0 spec [4]<br />

• HTML 2.0 spec [5]<br />

• HTML 3.2 spec [6]<br />

• HTML 4.0 spec [7]<br />

• HTML 4.01 spec [8]<br />

• XHTML 1.0 spec [9]<br />

• <strong>XML</strong> Entity Definitions for Characters [10]<br />

• The normative reference to RFC 2070 (still found in DTDs defining the character entities for HTML or XHTML)<br />

is historic; this RFC (along with other RFC's related to different part of the HTML specification) has been<br />

deprecated in favor of the newer informational RFC 2854 which defines the "text/html" MIME type and<br />

references directly the W3C specifications for the actual HTML content.<br />

• Numerical Reference of Unicode code points at Wikibooks<br />

External links<br />

• Character entity references in HTML 4 [11] at the W3C<br />

• Multilanguage special character entity list [12] - List of special characters, entities and their names.<br />

References<br />

[1] http://www.unicode.org/<br />

[2] http://www.unicode.org/Public/UNIDATA/UnicodeData.txt<br />

[3] http://www.w3.org/<br />

[4] http://www.w3.org/TR/REC-xml/<br />

[5] http://www.w3.org/MarkUp/html-spec/html-spec_toc.html<br />

[6] http://www.w3.org/TR/REC-html32<br />

[7] http://www.w3.org/TR/1998/REC-html40-19980424/<br />

[8] http://www.w3.org/TR/REC-html40/<br />

[9] http://www.w3.org/TR/xhtml1/<br />

[10] http://www.w3.org/TR/xml-entity-names/<br />

[11] http://www.w3.org/TR/html4/sgml/entities.html<br />

[12] http://www.seomister.com/ch


Log4js 44<br />

Log4js<br />

Developer(s)<br />

Log4js Logo<br />

Stephan Strittmatter, Seth Chisamore<br />

[1]<br />

Stable release 1.0 / August 4, 2008<br />

Operating<br />

system<br />

Type Framework<br />

Windows, Linux, Mac OS<br />

License Apache Software Foundation<br />

Website http://log4js.berlios.de [1]<br />

Log4js is a framework written in JavaScript to log application events.<br />

The framework is very close to the API of Log4j. It is also available under the licence of Apache Software<br />

Foundation.<br />

Functionality<br />

The base concept is identical to Log4j. The same log levels and almost<br />

all methods are identical.<br />

One special feature of Log4js is the ability to log the events of the<br />

browser remote on the server. Using Ajax it is possible to send the<br />

logging events in several formats (<strong>XML</strong>, JSON, plain ASCII etc.) to<br />

the server to be evaluated there.<br />

Appender<br />

Following appenders are implemented currently:<br />

AjaxAppender<br />

Sends the logs via XmlHttpRequest (Ajax) to the server to be processed there.<br />

ConsoleAppender<br />

Logs within the HTML page or in a separate window.<br />

FileAppender<br />

Writes to a local file (Internet Explorer and Mozilla supported).<br />

JSConsoleAppender<br />

Appender for the JavaScript Console of Mozilla, Opera and Safari.<br />

MetatagAppender<br />

Adds the log events to Metatags in the DOM of document.<br />

class diagram


Log4js 45<br />

WindowsEventsAppender<br />

Layout<br />

Using Internet Explorer it is possible to log to Windows System Events.<br />

The Layout classes are for different formattings of the events:<br />

BasicLayout<br />

Simple textual output of the events.<br />

HtmlLayout<br />

Formats the event as HTML -element.<br />

JSONLayout<br />

Converts the events to JSON-objects which are readable in many other programming languages like Perl, PHP<br />

and Java.<br />

<strong>XML</strong>Layout<br />

<strong>XML</strong> formatted output.<br />

External links<br />

• Log4js Homepage [1]<br />

• Log4js Wiki [2]<br />

• Apache Logging Homepage [3]<br />

References<br />

[1] http://log4js.berlios.de<br />

[2] http://scratchpad.wikia.com/wiki/Log4js<br />

[3] http://logging.apache.org/


MAREC 46<br />

MAREC<br />

The MAtrixware REsearch Collection (MAREC) is a standardised patent data corpus available for research<br />

purposes. MAREC could be defined as corpus that seeks to represent patent documents of several languages in order<br />

to answer specific research questions. [1] [2] It consists of 19 million patent documents in different languages,<br />

normalised to a highly specific <strong>XML</strong> schema.<br />

MAREC is intended as raw material for research in areas such as information retrieval, natural language processing<br />

or machine translation, which require large amounts of complex documents. [3] The collection contains documents in<br />

19 languages, the majority being English, German and French, and about half of the documents include full text.<br />

In MAREC, the documents from different countries and sources are normalised to a common <strong>XML</strong> format with a<br />

uniform patent numbering scheme and citation format. The standardised fields include dates, countries, languages,<br />

references, person names, and companies as well as subject classifications such as IPC codes. [4]<br />

MAREC is a comparable corpus, where many documents are available in similar versions in other languages. A<br />

comparable corpus can be defined as consisting of texts that share similar topics – news text from the same time<br />

period in different countries, while a parallel corpus is defined as a collection of documents with aligned translations<br />

from the source to the target language. [5] Since the patent document refers to the same “invention” or “concept of<br />

idea” the text is a translation of the invention, but it does not have to be a direct translation of the text itself – text<br />

parts could have been removed or added for clarification reasons.<br />

The 19,386,697 <strong>XML</strong> files measure a total of 621 GB and are hosted by the Information Retrieval Facility. Access<br />

and support are free of charge for research purposes.<br />

External links<br />

• User guide and statistics [6]<br />

• Information Retrieval Facility [7]<br />

• "One week of MAREC" sample [8]<br />

References<br />

[1] Merz C., (2003) A Corpus Query Tool For Syntactically Annotated Corpora Licentiate Thesis, The University of Zurich, Department of<br />

Computation linguistic, Switzerland<br />

[2] Biber D., Conrad S., and Reppen R. (2000) Corpus Linguistics: Investigating <strong>Language</strong> Structure and Use. Cambridge University Press, 2nd<br />

edition<br />

[3] Manning, C. D. and Schütze, H. (2002) Foundations of statistical natural language processing Cambridge, MA, Massachusetts Institute of<br />

Technology (MIT) ISBN 0-262-13360-1.<br />

[4] European Patent Office (2009) Guidelines for examination in the European Patent Office (http://documents.epo.org/projects/babylon/<br />

eponet.nsf/0/1AFC30805E91D074C125758A0051718A/$File/guidelines_2009_complete_en.pdf), Published by European Patent Office,<br />

Germany (April 2009)<br />

[5] Järvelin A. , Talvensaari T. , Järvelin Anni, (2008) Data driven methods for improving mono- and cross-lingual IR performance in noisy<br />

environments, Proceedings of the second workshop on Analytics for noisy unstructured text data, (Singapore)<br />

[6] http://www.matrixware.com/documentation/marec/index.jsp?topic=/com.MxW.MAREC/ch02.html<br />

[7] http://ir-facility.org<br />

[8] http://matrixware.net/tos/marec/


Media Object Server 47<br />

Media Object Server<br />

Media Object Server (MOS) is an <strong>XML</strong>-based protocol for transferring information between newsroom automation<br />

systems and other associated systems such as media servers.<br />

The MOS protocol allows a variety of devices to be controlled from one central device or piece of software. This<br />

limits the need to have operators in multiple locations throughout the studio environment. For example, multiple<br />

character generators can be fired from a single control workstation, without needing an operator at each CG console.<br />

External references<br />

• http://www.mosprotocol.com/<br />

• http://www.codeproject.com/KB/cs/mosprotocol.aspx by Rizwan Qureshi<br />

METS<br />

The Metadata Encoding and Transmission Standard is a metadata standard for encoding descriptive, administrative,<br />

and structural metadata regarding objects within a digital library, expressed using the <strong>XML</strong> schema language of the<br />

World Wide Web Consortium. The standard is maintained in the Network Development and MARC Standards Office<br />

of the Library of Congress, and is being developed as an initiative of the Digital Library Federation.<br />

Introduction<br />

METS is an <strong>XML</strong> Schema designed for the purpose of:<br />

• Creating <strong>XML</strong> document instances that express the hierarchical structure of digital library objects.<br />

• Recording the names and locations of the files that comprise those objects.<br />

• Recording associated metadata. METS can, therefore, be used as a tool for modeling real world objects, such as<br />

particular document types.<br />

Depending on its use, a METS document could be used in the role of Submission Information Package (SIP),<br />

Archival Information Package (AIP), or Dissemination Information Package (DIP) within the Open Archival<br />

Information System (OAIS) Reference Model.<br />

Digital libraries Vs Traditional libraries<br />

Maintaining a library of digital objects requires maintaining metadata about those objects. The metadata necessary<br />

for successful management and use of digital objects is both more extensive than and different from the metadata<br />

used for managing collections of printed works and other physical materials.<br />

• Where a traditional library may record descriptive metadata regarding a book in its collection, the book will not<br />

dissolve into a series of unconnected pages if the library fails to record structural metadata regarding the book's<br />

organization, nor will scholars be unable to evaluate the book's worth if the library fails to note that the book was<br />

produced using a Ryobi offset press.<br />

• The same cannot be said for a digital library. Without structural metadata, the page image or text files<br />

comprising the digital work are of little use, and without technical metadata regarding the digitization process,<br />

scholars may be unsure of how accurate a reflection of the original the digital version provides.<br />

• However in a digital library it is possible to create e-book like PDF file, Tiff file which can be seen a single<br />

physical book and reflect the integrity of the original.


METS 48<br />

Characteristics of METS documents<br />

Any METS document has the following features:<br />

• An open standard (non-proprietary)<br />

• Developed by the library community<br />

• Relatively simple<br />

• Extensible<br />

• Modular<br />

Sections of a METS document Example of a METS document<br />

The 7 sections of a METS document<br />

• METS header: Contains metadata describing the METS document itself, such as its creator, editor, etc.<br />

• Descriptive Metadata: May contain internally embedded metadata or point to metadata external to the METS<br />

document. Multiple instances of both internal and external descriptive metadata may be included.<br />

• Administrative Metadata: Provides information regarding how files were created and stored, intellectual<br />

property rights, metadata regarding the original source object from which the digital library object derives, and<br />

information regarding the provenance of files comprising the digital library object (such as master/derivative<br />

relationships, migrations, and transformations). As with descriptive metadata, administrative metadata may be<br />

internally encoded or external to the METS document.<br />

• File Section: Lists all files containing content which comprise the electronic versions of the digital object. file<br />

elements may be grouped within fileGrp elements to subdivide files by object version.<br />

• Structural Map: Outlines a hierarchical structure for the digital library object, and links the elements of that<br />

structure to associated content files and metadata.<br />

• Structural Links: Allows METS creators to record the existence of hyperlinks between nodes in the Structural<br />

Map. This is of particular value in using METS to archive Websites.<br />

• Behavioral: Used to associate executable behaviors with content in the METS object. Each behavior has a<br />

mechanism element identifying a module of executable code that implements behaviors defined abstractly by its<br />

interface definition.


METS 49<br />

METS profiles<br />

METS Profiles are intended to describe a class of METS documents in sufficient detail to provide both document<br />

authors and programmers the guidance they require to create and process METS documents conforming with a<br />

particular profile.<br />

A profile is expressed as an <strong>XML</strong> document. There is a schema for this purpose. The profile expresses the<br />

requirements that a METS document must satisfy. A sufficiently explicit METS Profile may be considered a data<br />

standard.<br />

METS Profiles in use<br />

• Musical Score (may be a score, score and parts, or a set of parts only)<br />

• Print Material (books, pamphlets, etc.)<br />

• Music Manuscript (score or sketches)<br />

• Recorded Event (audio or video)<br />

• PDF Document<br />

• Bibliographic Record<br />

• Photograph<br />

• Compact Disc<br />

• Collection<br />

See also<br />

• Digital Item Declaration <strong>Language</strong><br />

• Dublin Core, an ISO metadata standard<br />

• Preservation Metadata: Implementation Strategies (PREMIS)<br />

• Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)<br />

External links<br />

• Network Development and MARC Standards Office [1]<br />

• Library of Congress [2]<br />

• Digital Library Federation [3]<br />

• METS Official web site [1]<br />

References<br />

[1] http://www.loc.gov/standards/mets/<br />

[2] http://www.loc.gov/index.html<br />

[3] http://www.diglib.org/


Numeric character reference 50<br />

Numeric character reference<br />

A numeric character reference (NCR) is a common markup construct used in SGML and other SGML-related<br />

markup languages such as HTML and <strong>XML</strong>. It consists of a short sequence of characters that, in turn, represent a<br />

single character from the Universal Character Set (UCS) of Unicode. NCRs are typically used in order to represent<br />

characters that are not directly encodable in a particular document. When the document is interpreted by a<br />

markup-aware reader, each NCR is treated as if it were the character it represents.<br />

Example<br />

In SGML, HTML, and <strong>XML</strong>, the following are all valid numeric character references for the Greek capital letter<br />

Sigma ("Σ"),<br />

Numerical character reference of Unicode character Σ<br />

Σ = U+03A3: GREEK CAPITAL LETTER SIGMA (3A3 = 931 )<br />

16 10<br />

Unicode character Numerical base Numerical reference in markup Effect<br />

U+03A3 Decimal &#931; Σ<br />

U+03A3 Decimal &#0931; Σ<br />

U+03A3 Hexadecimal &#x3A3; Σ<br />

U+03A3 Hexadecimal &#x03A3; Σ<br />

U+03A3 Hexadecimal &#x3a3; Σ<br />

Discussion<br />

<strong>Markup</strong> languages are typically defined in terms of UCS or Unicode characters. That is, a document consists, at its<br />

most fundamental level of abstraction, of a sequence of characters, which are abstract units that exist independently<br />

of any encoding.<br />

Ideally, when the characters of a document utilizing a markup language are encoded for storage or transmission over<br />

a network as a sequence of bits, the encoding that is used will be one that supports representing each and every<br />

character in the document, if not in the whole of Unicode, directly as a particular bit sequence.<br />

Sometimes, though, for reasons of convenience or due to technical limitations, documents are encoded with an<br />

encoding that cannot represent some characters directly. For example, the widely used encodings based on ISO 8859<br />

can only represent, at most, 256 unique characters as one 8-bit byte each.<br />

Documents are rarely, in practice, ever allowed to use more than one encoding internally, so the onus is usually on<br />

the markup language to provide a means for document authors to express unencodable characters in terms of<br />

encodable ones. This is generally done through some kind of "escaping" mechanism.<br />

The SGML-based markup languages allow document authors to use special sequences of characters from the ASCII<br />

range (the first 128 code points of Unicode) to represent, or reference, any Unicode character, regardless of whether<br />

the character being represented is directly available in the document's encoding. These special sequences are<br />

character references.<br />

Character references that are based on the referenced character's UCS or Unicode "code point" are called numeric<br />

character references. In HTML 4 and in all versions of XHTML and <strong>XML</strong>, the code point can be expressed either as<br />

a decimal (base 10) number or as a hexadecimal (base 16) number. The syntax is as follows:


Numeric character reference 51<br />

Character U+0026 (ampersand), followed by character U+0023 (number sign), followed by one of the following<br />

choices:<br />

• one or more decimal digits zero (U+0030) through nine (U+0039); or<br />

• character U+0078 ("x") followed by one or more hexadecimal digits, which are zero (U+0030) through nine<br />

(U+0039), Latin capital letter A (U+0041) through F (U+0046), and Latin small letter a (U+0061) through f<br />

(U+0066);<br />

all followed by character U+003B (semicolon). Older versions of HTML disallowed the hexadecimal syntax.<br />

The characters that comprise a numeric character reference can be represented in every character encoding used in<br />

computing and telecommunications today, so there is no risk of the reference itself being unencodable.<br />

There is another kind of character reference called a character entity reference, which allows a character to be<br />

referred to by a name instead of a number. (Naming a character creates a character entity.) HTML defines some<br />

character entities, but not many; all other characters can only be included by direct encoding or using NCRs.<br />

Restrictions<br />

The Universal Character Set defined by ISO 10646 is the "document character set" of SGML, HTML 4, so by<br />

default, any character in such a document, and any character referenced in such a document, must be in the UCS.<br />

While the syntax of SGML does not prohibit references to unassigned code points, such as &#xFFFF;,<br />

SGML-derived markup languages such as HTML and <strong>XML</strong> can, and often do, restrict numeric character references<br />

to only those code points that are assigned to characters or that have not been permanently left unassigned.<br />

Restrictions may also apply for other reasons. For example, in HTML 4, &#12;, which is a reference to a<br />

non-printing "form feed" control character, is allowed because a form feed character is allowed. But in <strong>XML</strong>, the<br />

form feed character cannot be used, not even by reference. As another example, &#128;, which is a reference to<br />

another control character, is not allowed to be used or referenced in either HTML or <strong>XML</strong>, but when used in HTML,<br />

it is usually not flagged as an error by web browsers—some of which attempt to interpret it as a reference to the<br />

character represented by code value 128 in the Windows-1252 encoding: "€", which actually should be represented<br />

as &#8364;. As a further example, prior to the publication of <strong>XML</strong> 1.0 Second Edition on October 6, 2000, <strong>XML</strong> 1.0<br />

was based on an older version of ISO 10646 and prohibited using characters above U+FFFD, except in character<br />

data, thus making a reference like &#65536; (U+10000) illegal. In <strong>XML</strong> 1.1 and newer editions of <strong>XML</strong> 1.0, such a<br />

reference is allowed, because the available character repertoire was explicitly extended.<br />

<strong>Markup</strong> languages also place restrictions on where character references can occur.<br />

See also<br />

• Character entity reference<br />

• List of <strong>XML</strong> and HTML character entity references


Office Open <strong>XML</strong> 52<br />

Office Open <strong>XML</strong><br />

class="infobox" style="width: 22em; font-size: 88%; line-height: 1.5em" Office<br />

Open <strong>XML</strong><br />

• Office Open <strong>XML</strong> file formats<br />

• Open Packaging Conventions<br />

• Open Specification Promise<br />

• Vector <strong>Markup</strong> <strong>Language</strong><br />

• Office Open <strong>XML</strong> software<br />

• Comparison of Office Open <strong>XML</strong> software<br />

• Office Open <strong>XML</strong> standardization<br />

Filename extension .docx or .docm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

wordprocessingml.<br />

[1]<br />

document<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Document file format<br />

Extended from <strong>XML</strong>, DOC, WordProcessingML<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

Filename extension .pptx or .pptm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

presentationml.<br />

[1]<br />

presentation


Office Open <strong>XML</strong> 53<br />

|-<br />

|}<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Presentation<br />

Extended from <strong>XML</strong>, PPT<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

Filename extension .xlsx or .xlsm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

spreadsheetml.<br />

[1]<br />

sheet<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Spreadsheet<br />

Extended from <strong>XML</strong>, XLS, SpreadsheetML<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

Office Open <strong>XML</strong> (also informally known as OO<strong>XML</strong> or Open<strong>XML</strong>) is a zipped, <strong>XML</strong>-based file format<br />

developed by Microsoft [4] for representing spreadsheets, charts, presentations and word processing documents. The<br />

Office Open <strong>XML</strong> specification has been standardised both by Ecma and, in a later edition, by ISO and IEC as an<br />

International Standard (ISO/IEC 29500).<br />

Starting with Microsoft Office 2007, the Office Open <strong>XML</strong> file formats (ECMA-376) have become the default [5]<br />

target file format of Microsoft Office, [6] [7] although the Strict variant of the standard is not fully supported. [8]<br />

Background<br />

In 2000, Microsoft released an initial version of an <strong>XML</strong>-based format for Microsoft Excel, which was incorporated<br />

in Office XP. In 2002, a new file format for Microsoft Word followed. [9] The Excel and Word formats—known as<br />

the Microsoft Office <strong>XML</strong> formats—were later incorporated into the 2003 release of Microsoft Office.<br />

Microsoft announced in November 2005 that it would co-sponsor standardization of the new version of their<br />

<strong>XML</strong>-based formats through Ecma International, as "Office Open <strong>XML</strong>". [10]


Office Open <strong>XML</strong> 54<br />

Standardization process<br />

Microsoft submitted initial material to Ecma International Technical Committee TC45, where it was standardized to<br />

become ECMA-376, approved in December 2006. [11]<br />

This standard was then fast-tracked in the Joint Technical Committee 1 of ISO and IEC.<br />

After initially failing to pass, an amended version of the format received the necessary votes for approval as an<br />

ISO/IEC Standard as the result of a JTC 1 fast tracking standardization process that concluded in April 2008. [12] The<br />

resulting four part International Standard (designated ISO/IEC 29500:2008) was published in November 2008 [13]<br />

and can be downloaded from the ITTF. [14] A technically equivalent set of texts is published by Ecma as ECMA-376<br />

Office Open <strong>XML</strong> File Formats — 2nd edition (December 2008); they can be downloaded from their web site. [15]<br />

Licensing<br />

Under the Ecma International code of conduct in patent matters, [16] participating and approving member<br />

organisations of ECMA are required to make available their patent rights on a Reasonable and Non Discriminatory<br />

(RAND) basis.<br />

Holders of patents which concern ISO/IEC International Standards may agree to a standardized license governing the<br />

terms under which such patents may be licensed, in accord with the ISO/IEC/ITU common patent policy [17] .<br />

Microsoft, the main contributor to the standard, provided a Covenant Not to Sue [18] for its patent licensing. The<br />

covenant received a mixed reception, with some like the Groklaw blog criticizing it, [19] and others such as Lawrence<br />

Rosen, (an attorney and lecturer at Stanford Law School), endorsing it. [20]<br />

Microsoft has added the format to their Open Specification Promise [21] in which<br />

Microsoft irrevocably promises not to assert any Microsoft Necessary Claims against you for making,<br />

using, selling, offering for sale, importing or distributing any implementation to the extent it conforms to<br />

a Covered Specification […]<br />

This is limited to applications which do not deviate from the ISO/IEC 29500:2008 or Ecma-376 standard and to<br />

parties that do not "file, maintain or voluntarily participate in a patent infringement lawsuit against a Microsoft<br />

implementation of such Covered Specification". [22] [23] The Open Specification Promise was included in documents<br />

submitted to ISO/IEC in support of the ECMA-376 fast track submission. [24] Ecma International asserted that, "The<br />

OSP enables both open source and commercial software to implement [the specification]". [25]<br />

Versions<br />

The Office Open <strong>XML</strong> specification exists in a number of versions.<br />

ECMA-376 1st edition (2006)<br />

The ECMA standard is structured in five parts to meet the needs of different audiences. [15]<br />

Part 1. Fundamentals<br />

Vocabulary, notational conventions and abbreviations<br />

Summary of primary and supporting markup languages<br />

Conformance conditions and interoperability guidelines<br />

Constraints within the Open Packaging Conventions that apply to each document type<br />

Part 2. Open Packaging Conventions<br />

The Open Packaging Conventions (OPC), for the package model and physical package, is defined and used by<br />

various document types in various applications from multiple vendors.


Office Open <strong>XML</strong> 55<br />

It defines core properties, thumbnails, digital signatures, and authorizations and encryption capabilities for<br />

parts or all the contents in the package.<br />

<strong>XML</strong> schemas for the OPC are declared as <strong>XML</strong> Schema Definitions (XSD) and (non-normatively) using<br />

RELAX NG (ISO/IEC 19757-2)<br />

Part 3. Primer<br />

Informative (non-normative) introduction to WordprocessingML, SpreadsheetML, PresentationML,<br />

DrawingML, VML and Shared MLs, providing context and illustrating elements through examples and<br />

diagrams<br />

Describes the custom <strong>XML</strong> data storing facility within a package to support integration with business data<br />

Part 4. <strong>Markup</strong> <strong>Language</strong> Reference<br />

Contains the reference material for WordprocessingML, SpreadsheetML, PresentationML, DrawingML,<br />

Shared MLs and Custom <strong>XML</strong> Schema, defining every element and attribute including the element hierarchy<br />

(parent/child relationships)<br />

<strong>XML</strong> schemas for the markup languages are declared as XSD and (non-normatively) using RELAX NG<br />

Defines the custom <strong>XML</strong> data storing facility<br />

Part 5. <strong>Markup</strong> Compatibility and Extensibility<br />

Describes extension facilities of Open<strong>XML</strong> documents and specifies elements and attributes by which<br />

applications with different extensions can interoperate<br />

ISO/IEC 29500:2008<br />

The ISO/IEC standard is structured into four parts. [26] Parts 1, 2 and 3 are independent standards; for example Part 2,<br />

specifying Open Packaging Conventions, is used by other files formats including XPS and Design Web Format. Part<br />

4 is to be read as a modification to Part 1, on which it depends.<br />

A technically equivalent set of texts is also published by Ecma as ECMA-376 2nd edition (2008).<br />

Part 1 (Fundamentals and <strong>Markup</strong> <strong>Language</strong> Reference)<br />

This part has 5560 pages. It contains:<br />

• Conformance definitions<br />

• Reference material for the <strong>XML</strong> document markup languages defined by the Standard<br />

• <strong>XML</strong> schemas for the document markup languages declared using XSD and (non-normatively) RELAX NG<br />

• Defines the foreign markup facilities<br />

Part 2 (Open Packaging Conventions)<br />

This part has 129 pages. It contains:<br />

• A description of the Open Packaging Conventions (package model, physical package)<br />

• Core properties, thumbnails and digital signatures<br />

• <strong>XML</strong> schemas for the OPC are declared using XSD and (non-normatively) RELAX NG)<br />

Part 3 (<strong>Markup</strong> Compatibility and Extensibility)<br />

This part has 40 pages. It contains:<br />

• A description of extensions: elements and attributes which define mechanisms allowing applications to specify<br />

alternative means of negotiating content<br />

• Extensibility rules are expressed using NVDL<br />

Part 4 (Transitional Migration Features)<br />

This part has 1464 pages. It contains:


Office Open <strong>XML</strong> 56<br />

• Legacy material such as compatibility settings and the graphics markup language VML<br />

• A list of syntactic differences between this text and ECMA-376 1st edition<br />

The standard specifies two levels of document and application conformance, strict and transitional for each of<br />

WordprocessingML, PresentationML and SpreadsheetML. The standard also specifies applications descriptions of<br />

base and full.<br />

Compatibility between versions<br />

The intent of the changes from ECMA-376 1st edition to ISO/IEC 29500:2008 was that a valid ECMA-376<br />

document would be a valid ISO 29500 "transitional" document [27] , but at least one change introduced at the BRM<br />

(refusing to allow further values for xsd:boolean) had the effect of breaking backwards compatibility for most<br />

documents. [28] A fix for this has been suggested to ISO/IEC JTC1/SC34/WG4, and was approved in June 2009 to go<br />

forward as a recommendation for the first amendment to Office Open <strong>XML</strong>. [29]<br />

File formats<br />

The Office Open <strong>XML</strong> file formats are a set of file formats that can be used to represent electronic office documents.<br />

The format defines a set of <strong>XML</strong> markup vocabularies for word processing documents, spreadsheets and<br />

presentations as well as specific <strong>XML</strong> markup vocabularies for material such as mathematical formulae, graphics,<br />

bibliographies etc. The stated goal of the Office Open <strong>XML</strong> standard is to be capable of faithfully representing the<br />

pre-existing corpus of word-processing documents, spreadsheets and presentations that had been produced by the<br />

Microsoft Office applications and to facilitate extensibility and interoperability by enabling implementations by<br />

multiple vendors and on multiple platforms.<br />

An Office Open <strong>XML</strong> file is a ZIP-compatible OPC package containing <strong>XML</strong> documents and other resources. That<br />

is, one can see the insides of a .xlsm file, for example, by renaming it as .zip file. Then, the file can be opened by any<br />

zip tool and the actual .xml files contained therein can be viewed in a web browser or a plain text editor.<br />

Adoption<br />

Several countries have formally announced either adoption, or the evaluation of adoption of Office Open <strong>XML</strong>,<br />

while others have rejected it completely. In some cases the Office Open <strong>XML</strong> standard has a national standard<br />

identifier; In some cases the Office Open <strong>XML</strong> standard is permitted to be used where national regulation says that<br />

non-proprietary formats must be used, in other cases, it means that some government body has actually decided that<br />

Office Open <strong>XML</strong> will be used in some specific context, and in still other cases, some Government body has decided<br />

that it will not use Office Open <strong>XML</strong> at all.<br />

Belgium<br />

Denmark<br />

Germany<br />

Belgium's Federal Public Service for Information and Communication Technology in 2006 was evaluating the<br />

adoption of the Office Open <strong>XML</strong> format. It already then confirmed that it would consider all ISO standards to<br />

be open standards, mentioning Office Open <strong>XML</strong> as such a possible future ISO standard. [30]<br />

In June 2007, the Danish Ministry of Science, Technology and Innovation recommended that beginning with<br />

January 1, 2008 public authorities must support at least one of the two word processing document formats<br />

Office Open <strong>XML</strong> or Open Document Format in all new IT solutions, where appropriate. [31]<br />

In Germany the Office Open <strong>XML</strong> standard is currently under observation by the governmental office for<br />

standards in public IT ("Koordinierungs- und Beratungsstelle der Bundesregierung für Informationstechnik in<br />

der Bundesverwaltung" (KBSt). The latest release of "SAGA" (Standards and Architectures for


Office Open <strong>XML</strong> 57<br />

Japan<br />

Lithuania<br />

Norway<br />

Sweden<br />

E-Government-Applications) includes Office Open <strong>XML</strong> file formats. The standard may be used to exchange<br />

complex documents when further processing is required. [32]<br />

On June 29, 2007, the government of Japan published a new interoperability framework which gives<br />

preference to the procurement of products that follow open standards. [33] [34] On July 2 the government<br />

declared that they hold the view that formats like Office Open <strong>XML</strong> which organizations such as Ecma<br />

International and ISO had also approved was, according to them, an open standard . Also, they said that it was<br />

one of the preferences, whether the format is open, to choose which software the government shall deploy.<br />

Lithuanian Standards Board has adopted the ISO/IEC 29500:2008 Office Open <strong>XML</strong> format standard as<br />

Lithuanian National standard. The decision was made by Technical Committee 4 Information Technology on<br />

March 5, 2009. The proposal to adopt the Office Open <strong>XML</strong> format standard was submitted by Lithuanian<br />

Archives Department under the Government of the Republic of Lithuania. [35]<br />

Norway's Ministry of Government Administration and Reform is evaluating the adoption of the Office Open<br />

<strong>XML</strong> format. The ministry put the document standard under observation in December 2007. [36]<br />

The Kingdom of Sweden has adopted Office Open <strong>XML</strong> as a 4 part Swedish National Standard SS-ISO/IEC<br />

[37] [38] [39] [40]<br />

29500:2009.<br />

Switzerland<br />

In July 2007, the Swiss Federal Council announced adherence SAGA.ch e-Government standards mandatory<br />

for its departments as well as for cantons, cities and municipalities. The latest version of SAGA.ch includes<br />

Office Open <strong>XML</strong> file formats. [41]<br />

United Kingdom<br />

The UK has put out an action plan for use of open standards, which includes ISO/IEC 29500 as one of several<br />

[42] [43]<br />

formats to be supported.<br />

United States of America<br />

On April 15, 2009, the ANSI-accredited INCITS organisation voted to adopt ISO/IEC 29500:2008 as an<br />

American National Standard. [44]<br />

The state of Massachusetts has been examining its options for implementing <strong>XML</strong>-based document<br />

processing. In early 2005, Eric Kriss, Secretary of Administration and Finance in Massachusetts, was the first<br />

government official in the United States to publicly connect open formats to a public policy purpose: "It is an<br />

overriding imperative of the American democratic system that we cannot have our public documents locked up<br />

in some kind of proprietary format, perhaps unreadable in the future, or subject to a proprietary system license<br />

that restricts access". [45] Since 2007 Massachusetts has classified Office Open <strong>XML</strong> as "Open Format" and has<br />

amended [46] its approved technical standards list — the Enterprise Technical Reference Model (ETRM) — to<br />

include Office Open <strong>XML</strong>. Massachusetts, under heavy pressure from some vendors, now formally endorses<br />

Office Open <strong>XML</strong> formats for its public records. [47]


Office Open <strong>XML</strong> 58<br />

Application support<br />

Starting with Microsoft Office 2007, the Office Open <strong>XML</strong> file formats (ECMA-376) have become the default [5] file<br />

format of Microsoft Office. [6] [7] However, due to the changes introduced in a later version, Office 2007 is not<br />

entirely in compliance with ISO/IEC 29500:2008. [48] [49] [50] [51] Microsoft Office 2010 includes support for the<br />

ISO/IEC 29500:2008 compliant version of Office Open <strong>XML</strong>. [49] . Office 2010 does not yet support saving<br />

document conform the strict schema of the ISO/IEC 29500:2008 specification, but saves documents conform the<br />

transitional schema of the ISO/IEC 29500:2008 specification. [52] [53] The intent of the ISO/IEC is to allow the<br />

removal of the transitional variant from the ISO/IEC 29500 standard. [53]<br />

The SoftMaker Office 2010 Suite claims to be able to reliably read and write .DOCX and .XLSX files in its word<br />

processor and spreadsheet applications.<br />

The OpenOffice.org office suite has been able to import Office Open <strong>XML</strong> files (.docx, .xlsx, .pptx, etc.) since<br />

version 3. [54]<br />

The KOffice office suite has been able to import Office Open <strong>XML</strong> files since version 2.2.<br />

Other mainstream Office products that have started to offer import support for the Office Open <strong>XML</strong> formats are<br />

Apple's TextEdit (included with Mac OS X) and iWork, IBM Lotus Notes, Corel Wordperfect, Kingsoft Office and<br />

Google apps.<br />

Controversies<br />

The ISO standardization of Office Open <strong>XML</strong> was controversial and embittered. According to InfoWorld:<br />

OO<strong>XML</strong> was opposed by many on grounds it was unneeded, as software makers could use<br />

OpenDocument Format (ODF), a less complicated office software format that was already an<br />

international standard. [55]<br />

The same InfoWorld article reported that IBM (which supports the ODF format) threatened to leave standards bodies<br />

that it said allow dominant corporations like Microsoft to wield undue influence. Microsoft was accused of co-opting<br />

the standardization process by leaning on countries to ensure that it got enough votes at the ISO for Office Open<br />

<strong>XML</strong> to pass. [56]<br />

Richard Stallman of the Free Software Foundation has stated that "Microsoft offers a gratis patent license for<br />

OO<strong>XML</strong> on terms which do not allow free implementations." [57]<br />

See also<br />

• List of document markup languages<br />

• Comparison of document markup languages<br />

• Open Document Format<br />

External links<br />

• ECMA-376 site [2]<br />

• ISO/IEC 29500:2008 [3]<br />

• Open<strong>XML</strong>Developer.org [58] , Microsoft's site for developers<br />

• Open <strong>XML</strong> Community site [59] Microsoft's site for customers and partners<br />

• "The WordprocessingML Vocabulary", sample chapter from O'Reilly book Office 2003 <strong>XML</strong> [60] PDF (1.22 MB)<br />

• OpenOffice.org [61] , How do I open Microsoft Office 2007 files? Article by OpenOffice.org<br />

• Information technology -- Office Open <strong>XML</strong> file formats [62] , ISO Standards, JTC 1 Information technology, SC<br />

34<br />

• FAQs on ISO/IEC 29500 [63] , ISO's FAQ site on ISO/IEC 29500


Office Open <strong>XML</strong> 59<br />

• DOCX reference document [64] , contains a file with fairly complex formatting and can be used to quickly check<br />

compatibility of an implementation<br />

• Open<strong>XML</strong> site [65] , contains resources, articles and tools for Office Open <strong>XML</strong><br />

• Interoperability study [66] showing an indication of the percentage of support for Office Open <strong>XML</strong> by several<br />

different office suite implementations in aug-2008<br />

References<br />

[1] Microsoft. "Register file extensions on third party servers" (http://technet.microsoft.com/en-us/library/cc179224.aspx). microsoft.com. .<br />

Retrieved 2009-09-04.<br />

[2] http://www.ecma-international.org/publications/standards/Ecma-376.htm<br />

[3] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_tc_browse.htm?commid=45374<br />

[4] "Q&A: Microsoft Co-Sponsors Submission of Office Open <strong>XML</strong> Document Formats to Ecma International for Standardization" (https://<br />

www.microsoft.com/presspass/features/2005/nov05/11-21Ecma.mspx). Microsoft. 2005-11-21. .<br />

[5] "Microsoft Expands List of Formats Supported in Microsoft Office" (http://www.microsoft.com/Presspass/press/2008/may08/<br />

05-21ExpandedFormatsPR.mspx?rss_fdn=Press Releases). Microsoft. . Retrieved 2008-05-21.<br />

[6] "Microsoft's future lies somewhere beyond the Vista by Evansville Courier & Press" (http://www.courierpress.com/news/2008/oct/24/<br />

microsofts-future-lies-somewhere-beyond-the/). Courierpress.com. . Retrieved 2009-05-19.<br />

[7] "Rivals Set Their Sights on Microsoft Office: Can They Topple the Giant? - Knowledge@Wharton" (http://knowledge.wharton.upenn.edu/<br />

article.cfm?articleid=1795). Knowledge.wharton.upenn.edu. . Retrieved 2009-05-19.<br />

[8] ISO OO<strong>XML</strong> convener: Microsoft's format "heading for failure" (http://arstechnica.com/microsoft/news/2010/04/<br />

iso-ooxml-convener-microsofts-format-heading-for-failure.ars)<br />

[9] Brian Jones (2007-01-25). "History of office <strong>XML</strong> formats (1998–2006)" (http://blogs.msdn.com/brian_jones/archive/2007/01/25/<br />

office-xml-formats-1998-2006.aspx). MSDN blogs. .<br />

[10] "Microsoft Co-Sponsors Submission of Office Open <strong>XML</strong> Document Formats to Ecma International for Standardization" (http://www.<br />

microsoft.com/presspass/features/2005/nov05/11-21Ecma.mspx). Microsoft. 2005-11-21. .<br />

[11] "Ecma International approves Office Open <strong>XML</strong> standard" (http://www.ecma-international.org/news/PressReleases/<br />

PR_TC45_Dec2006.htm). Ecma International. 2006-12-07. .<br />

[12] "ISO/IEC DIS 29500 receives necessary votes for approval as an International Standard" (http://www.iso.org/iso/pressrelease.<br />

htm?refid=Ref1123). ISO. 2008-04-02. .<br />

[13] ISO/IEC (2008-11-18). "Publication of ISO/IEC 29500:2008, Information technology — Office Open <strong>XML</strong> formats" (http://www.iso.<br />

org/iso/pressrelease.htm?refid=Ref1181). ISO. . Retrieved 2008-11-19.<br />

[14] "Freely Available Standards" (http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html). ITTF (ISO/IEC). 2008-11-18. .<br />

[15] "Standard ECMA-376" (http://www.ecma-international.org/publications/standards/Ecma-376.htm). Ecma-international.org. . Retrieved<br />

2009-05-19.<br />

[16] "Code of Conduct in Patent Matters" (http://www.ecma-international.org/memento/codeofconduct.htm). Ecma International. .<br />

[17] "ISO/IEC/ITU common patent policy" (http://isotc.iso.org/livelink/livelink/fetch/2000/2122/3770791/Common_Policy.htm). .<br />

[18] "Microsoft Covenant Regarding Office 2003 <strong>XML</strong> Reference Schemas" (http://www.microsoft.com/office/xml/covenant.mspx).<br />

Microsoft. . Retrieved 2006-07-11.<br />

[19] "2 Escape Hatches in MS's Covenant Not to Sue" (http://www.groklaw.net/articlebasic.php?story=20051202135844482). Groklaw. .<br />

Retrieved 2007-01-29.<br />

[20] Berlind, David (November 28, 2005). "Top open source lawyer blesses new terms on Microsoft's <strong>XML</strong> file format" (http://blogs.zdnet.<br />

com/BTL/?p=2192). ZDNet. . Retrieved 2007-01-27.<br />

[21] "Microsoft Open Specification Promise" (http://www.microsoft.com/interop/osp/default.mspx). Microsoft. 2006-09-12. . Retrieved<br />

2007-04-22.<br />

[22] "" (http://www.ecma-international.org/publications/index.html). Ecma International. . ""Ecma Standards and Technical Reports are<br />

made available to all interested persons or organizations, free of charge and licensing restrictions""<br />

[23] "Microsoft Open Specification Promise" (http://www.microsoft.com/Interop/osp/default.mspx). Microsoft.com. .<br />

[24] "Licensing conditions that Microsoft offers for Office Open <strong>XML</strong>" (http://www.jtc1sc34.org/repository/0810c.htm). Jtc1sc34.org.<br />

2006-12-20. . Retrieved 2009-05-19.<br />

[25] "Microsoft Word — Responses to Comments and Perceived Contradictions.doc" (http://www.ecma-international.org/news/<br />

TC45_current_work/Ecma responses.pdf) (PDF). . Retrieved 2009-09-16.<br />

[26] "ISO (You searched for "29500" in title and abstract" (http://www.iso.org/iso/search.htm?qt=29500&published=on&<br />

active_tab=standards). International Organization for Standardization. 2009-06-05. .<br />

[27] "Re-introducing on/off-values to ST-OnOff in OO<strong>XML</strong> Part 4" (http://idippedut.dk/post/2009/06/23/<br />

Re-introducing-onoff-values-to-ST-OnOff-in-OO<strong>XML</strong>-Part-4.aspx). . Retrieved 2009-09-29.<br />

[28] "OO<strong>XML</strong> and Office 2007 Conformance: a Smoke Test" (http://www.adjb.net/post/<br />

OO<strong>XML</strong>-and-Office-2007-Conformance-a-Smoke-Test.aspx). . Retrieved 2009-09-29.


Office Open <strong>XML</strong> 60<br />

[29] "Minutes of the Copenhagen Meeting of ISO/IEC JTC1/SC34/WG4" (http://www.itscj.ipsj.or.jp/sc34/open/1239.pdf). 2009-06-22. .<br />

Retrieved 2009-09-29. page 15<br />

[30] "FED13321-docsPeterStrickx.indd" (http://www.fedict.belgium.be/nl/binaries/Open_Standaarden_NL_V1_tcm167-16667.pdf) (PDF).<br />

. Retrieved 2009-09-16.<br />

[31] "Bilag 8 – Sammenligning af rapporten om "Estimering af omkostningerne ved indførelse af Office Open <strong>XML</strong> (OO<strong>XML</strong>) og Open<br />

Document Format (ODF) i centraladministrationen" i forhold til de spørgsmål, der skal belyses i de økonomiske konsekvensvurderinger, jf.<br />

rapporten om "Anvendelse af åbne standarder i det offentlige"" (http://vtu.dk/nyheder/aktuelle-temaer/2007/aabne-standarder/bilag/<br />

bilag-8.html/). Vtu.dk. . Retrieved 2009-05-19.<br />

[32] "SAGA 4.0" (http://gsb.download.bva.bund.de/KBSt/SAGA/SAGA_v4.0.pdf) (PDF). . Retrieved 2009-09-16.<br />

[33] Gardner, David (2007-07-10). "Office Software Formats Battle Moves To Asia" (http://www.informationweek.com/news/showArticle.<br />

jhtml?articleID=201000546). Information Week. . Retrieved 2007-07-27.<br />

[34] "Interoperability framework for information systems (in Japanese)" (http://www.meti.go.jp/press/20070629014/20070629014.html).<br />

Ministry of Economy, Trade and Industry, Japan. 2007-06-29. . Retrieved 2007-07-27.<br />

[35] "Latest News" (http://www.openxmlcommunity.com/latestnews.aspx). Open <strong>XML</strong> Community. . Retrieved 2009-05-19.<br />

[36] "Referansekatalog for IT-standarder i offentlig sektor" (http://www.regjeringen.no/en/dep/fad/Documents/Rundskriv/2007/<br />

Referansekatalog-for-IT-standarder-i-off.html?id=494951). regjeringen.no. . Retrieved 2009-05-19.<br />

[37] "SS-ISO/IEC 29500-1:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68693&PresID=2&<br />

Desc=SS-ISO/IEC 29500-1:2009). Sis.se. 2009-01-19. . Retrieved 2009-09-16.<br />

[38] "SS-ISO/IEC 29500-2:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68694&PresID=1&<br />

Desc=SS-ISO/IEC 29500-2:2009). Sis.se. . Retrieved 2009-09-16.<br />

[39] "SS-ISO/IEC 29500-3:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68695&PresID=2&<br />

Desc=SS-ISO/IEC 29500-3:2009). Sis.se. . Retrieved 2009-09-16.<br />

[40] "SS-ISO/IEC 29500-4:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68696&PresID=1&<br />

Desc=SS-ISO/IEC 29500-4:2009). Sis.se. . Retrieved 2009-09-16.<br />

[41] "eCH — Downloads | Standards/Normes | eCH-0014 d SAGA.ch" (http://www.ech.ch/index.php?option=com_docman&<br />

task=cat_view&gid=92&lang=en). Ech.ch. . Retrieved 2009-05-19.<br />

[42] "Open Source, Open Standards and Re–Use: Government Action Plan" (http://www.cabinetoffice.gov.uk/government_it/open_source/<br />

action.aspx). UK Government Cabinet Office. 2009-02-24. .<br />

[43] Rick Jelliffe (2009-02-26). "Open standards: the UK gets it, probably" (http://broadcast.oreilly.com/2009/02/<br />

open-standards-the-uk-gets-it.html). .<br />

[44] "INCITS Letter Ballot 3025" (http://ballot.itic.org/itic/archive.taf?function=detail&ballot_id=3025&<br />

_UserReference=9B6726AA59D4BAC249E6E82E). INCITS. 2009-04-15. .<br />

[45] "Informal comments on Open Formats" (http://web.archive.org/web/20061013201242/http://www.mass.gov/eoaf/<br />

open_formats_comments.html). Web.archive.org. . Retrieved 2009-09-16.<br />

[46] http://www.mass.gov/?pageID=itdterminal&L=3&L0=Home&L1=Policies%2c+Standards+%26+Guidance&L2=Drafts+for+<br />

Review&sid=Aitd&b=terminalcontent&f=policies_standards_etrmv4_etrmv4dot0revisions&csid=Aitd<br />

[47] "Cover Pages: Major Revision of Massachusetts Enterprise Technical Reference Model (ETRM)" (http://xml.coverpages.org/<br />

ni2007-07-03-a.html). Xml.coverpages.org. . Retrieved 2009-05-19.<br />

[48] "OO<strong>XML</strong> Implementations: A Community of One" (http://www.odfalliance.org/resources/IssueBriefImplementations.pdf). ODF<br />

Alliance. 2008-02-20. . Retrieved 2009-05-19.<br />

[49] "Microsoft Expands List of Formats Supported in Microsoft Office" (http://www.microsoft.com/Presspass/press/2008/may08/<br />

05-21ExpandedFormatsPR.mspx). Microsoft.com. 2008-05-21. . Retrieved 2009-05-19.<br />

[50] Lai, Eric (2008-05-27). = 141&pageNumber=1 "FAQ: Office 14 and Microsoft's support for ODF" (http://www.computerworld.com/<br />

action/article.do?command=viewArticleBasic&taxonomyName=Protocols+and+Standards&articleId=9089258&taxonomyId).<br />

Computerworld.com. = 141&pageNumber=1. Retrieved 2009-05-19.<br />

[51] Andy Updegrove. "Microsoft Office 2007 to Support ODF — and not OO<strong>XML</strong>" (http://consortiuminfo.org/standardsblog/article.<br />

php?story=20080521092930864). ConsortiumInfo.org. . Retrieved 2009-05-19.<br />

[52]


Office Open <strong>XML</strong> 61<br />

[59] http://www.openxmlcommunity.org/<br />

[60] http://www.oreilly.com/catalog/officexml/chapter/ch02.pdf<br />

[61] http://wiki.services.openoffice.org/wiki/Documentation/FAQ/General/OpeningMSO2007Files<br />

[62] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=45515<br />

[63] http://www.iso.org/iso/faqs_isoiec29500<br />

[64] http://katana.oooninja.com/w/reference_sample_documents<br />

[65] http://www.openxml.biz/<br />

[66] http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1201708<br />

Office Open <strong>XML</strong> file formats<br />

Office Open <strong>XML</strong><br />

• Office Open <strong>XML</strong> file formats<br />

• Open Packaging Conventions<br />

• Open Specification Promise<br />

• Vector <strong>Markup</strong> <strong>Language</strong><br />

• Office Open <strong>XML</strong> software<br />

• Comparison of Office Open <strong>XML</strong><br />

software<br />

• Office Open <strong>XML</strong> standardization<br />

Filename extension .docx or .docm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

wordprocessingml.<br />

[1]<br />

document<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Document file format<br />

Extended from <strong>XML</strong>, DOC, WordProcessingML<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,


Office Open <strong>XML</strong> file formats 62<br />

Filename extension .pptx or .pptm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

presentationml.<br />

[1]<br />

presentation<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Presentation<br />

Extended from <strong>XML</strong>, PPT<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

Filename extension .xlsx or .xlsm<br />

Internet media<br />

type<br />

application/vnd.<br />

openxmlformats-officedocument.<br />

spreadsheetml.<br />

[1]<br />

sheet<br />

Developed by Microsoft, Ecma, ISO/IEC<br />

Type of format Spreadsheet<br />

Extended from <strong>XML</strong>, XLS, SpreadsheetML<br />

Standard(s) ECMA-376, ISO/IEC 29500<br />

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]<br />

,<br />

The Office Open <strong>XML</strong> file formats are a set of file formats that can be used to represent electronic office<br />

documents. There are formats for word processing documents, spreadsheets and presentations as well as specific<br />

formats for material such as mathematical formulae, graphics, bibliographies etc.<br />

The formats were developed by Microsoft and first appeared in Microsoft Office 2007. They were standardized<br />

between December 2006 and November 2008, first by the Ecma International consortium, where they became


Office Open <strong>XML</strong> file formats 63<br />

ECMA-376, and subsequently, after a contentious standardization process, by the ISO/IEC's Joint Technical<br />

Committee 1, where they became ISO/IEC 29500:2008.<br />

Container<br />

Office Open <strong>XML</strong> documents are stored in Open Packaging<br />

Convention (OPC) packages, which are ZIP files containing<br />

<strong>XML</strong> and other data files, along with a specification of the<br />

relationships between them. [2] Depending on the type of the<br />

document, the packages have different internal directory<br />

structures and names. An application will use the relationships<br />

files to locate individual sections (files), with each having<br />

accompanying metadata, in particular MIME metadata.<br />

A basic package contains an <strong>XML</strong> file called<br />

[Content_Types].xml at the root, along with three directories:<br />

_rels, docProps, and a directory specific for the document<br />

type (for example, in a .docx word processing package, there<br />

would be a word directory). The word directory contains the<br />

document.xml file which is the core content of the document.<br />

[Content_Types].xml<br />

_rels<br />

_rels/.rel<br />

This file provided MIME type information for parts of<br />

the package, using defaults for certain file extensions<br />

and overrides for parts specificied by IRI.<br />

Container structure of Part 2 of the Ecma Office Open <strong>XML</strong><br />

standard, ECMA-376<br />

This directory contains relationships for the files within the package. To find the relationships for a specific<br />

file, look for the _rels directory that is a sibling of the file, and then for a file that has the original file name<br />

with a .rels appended to it. For example, if the content types file had any relationships, there would be a file<br />

called [Content_Types].xml.rels inside the _rels directory.<br />

This file is where the package relationships are located. Applications look here first. <strong>View</strong>ing in a text editor,<br />

one will see it outlines each relationship for that section. In a minimal document containing only the basic<br />

document.xml file, the relationships detailed are metadata and document.xml.<br />

docProps/core.xml<br />

This file contains the core properties for any Office Open <strong>XML</strong> document.<br />

word/document.xml<br />

This file is the main part for any Word document.<br />

Relationships<br />

An example relationship file (word/_rels/document.xml.rels), is:<br />

<br />

<br />


Office Open <strong>XML</strong> file formats 64<br />

Target="http://en.wikipedia.org/images/wiki-en.png"<br />

TargetMode="External" /><br />

<br />

<br />

As such, images referenced in the document can be found in the relationship file by looking for all relationships that<br />

are of type http://schemas.microsoft.com/office/2006/relationships/image. To change the used image, edit the<br />

relationship.<br />

The following code shows an example of inline markup for a hyperlink:<br />

<br />

In this example, the Uniform Resource Locator (URL) is represented by "rId2". The actual URL is in the<br />

accompanying relationships file, located by the corresponding "rId2" item. Linked images, templates, and other<br />

items are referenced in the same way.<br />

Pictures can be embedded or linked using a tag:<br />

<br />

This is the reference to the image file. All references are managed via relationships. For example, a document.xml<br />

has a relationship to the image. There is a _rels directory in the same directory as document.xml, inside _rels is a file<br />

called document.xml.rels. In this file there will be a relationship definition that contains type, ID and location. The<br />

ID is the referenced ID used in the <strong>XML</strong> document. The type will be a reference schema definition for the media<br />

type and the location will be an internal location within the ZIP package or an external location defined with a URL.<br />

Document properties<br />

Office Open <strong>XML</strong> uses the Dublin Core Metadata Element Set and DCMI Metadata Terms to store document<br />

properties. Dublin Core is a standard for cross-domain information resource description and is defined in ISO<br />

15836:2003 [3] .<br />

An example document properties file (docProps/core.xml) that uses Dublin Core metadata, is:<br />

<br />


Office Open <strong>XML</strong> file formats 65<br />

2008-06-19T20:00:00Z<br />

2008-06-19T20:42:00Z<br />

Document file format<br />

Final<br />

<br />

Document markup languages<br />

An Office Open <strong>XML</strong> file may contain several documents encoded in specialized markup languages corresponding<br />

to applications within the Microsoft Office product line. Office Open <strong>XML</strong> defines multiple vocabularies using 27<br />

namespaces and 89 schema modules.<br />

The primary markup languages are:<br />

• WordprocessingML for word-processing<br />

• SpreadsheetML for spreadsheets<br />

• PresentationML for presentations<br />

Shared markup language materials include:<br />

• Office Math <strong>Markup</strong> <strong>Language</strong> (OMML)<br />

• DrawingML used for vector drawing, charts, and for example, text art (additionally, though deprecated, VML is<br />

supported for drawing)<br />

• Extended properties<br />

• Custom properties<br />

• Variant Types<br />

• Custom <strong>XML</strong> data properties<br />

• Bibliography<br />

In addition to the above markup languages custom <strong>XML</strong> schemas can be used to extend Office Open <strong>XML</strong>.<br />

Design approach<br />

Patrick Durusau, the editor of ODF, has viewed the markup style of OO<strong>XML</strong> and ODF as representing two sides of<br />

a debate: the "element side" and the "attribute side". He notes that OO<strong>XML</strong> represents "the element side of this<br />

approach" and singles out the KeepNext element as an example:<br />

<br />

<br />

…<br />

<br />

In contrast, he notes ODF would use the single attribute fo:keep-next, rather than an element, for the same<br />

semantic. [4]<br />

The <strong>XML</strong> Schema of Office Open <strong>XML</strong> emphasizes reducing load time and improving parsing speed. [5] In a test<br />

with applications current in April 2007, <strong>XML</strong>-based office documents were slower to load than binary formats. [6] To<br />

enhance performance, Office Open <strong>XML</strong> uses very short element names for common elements and spreadsheets save<br />

dates as index numbers (starting from 1899 or from 1904). In order to be systematic and generic, Office Open <strong>XML</strong><br />

typically uses separate child elements for data and metadata (element names ending in Pr for properties) rather than<br />

using multiple attributes, which allows structured properties. Office Open <strong>XML</strong> does not use mixed content but uses<br />

elements to put a series of text runs (element name r) into paragraphs (element name p). The result is terse and<br />

highly nested in contrast to HTML, for example, which is fairly flat, designed for humans to write in text editors and<br />

is more congenial for humans to read.


Office Open <strong>XML</strong> file formats 66<br />

The naming of elements and attributes within the text have attracted some criticism. There are three different<br />

syntaxes in OO<strong>XML</strong> (ECMA-376) for specifying the color and alignment of text depending on whether the<br />

document is a text, spreadsheet, or presentation. Rob Weir (an IBM employee and co-chair of the OASIS<br />

OpenDocument Format TC) asks "What is the engineering justification for this horror?". He contrasts with<br />

OpenDocument: "ODF uses the W3C's XSL-FO vocabulary for text styling, and uses this vocabulary<br />

consistently". [7]<br />

Some have argued the design is based too closely on Microsoft applications. In August 2007, the Linux Foundation<br />

published a blog post calling upon ISO National Bodies to vote "No, with comments" during the International<br />

Standardization of OO<strong>XML</strong>. It said, "OO<strong>XML</strong> is a direct port of a single vendor's binary document formats. It<br />

avoids the re-use of relevant existing international standards (e.g. several cryptographic algorithms, VML, etc.).<br />

There are literally hundreds of technical flaws that should be addressed before standardizing OO<strong>XML</strong> including<br />

continued use of binary code tied to platform specific features, propagating bugs in MS-Office into the standard,<br />

proprietary units, references to proprietary/confidential tags, unclear IP and patent rights, and much more". [8]<br />

The version of the standard submitted to JTC 1 was 6546 pages long. The need and appropriateness of such length<br />

has been questioned. [9] [10] Google stated that "the ODF standard, which achieves the same goal, is only 867<br />

pages" [9]<br />

WordprocessingML (WML)<br />

Word processing documents use the <strong>XML</strong> vocabulary known as WordprocessingML normatively defined by the<br />

schema wml.xsd which accompanies the standard. This vocabulary is defined in clause 11 of Part 1. [11]<br />

SpreadsheetML (SML)<br />

Spreadsheet documents use the <strong>XML</strong> vocabulary known as SpreadsheetML normatively defined by the schema<br />

sml.xsd which accompanies the standard. This vocabulary is described in clause 12 of Part 1. [11]<br />

Each worksheet in a spreadsheet is represented by an <strong>XML</strong> document with a root element named <br />

in the http://schemas.openxmlformats.org/spreadsheetml/2006/main Namespace.<br />

The representation of date and time values in SpreadsheetML has attracted some criticism. ECMA-376 1st edition<br />

does not conform to ISO 8601:2004 "Representation of Dates and Times". It requires that implementations replicate<br />

a Lotus 1-2-3 [12] bug that dictates that 1900 is a leap year, which in fact it isn't. Products complying with<br />

ECMA-376 would be required to use the WEEKDAY() spreadsheet function, and therefore assign incorrect dates to<br />

some days of the week, and also miscalculate the number of days between certain dates. [13] ECMA-376 2nd edition<br />

(ISO/IEC 29500) allows the use of 8601:2004 "Representation of Dates and Times" in addition to the Lotus 1-2-3<br />

[14] [15]<br />

bug-compatible form.<br />

3<br />

Office MathML (OMML)<br />

Office Math <strong>Markup</strong> <strong>Language</strong> is a mathematical markup language which can be embedded in WordprocessingML,<br />

with intrinsic support for including word processing markup like revision markings, [16] footnotes, comments, images<br />

and elaborate formatting and styles. [17] The OMML format is different from the World Wide Web Consortium<br />

(W3C) MathML recommendation that does not support those office features, but is partially compatible [18] through<br />

XSL Transformations.<br />

The following Office MathML example defines the fraction:<br />

<br />


Office Open <strong>XML</strong> file formats 67<br />

<br />

π


Office Open <strong>XML</strong> file formats 68<br />

Foreign resources<br />

Non-<strong>XML</strong> content<br />

OO<strong>XML</strong> documents are typically composed of other resources in addition to <strong>XML</strong> content (graphics, video, etc.).<br />

Some have criticised the choice of permitted format for such resources: ECMA-376 1st edition specifies "Embedded<br />

Object Alternate Image Requests Types" and "Clipboard Format Types", which refer to Windows Metafiles or<br />

Enhanced Metafiles – each of which are proprietary formats that have hard-coded dependencies on Windows itself.<br />

The critics state the standard should instead have referenced the platform neutral standard ISO/IEC 8632 "Computer<br />

Graphics Metafile". [13]<br />

Foreign markup<br />

The Standard provides three mechanisms to allow foreign markup to be embedded within content for editing<br />

purposes:<br />

• Smart tags<br />

• Custom <strong>XML</strong> markup<br />

• Structured Document Tags<br />

These are defined in clause 17.5 of Part 1.<br />

Compatibility settings<br />

Versions of Office Open <strong>XML</strong> contain what are termed "compatibility settings". These are contained in Part 4<br />

("<strong>Markup</strong> <strong>Language</strong> Reference") of ECMA-376 1st Edition, but during standardization were moved to become a new<br />

part (also called Part 4) of ISO/IEC 29500:2008 ("Transitional Migration Features").<br />

These settings (including element with names such as autoSpaceLikeWord95, footnoteLayoutLikeWW8,<br />

lineWrapLikeWord6, mwSmallCaps, shapeLayoutLikeWW8, suppressTopSpacingWP, truncateFontHeightsLikeWP6,<br />

uiCompat97To2003, useWord2002TableStyleRules, useWord97LineBreakRules, wpJustification and wpSpaceWidth)<br />

were the focus of some controversy during the standardisation of DIS 29500. [24] As a result, new text was added to<br />

ISO/IEC 29500 to document them. [25]<br />

An article in Free Software Magazine has criticized the markup used for these settings. Office Open <strong>XML</strong> uses<br />

distinctly named elements for each compatibility setting, each of which is declared in the schema. The repertoire of<br />

settings is thus limited — for new compatibility settings to be added, new elements may need to be declared,<br />

"potentially creating thousands of them, each having nothing to do with interoperability". [26]<br />

Extensibility<br />

The standard provides two types of extensibility mechanism, <strong>Markup</strong> Compatibility and Extensibility (MCE) defined<br />

in Part 3 (ISO/IEC 29500-3:2008) and Extension Lists defined in clause 18.2.10 of Part 1.<br />

References<br />

[1] Microsoft. "Register file extensions on third party servers" (http://technet.microsoft.com/en-us/library/cc179224.aspx). microsoft.com. .<br />

Retrieved 2009-09-04.<br />

[2] Tom Ngo (December 11, 2006). "Office Open <strong>XML</strong> Overview" (http://www.ecma-international.org/news/TC45_current_work/<br />

Open<strong>XML</strong> White Paper.pdf) (PDF). Ecma International. p. 6. . Retrieved 2007-01-23.<br />

[3] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=37629<br />

[4] Patrick Durusau (21 October 2008). "Old Wine In New Skins" (http://www.durusau.net/publications/old_wine.pdf). .<br />

[5] Intellisafe Technologies. "Software Developer uses Office Open <strong>XML</strong> to Minimize File Space, Increase Interoperability" (http://www.<br />

openxmlcommunity.org/documents/casestudies/Intellisafe_Open<strong>XML</strong>_Final.pdf). .


Office Open <strong>XML</strong> file formats 69<br />

[6] George Ou (2007-04-27). "MS Office 2007 versus Open Office 2.2 shootout" (http://blogs.zdnet.com/Ou/?p=480). ZDnet.com. .<br />

Retrieved 2007-04-27.<br />

[7] Rob Weir (14 March 2008). "Disharmony of OO<strong>XML</strong>" (http://www.robweir.com/blog/2008/03/disharmony-of-ooxml.html). .<br />

[8] John Cherry (14 March 2008). "OO<strong>XML</strong> — vote "No, with comments"" (http://www.linux-foundation.org/weblogs/cherry/2007/08/29/<br />

ooxml-vote-no-with-comments/). .<br />

[9] "Google's Position on OO<strong>XML</strong> as a Proposed ISO Standard" (http://www.odfalliance.org/resources/Google OO<strong>XML</strong> Q A.pdf). Google.<br />

2008-02. . "If ISO were to give OO<strong>XML</strong> with its 6546 pages the same level of review that other standards have seen, it would take 18 years<br />

(6576 days for 6546 pages) to achieve comparable levels of review to the existing ODF standard (871 days for 867 pages) which achieves the<br />

same purpose and is thus a good comparison. Considering that OO<strong>XML</strong> has only received about 5.5% of the review that comparable standards<br />

have undergone, reports about inconsistencies, contradictions and missing information are hardly surprising"<br />

[10] "OO<strong>XML</strong>: What's the big deal?" (http://www.ibm.com/developerworks/library/x-ooxmlstandard.html). IBM. 2008-02-19. .<br />

[11] "ISO/IEC 29500-1:2008" (http://standards.iso.org/ittf/PubliclyAvailableStandards/c051463_ISOIEC 29500-1_2008(E).zip). ISO and<br />

IEC. 2008-09. .<br />

[12] Kyd, Charley (October 2006). "How to Work With Dates Before 1900 in Excel" (http://www.exceluser.com/explore/earlydates.htm).<br />

ExcelUser. . Retrieved 2009-09-16.<br />

[13] "The Contradictory Nature of OO<strong>XML</strong>" (http://www.consortiuminfo.org/standardsblog/article.php?story=20070117145745854).<br />

ConsortiumInfo.org. .<br />

[14] "ECMA-376 2nd edition Part 1 (3. Normative references)" (http://www.ecma-international.org/publications/standards/Ecma-376.htm).<br />

Ecma-international.org. . Retrieved 2009-09-16.<br />

[15] "New set of proposed dispositions posted, including more positive changes to the Ecma Office Open <strong>XML</strong> formats – Dispositions now<br />

proposed for more than half of National Bodies' comments" (http://www.ecma-international.org/news/TC45_current_work/New set of<br />

proposed dispositions posted.htm). Ecma-international.org. 2007-12-11. . Retrieved 2009-09-16.<br />

[16] Jesper Lund Stocholm (2008-01-29). "Do your math — OO<strong>XML</strong> and OMML" (http://idippedut.dk/post/2008/01/<br />

Do-your-math---OO<strong>XML</strong>-and-OMML.aspx). A Mooh Point blog. . Retrieved 2008-02-12.<br />

[17] Murray Sargent (2007-06-05). "Science and Nature have difficulties with Word 2007 mathematics" (http://blogs.msdn.com/murrays/<br />

archive/2007/06/05/science-and-nature-have-difficulties-with-word-2007-mathematics.aspx). MSDN blogs. . Retrieved 2007-07-31.<br />

[18] David Carlisle (2007-05-09). "XHTML and MathML from Office 2007" (http://dpcarlisle.blogspot.com/2007/04/<br />

xhtml-and-mathml-from-office-20007.html). David Carlisle. . Retrieved 2007-09-20.<br />

[19] "Microsoft Office dumped by Science and Nature" (http://www.zdnet.com.au/news/software/soa/<br />

Microsoft-Office-dumped-by-Science-and-Nature/0,130061733,339278690,00.htm). ZDNet Australia. 18 June 2007. .<br />

[20] Wouter Van Vugt (2008-11-01). "Open <strong>XML</strong> Explained e-book" (http://openxmldeveloper.org/articles/1970.aspx).<br />

Openxmldeveloper.org. . Retrieved 2007-09-14.<br />

[21] Rick Jelliffe in Technical (2007-04-16). "Why EMUs? - O'Reilly <strong>XML</strong> Blog" (http://www.oreillynet.com/xml/blog/2007/04/<br />

what_is_an_emu.html). Oreillynet.com. . Retrieved 2009-05-19.<br />

[22] "The X Factor" (http://reddevnews.com/features/article.aspx?editorialsid=2356). reddevnews.com. October 2007. .<br />

[23] "VML — the Vector <strong>Markup</strong> <strong>Language</strong>" (http://www.w3.org/TR/NOTE-VML). W3.org. 1998-05-13. . Retrieved 2009-05-19.<br />

[24] "ODF/OO<strong>XML</strong> technical white paper — A white paper based on a technical comparison between the ODF and OO<strong>XML</strong> formats" (http://<br />

www.freesoftwaremagazine.com/articles/odf_ooxml_technical_white_paper?page=0,9). Free Software Magazine. .<br />

[25] "ECMA-376 2nd edition Part 4 (paragraph 9.7.3)" (http://www.ecma-international.org/publications/standards/Ecma-376.htm).<br />

Ecma-international.org. . Retrieved 2009-09-16.<br />

[26] "ODF/OO<strong>XML</strong> technical white paper — A white paper based on a technical comparison between the ODF and OO<strong>XML</strong> formats" (http://<br />

www.freesoftwaremagazine.com/articles/odf_ooxml_technical_white_paper?page=0,7). Free Software Magazine. . ""... OO<strong>XML</strong> chose this<br />

route. Rather than create an application-definable configuration tag there is a unique tag for each setting ... Currently, the only application's<br />

unique settings that are catered for are the applications that the standard's authors have decided to include, ... For other applications to be<br />

added, further tag names would need to be defined in the specification, potentially creating thousands of them, each having nothing to do with<br />

interoperability .."."


OIO<strong>XML</strong> 70<br />

OIO<strong>XML</strong><br />

OIO<strong>XML</strong> is a project by the Danish government to develop a number of reusable data components serializable in<br />

various formats, although currently the only method of serialization for OIO<strong>XML</strong> data is in the <strong>XML</strong> format. This<br />

project was undertaken so as to ease communication from, to and between Danish governmental instances. It was<br />

made as part of the Danish government's transition to what they refer to as an eGovernment, in which<br />

communication between governmental instances, companies and the public should be paper-free. There has been<br />

some confusion as to what OIO<strong>XML</strong> is as the most prominent OIO<strong>XML</strong> format, the Danish Efaktura format which<br />

is a localization of UBL is also referred to as OIO<strong>XML</strong> by many governmental documents. It is currently a<br />

requirement for all invoices given to a Danish governmental organization to be in the Efaktura format.<br />

Sources<br />

• The interoperability framework [1]<br />

• OIO - Offentlig Information Online (public information online) - english main page of the site [2]<br />

• Description of OIO<strong>XML</strong> and its reasons [3]<br />

• Reference to the OIO<strong>XML</strong> markup language [4]<br />

• Validator for OIO<strong>XML</strong> [5]<br />

• Examples of OIO<strong>XML</strong> invoices in comparison with regular invoices (danish) [6]<br />

References<br />

[1] http://standarder.oio.dk/my-home-your-home/view?set_language=en<br />

[2] http://www.oio.dk/?o=a54bd5e3b9e3e94209f94882ac0c9301<br />

[3] http://isb.oio.dk/Info/Standardization/OIO<strong>XML</strong>%20Classes.htm<br />

[4] http://xmltools.oio.dk/oioonlinevalidator/ehandel/0p71/Invoice/<br />

[5] http://xmltools.oio.dk/oioonlinevalidator/<br />

[6] http://www.oio.dk/dataudveksling/ehandel/eFaktura/eksempler


Open <strong>XML</strong> Paper Specification 71<br />

Open <strong>XML</strong> Paper Specification<br />

Filename extension .oxps, .xps<br />

Internet media<br />

type<br />

application/oxps, application/vnd.ms-xpsdocument<br />

Developed by Microsoft, Ecma International<br />

Initial release October 2006<br />

Latest release First Edition / June 16, 2009<br />

Type of format Page description language /<br />

Document file format<br />

Contained by Open Packaging Conventions<br />

Extended from ZIP, <strong>XML</strong>, XAML<br />

Standard(s) ECMA-388<br />

Website [1] [1]<br />

The Open <strong>XML</strong> Paper Specification (also referred to as OpenXPS), is an open specification for a page description<br />

language and a fixed-document format originally developed by Microsoft as <strong>XML</strong> Paper Specification (XPS) that<br />

was later standardized by Ecma International as international standard ECMA-388. It is an <strong>XML</strong>-based (more<br />

precisely XAML-based) specification, based on a new print path and a color-managed vector-based document format<br />

that supports device independence and resolution independence. OpenXPS was standardized as an open standard<br />

document format on June 16, 2009. [2]<br />

Development of the <strong>XML</strong> Paper Specification<br />

In 2003 Global Graphics was chosen by Microsoft to provide consultancy and proof of concept development<br />

services on XPS and worked with the Windows development teams on the specification and reference architecture<br />

for the new format. [3]<br />

The XPS document format consists of structured <strong>XML</strong> markup that defines the layout of a document and the visual<br />

appearance of each page, along with rendering rules for distributing, archiving, rendering, processing and printing<br />

the documents. Notably, the markup language for XPS is a subset of XAML, allowing it to incorporate<br />

vector-graphic elements in documents, using XAML to mark up the WPF primitives. The elements used are<br />

described in terms of paths and other geometrical primitives.<br />

An XPS file is in fact a ZIP archive using the Open Packaging Conventions, containing the files which make up the<br />

document. These include an <strong>XML</strong> markup file for each page, text, embedded fonts, raster images, 2D vector<br />

graphics, as well as the digital rights management information. The contents of an XPS file can be examined simply<br />

by opening it in an application which supports ZIP files.


Open <strong>XML</strong> Paper Specification 72<br />

Features<br />

XPS specifies a set of document layout functionality for paged, printable documents. It also has support for features<br />

such as color gradients, transparencies, CMYK color spaces, printer calibration, multiple-ink systems and print<br />

schemas. XPS supports the Windows Color System color management technology for color conversion precision<br />

across devices and higher dynamic range. It also includes a software raster image processor (RIP) which is<br />

downloadable separately. [4] The print subsystem also has support for named colors, simplifying color definition for<br />

images transmitted to printers supporting those colors.<br />

XPS also supports HD Photo images natively for raster images. [5] The XPS format used in the spool file represents<br />

advanced graphics effects such as 3D images, glow effects, and gradients as Windows Presentation Foundation<br />

primitives, which are processed by the printer drivers without rasterization, preventing rendering artifacts and<br />

reducing computational load.<br />

Similarities with PDF and PostScript<br />

Like Adobe Systems's PDF format, XPS is a fixed-layout document format designed to preserve document<br />

fidelity, [6] providing device-independent documents appearance. PDF is a database of objects, created from<br />

PostScript and also directly generated from many applications, whereas XPS is based on <strong>XML</strong>. The filter pipeline<br />

architecture of XPS is also similar to the one used in printers supporting the PostScript page description language.<br />

PDF includes dynamic capabilities not supported by the XPS format. [7]<br />

<strong>View</strong>ing and creating XPS documents<br />

XPS is supported on several versions of Windows.<br />

Because the printing architecture of Windows Vista uses XPS as the spooler format, [6] it has native support for<br />

generating and reading XPS documents. [8] XPS documents can be created by printing to the virtual XPS printer<br />

driver. The XPS <strong>View</strong>er is installed by default in Windows Vista and Windows 7. The viewer is hosted within<br />

Internet Explorer in Windows Vista, but is a native application in Windows 7. The IE-hosted XPS viewer and the<br />

XPS Document Writer are also available to Windows XP users when they download the .NET Framework 3.0. The<br />

IE-hosted viewer supports digital rights management and digital signatures. Users who do not wish to view XPS<br />

documents in the browser can download the XPS Essentials Pack, [9] which includes a standalone viewer and the XPS<br />

Document Writer. The XPS Essentials Pack also includes providers to enable the IPreview and IFilter capabilities<br />

used by Windows Desktop Search, as well as shell handlers to enable thumbnail views and file properties for XPS<br />

documents in Windows Explorer. [10] The XPS Essentials Pack is available for Windows XP, Windows Server 2003,<br />

and Windows Vista. [10] Installing this pack enables operating systems prior to Windows Vista to use the XPS print<br />

processor, instead of the GDI-based WinPrint, which can produce better quality prints for printers that support XPS<br />

in hardware (directly consume the format). [11] The print spooler format on these operating systems when printing to<br />

older, non-XPS-aware printers, however, remains unchanged.<br />

Windows 7 contains a standalone version of the XPS viewer that supports digital signatures. [12]<br />

Third-party support<br />

Software


Open <strong>XML</strong> Paper Specification 73<br />

GhostXPS<br />

Name Publisher Platform Function<br />

Artifex Software<br />

Inc. [13]<br />

Okular Okular team [15] • Linux<br />

Cross platform The Ghostscript software suite for processing of various page description<br />

• FreeBSD<br />

• Microsoft<br />

Windows<br />

• Solaris<br />

languages includes an input parser called GhostXPS for XPS. The software may<br />

be downloaded in source code form from ghostscript.com [14]<br />

.<br />

Okular, the document viewer of the KDE project, can display XPS documents.<br />

STDU <strong>View</strong>er STDUtility [16] Microsoft Windows STDU <strong>View</strong>er and display and organize XPS documents (as well as other<br />

XPS Annotator<br />

Aspose.Words<br />

product family<br />

www.xpsdev.com<br />

[17]<br />

ASPOSE [18] • .NET Framework<br />

• Java<br />

electronic document formats).<br />

Microsoft Windows XPS Annotator can display, digitally-sign and annotate XPS documents. In<br />

• Microsoft<br />

Sharepoint<br />

• SQL Server<br />

Reporting<br />

Services<br />

• JasperReports<br />

Multilizer Multilizer [20] • Microsoft<br />

Windows<br />

NiXPS <strong>View</strong> NiXPS [21] • Microsoft<br />

Windows<br />

• Mac OS X<br />

NiXPS Edit NiXPS [21] • Microsoft<br />

Windows<br />

• Mac OS X<br />

NiXPS SDK NiXPS [21] • Microsoft<br />

Pagemark<br />

Xps<strong>View</strong>er<br />

Pagemark<br />

XpsConvert<br />

Pagemark<br />

XpsPlugin<br />

PDFTron<br />

XPSConvert<br />

Pagemark<br />

Technology,Inc.<br />

[25]<br />

Pagemark<br />

Technology,Inc.<br />

[25]<br />

Pagemark<br />

Technology,Inc.<br />

[25]<br />

Windows<br />

• Mac OS X<br />

• Microsoft<br />

Windows<br />

• Mac OS<br />

• Linux<br />

• Microsoft<br />

Windows<br />

• Mac OS<br />

• Linux<br />

• Mozilla Firefox<br />

• Safari<br />

PDFTron [27] • Microsoft<br />

Windows<br />

• Mac OS X<br />

• Linux<br />

addition, it can convert XPS documents to common picture formats.<br />

Aspose.Words enables application developers to build applications that<br />

"generate, modify, convert, render and print" XPS documents as well as some<br />

other formats. Aspose.Words is .NET Framework class library rather than an<br />

[19]<br />

independent computer software; hence it cannot be used by consumers.<br />

Multilizer localization products support the translation of documents through a<br />

XPS Scanner plug-in. This plug-in enables users to extract texts from a XPS<br />

document, translate it, and write a translated XPS document with the same<br />

structure.<br />

[22]<br />

NiXPS <strong>View</strong> can display, search and print XPS documents.<br />

NiXPS Edit can view, edit, search, print and export XPS<br />

[23]<br />

documents.<br />

NiXPS SDK enables application developers to develop applications that can<br />

[24]<br />

view, edit or export XPS documents.<br />

Pagemark Xps<strong>View</strong>er can display and organize XPS documents as well as<br />

[26]<br />

converting them to common picture formats.<br />

Pagemark XpsConverter, a command-line interface tool, can convert XPS<br />

[26]<br />

documents to PDF documents, as well as common picture formats.<br />

Pagemark XpsPlugin, an add-on for Mozilla Firefox and Safari web browsers,<br />

enables these web browsers to display XPS documents inside the browser<br />

window. This commercial product is still not available for purchase, but a demo<br />

[26]<br />

version is<br />

available.<br />

PDFTron XPSConvert, a command-line interface tool, can convert XPS<br />

[28]<br />

documents to PDF format or common picture formats.


Open <strong>XML</strong> Paper Specification 74<br />

PDFTron<br />

PDF2XPS<br />

Software Imaging<br />

XPS<strong>View</strong>er<br />

PDFTron [27] • Microsoft<br />

Windows<br />

Software Imaging<br />

[30]<br />

• Mac OS X<br />

• Linux<br />

PDFTron PDF2XPS, a command-line interface tool, can convert PDF<br />

[29]<br />

documents into XPS documents.<br />

Microsoft Windows Software Imaging XPS<strong>View</strong>er, a freeware alternative to Microsoft XPS <strong>View</strong>er,<br />

can view and print XPS documents.Software Imaging [31]<br />

NDesk XPS<br />

NDesk [32] Mono [33]<br />

NDesk XPS can view and convert XPS documents.<br />

Danet Studio<br />

Danetsoft [34] Microsoft Windows<br />

Danet Studio can create, display, sign, convert and annotate XPS documents. It<br />

[35]<br />

can split and merge existing XPS documents to create new XPS<br />

documents.<br />

xps2pdf.org [36] World Wide Web xps2pdf.org, an online tool, can convert XPS documents to PDF format.<br />

TreasureUP XPS to<br />

Image Converter<br />

1.1<br />

Hardware<br />

TreasureUP [37] Microsoft Windows<br />

Convert XPS pages to image files formats: Jpeg, Png and Gif. Supports batch<br />

[38]<br />

files conversion, and automatically converting files in specified folder.<br />

XPS has the support of printing companies such as Konica Minolta, Sharp, [39] Canon, Epson, Hewlett-Packard, [40]<br />

and Xerox [41] and software and hardware companies such as Software Imaging, [42] Pagemark Technology Inc., [43]<br />

Informative Graphics Corp. (IGC), [44] NiXPS NV, [45] Zoran, [46] and Global Graphics. [47]<br />

Native XPS printers have been introduced by Canon ,Konica Minolta, Toshiba, and Xerox. [48]<br />

Devices that are Certified for Windows Vinod' level of Windows Logo conformance certificate are required to have<br />

XPS drivers for printing since 1 June 2007. [49]<br />

Licensing<br />

In order to encourage wide use of the format, Microsoft has released XPS under a royalty-free patent license called<br />

the Community Promise for XPS, [50] [51] allowing users to create implementations of the specification that read, write<br />

and render XPS files as long as they include a notice within the source that technologies implemented may be<br />

encumbered by patents held by Microsoft. Microsoft also requires that organizations "engaged in the business of<br />

developing (i) scanners that output XPS Documents; (ii) printers that consume XPS Documents to produce<br />

hard-copy output; or (iii) print driver or raster image software products or components thereof that convert XPS<br />

Documents for the purpose of producing hard-copy output, [...] will not sue Microsoft or any of its licensees under<br />

the <strong>XML</strong> Paper Specification or customers for infringement of any <strong>XML</strong> Paper Specification Derived Patents (as<br />

defined below) on account of any manufacture, use, sale, offer for sale, importation or other disposition or promotion<br />

of any <strong>XML</strong> Paper Specification implementations." The specification itself is released under a royalty-free copyright<br />

license, allowing its free distribution. [52]<br />

Standardization<br />

Microsoft submitted the XPS specification to Ecma International. [53]<br />

In June 2007 Ecma International Technical Committee 46 (TC46) was set up to develop a standard based on the<br />

Open <strong>XML</strong> Paper Specification (OpenXPS). [54]<br />

At the 97th General Assembly held in Budapest, June 16, 2009, Ecma International approved Open <strong>XML</strong> Paper<br />

Specification (OpenXPS) as an Ecma standard (ECMA-388). [2]<br />

TC46's members are:


Open <strong>XML</strong> Paper Specification 75<br />

See also<br />

• Comparison of OpenXPS and PDF<br />

• Windows Vista printing technologies<br />

• Functional specification<br />

External links<br />

• <strong>XML</strong> Paper Specification [55]<br />

• Autodesk • Konica Minolta • QualityLogic<br />

• Brother Industries • Lexmark • Ricoh<br />

• Canon • Microsoft • Software Imaging Limited<br />

• Fujifilm • Monotype Imaging • Toshiba<br />

• Fujitsu • Océ Technologies • Xerox<br />

• Global Graphics • Pagemark Technology • Zoran Corporation<br />

• Hewlett Packard • Panasonic/Matsushita<br />

• Microsoft XPS Development Team Blog [56]<br />

• Standard ECMA-388 Open <strong>XML</strong> Paper Specification [1]<br />

• XPS FAQ and white papers on office and professional printing from a software technology provider [57]<br />

• <strong>View</strong>ing XPS Documents [58]<br />

References<br />

[1] http://www.ecma-international.org/publications/standards/Ecma-388.htm<br />

[2] Steve McGibbon (Microsoft) (2009-06-17). "OpenXPS - Open<strong>XML</strong> Paper Specification" (http://notes2self.net/archive/2009/06/17/<br />

openxps-openxml-paper-specification.aspx). .<br />

[3] "Global Graphics XPS reference" (http://www.redorbit.com/news/technology/665662/<br />

global_graphics_xps_reference_rip_available_from_microsoft/index.html). Redorbit.com. 2006-09-21. . Retrieved 2009-12-10.<br />

[4] "Reference Raster Image Processor (RIP)" (http://www.microsoft.com/whdc/device/print/RRIP.mspx). Microsoft.com. 2007-01-09. .<br />

Retrieved 2009-12-10.<br />

[5] "HD Photo information on Microsoft Photography team blog" (http://blogs.msdn.com/pix/archive/2007/03/12/hd-photo.aspx).<br />

Blogs.msdn.com. 2007-03-12. . Retrieved 2009-12-10.<br />

[6] Foley, Mary Jo (2005-04-25). "Microsoft Readies New Document Printing Specification" (http://www.microsoft-watch.com/content/<br />

operating_systems/microsoft_readies_new_document_printing_specification.html). Microsoft-watch.com. . Retrieved 2009-12-10.<br />

[7] "Comparison of PDF, XPS and ODF by an ISV providing PDF solutions" (http://www.amyuni.com/blog/?p=8). Amyuni.com. . Retrieved<br />

2009-12-10.<br />

[8] "XPS Documents in Windows Vista" (http://www.microsoft.com/windows/products/windowsvista/features/details/xps.mspx).<br />

Microsoft.com. . Retrieved 2009-12-10.<br />

[9] Download details: XPS Essentials Pack Version 1.0 (http://www.microsoft.com/downloads/details.<br />

aspx?FamilyID=b8dcffdd-e3a5-44cc-8021-7649fd37ffee&displaylang=en) Microsoft <strong>XML</strong> Paper Specification Essentials Pack<br />

[10] "<strong>View</strong> and generate XPS" (http://www.microsoft.com/whdc/xps/viewxps.mspx). Microsoft.com. . Retrieved 2009-12-10.<br />

[11] XPSDrv Filter Pipeline: Implementation and Best Practice (http://download.microsoft.com/download/9/c/5/<br />

9c5b2167-8017-4bae-9fde-d599bac8184a/XPSDrv_FilterPipe.doc)<br />

[12] "<strong>View</strong> and Generate XPS" (http://www.microsoft.com/whdc/xps/viewxps.mspx). Microsoft.com. . Retrieved 2009-12-10.<br />

[13] http://www.artifex.com/<br />

[14] http://www.ghostscript.com/GhostPCL.html<br />

[15] http://okular.kde.org/team.php<br />

[16] http://www.stdutility.com<br />

[17] http://www.xpsdev.com<br />

[18] http://www.aspose.com/<br />

[19] "Aspose.Words Product Family" (http://www.aspose.com/categories/product-family-packs/aspose.words-product-family/default.<br />

aspx). Aspose.com. . Retrieved 2010-03-24.<br />

[20] http://www.multilizer.com


Open <strong>XML</strong> Paper Specification 76<br />

[21] http://www.nixps.com<br />

[22] "NiXPS <strong>View</strong>" (http://www.nixps.com/view3/index.html). Nixps.com. . Retrieved 2010-03-24.<br />

[23] "NiXPS Edit" (http://www.nixps.com/nixps_edit_20.html). Nixps.com. . Retrieved 2010-03-24.<br />

[24] "Nixps Sdk" (http://www.nixps.com/library.html). Nixps.com. . Retrieved 2010-03-24.<br />

[25] http://www.pagemarktechnology.com/<br />

[26] "Pagemark: XPS <strong>View</strong>er, XPS Converter and XPS Plug-in" (http://www.pagemarktechnology.com/home/products.html).<br />

Pagemarktechnology.com. . Retrieved 2010-03-24.<br />

[27] http://www.pdftron.com/<br />

[28] "PDFTron XPSConvert" (http://www.pdftron.com/xpsconvert/index.html). Pdftron.com. 2007-04-02. . Retrieved 2010-03-24.<br />

[29] "PDFTron PDF2XPS" (http://www.pdftron.com/pdf2xps/index.html). Pdftron.com. 2007-04-02. . Retrieved 2010-03-24.<br />

[30] http://softwareimaging.com/<br />

[31] http://softwareimaging.com/products-services/XPS<strong>View</strong>er/index.asp<br />

[32] http://www.ndesk.org/<br />

[33] "NDesk XPS" (http://www.ndesk.org/Xps). Ndesk.org. . Retrieved 2010-03-24.<br />

[34] http://www.danetsoft.com/<br />

[35] Danet Studio (http://www.danetsoft.com/product)<br />

[36] http://www.xps2pdf.org<br />

[37] http://www.treasureup.com/page1.aspx<br />

[38] "XPS to Image" (http://download.cnet.com/TreasureUP-XPS-to-Image-Converter/3000-6675_4-10838983.html). download.cnet.com.<br />

2010-04-05. .<br />

[39] "Sharp Open Systems Architecture supports XPS in multi-function printers" (http://www.sharpusa.com/products/<br />

FunctionPressReleaseSingle/0,1080,650-5,00.html#). Sharpusa.com. . Retrieved 2009-12-10.<br />

[40] Monckton, Paul. "''IT Week'' 10 November 2006, Canon, Epson and HP support for XPS" (http://www.itweek.co.uk/<br />

personal-computer-world/features/2167665/photo-printing-under-windows). Itweek.co.uk. . Retrieved 2009-12-10.<br />

[41] "''Fuji Xerox and Microsoft Collaborate in Document Management Solutions Field''" (http://www.fujixerox.co.jp/eng/headline/2006/<br />

1128_withms.html). Fujixerox.co.jp. 2006-11-28. . Retrieved 2009-12-10.<br />

[42] "XPS & Windows Vista" (http://softwareimaging.com/xps). Software Imaging. . Retrieved 2009-12-10.<br />

[43] "Bot generated title ->" (http://www.pagemarktechnology.com). Pagemark Technology


PCDATA 77<br />

PCDATA<br />

PCDATA is a term originated from SGML, short for "Parsed Character Data".<br />

#PCDATA in <strong>XML</strong> DTD<br />

In <strong>XML</strong> DTD[1], #PCDATA is the keyword to specify "mixed content", meaning an element can contain character<br />

data and/or child elements in arbitrary order and number of occurrences. For example:<br />

<br />

<br />

In this example, element must contain character data only; element can contain a mixture of any<br />

combination of character data , , element(s).<br />

Although its name and its appearance in DTD suggest so, #PCDATA itself is not a semantic term for character<br />

data; it can only appear as the leading syntactic construct in "mixed content" definition. The following usages are<br />

illegal:<br />

<br />

<br />

<br />

<br />

[1] http://www.w3.org/TR/REC-xml/#sec-mixed-content


Plain Old <strong>XML</strong> 78<br />

Plain Old <strong>XML</strong><br />

Plain Old <strong>XML</strong> (POX) is a term used to describe basic <strong>XML</strong>, sometimes mixed in with other, blendable<br />

specifications like <strong>XML</strong> Namespaces, Dublin Core, XInclude and XLink. People typically use the term as a contrast<br />

with complicated, multilayered <strong>XML</strong> specifications like those for web services or RDF. The term may have been<br />

derived from or inspired by the expression plain old telephone service (a.k.a. POTS) and, similarly Plain Old Java<br />

Object.<br />

An interesting question is how POX relates to <strong>XML</strong> Schema. On the one hand, POX is completely compatible with<br />

<strong>XML</strong> Schema. However, many POX users eschew <strong>XML</strong> Schema to avoid the poor or inconsistent quality of <strong>XML</strong><br />

Schema-to-Java tools.<br />

POX is complementary to REST: REST refers to a communication pattern, while POX refers to an information<br />

format style.<br />

The primary competitors to POX are more strictly-defined <strong>XML</strong>-based information formats such as RDF and SOAP<br />

section 5 encoding, as well as general non-<strong>XML</strong> information formats such as JSON and CSV.<br />

External links<br />

• REST and POX article [1] from the Microsoft Developer Network<br />

• Plain Old <strong>XML</strong> Considered Harmful [2] from Microformats.org<br />

• Support for POX [3] in the Java Spring Framework<br />

• Plain<strong>XML</strong> on SourceForge.net [4]<br />

References<br />

[1] http://msdn.microsoft.com/en-us/library/aa395208.aspx<br />

[2] http://microformats.org/wiki/plain-old-xml-considered-harmful<br />

[3] http://static.springsource.org/spring-ws/sites/1.5/apidocs/org/springframework/ws/pox/package-summary.html<br />

[4] http://sourceforge.net/projects/plainxml/


Portable Application Description 79<br />

Portable Application Description<br />

Portable Application Description is a machine-readable document format designed by the Association of<br />

Shareware Professionals.<br />

It allows authors to provide product descriptions and specifications to online sources in a standard way, using a<br />

standard data format, a simplified subset of <strong>XML</strong>, that will allow webmasters and program librarians to automate<br />

program listings. PAD saves time for both authors and webmasters.<br />

Each field in the specification has a regular expression (regex) associated with it. The regex acts as a constraint on<br />

the field: if the regex matches, the field value is legal and if it fails to match, the field and the PAD file as a whole<br />

are out of spec. Only files where all fields in the file pass validation are properly called PAD files.<br />

The simplifications in PAD over <strong>XML</strong> are primarily PAD does not use name/value pairs in tags. All tags are<br />

attribute-free. This is less expressive than <strong>XML</strong> but easier to parse. The official PAD spec uses unique tags. To<br />

extract the fields in the official spec, it is not necessary to descend through the tag path. However, if multiple<br />

languages are represented in a single PAD file, then correct parsing does require descending though the tag path<br />

because leaf tags are duplicated for each language supported.<br />

External links<br />

• Official PAD site [1]<br />

• The Official PAD specification [2]<br />

• The Official PAD validator [3]<br />

• 30 or so free and commercial PAD products, services, and links [4]<br />

• PAD database and graphics updated weekly [5]<br />

• About PAD files (Software Industry Professionals) [6]<br />

• PAD Validation Tool [7]<br />

• Online PAD Generator [8]<br />

• Taşınabilir Uygulama Tanımı [9]<br />

References<br />

[1] http://www.asp-shareware.org/pad/<br />

[2] http://www.asp-shareware.org/pad/spec/spec.php<br />

[3] http://www.asp-shareware.org/pad/spec/validate.php<br />

[4] http://www.asp-shareware.org/pad/padlinks.php<br />

[5] http://paddatacenter.net/<br />

[6] http://www.siprofessionals.org/developers/viewarticle.php?id=si20070802<br />

[7] http://www.sharewarepromotions.com/PAD_Validation.asp<br />

[8] http://www.padbuilder.com/<br />

[9] http://www.tankado.com/pad-portable-application-description/


Publishing Requirements for Industry Standard Metadata 80<br />

Publishing Requirements for Industry Standard<br />

Metadata<br />

PRISM Metadata Standard<br />

Introduction<br />

The Publishing Requirements for Industry Standard Metadata (PRISM) [1] specification defines a set of <strong>XML</strong><br />

metadata vocabularies for syndicating, aggregating, post-processing and multi-purposing content. PRISM provides a<br />

framework for the interchange and preservation of content and metadata, a collection of elements to describe that<br />

content, and a set of controlled vocabularies listing the values for those elements. PRISM can be <strong>XML</strong>, RDF/<strong>XML</strong>,<br />

or XMP and incorporates Dublin Core elements. PRISM can be thought of as a set of <strong>XML</strong> tags used to contain the<br />

metadata of articles and even tag article content.<br />

PRISM conforms to the World Wide Web standard for Namespaces. PRISM namespaces are PRISM (prism:),<br />

PRISM Usage Rights (pur:), Dublin Core (dc: and dcterms:), PRISM Inline Metadata (pim:), PRISM Rights<br />

<strong>Language</strong> (prl:), PRISM Aggregator Message (pam:), and PRISM Controlled Vocabulary (pcv:). PRISM<br />

incorporated existing industry standards such as Dublin Core and XHTML in order to leverage work that had already<br />

been done in the publishing industry. New elements were created only when required, and were assigned to PRISM<br />

specific namespaces.<br />

Overview<br />

PRISM consists of three specifications. The PRISM Specification, itself, provides definition for the overall PRISM<br />

framework. A second specification, the PRISM Aggregator Message (PAM) Schema/DTD, is a standard format for<br />

publishers to use for delivery of content to websites, aggregators, and syndicators. PAM is available as an <strong>XML</strong><br />

DTD and an <strong>XML</strong> schema (XSD). Both PAM formats provides a simple, flexible model for transmitting content and<br />

PRISM metadata. The third, and newest, specification provides an <strong>XML</strong> schema (XSD) for capture of content usage<br />

rights metadata. This Guide to PRISM Usage Rights utilizes the elements found in PRISM’s Usage Rights<br />

Namespace to allow users to comprehensively capture and relay rights metadata for text and media content.<br />

Background<br />

In 1999, IDEAlliance contracted Linda Burman to found the PRISM Working Group to address emerging publisher<br />

requirements for a metadata standard to facilitate “agile” content for search, digital asset management, content<br />

aggregation. Since that time, individuals from more than 50 IDEAlliance member companies have participated in the<br />

development of the specifications.<br />

PRISM is an IDEAlliance specification but is available free of charge. IDEAlliance (International Digital Enterprise<br />

Alliance) is a not-for-profit membership organization. Its mission is to advance user-driven, cross-industry solutions<br />

for all publishing and content-related processes by developing standards, fostering business alliances, and identifying<br />

best practices.<br />

Many organizations use PRISM because it provides a common metadata standard across platforms, media types and<br />

business units. Organizations who are involved in any type of content creation, categorization, management,<br />

aggregation and distribution, both commercially and within intranet and extranet frameworks can use the PRISM<br />

standards.<br />

The PRISM Working Group is open to all IDEAlliance members and includes: Adobe Systems, Hachette Filipacchi<br />

Media, Hearst, L.A. Burman Associates, LexisNexis, The McGraw-Hill Companies, Reader’s Digest, Source<br />

Interlink Media Companies, Time Inc., The Nature Publishing Group, and U.S. News and World Report.


Publishing Requirements for Industry Standard Metadata 81<br />

Usage and Applications<br />

PRISM can be incorporated into other standards and at this time, the PRISM Working Group is only aware of<br />

PRISM incorporation with RSS 1.0. See RSS 1.0 [2] and the RSS 1.0 PRISM Module for more information.<br />

The PRISM specification defines a set of metadata vocabularies. PRISM metadata may be expressed in a different<br />

syntax depending on the specific use-case scenario. Currently PRISM metadata can be encoded <strong>XML</strong>, <strong>XML</strong>/RDF, or<br />

as XMP. Each of these expressions of PRISM metadata is called a profile.<br />

• Profile 1 is for the expression of PRISM metadata in <strong>XML</strong>. An example is the <strong>XML</strong> PRISM Aggregator Message<br />

(PAM).<br />

• Profile 2 is for the expression of PRISM metadata in <strong>XML</strong>/RDF such as for expressing PRISM metadata in RSS<br />

feeds.<br />

• Profile 3 is for embedding PRISM metadata in media objects such as digital images or PDFs using XMP<br />

technology.<br />

PRISM describes many components of print, online, mobile, and multimedia content including the following:<br />

• Who created, contributed to, and owns the rights to the content?<br />

• What locations, organizations, topics, people, and/or events it covers, the media it contains, and under what<br />

conditions it may be reproduced?<br />

• When it was published? (cover date, post date, volume, number), withdrawn?<br />

• Where it can be republished, and the original platform on which it appeared?<br />

• How it can be reused?<br />

Common PRISM Usage<br />

• Syndication to partners<br />

• Content aggregation<br />

• Content repurposing<br />

• Resource discovery and search optimization<br />

• Multiple platform and channel distribution<br />

• Content archiving<br />

• Capture rights usage information<br />

• Creation of feeds, such as RSS<br />

• Standalone services<br />

• Embedded descriptions, such as XMP<br />

• Web publishing<br />

See also<br />

• Dublin Core<br />

• DTD<br />

• Comparison of document markup languages<br />

• Controlled vocabulary<br />

• Interoperability


Publishing Requirements for Industry Standard Metadata 82<br />

See also<br />

• Dublin Core Metadata Initiative<br />

• Bibliographic Ontology<br />

Further reading<br />

• IDEAlliance [3]<br />

• PRISM Standard [4]<br />

• PRISM FAQ [5]<br />

• RSS 1.0 PRISM Module [6]<br />

• Using PRISM - The PRISM Cookbook [7] is a systematic guide that demonstrates how to apply PRISM elements<br />

in particular business scenarios. The existing PRISM Cookbook addresses only PRISM Profile 1 (<strong>XML</strong>).<br />

• W3C – Namespaces in <strong>XML</strong> [8]<br />

References<br />

[1] PRISM Metadata Standard (http://www.idealliance.org/industry_resources/intelligent_content_informed_workflow/prism)<br />

[2] http://web.resource.org/rss/1.0/spec<br />

[3] http://www.idealliance.org<br />

[4] http://www.prismstandard.org<br />

[5] http://www.prismstandard.org/faq/<br />

[6] http://nurture.nature.com/rss/modules/mod_prism.html<br />

[7] http://www.prismstandard.org/resources/<br />

[8] http://www.w3.org/TR/2006/REC-xml-names11-20060816/<br />

QName<br />

QNames were introduced by <strong>XML</strong> Namespaces in order to be used as URI references [1] . QName stands for<br />

"qualified name" and defines a valid identifier for elements and attributes. QNames are generally used to reference<br />

particular elements or attributes within <strong>XML</strong> documents. [2]<br />

Motivation<br />

Since URI references can be long and may contain prohibited characters for element/attribute naming, QNames are<br />

used to create a mapping between the URI and a namespace prefix. The mapping enables the abbreviation of URIs,<br />

therefore it achieves a more convenient way to write <strong>XML</strong> documents. (see Example)<br />

Formal definition<br />

QNames are formally defined by the W3C as [3] :<br />

QName ::= PrefixedName | UnprefixedName<br />

PrefixedName ::= Prefix ':' LocalPart<br />

UnprefixedName ::= LocalPart<br />

Whereby the Prefix is used as placeholder for the namespace and the LocalPart as the local part of the qualified<br />

name. A local part can be an attribute name or an element name.


QName 83<br />

Example<br />

<br />

<br />

<br />

<br />

In line two the prefix "x" is declared to be associated with the URI "http://example.com/ns/foo". This prefix can<br />

further on be used as abbreviation for this namespace. Subsequently the tag "x:p" is a valid QName because it uses<br />

the "x" as namespace reference and "p" as local part. The tag "doc" is also a valid QName, but it consists only of a<br />

local part. [4]<br />

See also<br />

• CURIE<br />

References<br />

[1] Namespaces in <strong>XML</strong> 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#dt-qualname)<br />

[2] Using Qualified Names (QNames) as Identifiers in <strong>XML</strong> Content (http://www.w3.org/2001/tag/doc/qnameids.html#sec-qnames-xml)<br />

[3] Namespaces in <strong>XML</strong> 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#NT-QName)<br />

[4] Namespaces in <strong>XML</strong> 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#NT-LocalPart)<br />

QTI<br />

The IMS Question and Test Interoperability specification (QTI) defines a standard format for the representation<br />

of assessment content and results, supporting the exchange of this material between authoring and delivery systems,<br />

repositories and other learning management systems. It allows assessment materials to be authored and delivered on<br />

multiple systems interchangeably. It is, therefore, designed to facilitate interoperability between systems [1] .<br />

The specification consists of a data model that defines the structure of questions, assessments and results from<br />

questions and assessments together with an <strong>XML</strong> data binding that essentially defines a language for interchanging<br />

questions and other assessment material. The <strong>XML</strong> binding is widely used for exchanging questions between<br />

different authoring tools and by publishers. The assessment and results parts of the specification are less widely used.<br />

Background<br />

QTI was produced by the IMS Global Learning Consortium, which is an industry and academic consortium that<br />

develops specifications for interoperable learning technology. QTI was inspired by the need for interoperability in<br />

question design, and to avoid people losing or having to re-type questions when technology changes. Developing and<br />

validating good questions is time consuming, and it's desirable to be able to create them in a platform and technology<br />

neutral format.<br />

QTI version 1.0 was materially based on a proprietary Questions <strong>Markup</strong> <strong>Language</strong> (QML) language defined by<br />

QuestionMark, but the language has evolved over the years and can now describe almost any reasonable question<br />

that one might want to describe. (QML is still in use by Questionmark and is generated for interoperability by tools<br />

like Adobe Captivate).<br />

The most widely used version of QTI at the time of writing is version 1.2, which was finalized in 2002. This works<br />

well for exchanging simple question types, and is supported by many tools that allow the creation of questions.<br />

Version 2.0 was released in 2005, with v2.1 due for release in 2008 [2] . 2.0 addressed the item (individual question)<br />

level of the specification only, with 2.1 covering assessments and results as well as correcting errors which had


QTI 84<br />

become apparent in 2.0. Version 2.x is a significant improvement on earlier versions, defining a new underlying<br />

interaction model. It is also notable for its significantly greater degree of integration with other specifications (some<br />

of which did not exist during the production of v1): the specification addresses the relationship with IMS Content<br />

Packaging v1.2, IEEE Learning Object Metadata, IMS Learning Design, IMS Simple Sequencing and other<br />

standards such as XHTML. It also provides guidance on representing context-specific usage data and information to<br />

support the migration of content from earlier versions of the specification.<br />

Because v2.0 was limited to items only, and v2.1 has yet to be formally released by IMS (although two public drafts<br />

plus an addendum are currently available), uptake of v2.x has been slow to date. The delay between the release of 2.0<br />

and 2.1 (over three years to date) may have hindered uptake to some extent, with developers reluctant to commit to<br />

v2.0 knowing that v2.1 is in development. The use of a profile of v1.2.1 in the IMS Common Cartridge specification<br />

may exacerbate this. A number of implementations are emerging, however, and uptake may increase once the<br />

specification is finally available in a stable form.<br />

In early 2009, the IMS Global Learning Consortium withdrew QTI 2.1, stating that "Adequate feedback on the<br />

specification has not been received, and therefore, the specification has been put back into the IMS project group<br />

process for further work." [3] The most recent version of QTI that is fully endorsed by IMS GLC is v1.2.1. This<br />

decision met with disapproval on the IMS-QTI mailing list. [4] A further clarification on the QTI 2.1 withdrawal<br />

acknowledged the work done on implementing the QTI 2.1 draft specification, and cited criticism on the lack of<br />

interoperability of IMS specifications as a reason for endorsing only IMS QTI 1.2. [5] A few weeks later IMS GLC<br />

reposted the QTI v2.1 draft specification on their website [6] with a warning that the specification is incomplete:<br />

Caution: The QTIv2.1PD Version 2 specification is incomplete in its current state. The IMS QTI project group<br />

is in the process of evolving this specification based on input from market participants. Suppliers of products<br />

and services are encouraged to participate by contacting Mark McKell at [e-mail address removed]. This<br />

specification will be superseded by an updated release based on the input of the project group participants.<br />

Please note that supplier's claims as to implementation of QTI v2.1 and conformance to it HAVE NOT BEEN<br />

VALIDATED by IMS GLC. While such suppliers are likely well-intentioned, IMS GLC member<br />

organizations have not yet put in place the testing process to validate these claims. IMS GLC currently grants a<br />

conformance mark to the Common Cartridge profile of QTI v1.2.1. [7]<br />

Timeline<br />

Date Version Comments<br />

March 1999 0.5 Internal to IMS<br />

February 2000 1.0 public draft<br />

May 2000 1.0 final release<br />

August 2000 1.01<br />

March 2001 1.1<br />

January 2002 1.2<br />

March 2003 1.2.1 addendum<br />

September 2003 2.0 charter Initiation of working group<br />

January 2005 2.0 final release<br />

January 2006 2.1 public draft<br />

July 2006 2.1 public draft version 2<br />

April 2008 2.1 public draft addendum<br />

early 2009 2.1 removed from website


QTI 85<br />

January 2010 2.1 reinstated on website<br />

Applications with IMS QTI support<br />

Name QTI<br />

ANGEL Learning<br />

Management Suite<br />

APIS QTIv2<br />

Assessment Engine<br />

version<br />

Type of tool Comment<br />

2.1 [8] LMS also supports IMS Common Cartridge [8]<br />

2.0 draft<br />

[9]<br />

Java library & demo<br />

application.<br />

AQuRate 2.1 [10] authoring tool see QTITools<br />

ASDEL 2.1 [11] assessment delivery system see QTITools<br />

ATutor 1.2, 2.1<br />

[12]<br />

LCMS<br />

Canvas Learning [13]<br />

1.2.1<br />

Authoring tools and SCORM<br />

compatible item renderer<br />

CCReader 1.2.1 CC<br />

Cognero<br />

Profile<br />

[14]<br />

1.2 and<br />

2.1 [15]<br />

Content-e 1.2 & 2.0<br />

[16]<br />

DB Primary 2.0 [17]<br />

[18]<br />

Diploma 1.2, 2.1<br />

[19]<br />

Dokeos<br />

Elques<br />

1.2 and<br />

2.0 [20]<br />

2.1 [21]<br />

[22]<br />

available as middle-ware<br />

solutions.<br />

Common Cartridge <strong>View</strong>er<br />

Assessment authoring and<br />

delivery system.<br />

Professional authoring tool<br />

Content-e.<br />

LMS<br />

Incomplete. Author recommends using QTITools instead.<br />

Creators - Can Studios contributed to the development of the QTI specification.<br />

A number of LMS systems used the Canvas Learning Player to achieve<br />

compatibility with the Becta learning platform conformance regime. The system<br />

is currently being distributed to schools in the UK as a result of this integration<br />

work.<br />

Cognero imports QTI 1.2 and exports QTI 1.2 and 2.1 to allow content to work<br />

with other systems.<br />

Imports QTI 1.2 and 2.0.<br />

export QTI 1.2 & 2.1<br />

LMS/LCMS export QTI 1.2 & 2.0 (1.2 disabled by default but available) (supports SCORM<br />

1.2)<br />

authoring tool exports QTI 2.1 and QTI 1.2 (for LMS OLAT only); imports QTI 2.1, Tests<br />

from Blackboard and OLAT (kind of QTI 1.2 too)<br />

it's learning 2.1 [23] VLE import and export questions in QTI 2.1 format<br />

ILIAS<br />

Lectora<br />

not stated<br />

[24]<br />

not stated<br />

[25]<br />

LMS supports SCORM 1.2 and SCORM 2004<br />

authoring tool supports SCORM 1.2 and SCORM 2004<br />

Mathqurate 2.1 [26] authoring tool see QTITools. Embedded Gecko engine and support for multiple interactions<br />

Moodle<br />

not stated<br />

[27]<br />

LCMS supports adaptive questions; QTI 2.0 export is still unfinished


QTI 86<br />

Online Learning And<br />

Training<br />

QTI 1.2<br />

[28]<br />

ONYX 2.1 [29] modular assessment delivery<br />

OWL Testing<br />

Software<br />

not stated<br />

[30]<br />

LCMS QTI 2.1 compliance can be achieved with ONYX as plugin<br />

system<br />

QTITools 2.1 [31] collection of tools and<br />

QuestionMark<br />

Perception<br />

Question Writer 2.0<br />

Publisher Edition<br />

Question Writer 3.5<br />

Professional<br />

not stated<br />

[33]<br />

Respondus 1.2 [39]<br />

RM Test Authoring<br />

System<br />

open-source, QTI 2.1 import and export, Report <strong>View</strong>er for graphical<br />

visualization of QTI-Result-Files<br />

test management system can import IMS QTI<br />

libraries<br />

authoring tool and delivery<br />

system<br />

Test authoring tool Spectatus procudes QTI<br />

[32]<br />

2.1<br />

can export IMS QTI, an online tool provides QTI 1.2 import<br />

[34]<br />

1.2 authoring tool Exports as QTI 1.2 and SCORM 1.2 [35]<br />

[36]<br />

1.2 authoring tool Exports as QTI 1.2 and SCORM 1.2 [37] Also specific QTI Export for Pearson<br />

VUE [38]<br />

[40]<br />

authoring tool QTI export<br />

2.1 [41] authoring tool<br />

Sakai 1.2 [42] LMS<br />

SToMP (Software<br />

Teaching of Modular<br />

Physics)<br />

2.1 [43] assessment system mostly unavailable as of July 2008<br />

Studywiz 1.2 [44] Virtual Learning<br />

Wimba Create<br />

Other software:<br />

QTI Lite<br />

[45]<br />

Environment Module<br />

authoring tool only export<br />

An optional module for creating and assigning QTI v1.2 questions to students.<br />

Available as of June 2008<br />

• QTI Migration Tool (University of Cambridge): converts QTI version 1.x data into QTI 2.0 content packages. [46]<br />

External links<br />

• IMS Global Learning Consortium: IMS Question & Test Interoperability Specification [47]<br />

• TOIA (Technologies for Online Interoperable Assessment) [48] - this project ended in 2007 and software is no<br />

longer available.<br />

• QTI Tools [49]<br />

• JISC CETIS Assessment special interest group [50]<br />

• JISC CETIS wiki: Assessment tools, projects and resources [51]<br />

• IMS Question & Test Interoperability mailing list [52]


QTI 87<br />

References<br />

[1] Effective Practice with e-Assessment guide, p.44 (http://www.jisc.ac.uk/media/documents/themes/elearning/effpraceassess.pdf)<br />

[2] QTI Update (http://wiki.cetis.ac.uk/Assessment_and_EC_SIGs_meeting_Feb_2008#QTI_Update)<br />

[3] IMS Global Learning Consortium: IMS Question & Test Interoperability Specification (http://www.imsglobal.org/question/index.html).<br />

Accessed March 29, 2009.<br />

[4] E-mail thread "QTI 2.1 draft specification withdrawn" (http://lists.ucles.org.uk/public/ims-qti/2009-March/001456.html), starting<br />

March 27, 2009.<br />

[5] Rob Abel: Further clarification on the removal of QTI v2.1 from the IMS web site (http://www.imsglobal.org/community/forum/<br />

messageview.cfm?catid=21&threadid=36&enterthread=y), on the IMS Global Learning Consortium's Question and Test Interoperability<br />

Forum, March 30, 2009. Accessed March 29, 2009.<br />

[6] rabel: We are reposting the QTI v2.1 (http://www.imsglobal.org/community/forum/messageview.cfm?catid=21&threadid=41&<br />

enterthread=y). Question and Test Interoperability Forum, April 14, 2009. Accessed April 17, 2009.<br />

[7] IMS Global Learning Consortium: IMS Question & Test Interoperability Specification (http://www.imsglobal.org/question/index.html).<br />

Accessed April 17, 2009.<br />

[8] ANGEL Learning Management Suite: Standards Leadership (http://www.angellearning.com/products/lms/standards.html). Accessed<br />

March 30, 2009.<br />

[9] Sourceforge.net: APIS QTIv2 Assessment Engine (http://sourceforge.net/projects/apis). Accessed March 30, 2009.<br />

[10] AQuRate: A QTI-2.x Authoring Tool (http://aqurate.kingston.ac.uk/). Accessed March 30, 2009.<br />

[11] ASDEL: assessment delivery system for QTIv2 questions (http://www.asdel.ecs.soton.ac.uk/). Accessed March 30, 2009.<br />

[12] ATutorATutor Learning Content Management System: Information (http://www.atutor.ca/atutor/). Accessed March 30, 2009.<br />

[13] Canvas Learning (http://www.canvaslearning.com). Accessed August, 2009.<br />

[14] CCReader project in Sourceforge (http://sourceforge.net/projects/ccreader). Accessed March 30, 2009.<br />

[15] Cognero: Cognero Features (http://www.cognero.com/features.html). Accessed February 19, 2009<br />

[16] Professional authoring tool content-e. (http://eng.content-e.nl/) Accessed July, 2009.<br />

[17] iBoard content available in DB Primary (http://www.e2bn.org/services/120/iboard-content-available-in-db-primary.html). Accessed<br />

March 30, 2009.<br />

[18] DB Primary's own Technical Overview (http://www.getprimary.com/tech_spec.html) does not mention QTI.<br />

[19] Diploma 6 (Windows) Release Notes (6.61 (Build 0087 - 8/8/2008)) (http://www.brownstone.net/support/Dip6-ReleaseNotes.asp).<br />

Accessed March 30, 2009.<br />

[20] Dokeos code (no other reference available) (http://dokeos.svn.sourceforge.net/viewvc/dokeos/trunk/dokeos/main/exercice/export/)<br />

[21] Elques: Elques Features (http://elques.bps-system.de/en/?Features). Accessed March 30, 2009.<br />

[22] Elques: Elques 2.0[[Category:Articles containing German language text (http://elques.bps-system.de/)]] (in German). Accessed<br />

September 30, 2009.<br />

[23] it's learning: Importing and exporting (https://www.itslearning.com/Ntt/Help/en-GB/Default_Left.htm#StartTopic=Adding). Accessed<br />

June 19, 2009.<br />

[24] ILIAS France (http://ilias-france.info/ilias.htm). Accessed March 30, 2009.<br />

[25] Lectora Supports eLearning Standards (http://www.trivantis.com/products/elearningstandards.html). Accessed March 30, 2009.<br />

[26] Mathqurate: Maths-enabled QTI-2.1 item authoring (http://aqurate.kingston.ac.uk/mathqurate/). Accessed April 3, 2009.<br />

[27] Development:Question engine - MoodleDocs (http://docs.moodle.org/en/Question_engine). Accessed March 30, 2009.<br />

[28] OLAT Feature List and Some Screenshots (http://www.olat.org/website/en/html/about_features.html). Accessed March 30, 2009.<br />

[29] Onyx Feature List and more Infos (http://onyx.bps-system.de/en/?Features). Accessed March 30, 2009.<br />

[30] OWL Test Conversion Service (http://www.owlts.com/test-conversion.html). Accessed March 30, 2009.<br />

[31] SourceForge.net: QTItools (http://sourceforge.net/projects/qtitools/). Accessed March 30, 2009.<br />

[32] Paul Neve: " Spectatus - QTI 2.1 test authoring tool (http://lists.ucles.org.uk/public/ims-qti/2010-February/001571.html)", IMS-QTI<br />

mailing list, February 26, 2010. Accessed April 14, 2010.<br />

[33] Questionmark - Windows Based Authoring - Question Types (http://www.questionmark.com/us/perception/<br />

authoring_windows_qm_qtypes.aspx). Accessed March 30, 2009.<br />

[34] Publisher's Legacy Software Page (http://www.questionwriter.com/pricing/custom-development.html). Accessed March 31, 2009.<br />

[35] Question Writer 2.0 Publisher Edition Manual (http://downloads.centralquestion.com/QuestionWriterManual.pdf). Accessed March 31,<br />

2009.<br />

[36] Question Writer Blog Announcement (http://www.questionwriterblog.com/archives/2009/05/question_writer_34.html). Accessed May<br />

18, 2009.<br />

[37] Question Writer Features Description (http://www.questionwriter.com/features.html). Accessed May 18, 2009.<br />

[38] Question Writer Blog Entry on Feature (http://www.questionwriterblog.com/archives/2009/06/qti_for_pearson_vue.html). Accessed<br />

July 29, 2009.<br />

[39] Respondus Plug-in for Moodle (http://www.respondus.com/update/2007-11-c.shtml). Accessed March 30, 2009.<br />

[40] The Respondus Version 3.5 page (http://www.respondus.com/products/respondus.shtml) does not mention the QTI version.<br />

[41] RM: Test Authoring System (http://www.rm.com/generic.asp?cref=GP1002551). Accessed March 31, 2009.


QTI 88<br />

[42] Sakai: SAMigo/Test and Quizzes (http://bugs.sakaiproject.org/confluence/display/SAM/Home). Accessed March 30, 2009.<br />

[43] SToMP: An Overview (http://www.stomp.ac.uk/). Accessed March 31, 2009.<br />

[44] Studywiz QT Assessment (http://www.europe.studywiz.com/?page_id=72). Accessed April 03, 2009.<br />

[45] Wimba Create Brochure (http://www.wimba.com/assets/resources/wimbaCrBrochure_HE.pdf). Accessed March 30, 2009.<br />

[46] QTI Migration Tool (http://qtitools.caret.cam.ac.uk/index.php?option=com_docman&task=cat_view&gid=18&Itemid=28). Accessed<br />

March 30, 2009.<br />

[47] http://www.imsglobal.org/question<br />

[48] http://www.toia.ac.uk<br />

[49] http://qtitools.caret.cam.ac.uk/<br />

[50] http://jisc.cetis.ac.uk/domain/assessment<br />

[51] http://wiki.cetis.ac.uk/Assessment_tools%2C_projects_and_resources<br />

[52] http://lists.ucles.org.uk/lists/listinfo/ims-qti


Resource Description Framework 89<br />

Resource Description Framework<br />

Current Status Published<br />

Editors Frank Manola, Eric Miller<br />

Base Standards <strong>XML</strong>, URI<br />

Related<br />

Standards<br />

RDFS, OWL<br />

Domain Semantic Web<br />

Abbreviation RDF<br />

Website RDF Primer [1]<br />

The Resource Description Framework (RDF) is a family of World Wide Web Consortium (W3C) specifications<br />

originally designed as a metadata data model. It has come to be used as a general method for conceptual description<br />

or modeling of information that is implemented in web resources, using a variety of syntax formats.<br />

Overview<br />

The RDF data model [2] is similar to classic conceptual modeling approaches such as Entity-Relationship or Class<br />

diagrams, as it is based upon the idea of making statements about resources (in particular Web resources) in the form<br />

of subject-predicate-object expressions. These expressions are known as triples in RDF terminology. The subject<br />

denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between<br />

the subject and the object. For example, one way to represent the notion "The sky has the color blue" in RDF is as<br />

the triple: a subject denoting "the sky", a predicate denoting "has the color", and an object denoting "blue". RDF is<br />

an abstract model with several serialization formats (i.e., file formats), and so the particular way in which a resource<br />

or triple is encoded varies from format to format.<br />

This mechanism for describing resources is a major component in what is proposed by the W3C's Semantic Web<br />

activity: an evolutionary stage of the World Wide Web in which automated software can store, exchange, and use<br />

machine-readable information distributed throughout the Web, in turn enabling users to deal with the information<br />

with greater efficiency and certainty. RDF's simple data model and ability to model disparate, abstract concepts has<br />

also led to its increasing use in knowledge management applications unrelated to Semantic Web activity.<br />

A collection of RDF statements intrinsically represents a labeled, directed multi-graph. As such, an RDF-based data<br />

model is more naturally suited to certain kinds of knowledge representation than the relational model and other<br />

ontological models. However, in practice, RDF data is often persisted in relational database or native representations<br />

also called Triplestores, or Quad stores if context (i.e. the named graph) is also persisted for each RDF triple. [3] As<br />

RDFS and OWL demonstrate, additional ontology languages can be built upon RDF.<br />

History<br />

There were several ancestors to the W3C's RDF. Technically the closest was MCF, a project initiated by<br />

Ramanathan V. Guha while at Apple Computer and continued, with contributions from Tim Bray, during his tenure<br />

at Netscape Communications Corporation. Ideas from the Dublin Core community, and from PICS, the Platform for<br />

Internet Content Selection (the W3C's early Web content labelling system) were also key in shaping the direction of<br />

the RDF project.<br />

The W3C published a specification of RDF's data model and <strong>XML</strong> syntax as a Recommendation in 1999. [4] Work<br />

then began on a new version that was published as a set of related specifications in 2004. While there are a few


Resource Description Framework 90<br />

implementations based on the 1999 Recommendation that have yet to be completely updated, adoption of the<br />

improved specifications has been rapid since they were developed in full public view, unlike some earlier<br />

technologies of the W3C. Most newcomers to RDF are unaware that the older specifications even exist.<br />

RDF Topics<br />

RDF Vocabulary<br />

The vocabulary defined by the RDF specification is:<br />

• rdf:type - a predicate used to state that a resource is an instance of a class<br />

• rdf:<strong>XML</strong>Literal - the class of typed literals<br />

• rdf:Property - the class of properties<br />

• rdf:Alt, rdf:Bag, rdf:Seq - containers of alternatives, unordered containers, and ordered containers (rdfs:Container<br />

is a super-class of the three)<br />

• rdf:List - the class of RDF Lists<br />

• rdf:nil - an instance of rdf:List representing the empty list<br />

• rdf:Statement, rdf:subject, rdf:predicate, rdf:object – used for reification (see below)<br />

This vocabulary is used as a foundation for RDF Schema where it is extended.<br />

Serialization formats<br />

Two common serialization formats are in use.<br />

The first is an <strong>XML</strong> format. This format is often called simply RDF because it was introduced among the other W3C<br />

specifications defining RDF. However, it is important to distinguish the <strong>XML</strong> format from the abstract RDF model<br />

itself. Its MIME media type, application/rdf+xml, was registered by RFC 3870. It recommends RDF documents to<br />

follow the new 2004 specifications.<br />

In addition to serializing RDF as <strong>XML</strong>, the W3C introduced Notation 3 (or N3) as a non-<strong>XML</strong> serialization of RDF<br />

models designed to be easier to write by hand, and in some cases easier to follow. Because it is based on a tabular<br />

notation, it makes the underlying triples encoded in the documents more easily recognizable compared to the <strong>XML</strong><br />

serialization. N3 is closely related to the Turtle and N-Triples formats.<br />

Triples may be stored in a triplestore.<br />

Resource identification<br />

The subject of an RDF statement is either a Uniform Resource Identifier (URI) or a blank node, both of which<br />

denote resources. Resources indicated by blank nodes are called anonymous resources. They are not directly<br />

identifiable from the RDF statement. The predicate is a URI which also indicates a resource, representing a<br />

relationship. The object is a URI, blank node or a Unicode string literal.<br />

In Semantic Web applications, and in relatively popular applications of RDF like RSS and FOAF (Friend of a<br />

Friend), resources tend to be represented by URIs that intentionally denote, and can be used to access, actual data on<br />

the World Wide Web. But RDF, in general, is not limited to the description of Internet-based resources. In fact, the<br />

URI that names a resource does not have to be dereferenceable at all. For example, a URI that begins with "http:"<br />

and is used as the subject of an RDF statement does not necessarily have to represent a resource that is accessible via<br />

HTTP, nor does it need to represent a tangible, network-accessible resource — such a URI could represent<br />

absolutely anything. However, there is broad agreement that a bare URI (without a # symbol) which returns a<br />

300-level coded response when used in an http GET request should be treated as denoting the internet resource that it<br />

succeeds in accessing.


Resource Description Framework 91<br />

Therefore, producers and consumers of RDF statements must agree on the semantics of resource identifiers. Such<br />

agreement is not inherent to RDF itself, although there are some controlled vocabularies in common use, such as<br />

Dublin Core Metadata, which is partially mapped to a URI space for use in RDF. The intent of publishing<br />

RDF-based ontologies on the Web is often to establish, or circumscribe, the intended meanings of the resource<br />

identifiers used to express data in RDF. For example, the URI http:/ / www. w3. org/ TR/ 2004/<br />

REC-owl-guide-20040210/ wine#merlot is intended by its owners to refer to the class of all Merlot red wines, an<br />

intent which is expressed by the OWL ontology — itself an RDF document — in which it occurs. Note that this is<br />

not a 'bare' resource identifier, but is rather a URI reference, containing the '#' character and ending with a fragment<br />

identifier.<br />

Statement reification and context<br />

The body of knowledge modeled by a collection of statements may be subjected to reification, in which each<br />

statement (that is each triple subject-predicate-object altogether) is assigned a URI and treated as a resource about<br />

which additional statements can be made, as in "Jane says that John is the author of document X". Reification is<br />

sometimes important in order to deduce a level of confidence or degree of usefulness for each statement.<br />

In a reified RDF database, each original statement, being a resource, itself, most likely has at least three additional<br />

statements made about it: one to assert that its subject is some resource, one to assert that its predicate is some<br />

resource, and one to assert that its object is some resource or literal. More statements about the original statement<br />

may also exist, depending on the application's needs.<br />

Borrowing from concepts available in logic (and as illustrated in graphical notations such as conceptual graphs and<br />

topic maps), some RDF model implementations acknowledge that it is sometimes useful to group statements<br />

according to different criteria, called situations, contexts, or scopes, as discussed in articles by RDF specification<br />

co-editor Graham Klyne [5] [6] . For example, a statement can be associated with a context, named by a URI, in order<br />

to assert an "is true in" relationship. As another example, it is sometimes convenient to group statements by their<br />

source, which can be identified by a URI, such as the URI of a particular RDF/<strong>XML</strong> document. Then, when updates<br />

are made to the source, corresponding statements can be changed in the model, as well.<br />

Implementation of scopes does not necessarily require fully reified statements. Some implementations allow a single<br />

scope identifier to be associated with a statement that has not been assigned a URI, itself [7] [8] . Likewise named<br />

graphs in which a set of triples is named by a URI can represent context without the need to reify the triples. [9]<br />

Query and inference languages<br />

The predominant query language for RDF graphs is SPARQL. SPARQL is an SQL-like language, and a<br />

recommendation of the W3C as of January 15, 2008.<br />

An example of a SPARQL query to show country capitals in Africa, using a fictional ontology.<br />

PREFIX abc: .<br />

SELECT ?capital ?country<br />

WHERE {<br />

}<br />

?x abc:cityname ?capital ;<br />

abc:isCapitalOf ?y.<br />

?y abc:countryname ?country ;<br />

abc:isInContinent abc:Africa.<br />

Other ways to query RDF graphs include:<br />

• RDQL, precursor to SPARQL, SQL-like<br />

• Versa, compact syntax (non–SQL-like), solely implemented in 4Suite (Python)


Resource Description Framework 92<br />

• RQL, one the first declarative languages for uniformly querying RDF schemas and resource descriptions,<br />

implemented in RDFSuite.<br />

• XUL has a template [10] element in which to declare rules for matching data in RDF. XUL uses RDF extensively<br />

for databinding.<br />

Examples<br />

Example 1: RDF Description of a person named Eric Miller [11]<br />

Here is an example taken from the W3C website [11] describing a resource with statements "there is a Person<br />

identified by http:/ / www. w3. org/ People/ EM/ contact#me, whose name is Eric Miller, whose email address is<br />

em@w3.org, and whose title is Dr.".<br />

The resource "http:/ / www. w3. org/ People/ EM/ contact#me" is the<br />

subject. The objects are: (i) "Eric Miller" (with a predicate "whose<br />

name is"), (ii) em@w3.org (with a predicate "whose email address is"),<br />

and (iii) "Dr." (with a predicate "whose title is"). The subject is a URI.<br />

The predicates also have URIs. For example, the URI for the predicate:<br />

(i) "whose name is" is http:/ / www. w3. org/ 2000/ 10/ swap/ pim/<br />

contact#fullName, (ii) "whose email address is" is http:/ / www. w3.<br />

org/ 2000/ 10/ swap/ pim/ contact#mailbox, (iii) "whose title is" is<br />

http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#personalTitle. In<br />

addition, the subject has a type (with URI http://www.w3.org/1999/<br />

02/ 22-rdf-syntax-ns#type), which is person (with URI http:/ / www.<br />

[11]<br />

An RDF Graph Describing Eric Miller<br />

w3. org/ 2000/ 10/ swap/ pim/ contact#Person), and a mailbox (with URI http:/ / www. w3. org/ 2000/ 10/ swap/<br />

pim/contact#mailbox.) Therefore, the following "subject, predicate, object" RDF triples can be expressed:<br />

(i) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#fullName,<br />

"Eric Miller"<br />

(ii) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/<br />

contact#personalTitle, "Dr."<br />

(iii) http://www.w3.org/People/EM/contact#me, http://www.w3.org/1999/02/22-rdf-syntax-ns#type, http://<br />

www.w3.org/2000/10/swap/pim/contact#Person<br />

(iv) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#mailbox,<br />

em@w3.org<br />

Example 2: The postal abbreviation for New York<br />

Certain concepts in RDF are taken from logic and linguistics, where subject-predicate and subject-predicate-object<br />

structures have meanings similar to, yet distinct from, the uses of those terms in RDF. This example demonstrates:<br />

In the English language statement 'New York has the postal abbreviation NY' , 'New York' would be the subject, 'has<br />

the postal abbreviation' the predicate and 'NY' the object.<br />

Encoded as an RDF triple, the subject and predicate would have to be resources named by URIs. The object could be<br />

a resource or literal element. For example, in the Notation 3 form of RDF, the statement might look like:<br />

"NY" .<br />

In this example, "urn:x-states:New%20York" is the URI for a resource that denotes the U.S. state New York,<br />

"http://purl.org/dc/terms/alternative" is the URI for a predicate (whose human-readable definition can be found at<br />

here [12] ), and "NY" is a literal string. Note that the URIs chosen here are not standard, and don't need to be, as long


Resource Description Framework 93<br />

as their meaning is known to whatever is reading them.<br />

N-Triples is just one of several standard serialization formats for RDF. The triple above can also be equivalently<br />

represented in the standard RDF/<strong>XML</strong> format as:<br />

<br />

<br />

<br />

<br />

NY<br />

However, because of the restrictions on the syntax of QNames (such as dcterms:alternative above), there are some<br />

RDF graphs that are not representable with RDF/<strong>XML</strong>.<br />

Example 3: A Wikipedia article about Tony Benn<br />

In a like manner, given that "http://en.wikipedia.org/wiki/Tony_Benn" identifies a particular resource (regardless of<br />

whether that URI could be traversed as a hyperlink, or whether the resource is actually the Wikipedia article about<br />

Tony Benn), to say that the title of this resource is "Tony Benn" and its publisher is "Wikipedia" would be two<br />

assertions that could be expressed as valid RDF statements. In the N-Triples form of RDF, these statements might<br />

look like the following:<br />

"Tony Be<br />

"Wik<br />

And these statements might be expressed in RDF/<strong>XML</strong> as:<br />

<br />

<br />

<br />

Tony Benn<br />

Wikipedia<br />

<br />

To an English-speaking person, the same information could be represented simply as:<br />

The title of this resource, which is published by Wikipedia, is 'Tony Benn'<br />

However, RDF puts the information in a formal way that a machine can understand. The purpose of RDF is to<br />

provide an encoding and interpretation mechanism so that resources can be described in a way that particular<br />

software can understand it; in other words, so that software can access and use information that it otherwise couldn't<br />

use.<br />

Both versions of the statements above are wordy because one requirement for an RDF resource (as a subject or a<br />

predicate) is that it be unique. The subject resource must be unique in an attempt to pinpoint the exact resource being<br />

described. The predicate needs to be unique in order to reduce the chance that the idea of Title or Publisher will be<br />

ambiguous to software working with the description. If the software recognizes http://purl.org/dc/elements/1.1/title<br />

(a specific definition for the concept of a title established by the Dublin Core Metadata Initiative), it will also know<br />

that this title is different from a land title or an honorary title or just the letters t-i-t-l-e put together.


Resource Description Framework 94<br />

The following example shows how such simple claims can be elaborated on, by combining multiple RDF<br />

vocabularies. Here, we note that the primary topic of the Wikipedia page is a "Person" whose name is "Tony Benn":<br />

<br />

<br />

<br />

Tony Benn<br />

Wikipedia<br />

<br />

Applications<br />

<br />

<br />

Tony Benn<br />

<br />

<br />

• Sigma [13] - Application from DERI in National University of Ireland, Galway(NUIG).<br />

• Creative Commons - Uses RDF to embed license information in web pages and mp3 files.<br />

• DOAC (Description of a Career) - supplements FOAF to allow the sharing of résumé information.<br />

• FOAF (Friend of a Friend) - designed to describe people, their interests and interconnections.<br />

• Haystack client - Semantic web browser from MIT CS & AI lab. [14]<br />

• IDEAS Group - developing a formal 4D Ontology for Enterprise Architecture using RDF as the encoding. [15]<br />

• Microsoft shipped a product, Connected Services Framework [16] ,which provides RDF-based Profile Management<br />

capabilities.<br />

• MusicBrainz - Publishes information about Music Albums. [17]<br />

• NEPOMUK, an open-source software specification for a Social Semantic desktop uses RDF as a storage format<br />

for collected metadata. NEPOMUK is mostly known because of its integration into the KDE4 desktop<br />

environment.<br />

• RDF Site Summary - one of several "RSS" languages for publishing information about updates made to a web<br />

page; it is often used for disseminating news article summaries and sharing weblog content.<br />

• Simple Knowledge Organization System (SKOS) - an KR representation intended to support<br />

vocabulary/thesaurus applications<br />

• SIOC (Semantically-Interlinked Online Communities) - designed to describe online communities and to create<br />

connections between Internet-based discussions from message boards, weblogs and mailing lists. [18]<br />

• Smart-M3 - provides an infrastructure for using RDF and specifically uses the ontology agnostic nature of RDF to<br />

enable heterogeneous mashing-up of information [19]<br />

• Many other RDF schemas are available by searching SchemaWeb. [20]<br />

Some uses of RDF include research into social networking. This is important because it could help governments<br />

keep track of terrorists cells. It will also help people in business fields understand better their relationships with<br />

members of industries that could be of use for product placement [21] . It will also help scientists understand how<br />

people are connected to one another.<br />

RDF is being used to have a better understanding of traffic patterns. This is because the information regarding traffic<br />

patterns is on different websites, and RDF is used to integrate information from different sources on the web. Before,<br />

the common methodology was using keyword searching, but this method is problematic because it does not consider


Resource Description Framework 95<br />

synonyms. This is why ontologies are useful in this situation. But one of the issues that comes up when trying to<br />

efficiently study traffic is that to fully understand traffic, concepts related to people, streets, and roads must be well<br />

understood. Since these are human concepts, they require the addition of fuzzy logic. This is because values that are<br />

useful when describing roads, like slipperiness, are not precise concepts and cannot be measured. This would imply<br />

that the best solution would incorporate both fuzzy logic and ontology. [22]<br />

See also<br />

Notations for RDF<br />

• N3<br />

• N-Triples<br />

• TRiG<br />

• TRiX<br />

• Turtle<br />

• RDF/<strong>XML</strong><br />

• RDFa<br />

Ontology/vocabulary languages<br />

• OWL<br />

• SKOS<br />

• RDF schema<br />

Similar concepts<br />

• Entity-attribute-value model<br />

• Graph theory - An RDF model is a labeled, directed multi-graph.<br />

• Website Parse Template<br />

• Tagging<br />

• Topic Maps - Topic Maps is in some ways, similar to RDF.<br />

• Semantic network<br />

Other (unsorted)<br />

• Associative model of data<br />

• Business Intelligence 2.0 (BI 2.0)<br />

• DataPortability<br />

• Folksonomy<br />

• GRDDL<br />

• Life Science Identifiers<br />

• Meta Content Framework<br />

• Semantic Web<br />

• Swoogle<br />

• Universal Networking <strong>Language</strong> (UNL)


Resource Description Framework 96<br />

Further reading<br />

• W3C's RDF at W3C [23] : specifications, guides, and resources<br />

• RDF Semantics [24] : specification of semantics, and complete systems of inference rules for both RDF and RDFS<br />

Tutorials and documents<br />

• Quick Intro to RDF [25]<br />

• RDF in Depth [26]<br />

• Introduction to the RDF Model [27]<br />

• What is RDF? [28]<br />

• An introduction to RDF [29]<br />

• RDF and XUL [30] , with examples.<br />

External links<br />

News and resources<br />

• Dave Beckett's RDF Resource Guide [31]<br />

• Resource Description Framework: According to W3C specifications and Mozilla's documentation [30]<br />

• RDF Datasources [32] : RDF datasources in Mozilla<br />

• The Finance Ontology [33] Semantic web application under construction.<br />

RDF software tools<br />

• Raptor RDF Parser Library [34]<br />

• Listing of RDF and OWL tools at W3C wiki [35]<br />

• SemWebCentral [36] Open Source semantic web tools<br />

• Intellidimension [37] Semantic web software and tools for Windows, .NET/C# and SQL Server<br />

• Listing of RDF software at xml.com [38]<br />

• Rhodonite [39] : freeware RDF editor and RDF browser with a drag-and-drop interface<br />

• D2R Server [40] : tool to publish relational databases as an RDF-graph<br />

• Virtuoso Universal Server: a SPARQL compliant platform for RDF data management, SQL-RDF integration, and<br />

RDF based Linked Data deployment<br />

• ROWLEX [41] : .NET library and toolkit built to create and browse RDF documents easily. It abstracts away the<br />

level of RDF triples and elevates the level of the programming work to (OWL) classes and properties.<br />

• AlchemyAPI [42] : web service API / SDK that converts unstructured text into RDF & Linked Data.<br />

• The Sweet Tools [43] listing of 800+ RDF and -related tools, most open source, and sortable by category and<br />

language (among other facets).<br />

RDF datasources<br />

• Wikipedia 3 [44] : System One's RDF conversion of the English Wikipedia, updated monthly<br />

• DBpedia: a Linking Open Data Community Project [45] that exposes an every increasing collection of RDF based<br />

Linked Data sources<br />

• Semantic Systems Biology [46]


Resource Description Framework 97<br />

References<br />

[1] http://www.w3.org/TR/rdf-primer/<br />

[2] http://www.w3.org/TR/PR-rdf-syntax/"Resource Description Framework (RDF) Model and Syntax Specification"<br />

[3] Optimized Index Structures for Querying RDF from the Web (http://sw.deri.org/2005/02/dexa/yars.pdf) Andreas Harth, Stefan Decker,<br />

3rd Latin American Web Congress, Buenos Aires, Argentina, October 31 to November 2, 2005, pp. 71-80<br />

[4] W3C 1999 specification (http://www.w3.org/TR/rdf-syntax-grammar/)<br />

[5] Contexts for RDF Information Modelling (http://www.ninebynine.org/RDFNotes/RDFContexts.html)<br />

[6] Circumstance, Provenance and Partial Knowledge (http://www.ninebynine.org/RDFNotes/UsingContextsWithRDF.html)<br />

[7] The Concept of 4Suite RDF Scopes (http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/scopes)<br />

[8] Redland RDF Library - Contexts (http://librdf.org/notes/contexts.html)<br />

[9] Named Graphs (http://www.w3.org/2004/03/trix/)<br />

[10] http://developer.mozilla.org/en/docs/XUL:Template_Guide:Introduction<br />

[11] "RDF Primer" (http://www.w3.org/TR/rdf-primer/). W3C. . Retrieved 2009-03-13.<br />

[12] http://dublincore.org/documents/library-application-profile/index.shtml#Alternative<br />

[13] http://sig.ma/<br />

[14] Haystack (http://groups.csail.mit.edu/haystack/home.html)<br />

[15] The IDEAS Group Website (http://www.ideasgroup.org)<br />

[16] Connected Services Framework (http://www.microsoft.com/serviceproviders/solutions/connectedservicesframework.mspx)<br />

[17] RDF on MusicBrainz Wiki (http://wiki.musicbrainz.org/RDF)<br />

[18] SIOC (Semantically-Interlinked Online Communities) (http://sioc-project.org/)<br />

[19] Oliver Ian, Honkola Jukka, Ziegler Jurgen (2008). “Dynamic, Localized Space Based Semantic Webs”. IADIS WWW/Internet 2008.<br />

Proceedings, p.426, IADIS Press, ISBN 978-972-8924-68-3<br />

[20] SchemaWeb (http://www.schemaweb.info)<br />

[21] An RDF Approach for Discovering the Relevant Semantic Associations in a Social Network By Thushar A.K, and P. Santhi Thilagam<br />

[22] Traffic Information Retrieval Based on Fuzzy Ontology and RDF on the Semantic Web By Jun Zhai, Yi Yu, Yiduo Liang, and Jiatao Jiang<br />

(2008)<br />

[23] http://www.w3.org/RDF/<br />

[24] http://www.w3.org/TR/2004/REC-rdf-mt-20040210/<br />

[25] http://rdfabout.com/quickintro.xpd<br />

[26] http://rdfabout.com/intro/<br />

[27] http://www.xulplanet.com/tutorials/mozsdk/rdfstart.php<br />

[28] http://www.xml.com/pub/a/2001/01/24/rdf.html<br />

[29] http://www-128.ibm.com/developerworks/library/w-rdf/<br />

[30] http://www.xul.fr/en-xml-rdf.html<br />

[31] http://planetrdf.com/guide/<br />

[32] http://xulplanet.com/tutorials/mozsdk/rdfsources.php<br />

[33] http://www.fadyart.com/ontology.html<br />

[34] http://librdf.org/raptor/<br />

[35] http://esw.w3.org/topic/SemanticWebTools<br />

[36] http://projects.semwebcentral.org/<br />

[37] http://www.intellidimension.com/<br />

[38] http://www.xml.com/pub/rg/RDF_Software<br />

[39] http://rhodonite.angelite.nl<br />

[40] http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/<br />

[41] http://rowlex.nc3a.nato.int<br />

[42] http://www.alchemyapi.com/api/entity/ldata.html<br />

[43] http://www.mkbergman.com/new-version-sweet-tools-sem-web/<br />

[44] http://labs.systemone.at/wikipedia3<br />

[45] http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData<br />

[46] http://www.semantic-systems-biology.org


Resources of a Resource 98<br />

Resources of a Resource<br />

Resources of a Resource (ROR) is an <strong>XML</strong> format for describing the content of an internet resource or website in a<br />

generic fashion so this content can be better understood by search engines, spiders, web applications, etc. The ROR<br />

format provides several pre-defined terms for describing objects like sitemaps, products, events, reviews, jobs,<br />

classifieds, etc. The format can be extended with custom terms.<br />

RORweb.com [1] is the official website of ROR; the ROR format was created by AddMe.com [2] as a way to help<br />

search engines better understand content and meaning. Similar concepts, like Google Sitemaps and Google Base,<br />

have also been developed since the introduction of the ROR format.<br />

ROR objects are placed in an ROR feed called ror.xml. This file is typically located in the root directory of the<br />

resource or website it describes. When a search engine like Google or Yahoo searches the web to determine how to<br />

categorize content, the ROR feed allows the search engines "spider" to quickly identify all the content and attributes<br />

of the website.<br />

This has three main benefits:<br />

1. It allows the spider to correctly categorize the content of the website into its engine.<br />

2. It allows the spider to extract very detailed information about the objects on a website (sitemaps, products,<br />

events, reviews, jobs, classifieds, etc)<br />

3. It allows the website owner to optimize his site for inclusion of its content into the search engines.<br />

External links<br />

• RORweb.com [1]<br />

References<br />

[1] http://www.rorweb.com<br />

[2] http://www.AddMe.com


Reverse Ajax 99<br />

Reverse Ajax<br />

Reverse Ajax refers to an Ajax design pattern that uses long-lived HTTP connections to enable low-latency<br />

communication between a web server and a browser. Basically it is a way of sending data from client to server and a<br />

[1] [2]<br />

mechanism for pushing server data back to the browser.<br />

This server–client communication takes one of two forms:<br />

• Client polling: the client repeatedly queries (polls) the server and waits for an answer.<br />

• Server pushing: a connection between a server and client is kept open and the server sends data when available.<br />

Reverse Ajax describes the implementation of either of these models, or a combination of both. The design pattern is<br />

also known as Ajax Push, Full Duplex Ajax and Streaming Ajax.<br />

Examples<br />

The following is a simple example. Imagine we have 2 clients and 1 server, and client1 wants to send the message<br />

"hello" to every other client.<br />

With traditional Ajax (polling):<br />

• client1 sends the message "hello"<br />

• server receives the message "hello"<br />

• client2 polls the server<br />

• client2 receives the message "hello"<br />

• client1 polls the server<br />

• client1 receives the message "hello"<br />

With reverse Ajax (pushing):<br />

• client1 sends the message "hello"<br />

• server receives the message "hello"<br />

• server sends the message "hello" to all clients<br />

Less traffic is generated with Reverse Ajax and messages are transferred with less delay (low-latency).<br />

External links<br />

• The Slow Load Technique/Reverse AJAX - Simulating Server Push in a Standard Web Browser [3]<br />

• Exploring Reverse Ajax [4]<br />

• Reverse Ajax with DWR (an Java Ajax framework) [5]<br />

• Changing the Web Paradigm - Moving from traditional Web applications to Streaming-AJAX [6]<br />

References<br />

[1] Crane, Dave; McCarthy, Phil (July 2008) (in English). Comet and Reverse Ajax: The Next Generation Ajax 2.0. Apress. ISBN 1590599985.<br />

[2] Martin, Katherine (2007-03-22). "Developing Applications using Reverse Ajax" (http://today.java.net/pub/a/today/2007/03/22/<br />

developing-applications-using-reverse-ajax.html). java.net, O'Reilly and CollabNet. .<br />

[3] http://www.obviously.com/tech_tips/slow_load_technique<br />

[4] http://gmapsdotnetcontrol.blogspot.com/2006/08/exploring-reverse-ajax-ajax.html<br />

[5] http://ajaxian.com/archives/reverse-ajax-with-dwr<br />

[6] http://www.lightstreamer.com/Lightstreamer_Paradigm.pdf


Root element 100<br />

Root element<br />

Each <strong>XML</strong> document has exactly one single root element. This element is also known as the document element. It<br />

encloses all the other elements and is therefore the sole parent element to all the other elements.<br />

The World Wide Web Consortium defines not only the specifications for <strong>XML</strong> itself [1] , but also the DOM, which is<br />

a platform- and language-independent standard object model for representing <strong>XML</strong> documents. DOM Level 1<br />

defines, for every <strong>XML</strong> document, an object representation of the document itself and an attribute or property on the<br />

document called documentElement. This property provides access to an object of type element which directly<br />

represents the root element of the document [2] .<br />

<br />

content<br />

<br />

<br />

There can be other <strong>XML</strong> nodes outside of the root element [3] , in particular the root element may be preceded by a<br />

prolog, which itself may consist of an <strong>XML</strong> declaration, optional comments, processing instructions and whitespace,<br />

followed by an optional DOCTYPE declaration and more optional comments, processing instructions and<br />

whitespace. After the document element there may be further optional comments, processing instructions and<br />

whitespace within the document [4] .<br />

Within the document element, apart from any number of attributes and other elements, there may also be more<br />

optional text, comments, processing instructions and whitespace.<br />

A more expanded example of an <strong>XML</strong> document follows, demonstrating some of these extra nodes along with a<br />

single rootElement element.<br />

<br />

<br />

<br />

<br />

<br />

<br />

text<br />

<br />


Root element 101<br />

References<br />

[1] The current W3C <strong>XML</strong> 1.0 specification (http://www.w3.org/TR/xml/)<br />

[2] The 'documentElement' definition in the W3C DOM Level 1 specification (http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/<br />

level-one-core.html#i-Document)<br />

[3] The 'well-formed document' section of the W3C <strong>XML</strong> specification (http://www.w3.org/TR/2006/REC-xml-20060816/<br />

#sec-well-formed)<br />

[4] The 'prolog' section of the W3C <strong>XML</strong> specification (http://www.w3.org/TR/2006/REC-xml-20060816/#NT-prolog)<br />

Schematron<br />

In markup languages, Schematron is a rule-based validation language for making assertions about the presence or<br />

absence of patterns in <strong>XML</strong> trees. It is a structural schema language expressed in <strong>XML</strong> using a small number of<br />

elements and XPath.<br />

In a typical implementation, the Schematron schema <strong>XML</strong> is processed into normal XSLT code for deployment<br />

anywhere that XSLT can be used.<br />

Schematron is capable of expressing constraints in ways that XDR and DTD cannot. For example, it can require that<br />

the content of an element be controlled by one of its siblings. Or it can request or require that the root element,<br />

regardless of what element that is, must have specific attributes. Schematron can also specify required relationships<br />

between multiple <strong>XML</strong> files.<br />

Constraints and content rules may be associated with "plain-English" validation error messages. This may be<br />

preferred by some users who might otherwise have to cross-reference numeric error codes to understand what they<br />

mean.<br />

Uses<br />

Schematron's design of expressing constraints through an XPath-based language that can be deployed as XSLT code,<br />

make it practical for applications such as the following:<br />

Adjunct to Structural Validation<br />

by testing for co-occurrence constraints, non-regular constraints, and inter-document constraints, Schematron<br />

can extend the validations able to be expressed in languages such as DTDs, RELAX NG or <strong>XML</strong> Schema.<br />

Lightweight Business Rules Engine<br />

Schematron is not a comprehensive, Rete rules engine, but it can be used to express rules about complex<br />

structures with an <strong>XML</strong> document.<br />

<strong>XML</strong> Editor Syntax Highlighting Rules<br />

<strong>XML</strong> Editors use Schematron rules to conditionally highlight <strong>XML</strong> files for errors.


Schematron 102<br />

Versions<br />

Schematron was invented by Rick Jelliffe at Academia Sinica Computing Centre, Taiwan. He described Schematron<br />

as "a feather duster to reach the parts other schema languages cannot reach".<br />

The most common versions of Schematron are:<br />

• Schematron 1.0 (1999)<br />

• Schematron 1.3 (2000): this version used the namespace http://xml.ascc.net/schematron/''.It was supported by<br />

an XSLT implementation with a plug-in architecture.<br />

• Schematron 1.5 [1] (2001): this version was widely implemented and still found.<br />

• Schematron 1.6 [2] (2002): this version was the base of ISO Schematron and obsoleted by it<br />

• ISO Schematron [16] (2006): this version regularizes several features, and provides an <strong>XML</strong> output format SVRL.<br />

It uses the new namespace http://purl.oclc.org/dsdl/schematron''<br />

• ISO Schematron (2010): this proposed version adds support for XSLT2 and arbitrary properties<br />

Schematron as an ISO Standard<br />

Schematron has been standardized to become part of : ISO/IEC 19757 - Document Schema Definition <strong>Language</strong>s<br />

(DSDL) - Part 3: Rule-based validation - Schematron.<br />

This standard is available free on the ISO Publicly Available Specifications [16] list. Paper versions may be<br />

purchased from ISO or national standards bodies.<br />

Schemas that use ISO/IEC FDIS 19757-3 should use the following namespace:<br />

http://purl.oclc.org/dsdl/schematron<br />

Sample Rule<br />

Schematron rules are very simple to create using a standard <strong>XML</strong> editor or XForms application. The following is a<br />

sample schema:<br />

<br />

<br />

Date rules<br />

<br />

ContractDate should be in the pa<br />

are not allowed.<br />

<br />

<br />

<br />

This rule checks to make sure that the ContractDate <strong>XML</strong> element has a date that is before the current date. If this<br />

rule fails the validation will fail and an error message which is the body of the assert element will be returned to the<br />

user.


Schematron 103<br />

Implementation<br />

Schematron source files are usually transformed into XSLT files (using XSLT) and placed in an <strong>XML</strong> Pipeline. This<br />

allows workflow process designers to build and maintain rules using standard <strong>XML</strong> manipulation tools.<br />

For example an Apache Ant task can be used to convert Schematron rules into XSLT files.<br />

See also<br />

• <strong>XML</strong> Schema <strong>Language</strong> Comparison - Comparison to other <strong>XML</strong> Schema languages.<br />

• Service Modeling <strong>Language</strong> - Service Modeling <strong>Language</strong> uses Schematron.<br />

External links<br />

• ISO Schematron Home Page [3]<br />

• Academia Sinica Computing Centre's Schematron Home Page [4]<br />

• Schematron Wiki including Implementer's FAQ [5]<br />

References<br />

[1] http://xml.ascc.net/schematron/<br />

[2] http://xml.ascc.net/resource/schematron/Schematron2000.html<br />

[3] http://www.schematron.com<br />

[4] http://www.ascc.net/xml/resource/schematron/<br />

[5] http://www.eccnet.com/schematron/index.php/Main_Page<br />

Simple Outline <strong>XML</strong><br />

Simple Outline <strong>XML</strong> (SOX) is a compressed way of writing <strong>XML</strong>.<br />

SOX uses indenting to represent the structure of an <strong>XML</strong> document, eliminating the need for closing tags.<br />

Example<br />

The following XHTML markup fragment:<br />

<br />

<br />

Sample page<br />

<br />

<br />

A very brief page<br />

<br />

<br />

... would appear in SOX as:<br />

html><br />

xmlns=http://www.w3.org/1999/xhtml<br />

head><br />

body><br />

title> Sample page<br />

p> A very brief page


Simple Outline <strong>XML</strong> 104<br />

SOX can be readily converted to <strong>XML</strong>.<br />

See also<br />

• Haml is a meta-XHTML representation that integrates with Ruby on Rails and has a similar mark-up structure.<br />

Sources<br />

• http://www.langdale.com.au/SOX/<br />

• http://www.ibm.com/developerworks/xml/library/x-syntax.html<br />

Simple <strong>XML</strong><br />

Simple <strong>XML</strong> is a variation of <strong>XML</strong> containing only elements. All attributes are converted into elements. Not having<br />

attributes or other xml elements such as the <strong>XML</strong> declaration / DTDs allows the use of simple and fast parsers. This<br />

format is also compatible with mainstream <strong>XML</strong> parsers.<br />

Structure<br />

For example:<br />

gardening Watering 6:00<br />

7:00 cooking <br />

12:00 <br />

would represent:<br />

<br />

<br />

Validation<br />

Simple <strong>XML</strong> uses a simple XPath list for validation. The <strong>XML</strong> snippet above for example, would be represented by:<br />

/Agenda/type|(Activity/type|(*/time))<br />

or a bit more human readable as:<br />

/Agenda/type /Agenda/Activity/type /Agenda/Activity/*/time<br />

This allows the <strong>XML</strong> to be processed as a stream (without creating an object model in memory) with fast validation.<br />

References<br />

1. http://www.w3.org/<strong>XML</strong>/simple-<strong>XML</strong>.html


Streaming <strong>XML</strong> 105<br />

Streaming <strong>XML</strong><br />

Streaming <strong>XML</strong> means dynamic data which is in an <strong>XML</strong> format.<br />

Another popular use of this term refers to one method of consuming <strong>XML</strong> data – largely known as Simple API for<br />

<strong>XML</strong>. This is via asynchronous events that are generated as the <strong>XML</strong> data is parsed. In this context, the consumer<br />

streams through the <strong>XML</strong> data one item at a time. It does not have anything to do whether the underlying data is<br />

being updated via dynamic or static means.<br />

Uses<br />

• Extensible Messaging and Presence Protocol (XMPP). This is the protocol used for example in Google Talk.<br />

Styled Layer Descriptor<br />

A Styled Layer Descriptor (SLD) is an <strong>XML</strong> schema specified by the Open Geospatial Consortium (OGC) for<br />

describing the appearance of map layers. It is capable of describing the rendering of vector and raster data. A typical<br />

use of SLDs is to instruct a Web Map Service (WMS) of how to render a specific layer.<br />

In August 2007 the SLD specification has been split up into two new OGC specifications [1] :<br />

• Symbology Encoding Implementation Specification (SE)<br />

• Styled Layer Descriptor<br />

Styled Layer Descriptor Specification now only contains the protocol for communicating with a WMS about how to<br />

style a layer. The actual description of the styling is now exclusively described in the Symbology Encoding<br />

Implementation Specification.<br />

Open source SLD supporting software<br />

Desktop software<br />

• JUMP GIS<br />

• UDig<br />

Server-side software<br />

• GeoServer<br />

• Mapserver<br />

See also<br />

• UDig<br />

• GeoServer


Styled Layer Descriptor 106<br />

External links<br />

• AtlasStyler SLD Editor [2] is a free-software (LGPL) SLD Editor developed with GeoTools+Java+Swing.<br />

External links<br />

• OpenGIS Styled Layer Descriptor Implementation Specification [3]<br />

• OpenGIS Symbology Encoding Implementation Specification [4]<br />

References<br />

[1] OGC press release about Symbology Encoding and SLD (http://www.opengeospatial.org/press/?page=pressrelease&year=0&prid=306)<br />

[2] http://wald.intevation.org/projects/atlas-framework<br />

[3] http://www.opengeospatial.org/standards/sld<br />

[4] http://www.opengeospatial.org/standards/symbol<br />

Topic (<strong>XML</strong>)<br />

In <strong>XML</strong> terminology, topic can mean<br />

1. A resource that acts as a proxy for some subject; the topic map system's representation of that subject. The<br />

relationship between a topic and its subject is defined to be one of reification. Reification of a subject allows topic<br />

characteristics to be assigned to the topic that reifies it.<br />

2. A short document which is written in such a way that it completely answers a single question. For example, an<br />

online help system typically consists of hundreds of topics, each describing a single procedure or concept. See<br />

topic-based authoring.<br />

3. A element, used in many <strong>XML</strong> formats.<br />

See also<br />

• Topic Maps<br />

External links<br />

• Specification in <strong>XML</strong> Topic Maps (XTM) 1.0 (topicmaps.org) [1]<br />

• FAQ: The Topic Architecture of DITA [2]<br />

References<br />

[1] http://www.topicmaps.org/xtm/index.html<br />

[2] http://dita.xml.org/node/1230


Unique Particle Attribution 107<br />

Unique Particle Attribution<br />

The Unique Particle Attribution (UPA) rule is <strong>XML</strong> Schema's mechanism to prevent schema ambiguity.<br />

Due to the UPA rule the schema fragment given below is prohibited.<br />

<br />

<br />

<br />

<br />

Given the instance fragment:<br />

42<br />

It is not possible to create a Post-Schema-Validation Infoset, because it is ambiguous whether should be<br />

associated with the element declaration x, or the wildcard (xsd:any).<br />

The W3C schema workgroup is considering weak wildcards for schema version 1.1. Using weak wildcards, the<br />

explicit element declaration would always take precedence ( is associated with the element declaration), thus<br />

removing the ambiguity.<br />

See also<br />

• W3C <strong>XML</strong> Schema<br />

External links<br />

• Schema Component Constraint: Unique Particle Attribution [1]<br />

• An Approach for Evolving <strong>XML</strong> Vocabularies Using <strong>XML</strong> Schema [2]<br />

• <strong>XML</strong> Schema 1.1 Part 1: Structures [3]<br />

• <strong>XML</strong> Schema 1.1 Part 2: Datatypes [4]<br />

References<br />

[1] http://www.w3.org/TR/xmlschema-1/#cos-nonambig<br />

[2] http://lists.w3.org/Archives/Public/www-tag/2004Aug/att-0010/NRMVersioningProposal.html<br />

[3] http://www.w3.org/TR/xmlschema11-1/<br />

[4] http://www.w3.org/TR/xmlschema11-2/


VTD-<strong>XML</strong> 108<br />

VTD-<strong>XML</strong><br />

Developer(s) XimpleWare<br />

Stable release 2.8 / April 12, 2009<br />

Operating<br />

system<br />

Portable<br />

Type <strong>XML</strong> parser/indexer/slicer/editor library<br />

License GPL and Proprietary License<br />

Website vtd-xml.sourceforge.net [1] VTD-<strong>XML</strong> blog<br />

[2]<br />

Virtual Token Descriptor for eXtensible <strong>Markup</strong> <strong>Language</strong> (VTD-<strong>XML</strong>) refers to a collection of cross-platform<br />

<strong>XML</strong> processing technologies centered around a non-extractive [3] [4] <strong>XML</strong>, "document-centric" parsing technique<br />

called Virtual Token Descriptor (VTD). Depending on the perspective, VTD-<strong>XML</strong> can be viewed as one of the<br />

following:<br />

• A "Document-Centric" [5] [6] [7] [8] [9]<br />

<strong>XML</strong> parser<br />

• A native <strong>XML</strong> indexer or a file format that uses binary data to enhance the text <strong>XML</strong> [10]<br />

[11] [12]<br />

• An incremental <strong>XML</strong> content modifier<br />

• An <strong>XML</strong> slicer/splitter/assembler [13]<br />

• An <strong>XML</strong> editor/eraser<br />

[14] [15] [16]<br />

• A way to port <strong>XML</strong> processing on chip<br />

• A non-blocking, stateless XPath evaluator [17]<br />

VTD-<strong>XML</strong> is developed by XimpleWare and dual-licensed under GPL and proprietary license. It is originally<br />

written in Java, but is now available in C [18] and C#. An extended version supporting 256 GB file size is also<br />

available.<br />

Basic Concept<br />

Non-Extractive, Document-Centric Parsing<br />

Traditionally, a lexical analyzer represents tokens (the small units of indivisible character values) as discrete string<br />

objects. This approach is designated extractive parsing. In contrast, non-extractive tokenization mandates that one<br />

keeps the source text intact, and uses offsets and lengths to describe those tokens.<br />

Virtual Token Descriptor<br />

Virtual Token Descriptor (VTD) applies the concept of non-extractive, document-centric parsing to <strong>XML</strong><br />

processing. A VTD record uses a 64-bit integer to encode the offset, length, token type and nesting depth of a token<br />

in an <strong>XML</strong> document. Because all VTD records are 64-bit in length, they can be stored efficiently and managed as<br />

an array. [19]


VTD-<strong>XML</strong> 109<br />

Location Cache<br />

Location Caches (LC) build on VTD records to provide efficient random access. Organized as tables, with one table<br />

per nesting depth level, LCs contain entries modeling an <strong>XML</strong> document's element hierarchy. An LC entry is a<br />

64-bit integer encoding a pair of 32-bit values. The upper 32 bits identify the VTD record for the corresponding<br />

element. The lower 32 bits identify that element's first child in the LC at the next lower nesting level.<br />

Benefits<br />

Overview<br />

Virtually all the core benefits of VTD-<strong>XML</strong> are inherent to non-extractive, document-centric parsing which provides<br />

these characteristics:<br />

• The source <strong>XML</strong> text is kept intact in memory without decoding.<br />

• The internal representation of VTD-<strong>XML</strong> is inherently persistent.<br />

• Obviates object-oriented modeling of the hierarchical representation as it relies entirely on primitive data types<br />

(e.g., 64-bit integers) to represent the <strong>XML</strong> hierarchy, thus reducing object creation cost to nearly zero [20] .<br />

Combining those characteristics permits thinking of <strong>XML</strong> purely as syntax (bits, bytes, offsets, lengths, fragments,<br />

namespace-compensated fragments, and document composition) instead of the serialization/deserialization of<br />

objects. This is a powerful way to think about <strong>XML</strong>/SOA applications.<br />

Simplicity<br />

Developers' typical first impression is that, with VTD-<strong>XML</strong>, there are relatively few classes and methods to<br />

remember in order to write applications.<br />

As Parser<br />

When used in parsing mode, VTD-<strong>XML</strong> is a general purpose, extremely high performance [21] <strong>XML</strong> parser which<br />

compares favorably with others:<br />

• VTD-<strong>XML</strong> typically outperforms SAX (with NULL content handler) while still providing full random access and<br />

built-in XPath support.<br />

• VTD-<strong>XML</strong> typically consumes 1.3-1.5 times the <strong>XML</strong> document's size in memory, which is about 1/5 the<br />

memory usage of DOM<br />

• Applications written in VTD-<strong>XML</strong> are usually much shorter and cleaner than their DOM or SAX versions.<br />

As Indexer<br />

Because of the inherent persistence of VTD-<strong>XML</strong>, developers can write the internal representation of a parsed <strong>XML</strong><br />

document to disk and later reload it to avoid repetitive parsing. To this end, XimpleWare has introduced VTD+<strong>XML</strong><br />

as a binary packaging format combining VTD, LC and the <strong>XML</strong> text. It can typically be viewed in one of the<br />

following two ways:<br />

• A native <strong>XML</strong> index that completely eliminates the parsing cost and also retains all benefits of <strong>XML</strong>. It is a file<br />

format that is human readable and backward compatible with <strong>XML</strong>.<br />

• A binary <strong>XML</strong> format that uses binary data to enhance the processing of the <strong>XML</strong> text.


VTD-<strong>XML</strong> 110<br />

<strong>XML</strong> Content Modifier<br />

Because VTD-<strong>XML</strong> keeps the <strong>XML</strong> text intact without decoding, when an application intends to modify the content<br />

of <strong>XML</strong> it only needs to modify the portions most relevant to the changes. This is in stark contrast with DOM, SAX,<br />

or StAx parsing, which incur the cost of parsing and re-serialization no matter how small the changes are.<br />

Since VTDs refer to document elements by their offsets, changes to the length of elements occurring earlier in a<br />

document require adjustments to VTDs referring to all later elements. However, those adjustments are integer<br />

additions, albeit to many integers in multiple tables, so they are quick.<br />

<strong>XML</strong> Slicer/Splitter/Assembler<br />

An application based on VTD-<strong>XML</strong> can also use offsets and lengths to address tokens, or element fragments. This<br />

allows <strong>XML</strong> documents to be manipulated like arrays of bytes.<br />

• As a slicer, VTD-<strong>XML</strong> can "slice" off a token or an element fragment from an <strong>XML</strong> document, then insert it back<br />

into another location in the same document, or into a different document.<br />

• As a splitter, VTD-<strong>XML</strong> can split sub-elements in an <strong>XML</strong> document and dump each into a separate <strong>XML</strong><br />

document.<br />

• As an assembler, VTD-<strong>XML</strong> can "cut" chunks out of multiple <strong>XML</strong> documents and assemble them into a new<br />

<strong>XML</strong> document.<br />

<strong>XML</strong> Editor/Eraser<br />

Used as an editor/eraser, VTD-<strong>XML</strong> can directly edit/erase the underlying byte content of the <strong>XML</strong> text, provided<br />

that the token length is wider than the intended new content. An immediate benefit of this approach is that the<br />

application can immediately reuse the original VTD and LC. In contrast, when using VTD-<strong>XML</strong> to incrementally<br />

update an <strong>XML</strong> document, an application needs to reparse the updated document before the application can process<br />

it.<br />

An editor can be made smart enough to track the location of each token, permitting new, longer tokens to replace<br />

existing, shorter tokens by merely addressing the new token in separate memory outside that used to store the<br />

original document. Likewise, when reordering the document, element text does not need to be copied; only the LCs<br />

need to be updated. When a complete, contiguous <strong>XML</strong> document is needed, such as when saving it, the disparate<br />

parts can be reassembled into a new, contiguous document.<br />

Other Benefits<br />

VTD-<strong>XML</strong> also pioneers the non-blocking, stateless XPath evaluation approach.<br />

Weaknesses<br />

VTD-<strong>XML</strong> also exhibits a few noticeable shortcomings:<br />

• As an <strong>XML</strong> parser, it does not support external entities declared in the DTD.<br />

• As a file format, it increases the document size by about 30% to 50%.<br />

• As an API, it is not compatible with DOM or SAX.<br />

• It is difficult to support certain validation techniques, employed by DTD and <strong>XML</strong> Schema (e.g., default<br />

attributes and elements), that require modifications to the <strong>XML</strong> instances being parsed.


VTD-<strong>XML</strong> 111<br />

Areas of Applications<br />

General-purpose Replacement for DOM or SAX<br />

Because of VTD-<strong>XML</strong>'s performance and memory advantages, it covers a larger portion of <strong>XML</strong> use cases than<br />

either DOM or SAX [22] .<br />

• Compared to DOM, VTD-<strong>XML</strong> processes bigger (3x~5x) <strong>XML</strong> documents for the same amount of physical<br />

memory at about 3 to 10 times the performance.<br />

• Compared to SAX, VTD-<strong>XML</strong> provides random access and XPath support and outperforms SAX by at least 2x.<br />

XPath over Huge <strong>XML</strong> documents<br />

The extended edition of VTD-<strong>XML</strong> combining with 64-bit JVM makes possible XPath-based <strong>XML</strong> processing over<br />

huge <strong>XML</strong> documents (up to 256 GB) in size.<br />

For SOA/WS/<strong>XML</strong> Security<br />

[23] [24] [25]<br />

The combination of VTD-<strong>XML</strong>'s high performance and incremental-update capability makes it essential<br />

to achieve the desired level of Quality of Service for SOA/WS/<strong>XML</strong> security applications.<br />

For SOA/WS/<strong>XML</strong> Intermediary<br />

VTD-<strong>XML</strong> is well suited for SOA intermediary applications such as <strong>XML</strong> routers/switches/gateways, Enterprise<br />

Service Buses, and services aggregation points. All those applications perform the basic "store and forward"<br />

operations for which retaining the original <strong>XML</strong> is critical for minimizing latency. VTD-<strong>XML</strong>'s incremental update<br />

capability also contributes significantly to the forwarding performance.<br />

VTD-<strong>XML</strong>'s random-access capability lends itself well to XPath-based <strong>XML</strong> routing/switching/filtering common in<br />

AJAX and SOA deployment.<br />

Intelligent SOA/WS/<strong>XML</strong> Load-balancing and Offloading<br />

When an <strong>XML</strong> document travels through several middle-tier SOA components, the first message stop, after finishing<br />

the inspection of the <strong>XML</strong> document, can choose to send the VTD+<strong>XML</strong> file format to the downstream components<br />

to avoid repetitive parsing, thus improving throughput.<br />

By the same token, an intelligent SOA load balancer can choose to generate VTD+<strong>XML</strong> for incoming/outgoing<br />

SOAP messages to offload <strong>XML</strong> parsing from the application servers that receive those messages.<br />

<strong>XML</strong> Persistence Data Store<br />

When viewed from the perspective of native <strong>XML</strong> persistence, VTD-<strong>XML</strong> can be used as a human-readable, easy to<br />

use, general-purpose <strong>XML</strong> index. <strong>XML</strong> documents stored this way can be loaded into memory to be queried,<br />

updated, or edited without the overhead of parsing/re-serialization.<br />

Schemaless <strong>XML</strong> Data Binding<br />

VTD-<strong>XML</strong>'s combination of high performance, low memory usage, and non-blocking XPath evaluation makes<br />

possible a new <strong>XML</strong> data binding approach based entirely on XPath. This approach's biggest benefit is it no longer<br />

requires <strong>XML</strong> schema, avoids needless object creation, and takes advantage of <strong>XML</strong>'s inherent loose encoding [26] .<br />

It is worth noting that data binding discussed in the article mentioned above needs to be implemented by the<br />

application: VTD-<strong>XML</strong> itself only offers accessors. In this regard VTD-<strong>XML</strong> is not a data binding solution itself<br />

(unlike JiBX, JAXB, <strong>XML</strong>Beans), although it offers extraction functionality for data binding packages, much like<br />

other <strong>XML</strong> parsers (STAX, StAX).


VTD-<strong>XML</strong> 112<br />

Essential Classes<br />

As of Version 2.6, the Java and C# versions of VTD-<strong>XML</strong> consist of the following classes:<br />

• VTDGen (VTD Generator) is the class that encapsulates the main parsing, index loading and index writing<br />

functions.<br />

• VTDNav (VTD Navigator) is the class that (1) encapsulates <strong>XML</strong>, VTD, and hierarchical info, (2) contains<br />

various navigation methods,(3) performs various comparisons between VTD records and strings, and (4) converts<br />

VTD records to primitive data types.<br />

• AutoPilot is a class containing functions that perform node-level iteration and XPath.<br />

• <strong>XML</strong>Modifier is a class that offers incremental update capability, such as delete, insert and update.<br />

The extended VTD-<strong>XML</strong> consists of the following classes:<br />

• VTDGenHuge (Extended VTD Generator) encapsulates the main parsing.<br />

• <strong>XML</strong>Buffer performs in-memory loading of <strong>XML</strong> documents.<br />

• <strong>XML</strong>MemMappedBuffer performs memory mapped loading of <strong>XML</strong> documents.<br />

• VTDNavHuge (Extended VTD Navigator)1) encapsulates <strong>XML</strong>, Extended VTD, and hierarchical info, (2)<br />

contains various navigation methods,(3) performs various comparisons between VTD records and strings, and (4)<br />

converts VTD records to primitive data types.<br />

• AutoPilotHuge performs node-level iteration and XPath.<br />

Code Sample<br />

/* In this java program, we demonstrate how to use <strong>XML</strong>Modifier to<br />

incrementally<br />

* update a simple <strong>XML</strong> purchase order.<br />

* a particular name space. We also are going<br />

* to use VTDGen's parseFile to simplify programming.<br />

*/<br />

import com.ximpleware.*;<br />

public class Update {<br />

public static void main(String argv[]) throws NavException,<br />

ModifyException, IOException{<br />

// open a file and read the content into a byte array<br />

VTDGen vg = new VTDGen();<br />

if (vg.parseFile("oldpo.xml", true)){<br />

VTDNav vn = vg.getNav();<br />

AutoPilot ap = new AutoPilot(vn);<br />

<strong>XML</strong>Modifier xm = new <strong>XML</strong>Modifier(vn);<br />

ap.selectXPath("/purchaseOrder/items/item[@partNum='872-AA']");<br />

int i = -1;<br />

while((i=ap.evalXPath())!=-1){ xm.remove();<br />

xm.insertBeforeElement("\n");


VTD-<strong>XML</strong> 113<br />

}<br />

}<br />

References<br />

}<br />

}<br />

ap.selectXPath("/purchaseOrder/items/item/USPrice[.


X-expression 114<br />

X-expression<br />

X-expressions are the unification of S-expressions found in the Lisp programming language with <strong>XML</strong>.<br />

X-expressions unify notions of computation with data sharing.<br />

XBRLS<br />

XBRLS (XBRL Simple Application Profile) is an application profile of XBRL.<br />

XBRLS is designed to be 100% XBRL compliant. The stated goals of XBRLS are "to maximize XBRL's benefits,<br />

reduce costs of implementation, and maximize the functionality and effectiveness of XBRL" [1] . XBRL is a general<br />

purpose specification, based on the idea that no one is likely to use 100% of the components of XBRL in building<br />

any one solution. XBRLS specifies a subset of XBRL that is designed to meet the needs of most business users in<br />

most situations, and offers it as a starting point for others. This approach creates an application profile of XBRL<br />

(equivalent to a database view but concerned with metadata, not data).<br />

XBRLS is intended to enable the non-XBRL expert to create both XBRL metadata and XBRL reports in a simple<br />

and convenient manner. At the same time, it seeks to improve the usability of XBRL, the interoperability among<br />

XBRL-based solutions, the effectiveness of XBRL extensions and to reduce software development costs.<br />

The profile was created by Rene van Egmond and Charlie Hoffman, who was the initial creator of XBRL. It borrows<br />

heavily from the US GAAP Taxonomy Architecture.<br />

XBRLS Architecture<br />

The XBRLS architecture is based on many ideas used by the US GAAP Taxonomy Architecture. The intent of the<br />

XBRLS architecture is to make it easier for business users to make use of XBRL, to make it easier for software<br />

vendors to support XBRL, and to safely use the features of XBRL. XBRLS is a subset of what is allowed by the<br />

complete XBRL Specification. Examples of these limitations placed on XBRL are the following:<br />

• Uses no tuples.<br />

• Only uses the segment element of the instance context and disallows the use of the scenario element.<br />

• Allows only XBRL dimensional information as content for the segment element in the instance context.<br />

Furthermore, it requires that every concept (member, primary item) participates in a hypercube and that all<br />

hypercubes are closed.<br />

• Allows no uses of simple or complex typed members within XBRL Dimensions.<br />

• XBRLS never uses the precision attribute, always uses the decimals attribute.<br />

• Requires that every measure exists in at least one XBRL Dimension.<br />

XBRL Components not used in XBRLS


XBRLS 115<br />

XBRL<br />

Specification<br />

Instance Context: entity<br />

Topic Explanation<br />

identifier, entity<br />

scheme<br />

Although not required when using XBRLS, it is highly encouraged that the entity scheme and identifier<br />

be “held static” or synchronized with an explicit member and rather have XBRL Dimensions be used to<br />

articulate entity information, perhaps with an XBRLS “Entity [Axis]” dimension.<br />

The “entity identifier” and “entity scheme” portion of a context should not be used. Rather, the “entity<br />

identifier” and “entity schema” are static (i.e., dummy values in order to pass XBRL validation), using<br />

constant values. The information articulates relating to the entity identifier and entity scheme are moved<br />

to an XBRLS specific taxonomy that makes use of XBRL Dimensions to communicate this information.<br />

Instance Context: period Although not required when using XBRLS, it is highly encouraged that the period context be “held<br />

Instance (sections<br />

4.7.4 and 4.7.3.2)<br />

Context: segments,<br />

scenarios<br />

Instance Fact Value:<br />

precision<br />

Taxonomy Elements: tuples Tuples are not allowed.<br />

static” or synchronized with an explicit member and that rather XBRL Dimensions be used to articulate<br />

this information, perhaps with an XBRLS “Period [Axis]” dimension. Uses XBRL Dimensions to<br />

articulate this XBRL quasi dimension.<br />

Only uses XBRL Dimensions to articulate the content of segments and scenarios, excluding the use of<br />

<strong>XML</strong> Schema-based contextual information allowed by sections. Furthermore, mixing <strong>XML</strong> Schema<br />

based-contextual information and XBRL Dimensions is technically dangerous.<br />

Uses only the decimals attribute, precision must not be used.<br />

Taxonomies Weight The weight attribute value of calculations must be either “1” or “-1”, no decimal value between the two is<br />

Taxonomies Annotation,<br />

Documentation<br />

allowed.<br />

Each schema and each linkbase must provide documentation that describes the contents of the file that is<br />

readable by a computer application.<br />

Dimensions Open Hypercubes Open hypercubes are not allowed, only closed hypercubes are allowed.<br />

Dimensions notAll Only “all” has-hypercube arcroles are allowed, “notAll” is not allowed<br />

Dimensions Typed Members Typed members (simple or complex) are not allowed.<br />

External links<br />

• XBRL Business Information Exchange [2]<br />

• XBRLS: how a simpler XBRL can make a better XBRL [3]<br />

• Comprehensive Example [4]<br />

• XBRLS - XBRL Made Easy [5]<br />

• Data Interactive: An Interview with Charlie Hoffman [6]<br />

References<br />

[1] XBRL Business Information Exchange (http://xbrl.squarespace.com/xbrls/) website<br />

[2] http://xbrl.squarespace.com/xbrls/<br />

[3] http://xbrl.squarespace.com/storage/xbrls/XBRLS-How-simpler-can-be-better-2008-03-11.pdf<br />

[4] http://xbrl.squarespace.com//storage/xbrls/XBRLS-ComprehensiveExample-2008-04-18.zip<br />

[5] http://www.ubmatrix.com/company/innovation.htm<br />

[6] http://hitachidatainteractive.com/2008/04/23/an-interview-with-charlie-hoffman


Xdos 116<br />

Xdos<br />

XDoS is an acronym for <strong>XML</strong> denial-of-service.<br />

An XDoS attack is a content-borne attack whose purpose is to shut down a web service or system running that<br />

service. A common XDoS attack occurs when an <strong>XML</strong> message is sent with a multitude of digital signatures and a<br />

naive parser would look at each signature and use all the CPU cycles, eating up all resources. These are less common<br />

than inadvertent XDoS attacks which occur when a programming error by a trusted customer causes a handshake to<br />

go into an infinite loop.<br />

XDR Schema<br />

<strong>XML</strong>-Data Reduced (XDR) schema, used in W3C <strong>XML</strong>-Data Note and the Document Content Description (DCD)<br />

initiative for <strong>XML</strong>.<br />

MS<strong>XML</strong> provided XDR schema support from versions 2.0 up to - but not including - version 6.0 [1] .<br />

See also<br />

• <strong>XML</strong> Schema <strong>Language</strong> Comparison - Comparison of other <strong>XML</strong> Schema languages (not XDR).<br />

• List of <strong>XML</strong> Schemas - list of <strong>XML</strong> schemas in use on the Internet sorted by purpose<br />

External links<br />

• XDR Schema Data Types Reference [2]<br />

References<br />

[1] Version and Conformance (http://msdn2.microsoft.com/en-us/library/ms757825(VS.85).aspx)<br />

[2] http://msdn2.microsoft.com/en-us/library/ms256049.aspx


XEE (Starlight) 117<br />

XEE (Starlight)<br />

XEE (<strong>XML</strong> Engineering Environment) is a visual language for data processing and ETL tasks. It is designed for the<br />

Starlight Information Visualization System as a method for producing and processing <strong>XML</strong> data.


XEP 118<br />

XEP<br />

Developer(s) RenderX<br />

Stable release 4.18 / March 2010<br />

Written in Java<br />

Operating<br />

system<br />

Type Layout engine<br />

Website [1]<br />

Microsoft Windows, Linux, FreeBSD<br />

XEP is a commercial XSL-FO layout engine written in Java. XEP is proprietary software by RenderX.<br />

History<br />

Started in 1999 as a working prototype written in Perl and completely rewritten in Java soon, XEP has evolved into a<br />

complete engine. XEP runs on any platform where Java runtime is available, including Windows, Linux, FreeBSD<br />

and other server platforms.<br />

Features<br />

XEP accepts XSL-FO as input, as well as <strong>XML</strong>+XSLT. Its output formats are: PDF, PostScript, AFP, PPML, XPS,<br />

HTML, SVG, and internal <strong>XML</strong>-based format called XEPOUT.<br />

XEP demonstrates conformance with XSL-FO Recommendation v1.0, a wide range of extensions, and support for a<br />

good subset of XSL 1.1 features. [2]<br />

Available font types, depending on the output format generator, are Type 1, TrueType and OpenType, with the<br />

ability of embedding and subsetting.<br />

Accepted images are most of flavors of raster graphics, SVG, EPS and PDF.<br />

API<br />

For integration XEP provides API in Java and examples covering a number of approaches such as SAX, JAXP and<br />

DOM. XEP has a flexible configuration, which allows running it concurrently in threads on huge input documents,<br />

but also in a small heap in diskless environments such as appservers.<br />

Satellite software<br />

For Windows users there exists a .NET wrapper called XEPWin, and an accompanying .NET development kit with<br />

API in C#, VB and ASP.NET.<br />

Satellite software includes EnMasse - a multiplexer of a grid of XEP engines, with simple networked API and<br />

examples in C, Java, Perl and Python.


XEP 119<br />

External links<br />

• XEP on RenderX site [1]<br />

• Official W3C XSL recommendation formatted by XEP [3]<br />

• How to use XEP with Stylus Studio [4]<br />

References<br />

[1] http://www.renderx.com/tools/xep.html<br />

[2] http://xml.coverpages.org/ni2001-11-08-b.html<br />

[3] http://www.w3.org/TR/2006/REC-xsl11-20061205/xsl11.pdf<br />

[4] http://www.stylusstudio.com/renderx/xep.html<br />

<strong>XML</strong><br />

Filename extension .xml<br />

Internet media type [1] [2]<br />

application/xml , text/xml (deprecated)<br />

Uniform Type Identifier public.xml<br />

Developed by World Wide Web Consortium<br />

Type of format <strong>Markup</strong> language<br />

Extended from SGML<br />

Extended to Numerous, including:<br />

XHTML, RSS, Atom<br />

Standard(s) 1.0 (Fifth Edition) [3] November 26, 2008<br />

1.1 (Second Edition) [4] August 16, 2006<br />

Open format? Yes


<strong>XML</strong> 120<br />

Current Status Published<br />

Year Started 1996<br />

Editors Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, François Yergeau, John Cowan<br />

Related<br />

Standards<br />

<strong>XML</strong> Schema<br />

Domain Data Serialization<br />

Abbreviation <strong>XML</strong><br />

Website <strong>XML</strong> 1.0 [5]<br />

<strong>XML</strong> (Extensible <strong>Markup</strong> <strong>Language</strong>) is a set of rules for encoding documents in machine-readable form. It is<br />

defined in the <strong>XML</strong> 1.0 Specification [6] produced by the W3C, and several other related specifications, all gratis<br />

open standards. [7]<br />

<strong>XML</strong>'s design goals emphasize simplicity, generality, and usability over the Internet. [8] It is a textual data format,<br />

with strong support via Unicode for the languages of the world. Although <strong>XML</strong>'s design focuses on documents, it is<br />

widely used for the representation of arbitrary data structures, for example in web services.<br />

There are many programming interfaces that software developers may use to access <strong>XML</strong> data, and several schema<br />

systems designed to aid in the definition of <strong>XML</strong>-based languages.<br />

As of 2009, hundreds of <strong>XML</strong>-based languages have been developed, [9] including RSS, Atom, SOAP, and XHTML.<br />

<strong>XML</strong>-based formats have become the default for most office-productivity tools, including Microsoft Office (Office<br />

Open <strong>XML</strong>), OpenOffice.org (OpenDocument), and Apple's iWork. [10]<br />

Key terminology<br />

The material in this section is based on the <strong>XML</strong> Specification. This is not an exhaustive list of all the constructs<br />

which appear in <strong>XML</strong>; it provides an introduction to the key constructs most often encountered in day-to-day use.<br />

(Unicode) Character<br />

By definition, an <strong>XML</strong> document is a string of characters. Almost every legal Unicode character may appear<br />

in an <strong>XML</strong> document.<br />

Processor and Application<br />

The processor analyzes the markup and passes structured information to an application. The specification<br />

places requirements on what an <strong>XML</strong> processor must do and not do, but the application is outside its scope.<br />

The processor (as the specification calls it) is often referred to colloquially as an <strong>XML</strong> parser.<br />

<strong>Markup</strong> and Content<br />

Tag<br />

Element<br />

The characters which make up an <strong>XML</strong> document are divided into markup and content. <strong>Markup</strong> and content<br />

may be distinguished by the application of simple syntactic rules. All strings which constitute markup either<br />

begin with the character "", or begin with the character "&" and end with a ";". Strings of<br />

characters which are not markup are content.<br />

A markup construct that begins with "". Tags come in three flavors: start-tags, for example<br />

, end-tags, for example , and empty-element tags, for example .<br />

A logical component of a document which either begins with a start-tag and ends with a matching end-tag, or<br />

consists only of an empty-element tag. The characters between the start- and end-tags, if any, are the element's<br />

content, and may contain markup, including other elements, which are called child elements. An example of an


<strong>XML</strong> 121<br />

Attribute<br />

element is Hello, world. (see hello world). Another is .<br />

A markup construct consisting of a name/value pair that exists within a start-tag or empty-element tag. In the<br />

example (below) the element img has two attributes, src and alt:<br />

. Another example would be<br />

Connect A to B. where the name of the attribute is "number" and the value is "3":<br />

<strong>XML</strong> Declaration<br />

<strong>XML</strong> documents may begin by declaring some information about themselves, as in the following example.<br />

<br />

Example<br />

Here is a small, complete <strong>XML</strong> document, which uses all of these constructs and concepts.<br />

<br />

<br />

<br />

This is Raphael's "Foligno" Madonna, painted in<br />

1511–1512.<br />

<br />

<br />

There are five elements in this example document: painting, img, caption, and two dates. The date elements are<br />

children of caption, which is a child of the root element painting. img has two attributes, src and alt.<br />

Characters and escaping<br />

<strong>XML</strong> documents consist entirely of characters from the Unicode repertoire. Except for a small number of<br />

specifically excluded control characters, any character defined by Unicode may appear within the content of an <strong>XML</strong><br />

document. The selection of characters which may appear within markup is somewhat more limited but still large.<br />

<strong>XML</strong> includes facilities for identifying the encoding of the Unicode characters which make up the document, and for<br />

expressing characters which, for one reason or another, cannot be used directly.<br />

Details on valid characters<br />

Unicode characters in the following code point ranges are valid in <strong>XML</strong> 1.0 documents: [11]<br />

• U+0009<br />

• U+000A<br />

• U+000D<br />

• U+0020–U+D7FF<br />

• U+E000–U+FFFD<br />

• U+10000–U+10FFFF<br />

Unicode characters in the following code point ranges are always valid in <strong>XML</strong> 1.1 documents: [12]<br />

• U+0001–U+0008<br />

• U+000B–U+000C<br />

• U+000E–U+001F<br />

• U+007F–U+0084<br />

• U+0086–U+009F


<strong>XML</strong> 122<br />

The preceding code points are contained in the following code point ranges which are only valid in certain contexts<br />

in <strong>XML</strong> 1.1 documents:<br />

• U+0001–U+D7FF<br />

• U+E000–U+FFFD<br />

• U+10000–U+10FFFF<br />

Encoding detection<br />

The Unicode character set can be encoded into bytes for storage or transmission in a variety of different ways, called<br />

"encodings". Unicode itself defines encodings which cover the entire repertoire; well-known ones include UTF-8<br />

and UTF-16. [13] There are many other text encodings which pre-date Unicode, such as ASCII and ISO/IEC 8859;<br />

their character repertoires in almost every case are subsets of the Unicode character set.<br />

<strong>XML</strong> allows the use of any of the Unicode-defined encodings, and any other encodings whose characters also appear<br />

in Unicode. <strong>XML</strong> also provides a mechanism whereby an <strong>XML</strong> processor can reliably, without any prior knowledge,<br />

determine which encoding is being used. [14] Encodings other than UTF-8 and UTF-16 will not necessarily be<br />

recognized by every <strong>XML</strong> parser.<br />

Escaping<br />

There are several reasons why it may be difficult or impossible to include some character directly in an <strong>XML</strong><br />

document.<br />

• The characters "


<strong>XML</strong> 123<br />

Comments<br />

Comments may appear anywhere in a document outside other markup. Comments should not appear on the first line<br />

or otherwise above the <strong>XML</strong> declaration for <strong>XML</strong> processor compatibility. The string "--" (double-hyphen) is not<br />

allowed (as it is used to delimit comments), and entities must not be recognized within comments.<br />

An example of a valid comment: ""<br />

International use<br />

<strong>XML</strong> supports the direct use of almost any Unicode character in element names, attributes, comments, character<br />

data, and processing instructions (other than the ones that have special symbolic meaning in <strong>XML</strong> itself, such as the<br />

open corner bracket, "


<strong>XML</strong> 124<br />

DTD<br />

The oldest schema language for <strong>XML</strong> is the Document Type Definition (DTD), inherited from SGML.<br />

DTDs have the following benefits:<br />

• DTD support is ubiquitous due to its inclusion in the <strong>XML</strong> 1.0 standard.<br />

• DTDs are terse compared to element-based schema languages and consequently present more information in a<br />

single screen.<br />

• DTDs allow the declaration of standard public entity sets for publishing characters.<br />

• DTDs define a document type rather than the types used by a namespace, thus grouping all constraints for a<br />

document in a single collection.<br />

DTDs have the following limitations:<br />

• They have no explicit support for newer features of <strong>XML</strong>, most importantly namespaces.<br />

• They lack expressiveness. <strong>XML</strong> DTDs are simpler than SGML DTDs and there are certain structures that cannot<br />

be expressed with regular grammars. DTDs only support rudimentary datatypes.<br />

• They lack readability. DTD designers typically make heavy use of parameter entities (which behave essentially as<br />

textual macros), which make it easier to define complex grammars, but at the expense of clarity.<br />

• They use a syntax based on regular expression syntax, inherited from SGML, to describe the schema. Typical<br />

<strong>XML</strong> APIs such as SAX do not attempt to offer applications a structured representation of the syntax, so it is less<br />

accessible to programmers than an element-based syntax may be.<br />

Two peculiar features that distinguish DTDs from other schema types are the syntactic support for embedding a<br />

DTD within <strong>XML</strong> documents and for defining entities, which are arbitrary fragments of text and/or markup that the<br />

<strong>XML</strong> processor inserts in the DTD itself and in the <strong>XML</strong> document wherever they are referenced, like character<br />

escapes.<br />

DTD technology is still used in many applications because of its ubiquity.<br />

<strong>XML</strong> Schema<br />

A newer schema language, described by the W3C as the successor of DTDs, is <strong>XML</strong> Schema, often referred to by<br />

the initialism for <strong>XML</strong> Schema instances, XSD (<strong>XML</strong> Schema Definition). XSDs are far more powerful than DTDs<br />

in describing <strong>XML</strong> languages. They use a rich datatyping system and allow for more detailed constraints on an <strong>XML</strong><br />

document's logical structure. XSDs also use an <strong>XML</strong>-based format, which makes it possible to use ordinary <strong>XML</strong><br />

tools to help process them.<br />

RELAX NG<br />

RELAX NG was initially specified by OASIS and is now also an ISO international standard (as part of DSDL).<br />

RELAX NG schemas may be written in either an <strong>XML</strong> based syntax or a more compact non-<strong>XML</strong> syntax; the two<br />

syntaxes are isomorphic and James Clark's Trang conversion tool can convert between them without loss of<br />

information. RELAX NG has a simpler definition and validation framework than <strong>XML</strong> Schema, making it easier to<br />

use and implement. It also has the ability to use datatype framework plug-ins; a RELAX NG schema author, for<br />

example, can require values in an <strong>XML</strong> document to conform to definitions in <strong>XML</strong> Schema Datatypes.


<strong>XML</strong> 125<br />

Schematron<br />

Schematron is a language for making assertions about the presence or absence of patterns in an <strong>XML</strong> document. It<br />

typically uses XPath expressions.<br />

ISO DSDL and other schema languages<br />

The ISO DSDL (Document Schema Description <strong>Language</strong>s) standard brings together a comprehensive set of small<br />

schema languages, each targeted at specific problems. DSDL includes RELAX NG full and compact syntax,<br />

Schematron assertion language, and languages for defining datatypes, character repertoire constraints, renaming and<br />

entity expansion, and namespace-based routing of document fragments to different validators. DSDL schema<br />

languages do not have the vendor support of <strong>XML</strong> Schemas yet, and are to some extent a grassroots reaction of<br />

industrial publishers to the lack of utility of <strong>XML</strong> Schemas for publishing.<br />

Some schema languages not only describe the structure of a particular <strong>XML</strong> format but also offer limited facilities to<br />

influence processing of individual <strong>XML</strong> files that conform to this format. DTDs and XSDs both have this ability;<br />

they can for instance provide the infoset augmentation facility and attribute defaults. RELAX NG and Schematron<br />

intentionally do not provide these.<br />

Related specifications<br />

A cluster of specifications closely related to <strong>XML</strong> have been developed, starting soon after the initial publication of<br />

<strong>XML</strong> 1.0. It is frequently the case that the term "<strong>XML</strong>" is used to refer to <strong>XML</strong> together with one or more of these<br />

other technologies which have come to be seen as part of the <strong>XML</strong> core.<br />

• <strong>XML</strong> Namespaces enable the same document to contain <strong>XML</strong> elements and attributes taken from different<br />

vocabularies, without any naming collisions occurring. Essentially all software which is advertised as supporting<br />

<strong>XML</strong> also supports <strong>XML</strong> Namespaces.<br />

• <strong>XML</strong> Base defines the xml:base attribute, which may be used to set the base for resolution of relative URI<br />

references within the scope of a single <strong>XML</strong> element.<br />

• The <strong>XML</strong> Information Set or <strong>XML</strong> infoset describes an abstract data model for <strong>XML</strong> documents in terms of<br />

information items. The infoset is commonly used in the specifications of <strong>XML</strong> languages, for convenience in<br />

describing constraints on the <strong>XML</strong> constructs those languages allow.<br />

• xml:id Version 1.0 asserts that an attribute named xml:id functions as an "ID attribute" in the sense used in a<br />

DTD.<br />

• XPath defines a syntax named XPath expressions which identifies one or more of the internal components<br />

(elements, attributes, and so on) included in an <strong>XML</strong> document. XPath is widely used in other core-<strong>XML</strong><br />

specifications and in programming libraries for accessing <strong>XML</strong>-encoded data.<br />

• XSLT is a language with an <strong>XML</strong>-based syntax that is used to transform <strong>XML</strong> documents into other <strong>XML</strong><br />

documents, HTML, or other, unstructured formats such as plain text or RTF. XSLT is very tightly coupled with<br />

XPath, which it uses to address components of the input <strong>XML</strong> document, mainly elements and attributes.<br />

• XSL Formatting Objects, or XSL-FO, is a markup language for <strong>XML</strong> document formatting which is most often<br />

used to generate PDFs.<br />

• XQuery is an <strong>XML</strong>-oriented query language strongly rooted in XPath and <strong>XML</strong> Schema. It provides methods to<br />

access, manipulate and return <strong>XML</strong>.<br />

• <strong>XML</strong> Signature defines syntax and processing rules for creating digital signatures on <strong>XML</strong> content.<br />

• <strong>XML</strong> Encryption defines syntax and processing rules for encrypting <strong>XML</strong> content.<br />

Some other specifications conceived as part of the "<strong>XML</strong> Core" have failed to find wide adoption, including<br />

XInclude, XLink, and XPointer.


<strong>XML</strong> 126<br />

Use on the Internet<br />

It is common for <strong>XML</strong> to be used in interchanging data over the Internet. RFC 3023 gives rules for the construction<br />

of Internet Media Types for use when sending <strong>XML</strong>. It also defines the types "application/xml" and "text/xml",<br />

which say only that the data is in <strong>XML</strong>, and nothing about its semantics. The use of "text/xml" has been criticized as a<br />

potential source of encoding problems and is now in the process of being deprecated. [18] RFC 3023 also<br />

recommends that <strong>XML</strong>-based languages be given media types beginning in "application/" and ending in "+xml"; for<br />

example "application/svg+xml" for SVG.<br />

Further guidelines for the use of <strong>XML</strong> in a networked context may be found in RFC 3470, also known as IETF BCP<br />

70; this document is very wide-ranging and covers many aspects of designing and deploying an <strong>XML</strong>-based<br />

language.<br />

Programming interfaces<br />

The design goals of <strong>XML</strong> include "It shall be easy to write programs which process <strong>XML</strong> documents." [8] Despite<br />

this fact, the <strong>XML</strong> specification contains almost no information about how programmers might go about doing such<br />

processing. The <strong>XML</strong> Infoset provides a vocabulary to refer to the constructs within an <strong>XML</strong> document, but once<br />

again does not provide any guidance on how to access this information. A variety of APIs for accessing <strong>XML</strong> have<br />

been developed and used, and some have been standardized.<br />

Existing APIs for <strong>XML</strong> processing tend to fall into these categories:<br />

• Stream-oriented APIs accessible from a programming language, for example SAX and StAX.<br />

• Tree-traversal APIs accessible from a programming language, for example DOM.<br />

• <strong>XML</strong> data binding, which provides an automated translation between an <strong>XML</strong> document and<br />

programming-language objects.<br />

• Declarative transformation languages such as XSLT and XQuery.<br />

Stream-oriented facilities require less memory and, for certain tasks which are based on a linear traversal of an <strong>XML</strong><br />

document, are faster and simpler than other alternatives. Tree-traversal and data-binding APIs typically require the<br />

use of much more memory, but are often found more convenient for use by programmers; some include declarative<br />

retrieval of document components via the use of XPath expressions.<br />

XSLT is designed for declarative description of <strong>XML</strong> document transformations, and has been widely implemented<br />

both in server-side packages and Web browsers. XQuery overlaps XSLT in its functionality, but is designed more for<br />

searching of large <strong>XML</strong> databases.<br />

Simple API for <strong>XML</strong> (SAX)<br />

SAX is a lexical, event-driven interface in which a document is read serially and its contents are reported as<br />

callbacks to various methods on a handler object of the user's design. SAX is fast and efficient to implement, but<br />

difficult to use for extracting information at random from the <strong>XML</strong>, since it tends to burden the application author<br />

with keeping track of what part of the document is being processed. It is better suited to situations in which certain<br />

types of information are always handled the same way, no matter where they occur in the document.<br />

Pull parsing<br />

Pull parsing [19] treats the document as a series of items which are read in sequence using the Iterator design pattern.<br />

This allows for writing of recursive-descent parsers in which the structure of the code performing the parsing mirrors<br />

the structure of the <strong>XML</strong> being parsed, and intermediate parsed results can be used and accessed as local variables<br />

within the methods performing the parsing, or passed down (as method parameters) into lower-level methods, or<br />

returned (as method return values) to higher-level methods. Examples of pull parsers include StAX in the Java<br />

programming language, Simple<strong>XML</strong> in PHP and System.Xml.XmlReader in the .NET Framework.


<strong>XML</strong> 127<br />

A pull parser creates an iterator that sequentially visits the various elements, attributes, and data in an <strong>XML</strong><br />

document. Code which uses this iterator can test the current item (to tell, for example, whether it is a start or end<br />

element, or text), and inspect its attributes (local name, namespace, values of <strong>XML</strong> attributes, value of text, etc.), and<br />

can also move the iterator to the next item. The code can thus extract information from the document as it traverses<br />

it. The recursive-descent approach tends to lend itself to keeping data as typed local variables in the code doing the<br />

parsing, while SAX, for instance, typically requires a parser to manually maintain intermediate data within a stack of<br />

elements which are parent elements of the element being parsed. Pull-parsing code can be more straightforward to<br />

understand and maintain than SAX parsing code.<br />

Document Object Model (DOM)<br />

DOM (Document Object Model) is an interface-oriented Application Programming Interface that allows for<br />

navigation of the entire document as if it were a tree of "Node" objects representing the document's contents. A<br />

DOM document can be created by a parser, or can be generated manually by users (with limitations). Data types in<br />

DOM Nodes are abstract; implementations provide their own programming language-specific bindings. DOM<br />

implementations tend to be memory intensive, as they generally require the entire document to be loaded into<br />

memory and constructed as a tree of objects before access is allowed.<br />

Data binding<br />

Another form of <strong>XML</strong> processing API is <strong>XML</strong> data binding, where <strong>XML</strong> data is made available as a hierarchy of<br />

custom, strongly typed classes, in contrast to the generic objects created by a Document Object Model parser. This<br />

approach simplifies code development, and in many cases allows problems to be identified at compile time rather<br />

than run-time. Example data binding systems include the Java Architecture for <strong>XML</strong> Binding (JAXB), <strong>XML</strong><br />

Serialization in .NET, [20] [21] [22]<br />

and CodeSynthesis XSD for C++.<br />

<strong>XML</strong> as data type<br />

<strong>XML</strong> is beginning to appear as a first-class data type in other languages. The ECMAScript for <strong>XML</strong> (E4X)<br />

extension to the ECMAScript/JavaScript language explicitly defines two specific objects (<strong>XML</strong> and <strong>XML</strong>List) for<br />

JavaScript, which support <strong>XML</strong> document nodes and <strong>XML</strong> document lists as distinct objects and use a dot-notation<br />

specifying parent-child relationships. E4X is supported by the Mozilla 2.5+ browsers and Adobe Actionscript, but<br />

has not been adopted more universally. Similar notations are used in Microsoft's LINQ implementation for Microsoft<br />

.NET 3.5 and above, and in Scala (which uses the Java VM). The open-source xmlsh application, which provides a<br />

Linux-like shell with special features for <strong>XML</strong> manipulation, similarly treats <strong>XML</strong> as a data type, using the <br />

notation. [23] The Resource Description Framework defines a data type rdf:<strong>XML</strong>Literal to hold wrapped, canonical<br />

<strong>XML</strong>. [24]<br />

History<br />

<strong>XML</strong> is an application profile of SGML (ISO 8879). [25]<br />

The versatility of SGML for dynamic information display was understood by early digital media publishers in the<br />

late 1980s prior to the rise of the Internet. [26] [27] By the mid-1990s some practitioners of SGML had gained<br />

experience with the then-new World Wide Web, and believed that SGML offered solutions to some of the problems<br />

the Web was likely to face as it grew. Dan Connolly added SGML to the list of W3C's activities when he joined the<br />

staff in 1995; work began in mid-1996 when Sun Microsystems engineer Jon Bosak developed a charter and<br />

recruited collaborators. Bosak was well connected in the small community of people who had experience both in<br />

SGML and the Web. [28]<br />

<strong>XML</strong> was compiled by a working group of eleven members, [29] supported by an (approximately) 150-member<br />

Interest Group. Technical debate took place on the Interest Group mailing list and issues were resolved by consensus


<strong>XML</strong> 128<br />

or, when that failed, majority vote of the Working Group. A record of design decisions and their rationales was<br />

compiled by Michael Sperberg-McQueen on December 4, 1997. [30] James Clark served as Technical Lead of the<br />

Working Group, notably contributing the empty-element "" syntax and the name "<strong>XML</strong>". Other names that<br />

had been put forward for consideration included "MAGMA" (Minimal Architecture for Generalized <strong>Markup</strong><br />

Applications), "SLIM" (Structured <strong>Language</strong> for Internet <strong>Markup</strong>) and "MGML" (Minimal Generalized <strong>Markup</strong><br />

<strong>Language</strong>). The co-editors of the specification were originally Tim Bray and Michael Sperberg-McQueen. Halfway<br />

through the project Bray accepted a consulting engagement with Netscape, provoking vociferous protests from<br />

Microsoft. Bray was temporarily asked to resign the editorship. This led to intense dispute in the Working Group,<br />

eventually solved by the appointment of Microsoft's Jean Paoli as a third co-editor.<br />

The <strong>XML</strong> Working Group never met face-to-face; the design was accomplished using a combination of email and<br />

weekly teleconferences. The major design decisions were reached in twenty weeks of intense work between July and<br />

November 1996, when the first Working Draft of an <strong>XML</strong> specification was published. [31] Further design work<br />

continued through 1997, and <strong>XML</strong> 1.0 became a W3C Recommendation on February 10, 1998.<br />

Sources<br />

<strong>XML</strong> is a profile of an ISO standard SGML, and most of <strong>XML</strong> comes from SGML unchanged. From SGML comes<br />

the separation of logical and physical structures (elements and entities), the availability of grammar-based validation<br />

(DTDs), the separation of data and metadata (elements and attributes), mixed content, the separation of processing<br />

from representation (processing instructions), and the default angle-bracket syntax. Removed were the SGML<br />

Declaration (<strong>XML</strong> has a fixed delimiter set and adopts Unicode as the document character set).<br />

Other sources of technology for <strong>XML</strong> were the Text Encoding Initiative (TEI), which defined a profile of SGML for<br />

use as a 'transfer syntax'; HTML, in which elements were synchronous with their resource, the separation of<br />

document character set from resource encoding, the xml:lang attribute, and the HTTP notion that metadata<br />

accompanied the resource rather than being needed at the declaration of a link. The Extended Reference Concrete<br />

Syntax (ERCS) project of the SPREAD (Standardization Project Regarding East Asian Documents) project of the<br />

ISO-related China/Japan/Korea Document Processing expert group was the basis of <strong>XML</strong> 1.0's naming rules;<br />

SPREAD also introduced hexadecimal numeric character references and the concept of references to make available<br />

all Unicode characters. To support ERCS, <strong>XML</strong> and HTML better, the SGML standard IS 8879 was revised in 1996<br />

and 1998 with WebSGML Adaptations. The <strong>XML</strong> header followed that of ISO HyTime.<br />

Ideas that developed during discussion which were novel in <strong>XML</strong> included the algorithm for encoding detection and<br />

the encoding header, the processing instruction target, the xml:space attribute, and the new close delimiter for<br />

empty-element tags. The notion of well-formedness as opposed to validity (which enables parsing without a schema)<br />

was first formalized in <strong>XML</strong>, although it had been implemented successfully in the Electronic Book Technology<br />

"Dynatext" software [32] ; the software from the University of Waterloo New Oxford English Dictionary Project; the<br />

RISP LISP SGML text processor at Uniscope, Tokyo; the US Army Missile Command IADS hypertext system;<br />

Mentor Graphics Context; Interleaf and Xerox Publishing System.<br />

Versions<br />

There are two current versions of <strong>XML</strong>. The first (<strong>XML</strong> 1.0) was initially defined in 1998. It has undergone minor<br />

revisions since then, without being given a new version number, and is currently in its fifth edition, as published on<br />

November 26, 2008. It is widely implemented and still recommended for general use.<br />

The second (<strong>XML</strong> 1.1) was initially published on February 4, 2004, the same day as <strong>XML</strong> 1.0 Third Edition [33] , and<br />

is currently in its second edition, as published on August 16, 2006. It contains features (some contentious) that are<br />

intended to make <strong>XML</strong> easier to use in certain cases [34] . The main changes are to enable the use of line-ending<br />

characters used on EBCDIC platforms, and the use of scripts and characters absent from Unicode 3.2. <strong>XML</strong> 1.1 is<br />

not very widely implemented and is recommended for use only by those who need its unique features. [35]


<strong>XML</strong> 129<br />

Prior to its fifth edition release, <strong>XML</strong> 1.0 differed from <strong>XML</strong> 1.1 in having stricter requirements for characters<br />

available for use in element and attribute names and unique identifiers: in the first four editions of <strong>XML</strong> 1.0 the<br />

characters were exclusively enumerated using a specific version of the Unicode standard (Unicode 2.0 to Unicode<br />

3.2.) The fifth edition substitutes the mechanism of <strong>XML</strong> 1.1, which is more future-proof but reduces redundancy.<br />

The approach taken in the fifth edition of <strong>XML</strong> 1.0 and in all editions of <strong>XML</strong> 1.1 is that only certain characters are<br />

forbidden in names, and everything else is allowed, in order to accommodate the use of suitable name characters in<br />

future versions of Unicode. In the fifth edition, <strong>XML</strong> names may contain characters in the Balinese, Cham, or<br />

Phoenician scripts among many others which have been added to Unicode since Unicode 3.2. [36]<br />

Almost any Unicode code point can be used in the character data and attribute values of an <strong>XML</strong> 1.0 or 1.1<br />

document, even if the character corresponding to the code point is not defined in the current version of Unicode. In<br />

character data and attribute values, <strong>XML</strong> 1.1 allows the use of more control characters than <strong>XML</strong> 1.0, but, for<br />

"robustness", most of the control characters introduced in <strong>XML</strong> 1.1 must be expressed as numeric character<br />

references (and #x7F through #x9F, which had been allowed in <strong>XML</strong> 1.0, are in <strong>XML</strong> 1.1 even required to be<br />

expressed as numeric character references [37] ). Among the supported control characters in <strong>XML</strong> 1.1 are two line<br />

break codes that must be treated as whitespace. Whitespace characters are the only control codes that can be written<br />

directly.<br />

There has been discussion of an <strong>XML</strong> 2.0, although no organization has announced plans for work on such a project.<br />

<strong>XML</strong>-SW (SW for skunk works), written by one of the original developers of <strong>XML</strong>, contains some proposals for<br />

what an <strong>XML</strong> 2.0 might look like: elimination of DTDs from syntax, integration of namespaces, <strong>XML</strong> Base and<br />

<strong>XML</strong> Information Set (infoset) into the base standard.<br />

The World Wide Web Consortium also has an <strong>XML</strong> Binary Characterization Working Group doing preliminary<br />

research into use cases and properties for a binary encoding of the <strong>XML</strong> infoset. The working group is not chartered<br />

to produce any official standards. Since <strong>XML</strong> is by definition text-based, ITU-T and ISO are using the name Fast<br />

Infoset for their own binary infoset to avoid confusion (see ITU-T Rec. X.891 | ISO/IEC 24824-1).<br />

See also<br />

• Category:<strong>XML</strong><br />

• Binary <strong>XML</strong><br />

• <strong>XML</strong> Protocol<br />

• List of <strong>XML</strong> markup languages<br />

• Category:<strong>XML</strong>-based standards<br />

• Comparison of layout engines (<strong>XML</strong>)<br />

• Comparison of data serialization formats<br />

• OpenDocument<br />

Further reading<br />

• Annex A of ISO 8879:1986 (SGML)<br />

• Lawrence A. Cunningham (2005). "<strong>Language</strong>, Deals and Standards: The Future of <strong>XML</strong> Contracts". Washington<br />

University Law Review. SSRN 900616 [38] .<br />

• Bosak, Jon; Tim Bray (May 1999). "<strong>XML</strong> and the Second-Generation Web". Scientific American. Online at <strong>XML</strong><br />

and the Second-Generation Web [39] .


<strong>XML</strong> 130<br />

External links<br />

• W3C <strong>XML</strong> homepage [40]<br />

• <strong>XML</strong> 1.0 Specification [41]<br />

• Introduction to Generalized <strong>Markup</strong> [42] by Charles Goldfarb<br />

• Making Mistakes with <strong>XML</strong> [43] by Sean Kelly<br />

• The Multilingual WWW [44] by Gavin Nicol<br />

• Retrospective on Extended Reference Concrete Syntax [45] by Rick Jelliffe<br />

• <strong>XML</strong>, Java and the Future of the Web [46] by Jon Bosak<br />

• <strong>XML</strong> tutorials in w3schools [47]<br />

• <strong>XML</strong>.gov [48]<br />

• Thinking <strong>XML</strong>: The <strong>XML</strong> decade [49] by Uche Ogbuji<br />

• <strong>XML</strong>: Ten year anniversary [50] by Elliot Kimber<br />

• Five years later, <strong>XML</strong>... [51] by Simon St. Laurent<br />

• 23 <strong>XML</strong> fallacies to watch out for [52] by Sean McGrath<br />

• <strong>XML</strong> Injection [53] - Web Application Security Consortium<br />

• W3C <strong>XML</strong> is Ten! [54] , <strong>XML</strong> 10 years press release<br />

References<br />

[1] "<strong>XML</strong> Media Types, RFC 3023" (http://tools.ietf.org/html/rfc3023#section-3.2). IETF. 2001-01. pp. 9–11. . Retrieved 2010-01-04.<br />

[2] "<strong>XML</strong> Media Types, RFC 3023" (http://tools.ietf.org/html/rfc3023#section-3.1). IETF. 2001-01. pp. 7–9. . Retrieved 2010-01-04.<br />

[3] http://www.w3.org/TR/2008/REC-xml-20081126/<br />

[4] http://www.w3.org/TR/2006/REC-xml11-20060816/<br />

[5] http://www.w3.org/TR/rec-xml<br />

[6] <strong>XML</strong> 1.0 Specification (http://www.w3.org/TR/REC-xml)<br />

[7] "W3C DOCUMENT LICENSE" (http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231). .<br />

[8] "<strong>XML</strong> 1.0 Origin and Goals" (http://www.w3.org/TR/REC-xml/#sec-origin-goals). . Retrieved July 2009.<br />

[9] "<strong>XML</strong> Applications and Initiatives" (http://xml.coverpages.org/xmlApplications.html). .<br />

[10] "Introduction to iWork Programming Guide. Mac OS X Reference Library" (http://developer.apple.com/mac/library/documentation/<br />

AppleApplications/Conceptual/iWork2-0_<strong>XML</strong>/Chapter01/Introduction.html). Apple. .<br />

[11] http://www.w3.org/TR/2006/REC-xml-20060816/#charsets<br />

[12] http://www.w3.org/TR/xml11/#charsets<br />

[13] "Characters vs. Bytes" (http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF). .<br />

[14] "Autodetection of Character Encodings" (http://www.w3.org/TR/REC-xml/#sec-guessing). .<br />

[15] It is allowed, but not recommended, to use "


<strong>XML</strong> 131<br />

[27] edited by Sueann Ambron and Kristina Hooper ; foreword by John Sculley. (1988). "Publishers, multimedia, and interactivity". Interactive<br />

multimedia. Cobb Group. ISBN 1-55615-124-1.<br />

[28] Eliot Kimber (2006). "<strong>XML</strong> is 10" (http://drmacros-xml-rants.blogspot.com/#116460437782808906). .<br />

[29] The working group was originally called the "Editorial Review Board." The original members and seven who were added before the first<br />

edition was complete, are listed at the end of the first edition of the <strong>XML</strong> Recommendation, at http://www.w3.org/TR/1998/<br />

REC-xml-19980210.<br />

[30] "Reports From the W3C SGML ERB to the SGML WG And from the W3C <strong>XML</strong> ERB to the <strong>XML</strong> SIG" (http://www.w3.org/<strong>XML</strong>/<br />

9712-reports.html). W3.org. . Retrieved 2009-07-31.<br />

[31] "Extensible <strong>Markup</strong> <strong>Language</strong> (<strong>XML</strong>)" (http://www.w3.org/TR/WD-xml-961114.html). W3.org. 1996-11-14. . Retrieved 2009-07-31.<br />

[32] Jon Bosak, Sun Microsystems (2006-12-07). "Closing Keynote, <strong>XML</strong> 2006" (http://2006.xmlconference.org/proceedings/162/<br />

presentation.html). 2006.xmlconference.org. . Retrieved 2009-07-31.<br />

[33] Extensible <strong>Markup</strong> <strong>Language</strong> (<strong>XML</strong>) 1.0 (Third Edition) (http://www.w3.org/TR/2004/REC-xml-20040204)<br />

[34] "Extensible <strong>Markup</strong> <strong>Language</strong> (<strong>XML</strong>) 1.1 (Second Edition) – Rationale and list of changes for <strong>XML</strong> 1.1" (http://www.w3.org/TR/<br />

xml11/#sec-xml11). W3C. . Retrieved 2006-12-21.<br />

[35] Harold, Elliotte Rusty (2004). Effective <strong>XML</strong> (http://www.cafeconleche.org/books/effectivexml/). Addison-Wesley. pp. 10–19.<br />

ISBN 0321150406. .<br />

[36] "Extensible <strong>Markup</strong> <strong>Language</strong> (<strong>XML</strong>) 1.1 (Second Edition) – Rationale and list of changes for <strong>XML</strong> 1.1" (http://www.w3.org/TR/<br />

xml11/#dt-name). W3C. . Retrieved 2009-12-11.<br />

[37] http://www.w3.org/TR/xml11/#sec-xml11<br />

[38] http://ssrn.com/abstract=900616<br />

[39] http://www.scientificamerican.com/article.cfm?id=xml-and-the-second-genera<br />

[40] http://www.w3.org/<strong>XML</strong>/<br />

[41] http://www.w3.org/TR/REC-xml<br />

[42] http://www.sgmlsource.com/history/AnnexA.htm<br />

[43] http://www.developer.com/xml/article.php/10929_3583081_1<br />

[44] http://www.mind-to-mind.com/library/papers/multilingual/multilingual-www.html<br />

[45] http://xml.ascc.net/en/utf-8/ercsretro.html<br />

[46] http://www.xml.com/pub/a/w3j/s3.bosak.html<br />

[47] http://www.w3schools.com/xml/default.asp<br />

[48] http://xml.gov/<br />

[49] http://www-128.ibm.com/developerworks/library/x-think38.html<br />

[50] http://drmacros-xml-rants.blogspot.com/2006/11/xml-ten-year-aniversary.html<br />

[51] http://www.oreillynet.com/xml/blog/2003/02/five_years_later_xml.html<br />

[52] http://www.itworld.com/xml-fallacies-nlstipsm-080122<br />

[53] http://projects.webappsec.org/<strong>XML</strong>-Injection<br />

[54] http://www.w3.org/2008/02/xml10-pressrelease


<strong>XML</strong> and MIME 132<br />

<strong>XML</strong> and MIME<br />

<strong>XML</strong><br />

An <strong>XML</strong> document is a text document that consists of an <strong>XML</strong> declaration and a root element with well-formed<br />

content.<br />

Example <strong>XML</strong> Document<br />

MIME<br />

<br />

<br />

<br />

Blah<br />

MIME (Multipurpose Internet Mail Extensions) is an Internet Standard that allows email systems to interpret<br />

complex data. Web browsers also use the MIME type to accurately display information or launch a separate<br />

application to handle the data.<br />

All MIME types (called Internet media type) consist of two parts, in the form type/subtype.<br />

This information is sent to the browser by a web server. Usually, the server determines the MIME type based on the<br />

document's file extension. For example, the server would interpret an extension of .txt (plain text file) to have a<br />

MIME type of text/plain.<br />

<strong>XML</strong> Specific MIME Types<br />

There are two MIME assignments for <strong>XML</strong> data. These are:<br />

• application/xml (RFC 3023)<br />

• text/xml (RFC 3023)<br />

Because of the wide variety of documents that can be expressed using an <strong>XML</strong> syntax, additional MIME types are<br />

needed to differentiate between languages. <strong>XML</strong>-based formats add a suffix of +xml to the MIME type.<br />

The followings are some examples of common <strong>XML</strong> media types.<br />

• Registered<br />

• Extensible HyperText <strong>Markup</strong> <strong>Language</strong> (XHTML): application/xhtml+xml (RFC 3236)<br />

• Atom: application/atom+xml (RFC 4287)<br />

• Registration-In-Progress<br />

• Extensible Stylesheet <strong>Language</strong> Transformations (XSLT): application/xslt+xml [1]<br />

• Scalable Vector Graphics (SVG): image/svg+xml [2]<br />

• Unregistered<br />

• Mathematical <strong>Markup</strong> <strong>Language</strong> (MathML): application/mathml+xml<br />

• Really Simple Syndication (RSS 2.0): application/rss+xml


<strong>XML</strong> and MIME 133<br />

External links<br />

• Official List of MIME Types [3]<br />

• IBM article [4]<br />

References<br />

[1] http://www.w3.org/TR/xslt20/#xslt-mime-definition<br />

[2] http://www.w3.org/TR/SVGMobile12/mimereg.html<br />

[3] http://www.iana.org/assignments/media-types/<br />

[4] http://www-128.ibm.com/developerworks/xml/library/x-mxd2.html<br />

<strong>XML</strong> appliance<br />

An <strong>XML</strong> appliance is a separate computer system with deliberately narrow functionality that exchanges <strong>XML</strong><br />

messages with other computer systems. <strong>XML</strong> appliances secure, accelerate and route <strong>XML</strong> so enterprises can<br />

cost-effectively realize its full potential for messaging and service-oriented architectures (SOAs). They are designed<br />

specifically to be easy to install, configure and manage. While some <strong>XML</strong> appliances must rely on specialized<br />

hardware and software to accelerate the processing of <strong>XML</strong> messages, others accomplish the same tasks using<br />

standards-based hardware and operating systems.<br />

History of <strong>XML</strong> appliances<br />

The first <strong>XML</strong> appliances were created by DataPower in 1999, Sarvega and Forum Systems in 2001, but there were<br />

generally two groups of engineers - some who were focused on large volumes of <strong>XML</strong> transformations and some<br />

who were focused on high-speed <strong>XML</strong> processing and security. The transformation team created specialized<br />

software or Application-specific integrated circuits that performed transformations up to 100 times faster than basic<br />

software-only solutions. Although there were some early adopters of these systems, it was initially restricted to large<br />

e-commerce sites such as Yahoo! and Amazon. The <strong>XML</strong> processing team created highly optimized appliances that<br />

secured and integrated <strong>XML</strong> across many use cases. Early entrants in <strong>XML</strong> appliances include vendors such as<br />

DataPower (now owned by IBM), Reactivity, Inc. (acquired by CISCO), Forum Systems, Layer 7 Technologies,<br />

Vordel, and Sarvega (now owned by Intel).<br />

These two approaches began to converge when a second generation of <strong>XML</strong> appliances started to appear around<br />

2003, when these devices were used to exchange SOAP <strong>XML</strong> messages between computers on public networks.<br />

These messages required advanced security features such as encryption, digital signatures and denial of service<br />

attack prevention. Because the setup and configuration of software-only systems was time consuming, companies<br />

could save a great deal of money by using appliances that were pre-packaged with WS-Security standards built in.


<strong>XML</strong> appliance 134<br />

Common features of <strong>XML</strong> appliances<br />

• They can validate <strong>XML</strong> messages for well-formedness as they enter or exit the appliance<br />

• They include hardware and/or software customized for efficient <strong>XML</strong> parsing and analysis.<br />

• They have built-in support for many <strong>XML</strong> standards such as XSLT, XPath, SOAP and WS-Security<br />

Classification of <strong>XML</strong> appliances<br />

Although the term <strong>XML</strong> appliance is the most general term to describe these devices, most vendors use alternative<br />

terminology that describe more specific functionality of these devices. The following are alternative names used for<br />

<strong>XML</strong> Appliances:<br />

• <strong>XML</strong> accelerators — are devices that typically use custom hardware or software built on standards-based<br />

hardware to accelerate XPath processing. This hardware typically provides a performance boost between 10 and<br />

100 times in the number of messages per second that can be processed.<br />

• Integration appliance — (also known as application routers) are devices that are designed to make the integration<br />

of computer systems easier.<br />

• <strong>XML</strong> security gateways (also known as <strong>XML</strong> firewalls) are devices that support the WS-Security standards.<br />

These appliances typically offload encryption and decryption to specialized hardware devices.<br />

• <strong>XML</strong> Enabled Networking — an abstraction layer that exists alongside the traditional IP network. This layer<br />

addresses the security, incompatibility and latency issues encumbering <strong>XML</strong> messages, web services and<br />

service-oriented architectures (SOAs).<br />

Notable <strong>XML</strong> appliance vendors<br />

• Bloombase<br />

• Citrix Systems (through acquisition of QuickTree [1])<br />

• DataPower (now owned by IBM), see IBM WebSphere DataPower SOA Appliances<br />

• F5 Networks<br />

• Radware<br />

• Solace Systems<br />

• Xtradyne<br />

• Cisco<br />

See also<br />

• <strong>XML</strong><br />

• XSLT<br />

• SOAP<br />

• <strong>XML</strong> Enabled Networking<br />

• WS-Security<br />

• Apache Axis<br />

• Integration appliance


<strong>XML</strong> appliance 135<br />

References<br />

[1] http://community.citrix.com/display/ocb/2008/11/14/<strong>XML</strong>+Security+Features+in+Netscaler+9.0<br />

<strong>XML</strong> Base<br />

<strong>XML</strong> Base is a World Wide Web Consortium recommended facility for defining base URIs for parts of <strong>XML</strong><br />

documents.<br />

<strong>XML</strong> Base recommendation was adopted on 2001-06-27.<br />

The attribute xml:base may be inserted in <strong>XML</strong> documents to specify a base URI other than the base URI of the<br />

document or external entity. The value of this attribute is interpreted as a URI Reference as defined in RFC 3986<br />

[IETF RFC 3986]. It serves the function described in section 5.1.1 of RFC3986, establishing the base URI (or IRI)<br />

for resolving any relative references found within the effective scope of the xml:base attribute.<br />

In namespace-aware <strong>XML</strong> processors, the "xml" prefix is bound to the namespace name http:/ / www. w3. org/<br />

<strong>XML</strong>/1998/namespace as described in Namespaces in <strong>XML</strong> [<strong>XML</strong> Names]. Note that xml:base can be still used by<br />

non-namespace-aware processors.<br />

External links<br />

• <strong>XML</strong> Base W3C Recommendation [1]<br />

References<br />

[1] http://www.w3.org/TR/xmlbase/


<strong>XML</strong> Catalog 136<br />

<strong>XML</strong> Catalog<br />

<strong>XML</strong> documents typically refer to external entities, for example the public and/or system ID for the Document Type<br />

Definition. These external relationships are expressed using URIs, typically as URLs.<br />

However, if they are absolute URLs, they only work when your network can reach them. Relying on remote<br />

resources makes <strong>XML</strong> processing susceptible to both planned and unplanned network downtime.<br />

Conversely, if they are relative URLs, they're only useful in the context where they were initially created. For<br />

example, the URL "../../xml/dtd/docbookx.xml" will usually only be useful in very limited circumstances.<br />

One way to avoid these problems is to use an entity resolver (a standard part of SAX) or a URI Resolver (a standard<br />

part of JAXP). A resolver can examine the URIs of the resources being requested and determine how best to satisfy<br />

those requests. The <strong>XML</strong> catalog is a document describing a mapping between external entity references and<br />

locally-cached equivalents.<br />

Example Catalog.xml<br />

The following simple catalog shows how one might provide locally-cached DTDs for an XHTML page validation<br />

tool, for example.<br />

<br />

<br />

<br />

<br />

<br />

<br />

This catalog makes it possible to resolve -//W3C//DTD XHTML 1.0 Strict//EN to the local URI<br />

dtd/xhtml1/xhtml1-strict.dtd. Similarly, it provides local URIs for two other public IDs.<br />

Note that the document above includes a DOCTYPE - this may cause the parser to attempt to access the system ID<br />

URL for the DOCTYPE (i.e. http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd) before the<br />

catalog resolver is fully functioning, which is probably undesirable. To prevent this, simply remove the DOCTYPE<br />

declaration.<br />

The following example shows this, and also shows the equivalent declarations as an alternative to<br />

declarations.


<strong>XML</strong> Catalog 137<br />

<br />

<br />

<br />

<br />

<br />

Using a Catalog - Java SAX Example<br />

Catalog resolvers are available for various programming languages. The following example shows how, in Java, a<br />

SAX parser may be created to parse some input source in which the<br />

org.apache.xml.resolver.tools.CatalogResolver is used to resolve external entities to<br />

locally-cached instances. This resolver originates from Apache Xerces but is now included with the Sun Java<br />

runtime.<br />

Simply create a SAXParser in the normal way, using factories. Obtain the <strong>XML</strong> reader and set the entity resolver<br />

to the standard one (CatalogResolver) or another of your own.<br />

final SAXParser saxParser =<br />

SAXParserFactory.newInstance().newSAXParser();<br />

final <strong>XML</strong>Reader reader = saxParser.get<strong>XML</strong>Reader();<br />

final ContentHandler handler = ...;<br />

final InputSource input = ...;<br />

reader.setEntityResolver( new CatalogResolver() );<br />

reader.setContentHandler( handler );<br />

reader.parse( input );<br />

It is important to call the parse method on the reader, not on the SAX parser.


<strong>XML</strong> Catalog 138<br />

See also<br />

• <strong>XML</strong> Catalogs. OASIS Standard, Version 1.1. 07-October-2005. [1]<br />

• <strong>XML</strong> Entity and URI Resolvers [2] , Sun<br />

• <strong>XML</strong> Catalog Manager [3] project on Sourceforge<br />

• <strong>XML</strong> Catalogs for .NET and Mono [4]<br />

References<br />

[1] http://www.oasis-open.org/committees/download.php/14810/xml-catalogs.pdf<br />

[2] http://java.sun.com/webservices/docs/1.6/jaxb/catalog.html<br />

[3] http://xmlcatmgr.sourceforge.net/<br />

[4] http://xmlcatalog.net/<br />

<strong>XML</strong> Certification Program<br />

<strong>XML</strong> Certification Program (<strong>XML</strong> Master) is IT professional certification for <strong>XML</strong> and related technologies.<br />

There are two levels of <strong>XML</strong> Certifications, <strong>XML</strong> Master Basic certification and <strong>XML</strong> Master Professional<br />

certification, and more than 18000 examiners have passed those examinations.<br />

Certification paths<br />

<strong>XML</strong> Master Professional Application Developer Certification<br />

• <strong>XML</strong> Master Professional Application Developer is a certification for professionals who have demonstrated the<br />

ability to use technology in developing applications that deal with <strong>XML</strong> data.<br />

<strong>XML</strong> Master Professional Application Developer Certification Requirements<br />

• Pass the <strong>XML</strong> Master Basic exam and the <strong>XML</strong> Master Professional Application Developer certification exam.<br />

<strong>XML</strong> Master Professional Application Developer Certification Exam<br />

• Duration => 90 minutes<br />

• Number of Questions => 45 questions<br />

• Required Passing Score => 70%<br />

<strong>XML</strong> Master Professional Application Developer Certification Exam Topics<br />

• Section 1 - DOM / SAX<br />

• Section 2 - DOM / SAX Programming<br />

• Section 3 - XSLT<br />

• Section 4 - <strong>XML</strong> Schema<br />

• Section 5 - <strong>XML</strong> Processing System Design Technology<br />

• Section 6 - Utilizing <strong>XML</strong>


<strong>XML</strong> Certification Program 139<br />

<strong>XML</strong> Master Professional Database Administrator Certification<br />

• The <strong>XML</strong> Master Professional Database Administrator is a certification for professionals who have demonstrated<br />

the ability to use technology in XQuery and <strong>XML</strong>DB.<br />

<strong>XML</strong> Master Professional Database Administrator Certification Requirements<br />

• Pass the <strong>XML</strong> Master Basic exam and the <strong>XML</strong> Master Professional Database Administrator certification exam.<br />

<strong>XML</strong> Master Professional Database Administrator Certification Exam<br />

• Duration in minutes => 90 minutes<br />

• Number of Questions => 30 questions<br />

• Required Passing Score => 80%<br />

<strong>XML</strong> Master Professional Database Administrator Certification Exam Topics<br />

• Section 1 - Overview<br />

• Section 2 - XQuery, XPath<br />

• Section 3 - Manipulating <strong>XML</strong> Data<br />

• Section 4 - Creating <strong>XML</strong> Schema and Other <strong>XML</strong> Database Objects<br />

<strong>XML</strong> Master Basic Certification<br />

• <strong>XML</strong> Master Basic is a certification for professionals who have demonstrated the ability to use <strong>XML</strong> and related<br />

technologies.<br />

<strong>XML</strong> Master Basic Certification Requirements<br />

• Pass the <strong>XML</strong> Master Basic certification exam.<br />

<strong>XML</strong> Master Basic Certification Exam<br />

• Duration in minutes => 90 minutes<br />

• Number of Questions => 50 questions<br />

• Minimum Passing Score => 70%<br />

<strong>XML</strong> Master Basic Certification Exam Topics<br />

• Section 1 - <strong>XML</strong> Overview<br />

• Section 2 - Creating <strong>XML</strong> Documents<br />

• Section 3 - DTD<br />

• Section 4 - <strong>XML</strong> Schema<br />

• Section 5 - XSLT, XPath<br />

• Section 6 - Namespace


<strong>XML</strong> Certification Program 140<br />

For Certification Exam Takers<br />

Exam Fee<br />

It takes US$125 for each certification exam.<br />

Exam Enrollment<br />

The <strong>XML</strong> Master exams are available daily at Prometric Authorized Testing Centers. To take the exam, schedule a<br />

day and time at Prometric Web site [1] .<br />

External links<br />

<strong>XML</strong> Certification Program (<strong>XML</strong> Master) official website<br />

• Introduction to <strong>XML</strong> Certification Program: <strong>XML</strong> Master [2]<br />

• <strong>XML</strong> Master Certification Practice Exam [3]<br />

• <strong>XML</strong> Master Certification Success Stories [4]<br />

<strong>XML</strong> Master Basic Certification Exam Preparation Links<br />

Section 1 - <strong>XML</strong> Overview<br />

• a. Overview of <strong>XML</strong><br />

• <strong>XML</strong> features [5]<br />

• Purpose of <strong>XML</strong> [6]<br />

• b. Overview of related <strong>XML</strong> technologies<br />

• Names for and overview of <strong>XML</strong>-related technologies defined by the W3C or other standards<br />

organizations XPath, XLink, XQuery, XPointer, DOM, SAX, SOAP, XHTML etc.<br />

[7]<br />

• Names for and overview of applicable <strong>XML</strong> specifications defined according to industry or purpose by the<br />

W3C or other standards organizations [6]<br />

• Purpose of schema definition language defining <strong>XML</strong> structure [8]<br />

• Differences in defined content and functions of <strong>XML</strong> Schema and DTD [8]<br />

Section 2 - Creating <strong>XML</strong> Documents<br />

• a. Syntax<br />

• Naming rules, usable characters defined within an <strong>XML</strong> document [9]<br />

• Methods for coding <strong>XML</strong> documents utilizing tags [10]<br />

• Rules for coding declarations, elements, comments, character references, and processing commands<br />

comprising an <strong>XML</strong> document [10]<br />

• Methods for coding character data and markups (tags, references, comments, etc.) comprising an <strong>XML</strong><br />

document [11]<br />

• The role of an <strong>XML</strong> processor (<strong>XML</strong> parser) [12]<br />

• b. Elements, attributes, entities<br />

• Coding elements that include attributes [13]<br />

• Types of entities [14]<br />

• Handling entities and references using an <strong>XML</strong> processor [15]<br />

• Usage of character references [16]<br />

• Usage of Predefined entities [17]<br />

• Method for referencing entities [17]<br />

• c. Valid <strong>XML</strong> documents, well-formed <strong>XML</strong> documents


<strong>XML</strong> Certification Program 141<br />

• Well-formed <strong>XML</strong> document coding methods [18]<br />

• Coding methods to ensure valid <strong>XML</strong> documents [8]<br />

• Differences between valid <strong>XML</strong> documents and well-formed <strong>XML</strong> documents [19]<br />

• Creating valid <strong>XML</strong> documents for defined DTDs [19]<br />

• Creating valid <strong>XML</strong> documents for defined <strong>XML</strong> Schema [19]<br />

• d. Special characters/ character codes, encoding/ normalizing <strong>XML</strong> documents<br />

• Character references [16]<br />

• <strong>XML</strong> declarations and text declarations [20]<br />

• Handling white spaces [21]<br />

• End-of-line handling in <strong>XML</strong> documents [22]<br />

• Normalizing attribute values [23]<br />

Section 3 - DTD<br />

• a. Basics<br />

• Document type declarations [24]<br />

• Methods for coding DTD internal subsets and external subsets [24]<br />

• Differences between DTD internal subsets and external subsets [24]<br />

• Internal entities and external entities, Parsed entities and unparsed entities [24]<br />

• b. Content model/element type declarations/attribute-list declarations/actual processing/entity declarations<br />

• Element type declarations [25]<br />

• Content model definitions for elements [25]<br />

• Attribute-list declarations [26]<br />

• Attribute types [26]<br />

• Attribute defaults [26]<br />

• Entity declarations [27]<br />

Section 4 - <strong>XML</strong> Schema<br />

• a. Basics<br />

• <strong>XML</strong> Schema document structure [28]<br />

• <strong>XML</strong> Schema Namespace [28]<br />

• Mapping between <strong>XML</strong> documents and <strong>XML</strong> schema documents [29]<br />

• b. Data types/ coding methods/ actual processing<br />

• <strong>XML</strong> Schema embedded data types [30]<br />

• Simple type and complex type [31]<br />

• Type extensions and restrictions [29]<br />

• Element definitions [30]<br />

• Attribute definitions [30]<br />

Section 5 - XSLT, XPath<br />

• a. Basics<br />

• Purpose of XSLT [32]<br />

• Application use of XSLT [33]<br />

• XSLT stylesheet structure [34]<br />

• XSLT Namespace [34]<br />

• b. Elements/ templates/ character encoding/ actual transformation processing<br />

• Coding methods and related functions for well-known XSLT elements [35]<br />

• Template rules and templates [36]


<strong>XML</strong> Certification Program 142<br />

• Pattern coding and matching patterns and nodes [36]<br />

• Output processing using XSLT [37]<br />

• c. Coding XPath expressions within a stylesheet<br />

• Basic operators [36]<br />

• Basic functions [36]<br />

• Basic coding methods for location paths (designating tree structure nodes) [36]<br />

Section 6 - Namespace<br />

• a. <strong>XML</strong> namespaces<br />

• <strong>XML</strong> namespace defined content [38]<br />

• Application use of <strong>XML</strong> namespace [38]<br />

• <strong>XML</strong> namespace coding methods [38]<br />

• <strong>XML</strong> namespace scope (effective scope) [39]<br />

External links<br />

• <strong>XML</strong> Master Trainings [40] (German)<br />

• <strong>XML</strong> Master Basic Training [41] (German)<br />

References<br />

[1] http://securereg3.prometric.com/<br />

[2] http://www.xmlmaster.org/en<br />

[3] http://www.xmlmaster.org/en/practice_exam/<br />

[4] http://www.xmlmaster.org/en/success/index.html<br />

[5] http://www.w3schools.com/xml/xml_whatis.asp<br />

[6] http://www.w3schools.com/xml/xml_usedfor.asp<br />

[7] http://www.xmlmaster.org/en/article/d01/c01/<br />

[8] http://www.w3schools.com/schema/schema_why.asp<br />

[9] http://www.w3schools.com/xml/xml_elements.asp<br />

[10] http://www.w3schools.com/xml/xml_syntax.asp<br />

[11] http://www.w3schools.com/xml/xml_cdata.asp [12]<br />

http://www.w3schools.com/dtd/dtd_validation.asp [13]<br />

http://www.w3schools.com/xml/xml_attributes.asp<br />

[14] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-entity-decl<br />

[15] http://www.w3.org/TR/2006/REC-xml-20060816/#TextEntities<br />

[16] http://www.w3.org/TR/2004/REC-xml-20040204/#sec-entexpand<br />

[17] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-references<br />

[18] http://www.xmlmaster.org/en/article/d01/c02/<br />

[19] http://www.w3schools.com/xml/xml_dtd.asp<br />

[20] http://www.w3schools.com/xml/xml_encoding.asp<br />

[21] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space<br />

[22] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-line-ends<br />

[23] http://www.w3.org/TR/2006/REC-xml-20060816/<br />

[24] http://www.xmlmaster.org/en/article/d01/c03/<br />

[25] http://www.w3schools.com/dtd/dtd_elements.asp<br />

[26] http://www.w3schools.com/dtd/dtd_attributes.asp<br />

[27] http://www.w3schools.com/dtd/dtd_entities.asp<br />

[28] http://www.w3schools.com/schema/schema_schema.asp<br />

[29] http://www.xmlmaster.org/en/article/d01/c06/<br />

[30] http://www.xmlmaster.org/en/article/d01/c04/<br />

[31] http://www.xmlmaster.org/en/article/d01/c05/<br />

[32] http://www.w3schools.com/xsl/xsl_intro.asp<br />

[33] http://www.xmlmaster.org/en/article/d01/c07/<br />

[34] http://www.w3schools.com/xsl/xsl_transformation.asp


<strong>XML</strong> Certification Program 143<br />

[35] http://www.w3schools.com/xsl/xsl_templates.asp<br />

[36] http://www.xmlmaster.org/en/article/d01/c08/<br />

[37] http://www.w3schools.com/xsl/el_output.asp<br />

[38] http://www.w3schools.com/xml/xml_namespaces.asp<br />

[39] http://www.xmlmaster.org/en/article/d01/c10/<br />

[40] http://www.digicomp.ch/xml<br />

[41] http://www.data2type.de/leistungen/schulungen/xmlmaster<br />

<strong>XML</strong> Configuration Access Protocol<br />

The <strong>XML</strong> Configuration Access Protocol (XCAP), is an application layer protocol that allows a client to read, write,<br />

and modify application configuration data stored in <strong>XML</strong> format on a server.<br />

Overview<br />

XCAP maps <strong>XML</strong> document sub-trees and element attributes to HTTP URIs, so that these components can be<br />

directly accessed by clients using HTTP protocol. An XCAP server is used by XCAP clients to store data like buddy<br />

lists and presence policy in combination with a SIP Presence server that supports PUBLISH, SUBSCRIBE and<br />

NOTIFY methods to provide a complete SIP SIMPLE server solution.<br />

Features<br />

The following operations are supported via XCAP protocol in a client-server interaction:<br />

• Retrieve an item<br />

• Delete an item<br />

• Modify an item<br />

• Add an item<br />

The operations above can be executed on the following items:<br />

• Document<br />

• Element<br />

• Attribute<br />

The XCAP addressing mechanism is based on XPath, that provides the ability to navigate around the <strong>XML</strong> tree.<br />

Application Usages<br />

The following applications are provided by XCAP, by using specific auid (Application Unique Id):<br />

• XCAP capabilities (auid = xcap-caps).<br />

• Resource lists (auid = resource-lists). A resource lists application is any application that needs access to a list of<br />

resources, identified by a URI, to which operations, such as subscriptions, can be applied.<br />

• Presence rules (auid = pres-rules, org.openmobilealliance.pres-rules). A Presence Rules application is an<br />

application which uses authorization policies, also known as authorization rules, to specify what presence<br />

information can be given to which watchers, and when.<br />

• RLS services (auid = rls-services). A Resource List Server (RLS) services application is Session Initiation<br />

Protocol (SIP) application whereby a server receives SIP SUBSCRIBE requests for resource, and generates<br />

subscriptions towards the resource list.<br />

• PIDF manipulation (auid = pidf-manipulation). Pidf-manipulation application usage defines how XCAP is used<br />

to manipulate the contents of PIDF based presence documents.


<strong>XML</strong> Configuration Access Protocol 144<br />

Standards<br />

The XCAP protocol is based on the following IETF standards:<br />

RFC4825 [1] , RFC4826 [2] , RFC4827 [3] , RFC5025 [4]<br />

External links<br />

• XCAP Tutorial [5]<br />

• OpenXCAP [6]<br />

References<br />

[1] RFC4825 (http://www.ietf.org/rfc/rfc4825.txt)<br />

[2] RFC4826 (http://www.ietf.org/rfc/rfc4826.txt)<br />

[3] RFC4827 (http://www.ietf.org/rfc/rfc4827.txt)<br />

[4] RFC5025 (http://www.ietf.org/rfc/rfc5025.txt)<br />

[5] http://www.jdrosen.net/papers/xcap-tutorial.ppt<br />

[6] http://openxcap.org/<br />

<strong>XML</strong> Control Protocol<br />

<strong>XML</strong> Control Protocol, or XCP, was launched as an April Fools' Day joke on April 1, 2004. It was pitched as a<br />

drop-in replacement for TCP with the slogan "Light the Fiber!". The web site put up for the occasion now seems to<br />

be owned by a link farm.<br />

External links<br />

• TCP is So Over by Tim Bray [1]<br />

• Former XCP home page [2]<br />

References<br />

[1] http://www.tbray.org/ongoing/When/200x/2004/04/01/XCP<br />

[2] http://www.x-cp.org/


<strong>XML</strong> data binding 145<br />

<strong>XML</strong> data binding<br />

<strong>XML</strong> data binding refers to the process of representing the information in an <strong>XML</strong> document as an object in<br />

computer memory. This allows applications to access the data in the <strong>XML</strong> from the object rather than using the<br />

DOM or SAX to retrieve the data from a direct representation of the <strong>XML</strong> itself.<br />

An <strong>XML</strong> data binder accomplishes this by automatically creating a mapping between elements of the <strong>XML</strong> schema<br />

of the document we wish to bind and members of a class to be represented in memory.<br />

When this process is applied to convert a <strong>XML</strong> document to an object, it is called unmarshalling. The reverse<br />

process, to serialize an object as <strong>XML</strong>, is called marshalling.<br />

Since <strong>XML</strong> is inherently sequential and objects are (usually) not, <strong>XML</strong> data binding mappings often have difficulty<br />

preserving all the information in an <strong>XML</strong> document. Specifically, information like comments, <strong>XML</strong> entity<br />

references, and sibling order may fail to be preserved in the object representation created by the binding application.<br />

This is not always the case; sufficiently complex data binders are capable of preserving 100% of the information in<br />

an <strong>XML</strong> document.<br />

Similarly, since objects in computer memory are not inherently sequential, and may include links to other objects<br />

(including self-referential links), <strong>XML</strong> data binding mappings often have difficulty preserving all the information<br />

about an object when it is marshalled to <strong>XML</strong>.<br />

An alternative approach to automatic data binding relies instead on hand-crafted XPath expressions that extract the<br />

data from <strong>XML</strong>. This approach has a number of benefits. First, the data binding code only needs proximate<br />

knowledge (e.g., topology, tag names, etc.) of the <strong>XML</strong> tree structure, which developers can determine by looking at<br />

the <strong>XML</strong> data; <strong>XML</strong> schemas are no longer mandatory. Furthermore, XPath allows the application to bind the<br />

relevant data items and filter out everything else, avoiding the unnecessary processing that would be required to<br />

completely unmarshall the entire <strong>XML</strong> document. The drawback of this approach is the lack of automation in<br />

implementing the object model and XPath expressions. Instead the application developers have to create these<br />

artifacts manually.<br />

Data binding in general<br />

One of <strong>XML</strong> data binding's strengths is the ability to un/serialize objects across programs, languages, and platforms.<br />

You can dump a time series of structured objects from a datalogger written in C on an embedded processor, bring it<br />

across the network to process in perl and finally visualize in Mathematica. The structure and the data remain<br />

consistent and coherent throughout the journey, and no custom formats or parsing is required. This is not unique to<br />

<strong>XML</strong>. YAML, for example, is emerging as a powerful data binding alternative to <strong>XML</strong>. JSON (which can be<br />

regarded as a subset of YAML) is often suitable for lightweight or restricted applications.


<strong>XML</strong> data binding 146<br />

External links<br />

• <strong>XML</strong> Data Binding Resources [1] , by Ronald Bourret<br />

• <strong>XML</strong> Schema Patterns for Databinding Working Group [2]<br />

See also<br />

• Bound control<br />

• Data structure<br />

• JSON<br />

• Serialization<br />

• YAML<br />

References<br />

[1] http://www.rpbourret.com/xml/<strong>XML</strong>DataBinding.htm<br />

[2] http://www.w3.org/2002/ws/databinding<br />

<strong>XML</strong> database<br />

An <strong>XML</strong> database is a data persistence software system that allows data to be stored in <strong>XML</strong> format. This data can<br />

then be queried, exported and serialized into the desired format.<br />

Two major classes of <strong>XML</strong> database exist:<br />

1. <strong>XML</strong>-enabled: these map all <strong>XML</strong> to a traditional database (such as a relational database [1] ), accepting <strong>XML</strong> as<br />

input and rendering <strong>XML</strong> as output. This term implies that the database does the conversion itself (as opposed to<br />

relying on middleware).<br />

2. Native <strong>XML</strong> (NXD): the internal model of such databases depends on <strong>XML</strong> and uses <strong>XML</strong> documents as the<br />

fundamental unit of storage, which are, however, not necessarily stored in the form of text files.<br />

Rationale for <strong>XML</strong> in databases<br />

O'Connell (2005, 9.2) gives one reason for the use of <strong>XML</strong> in databases: the increasingly common use of <strong>XML</strong> for<br />

data transport, which has meant that "data is extracted from databases and put into <strong>XML</strong> documents and vice-versa".<br />

It may prove more efficient (in terms of conversion costs) and easier to store the data in <strong>XML</strong> format.<br />

Native <strong>XML</strong> databases<br />

The term "native <strong>XML</strong> database" (NXD) can lead to confusion. Many NXDs do not function as standalone databases<br />

at all, and do not really store the native (text) form.<br />

The formal definition from the <strong>XML</strong>:DB initiative (which appears to be inactive since 2003 [2] ) states that a native<br />

<strong>XML</strong> database:<br />

• Defines a (logical) model for an <strong>XML</strong> document — as opposed to the data in that document — and stores and<br />

retrieves documents according to that model. At a minimum, the model must include elements, attributes,<br />

PCDATA, and document order. Examples of such models include the XPath data model, the <strong>XML</strong> Infoset, and<br />

the models implied by the DOM and the events in SAX 1.0.<br />

• Has an <strong>XML</strong> document as its fundamental unit of (logical) storage, just as a relational database has a row in a<br />

table as its fundamental unit of (logical) storage.


<strong>XML</strong> database 147<br />

• Need not have any particular underlying physical storage model. For example, NXDs can use relational,<br />

hierarchical, or object-oriented database structures, or use a proprietary storage format (such as indexed,<br />

compressed files).<br />

Additionally, many <strong>XML</strong> databases provide a logical model of grouping documents, called "collections". Databases<br />

can set up and manage many collections at one time. In some implementations, a hierarchy of collections can exist,<br />

much in the same way that an operating system's directory-structure works.<br />

All <strong>XML</strong> databases now support at least one form of querying syntax. Minimally, just about all of them support<br />

XPath for performing queries against documents or collections of documents. XPath provides a simple pathing<br />

system that allows users to identify nodes that match a particular set of criteria.<br />

In addition to XPath, many <strong>XML</strong> databases support XSLT as a method of transforming documents or query-results<br />

retrieved from the database. XSLT provides a declarative language written using an <strong>XML</strong> grammar. It aims to define<br />

a set of XPath filters that can transform documents (in part or in whole) into other formats including Plain text,<br />

<strong>XML</strong>, or HTML.<br />

Many <strong>XML</strong> databases also support XQuery to perform querying. XQuery includes XPath as a node-selection<br />

method, but extends XPath to provide transformational capabilities. Users sometimes refer to its syntax as<br />

"FLWOR" (pronounced 'Flower') because the flow may include the following statements: 'For', 'Let', 'Where', 'Order'<br />

and 'Return'. Traditional RDBMS vendors (who traditionally had SQL only engines), are now shipping with hybrid<br />

SQL and XQuery engines. Hybrid SQL/XQuery engines help to query <strong>XML</strong> data alongside the Relational data, in<br />

the same query expression. This approach helps in combining Relational and <strong>XML</strong> data.<br />

Some <strong>XML</strong> databases support an API called the <strong>XML</strong>:DB API (or XAPI) as a form of implementation-independent<br />

access to the <strong>XML</strong> datastore. In <strong>XML</strong> databases, XAPI resembles ODBC and JDBC as used with relational<br />

databases. On the 24th of June 2009, The Java Community Process released the final version of the XQuery API for<br />

Java specification (XQJ) [3] - "a common API that allows an application to submit queries conforming to the W3C<br />

XQuery 1.0 specification and to process the results of such queries".<br />

Databases known to support XQuery, XQJ, <strong>XML</strong>:DB, or a RESTful API<br />

<strong>XML</strong> Database License <strong>Language</strong> XQJ API <strong>XML</strong>:DB<br />

[4]<br />

Apache XIndice (no longer maintained )<br />

API<br />

RESTful<br />

API<br />

Open source Java No Yes No No<br />

BaseX Open source Java Yes Yes Yes Yes<br />

Gemfire Enterprise Commercial Unknown No Yes No Yes<br />

DOMSafe<strong>XML</strong> Commercial Unknown No Yes No Yes<br />

eXist Open source Java No Yes Yes No<br />

MarkLogic Server Commercial C++ No No Yes Yes<br />

MonetDB/XQuery Open source C++ No Yes No No<br />

my<strong>XML</strong>DB Open source Java No Yes No Unknown<br />

OZONE Open source Java No Yes No Yes<br />

Sedna Open source C++ Yes Yes No Yes<br />

Tamino Commercial Unknown No Partial No Unknown<br />

TeXtML Commercial Unknown Unknown Unknown No Yes<br />

Xpriori XMS Commercial C++ No No No Yes<br />

Transaction Support


<strong>XML</strong> database 148<br />

Implementations<br />

• Apache Xindice [5] (previous name:dbxml)<br />

• BaseX [6] native, open-source <strong>XML</strong> Database developed at the University of Konstanz. Supports XQuery and Full<br />

Text [7] and Update [8] extensions.<br />

• BSn/NONMONOTONIC Lab: IB Search Engine [9] , embeddable <strong>XML</strong>++ search engine using a generic/abstract<br />

model and a mix of polymorphic objects types. Spin-off from the Isearch project.<br />

• Clusterpoint Storage Engine [10] , an <strong>XML</strong> storage engine geared towards high-volume applications and<br />

millisecond query times.<br />

• DB2 9 Express-C [11] , no-charge hybrid relational/<strong>XML</strong> data server with Pure<strong>XML</strong><br />

• EMC Documentum xDB [12] , a commercial native <strong>XML</strong> database including XQuery implementation, embeddable<br />

• eXist-db [13] , open-source native <strong>XML</strong> database, written in Java<br />

• Gemstone System's GemFire Enterprise [14] commercial <strong>XML</strong> database<br />

• MarkLogic Server [15] , a native <strong>XML</strong> database which uses XQuery.<br />

• M/DB:X [16] , a lightweight, REST-interfaced native <strong>XML</strong> database designed for use as a Cloud database.<br />

• MonetDB/XQuery [17] - XQuery processor on top of the MonetDB relational database system. Also supports<br />

W3C XQUF [8] updates. Open source.<br />

• Oracle <strong>XML</strong> DB [18] <strong>XML</strong> Enabled, (as of Oracle 10g known as Oracle XDB) despite its name it does not support<br />

the <strong>XML</strong>:DB API.<br />

• Oracle Berkeley DB <strong>XML</strong> [19] , <strong>XML</strong> Enabled, embedded database; built on top of the Berkeley DB (a key-value<br />

database).<br />

• Sedna <strong>XML</strong> Database [20] , Open source <strong>XML</strong> database developed by MODIS [21] team at Institute for System<br />

Programming [22] . Supports XQuery, Updates, XQJ API, Transactions and Triggers<br />

• SQL Server 2005 [23] , Free Express Edition with full xml features<br />

• Tamino <strong>XML</strong> Server [24] , native <strong>XML</strong> database. support for XQuery, XQuery Update, Transactions and Server<br />

Extensions.<br />

• TEXTML Server [25] , a native <strong>XML</strong> database combined with a full-text search engine.<br />

• TigerLogic XDMS [26] native <strong>XML</strong> Database<br />

• Timber [27] , a native <strong>XML</strong> database system developed at the University of Michigan<br />

• Qizx 3.0 [28] a native XQuery database engine written in Java (free & open source edition available)<br />

• XStreamDB [29] , native <strong>XML</strong> Database<br />

• Xpriori XMS [30] , XMS is a completely self constructing native <strong>XML</strong> database.<br />

External references<br />

• <strong>XML</strong> Databases - The Business Case, Charles Foster, June 2008 [31] - Talks about the current state of Databases<br />

and data persistence, how the current Relational Database model is starting to crack at the seams and gives an<br />

insight into a strong alternative for today's requirements.<br />

• An <strong>XML</strong>-based Database of Molecular Pathways (2005-06-02) [32] Speed / Performance comparisons of eXist,<br />

X-Hive, Sedna and Qizx/open<br />

• <strong>XML</strong> Native Database Systems: Review of Sedna, Ozone, NeoCoreXMS [33] 2006<br />

• <strong>XML</strong> Data Stores: Emerging Practices [34]<br />

• Bhargava, P.; Rajamani, H.; Thaker, S.; Agarwal, A. (2005) <strong>XML</strong> Enabled Relational Databases, Texas, The<br />

University of Texas at Austin.<br />

• O'Connell, S. Advanced Databases Course Notes, Southampton, University of Southampton, 2005<br />

• Initiative for <strong>XML</strong> Databases [35]<br />

• <strong>XML</strong> and Databases, Ronald Bourret, September 2005 [36]<br />

• <strong>XML</strong> Database Products, Ronald Bourret, 2000-2009 [37]


<strong>XML</strong> database 149<br />

• The State of Native <strong>XML</strong> Databases, Elliotte Rusty Harold, August 13, 2007 [38]<br />

• <strong>XML</strong> for DB2 Information Integration [39] , an IBM Redbook that has a chapter on <strong>XML</strong> and databases (1st<br />

chapter).<br />

References<br />

[1] Mustafa Atay and Shiyong Lu, “Storing and Querying <strong>XML</strong>: An Efficient Approach Using Relational Databases”, ISBN 3639115813, VDM<br />

Verlag, 2009.<br />

[2] http://xmldb-org.sourceforge.net/faqs.html<br />

[3] http://jcp.org/en/jsr/detail?id=225<br />

[4] http://www.oreillynet.com/onjava/blog/2006/03/dont_be_misled_xindice_is_dead.html<br />

[5] http://xml.apache.org/xindice/<br />

[6] http://basex.org/<br />

[7] http://www.w3.org/TR/xpath-full-text-10/<br />

[8] http://www.w3.org/TR/xqupdate/<br />

[9] http://www.ibu.de/node/52<br />

[10] http://www.clusterpoint.com/<br />

[11] http://ibm.com/db2/viper/<br />

[12] http://www.emc.com/products/detail/software/documentum-xdb.htm<br />

[13] http://exist.sourceforge.net/<br />

[14] http://www.gemstone.com/products/gemfire/enterprise.php<br />

[15] http://www.marklogic.com/<br />

[16] http://www.mgateway.com/mdbx.html<br />

[17] http://monetdb.cwi.nl/XQuery/<br />

[18] http://www.oracle.com/technology/tech/xml/xmldb/index.html<br />

[19] http://www.oracle.com/database/berkeley-db/xml/index.html<br />

[20] http://modis.ispras.ru/sedna<br />

[21] http://modis.ispras.ru<br />

[22] http://ispras.ru<br />

[23] http://www.microsoft.com/sql/default.mspx<br />

[24] http://www.softwareag.com/corporate/products/wm/tamino/<br />

[25] http://www.ixiasoft.com/textmlserver<br />

[26] http://www.rainingdata.com/products/tl/index.html<br />

[27] http://www.eecs.umich.edu/db/timber/<br />

[28] http://www.xmlmind.com/qizx/<br />

[29] http://bluestream.com/products/xstreamdb32<br />

[30] http://www.xpriori.com<br />

[31] http://www.cfoster.net/articles/xmldb-business-case<br />

[32] http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-3717<br />

[33] http://swing.felk.cvut.cz/index.php?option=com_docman&task=doc_view&gid=5&Itemid=62<br />

[34] http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/mags/ic/&toc=comp/mags/ic/2005/02/w2toc.xml&DOI=10.<br />

1109/MIC.2005.48<br />

[35] http://xmldb-org.sourceforge.net<br />

[36] http://www.rpbourret.com/xml/<strong>XML</strong>AndDatabases.htm<br />

[37] http://www.rpbourret.com/xml/<strong>XML</strong>DatabaseProds.htm<br />

[38] http://cafe.elharo.com/xml/the-state-of-native-xml-databases/<br />

[39] http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246994.html


<strong>XML</strong> editor 150<br />

<strong>XML</strong> editor<br />

An <strong>XML</strong> editor is a markup language editor with added functionality to facilitate the editing of <strong>XML</strong>. This can be<br />

done using a plain text editor, with all the code visible, but <strong>XML</strong> editors have added facilities like tag completion<br />

and menus and buttons for tasks that are common in <strong>XML</strong> editing, based on data supplied with document type<br />

definition (DTD) or the <strong>XML</strong> tree.<br />

There are also graphical <strong>XML</strong> editors that hide the code in the background and present the content to the user in a<br />

more user-friendly format, approximating the rendered version or editing forms. This is helpful for situations where<br />

people who are not fluent in <strong>XML</strong> code need to enter information in <strong>XML</strong> based documents such as time sheets and<br />

expenditure reports. And even if the user is familiar with <strong>XML</strong>, use of such editors, which take care of syntax<br />

details, is often faster and more convenient.<br />

Functionality beyond syntax highlighting<br />

An <strong>XML</strong> editor goes beyond the syntax highlighting offered by many plaintext editors and generic source code<br />

editors, verifying the <strong>XML</strong> source based on an <strong>XML</strong> Schema or <strong>XML</strong> DTD, and some can do it as the document is<br />

being edited in real time. Other features of an editor designed specifically for editing <strong>XML</strong> might include element<br />

word completion and automatic appending of a closing tag whenever an opening tag is entered. These features can<br />

help to prevent typographically originating errors in the <strong>XML</strong> code. Some <strong>XML</strong> editors provide for the ability to run<br />

an XSLT transform, or series of transforms, over a document. Some of the larger <strong>XML</strong> packages even offer XSLT<br />

debugging features and XSL-FO processors for generation of PDF files from documents.<br />

Textual editors<br />

Text <strong>XML</strong> editors generally provide features dealing with working with element tags. Syntax highlighting is a basic<br />

standard of any <strong>XML</strong> editor; that is, they color element text differently from regular text. Element and attribute<br />

completion based on a DTD or schema is also available from many text <strong>XML</strong> editors. Displaying line numbers is<br />

also a common and useful feature, as is providing the ability to reformat a document to conform to a particular style<br />

of indenture.<br />

Here is an example of edition in a text editor with syntax coloring:<br />

The advantage of text editors is that they present exactly the information that is stored in the <strong>XML</strong> file. It is the best<br />

way to control the formatting of the file (such as indentations), to do low-level operations (such as a find/replace on<br />

element names) and to edit <strong>XML</strong> files without any schema or configuration file.<br />

Graphical editors<br />

Graphical editors based on GUIs may be easier for some people to use than text editors, and may not require<br />

knowledge of <strong>XML</strong> syntax. These are often called WYSIWYG ("What You See Is What You Get") editors, but not<br />

all of them are WYSIWYG: graphical <strong>XML</strong> editors can be WYSIWYG when they try to display the final rendering<br />

or WYSIWYM ("What You See Is What You Mean") when they try to display the actual meaning of <strong>XML</strong> elements.<br />

When they are not WYSIWYG, they do not display the (or one of the) graphical end result of a document, but<br />

instead focus on conveying the meaning of the text. They use DTDs or <strong>XML</strong> schemas and/or configuration files to


<strong>XML</strong> editor 151<br />

map <strong>XML</strong> elements to graphical components.<br />

These kinds of editors are generally more useful for <strong>XML</strong> languages for data rather than for storing documents.<br />

Documents tends to be fairly freeform in structure, which tends to defy the generally rigid nature of many graphical<br />

editors.<br />

In the above example, the editor is using a configuration file to know that the TABLE element represents a table, the<br />

TR element represents a row of the table, and the TD element represents a cell of the table. It is using this<br />

information to display the table based on this structuring information, in order to make editing easier.<br />

Schema and configuration files information can also be used to ensure that users do not create invalid documents.<br />

For instance, in a text editor, it is possible to create a row with too many cells in the table, while this would not be<br />

possible with the above graphical user interface.<br />

WYSIWYG editors<br />

WYSIWYG editors let people edit files directly with the tags represented by some form of graphical viewing rather<br />

than bare <strong>XML</strong> code. Often, WYSIWYG editors attempt to emulate the end result of some transform or CSS<br />

stylesheet application. This emulation may or may not be possible, depending on the transformation from <strong>XML</strong> into<br />

the end result.<br />

Naive use of a WYSIWYG editor can lead to the creation of documents that do not have the intrinsic semantics of<br />

the particular <strong>XML</strong> language. This comes about if the user is focused on trying to achieve a certain visual<br />

presentation with the editor, rather than using the WYSIWYG to make editing the document easier. For instance,<br />

someone creating a web page could use an H2 element (meaning: second level title) instead of H1 (meaning: first<br />

level title) because it looks smaller on their current WYSIWYG editor. Such an author is making a choice based on<br />

the apparent visual representation, but a visitor to the author's web page can offer a very different rendering in their<br />

browser.<br />

However, as long as the underlying meaning of the document is understood by the author, and the author does not<br />

make decisions based on the exact look in the WYSIWYG editor, such an editor can be of value to the writer. It is<br />

generally much easier to read a document that is being rendered in some fashion than it is to read the raw <strong>XML</strong> code.<br />

Also, editing can be much more intuitive, as the WYSIWYG editor can use tools similar to many word processing<br />

applications. Some WYSIWYG editors even allow the user to use a DTD or Schema and define their own user<br />

interface for editing.<br />

Usually WYSIWYG editors support CSS but not XSLT, because XSLT transformations can be very complex, and<br />

guessing what the user meant when changing the end result can be impossible. The WYSIWYG editors that do<br />

support XSLT, such as Syntext Serna, will therefore apply changes directly to the original <strong>XML</strong>, while updating the<br />

view by running the XSLT for every change.


<strong>XML</strong> editor 152<br />

In the above example, a stylesheet is used to color table cells in a particular way. For instance, even rows do not have<br />

the same background color as odd rows, in order to make reading easier.<br />

Application domains<br />

• Computer programming<br />

• Technical editing<br />

See also<br />

• List of <strong>XML</strong> editors<br />

• Authoring system<br />

• Editing<br />

• Source code editor<br />

Edited formats<br />

• <strong>XML</strong><br />

• Darwin Information Typing Architecture (DITA)<br />

• DocBook<br />

External links<br />

• <strong>XML</strong> Editors [1] at the Open Directory Project<br />

• List of editors from xml.com [2]<br />

References<br />

[1] http://www.dmoz.org/Computers/Data_Formats/<strong>Markup</strong>_<strong>Language</strong>s/<strong>XML</strong>/Tools/Editors//<br />

[2] http://www.xml.com/pub/pt/3


<strong>XML</strong> Enabled Directory 153<br />

<strong>XML</strong> Enabled Directory<br />

<strong>XML</strong> Enabled Directory (XED) is a framework for managing objects represented using the Extensible <strong>Markup</strong><br />

<strong>Language</strong> (<strong>XML</strong>). XED builds on X.500 and LDAP directory services technologies.<br />

XED was originally designed in 2003 by Steven Legg of eB2Bcom (formerly of Adacel Technologies) and Daniel<br />

Prager (formerly of Deakin University).<br />

The <strong>XML</strong> Enabled Directory (XED) framework leverages existing Lightweight Directory Access Protocol (LDAP)<br />

and X.500 directory technology to create a directory service that stores, manages and transmits Extensible <strong>Markup</strong><br />

<strong>Language</strong> (<strong>XML</strong>) format data, while maintaining interoperability with LDAP clients, X.500 Directory User Agents<br />

(DUAs), and X.500 Directory System Agents (DSAs).<br />

The main features of XED are:<br />

• semantically equivalent <strong>XML</strong> renditions of existing directory protocols,<br />

• <strong>XML</strong> renditions of directory data,<br />

• the ability to accept at run time, user defined attribute syntaxes specified in a variety of <strong>XML</strong> schema languages,<br />

• the ability to perform filter matching on the parts of <strong>XML</strong> format attribute values.<br />

• the flexibility for implementors to develop XED clients using only their favoured <strong>XML</strong> schema language.<br />

The <strong>XML</strong> Enabled Directory allows directory entries to contain <strong>XML</strong> formatted data as attribute values.<br />

Furthermore, the attribute syntax can be specified in any one of a variety of <strong>XML</strong> schema languages that the<br />

directory understands.<br />

The directory server is then able to perform data validation and semantically meaningful matching of <strong>XML</strong><br />

documents, or their parts, on behalf of client applications, making the implementation of <strong>XML</strong>-based applications<br />

easier and faster.<br />

<strong>XML</strong> applications can also exploit the directory's traditional capabilities of cross-application data sharing, data<br />

distribution, data replication, user authentication and user access control, further lowering the cost of building new<br />

<strong>XML</strong> applications<br />

XED Implementations<br />

eB2Bcom's <strong>View</strong>500 Identity Server provides organisations with a fast, scalable and flexible directory system. As it<br />

has been developed strictly adhering to open standards and it features support for the X.500, LDAP, XED and<br />

ACP133 Standards. Being standards compliant, <strong>View</strong>500 will interface with a variety of applications, both now and<br />

into the future.<br />

External links<br />

• <strong>XML</strong> Enabled Directory [1]<br />

• A work-in-progress XED specification [2]<br />

References<br />

[1] http://www.xmled.info/<br />

[2] http://www.xmled.info/specs.htm


<strong>XML</strong> Encryption 154<br />

<strong>XML</strong> Encryption<br />

<strong>XML</strong> Encryption, also known as <strong>XML</strong>-Enc, is a specification, governed by a W3C recommendation, that defines<br />

how to encrypt the contents of an <strong>XML</strong> element.<br />

Although <strong>XML</strong> Encryption can be used to encrypt any kind of data, it is nonetheless known as "<strong>XML</strong> Encryption"<br />

because an <strong>XML</strong> element (either an EncryptedData or EncryptedKey element) contains or refers to the cipher text,<br />

keying information, and algorithms.<br />

Both <strong>XML</strong> Signature and <strong>XML</strong> Encryption use the KeyInfo element, which appears as the child of a SignedInfo,<br />

EncryptedData, or EncryptedKey element and provides information to a recipient about what keying material to use<br />

in validating a signature or decrypting encrypted data.<br />

The KeyInfo element is optional: it can be attached in the message, or be delivered through a secure channel.<br />

External links<br />

• W3C info [1]<br />

References<br />

[1] http://www.w3.org/TR/xmlenc-core/<br />

<strong>XML</strong> Events<br />

In computer science and web development, <strong>XML</strong> Events is a W3C standard [1] for handling events that occur in an<br />

<strong>XML</strong> document. These events are typically caused by users interacting with the web page using a device such as a<br />

web browser on a personal computer or mobile phone.<br />

Formal Definition<br />

An <strong>XML</strong> Event is the representation of some asynchronous occurrence (such as a mouse button click) that gets<br />

associated with a data element in an <strong>XML</strong> document. <strong>XML</strong> Events provides a static, syntactic binding to the DOM<br />

Events interface, allowing the event to be handled.<br />

Motivation<br />

The <strong>XML</strong> Events standard is defined to provide <strong>XML</strong>-based languages with the ability to uniformly integrate event<br />

listeners and associated event handlers with Document Object Model (DOM) Level 2 event interfaces. The result is<br />

to provide a declarative, interoperable way of associating behaviors with <strong>XML</strong>-based documents such as XHTML.<br />

Advantages of <strong>XML</strong> Events<br />

<strong>XML</strong> Events uses a separation of concerns design pattern, and is technology-neutral with regards to handlers. It<br />

gives authors freedom in organizing their code and allows separation of document content from scripting.<br />

legacy HTML and early SVG versions bind events to presentation elements by encoding the event name in an<br />

attribute name, such that the value of the attribute is the action for that event at that element. For example (with<br />

Javascript’s onclick attribute):<br />

Stay here!


<strong>XML</strong> Events 155<br />

This design has three drawbacks:<br />

1. it hard-wires the events into the language, so that adding new event types requires changes to the language<br />

2. it forces authors to mix the content of the document with the specifications of the scripting and event handling,<br />

rather than allowing them to separate them.<br />

3. it restricts authors to a single scripting language per document.<br />

Relationship to Other Standards<br />

Unlike DOM Events which are usually associated with HTML documents, <strong>XML</strong> events are designed to be<br />

independent of specific devices. <strong>XML</strong> Events are used extensively in XForms, and, in version 1.2 of the SVG<br />

specification as of July 2006, is still a working draft.<br />

Example of <strong>XML</strong> Events using Listener in XForms<br />

The following is an example of how <strong>XML</strong> events are used in the XForms specification:<br />

<br />

<br />

<br />

Do it!<br />

<br />

alert("test");<br />

<br />

<br />

In this example, when the DOMActivate event occurs on the data element with an id attribute of myButton, the<br />

handler doit (for example a Javascript script element) is executed.<br />

See also<br />

• ECMAScript<br />

• DOM Events<br />

• XForms<br />

• XHTML<br />

External links<br />

• W3C <strong>XML</strong> Events Specification [2] was a W3C Recommendation on 14 October 2003 [3]<br />

• W3C <strong>XML</strong> Events for HTML Authors [4] tutorial


<strong>XML</strong> Events 156<br />

References<br />

[1] "<strong>XML</strong> Events: An Events Syntax for <strong>XML</strong>" (http://www.w3.org/TR/xml-events/). World Wide Web Consortium. 2003-10-14. .<br />

Retrieved 2008-11-19.<br />

[2] http://www.w3.org/TR/xml-events<br />

[3] http://www.w3.org/TR/2003/REC-xml-events-20031014/<br />

[4] http://www.w3.org/MarkUp/2004/xmlevents-for-html-authors<br />

<strong>XML</strong> framework<br />

An <strong>XML</strong> framework is a Software framework for <strong>XML</strong>. Basically, the framework implements several features to aid<br />

the programmer in creating her own application, but an <strong>XML</strong> framework differs from other frameworks in that all<br />

data produced is <strong>XML</strong>. The programmer defines and produces pure data in <strong>XML</strong> format and the framework<br />

transforms the document to any format desired.<br />

One code, one <strong>XML</strong> and several transformations like XHTML, SVG, WML, Excel or Word format, or any<br />

document type may result.<br />

Features in an <strong>XML</strong> framework<br />

• Classes to abstract the USE of <strong>XML</strong> documents<br />

• Classes to abstract the DATA access - All data is <strong>XML</strong> independent of your source, like <strong>XML</strong>, Database, text<br />

files<br />

• XSLT cache.<br />

• Easy way to create XSLT documents like code snippets<br />

• Framework must be extensible because <strong>XML</strong> is extensible by definition.<br />

Pure <strong>XML</strong> frameworks<br />

• <strong>XML</strong>Nuke


<strong>XML</strong> Literals 157<br />

<strong>XML</strong> Literals<br />

In the Microsoft .NET framework, <strong>XML</strong> Literal allows computer program to include <strong>XML</strong> directly in the code. It is<br />

currently only supported in VB.NET 9.0. When Visual Basic expression is embedded in an <strong>XML</strong> literal, the<br />

application creates a LINQ to <strong>XML</strong> object for each literal at run time.<br />

<strong>XML</strong> namespace<br />

<strong>XML</strong> namespaces are used for providing uniquely named elements and attributes in an <strong>XML</strong> document. They are<br />

defined in Namespaces in <strong>XML</strong> [1] , a W3C recommendation. An <strong>XML</strong> instance may contain element or attribute<br />

names from more than one <strong>XML</strong> vocabulary. If each vocabulary is given a namespace then the ambiguity between<br />

identically named elements or attributes can be resolved.<br />

A simple example would be to consider an <strong>XML</strong> instance that contained references to a customer and an ordered<br />

product. Both the customer element and the product element could have a child element named id. References to the<br />

id element would therefore be ambiguous; placing them in different namespaces would remove the ambiguity.<br />

Namespace declaration<br />

A namespace is declared using the reserved <strong>XML</strong> attribute xmlns, the value of which must be an Internationalized<br />

Resource Identifier (IRI), usually a Uniform Resource Identifier (URI) reference.<br />

For example:<br />

xmlns="http://www.w3.org/1999/xhtml"<br />

Note, however, that the namespace specification does not require nor suggest that the namespace URI be used to<br />

retrieve information; it is simply treated by an <strong>XML</strong> parser as a string. For example, the document at http:/ / www.<br />

w3.org/1999/xhtml itself does not contain any code. It simply describes the XHTML namespace to human readers.<br />

Using a URI (such as "http://www.w3.org/1999/xhtml") to identify a namespace, rather than a simple string (such as<br />

"xhtml"), reduces the possibility of different namespaces using duplicate identifiers.<br />

It is also possible to map namespaces to prefixes in namespace declarations. For example:<br />

xmlns:xhtml="http://www.w3.org/1999/xhtml"<br />

In this case, any element or attribute names that start with the prefix "xhtml:" are considered to be in the XHTML<br />

namespace.<br />

Namespace names<br />

Although the term namespace URI is widespread, the W3C Recommendation refers to it as the namespace name.<br />

The specification is not entirely prescriptive about the precise rules for namespace names (it does not explicitly say<br />

that parsers must reject documents where the namespace name is not a valid Uniform Resource Identifier), and many<br />

<strong>XML</strong> parsers allow any character string to be used. In version 1.1 of the recommendation, the namespace name<br />

becomes an Internationalized Resource Identifier, which licenses the use of non-ASCII characters that in practice<br />

were already accepted by nearly all <strong>XML</strong> software. The term namespace URI persists, however, not only in popular<br />

usage but also in many other specifications from W3C and elsewhere.<br />

Following publication of the Namespaces recommendation, there was an intensive debate about how a relative URI<br />

should be handled, with some arguing that it should simply be treated as a character string, and others that it should<br />

be turned into an absolute URI by resolving it against the base URI of the document [2] . The result of the debate was


<strong>XML</strong> namespace 158<br />

a ruling from W3C that relative URIs were deprecated [3] .<br />

The use of URIs taking the form of URLs in the http scheme (such as http:/ / www. w3. org/ 1999/ xhtml'') is<br />

common, despite the absence of any formal relationship with the HTTP protocol. The Namespaces specification does<br />

not say what should happen if such a URL is dereferenced (that is, if software attempts to retrieve a document from<br />

this location). One convention adopted by some users is to place a RDDL document at the location [4] . In general,<br />

however, users should assume that the namespace URI is simply a name, not the address of a document on the web.<br />

See also<br />

• Namespace<br />

External links<br />

• Namespaces in <strong>XML</strong> 1.0 (Third Edition) [1]<br />

• Namespaces in <strong>XML</strong> 1.1 (Second Edition) [8]<br />

References<br />

[1] http://www.w3.org/TR/REC-xml-names/<br />

[2] Leigh Dodds (24 May 2000), News from the trenches (http://www.xml.com/pub/a/2000/05/24/deviant/index.html),<br />

[3] Dan Connolly (11 Sep 2000), W3C <strong>XML</strong> Plenary decision on relative URI references in namespace declarations<br />

[4] Elliotte Rusty Harold (20 Feb 2001), RDDL Me This: What Does a Namespace URL Locate? (http://www.oreillynet.com/pub/a/oreilly/<br />

xml/news/xmlnut2_0201.html),<br />

<strong>XML</strong> Pretty Printer<br />

<strong>XML</strong> Pretty Printers are a type of Prettyprint or code beautifier that specifically improve the readability of <strong>XML</strong>.<br />

<strong>XML</strong> as a standard is designed to be human readable, but is sometimes generated by a computer as tightly<br />

compressed or compacted, and hence more difficult to read and edit. Running the <strong>XML</strong> file through a pretty printer<br />

will improve its readability and editability.<br />

Examples of <strong>XML</strong> Pretty Printers<br />

• xmllint (utility in open source library libxml2)<br />

• xmlindent open source tool, more information on the homepage here [1] .<br />

Online:<br />

• <strong>XML</strong> Pretty Printer Online<br />

• DecisionSoft <strong>XML</strong> Pretty Printer<br />

Windows:<br />

• xmlpp (command line)


<strong>XML</strong> Pretty Printer 159<br />

See also<br />

• Prettyprint<br />

• <strong>XML</strong><br />

External links<br />

• <strong>XML</strong> Pretty Printer Online [2]<br />

• DecisionSoft <strong>XML</strong> Pretty Printer [3]<br />

• xmlpp pretty printer [4]<br />

• <strong>XML</strong> Indent [1] , an <strong>XML</strong> stream reformatter<br />

References<br />

[1] http://xmlindent.sourceforge.net/<br />

[2] http://www.iconv.com/xmllint.htm<br />

[3] http://tools.decisionsoft.com/xmlpp.html<br />

[4] http://www.cheztabor.com/xmlpp/index.htm<br />

<strong>XML</strong> Protocol<br />

The <strong>XML</strong> Protocol ("<strong>XML</strong>P") is a standard being developed by the W3C <strong>XML</strong> Protocol Working Group to the<br />

following guidelines, outlined in the group's charter:<br />

1. An envelope for encapsulating <strong>XML</strong> data to be transferred in an interoperable manner that allows for distributed<br />

extensibility.<br />

2. A convention for the content of the envelope when used for RPC (Remote Procedure Call) applications. The<br />

protocol aspects of this should be coordinated closely with the IETF and make an effort to leverage any work they<br />

are doing, see below for details.<br />

3. A mechanism for serializing data representing non-syntactic data models such as object graphs and directed<br />

labeled graphs, based on the data types of <strong>XML</strong> Schema.<br />

4. A mechanism for using HTTP transport in the context of an <strong>XML</strong> Protocol. This does not mean that HTTP is the<br />

only transport mechanism that can be used for the technologies developed, nor that support for HTTP transport is<br />

mandatory. This component merely addresses the fact that HTTP transport is expected to be widely used, and so<br />

should be addressed by this Working Group. There will be coordination with the Internet Engineering Task Force<br />

(IETF). (See Blocks Extensible Exchange Protocol)<br />

Further, the protocol developed must meet the following requirements, as per the working group's charter:<br />

1. The envelope and the serialization mechanisms developed by the Working Group may not preclude any<br />

programming model nor assume any particular mode of communication between peers.<br />

2. Focus must be put on simplicity and modularity and must support the kind of extensibility actually seen on the<br />

Web. In particular, it must support distributed extensibility where the communicating parties do not have a priori<br />

knowledge of each other.


<strong>XML</strong> Protocol 160<br />

See also<br />

• <strong>XML</strong><br />

• Internet Engineering Task Force<br />

External links<br />

• <strong>XML</strong> Protocol Working Group Charter [1]<br />

• <strong>XML</strong> Protocol Working Group [2]<br />

References<br />

[1] http://www.w3.org/2004/02/<strong>XML</strong>-Protocol-Charter<br />

[2] http://www.w3.org/2000/xp/Group/<br />

<strong>XML</strong> schema<br />

An <strong>XML</strong> schema is a description of a type of <strong>XML</strong> document, typically expressed in terms of constraints on the<br />

structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by <strong>XML</strong><br />

itself. These constraints are generally expressed using some combination of grammatical rules governing the order of<br />

elements, Boolean predicates that the content must satisfy, data types governing the content of elements and<br />

attributes, and more specialized rules such as uniqueness and referential integrity constraints.<br />

There are languages developed specifically to express <strong>XML</strong> schemas. The Document Type Definition (DTD)<br />

language, which is native to the <strong>XML</strong> specification, is a schema language that is of relatively limited capability, but<br />

that also has other uses in <strong>XML</strong> aside from the expression of schemas. Two more expressive <strong>XML</strong> schema<br />

languages in widespread use are <strong>XML</strong> Schema (with a capital S) and RELAX NG.<br />

The mechanism for associating an <strong>XML</strong> document with a schema varies according to the schema language. The<br />

association may be achieved via markup within the <strong>XML</strong> document itself, or via some external means.<br />

Validation<br />

The process of checking to see if an <strong>XML</strong> document conforms to a schema is called validation, which is separate<br />

from <strong>XML</strong>'s core concept of syntactic well-formedness. All <strong>XML</strong> documents must be well-formed, but it is not<br />

required that a document be valid unless the <strong>XML</strong> parser is "validating," in which case the document is also checked<br />

for conformance with its associated schema. DTD-validating parsers are most common, but some support W3C<br />

<strong>XML</strong> Schema or RELAX NG as well.<br />

Documents are only considered valid if they satisfy the requirements of the schema with which they have been<br />

associated. These requirements typically include such constraints as:<br />

• Elements and attributes that must/may be included, and their permitted structure<br />

• The structure as specified by a regular expression syntax<br />

• How character data is to be interpreted, e.g. as a number, a date, a URL, a Boolean, etc.<br />

Validation of an instance document against a schema can be regarded as a conceptually separate operation from<br />

<strong>XML</strong> parsing. In practice, however, many schema validators are integrated with an <strong>XML</strong> parser.


<strong>XML</strong> schema 161<br />

<strong>XML</strong> schema languages<br />

• Document Content Description facility for <strong>XML</strong>, an RDF framework [1]<br />

• Document Definition <strong>Markup</strong> <strong>Language</strong> (DDML)<br />

• Document Schema Definition <strong>Language</strong>s (DSDL)<br />

• Document Structure Description (DSD)<br />

• Document Type Definition (DTD)<br />

• Namespace Routing <strong>Language</strong> (NRL)<br />

• RELAX NG and its predecessors RELAX and TREX<br />

• SGML<br />

• Schema for Object-Oriented <strong>XML</strong> (SOX)<br />

• Schematron<br />

• <strong>XML</strong>-Data Reduced (XDR)<br />

• <strong>XML</strong> Schema (WXS or XSD)<br />

Capitalization<br />

There is some confusion as to when to use the capitalized spelling "Schema" and when to use the lowercase spelling.<br />

The lowercase form is a generic term and may refer to any type of schema, including DTD, <strong>XML</strong> Schema (aka<br />

XSD), RELAX NG, or others, and should always be written using lowercase except when appearing at the start of a<br />

sentence. The form "Schema" (capitalized) in common use in the <strong>XML</strong> community always refers to W3C <strong>XML</strong><br />

Schema.<br />

See also<br />

• Data structure<br />

• Structuring information<br />

• List of <strong>XML</strong> schemas<br />

• <strong>XML</strong> Information Set<br />

• <strong>XML</strong> Schema <strong>Language</strong> Comparison<br />

• Schema (for other uses of the term)<br />

External links<br />

• Comparing <strong>XML</strong> Schema <strong>Language</strong>s [2] by Eric van der Vlist (2001)<br />

• Comparative Analysis of Six <strong>XML</strong> Schema <strong>Language</strong>s [3] by Dongwon Lee, Wesley W. Chu, In ACM SIGMOD<br />

Record, Vol. 29, No. 3, page 76-87, September 2000<br />

• Taxonomy of <strong>XML</strong> Schema <strong>Language</strong>s using Formal <strong>Language</strong> Theory [4] by Makoto Murata, Dongwon Lee,<br />

Murali Mani, Kohsuke Kawaguchi, In ACM Trans. on Internet Technology (TOIT), Vol. 5, No. 4, page 1-45,<br />

November 2005<br />

• Application of <strong>XML</strong> Schema in Web Services Security [5] by Sridhar Guthula, W3C Schema Experience Report,<br />

May 2005


<strong>XML</strong> schema 162<br />

References<br />

[1] "Document Content Description for <strong>XML</strong>: Submission to the World Wide Web Consortium 31-July-1998" (http://www.w3.org/TR/<br />

NOTE-dcd). .<br />

[2] http://www.xml.com/pub/a/2001/12/12/schemacompare.html<br />

[3] http://pike.psu.edu/publications/sigmod-record-00.pdf<br />

[4] http://pike.psu.edu/publications/toit05.pdf<br />

[5] http://www.w3.org/2005/05/25-schema/guthula.html<br />

<strong>XML</strong> Schema Editor<br />

The W3C's <strong>XML</strong> Schema Recommendation defines a formal mechanism for describing <strong>XML</strong> documents. The<br />

standard has become very popular and is used by the majority of standards bodies when describing their data. [1]<br />

The standard is very versatile allowing for programming concepts such as inheritance, and type creation. However<br />

its high complexity is one of its main issues. The standard itself is highly technical and published in 3 different parts,<br />

making it difficult to understand without committing large amounts of time to it.<br />

<strong>XML</strong> Schema Editor Tools<br />

The problems users face when working with the XSD standard can largely be mitigated with the use of graphical<br />

editing tools. Although any text-based editor can be used to edit an <strong>XML</strong> Schema, a graphical editor offers the<br />

biggest advantages, allowing the structure of the document to be viewed graphically and edited with validation<br />

support, entry helpers and other useful features.<br />

The editors that have been developed so far take several different approaches to the presentation of information:<br />

Text <strong>View</strong><br />

The text view of an <strong>XML</strong> Schema shows the schema in its native form. <strong>XML</strong> Schema Editors generally add to the<br />

text view with features like inline entry helpers and entry helper windows, code completion, line numbering, source<br />

folding, and syntax coloring.<br />

In more lengthy and complex schema documents, this is often difficult for even highly trained content model<br />

architects to work with, paving the way for software companies to come up with new and inventive way for users to<br />

visualize these documents.<br />

Physical <strong>View</strong><br />

A physical view of an <strong>XML</strong> Schema displays a graphical entity for each element within the <strong>XML</strong> Schema. This can<br />

make an XSD document easier to read, but does little to simplify editing. This is largely down to the structure of the<br />

XSD Standard, container elements are required which are dependent on the base type used and the types contained<br />

within. Meaning small changes to the logical structure can cause changes to ripple through the document.<br />

The structure of the XSD standard also means entities are referenced from other locations with the document, some<br />

editors allow these to be expanded and viewed in the location they are referenced from some don't, meaning lots of<br />

manual cross referencing.


<strong>XML</strong> Schema Editor 163<br />

Logical <strong>View</strong><br />

A logical view shows the structure of the <strong>XML</strong> Schema without showing all the detail of the syntax used to describe<br />

it. This provides a much clearer view of the <strong>XML</strong> Schema, making it easier to understand the structure of the<br />

document, and makes it easier to edit. Because the editor shows the logical structure of the XSD document, there is<br />

no need to show every element, removing much of the complexity and allowing the editor to automatically manage<br />

the syntactical rules.<br />

Example<br />

The following example will show the source XSD, logical and physical views for a simple schema.<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

A Sample <strong>XML</strong> Document for the schema<br />

<br />

<br />

<br />

Physical <strong>View</strong> Logical <strong>View</strong>


<strong>XML</strong> Schema Editor 164<br />

<br />

<br />

John<br />

Doe<br />

As you can see the logical view provides more information, but without the syntactical clutter, making it easier to<br />

understand and work with.<br />

<strong>XML</strong> Schema Editors<br />

As the XSD standard has gained support, a host of <strong>XML</strong> Schema editors have been developed.<br />

Application Name Screenshot Code Editor Physical<br />

Altova <strong>XML</strong>Spy screenshots [2]<br />

Editor<br />

Eclipse XSD Editor (eclipse.org [3] ) screenshots [3] Limited Editing<br />

Liquid <strong>XML</strong> Studio screenshots [4]<br />

Oxygen xml screenshots [5] Read only<br />

Stylus Studio screenshots [6] Read only<br />

<strong>XML</strong> Fox - Freeware Edition screenshots [7]<br />

References<br />

[1] http://www.w3.org/TR/xmlschema-0/W3C Primer<br />

[2] http://www.altova.com/features_dtdschema.html<br />

[3] http://wiki.eclipse.org/index.php/Introduction_to_the_XSD_Editor<br />

[4] http://www.liquid-technologies.com/XmlStudio/XmlStudio.aspx [5]<br />

http://www.oxygenxml.com/xml_schema_editor.html<br />

[6] http://www.stylusstudio.com/xml_schema_editor.html<br />

[7] http://www.xmlfox.com/xml_schema_editor.htm<br />

Logical Editor Split Code/Diagram<br />

<strong>View</strong>


<strong>XML</strong> Schema <strong>Language</strong> Comparison 165<br />

<strong>XML</strong> Schema <strong>Language</strong> Comparison<br />

A <strong>XML</strong> schema is a description of a type of <strong>XML</strong> document, typically expressed in terms of constraints on the<br />

structure and content of documents of that type, above and beyond the basic syntax constraints imposed by <strong>XML</strong><br />

itself. There are several different languages available for specifying an <strong>XML</strong> schema. Each language has its strengths<br />

and weaknesses.<br />

Note: the W3C defined schema language is called "<strong>XML</strong> Schema". However, this name can be confusing in the<br />

context of referring to a number of <strong>XML</strong> schema languages. As such, throughout this document, references to the<br />

term "<strong>XML</strong> schema" will be any <strong>XML</strong> schema language where the meaning might be ambiguous, while the term<br />

"W3C <strong>XML</strong> Schema" (referred to in this article as WXS) will be used for the W3C-defined <strong>XML</strong> schema language.<br />

Overview<br />

Though there are a number of schema languages available, the primary three languages are Document Type<br />

Definitions, W3C <strong>XML</strong> Schema, and RELAX NG. Each language has its own advantages and disadvantages.<br />

This article also covers a brief review of other schema languages.<br />

The primary purpose of a schema language is to specify what the structure of an <strong>XML</strong> document can be. This means<br />

which elements can reside in which other elements, which attributes are and are not legal to have on a particular<br />

element, and so forth. A schema is somewhat equivalent to a grammar for a language; a schema defines what the<br />

vocabulary for the language may be and what a valid "sentence" is.<br />

Document Type Definitions<br />

Advantages<br />

Of the primary three languages, DTDs are the only ones that can be defined inline. That is, the DTD can actually be<br />

embedded directly into the document.<br />

DTDs can define more than merely the content model. It can define data elements that can be used in the document,<br />

much like a C or C++ preprocessor may have #defines that are used internally.<br />

The DTD language is compact and highly readable, though it does require some experience to understand.<br />

Disadvantages<br />

The primary disadvantage to DTDs is their weakness of specificity. The content models for DTDs are very basic,<br />

particularly compared to the other two languages.<br />

Overuse of DTD-defined elements may make a document illegible or incomprehensible without the associated DTD.<br />

Additionally, there are several <strong>XML</strong> processors that, typically for ease-of-implementation reasons, do not understand<br />

DTDs. As such, if DTD-defined entities are being used, these <strong>XML</strong> processors will not recognize them.<br />

The language that DTDs are written in is not <strong>XML</strong>. Therefore, DTDs cannot use the various frameworks that have<br />

been built around <strong>XML</strong>. <strong>XML</strong> editors that support writing DTDs must do so by parsing an additional language, for<br />

example. Some <strong>XML</strong> processors, typically for economy of implementation or execution, simply ignore DTD<br />

information, including DTD data elements.<br />

The DTD concept for <strong>XML</strong> was borrowed from the SGML DTD concept. As such, the construct could not be<br />

changed when <strong>XML</strong> was extended with namespaces. As such, DTDs are namespace unaware.<br />

There is limited support for defining the type of the contained data. DTDs are primarily structural in nature. They do<br />

not have the ability to specify that an element contains an integral number, real number, a date, or anything of that<br />

nature.


<strong>XML</strong> Schema <strong>Language</strong> Comparison 166<br />

Tool Support<br />

DTDs are perhaps the most widely supported schema language for <strong>XML</strong>. Because DTDs are one of the earliest<br />

schema languages for <strong>XML</strong>, defined before <strong>XML</strong> even had namespace support, they are widely supported. Internal<br />

DTDs are often supported in <strong>XML</strong> processors; external DTDs are less often supported, but only slightly. Most large<br />

<strong>XML</strong> parsers, ones that support multiple <strong>XML</strong> technologies, will provide support for DTDs as well.<br />

W3C <strong>XML</strong> Schema<br />

Advantages over DTDs<br />

Compared to DTDs, W3C <strong>XML</strong> Schemas are exceptionally powerful. They provide much greater specificity than<br />

DTDs could. They are namespace aware, and provide support for types.<br />

W3C <strong>XML</strong> Schema is written in <strong>XML</strong> itself, and therefore has a schema of its own (appropriately, written in W3C<br />

<strong>XML</strong> Schema).<br />

W3C <strong>XML</strong> Schema has a large number of built-in and derived data types. These are specified by the W3C <strong>XML</strong><br />

Schema specification, so all W3C <strong>XML</strong> Schema validators and processors must support them.<br />

Due to the nature of the schema language, after an <strong>XML</strong> document is validated, the entire <strong>XML</strong> document, both<br />

content and structure, can be expressed in terms of the schema itself. This functionality, known as<br />

Post-Schema-Validation Infoset (PSVI), can be used to transform the document into a hierarchy of typed objects that<br />

can be accessed in a programming language through a neutral interface.<br />

Commonality with RELAX NG<br />

Both RELAX NG and W3C <strong>XML</strong> Schema allow for similar mechanisms of specificity. Both allow for a degree of<br />

modularity in their languages, going so far as to being able to split the schema into multiple files. And both of them<br />

are, or can be, defined in an <strong>XML</strong> language.<br />

Advantages over RELAX NG<br />

RELAX NG lacks any analog to PSVI. Unlike W3C <strong>XML</strong> Schema, RELAX NG was not designed with type<br />

assignment and data binding in mind.<br />

W3C <strong>XML</strong> Schema has a formal mechanism for attaching a schema to an <strong>XML</strong> document.<br />

RELAX NG has no ability to apply default attribute data to an element's list of attributes (i.e., changing the <strong>XML</strong><br />

info set), while W3C <strong>XML</strong> Schema does. [1]<br />

W3C <strong>XML</strong> Schema has a rich "simple type" system built in (xs:number, xs:date, etc., plus derivation of custom<br />

types), while RELAX NG has an extremely simplistic one because it's meant to use type libraries developed<br />

independently of RELAX NG, rather than grow its own. This is seen by some as a disadvantage. In practice it's<br />

common for a RELAX NG schema to use the predefined "simple types" and "restrictions" (pattern, maxLength, etc.)<br />

of W3C <strong>XML</strong> Schema.<br />

In W3C <strong>XML</strong> Schema a specific number or range of repetitions of patterns can be expressed more elegantly than<br />

under RELAX NG. For large numbers it's practically not possible to specify at all in RELAX NG.


<strong>XML</strong> Schema <strong>Language</strong> Comparison 167<br />

Disadvantages<br />

W3C <strong>XML</strong> Schema is complex and hard to learn, although that's partially because it tries to do more than mere<br />

validation (see PSVI).<br />

Although being written in <strong>XML</strong> is an advantage, it is also a disadvantage in some ways. The W3C <strong>XML</strong> Schema<br />

language in particular can be quite verbose, while a DTD can be terse and relatively easily editable.<br />

Likewise, WXS's formal mechanism for associating a document with a schema can pose a potential security<br />

problem. For WXS validators that will follow a URI to an arbitrary online location, there is the potential for reading<br />

something malicious from the other side of the stream. [2]<br />

W3C <strong>XML</strong> Schema does not implement most of the DTD ability to provide data elements to a document. While<br />

technically a comparative deficiency, it also does not have the problems that this ability can create as well, which<br />

makes it a strength.<br />

Although W3C <strong>XML</strong> Schema's ability to add default attributes to elements is an advantage, it is a disadvantage in<br />

some ways as well. It means that an <strong>XML</strong> file may not be usable in the absence of its schema, even if the document<br />

would validate against that schema. In effect, all users of such an <strong>XML</strong> document must also implement the W3C<br />

<strong>XML</strong> Schema specification, thus ruling out minimalist or older <strong>XML</strong> parsers. It can also dramatically slow down<br />

processing of the document, as the processor must potentially download and process a second <strong>XML</strong> file (the<br />

schema).<br />

Tool Support<br />

WXS support exists in a number of large <strong>XML</strong> parsing packages. Xerces and the .NET Framework's Base Class<br />

Library both provide support for WXS validation.<br />

RELAX NG<br />

RELAX NG provides for most of the advantages that W3C <strong>XML</strong> Schema does over DTDs.<br />

Advantages over W3C <strong>XML</strong> Schema<br />

While the language of RELAX NG can be written in <strong>XML</strong>, it also has an equivalent form that is much more like a<br />

DTD, but with greater specifying power. This form is known as the compact syntax. Tools can easily convert<br />

between these forms with no loss of features or even commenting. Even arbitrary elements specified between<br />

RELAX NG <strong>XML</strong> elements can be converted into the compact form.<br />

RELAX NG provides very strong support for unordered content. That is, it allows the schema to state that a<br />

sequence of patterns may appear in any order.<br />

RELAX NG also allows for non-deterministic content models. What this means is that RELAX NG allows the<br />

specification of a sequence like the following:<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

When the validator encounters something that matches the "odd" pattern, it is unknown whether this is the optional<br />

last "odd" reference or simply one in the zeroOrMore sequence without looking ahead at the data. RELAX NG<br />

allows this kind of specification. W3C <strong>XML</strong> Schema requires all of its sequences to be fully deterministic, so


<strong>XML</strong> Schema <strong>Language</strong> Comparison 168<br />

mechanisms like the above must be either specified in a different way or omitted altogether.<br />

RELAX NG allows attributes to be treated as elements in content models. In particular, this means that one can<br />

provide the following:<br />

<br />

<br />

<br />

false<br />

<br />

<br />

<br />

true<br />

<br />

<br />

<br />

<br />

<br />

This block states that the element "some_element" must have an attribute named "has_name". This attribute can only<br />

take true or false as values, and if it is true, the first child element of the element must be "name", which stores text.<br />

If "name" did not need to be the first element, then the choice could be wrapped in an "interleave" element along<br />

with other elements. The order of the specification of attributes in RELAX NG has no meaning, so this block need<br />

not be the first block in the element definition.<br />

W3C <strong>XML</strong> Schema cannot specify such a dependency between the content of an attribute and child elements.<br />

RELAX NG's specification only lists two built-in types (string and token), but it allows for the definition of many<br />

more. In theory, the lack of a specific list allows a processor to support data types that are very problem-domain<br />

specific.<br />

Most RELAX NG schemas can be algorithmically converted into W3C <strong>XML</strong> Schemas and even DTDs (except when<br />

using RELAX NG features not supported by those languages, as above). The reverse is not true. As such, RELAX<br />

NG can be used as a normative version of the schema, and the user can convert it to other forms for tools that do not<br />

support RELAX NG.<br />

Disadvantages<br />

Most of RELAX NG's disadvantages are covered under the section on W3C <strong>XML</strong> Schema's advantages over<br />

RELAX NG.<br />

Though RELAX NG's ability to support user-defined data types is useful, it comes at the disadvantage of only<br />

having two data types that the user can rely upon. Which, in theory, means that using a RELAX NG schema across<br />

multiple validators requires either providing those user-defined data types to that validator or using only the two<br />

basic types. In practice however, most RELAX NG processors support the W3C <strong>XML</strong> Schema set of data types.


<strong>XML</strong> Schema <strong>Language</strong> Comparison 169<br />

Tool Support<br />

RELAX NG's tool support is significant, but it is less widespread than W3C <strong>XML</strong> Schema. The Mono Project's<br />

implementation of the .NET Framework includes a RELAX NG validator. The C library libxml2 provides RELAX<br />

NG support as well. Sun Microsystems's Multiple Schema Validator for Java also provides RELAX NG support.<br />

Schematron<br />

Schematron is a fairly unique schema language. Unlike the main three, it defines an <strong>XML</strong> file's syntax as a list of<br />

XPath-based rules. If the document passes these rules, then it is valid.<br />

Advantages<br />

Because of its rule-based nature, Schematron's specificity is very strong. It can require that the content of an element<br />

be controlled by one of its siblings. It can also request or require that the root element, regardless of what element<br />

that happens to be, have specific attributes. It can even specify required relationships between multiple <strong>XML</strong> files.<br />

Disadvantages<br />

While Schematron is good at relational constructs, its ability to specify the basic structure of a document, that is,<br />

which elements can go where, results in a very verbose schema.<br />

The typical way to solve this is to combine Schematron with RELAX NG or W3C <strong>XML</strong> Schema. There are several<br />

schema processors available for both languages that support this combined form. This allows Schematron rules to<br />

specify additional constraints to the structure defined by W3C <strong>XML</strong> Schema or RELAX NG.<br />

Tool Support<br />

Schematron's reference implementation is actually an XSLT transformation that transforms the Schematron<br />

document into an XSLT that validates the <strong>XML</strong> file. As such, Schematron's potential toolset is any XSLT processor,<br />

though libxml2 provides an implementation that does not require XSLT. Sun Microsystems's Multiple Schema<br />

Validator for Java has an add-on that allows it to validate RELAX NG schemas that have embedded Schematron<br />

rules.<br />

Namespace Routing <strong>Language</strong> (NRL)<br />

This is not technically a schema language. Its sole purpose is to direct parts of documents to individual schemas<br />

based on the namespace of the encountered elements. An NRL is merely a list of <strong>XML</strong> namespaces and a path to a<br />

schema that each corresponds to. This allows each schema to be concerned with only its own language definition,<br />

and the NRL file routes the schema validator to the correct schema file based on the namespace of that element.<br />

This <strong>XML</strong> format is schema-language agnostic and works for just about any schema language.


<strong>XML</strong> Schema <strong>Language</strong> Comparison 170<br />

See also<br />

• Document Type Definition<br />

• Document Structure Description<br />

• W3C <strong>XML</strong> Schema<br />

• RELAX NG<br />

• Schematron<br />

• Namespace Routing <strong>Language</strong><br />

• Namespace-based Validation Dispatching <strong>Language</strong><br />

References<br />

• Comparative Analysis of Six <strong>XML</strong> Schema <strong>Language</strong>s [3] by Dongwon Lee, Wesley W. Chu, In ACM SIGMOD<br />

Record, Vol. 29, No. 3, page 76-87, September 2000<br />

• Taxonomy of <strong>XML</strong> Schema <strong>Language</strong>s using Formal <strong>Language</strong> Theory [4] by Makoto Murata, Dongwon Lee,<br />

Murali Mani, Kohsuke Kawaguchi, In ACM Trans. on Internet Technology (TOIT), Vol. 5, No. 4, page 1-45,<br />

November 2005<br />

[1] While annotations in RELAX NG can support default attribute values, the RELAX NG specification does not mandate that a validator<br />

provide this ability to modify an <strong>XML</strong> infoset as part of validation. The WXS specification does mandate this behavior. An additional<br />

specification associated with RELAX NG does provide this ability. See Relax NG DTD Compatibility (default value) (http://www.<br />

oasis-open.org/committees/relax-ng/compatibility.html#default-value).<br />

[2] James Clark (co-creator of RELAX NG). RELAX NG and W3C <strong>XML</strong> Schema (http://www.imc.org/ietf-xml-use/mail-archive/<br />

msg00217.html)


<strong>XML</strong> Studio 171<br />

<strong>XML</strong> Studio<br />

Editing an <strong>XML</strong> Schema in <strong>XML</strong> Studio<br />

Developer(s) Liquid Technologies<br />

Operating system Microsoft Windows<br />

Type <strong>XML</strong> Editor<br />

License EULA<br />

Website [1]<br />

Liquid <strong>XML</strong> Studio is an <strong>XML</strong> Editor and Integrated Development Environment (IDE) from Liquid Technologies.<br />

Liquid <strong>XML</strong> Studio allows developers to create <strong>XML</strong>-based and Web services applications using technologies such<br />

as <strong>XML</strong>, <strong>XML</strong> Schema, XSLT, XPath, WSDL, and SOAP [2] . Liquid <strong>XML</strong> Studio is also available as a plug-in for<br />

Microsoft Visual Studio [3] .<br />

Editions<br />

• Starter Edition<br />

• Designer Edition. Adds Visual Studio Integration and an <strong>XML</strong> Differencing tool.<br />

• Developer Edition. Adds Code generation to the features found in the Designer Edition. The <strong>XML</strong> Data Binder<br />

generates code for C++, C#, VB.Net, Java, Silverlight & Visual Basic.<br />

Editing <strong>View</strong>s<br />

• Graphical <strong>XML</strong> Schema Editor (XSD).<br />

• <strong>XML</strong> editor - with syntax highlighting and intellisense<br />

• DTD & CSS Editor - with syntax highlighting and Validation<br />

• XSLT Editor - Test Transform, syntax highlighting, intellisense and Validation<br />

Features<br />

• XPath Expression Builder - shows the results of your queries in realtime<br />

• Web Service Call Composer - allows developers to browse and call web services<br />

• <strong>XML</strong> Sample Generator - generates sample <strong>XML</strong> from an <strong>XML</strong> Schema<br />

• XSD Documentation Generation - creates HTML documentation from an <strong>XML</strong> Schema<br />

• <strong>XML</strong> Differencing tool - visualize the differences between 2 <strong>XML</strong> files<br />

• <strong>XML</strong> Schema Code Generation (<strong>XML</strong> Data Binding) for C++, C#, Java, VB.Net & Visual Basic 6<br />

• XSLT Editor - edits and executes XSL Transforms<br />

• Fast Infoset Support - Load and Save <strong>XML</strong> as Fast InfoSet [4]


<strong>XML</strong> Studio 172<br />

See also<br />

• Liquid Technologies<br />

• <strong>XML</strong><br />

• Category:<strong>XML</strong> editors<br />

• IDE<br />

• <strong>XML</strong> Schema<br />

• XSLT<br />

• XPath<br />

• Web services<br />

• Web Services Description <strong>Language</strong><br />

• SOAP<br />

External links<br />

• <strong>XML</strong> Studio product page [1]<br />

References<br />

[1] http://www.liquid-technologies.com/Product_XmlStudio.aspx<br />

[2] Liquid <strong>XML</strong> Studio Product Page (http://www.liquid-technologies.com/Product_XmlStudio.aspx)<br />

[3] Micorosoft Visual Studio Gallery (http://visualstudiogallery.com/ExtensionDetails.<br />

aspx?ExtensionID=33d43486-e73a-4f64-a342-f32c702abc19)<br />

[4] OSS Nokalva 'Market Wire' (http://www.marketwire.com/press-release/Oss-Nokalva-714198.html)<br />

<strong>XML</strong> Telemetric and Command Exchange<br />

XTCE (for <strong>XML</strong> Telemetric and Command<br />

Exchange) is an <strong>XML</strong> based exchange<br />

format for spacecraft telemetry and<br />

command meta-data. Using XTCE the<br />

format and content of a space systems<br />

command and telemetry links can be readily<br />

exchanged between spacecraft operators and<br />

manufacturers. XTCE was originally<br />

standardized by the OMG. In April 2007 the<br />

OMG released revision 1.1 of XTCE as an<br />

OMG available specification. Version 1.0 of the XTCE specification is a CCSDS red-book specification and version<br />

1.1 is a candidate CCSDS blue-book specification.<br />

Overview<br />

During the entire ground system development and operation phases of a mission, telemetry and telecommand<br />

definitions may be exchanged between multiple systems and organizations. Without a standard format, databases<br />

need dedicated converters to convert between the various proprietary database formats and editors. Allowing for a<br />

common database exchange format throughout the entire mission lifecycle will significantly reduce the cost of<br />

database conversions that occur in many space projects. XTCE has been developed as part of an international<br />

cooperation involving the National Aeronautics and Space Administration, the Jet Propulsion Laboratory, the<br />

Goddard Space Flight Center, the European Space Agency, the United States Air Force and private industry


<strong>XML</strong> Telemetric and Command Exchange 173<br />

including RT Logic, Harris, SciSys, Boeing and Lockheed Martin. The standards development effort has been<br />

coordinated via the Consultative Committee for Space Data Systems and the Object Management Group. The <strong>XML</strong><br />

Telemetry and Command Exchange standard is now in active use as a means to exchange mission databases<br />

improving interoperability while reducing mission readiness costs.<br />

External links<br />

• XTCE home [1]<br />

References<br />

• AIAA conference - SpaceOps2006, The XTCE Standardization approach of Telemetry and Command Databases -<br />

The ESA example: http://pdf.aiaa.org/preview/CDReadyMSPOPS06_1317/PV2006_5582.pdf<br />

• AIAA conference - SpaceOps 2006, Exchanging Databases with Dissimilar Systems using CCSDS XTCE: http://<br />

pdf.aiaa.org/preview/CDReadyMSPOPS06_1317/PV2006_5801.pdf<br />

• CCSDS, MOIMS-SMC Working Group: http://cwe.ccsds.org/moims/docs/MOIMS-SMandC<br />

• GSAW conference - 2006, Exchanging Databases with Dissimilar Systems using CCSDS XTCE, http://sunset.<br />

usc.edu/gsaw/gsaw2006/s2/merri.pdf<br />

• Aerospace Conference, 2004, XTCE: a standard <strong>XML</strong>-schema for describing mission operations databases, http:/<br />

/ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/9422/29904/01368138.pdf<br />

• AIAA conference - SpaceOps2006, A Model for a Spacecraft Operations <strong>Language</strong>, http://www.rheagroup.<br />

com/AIAA-2006-5708-129.pdf<br />

References<br />

[1] http://www.omg.org/space/xtce


<strong>XML</strong> template engine 174<br />

<strong>XML</strong> template engine<br />

An <strong>XML</strong> template engine (or <strong>XML</strong> template processor) is a specialized template processor for <strong>XML</strong> input and/or<br />

output, working in an <strong>XML</strong> template system context. There are two main types:<br />

• "<strong>XML</strong>-suite standards" compliant engines:<br />

• XSLT engines, named also XSLT processors<br />

• XQuery engines, named also XQuery processors<br />

• Others, like Web template engines<br />

XSLT processors<br />

XSLT processors may be delivered as standalone applications, or as software components or libraries intended for<br />

use by applications. Many web browsers and web server software have XSLT processor components built into them.<br />

Most current operating systems have an XSLT processor installed. For example, Windows XP comes with the<br />

MS<strong>XML</strong>3 library, which includes an XSLT processor.<br />

Optimizations<br />

Early XSLT processors had very few optimizations; stylesheet documents were read using the Document Object<br />

Model and the processor would act on them directly. XPath engines were also not optimized.<br />

By 2000, however, implementors saw optimization opportunities in both XPath evaluation and template rule<br />

processing. For example, the Java programming language's Transformation API for <strong>XML</strong> (TrAX), later subsumed<br />

into the Java API for <strong>XML</strong> Processing (JAXP), acknowledged one such optimization: before processing, the XSLT<br />

processor could condense the template rules and other stylesheet tree information into a single, compact Templates<br />

object, free from the constraints and bloat of standard DOMs, in an implementation-specific manner. This<br />

intermediate representation of the stylesheet tree allows for more efficient processing by potentially reducing<br />

preparation time and memory overhead. Additionally, the formal API allows for the object to be cached and reused<br />

for multiple transformations, potentially providing higher performance if several input documents are to be<br />

processed with the same XSLT stylesheet. Parallels are often drawn between this optimization and the compilation<br />

of programming language source code to bytecode: the stylesheets are said to be "compiled", even though they don't<br />

usually produce native programming language bytecode; rather, they produce intermediate structures and routines<br />

that are stored and processed internally. [1]<br />

In contrast, Eugene Kuznetsov (DataPower, IBM) and Jacek Ambroziak (Sun Microsystems: XSLT, Ambrosoft:<br />

Gregor/XSLT) have, independently, created the industry's first genuine optimizing compilers to output executable<br />

binary output. The approach has two major benefits: 1) the transformation executable can be run anywhere: servers,<br />

mobile devices, embedded environments lacking memory for the complete interpreter/compiler system, and 2) the<br />

transformation performance may reach the highest possible levels. The optimized compilation approach will lead to<br />

fastest transformation execution only when complemented by equally careful runtime system design!<br />

XPath evaluation also has room for significant optimizations, and most processor vendors have implemented at least<br />

some of them, for speed. For example, in the test will evaluate to true if /some/nodes<br />

identifies any nodes, so evaluation can stop as soon as the first matching node is found; continuing to look for the<br />

entire set of matching nodes would not change the result. Similar optimizations can be undertaken when processing<br />

xsl:when and xsl:value-of, as well as expressions relying on, either implicitly or explicitly, string(), boolean(), or<br />

number(), and those that use numeric and position()/last()-based predicates.


<strong>XML</strong> template engine 175<br />

Implementations<br />

Some of these are only libraries for specific programming languages, but some form the basis for command<br />

line or shell script utilities for one or more operating systems. Such utilities are either bundled with the<br />

libraries or independently maintained, and some are incorporated into other applications, such as database<br />

engines and web browsers, in order to add XSLT functionality to them. With the exception of web browsers,<br />

such utilities and applications are not listed here.<br />

Implementations for Java<br />

Xalan: Xalan-Java [2]<br />

SAXON by Michael Kay<br />

Gregor/XSLT [3] optimizing compiler and runtime by Jacek Ambroziak<br />

XT [4] originally by James Clark<br />

Oracle XSLT, in the Oracle XDK [5]<br />

Implementations for the .NET Framework<br />

Saxon .NET SourceForge Project Page [6] , an IKVM.NET-based port of Dr. Michael Kay's and<br />

Saxonica's Saxon Processor provides XSLT 2.0, XPath 2.0, and XQuery 1.0 support on the .NET<br />

platform.<br />

The .NET System. <strong>XML</strong> assembly provides a compiled XSLT 1.0 implementation, as well as an<br />

interpreted XSLT 1.0 implementation.<br />

Implementations for C or C++<br />

Xalan: Xalan-C++ [7]<br />

libxslt the XSLT C library for GNOME<br />

Sablotron [8] , which is integrated into PHP4<br />

XJR [9] , with XSLT 2.0, XPath2.0, and JSON support<br />

Implementations for Perl<br />

<strong>XML</strong>::LibXSLT [10] is a Perl interface to the libxslt C library<br />

<strong>XML</strong>::Sablotron [11] is a Perl interface to the Sablotron [8] processor<br />

Implementations for PHP<br />

XSLT [12] is the PHP4 interface to the Sablotron [8] processor<br />

XSL [13] is the new interface to XSL introduced in PHP5. The extension uses the libxslt library.<br />

Implementations for Python<br />

4XSLT, in the 4Suite [14] toolkit by Fourthought, Inc.<br />

lxml [15] is a Pythonic wrapper of the libxslt C library<br />

Implementations for Ruby<br />

Implementations for Tcl<br />

Ruby/XSLT [16] is a simple XSLT class based on libxml and libxslt<br />

Sablotron module for Ruby [17] is a ruby interface to Sablotron<br />

TclXSLT [18] wraps the libxslt library.<br />

tDOM [19] is a generic <strong>XML</strong> package, based on the expat library, that includes an XSLT<br />

implementation. In 2003, it was deemed "very probably the fastest available open source XSLT<br />

implementation, especially for bigger source files". [20]<br />

Implementations for JavaScript


<strong>XML</strong> template engine 176<br />

Google AJAXSLT [21] is an implementation of XSLT in JavaScript, intended for use in Ajax<br />

applications.<br />

Implementations for specific operating systems<br />

Microsoft's MS<strong>XML</strong> library may be used in various Microsoft Windows application development<br />

environments and languages, such as Visual Basic, C, and JScript.<br />

Microsoft offers a new XSLT processor in the System. <strong>XML</strong> component of the .NET Framework.<br />

Implementations integrated into web browsers<br />

References<br />

(Comparison of layout engines (<strong>XML</strong>))<br />

Mozilla has native XSLT support [22] based on TransforMiiX.<br />

Safari 1.2+ has native XSLT support, but Safari 1.2 is unable to perform XSL transformations via<br />

JavaScript [23] , a limitation that does not occur in Mozilla or Internet Explorer, or Safari 3. This limits<br />

the capabilities of Ajax applications that would run in Safari 2. Safari's (all varsions?) <strong>XML</strong>-parser is<br />

also not standards-compliant; it will parse <strong>XML</strong> strings according to HTML rules. Therefore, under<br />

certain circumstances, it will omit data from the DOM tree if it encounters malformed "HTML" — even<br />

though it actually encountered valid <strong>XML</strong>. These errors will propagate to XSL-processed DOM trees.<br />

X-Smiles has native XSLT support.<br />

Opera has partial native XSLT support since Version 9. Notable exceptions include the absence of the<br />

document() function.<br />

Internet Explorer 6 supports XSLT 1.0 via the MS<strong>XML</strong> library (described above). IE5 and IE5.5 came<br />

with an earlier MS<strong>XML</strong> component that only supported an older, nonrecommended dialect of XSLT. A<br />

newer version of MS<strong>XML</strong> can be downloaded and installed separately to enable IE5 and IE5.5 to<br />

support XSLT 1.0 through scripting, and if certain Windows Registry keys are modified, the newer<br />

library will replace the older version as the default used by IE.<br />

[1] Saxon: Anatomy of an XSLT processor (http://www-128.ibm.com/developerworks/xml/library/x-xslt2/) - An article describing the<br />

implementation and optimization details of a popular Java-based XSLT processor.<br />

[2] http://xml.apache.org/xalan-j/<br />

[3] http://ambrosoft.com/<br />

[4] http://www.blnz.com/xt/<br />

[5] http://www.oracle.com/technology/tech/xml/xdkhome.html<br />

[6] http://saxon.sourceforge.net/<br />

[7] http://xml.apache.org/xalan-c/<br />

[8] http://www.gingerall.org/sablotron.html<br />

[9] https://www.p6r.com/software/xjr.html<br />

[10] http://search.cpan.org/~msergeant/<strong>XML</strong>-LibXSLT-1.57/LibXSLT.pm<br />

[11] http://search.cpan.org/~pavelh/<strong>XML</strong>-Sablotron-1.01/Sablotron.pm [12]<br />

http://no.php.net/manual/en/ref.xslt.php<br />

[13] http://no.php.net/manual/en/book.xsl.php<br />

[14] http://4suite.org/<br />

[15] http://codespeak.net/lxml/<br />

[16] http://raa.ruby-lang.org/project/ruby-xslt/<br />

[17] http://www.rubycolor.org/sablot/<br />

[18] http://tclxml.sourceforge.net/tclxslt.html<br />

[19] http://www.tdom.org/<br />

[20] Loewer, Jochen; Ade, Rolf. "tDOM manual: tDOM Overview" (http://www.tdom.org/). . Retrieved 2009-11-12.<br />

[21] http://goog-ajaxslt.sourceforge.net/<br />

[22] http://www.mozilla.org/projects/xslt/<br />

[23] http://developer.apple.com/internet/safari/faq.html#anchor21


<strong>XML</strong> tree 177<br />

<strong>XML</strong> tree<br />

<strong>XML</strong> documents have a hierarchical structure and can conceptually be interpreted as a tree structure, called an <strong>XML</strong><br />

tree.<br />

This tree structure can not be divided into just root, nodes and leaves as normal tree structures. Although there is no<br />

consensus on the terminology used on <strong>XML</strong> Trees, at least two standard terminologies exist:<br />

• The terminology used in the XPath Data Model<br />

• The terminology used in the <strong>XML</strong> Information Set.<br />

<strong>XML</strong> validation<br />

<strong>XML</strong> validation is the process of checking a document written in <strong>XML</strong> (eXtensible <strong>Markup</strong> <strong>Language</strong>) to confirm<br />

that it is both "well-formed" and also "valid" in that it follows a defined structure. A "well-formed" document<br />

follows the basic syntactic rules of <strong>XML</strong>, which are the same for all <strong>XML</strong> documents. [1] A valid document also<br />

respects the rules dictated by a particular DTD or <strong>XML</strong> schema, according to the application-specific choices for<br />

those particular . [2]<br />

In addition, extended tools are available such as OASIS CAM standard specification that provide contextual<br />

validation of content and structure that is more flexible than basic schema validations.<br />

xmllint is a command line <strong>XML</strong> tool that can perform <strong>XML</strong> validation. It can be found in UNIX / Linux<br />

environments. An example with the use of this program for validation of a file called example.xml is<br />

xmllint --valid --noout example.xml<br />

External links<br />

Example C program<br />

• Validate <strong>XML</strong> against XSD in C [3] (using libxml)<br />

<strong>XML</strong> toolkit<br />

• The <strong>XML</strong> C parser and toolkit of Gnome [4] – libxml includes xmllint<br />

• Windows port of libxml [5] – maintained by Igor Zlatkovic<br />

Online validators for <strong>XML</strong> files<br />

• http://www.xmlvalidation.com/<br />

• http://www.stg.brown.edu/service/xmlvalid/<br />

• http://www.jcam.org.uk<br />

Articles discussing <strong>XML</strong> validation<br />

• DEVX March, 2009 - Taking <strong>XML</strong> Validation to the Next Level: Introducing CAM [6]


<strong>XML</strong> validation 178<br />

References<br />

[1] "Well-Formed <strong>XML</strong> Documents" (http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-well-formed). Extensible <strong>Markup</strong> <strong>Language</strong><br />

(<strong>XML</strong>) 1.1. W3C. 2004. .<br />

[2] "Constraints and Validation Rules" (http://www.w3.org/TR/xmlschema-1/#concepts-schemaConstraints). <strong>XML</strong> Schema Part 1:<br />

Structures Second Edition. W3C. 2004. .<br />

[3] http://knol2share.blogspot.com/2009/05/validate-xml-against-xsd-in-c.html<br />

[4] http://xmlsoft.org/xmldtd.html<br />

[5] http://www.zlatkovic.com/libxml.en.html<br />

[6] http://www.devx.com/xml/Article/41066<br />

<strong>XML</strong>-Enabled Networking<br />

<strong>XML</strong> Enabled Networking provides an abstraction layer that exists alongside the traditional Internet Protocol (IP)<br />

network. This layer addresses the security, incompatibility and latency issues encumbering <strong>XML</strong> messages, web<br />

services and service oriented architectures (SOAs).<br />

History of <strong>XML</strong> Enabled Networking<br />

Many organizations have adopted <strong>XML</strong> technologies - often as Web services or service oriented architectures<br />

(SOAs) - as the standard for new application development and integration. Applications based on <strong>XML</strong> and Web<br />

services offer rapid interoperability and seamless service re-use by establishing a standard data format and a standard<br />

interface.<br />

With faster development cycles, less development effort and improved agility, <strong>XML</strong> and Web services enable IT to<br />

deliver more solutions to the business at a substantially lower cost. However, using these technologies also creates<br />

some potential problems:<br />

• Security concerns: <strong>XML</strong> messages are text-based, human readable, verbose, and self-describing. An <strong>XML</strong><br />

message could include descriptions of identities and credentials used to authenticate services, signatures requiring<br />

verification etc. <strong>XML</strong> by itself does not provide an infrastructure for integrating with multiple identity/access<br />

control systems across the organization, ensuring trust and compliance for <strong>XML</strong> message processing, or<br />

protecting the organization from the threats that malicious individuals could introduce into the organization with<br />

<strong>XML</strong>.<br />

• Incompatibilities: Many <strong>XML</strong> standards have emerged. <strong>XML</strong> messages use a variety of security standards,<br />

transport protocols, credential types and data structures. Web service developers need some way to mediate<br />

between these different standards and protocols, especially when they are integrating with business partners who<br />

may employ entirely different standards and protocols.<br />

• Application latency: <strong>XML</strong> messages can consume significant processing resources from application servers,<br />

lowering performance for the <strong>XML</strong>-based service and for other applications that run on the same platform.<br />

<strong>XML</strong> Enabled Networking attempts to address these issues by creating an abstraction layer that exists alongside the<br />

traditional Internet Protocol (IP) network to provide security and access enforcement, accelerated <strong>XML</strong> message<br />

processing, mediation between standards and protocols, policy control and auditing. <strong>XML</strong> Enabled Networks have<br />

typically been sold as network appliances. Initially they required application-specific integrated circuits, but<br />

appliances that run on standards-based hardware and operating systems are now available.


<strong>XML</strong>-Enabled Networking 179<br />

Common Features of <strong>XML</strong> Enabled Networking<br />

• It is powered by hardened network appliances, ready to incorporate into the network with minimal disruption<br />

• <strong>XML</strong> Enabled Networking appliances have software to make the appliances easy to install, configure and manage<br />

• They can validate <strong>XML</strong> messages for well-formedness as they enter or exit the appliance<br />

• They can convert <strong>XML</strong> to any data format<br />

• They have built-in storage capabilities to enable on-device logging for compliance and debugging purposes.<br />

• They have built-in support for many <strong>XML</strong> standards such as XSLT, XPath, SOAP and WS-Security<br />

• They are easily upgradeable<br />

Classification of <strong>XML</strong> Enabled Networking<br />

<strong>XML</strong> Security Gateways or <strong>XML</strong> Firewalls offer comprehensive <strong>XML</strong> security processing. <strong>XML</strong> Security Gateways<br />

include acceleration and integration functionality. Enterprise class <strong>XML</strong> Security Gateways include robust policy<br />

management, correlated event/message/policy logging for visibility and extensibility frameworks.<br />

<strong>XML</strong> Routers deliver robust access control and integration with identity authorities with acceleration and integration<br />

functionality. Enterprise class <strong>XML</strong> Routers include robust policy management, correlated event/message/policy<br />

logging for visibility and extensibility frameworks.<br />

<strong>XML</strong> Accelerators optimize both message throughput and server performance for <strong>XML</strong> operations including schema<br />

validation, encryption/decryption, authentication, signing, data transformation and protocol mediation. Enterprise<br />

class <strong>XML</strong> Accelerators include robust policy management, correlated event/message/policy logging for visibility<br />

and extensibility frameworks.<br />

<strong>XML</strong> Enabled Networking vendors<br />

• Citrix Systems<br />

• DataPower (IBM)<br />

• F5 Networks<br />

• Forum Systems<br />

• Layer 7 Technologies<br />

• Reactivity, Inc. (Cisco [1] )<br />

• Solace Systems<br />

• Sonoa Systems<br />

• Strangeloop Networks<br />

• Vordel<br />

• Zeus Systems<br />

See also<br />

<strong>XML</strong><br />

SOAP<br />

WS-Security<br />

<strong>XML</strong> appliance<br />

References<br />

[1] http://newsroom.cisco.com/dlls/2007/corp_022107.html


<strong>XML</strong>-Retrieval 180<br />

<strong>XML</strong>-Retrieval<br />

<strong>XML</strong> Retrieval, or <strong>XML</strong> Information Retrieval, is the content-based retrieval of documents structured with <strong>XML</strong><br />

(eXtensible <strong>Markup</strong> <strong>Language</strong>). As such it is used for computing relevance of <strong>XML</strong> documents. [1]<br />

Queries<br />

Most <strong>XML</strong> retrieval approaches do so based on techniques from the information retrieval (IR) area, e.g. by<br />

computing the similarity between a query consisting of keywords (query terms) and the document. However, in<br />

<strong>XML</strong>-Retrieval the query can also contain structural hints. So-called "content and structure" (CAS) queries enable<br />

users to specify what structure the requested content can or must have.<br />

Exploiting <strong>XML</strong> structure<br />

Taking advantage of the self-describing structure of <strong>XML</strong> documents can improve the search for <strong>XML</strong> documents<br />

significantly. This includes the use of CAS queries, the weighting of different <strong>XML</strong> elements differently and the<br />

focused retrieval of subdocuments.<br />

Ranking<br />

Ranking in <strong>XML</strong>-Retrieval can incorporate both content relevance and structural similarity, which is the<br />

resemblance between the structure given in the query and the structure of the document. Also, the retrieval units<br />

resulting from an <strong>XML</strong> query may not always be entire documents, but can be any deeply nested <strong>XML</strong> elements, i.e.<br />

dynamic documents. The aim is to find the smallest retrieval unit that is highly relevant. Relevance can be defined<br />

according to the notion of specificity, which is the extent to which a retrieval unit focuses on the topic of request. [2]<br />

Existing <strong>XML</strong> search engines<br />

An overview of two potential approaches is available. [3] [4] The INitiative for the Evaluation of <strong>XML</strong>-Retrieval<br />

(INEX) was founded in 2002 and provides a platform for evaluating such algorithms. [2] Three different areas<br />

influence <strong>XML</strong>-Retrieval: [5]<br />

Traditional <strong>XML</strong> query languages<br />

Query languages such as the W3C standard XQuery [6] supply complex queries, but only look for exact matches.<br />

Therefore, they need to be extended to allow for vague search with relevance computing. Most <strong>XML</strong>-centered<br />

approaches imply a quite exact knowledge of the documents' schemas. [7]<br />

Databases<br />

Classic database systems have adopted the possibility to store semi-structured data [5] and resulted in the development<br />

of <strong>XML</strong> databases. Often, they are very formal, concentrate more on searching than on ranking, and are used by<br />

experienced users able to formulate complex queries.<br />

Information retrieval<br />

Classic information retrieval models such as the vector space model provide relevance ranking, but do not include<br />

document structure; only flat queries are supported. Also, they apply a static document concept, so retrieval units<br />

usually are entire documents. [7] They can be extended to consider structural information and dynamic document<br />

retrieval. Examples for approaches extending the vector space models are available: they use document subtrees<br />

(index terms plus structure) as dimensions of the vector space. [8]


<strong>XML</strong>-Retrieval 181<br />

See also<br />

• Document retrieval<br />

• Information retrieval applications<br />

References<br />

[1] Winter, Judith; Drobnik, Oswald (November 9, 2007).<br />

%20Architecture%20for%20<strong>XML</strong>%20Information%20Retrieval%20in%20a%20Peer-to-Peer%20Environment_2007.pdf "An Architecture<br />

for <strong>XML</strong> Information Retrieval in a Peer-to-Peer Environment" (ftp://ftp.tm.informatik.uni-frankfurt.de/pub/papers/ir/An). ACM.<br />

%20Architecture%20for%20<strong>XML</strong>%20Information%20Retrieval%20in%20a%20Peer-to-Peer%20Environment_2007.pdf. Retrieved<br />

2009-02-10.<br />

[2] Malik, Saadia; Trotman, Andrew; Lalmas, Mounia; Fuhr, Norbert (2007). "Overview of INEX 2006" (http://www.cs.otago.ac.nz/<br />

homepages/andrew/2006-10.pdf). Proceedings of the Fifth Workshop of the INitiative for the Evaluation of <strong>XML</strong> Retrieval. . Retrieved<br />

2009-02-10.<br />

[3] Amer-Yahia, Sihem; Lalmas, Mounia (2006). "<strong>XML</strong> Search: <strong>Language</strong>s, INEX and Scoring" (http://www.sigmod.org/record/issues/<br />

0612/p16-article-yahia.pdf). SIGMOD Rec. Vol. 35, No. 4. . Retrieved 2009-02-10.<br />

[4] Pal, Sukomal (June 30, 2006). "<strong>XML</strong> Retrieval: A Survey" (http://66.102.1.104/scholar?q=cache:R6ZYFNoTRrUJ:citeseerx.ist.psu.edu/<br />

viewdoc/download?doi=10.1.1.109.5986&rep=rep1&type=pdf). Technical Report, CVPR. . Retrieved 2009-02-10.<br />

[5] Fuhr, Norbert; Gövert, N.; Kazai, Gabriella; Lalmas, Mounia (2003). "INEX: Initiative for the Evaluation of <strong>XML</strong> Retrieval" (http://www.<br />

is.informatik.uni-duisburg.de/bib/pdf/ir/Fuhr_etal:02a.pdf). Proceedings of the First INEX Workshop, Dagstuhl, Germany, 2002. ERCIM<br />

Workshop Proceedings, France. . Retrieved 2009-02-10.<br />

[6] Boag, Scott; Chamberlin, Don; Fernández, Mary F.; Florescu, Daniela; Robie, Jonathan; Siméon, Jérôme (23 January 2007). "XQuery 1.0:<br />

An <strong>XML</strong> Query <strong>Language</strong>" (http://www.w3.org/TR/2007/REC-xquery-20070123/). W3C Recommendation. World Wide Web<br />

Consortium. . Retrieved 2009-02-10.<br />

[7] Schlieder, Torsten; Meuss, Holger (2002). "Querying and Ranking <strong>XML</strong> Documents" (http://209.85.173.132/<br />

search?q=cache:KHBo9BRjO7QJ:www.cis.uni-muenchen.de/people/Meuss/Pub/JASIS02.ps.gz). Journal of the American Society for<br />

Information Science and Technology, Vol. 53, No. 6. . Retrieved 2009-02-10.<br />

[8] Liu, Shaorong; Zou, Qinghua; Chu, Wesley W. (2004). "Configurable Indexing and Ranking for <strong>XML</strong> Information Retrieval" (http://www.<br />

cobase.cs.ucla.edu/tech-docs/sliu/SIGIR04.pdf). SIGIR'04. ACM. . Retrieved 2009-02-10.


<strong>XML</strong>HttpRequest 182<br />

<strong>XML</strong>HttpRequest<br />

HTTP<br />

Persistence · Compression · HTTP<br />

Secure<br />

Headers<br />

ETag · Cookie · Referrer · Location<br />

Status codes<br />

301 Moved permanently<br />

302 Found<br />

303 See Other<br />

403 Forbidden<br />

404 Not Found<br />

<strong>XML</strong>HttpRequest (XHR) is an API available in web browser scripting languages such as JavaScript. It is used to<br />

send HTTP or HTTPS requests directly to a web server and load the server response data directly back into the<br />

script. [1] The data might be received from the server as <strong>XML</strong> text [2] or as plain text. [3] Data from the response can be<br />

used directly to alter the DOM of the currently active document in the browser window without loading a new web<br />

page document. The response data can also be evaluated by the client-side scripting. For example, if it was formatted<br />

as JSON by the web server, it can easily be converted into a client-side data object for further use.<br />

<strong>XML</strong>HttpRequest has an important role in the Ajax web development technique. It is currently used by many<br />

websites to implement responsive and dynamic web applications. Examples of these web applications include Gmail,<br />

Google Maps, Facebook, and many others.<br />

History and support<br />

The concept behind the <strong>XML</strong>HttpRequest object was originally created by the developers of Outlook Web Access for<br />

Microsoft Exchange Server 2000. [4] An interface called I<strong>XML</strong>HTTPRequest was developed and implemented into<br />

the second version of the MS<strong>XML</strong> library using this concept. [4] [5] The second version of the MS<strong>XML</strong> library was<br />

shipped with Internet Explorer 5.0 in March 1999, allowing access, via ActiveX, to the I<strong>XML</strong>HTTPRequest interface<br />

using the <strong>XML</strong>HTTP wrapper of the MS<strong>XML</strong> library. [6]<br />

The Mozilla Foundation developed and implemented an interface called nsI<strong>XML</strong>HttpRequest into the Gecko layout<br />

[7] [8]<br />

engine. This interface was modelled to work as closely to Microsoft's I<strong>XML</strong>HTTPRequest interface as possible.<br />

Mozilla created a wrapper to use this interface through a JavaScript object which they called <strong>XML</strong>HttpRequest. [9]<br />

[10] [11]<br />

The <strong>XML</strong>HttpRequest object was accessible as early as Gecko version 0.6 released on December 6 of 2000,<br />

but it was not completely functional until as late as version 1.0 of Gecko released on June 5, 2002. [10] [11] The<br />

<strong>XML</strong>HttpRequest object became a de facto standard amongst other major user agents, implemented in Safari 1.2<br />

released in February 2004, [12] Konqueror, Opera 8.0 released in April 2005, [13] and iCab 3.0b352 released in<br />

September 2005. [14]<br />

The World Wide Web Consortium published a Working Draft specification for the <strong>XML</strong>HttpRequest object on April<br />

5, 2006, edited by Anne van Kesteren of Opera Software and Dean Jackson of W3C. [15] Its goal is "to document a<br />

minimum set of interoperable features based on existing implementations, allowing Web developers to use these<br />

features without platform-specific code." The last revision to the <strong>XML</strong>HttpRequest object specification was on<br />

[16] [17]<br />

November 19 of 2009, being a last call working draft.


<strong>XML</strong>HttpRequest 183<br />

Microsoft added the <strong>XML</strong>HttpRequest object identifier to its scripting languages in Internet Explorer 7.0 released in<br />

October 2006. [6]<br />

With the advent of cross-browser JavaScript libraries such as jQuery and the Prototype JavaScript Framework,<br />

developers can invoke <strong>XML</strong>HttpRequest functionality without coding directly to the API. Prototype provides an<br />

asynchronous requester object called Ajax.Request that wraps the browser's underlying implementation and provides<br />

access to it. [18] jQuery objects represent or wrap elements from the current client-side DOM. They all have a .load()<br />

method that takes a URI parameter and makes an <strong>XML</strong>HttpRequest to that URI, then by default places any returned<br />

[19] [20]<br />

HTML into the HTML element represented by the jQuery object.<br />

The W3C has since published another Working Draft specification for the <strong>XML</strong>HttpRequest object,<br />

"<strong>XML</strong>HttpRequest Level 2", on February 25 of 2008. [21] Level 2 consists of extended functionality to the<br />

<strong>XML</strong>HttpRequest object, including, but not currently limited to, progress events, support for cross-site requests, and<br />

the handling of byte streams. The latest revision of the <strong>XML</strong>HttpRequest Level 2 specification is that of 20th August<br />

2009, which is still a working draft. [22]<br />

Support in Internet Explorer versions 5, 5.5 and 6<br />

Internet Explorer versions 5 and 6 did not define the <strong>XML</strong>HttpRequest object identifier in their scripting languages<br />

as the <strong>XML</strong>HttpRequest identifier itself was not standard at the time of their releases. [6] Backward compatibility can<br />

be achieved through object detection if the <strong>XML</strong>HttpRequest identifier does not exist.<br />

An example of how to instantiate an <strong>XML</strong>HttpRequest object with support for Internet Explorer versions 5 and 6<br />

using JScript method ActiveXObject is below. [23]<br />

/*<br />

Provide the <strong>XML</strong>HttpRequest constructor for IE 5.x-6.x:<br />

Other browsers (including IE 7.x-8.x) do not redefine<br />

<strong>XML</strong>HttpRequest if it already exists.<br />

This example is based on findings at:<br />

http://blogs.msdn.com/xmlteam/archive/2006/10/23/using-the-right-version-of-msxml-in-inte<br />

*/<br />

if (typeof <strong>XML</strong>HttpRequest == "undefined")<br />

<strong>XML</strong>HttpRequest = function () {<br />

};<br />

try { return new ActiveXObject("Msxml2.<strong>XML</strong>HTTP.6.0"); }<br />

catch (e) {}<br />

try { return new ActiveXObject("Msxml2.<strong>XML</strong>HTTP.3.0"); }<br />

catch (e) {}<br />

try { return new ActiveXObject("Msxml2.<strong>XML</strong>HTTP"); }<br />

catch (e) {}<br />

//Microsoft.<strong>XML</strong>HTTP points to Msxml2.<strong>XML</strong>HTTP.3.0 and is redundant<br />

throw new Error("This browser does not support <strong>XML</strong>HttpRequest.");<br />

Web pages that use <strong>XML</strong>HttpRequest or <strong>XML</strong>HTTP can mitigate the current minor differences in the<br />

implementations either by encapsulating the <strong>XML</strong>HttpRequest object in a JavaScript wrapper, or by using an<br />

existing framework that does so. In either case, the wrapper should detect the abilities of current implementation and<br />

work within its requirements.


<strong>XML</strong>HttpRequest 184<br />

HTTP request<br />

The following sections demonstrate how a request using the <strong>XML</strong>HttpRequest object functions within a conforming<br />

user agent based on the W3C Working Draft. As the W3C standard for the <strong>XML</strong>HttpRequest object is still a draft,<br />

user agents may not abide by all the functionings of the W3C definition and any of the following is subject to<br />

change. Extreme care should be taken into consideration when scripting with the <strong>XML</strong>HttpRequest object across<br />

multiple user agents. This article will try to list the inconsistencies between the major user agents.<br />

The open method<br />

The HTTP and HTTPS requests of the <strong>XML</strong>HttpRequest object must be initialized through the open method. This<br />

method must be invoked prior to the actual sending of a request to validate and resolve the request method, URL,<br />

and URI user information to be used for the request. This method does not assure that the URL exists or the user<br />

information is correct. This method can accept up to five parameters, but requires only two, to initialize a request.<br />

The first parameter of the method is a text string indicating the HTTP request method to use. The request methods<br />

that must be supported by a conforming user agent, defined by the W3C draft for the <strong>XML</strong>HttpRequest object, are<br />

currently listed as the following. [24]<br />

• GET (Supported by IE7+, Mozilla 1+)<br />

• POST (Supported by IE7+, Mozilla 1+)<br />

• HEAD (Supported by IE7+)<br />

• PUT<br />

• DELETE<br />

• OPTIONS (Supported by IE7+)<br />

However, request methods are not limited to the ones listed above. The W3C draft states that a browser may support<br />

additional request methods at their own discretion.<br />

The second parameter of the method is another text string, this one indicating the URL of the HTTP request. The<br />

W3C recommends that browsers should raise an error and not allow the request of a URL with either a different port<br />

or ihost URI component from the current document. [25]<br />

The third parameter, a boolean value indicating whether or not the request will be asynchronous, is not a required<br />

parameter by the W3C draft. The default value of this parameter should be assumed to be true by a W3C conforming<br />

user agent if it is not provided. An asynchronous request ("true") will not wait on a server response before continuing<br />

on with the execution of the current script. It will instead invoke the onreadystatechange event listener of the<br />

<strong>XML</strong>HttpRequest object throughout the various stages of the request. A synchronous request ("false") however will<br />

block execution of the current script until the request has been completed, thus not invoking the onreadystatechange<br />

event listener.<br />

The fourth and fifth parameters are the URI user and password, respectively. These parameters are not required and<br />

should default to the current user and password of the document if not supplied, as defined by the W3C draft.<br />

The setRequestHeader method<br />

Upon successful initialization of a request, the setRequestHeader method of the <strong>XML</strong>HttpRequest object can be<br />

invoked to send HTTP headers with the request. The first parameter of this method is the text string name of the<br />

header. The second parameter is the text string value. This method must be invoked for each header that needs to be<br />

sent with the request. Any headers attached here will be removed the next time the open method is invoked in a W3C<br />

conforming user agent.


<strong>XML</strong>HttpRequest 185<br />

The send method<br />

To send an HTTP request, the send method of the <strong>XML</strong>HttpRequest must be invoked. This method accepts a single<br />

parameter containing the content to be sent with the request. This parameter may be omitted if no content needs to be<br />

sent. The W3C draft states that this parameter may be any type available to the scripting language as long as it can be<br />

turned into a text string, with the exception of the DOM document object. If a user agent cannot stringify the<br />

parameter, then the parameter should be ignored.<br />

If the parameter is a DOM document object, a user agent should assure the document is turned into well-formed<br />

<strong>XML</strong> using the encoding indicated by the inputEncoding property of the document object. If the Content-Type<br />

request header was not added through setRequestHeader yet, it should automatically be added by a conforming user<br />

agent as "application/xml;charset=charset," where charset is the encoding used to encode the document.<br />

The onreadystatechange event listener<br />

If the open method of the <strong>XML</strong>HttpRequest object was invoked with the third parameter set to true for an<br />

asynchronous request, the onreadystatechange event listener will be automatically invoked for each of the<br />

following actions that change the readyState property of the <strong>XML</strong>HttpRequest object.<br />

• After the open method has been invoked successfully, the readyState property of the <strong>XML</strong>HttpRequest object<br />

should be assigned a value of 1.<br />

• After the send method has been invoked and the HTTP response headers have been received, the readyState<br />

property of the <strong>XML</strong>HttpRequest object should be assigned a value of 2.<br />

• Once the HTTP response content begins to load, the readyState property of the <strong>XML</strong>HttpRequest object should<br />

be assigned a value of 3.<br />

• Once the HTTP response content has finished loading, the readyState property of the <strong>XML</strong>HttpRequest object<br />

should be assigned a value of 4.<br />

The major user agents are inconsistent with the handling of the onreadystatechange event listener.<br />

The HTTP response<br />

After a successful and completed call to the send method of the <strong>XML</strong>HttpRequest, if the server response was valid<br />

<strong>XML</strong> and the Content-Type header sent by the server is understood by the user agent as an Internet media type for<br />

<strong>XML</strong>, the response<strong>XML</strong> property of the <strong>XML</strong>HttpRequest object will contain a DOM document object. Another<br />

property, responseText will contain the response of the server in plain text by a conforming user agent, regardless of<br />

whether or not it was understood as <strong>XML</strong>.<br />

See also<br />

• Hypertext Transfer Protocol<br />

• Representational State Transfer<br />

• Ajax<br />

External links<br />

• Level 1 specification of the <strong>XML</strong>HttpRequest object from W3C [26]<br />

• Level 2 specification of the <strong>XML</strong>HttpRequest object from W3C [27]<br />

• Specification of the <strong>XML</strong>HttpRequest object for Apple developers [28]<br />

• Specification of the <strong>XML</strong>HttpRequest object for Microsoft developers [29]<br />

• Specification of the <strong>XML</strong>HttpRequest object for Mozilla developers [30]<br />

• Specification of the <strong>XML</strong>HttpRequest object for Opera developers [31]


<strong>XML</strong>HttpRequest 186<br />

• "Attacking AJAX Applications" [32] , a presentation given at the Black Hat security conference. Discusses several<br />

issues involving XHR and the future of cross-domain AJAX.<br />

References<br />

[1] "<strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<strong>XML</strong>HttpRequest/). W3.org. . Retrieved<br />

2009-07-14.<br />

[2] "The response<strong>XML</strong> attribute of the <strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<br />

<strong>XML</strong>HttpRequest/#responsexml). W3.org. . Retrieved 2009-07-14.<br />

[3] "The responseText attribute of the <strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<br />

<strong>XML</strong>HttpRequest/#responsetext). W3.org. . Retrieved 2009-07-14.<br />

[4] "Article on the history of <strong>XML</strong>HTTP by an original developer" (http://www.alexhopmann.com/xmlhttp.htm). Alexhopmann.com.<br />

2007-01-31. . Retrieved 2009-07-14.<br />

[5] "Specification of the I<strong>XML</strong>HTTPRequest interface from the Microsoft Developer Network" (http://msdn.microsoft.com/en-us/library/<br />

ms759148(VS.85).aspx). Msdn.microsoft.com. . Retrieved 2009-07-14.<br />

[6] Dutta, Sunava (2006-01-23). "Native <strong>XML</strong>HTTPRequest object" (http://blogs.msdn.com/ie/archive/2006/01/23/516393.aspx). IEBlog.<br />

Microsoft. . Retrieved 2006-11-30.<br />

[7] "Specification of the nsI<strong>XML</strong>HttpRequest interface from the Mozilla Developer Center" (https://developer.mozilla.org/en/<br />

nsI<strong>XML</strong>HttpRequest). Developer.mozilla.org. 2008-05-16. . Retrieved 2009-07-14.<br />

[8] "Specification of the nsIJS<strong>XML</strong>HttpRequest interface from the Mozilla Developer Center" (https://developer.mozilla.org/en/<br />

NsIJS<strong>XML</strong>HttpRequest). Developer.mozilla.org. 2009-05-03. . Retrieved 2009-07-14.<br />

[9] "Specification of the <strong>XML</strong>HttpRequest object from the Mozilla Developer Center" (https://developer.mozilla.org/en/XmlHttpRequest).<br />

Developer.mozilla.org. 2009-05-03. . Retrieved 2009-07-14.<br />

[10] "Version history for the Mozilla Application Suite" (http://www.mozilla.org/releases/history.html). Mozilla.org. . Retrieved 2009-07-14.<br />

[11] "Downloadable, archived releases for the Mozilla browser" (http://www-archive.mozilla.org/releases/). Archive.mozilla.org. . Retrieved<br />

2009-07-14.<br />

[12] "Archived news from Mozillazine stating the release date of Safari 1.2" (http://weblogs.mozillazine.org/hyatt/archives/2004_02.html).<br />

Weblogs.mozillazine.org. . Retrieved 2009-07-14.<br />

[13] "Press release stating the release date of Opera 8.0 from the Opera website" (http://www.opera.com/press/releases/2005/06/16/).<br />

Opera.com. 2005-04-19. . Retrieved 2009-07-14.<br />

[14] Soft-Info.org. "Detailed browser information stating the release date of iCab 3.0b352 from" (http://www.soft-info.org/browsers/<br />

icab-10109.html). Soft-Info.com. . Retrieved 2009-07-14.<br />

[15] "Specification of the <strong>XML</strong>HttpRequest object from the Level 1 W3C Working Draft released on April 5th, 2006" (http://www.w3.org/<br />

TR/2006/WD-<strong>XML</strong>HttpRequest-20060405/). W3.org. . Retrieved 2009-07-14.<br />

[16] "<strong>XML</strong>HttpRequest W3C Working Draft 19 November 2009" (http://www.w3.org/TR/2009/WD-<strong>XML</strong>HttpRequest-20091119/).<br />

W3.org. . Retrieved 2009-12-17.<br />

[17] "W3C Process Document, Section 7.4.2 Last Call Announcement" (http://www.w3.org/2005/10/Process-20051014/tr#last-call).<br />

W3.org. . Retrieved 2009-12-17.<br />

[18] Porteneuve, Christophe (2007). "9". in Daniel H Steinberg. Raleigh, North Carolina: Pragmatic Bookshelf. pp. 183. ISBN 1-934356-01-8.<br />

[19] Chaffer, Jonathan; Karl Swedberg (2007). Learning jQuery. Birmingham: Packt Publishing. pp. 107. ISBN 978-1-847192-50-9.<br />

[20] Chaffer, Jonathan; Karl Swedberg (2007). jQuery Reference Guide. Birmingham: Packt Publishing. pp. 156. ISBN 978-1-847193-81-0.<br />

[21] "Specification of the <strong>XML</strong>HttpRequest object from the Level 2 W3C Working Draft released on February 25th, 2008" (http://www.w3.<br />

org/TR/2008/WD-<strong>XML</strong>HttpRequest2-20080225/). W3.org. . Retrieved 2009-07-14.<br />

[22] "<strong>XML</strong>HttpRequest Level 2, W3C Working Draft 20 August 2009" (http://www.w3.org/TR/<strong>XML</strong>HttpRequest2/). W3.org. . Retrieved<br />

2010-04-08.<br />

[23] "Ajax Reference (<strong>XML</strong>HttpRequest object)" (http://www.javascriptkit.com/jsref/ajax.shtml). JavaScript Kit. 2008-07-22. . Retrieved<br />

2009-07-14.<br />

[24] "Dependencies of the <strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<strong>XML</strong>HttpRequest/<br />

#dependencies). W3.org. . Retrieved 2009-07-14.<br />

[25] "The "open" method of the <strong>XML</strong>HttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/<strong>XML</strong>HttpRequest/<br />

#the-open-method). W3.org. . Retrieved 2009-10-13.<br />

[26] http://www.w3.org/TR/<strong>XML</strong>HttpRequest/<br />

[27] http://www.w3.org/TR/<strong>XML</strong>HttpRequest2/<br />

[28] http://developer.apple.com/internet/webcontent/xmlhttpreq.html<br />

[29] http://msdn.microsoft.com/en-us/library/ms535874(VS.85).aspx<br />

[30] https://developer.mozilla.org/en/<strong>XML</strong>HttpRequest<br />

[31] http://www.opera.com/docs/specs/opera9/xhr/<br />

[32] http://www.isecpartners.com/files/iSEC-Attacking_AJAX_Applications.BH2006.pdf


<strong>XML</strong>Socket 187<br />

<strong>XML</strong>Socket<br />

<strong>XML</strong>Socket is a class in ActionScript which allows Adobe Flash content to use socket communication, via TCP<br />

stream sockets. It can be used for plain text, although, as the name implies, it was made for <strong>XML</strong>. It is often used in<br />

chat applications and multiplayer games.<br />

Examples<br />

ActionScript 2.0<br />

For a simple Hello, World! application in ActionScript 2.0, you could use the code below:<br />

var xmlSocket:<strong>XML</strong>Socket=new <strong>XML</strong>Socket();<br />

xmlSocket.onConnect=function() {<br />

}<br />

xmlSocket.send(new <strong>XML</strong>("Hello, World!"));<br />

xmlSocket.on<strong>XML</strong>=function(my<strong>XML</strong>) {<br />

}<br />

trace(my<strong>XML</strong>.firstChild.childNodes[0].firstChild.nodeValue);<br />

xmlSocket.close();<br />

xmlSocket.connect("localhost",8463);<br />

This would result in the output window of the Flash IDE opening and displaying "Hello, World!", assuming that a<br />

socket server was running on port 8463 of the local machine, and was echoing everything sent to it. <br />

External links<br />

• <strong>XML</strong> Sockets: the basics of multiplayer games [1] , gotoAndPlay Flash Tutorials<br />

• <strong>XML</strong>Socket Simplified [2] , Heliant Whitepaper for ActionScript<br />

• Utilizing Flash Player <strong>XML</strong>Sockets for JavaScript applications [3]<br />

• Palabre, Simple open source <strong>XML</strong> socket server for Flash written in python [4]<br />

References<br />

[1] http://www.gotoandplay.it/_articles/2003/12/xmlSocket.php<br />

[2] http://www.heliant.net/~stsai/code/<br />

[3] http://www.devpro.it/xmlsocket/<br />

[4] http://palabre.gavroche.net


XPath 188<br />

XPath<br />

Paradigm Query language<br />

Appeared in 1999<br />

Developer W3C<br />

Stable release 2.0 (January 23 2007)<br />

Major implementations JavaScript, C#, Java<br />

Influenced by XSLT, XPointer<br />

Influenced <strong>XML</strong> Schema,<br />

XForms<br />

XPath, the <strong>XML</strong> Path <strong>Language</strong>, is a query language for selecting nodes from an <strong>XML</strong> document. In addition,<br />

XPath may be used to compute values (e.g., strings, numbers, or Boolean values) from the content of an <strong>XML</strong><br />

document. XPath was defined by the World Wide Web Consortium (W3C).<br />

History<br />

The XPath language is based on a tree representation of the <strong>XML</strong> document, and provides the ability to navigate<br />

around the tree, selecting nodes by a variety of criteria. [1] In popular use (though not in the official specification), an<br />

XPath expression is often referred to simply as an XPath.<br />

Originally motivated by a desire to provide a common syntax and behavior model between XPointer and XSLT,<br />

subsets of the XPath query language are used in other W3C specifications such as <strong>XML</strong> Schema and XForms.<br />

Versions<br />

There are currently two versions in use.<br />

• XPath 1.0 became a Recommendation on 16 November 1999 and is widely implemented and used, either on its<br />

own (called via an API from languages such as Java, C# or JavaScript), or embedded in languages such as XSLT<br />

or XForms.<br />

• XPath 2.0 is the current version of the language; it became a Recommendation on 23 January 2007. A number of<br />

implementations exist but are not as widely used as XPath 1.0. The XPath 2.0 language specification is much<br />

larger than XPath 1.0 and changes some of the fundamental concepts of the language such as the type system.<br />

The most notable change is that XPath 2.0 has a much richer type system; [2] Every value is now a sequence (a single<br />

atomic value or node is regarded as a sequence of length one). XPath 1.0 node-sets are replaced by node sequences,<br />

which may be in any order.<br />

To support richer type sets, XPath 2.0 offers a greatly expanded set of functions and operators.<br />

XPath 2.0 is in fact a subset of XQuery 1.0. It offers a for expression which is cut-down version of the "FLWOR"<br />

expressions in XQuery. It is possible to describe the language by listing the parts of XQuery that it leaves out: the<br />

main examples are the query prolog, element and attribute constructors, the remainder of the "FLWOR" syntax, and<br />

the typeswitch expression.


XPath 189<br />

See also<br />

• XPath 1.0<br />

• XPath 2.0<br />

External links<br />

• XPath syntax [3]<br />

• XPath 1.0 specification [4]<br />

• XPath 2.0 specification [5]<br />

• What's New in XPath 2.0 [6]<br />

References<br />

[1] Article on xpath in techsoftcomputing.com<br />

[2] XPath 2.0 supports atomic types, defined as built-in types in <strong>XML</strong> Schema, and may also import user-defined types from a schema. (http://<br />

www.techsoftcomputing.com)<br />

[3] http://www.w3schools.com/XPath/xpath_syntax.asp<br />

[4] http://www.w3.org/TR/xpath<br />

[5] http://www.w3.org/TR/xpath20/<br />

[6] http://www.xml.com/pub/a/2002/03/20/xpath2.html<br />

XPath 2.0<br />

XPath 2.0 is the current version of the XPath language defined by the World Wide Web Consortium, W3C. It<br />

became a recommendation on 23 January 2007.<br />

XPath is used primarily for selecting parts of an <strong>XML</strong> document. For this purpose the <strong>XML</strong> document is modelled as<br />

a tree of nodes. XPath allows nodes to be selected by means of a hierarchic navigation path through the document<br />

tree.<br />

The language is significantly larger than its predecessor, XPath 1.0, and some of the basic concepts such as the data<br />

model and type system have changed. The two language versions are therefore described in separate articles.<br />

XPath 2.0 is used as a sublanguage of XSLT 2.0, and it is also a subset of XQuery 1.0. All three languages share the<br />

same data model, type system, and function library, and were developed together and published on the same day.<br />

Data model<br />

Every value in XPath 2.0 is a sequence of items. The items may be nodes or atomic values. An individual node or<br />

atomic value is considered to be a sequence of length one. Sequences may not be nested.<br />

Nodes are of seven kinds, corresponding to different constructs in the syntax of <strong>XML</strong>: elements, attributes, text<br />

nodes, comments, processing instructions, namespace nodes, and document nodes. (The document node replaces the<br />

root node of XPath 1.0, because the XPath 2.0 model allows trees to be rooted at other kinds of node, notably<br />

elements.)<br />

Nodes may be typed or untyped. A node acquires a type as a result of validation against an <strong>XML</strong> Schema. If an<br />

element or attribute is successfully validated against a particular complex type or simple type defined in a schema,<br />

the name of that type is attached as an annotation to the node, and determines the outcome of operations applied to<br />

that node: for example, when sorting, nodes that are annotated as integers will be sorted as integers.<br />

Atomic values may belong to any of the 19 primitive types defined in the <strong>XML</strong> Schema specification (for example,<br />

string, boolean, double, float, decimal, dateTime, QName, and so on). They may also belong to a type derived from


XPath 2.0 190<br />

one of these primitive types: either a built-in derived type such as integer or Name, or a user-defined derived type<br />

defined in a user-written schema.<br />

Type system<br />

The type system of XPath 2.0 is noteworthy for the fact that it mixes strong typing and weak typing within a single<br />

language.<br />

Operations such as arithmetic and boolean comparison require atomic values as their operands. If an operand returns<br />

a node (for example, @price * 1.2), then the node is automatically atomized to extract the atomic value. If the input<br />

document has been validated against a schema, then the node will typically have a type annotation, and this<br />

determines the type of the resulting atomic value (in this example, the price attribute might have the type decimal). If<br />

no schema is in use, the node will be untyped, and the type of the resulting atomic value will be untypedAtomic.<br />

Typed atomic values are checked to ensure that they have an appropriate type for the context where they are used:<br />

for example, it is not possible to multiply a date by a number. Untyped atomic values, by contrast, follow a weak<br />

typing discipline: they are automatically converted to a type appropriate to the operation where they are used: for<br />

example with an arithmetic operation an untyped atomic value is converted to the type double.<br />

Path expressions<br />

The location paths of XPath 1.0 are referred to in XPath 2.0 as path expressions. Informally, a path expression is a<br />

sequence of steps separated by the "/" operator, for example a/b/c (which is short for child::a/child::b/child::c). More<br />

formally, however, "/" is simply a binary operator that applies the expression on its right-hand side to each item in<br />

turn selected by the expression on the left hand side. So in this example, the expression a selects all the element<br />

children of the context node that are named ; the expression child::b is then applied to each of these nodes,<br />

selecting all the children of the elements; and the expression child::c is then applied to each node in this<br />

sequence, which selects all the children of these elements.<br />

The "/" operator is generalized in XPath 2.0 to allow any kind of expression to be used as an operand: in XPath 1.0,<br />

the right-hand side was always an axis step. For example, a function call can be used on the right-hand side. The<br />

typing rules for the operator require that the result of the first operand is a sequence of nodes. The right hand operand<br />

can return either nodes or atomic values (but not a mixture). If the result consists of nodes, then duplicates are<br />

eliminated and the nodes are returned in document order, and ordering defined in terms of the relative positions of<br />

the nodes in the original <strong>XML</strong> tree.<br />

In many cases the operands of "/" will be axis steps: these are largely unchanged from XPath 1.0, and are described<br />

in the article on XPath 1.0.<br />

Other operators<br />

Other operators available in XPath 2.0 include the following:


XPath 2.0 191<br />

Operators Effect<br />

+, -, *, div, mod, idiv Arithmetic on numbers, dates, and durations<br />

=, !=, , = General comparison: compare arbitrary sequences. The result is true if any pair of items, one from each sequence, satisfies<br />

the comparison<br />

eq, ne, lt, gt, le, ge Value comparison: compare single items<br />

is Compare node identity: true if both operands are the same node<br />

Compare node position, based on document order<br />

union, intersect,<br />

except<br />

Compare sequences of nodes, treating them as sets, returning the set union, intersection, or difference<br />

and, or boolean conjunction and disjunction. Negation is achieved using the not() function.<br />

to defines an integer range, for example 1 to 10<br />

instance of determines whether a value is an instance of a given type<br />

cast as converts a value to a given type<br />

castable as tests whether a value is convertible to a given type<br />

Conditional expressions may be written using the syntax if (A) then B else C.<br />

XPath 2.0 also offers a for expression, which is a small subset of the FLWOR expression from XQuery. The<br />

expression for $x in X return Y evaluates the expression Y for each value in the result of expression X in turn,<br />

referring to that value using the variable reference $x.<br />

Function library<br />

The function library in XPath 2.0 is greatly extended from the function library in XPath 1.0.<br />

The functions available include the following:<br />

Purpose Example Functions<br />

General string<br />

handling<br />

Regular<br />

expressions<br />

lower-case, upper-case, substring, substring-before, substring-after, translate, starts-with, ends-with, contains, string-length,<br />

concat, normalize-space, normalize-unicode<br />

matches, replace, tokenize<br />

Arithmetic count, sum, avg, min, max, round, floor, ceiling, abs<br />

Dates and times adjust-dateTime-to-timezone, current-dateTime, day-from-dateTime, month-from-dateTime, days-from-duration,<br />

months-from-duration, etc.<br />

Properties of nodes name, node-name, local-name, namespace-uri, base-uri, nilled<br />

Document handling doc, doc-available, document-uri, collection, id, idref<br />

URIs encode-for-uri, escape-html-uri, iri-to-uri, resolve-uri<br />

QNames QName, namespace-uri-from-QName, prefix-from-QName, resolve-QName<br />

Sequences insert-before, remove, subsequence, index-of, distinct-values, reverse, unordered, empty, exists<br />

Type checking one-or-more, exactly-one, zero-or-one


XPath 2.0 192<br />

Backwards compatibility<br />

Because of the changes in the data model and type system, not all expressions in XPath 2.0 have exactly the same<br />

effect as in 1.0. The main difference is that XPath 1.0 was more relaxed about type conversion, for example<br />

comparing two strings ("4" > "4.0") was quite possible but would do a numeric comparison; in XPath 2.0 this is<br />

defined to compare the two values as strings using a context-defined collating sequence.<br />

To ease transition, XPath 2.0 defines a mode of execution in which the semantics are modified to be as close as<br />

possible to XPath 1.0 behavior. When using XSLT 2.0, this mode is activated by setting version="1.0" as an attribute<br />

on the xsl:stylesheet element. This still doesn't offer 100% compatibility, but any remaining differences are only<br />

likely to be encountered in unusual cases.<br />

Support<br />

Support for XPath 2.0 is still limited.<br />

• For browser support, see Comparison of layout engines (<strong>XML</strong>).<br />

External links<br />

• XPath 2.0 specification [5]<br />

• What's New in XPath 2.0 [6]<br />

Xs3p<br />

xs3p is an XSLT stylesheet that generates XHTML documentation from <strong>XML</strong> Schema Definition language (XSD)<br />

schema.<br />

xs3p requires an XSLT processor like Xalan from Apache Software Foundation. The results can be generally viewed<br />

with any browser that supports Cascading Style Sheets Level 2 (CSS2) and XHTML 1.0, such as Explorer 5.5,<br />

Mozilla 1.0, Netscape 6 or Opera 5 (or later).<br />

xs3p was developed by Project Titanium [1] , Distributed Systems Technology Centre (DSTC) Pty Ltd. and<br />

distributed under a Mozilla Public License (MPL). xs3p is used by both the Oxygen <strong>XML</strong> Editor and Stylus Studio<br />

to generate schema documentation, and a modified version of the stylesheet is included with this program.[2]<br />

Recently the DSTC website, which was officially hosting the xs3p stylesheet, has become unavailable. A download<br />

of the xs3p stylesheet is available from the FiForms <strong>XML</strong> Definitions [3] project.<br />

References<br />

[1] http://titanium.dstc.edu.au/xml/xs3p/<br />

[2] http://www.oxygenxml.com/forum/ftopic2027.html<br />

[3] http://xml.fiforms.org/xs3p/


XSQL 193<br />

XSQL<br />

XSQL combines the power of <strong>XML</strong> and SQL to provide a language and database independent means to store and<br />

retrieve SQL queries and their results.<br />

Description<br />

XSQL is the combination of <strong>XML</strong> (Extensible <strong>Markup</strong> <strong>Language</strong>) and SQL (Structured Query <strong>Language</strong>) to provide<br />

a language and database independent means for storing SQL queries, clauses and query results. XSQL development<br />

is still in its infancy and welcomes suggestions for improvement (especially in the form of patches).<br />

Currently, the XSQL project has a DTD (Document Type Definition) to define the structure of an XSQL document<br />

and researchers are currently working on modifying the <strong>XML</strong> Generator, DBI Perl module to be able to parse XSQL<br />

documents and provide a tree- and event-based API (Application Programming Interface) to their elements. These<br />

modifications are being submitted as patches to the modules maintainer, Matt Sergeant. Thus, the source code does<br />

not live at this site.<br />

It is hoped that XSQL will provide an end-to-end solution for handling SQL in Perl (other languages can be<br />

supported if there is interest). Creating XSQL implementations in other languages will allow all databases to support<br />

<strong>XML</strong> without having to alter the database source code in any way. The XSQL implementations can take care of<br />

turning XSQL in SQL and turning results into XSQL.<br />

External links<br />

• XSQL project website [1]<br />

References<br />

[1] http://xsql.sourceforge.net/


Article Sources and Contributors 194<br />

Article Sources and Contributors<br />

Binary <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=353493919 Contributors: Chrisch, Cpl Syx, CyberSkull, Cybercobra, DSosnoski, Hervegirod, Hooperbloob, Joriki,<br />

Jzhang2007, Mac D83, Mipadi, Ordinant, Pengo, Potato32, Qutezuce, Semog, Skrapion, Sneftel, Tbleier, The Anome, Thumperward, 44 anonymous edits<br />

Business Process Definition Metamodel Source: http://en.wikipedia.org/w/index.php?oldid=349128636 Contributors: BPDM, Baudoin1, Diveintobpm, Ehheh, Goflow6206, Jpbowen, Lurp,<br />

Sisyph, Tomdebevoise, 8 anonymous edits<br />

CDATA Source: http://en.wikipedia.org/w/index.php?oldid=365608659 Contributors: Archer3, Barefootliam, CesarB, Ded.morris, Duke33, Ehn, ILikeThings, Luislobo, MC10, Mjb, Npowell,<br />

Phluid61, PoliticalJunkie, Renesis, Rjwilmsi, Thickycat, WakiMiko, Wiml, 49 anonymous edits<br />

CDuce Source: http://en.wikipedia.org/w/index.php?oldid=367963828 Contributors: AndrewGNF, Apokrif, Elonka, Elwikipedista, Frisch, Hans Adler, Jaxhere, Sourada, Stentie, The Thing<br />

That Should Not Be, Trovatore, VoluntarySlave<br />

Character entity reference Source: http://en.wikipedia.org/w/index.php?oldid=365999121 Contributors: ANONYMOUS COWARD0xC0DE, Bitnap, Clixus, DePiep, Derekread, Gazpacho,<br />

Gdr, Jatkins, Koujimachi07, Loadmaster, M7, Martin451, Mhkay, Mjb, Mzajac, Oashi, Svick, Tokek, UU, 12 anonymous edits<br />

CodeSynthesis XSD Source: http://en.wikipedia.org/w/index.php?oldid=333209308 Contributors: Boseko, Bunnyhop11, Csabo, Nicolas1981, Pedram.salehpoor, Soumyasch, 4 anonymous<br />

edits<br />

D3L Source: http://en.wikipedia.org/w/index.php?oldid=344098822 Contributors: Dawynn, Fabrictramp, Jackollie, Malcolma, Squids and Chips, Vgiasolli, 4 anonymous edits<br />

Darwin Information Typing Architecture Source: http://en.wikipedia.org/w/index.php?oldid=357405956 Contributors: AlexSpurling, Andy Dingley, Biker JR, Bobdoyle, Bruce Esrig,<br />

ChrisLott, Clayoquot, Cmsreview, Cschleifstein, Deathphoenix, DeweyQ, Dmccreary, Doug Bell, Elharo, Eslchip, Ghettoblaster, Hgkamath, Infoprosmktg, JDBravo, JamesBWatson,<br />

JosebaAbaitua, Jwalling, Krusch, LCP, LeeHunter, Masiano, MatisseEnzer, Mhedblom, Mythobeast, Ndenison, Nozipedia, Ohnoitsjamie, Roberto999, Ru.spider, Sernauser, Sibersandi,<br />

Skierpage, Terrillja, Toussaint, Tsemii, Walk Up Trees, Who, WissenVeredeln, Yorrose, 78 anonymous edits<br />

DITA Open Toolkit Source: http://en.wikipedia.org/w/index.php?oldid=367972391 Contributors: Andy Dingley, Bobdoyle, Cander0000, Elwikipedista, Ewlyahoocom, Sernauser<br />

Document Structure Description Source: http://en.wikipedia.org/w/index.php?oldid=344099967 Contributors: Amalas, Asser hassanain, Bunnyhop11, Dawynn, Dreftymac, Jerazol,<br />

Kbdank71, Mamling, Minghong, Rene Mas, 8 anonymous edits<br />

Document-Centric Source: http://en.wikipedia.org/w/index.php?oldid=319489730 Contributors: Canis Lupus, Jzhang2007, Malcolma, Oh Snap<br />

Document-centric <strong>XML</strong> processing Source: http://en.wikipedia.org/w/index.php?oldid=363018500 Contributors: Aj00200, Gary King, Jzhang2007, LilHelpa, R'n'B, RJFJR, Victor Lopes, 3<br />

anonymous edits<br />

Dynamic <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=302412968 Contributors: Aboriginal Noise, Egpetersen, Filmackay, Malcolma, 1 anonymous edits<br />

ECMAScript for <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=368395484 Contributors: AVRS, Aaronbrick, Ale jrb, Asqueella, Bobince, CesarB, Cybit, David Gerard, Deineka,<br />

DonToto, Drdamour, Drukepple, Everyking, Ffangs, Ghettoblaster, Guppie, Herorev, Imroy, Intgr, Jasonglchu, Klondike, Kuteni, Mbini, Mysterd429, Niqueco, Onevalefan, Pfurla, Pointillist,<br />

Schepers, Schristie, Shepard, Simonster, Spankman, Speck-Made, Tabletop, Vishnava, William Graham, WulfTheSaxon, Ysangkok, 83 anonymous edits<br />

Efficient <strong>XML</strong> Interchange Source: http://en.wikipedia.org/w/index.php?oldid=335112487 Contributors: Biscuittin, Cybercobra, Darobin, Erechtheus, Hervegirod, Jeffhos, Pengo, Sdw,<br />

TuukkaH, 10 anonymous edits<br />

Embedded RDF Source: http://en.wikipedia.org/w/index.php?oldid=344100836 Contributors: 4th-otaku, Cander0000, Dawynn, Earle Martin, Iridescent, Keithalexander, Mathiastck, Mdd, O<br />

keyes, Prodoc, Shepard, The Anome, Themfromspace, Ultimatewisdom, 1 anonymous edits<br />

EpiDoc Source: http://en.wikipedia.org/w/index.php?oldid=331916994 Contributors: Bluemoose, Bpiche, El C, Gabrielbodard, Paregorios, Polon11, Tobias Bergemann, XPtr, 3 anonymous<br />

edits<br />

eXtensible Server Pages Source: http://en.wikipedia.org/w/index.php?oldid=171773630 Contributors: Honestcurio, John Vandenberg, Jutta234, 2 anonymous edits<br />

Fast Infoset Source: http://en.wikipedia.org/w/index.php?oldid=363168067 Contributors: Beetstra, Doug Bell, Drano, Dreftymac, Ernstdehaan, Ghettoblaster, Gurch, Hervegirod, Iharjw,<br />

JavaIsGroovy, Johndrinkwater, Jzhang2007, Ksn, Merlin12, Obiltschnig, Pelegri, Precious Roy, Prickus, Torc2, Tuntable, Tycoon de, Warreed, 35 anonymous edits<br />

Global listings format Source: http://en.wikipedia.org/w/index.php?oldid=323960620 Contributors: Alvin Seville, Capnstank, 1 anonymous edits<br />

GMX Source: http://en.wikipedia.org/w/index.php?oldid=297460141 Contributors: Azydron, Canadian, GEn3S!Z, GregorB, Ikar.us, Jared Preston, Malinaccier, Pegship, Radon210, 8<br />

anonymous edits<br />

GMX-V Source: http://en.wikipedia.org/w/index.php?oldid=325214239 Contributors: Azydron, Emeraude, 2 anonymous edits<br />

Head-Body Pattern Source: http://en.wikipedia.org/w/index.php?oldid=332049421 Contributors: Duncharris, Pegship, RedWolf, Robertvan1, Timc, Uthbrian, Ynhockey, 3 anonymous edits<br />

HyTime Source: http://en.wikipedia.org/w/index.php?oldid=334129102 Contributors: Andreas Kaufmann, Klimov, Mjb, Mosca, Onlyemarie, Sderose, Thumperward, 9 anonymous edits<br />

Internationalization Tag Set Source: http://en.wikipedia.org/w/index.php?oldid=247890861 Contributors: Ghettoblaster, Sintaku, Ysavourel, 18 anonymous edits<br />

Klip Source: http://en.wikipedia.org/w/index.php?oldid=359761468 Contributors: Bogrady, Diveloop, Gdrori, Melaen, SDC, Utcursch, Wizard191, Wykis, Xe7al, 16 anonymous edits<br />

List of <strong>XML</strong> and HTML character entity references Source: http://en.wikipedia.org/w/index.php?oldid=365723538 Contributors: Adoniscik, Alerante, Andrew Carlssin, AxSkov, Beland,<br />

BenjaminHare, Cbrunet, Christian75, Clixus, Cy21, DePiep, DmitTrix, ERcheck, Fudo, Gaius Cornelius, George Hernandez, Gerbrant, Happy-melon, Isaac Dupree, J4 james, Jatkins, Joejava,<br />

John Vandenberg, Kf4yfd, Kieff, LiborX, Loadmaster, Mathtinder, Mhkay, Mindmatrix, Mjb, Monedula, NJJ.Rocher, Ohnoitsjamie, Phil Boswell, Psychonaut, Radon210, Reinyday, Reisio,<br />

RetiredUser2, Ringbang, Rjwilmsi, Rwwww, SallyForth123, Sam 1123, Suruena, Tamfang, Tezza2k1, The Thing That Should Not Be, The wub, Thinboy00P, Tokek, TreasuryTag, Wavelength,<br />

Wolf1728, Wwoods, 93 anonymous edits<br />

Log4js Source: http://en.wikipedia.org/w/index.php?oldid=333453341 Contributors: Amux, Euchiasmus, Ian Moody, JLaTondre, Stritti, Wdflake, 5 anonymous edits<br />

MAREC Source: http://en.wikipedia.org/w/index.php?oldid=352689127 Contributors: Hydrox, Mpgarnier, Ofalk, 13 anonymous edits<br />

Media Object Server Source: http://en.wikipedia.org/w/index.php?oldid=282541918 Contributors: Chungkuo, The Anome, Theroachman, Xezbeth, 1 anonymous edits<br />

METS Source: http://en.wikipedia.org/w/index.php?oldid=357913828 Contributors: Buiras, CBM, Charles Brooking, Davissp, DerHexer, Elonka, Grumpycraig, Isnow, Lyc. cooperi,<br />

M4gnum0n, Nicolas1981, Paulerb, Rich Farmbrough, Sallyrenee, SchfiftyThree, Stf, Thryduulf, Trovatore, WilliamDenton, 15 anonymous edits<br />

Numeric character reference Source: http://en.wikipedia.org/w/index.php?oldid=364363130 Contributors: ABCD, ANONYMOUS COWARD0xC0DE, Ahoerstemeier, Ajgorhoe, D99figge,<br />

David H. Flint, DePiep, Gudeldar, Hytri, Indefatigable, Karl Dickman, Kjoonlee, LeoNomis, Million Moments, Mjb, Ringbang, Shlomital, TreasuryTag, Voidvector, 11 anonymous edits<br />

Office Open <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=368502841 Contributors: AJRobbins, AVRS, Adi86, Adiel, Agentbla, AlbinoFerret, Ale2006, AlexHudson, Alexbrn,<br />

Alexmaco, AlistairMcMillan, Aljullu, AllTheThings, Alvestrand, Amux, Ancheta Wis, Andrew J. MacDonald, AnonMoos, Ans, Arebenti, Arnieswap, ArnoldReinhold, Artw, Asbjornu, Atchom,<br />

Avenue, BCable, Bbatsell, BeSherman, Beetstra, BenLanghinrichs, Bender235, Bento00, Biztalkguy, Blaisorblade, Blakkandekka, Bobblehead, Bobman52, Boing! said Zebedee, Booyabazooka,<br />

BradC, Brucevdk, Brumle72, Bryan Derksen, Bull Market, Cahill1, Cander0000, Catskul, CattleGirl, CesarB, Cfauck, Charles Esson, Chealer, CheesePlease NL, Cheros, Chowbok,<br />

Chuckhoffmann, Cibumamo, Clicketyclack, Cloud02, CodeNaked, Codyrank, CritterNYC, CyberSkull, D-Notice, DMacks, Damian Yerrick, Danfuzz, Danieldotcom, Dave souza, David Gerard,


Article Sources and Contributors 195<br />

DavidJ710, Davidprior, Delafield, Denis.labaye, DennyColt, DerHexer, Dguertin, Diamonddavej, Discospinster, Dockurt2k, Dolda2000, Donho, Dougofborg, Dovi, Downcreate, Dreftymac,<br />

Dwheeler, Długosz, Earthsound, EatMyShortz, Ebyabe, Edschofield, Egandrews, Elagatis, Emurphy42, Etscrivner, Euchiasmus, EvenT, Evice, Existhigh, Feedmecereal, Fingerz, Fjarlq,<br />

Fleminra, Froth, Fulldecent, Gabriella11758, Gabrielzorz, Gagravarr, Gakrivas, GangsterPanda, Garnwraly, Ghettoblaster, Gilliam, Greg L, GregorB, Guyjohnston, H2g2bob, HAl, HPSCHD,<br />

HaeB, Hamish Lawson, Hankwang, HarryHenryGebel, Harumphy, Hebrides, Helpsloose, Herorev, Hervegirod, Herzen, HiDrNick, HorsePunchKid, Hu12, HubertRoksor, Ildefonso Giron, Innv,<br />

Intgr, Iridescent, Ironiridis, Irperez, Isaac Dupree, ItsProgrammable, Iunaw, JLaTondre, Jac16888, Jacob Poon, JanusDC, Jeffmcneill, Jeltz, Jleedev, Jlovick, Joelpt, John Nevard, John of Reading,<br />

John zhu, JohnOwens, Johndrinkwater, Joker1984, Joker2007, Jonathan888, Joshua Issac, Jstaniek, Jtnn, Juliancolton, Justin545, Juventas, Jynus, KAMiKAZOW, Kaern, Karada, Karnesky,<br />

Kayano, Kedar damle, Kegart, Kenb215, Kenyon, Ketil, Khalid hassani, Khukri, KiloByte, Kilz, Klauys, Kneale, Kozuch, Kravietz, Kungfuadam, Latha P Nair, Laughton.andrew, Leandrod,<br />

LeeHunter, Leotohill, Lester, Liftarn, Lisamh, Lulu of the Lotus-Eaters, MBisanz, MZMcBride, Mahanga, Marbux, Mardus, Masterpjz9, Mat macwilliam, Mateo LeFou, Mathias<br />

Schindler, Mauro Bieg, Max Naylor, Mcld, Melomel, Mentaka, Merbenz, Micro01, Midnightcomm, Mipadi, Mitchoyoshitaka, Mmj, MonirTime, Mrand, Mratzloff, Mxn, NJA, Nberardi, Nbibler,<br />

Nealmcb, NeutralPoint, Niemeyerstein en, Nigelj, Nil Einne, Nitesh.dubey, Nmagedman, Noloader, Octahedron80, Odie5533, Odoncaoa, Oggiejnr, Oneiros, Opium, Orrc, Osaeris, Oub, Pairadox,<br />

Palfrey, Pandion auk, ParticleMan, Partyoffive, Paul Foxworthy, Paul1337, Pdfpdf, Peak Freak, Peashy, Perfect Proposal, Phil153, Piano non troppo, Pieterh, Piken, Piperh, Pixelface, PlainHolds,<br />

Plopez339, PokeYourHeadOff, PonThePony, Praetor alpha, Promethean promise, Putt1ck, Quantumelfmage, R3m0t, RS Ren, Rafert, Rainwarrior, Ramdrake, Rasmus.p, Raul654, Rcandelori,<br />

Reedy, RekishiEJ, Remiel, Reuqr, Rick Jelliffe, Rizox, Rjwilmsi, Rlmorgan, Robdurbar, RockMFR, Ronark, RossPatterson, Rursus, Ruud Koot, Ryuch, Régis Décamps, Salimfadhley, Scarian,<br />

Scientus, Scisonic, Scj2315, Sdedeo, SeanDuggan, Segedunum, Seweso, Shd, Shir Khan, Shmget, SigmaEpsilon, Sigmundg, Signalhead, Simosx, Sir Anon, SkyWalker, Sladen, SmartWarthog,<br />

Smartse, Soumyasch, Spartaz, Spitzak, SpuriousQ, Stang99gtv8, Stannered, Stephenchou0722, SteveSims, Stevenfruitsmaak, Stevenj, Subsume, Sumb, Superluser, Superm401, Svdb, Swiftdove,<br />

Syncrosoft, TKD, Ta bu shi da yu, Tabletop, Tackit, TakuyaMurata, Tarmle, Tatoute, Tawker, Tayste, Tgape, The Anome, The Divine Fluffalizer, The Thing That Should Not Be, TheMadGerman,<br />

Thelennonorth, Theonlyedge, Theosch, Thiseye, Thrapper, Thumperward, Tigernike1, Tiptoety, Tmpsantos, Todd Vierling, Tomdobb, Toolnut, Torfason, Towsonu2003, Tprit, Trails, TraxPlayer,<br />

Tregoweth, Trevordevore, Ttiotsw, Tunah, Turlo Lomon, Tvhuang, Tvol, Ultramandk, Utcursch, Veinor, Verbal, Verdy p, Vexorian, Virtualt333, WalterGR, Warren, Webhat, West London<br />

Dweller, Wheelybrook, WhiteCat, WiebeVanDerWorp, Wiki Raja, Wiki1959, WikiLaurent, Witoldp, Wmorein, Womble, Work permit, Wrightbus, WurmWoode, X-Bert, X-dark, Xpclient,<br />

Xx521xx, Yellowdesk, Yesudeep, Yoonkit, Zayani, Zero0w, Zoobab, Zsvedic, 1036 anonymous edits<br />

Office Open <strong>XML</strong> file formats Source: http://en.wikipedia.org/w/index.php?oldid=363884163 Contributors: Alvestrand, CommonsDelinker, Nigelj, Rjwilmsi, Verdy p, 3 anonymous edits<br />

OIO<strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=232294100 Contributors: Covergaard, JosefAssad, Part Deux, 2 anonymous edits<br />

Open <strong>XML</strong> Paper Specification Source: http://en.wikipedia.org/w/index.php?oldid=368337454 Contributors: A.Ou, Akhristov, Alecamiga, Alexander Abramov, Ambarish, Azakea,<br />

Benhutchings, Bfinn, Blicktek, Bokarevitch, Callidior, Chris Chittleborough, Chris the speller, Chuck Marean, CobraA1, Csiahistorian, Cwolfsheep, Cynical, DBrane, Danglobalgraph, David<br />

Haslam, Dawnseeker2000, Digita, Etienne.navarro, Feedmecereal, Filemon, FleetCommand, Fleminra, Frap, Fritz Saalfeld, Gertyk, Ghettoblaster, Gioto, Gordonf, HAl, Hervegirod, Inarius,<br />

JLaTondre, JanSöderback, Javalenok, Joaopaulo1511, Joelholdsworth, Joker1984, Jonhall, Jutiphan, Kpearce, Lasindi, Lboonsen, Leafnode, Lhammer610, LobStoR, LodesterreLLC, Maerk,<br />

Marasmusine, Marcosw, Mathrick, Morris lin, Mpbailey, Msiebuhr, Mythobeast, Nihiltres, Nil Einne, Nixps, Objectivesea, Oneiros, Orderud, Owen Ambur, Paul A, Paulej, Pelago, Philippe,<br />

PseudoSudo, Psiphiorg, Qef, Quiggles, RedAznor, Rjwilmsi, SURIV, SW2000, Seth Nimbosa, Simaocampos, Snailshoes, Soumyasch, Stephenchou0722, Sterrys, Sugeina, Superm401, Svick,<br />

Thumperward, Todd Vierling, Tooki, TotoBaggins, Toussaint, TreasuryTag, Uzume, Voidxor, Warren, WatchAndObserve, Wikianon, Woohookitty, Wq-man, Xpclient, ZimZalaBim, 159<br />

anonymous edits<br />

PCDATA Source: http://en.wikipedia.org/w/index.php?oldid=360253969 Contributors: Chealer, Fæ, Lobner, Malcolma, Renata3, Winterheat, 4 anonymous edits<br />

Plain Old <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=268137155 Contributors: Alynna Kasmira, Arto B, Atifmk, BrokenSegue, Bunnyhop11, Chalisa, Charivari, CondeNasty,<br />

Djmackenzie, Dpm64, Emersoni, Evil Monkey, GermanX, Hoos-foos, Julesd, LittleDan, MarXidad, Mindmatrix, Minghong, Tantek, Thumperward, Toby Woodwark, ZayZayEM, 16 anonymous<br />

edits<br />

Portable Application Description Source: http://en.wikipedia.org/w/index.php?oldid=353334808 Contributors: Aaleksanyants, Bitsmith, Christopher.widdowson, Gesslein, Here, Jll, MER-C,<br />

Pegship, RenegadeMinds, Riki, TheParanoidOne, 22 anonymous edits<br />

Publishing Requirements for Industry Standard Metadata Source: http://en.wikipedia.org/w/index.php?oldid=367751412 Contributors: Malcolma, Mauro Bieg, Prismwg, Rettetast, Rich<br />

Farmbrough<br />

QName Source: http://en.wikipedia.org/w/index.php?oldid=319026346 Contributors: Amire80, Anthony Appleyard, Frap, Gurch, Jnutting512, Motine, Stezton, Zundark, 1 anonymous edits<br />

QTI Source: http://en.wikipedia.org/w/index.php?oldid=361214744 Contributors: Alexcq, Bektur, Benscripps, Carnildo, ChristopheS, Fujnky, Gcm, Gimboid13, Grussak, Hammersmith38,<br />

J04n, Ja6a, JimTittsler, Larham, Lastkaled, Lindsey Kuper, Olak Ksirrin, RobertG, Ruale, Staffordaz, The7thone1188, Ysangkok, 33 anonymous edits<br />

Resource Description Framework Source: http://en.wikipedia.org/w/index.php?oldid=368011899 Contributors: 213.253.39.xxx, A5b, Acaciz, Akinyemi, Alcalazar, Alexius08,<br />

AlistairMcMillan, Amire80, AnAj, Andy Dingley, Angela, Ankitasdeveloper, Anrie Nord, Arto B, Asqueella, Backoftheboat, Barticus88, Bawolff, BeakerK44, BernhardBauer, Blathnaid,<br />

Blue.death, BobKeim, Booles, Broosty, C1932, Caoimhin, Carbuncle, Carlo.Ierna, Carmenutzadd, Cedringen, Chmod007, Cjcollier, Clan-destine, CloCkWeRX, Conversion script, Cygri, DRE,<br />

DanBri, Dancter, Daniele Gallesio, Davemck, Deodar, Dmccreary, Donald Albury, Dpv, Dr Shorthair, Dtcdthingy, Earle Martin, EddyVanderlinden, Emperor, EoGuy, Erick.Antezana, Esprit15d,<br />

Finell, Fleminra, FrankTobia, Fredrik, Funandtrvl, Ghettoblaster, Graham87, GregorB, Gyuri10, Haakon, Harrigan, Hetar, Hu12, Ian Spackman, IanDBailey, Ianalchemy, Jdthood,<br />

JesseChisholm, Jhammerb, Joe Jarvis, John Vandenberg, JonHarder, Jonathan O'Donnell, Jpbowen, Kaihsu, Kbdank71, Khurrad, KimvdLinde, KingsleyIdehen, Kiranoush, Kku, Knavesdied,<br />

Kwan, Langec, Liftarn, Lokatzis, Luk, Lysy, M3wiki1, Maduskis, Mandarax, Mark Renier, Mathiastck, Mauro Bieg, Mav, Mccaffry, Mdd, Mecanismo, Michael Hardy, MichaelBillington,<br />

Michal Nebyla, Midnight Madness, Minghong, Mjb, N2e, N3c, Nicolas1981, Nikevich, Niteowlneils, Nkour, Novum, Nux, Ojw, Onlyemarie, Pagatiponon, PatHayes, Pemboid, Pete142, Piet<br />

Delport, Pointillist, Pvosta, RaymondYee, RedWolf, Roland2, RossPatterson, Rursus, SEWilco, SMcCandlish, SamuelScarano, Sanxiyn, Sapoguapo, Schandi, Sdorrance, Securiger,<br />

ShaunMacPherson, Shepard, Shermanmonroe, Shinkolobwe, Sibersandi, Sina2, Smalljim, Soumyasch, Sstair, StWeasel, SteinbDJ, StephenReed, Stevertigo, Stoni, Stw, TNLNYC, Tezza2k1, The<br />

Anome, TikaKino, Tomlzz1, Toussaint, Triadic2000, Trixter, Turnstep, Ultimatewisdom, Universimmedia, Uriyan, Venullian, Vsddkjn, Wavelength, Wesleyneo, Wiki alf, WojPob, Xezbeth,<br />

Yaron K., Yitzhak, 217 anonymous edits<br />

Resources of a Resource Source: http://en.wikipedia.org/w/index.php?oldid=252504394 Contributors: GregorB, Jjordanpedia, NawlinWiki, Pearle, Robocoder, 8 anonymous edits<br />

Reverse Ajax Source: http://en.wikipedia.org/w/index.php?oldid=354518378 Contributors: Agentscott00, Anaraug, Brest, CarlManaster, CometGuru, Damiens.rf, Fadookie, FatalError,<br />

Furrykef, Gregdan, In side the pc, Inquisitus, Jacobolus, Jwoodger, Kalan, Kdknigga, MrOllie, MuffledThud, Pohta ce-am pohtit, Psilya, Sleepyhead81, Sprocketonline, Stefan Hintz, Ødipus sic,<br />

52 anonymous edits<br />

Root element Source: http://en.wikipedia.org/w/index.php?oldid=292478129 Contributors: Ferkelparade, Malcolma, Mike the k, Nigelj, Pegship, RJFJR, Rich Farmbrough, Robertvan1,<br />

Sardine, 6 anonymous edits<br />

Schematron Source: http://en.wikipedia.org/w/index.php?oldid=345263915 Contributors: Aqueenan, Bunnyhop11, Canadabear, Chsimps, Dmccreary, Dreftymac, Ghettoblaster, HoodedMan,<br />

Hymek, JukoFF, Kbdank71, Korval, Modify, Nickcarr, Pnkrockr, Rjwilmsi, Samdutton, Securiger, Wellithy, Žiedas, 23 anonymous edits<br />

Simple Outline <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=245061146 Contributors: CDV, Dreftymac, KennethJ, Krusch, Nfwu, Qu3a, Stevage, Tadman, Verdatum, 4<br />

anonymous edits<br />

Simple <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=290618000 Contributors: Codebytez, Danlev, Melab-1, Sydius, 8 anonymous edits<br />

Streaming <strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=313593995 Contributors: Clq, Deathy, Egpetersen, Fikus, Filmackay, Maustrauser, Neustradamus, Patdreams<br />

Styled Layer Descriptor Source: http://en.wikipedia.org/w/index.php?oldid=345632921 Contributors: Beautyod, Ebyabe, Firsfron, Lars Washington, Lordsatri, Mabdul, Oskosk, SEWilco,<br />

SheldonYoung, Vitomeuli, 4 anonymous edits<br />

Topic (<strong>XML</strong>) Source: http://en.wikipedia.org/w/index.php?oldid=305325594 Contributors: Barticus88, Blathnaid, Clayoquot, Eleusis, Fool, Hbent, Lheuer, Pearle, Quaque, Treborbassett, Walk<br />

Up Trees, 8 anonymous edits<br />

Unique Particle Attribution Source: http://en.wikipedia.org/w/index.php?oldid=272335554 Contributors: Bunnyhop11, Frandsen, Politepunk, Rich Farmbrough, 2 anonymous edits<br />

VTD-<strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=356464294 Contributors: AnmaFinotera, Beefyt, CamTarn, CambridgeBayWeather, EurekaLott, FayssalF, Greatestrowerever,<br />

Hervegirod, Hut 8.5, Jacosi, Jzhang2007, Katieh5584, LilHelpa, Paul8046, Pegship, Raise exception, Rjwilmsi, Rookkey, Switchercat, Toohool, Torc2, UncleDouggie, םודנר, 187 anonymous<br />

edits


Article Sources and Contributors 196<br />

X-expression Source: http://en.wikipedia.org/w/index.php?oldid=272451914 Contributors: Dragentsheets, Greenrd, JLaTondre, 1 anonymous edits<br />

XBRLS Source: http://en.wikipedia.org/w/index.php?oldid=338054643 Contributors: Blowdart, CharlesHoffman, Glennfcowan, Lancet75, Niente21, Pohta ce-am pohtit, 3 anonymous edits<br />

Xdos Source: http://en.wikipedia.org/w/index.php?oldid=352934824 Contributors: Dawynn, FreeKresge, Malcolma, Pearle, Salad Days, Smthng2sav, Tinucherian, 3 anonymous edits<br />

XDR Schema Source: http://en.wikipedia.org/w/index.php?oldid=335801797 Contributors: Aaaidan, Abelson, Greenrd, Jonnie d smith, Sergey.Radkevich, 2 anonymous edits<br />

XEE (Starlight) Source: http://en.wikipedia.org/w/index.php?oldid=245059907 Contributors: Elblanco, Malcolma, Ratarsed, 1 anonymous edits<br />

XEP Source: http://en.wikipedia.org/w/index.php?oldid=359239404 Contributors: Msulyaev, Odo1982, Toddst1, Zundark<br />

<strong>XML</strong> Source: http://en.wikipedia.org/w/index.php?oldid=367508169 Contributors: .:Ajvol:., 207.172.11.xxx, 213.253.39.xxx, 24ten, AHMartin, AThing, Aadaam, Actam, AdamCarden, Adeio,<br />

Ahabr, Ahkond, Ahoerstemeier, Aitias, Ajcumming, Aklauss, Aksi great, Alan Liefting, Alansohn, Alexbrn, AlistairMcMillan, Allkeyword, Amire80, AndersFeder, Andrisi, Angeltoribio, Ani td,<br />

Ankitasdeveloper, Anna Lincoln, Anon lynx, AnonMoos, Anti stupidity, Anu-43, Aomarks, Asqueella, Asteiner, Asymmetric, Atanveer9, AzaToth, B4hand, Barek, Barticus88, Bdesham,<br />

Beetstra, Belamp, Bernd in Japan, BertSen, Bevo, Bhadani, Biezl, BigFatBuddha, Bissinger, Bje2089, Blinklmc, Bluemoose, BlurTento, Bobdc, Bobianite, Boehm, Bonbayel, Bonethugnd, Booles,<br />

BorgQueen, Borgdylan, Boseko, BrianCully, Brick Thrower, Brighterorange, Brion VIBBER, Bryan Derksen, Brz7, Bunnyhop11, Burschik, Businessman332211, Bvajet,<br />

C.M.Sperberg-McQueen, CLD, CambridgeBayWeather, Cameltrader, Can't sleep, clown will eat me, CanadianLinuxUser, Caomhin, CapitalSasha, Carewolf, CarlHewitt, Cbdorsett, Cels2, Centrx,<br />

Charivari, Chininazu12, ChongDae, Chowbok, Chris 73, Chris Roy, Chrislk02, Chrisnewell, ChristopheS, Chzz, Cipherynx, Clayoquot, ClementSeveillac, CoSort2007, Coconut99 99, Cody5,<br />

Colonies Chris, Comesuntbob, Contraverse, Conversion script, CptAnonymous, Crosstowns, Cspan64, Cybercobra, D6, DKEdwards, Da monster under your bed, Dan100, DanConnolly, Daniel<br />

Olsen, Daniel.Cardenas, DanielVonEhren, DarkFalls, Darkfred, David spector, Davis685, Dcattell, Dcoetzee, DeadEyeArrow, Delcnsltmd, Deodar, Derek Ross, Derekread, Dicklyon, Dickpenn,<br />

DigitalEnthusiast, Dingbats, Dino72, Dkrms, Dlohcierekim, Dlrohrer2003, Dolcecars, DominiqueHazaelMassieux, Donmay12, DopefishJustin, DoriSmith, DougBarry, Dpattison2007, Dpbsmith,<br />

Dpm64, Dr Headgear, Dreftymac, Dthvt, Dullhunk, Dwheeler, Ebruchez, Edcolins, Edward Z. Yang, Efcavanaugh, Egandrews, Egil, Eisnel, ElBenevolente, Elharo, Ellmist, Elwikipedista,<br />

EngineerScotty, Eranb, Ericjs, Erik Zachte, Erikdw, Eritain, Etu, Evaluist, Ewsers, Fang.zheng, Fantasticfears, FatalError, Feline Hymnic, Ferdinand Pienaar, Figure, Fleminra, FloatingMind,<br />

Fnielsen, Folajimi, Fragglet, Fran Rogers, Francl, Frap, Freyr, Frisket, Fsolda, Furrykef, Fvw, GTBacchus, Gaius Cornelius, Gc9580, Gdrori, Geniac, Gennaro Prota, GentlemanGhost,<br />

GeoffPurchase, Ghettoblaster, Giftlite, Gjlubbertsen, Gjs238, Glass of water, Glenn, Gogo Dodo, Golwengaud, GrEp, GraemeL, Graham, Greg Murray, Ground Zero, Grumpycraig, Gudeldar,<br />

Haakon, Hairy Dude, Hannes Hirzel, Harold f, Hashar, Hervegirod, Hicketyhicketyhack, Highwayman65251, Hirzel, Hogman500, Hu12, Hurricane111, Hypertrek, Hyuri, IMSoP, Ian Moody,<br />

IanBurrell, Iftikhar88hussaini, Ijmorlan, IlanaDavidi, Imars, Imjustmatthew, Int21h, Intgr, Iridescent, Isilanes, Itai, J.delanoy, JForget, JKing, JLaTondre, JPalonus, JRocketeer, Jackacon,<br />

Jacobko, Jacobolus, JakobVoss, JamesBrownJr, Jao, Jargon64, Jauerback, JavaWoman, Jaxad0127, Jaxsam1, Jay, Jeenuv, Jeff G., Jeff3000, Jehzlau, Jerazol, Jesin,<br />

Jhannah, Jibjibjib, Jilplo Haggins, Jimthing, Jmlipton, Joachim Wuttke, Joanjoc, John Vandenberg, JohnSmith777, JohnWhitlock, Johnmarkh, Johnwcowan, Joku, Jonabbey, Jonkerz,<br />

Jonnyamazing, Jor, Jpbowen, Jshadias, Jzhang2007, Kai.Klesatschke, Kaldosh, Kamalakannanprogrammer, Kanags, Kapoing, Karderio, Karl Dickman, Katalaveno, Kbrose, Kc2idf,<br />

Keithgabryelski, Kenmccallum, Kensall, Kevinconroy, Kgaughan, Kha0sK1d, KickAssClown, Kl4m, Klaws, Koavf, Korval, Krauss, Kubigula, Kx1186, LDiracDelta, Lambiam, Larala,<br />

Lazynitwit, Lianmei, Liao, Lifefeed, Liftarn, Ligulem, Ling.Nut, LittleDan, Loveenatayal, Lumi71, Lycurgus, M.franceschet, M4gnum0n, MER-C, MK8, MaBoehm, Madir, Mah159, Mak<br />

Thorpe, Manishtomar, Maoj-wsu-sp, Mark Renier, MarkSweep, Martijn faassen, Martin451, Martinp23, MartynDavies, Mathmo, Matthäus Wander, MaxEnt, Maximaximax, Maximus06,<br />

Mayfare, Mbbradford, Mbell, Mcintyem, Mcorazao, Melab-1, Melon039, Meszigues, Mhkay, Michael Hardy, MichaelJanich, Miguelfms, Minghong, Mion, Miss Dark, Mjb, Mjpieters,<br />

Mola8sses, Montgomery '39, Mp, Mr. Shoeless, Mr.Z-man, MrJones, MrOllie, Mrjmcneil, Ms2ger, Mthibault, Mvulpe, Mwtoews, Mww113, Mxn, NO ACMLM,AND XKEPPER SUCK !,<br />

Nannus, Nanshu, Natasha2006, NawlinWiki, Neckro, Nemo bis, Netsnipe, Nicmila, Nigelj, Nikkimaria, Nile, Ninly, Niteowlneils, Nivaca, Nixeagle, Noldoaran, Nomediga, Norm mit, Nowa,<br />

Nsh, Nwbeeson, Octane, Ogmios, Ohnoitsjamie, Okyea, OliD, OsamaK, Oscar-ja, Osquar F, OverlordQ, Oxblood, P3x984, PTSE, Patrick, Paul Foxworthy, PaulXemdli, Pavel Vozenilek,<br />

Paxsimius, Peashy, Pelle, Pengo, PeteVerdon, Peterl, Pgk, Philip Trueman, Phluid61, Phoenix-forgotten, Phyzome, Pianohacker, Pikiwyn, Pmberry, Poccil, Porges, Pozcircuitboy, Prakash<br />

Nadkarni, Prodoc, Quarl, Quasipalm, Quiddity, Quilokos, Ramesses the Great, Rbonvall, Rbstimers, Rdmsoft, Red660, RedWolf, Redherring, Reinthal, Remy B, RenniePet, Rich Farmbrough,<br />

RichMorin, Richalex2010, Rick Block, Rick Jelliffe, RickBeton, Risi, Ritvikbhatnagar1, Rivecoder, Rje, Rjstott, Rjwilmsi, Rklawton, Robert K S, Robert Merkel, Robinjwest, Robomaeyhem,<br />

Rodney Boyd, Roger costello, Rory096, RoseParks, Rr2bwreain, Rror, Rvmolen, Ryanrs, Sam Hocevar, SamHathaway, SandiCastle, Sandius, Saqib, Saucepan, Sbvb, Schnolle, Scjessey, Scott<br />

MacLean, Scottielad, Sderose, Seanhan, Seidenstud, Semper discens, Sen Mon, ShaneCavanaugh, Shanes, Shibboleth, Shii, Shinkolobwe, Shizhao, Shlomital, SickTwist, Signsofstatic, Simetrical,<br />

SivaKumar, Sj, Sjc, Sleepyhead81, Smyth, Sosinfo, Sound effx, Spankman, Spe88, Spudstud, SqueakBox, Stefan.ciobaca, Stephen Gilbert, Steve R Barnes, SteveRwanda, Stevy76, StewartMH,<br />

Stf, Stijn Vermeeren, Stupiddestyredgasd, Stwalkerster, Superm401, Suruena, Suwayya, Svetovid, Syangtar, Sydius, TPK, Tagith, Taknik, Talktovalentine, TastyPoutine, Technopilgrim, Teddyb,<br />

Terjen, Terrifictriffid, Terrycojones, Thadius856, The Thing That Should Not Be, TheMightyOrb, Thierryc, Think777, Thumperward, Thunderhead, TimBray, TimR, Timc, Timur.shemsedinov,<br />

Tobias Bergemann, Todd Vierling, Tony1, ToonArmy, Topbanana, Toussaint, Trade2tradewell, Trankin, Traroth, Treekids, Trovatore, Trscavo, Tsunaminoai, Turnstep, TwoOneTwo, Twocs,<br />

Typhoonhurricane, Typochimp, UkPaolo, Unforgettableid, Unixxx, Unknown W. Brackets, Vaganyik, Varlaam, Versageek, Vespristiano, Vigilius, Violetriga, Vladkornea, Vojta, Volphy,<br />

WSU-AW-AK, Waskage, Wavelength, Wellithy, Wereon, Whale plane, Whkoh, Wickorama, Wiki alf, Wiki0709, Wikilibrarian, Wmahan, WojPob, Woohookitty, Wrs1864, Wulfila, Ww,<br />

XJamRastafire, Xompanthy, Xpclient, Yaronf, Ygramul, Yonkie, Zhaolei, Zoeb, Zootm, Олександр Кравчук, 1175 anonymous edits<br />

<strong>XML</strong> and MIME Source: http://en.wikipedia.org/w/index.php?oldid=359215170 Contributors: Crowne, Ellymelly, Hawky, John Vandenberg, Mgungora, O keyes, Roger costello,<br />

ShakespeareFan00, SpK, Typhoonhurricane, Wdflake, Wrs1864, 8 anonymous edits<br />

<strong>XML</strong> appliance Source: http://en.wikipedia.org/w/index.php?oldid=358713658 Contributors: AJR, Abesford, Alfe, Biot, Bunnly, Bunnyhop11, Comindico, CommonsDelinker, Darraghs,<br />

Dmccreary, Glace, Haakon, Hoagtim, Hughser, Iamrohit, Irishguy, Isotope23, Jbromhead, JonHarder, Jpbowen, Julesd, Kakarrott64, Kmorozov, L200817s, Layer7, Layer7tech, Lisfire, Lsonne,<br />

Martpol, MinorContributor, Ohthelameness, Reedy, Sherool, Sreekesh, Staffwaterboy, Stephen Compall, Tcramer1234, Vikingforties, 30 anonymous edits<br />

<strong>XML</strong> Base Source: http://en.wikipedia.org/w/index.php?oldid=333780510 Contributors: Anrie Nord, Fullstop, Furrykef, Pegship, Suruena, TimBray, Toussaint, Utcursch, 2 anonymous edits<br />

<strong>XML</strong> Catalog Source: http://en.wikipedia.org/w/index.php?oldid=350443768 Contributors: Abcoates, Alex.g, <strong>Markup</strong>854, Nate1481, RickBeton, TubularWorld, 4 anonymous edits<br />

<strong>XML</strong> Certification Program Source: http://en.wikipedia.org/w/index.php?oldid=365135620 Contributors: Melon039, Michel7789, Sykamoore, WestCity, 26 anonymous edits <strong>XML</strong><br />

Configuration Access Protocol Source: http://en.wikipedia.org/w/index.php?oldid=367225868 Contributors: Calment, Kbrose, Mondoblu, R'n'B, 9 anonymous edits<br />

<strong>XML</strong> Control Protocol Source: http://en.wikipedia.org/w/index.php?oldid=294538733 Contributors: Asbjornu, Malcolma, Melab-1, Mild Bill Hiccup, Salmar<br />

<strong>XML</strong> data binding Source: http://en.wikipedia.org/w/index.php?oldid=362549516 Contributors: Beetstra, Biehl, Boseko, Cander0000, Coconut99 99, DSosnoski, Doug Bell, Drrngrvy,<br />

Dsevilla, Emerks, Eshear, Jnutting512, Khookguy, Liempt, Miami33139, MrOllie, Mrflip, Nskhan84, Objsys, Payxystaxna, Poccil, Precious Roy, RedWolf, Redvers, Robert van Engelen,<br />

Sebastian.Dietrich, Simon sprott, SprottS, Squash, Stephen B Streater, Teeks99, Tirkfl, Trident job, Venango, Virgiltrasca, Wavelength, Yourfired101, 86 anonymous edits<br />

<strong>XML</strong> database Source: http://en.wikipedia.org/w/index.php?oldid=366561321 Contributors: 16x9, AJackl, Abukaspar, Adrianwn, Amirfr, Andionita, Arnabdotorg, Barefootliam, Belovedfreak,<br />

Bernd vdB old, Bohumir Zamecnik, Bradjamesbrown, Brick Thrower, Bunnyhop11, Ccouvrette, ChristianGruen, Colonies Chris, CorcaighAbu, DickieRose, Dilane, Dizzzz, Dmccreary,<br />

Doclabyrinth, DoriSmith, Edward C. Zimmermann, Eedeebee, Enric Naval, Epbr123, EricBloch, GVogeler, Glen Pepicelli, Gpallis, Gregburd, Happygiraffe, Hgkamath, Hobartimus, Joerg84,<br />

John Vandenberg, Johndbritton, Juansempere, Jzhang2007, Klingon, Kmorozov, Kokotero, Lamdk, Libcub, Mdd, Metaperl, Michael Slone, MiddleEarth, Nichtich, Nikkimaria, OlliX, Pearle,<br />

Pedant17, Philip Trueman, Playmobilonhishorse, Radim Baca, Rastgoo, Rayngwf, Rjwilmsi, Rtweed1955, Signalhead, Slakr, Snodnipper, Stevertigo, Sykamoore, TRosenbaum, Tbradford,<br />

Terrifictriffid, Thumperward, Tide rolls, Touko vk, Xmlchamp, Xpriori, Xshezang, Xxanthippe, 216 anonymous edits<br />

<strong>XML</strong> editor Source: http://en.wikipedia.org/w/index.php?oldid=358875951 Contributors: Alcalazar, Asqueella, Booles, Cedric dlb, Cinnamon42, Clayoquot, Damien1, DirkvdM, Dulciana,<br />

Efcavanaugh, Egandrews, Furrykef, GeoffPurchase, Geralds, Icairns, Julesd, Korval, LeeHunter, Mark Richards, Mjb, Mzajac, Nabeth, Owens1, Ownlyanangel, Quasipalm, RedWolf, Remuel,<br />

Richardmtl, Saqib, Sernauser, SimonP, Sjoerd visscher, Skreyola, Spankman, Srbauer, Swaq, Thv, Tobias Bergemann, Wrs1864, 72 anonymous edits<br />

<strong>XML</strong> Enabled Directory Source: http://en.wikipedia.org/w/index.php?oldid=291296534 Contributors: Chowbok, EagleOne, Kdz, Melab-1, MerryMorris, 3 anonymous edits<br />

<strong>XML</strong> Encryption Source: http://en.wikipedia.org/w/index.php?oldid=354384058 Contributors: Alekseysanin, ArnoldReinhold, AutumnSnow, Cuonghuyto, Gudeldar, Jc3s5h, Mabdul, Ntsimp,<br />

Pmerson, Samsara, Sverdrup, Westenra, Wrs1864, 15 anonymous edits<br />

<strong>XML</strong> Events Source: http://en.wikipedia.org/w/index.php?oldid=328519050 Contributors: Ahoerstemeier, Dmccreary, Dmyersturnbull, Dvunkannon, Ghettoblaster, Groupsixty, Hawky, I<br />

already forgot, Lev Matematik, Mathiastck, Pemboid, Reinthal, Risi, Rjwilmsi, Toussaint, Xaje, Zundark, 10 anonymous edits<br />

<strong>XML</strong> framework Source: http://en.wikipedia.org/w/index.php?oldid=322139018 Contributors: Bunnyhop11, Byjg, Kateshortforbob, Libcub, 2 anonymous edits<br />

<strong>XML</strong> Literals Source: http://en.wikipedia.org/w/index.php?oldid=297983431 Contributors: Biscuittin, Drilnoth, Highpitch, Maniamin, 1 anonymous edits


Article Sources and Contributors 197<br />

<strong>XML</strong> namespace Source: http://en.wikipedia.org/w/index.php?oldid=347806300 Contributors: Anthony Appleyard, Anwar saadat, AutumnSnow, CardinalDan, Detroit, Dpm64, Dreftymac,<br />

Ear1grey, Eh kia, Ehn, Franl, Gagsie, Hairy Dude, I am neuron, Ilyanep, ImperfectlyInformed, Juanpablosoto, Korval, Mabdul, Mhkay, Nigelj, Pitoutom, Reinthal, Robina Fox, Sciurinæ,<br />

Sourcejedi, SuperHamster, The.Modificator, TimBray, TubularWorld, 24 anonymous edits<br />

<strong>XML</strong> Pretty Printer Source: http://en.wikipedia.org/w/index.php?oldid=349425356 Contributors: Ashburnite, BackToThePast, KeithTyler, Malcolma, Oneiros, Tiberiusgrant, 4 anonymous<br />

edits<br />

<strong>XML</strong> Protocol Source: http://en.wikipedia.org/w/index.php?oldid=272452094 Contributors: ClementSeveillac, Imjustmatthew, Longhair, Pegship<br />

<strong>XML</strong> schema Source: http://en.wikipedia.org/w/index.php?oldid=340715184 Contributors: ABCD, Acdx, Ahoerstemeier, Alik Kirillovich, AutumnSnow, Beetstra, Bunnyhop11, Cbdorsett,<br />

Choster, Crystallina, Derekread, Dongwon, Doug Bell, Dreftymac, Ehn, Fryed-peach, Gardenstew, Hervegirod, Hymek, Jamelan, Jaxsam1, Korval, Krauss, Kucing, Mamling, MariahX, Mark<br />

Renier, MarkSweep, Mhkay, Minghong, Mjb, Ninly, Pi8ch, Pmerson, Poccil, Pxma, Rich Farmbrough, Runnerupnj, SheepNotGoats, Smyth, Stevage, SteveLoughran, Tobias Bergemann,<br />

Vernanimalcula, Vishrave, Wellithy, Xan 213, Þjóðólfr, 51 anonymous edits<br />

<strong>XML</strong> Schema Editor Source: http://en.wikipedia.org/w/index.php?oldid=364480930 Contributors: Bunnyhop11, Ched Davis, Egandrews, Fabrictramp, Gsgsgsgs, Kostmo, Pjcwikip, Rhubbarb,<br />

Rklear, Simon sprott, 12 anonymous edits<br />

<strong>XML</strong> Schema <strong>Language</strong> Comparison Source: http://en.wikipedia.org/w/index.php?oldid=349051677 Contributors: Ahoerstemeier, Bunnyhop11, Cfeet77, Crystallina, Decrease789, Dongwon,<br />

Dreftymac, Ghettoblaster, Giraffedata, Grumpycraig, Hsivonen, Jlowery, Korval, Penter ghost, Q Chris, Sloop Jon, Sześćsetsześćdziesiątsześć, Tuntable, 31 anonymous edits<br />

<strong>XML</strong> Studio Source: http://en.wikipedia.org/w/index.php?oldid=345963200 Contributors: Beetstra, Fabrictramp, Simon sprott, 7 anonymous edits<br />

<strong>XML</strong> Telemetric and Command Exchange Source: http://en.wikipedia.org/w/index.php?oldid=368146406 Contributors: Briangregory2000, BuffaloChip97, Eyreland, GerryInColorado,<br />

Iridescent, Jsafranek, Minizinim, Nasa-verve, O keyes, Pan Dan, Rich Farmbrough, SamFCooper, Timmerlj, 4 anonymous edits<br />

<strong>XML</strong> template engine Source: http://en.wikipedia.org/w/index.php?oldid=362160823 Contributors: Akmg, Crystallina, FatalError, Ishnigarrab, JacekA, Krauss, Markjoseph sc, Mhkay,<br />

MichaK, RHaworth, Radiant!, Rjwilmsi, Sanxiyn, Stevage, Stf, Tokek, 13 anonymous edits<br />

<strong>XML</strong> tree Source: http://en.wikipedia.org/w/index.php?oldid=352933815 Contributors: Booyabazooka, Dawynn, Malcolma, Nagle, Tinucherian, Velle, WereSpielChequers<br />

<strong>XML</strong> validation Source: http://en.wikipedia.org/w/index.php?oldid=361641348 Contributors: 3nx, Andy Dingley, David Haslam, Dawynn, Dreftymac, Drrwebber, EdJogg, Fnielsen, Hmains,<br />

Hymek, Jaxsam1, Korval, Pmerson, Rich Farmbrough, Waacstats, 10 anonymous edits<br />

<strong>XML</strong>-Enabled Networking Source: http://en.wikipedia.org/w/index.php?oldid=352338066 Contributors: Asparagus, Hybernator, Kakarrott64, Krbabu, Lsonne, MaxDel, Mbenna, Melab-1,<br />

MinorContributor, 7 anonymous edits<br />

<strong>XML</strong>-Retrieval Source: http://en.wikipedia.org/w/index.php?oldid=342675426 Contributors: DoriSmith, JudithWinter, Magioladitis, Nikkimaria<br />

<strong>XML</strong>HttpRequest Source: http://en.wikipedia.org/w/index.php?oldid=366820907 Contributors: .:Ajvol:., A3r0, Aditsu, Ahoerstemeier, Alaa.moustafa, Alansohn, Alcalazar, Alex Smotrov,<br />

Alexandre Martins, Algae, Alphachimp, Anirvan, Apv, Arjun G. Menon, Artw, Bezenek, Blackdenimgumby, BobBagwill, Bobo192, Bovineone, CDV, Caged.danimal, CambridgeBayWeather,<br />

CanisRufus, CapitalR, Catamorphism, Chealer, Christopherlin, Cic, Coffeeflower, DJ Rubbie, Damicatz, Dantman, Darklama, Delfuego, Digita, Dionyziz, Dirus, Discospinster, Djkenzie,<br />

Downfromzero, Drano, Dsnell923, EatMyShortz, Ej0c, Eloi.sanmartin, Enyo, Eric B. and Rakim, Eve Teschlemacher, Fabiob, FatalError, Filipvr, Fred Bradstadt, Fromz, Furrykef, Gabrielsroka,<br />

Gerbrant, Gilgamesh, Gilliam, Gimboid13, GraemeL, GregorB, Haza-w, Hondavice, Ignacio Javier Igjav, Isnow, J.delanoy, Jaray, Javalenok, Javawizard, Jaw959, Jdowland, Jeroldan, Jmabel,<br />

John Vandenberg, Jriffel, Keelypavan, Khalid hassani, Kozuch, Krellis, Kugland, Lee J Haywood, LemonairePaides, Liberatus, Lindsay-mclennan, Locos epraix, Lupin, Macaldo, Maian,<br />

Mamund, Manop, Marktmilligan, Marskind, Martin Hampl, Martnym, Masonbarge, Meand, Merc64, Metaeducation, Mindmatrix, Minghong, Mnot, Molily, Mrcs, Nickshanks, Nigelj,<br />

Nightstallion, Niven, Nkour, Norm mit, Oeln, Ohgyun Ahn, Pcj, Pctopp, Ph0t0phobic, Phloopy, Pjakubo86, Pjdonnelly, Proton.mule, Quilokos, Ramu50, RedWolf, Reisio, Remember the dot,<br />

Renku, RidinHood25, Ringbang, Rjwilmsi, Robert p levy, Rohan Jayasekera, Rufous, SalM, Sega381, Shamesspwns, Simon Lieschke, SineSwiper, Skeejay, Slant, Sleepyhead81, Spankman,<br />

Speight, Stephen Morley, Suruena, SvartMan, Taka, TakuyaMurata, Tamlyn, Teiladnam, The Anome, TheJosh, Thedangerouskitchen, Thumperward, Timc, Timeroot, Timwi, Tolmaion, Twsx,<br />

Urkle0, Vberger, VictorAnyakin, Vladogr, Wengier, White 720, WhiteHatLurker, Widgetguy, WikHead, Zippedmartin, Zoef1234, Zvn, Zzuuzz, ~K, 380 anonymous edits<br />

<strong>XML</strong>Socket Source: http://en.wikipedia.org/w/index.php?oldid=313909088 Contributors: Icktoofay, O keyes, Tomjenkins52, 1 anonymous edits<br />

XPath Source: http://en.wikipedia.org/w/index.php?oldid=355507321 Contributors: Bitbit, Bunnyhop11, D.c.camero, Girlo2111, Gondooley, JLaTondre, Jasondburkert, Jeffz1, Mabdul,<br />

Mathiastck, Mhkay, Ninly, Norro, Pgfearo, RSStockdale, Ringbang, Tibti, Walk Up Trees, 15 anonymous edits<br />

XPath 2.0 Source: http://en.wikipedia.org/w/index.php?oldid=344816885 Contributors: Bunnyhop11, D.c.camero, Fredrik, Girlo2111, Gudeldar, Int19h, Jan.Sievers, K1Bond007, Lar, Mabdul,<br />

Mathiastck, Mhkay, Roland Beker, Stevage, TheParanoidOne, Typhoonhurricane, Xiroth, 7 anonymous edits<br />

Xs3p Source: http://en.wikipedia.org/w/index.php?oldid=352203304 Contributors: AriManninen, Ashburnite, Databases, Dawynn, Hysteria18, 4 anonymous edits<br />

XSQL Source: http://en.wikipedia.org/w/index.php?oldid=362589846 Contributors: Bunnyhop11, Cander0000, Fatal!ty, HJWeng, Intgr, Legoktm, Levin, Melab-1, Tabletop, Xezbeth, 4<br />

anonymous edits


Image Sources, Licenses and Contributors 198<br />

Image Sources, Licenses and Contributors<br />

Image:Klip-logo1.png Source: http://en.wikipedia.org/w/index.php?title=File:Klip-logo1.png License: unknown Contributors: User:Awille, User:Cydebot, User:Diveloop<br />

Image:Log4js.png Source: http://en.wikipedia.org/w/index.php?title=File:Log4js.png License: GNU Free Documentation License Contributors: Stritti<br />

Image:Log4JS-UML.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Log4JS-UML.jpg License: GNU Free Documentation License Contributors: Stritti<br />

Image:PARTSangles.jpg Source: http://en.wikipedia.org/w/index.php?title=File:PARTSangles.jpg License: Public Domain Contributors: Buiras<br />

Image:METSdocument.jpg Source: http://en.wikipedia.org/w/index.php?title=File:METSdocument.jpg License: unknown Contributors: Buiras<br />

Image:X-office-document.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-document.svg License: unknown Contributors: Bdesham, Rocket000, Sasa Stefanovic<br />

Image:X-office-presentation.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-presentation.svg License: unknown Contributors: Linuxerist, Rocket000, Túrelio, 1<br />

anonymous edits<br />

Image:X-office-spreadsheet.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-spreadsheet.svg License: unknown Contributors: Bdesham, Rocket000, Sasa Stefanovic<br />

Image:Open Packaging Convention.png Source: http://en.wikipedia.org/w/index.php?title=File:Open_Packaging_Convention.png License: GNU General Public License Contributors:<br />

various<br />

Image:DrawingML example.png Source: http://en.wikipedia.org/w/index.php?title=File:DrawingML_example.png License: Public Domain Contributors: Original uploader was Tuanese at<br />

en.wikipedia<br />

Image:XPSIcon.png Source: http://en.wikipedia.org/w/index.php?title=File:XPSIcon.png License: unknown Contributors: Athaenara, Cristan, Joelholdsworth, Salavat, Sfan00 IMG, 2<br />

anonymous edits<br />

Image:Rdf graph for Eric Miller.png Source: http://en.wikipedia.org/w/index.php?title=File:Rdf_graph_for_Eric_Miller.png License: Attribution Contributors: W3C<br />

Image:<strong>XML</strong>.svg Source: http://en.wikipedia.org/w/index.php?title=File:<strong>XML</strong>.svg License: Creative Commons Attribution-Sharealike 2.5 Contributors: AutumnSnow, Fryed-peach, JeffyP,<br />

Jusjih, Karl Dickman, Latics, Platonides, SKvalen, Soeb, Verdy p, 3 anonymous edits<br />

Image:Xml_text_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_text_editor.png License: Public Domain Contributors: Damien1, 1 anonymous edits<br />

Image:xml_graphical_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_graphical_editor.png License: Public Domain Contributors: Damien1<br />

Image:xml_wysiwyg_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_wysiwyg_editor.png License: Public Domain Contributors: Damien1, 1 anonymous edits<br />

Image:SimpleXsd Physical.png Source: http://en.wikipedia.org/w/index.php?title=File:SimpleXsd_Physical.png License: Creative Commons Attribution 3.0 Contributors: User:Simon sprott<br />

Image:SimpleXsd Logical.png Source: http://en.wikipedia.org/w/index.php?title=File:SimpleXsd_Logical.png License: Creative Commons Attribution 3.0 Contributors: User:Simon sprott<br />

Image:Tick-green.png Source: http://en.wikipedia.org/w/index.php?title=File:Tick-green.png License: Public Domain Contributors: Wesley Warren<br />

Image:ScreenShot XsdEditor.png Source: http://en.wikipedia.org/w/index.php?title=File:ScreenShot_XsdEditor.png License: Creative Commons Attribution 3.0 Contributors: Simon sprott<br />

(talk). Original uploader was Simon sprott at en.wikipedia<br />

Image:XTCE exchange.gif Source: http://en.wikipedia.org/w/index.php?title=File:XTCE_exchange.gif License: Public Domain Contributors: GerryInColorado


License 199<br />

License<br />

Creative Commons Attribution-Share Alike 3.0 Unported<br />

http://creativecommons.org/licenses/by-sa/3.0/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!