EXtensible Markup Language (XML) - Cultural View

EXtensible Markup Language (XML) - Cultural View EXtensible Markup Language (XML) - Cultural View

from culturalview.com More from this publisher

14.07.2013 Views

EXtensible Markup Language (XML) Visit the Cultural View of Technology XML Tutorial page for videos and exercises PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Thu, 17 Jun 2010 01:47:38 UTC

EXtensible Markup

Language (XML)

Visit the Cultural View of Technology XML Tutorial page for videos and exercises

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.

PDF generated at: Thu, 17 Jun 2010 01:47:38 UTC

Contents

Articles

Binary XML 1

Business Process Definition Metamodel 2

CDATA 3

CDuce 6

Character entity reference 7

CodeSynthesis XSD 9

D3L 10

Darwin Information Typing Architecture 10

DITA Open Toolkit 14

Document Structure Description 15

Document-Centric 16

Document-centric XML processing 17

Dynamic XML 18

ECMAScript for XML 18

Efficient XML Interchange 20

Embedded RDF 21

EpiDoc 21

eXtensible Server Pages 23

Fast Infoset 24

Global listings format 26

GMX 26

GMX-V 27

Head-Body Pattern 28

HyTime 28

Internationalization Tag Set 29

Klip 32

List of XML and HTML character entity references 33

Log4js 44

MAREC 46

Media Object Server 47

METS 47

Numeric character reference 50

Office Open XML 52

Office Open XML file formats 61

OIOXML 70

Open XML Paper Specification 71

PCDATA 77

Plain Old XML 78

Portable Application Description 79

Publishing Requirements for Industry Standard Metadata 80

QName 82

QTI 83

Resource Description Framework 89

Resources of a Resource 98

Reverse Ajax 99

Root element 100

Schematron 101

Simple Outline XML 103

Simple XML 104

Streaming XML 105

Styled Layer Descriptor 105

Topic (XML) 106

Unique Particle Attribution 107

VTD-XML 108

X-expression 114

XBRLS 114

Xdos 116

XDR Schema 116

XEE (Starlight) 117

XEP 118

XML 119

XML and MIME 132

XML appliance 133

XML Base 135

XML Catalog 136

XML Certification Program 138

XML Configuration Access Protocol 143

XML Control Protocol 144

XML data binding 145

XML database 146

XML editor 150

XML Enabled Directory 153

XML Encryption 154

XML Events 154

XML framework 156

XML Literals 157

XML namespace 157

XML Pretty Printer 158

XML Protocol 159

XML schema 160

XML Schema Editor 162

XML Schema Language Comparison 165

XML Studio 171

XML Telemetric and Command Exchange 172

XML template engine 174

XML tree 177

XML validation 177

XML-Enabled Networking 178

XML-Retrieval 180

XMLHttpRequest 182

XMLSocket 187

XPath 188

XPath 2.0 189

Xs3p 192

XSQL 193

References

Article Sources and Contributors 194

Image Sources, Licenses and Contributors 198

Article Licenses

License 199

Binary XML 1

Binary XML

Binary XML refers to any specification which defines the compact representation of XML (Extensible Markup

Language) in a binary format. While there are several competing formats, none has been widely adopted by a

standards organization or accepted as a de facto standard. Using a binary XML format generally reduces the

verbosity of XML documents and cost of parsing [1] , but hinders the use of ordinary text editors and third-party tools

to view and edit the document. Binary XML is typically used in applications where standard XML is not an option

due to performance limitations, but the ability to convert the document to and from a form which is easily viewed

and edited is valued. Other advantages may include enabling random access and indexing of XML documents.

The major challenge for binary XML is to create a single, widely adopted standard. The International Organization

for Standardization (ISO) and the International Telecommunications Union (ITU) published the Fast Infoset standard

in 2007 and 2005, respectively. The World Wide Web Consortium (W3C) has produced the first draft of the EXI

format specification. Another standard (ISO/IEC 23001-1), known as Binary MPEG format for XML (BiM), has

been standardized by the ISO in 2001. BiM is used by many ETSI standards for Digital TV and Mobile TV. The

Open Geospatial Consortium also provides a Binary XML Encoding Specification (currently a Best Practice Paper)

optimized for geo-related data (GML).

Alternatives to binary XML include using traditional file compression methods on XML documents (for example

gzip); or using an existing standard such as ASN.1. Traditional compression methods, however, offer only the

advantage of compression, without the advantage of decreased parsing time or random access. ASN.1 is being used

as the basis of Fast Infoset, which is one binary XML standard. There are also hybrid approaches (e.g., VTD-XML)

that attach a small index file to an XML document to eliminate the overhead of parsing [2] .

Adoption

Projects and file formats which use binary XML include:

• Fast Infoset, a standard published by ISO/IEC and ITU-T

• Efficient XML from AgileDelta, Inc., selected as the basis for the W3C Standard for Binary XML (EXI)

• Extensible Binary Meta Language (EBML) from Matroska

• Wireless Binary XML (WBXML)

Other projects that have functionality related to (or competing with) binary representations include:

• VTD-XML from XimpleWare and VTD-XML project

• BiM Standard, from the ISO, developed by the MPEG working group

• Protocol Buffers from Google

• Data Distribution Service from OMG

References

[1] The performance woe of binary XML http://webservices.sys-con.com/read/250512.htm

[2] Index XML documents with VTD-XML (http://xml.sys-con.com/read/453082.htm)

Business Process Definition Metamodel 2

Business Process Definition Metamodel

The Business Process Definition Metamodel (BPDM) is a standard definition of concepts used to express business

process models (a metamodel), adopted by the OMG (Object Management Group). Metamodels define concepts,

relationships, and semantics for exchange of user models between different modeling tools. The exchange format is

defined by XSD (XML Schema) and XMI (XML for Metadata Interchange), a specification for transformation of

OMG metamodels to XML. Pursuant to the OMG's policies, the metamodel is the result of an open process

involving submissions by member organizations, following a Request for Proposal [1] (RFP) issued in 2003. BPDM

was adopted in initial form in July 2007, and finalized in July 2008.

BPDM provides abstract concepts as the basis for consistent interpretation of specialized concepts used by business

process modelers. For example, the ordering of many of the graphical elements in a BPMN (Business Process

Modeling Notation) diagram is depicted by arrows between those elements, but the specific elements can have a

variety of characteristics. For example, all BPMN events have some common characteristics, and a variety of

specific events are designated by the type of circle and the icon in the circle. The abstract BPDM concepts ensure

implementers of different modeling tools will associate the same characteristics and semantics with the modeling

elements to ensure models are interpreted the same way when moved to a different tool. Users of the modeling tools

do not need to be concerned with the abstractions-they only see the specialized elements.

BPDM extends business process modeling beyond the elements defined by BPMN and BPEL to include interactions

between otherwise-independent business processes executing in different business units or enterprises

(choreography). A choreography can be specified independently of its participants, and used as a requirement for the

specification of the orchestration implemented by a participant. BPDM provides for the binding of orchestration to

choreography to ensure compatibility. Many current business process models focus on specification of executable

business processes that execute within an enterprise (orchestration).

The BPDM specification addresses the objectives of the OMG RFP [1] on which it is based:

• BPDM "will define a set of abstract business process definition elements for specification of executable business

processes that execute within an enterprise, and may collaborate between otherwise-independent business

processes executing in different business units or enterprises."

• common metamodel to unify the diverse business process definition notations that exist in the industry containing

semantics compatible with leading business process modeling notations.

• A metamodel that complements existing UML metamodels so that business processes specifications can be part

of complete system specifications to assure consistency and completenes.

• The ability to integrate process models for workflow management processes, automated business processes, and

collaborations between business units.

• Support for the specification of web services choreography, describing the collaboration between participating

entities and the ability to reconcile the choreography with supporting internal business processes.

• The ability to exchange business process specifications between modeling tools, and between tools and execution

environments using XMI.

The RFP seeks to "improve communication between modelers, including between business and software modelers,

provide flexible selection of tools and execution environments, and promote the development of more specialized

tools for the analysis and design of processes."

For exchange of business process models, BPDM is an alternative to the existing process interchange format XPDL

(XML Process Definition Language) from the WfMC (Workflow Management Coalition). The two specifications

are similar in that they can be used by process design tools to exchange business process definitions. They are

different in that BPDM provides a specification of semantics integrated in a metamodel, and it includes additional

modeling capabilities such as choreography, discussed above. In addition, XPDL has many implementations, though

Business Process Definition Metamodel 3

only some support for XPDL 2.x, needed for interchanging BPMN. BPDM implementations are in preparation,

including support for BPMN, and translation to XPDL.

External links

• BPDM Tutorial [2]

• Design Rationale [3] (see Section 4, also Sections 7.6 and 7.9).

• Other introductory presentations [4]

• Web pages showing metamodels [5] in UML notation

• Specification documents, in two parts:

• Common Infrastructure [6] (see Section 4.4.1.1 for an overview of metamodeling).

• Process Definition [7] .

References

[1] http://www.omg.org/cgi-bin/doc?bei/03-01-06

[2] http://doc.omg.org/omg/08-06-32

[3] http://doc.omg.org/bmi/08-09-07

[4] http://www.conradbock.org/#BPDM

[5] ftp://ftp.omg.org/pub/docs/dtc/08-05-11/pages/188c21b53f42002f.htm

[6] http://doc.omg.org/dtc/08-05-07

[7] http://doc.omg.org/dtc/08-05-10

CDATA

The term CDATA, meaning character data, is used for distinct, but related purposes in the markup languages

SGML and XML. The term indicates that a certain portion of the document is general character data, rather than

non-character data or character data with a more specific, limited structure.

CDATA sections in XML

In an XML document or external parsed entity, a CDATA section is a section of element content that is marked for

the parser to interpret as only character data, not markup. A CDATA section is merely an alternative syntax for

expressing character data; there is no semantic difference between character data that manifests as a CDATA section

and character data that manifests as in the usual syntax in which "

CDATA 4

John Smith]]>

then the code is interpreted the same as if it had been written like this:

<sender>John Smith</sender>

That is, the "sender" tags will have exactly the same status as the "John Smith"— they will be treated as text.

Similarly, if the numeric character reference ð appears in element content, it will be interpreted as the single

Unicode character 00F0 (small letter eth). But if the same appears in a CDATA section, it will be parsed as six

characters: ampersand, hash mark, digit 2, digit 4, digit 0, semicolon.

Uses of CDATA sections

New authors of XML documents often misunderstand the purpose of a CDATA section, mistakenly believing that its

purpose is to "protect" data from being treated as ordinary character data during processing. Some APIs for working

with XML documents do offer options for independent access to CDATA sections, but such options exist above and

beyond the normal requirements of XML processing systems, and still do not change the implicit meaning of the

data. Character data is character data, regardless of whether it is expressed via a CDATA section or ordinary markup.

CDATA sections are useful for writing XML code as text data within an XML document. For example, if one wishes

to typeset a book with XSL explaining the use of an XML application, the XML markup to appear in the book itself

will be written in the source file in a CDATA section. However, a CDATA section cannot contain the string "]]>"

and therefore it is not possible for a CDATA section to contain nested CDATA sections. The preferred approach to

using CDATA sections for encoding text that contains the triad "]]>" is to use multiple CDATA sections by splitting

each occurrence of the triad just before the ">". For example, to encode "]]>" one would write:

]]>

This means that to encode "]]>" in the middle of a CDATA section, replace all occurrences of "]]>" with the

following:

]]]]>

This effectively stops and restarts the CDATA section.

Use of CDATA in program output

For generating XML "by hand", CDATA sections do not remove the need for escaping. The string ]]> (the CDATA

end marker) must be escaped with a string such as ]]]]>, which breaks the string across separate

CDATA sections. An alternative to using CDATA sections which may be simpler in some circumstances is to

escape the single characters & and < (normally using & or & and < or <). The different approaches

produce equally valid XML, and most XML parsers will not preserve the distinctions between them in their output.

CDATA sections in XHTML documents are liable to be parsed differently by web browsers if they render the

document as HTML, since HTML parsers do not recognise the CDATA start and end markers, nor do they recognise

HTML entity references such as < within tags. This can cause rendering problems in web browsers and

can lead to cross-site scripting vulnerabilities if used to display data from untrusted sources, since the two kinds of

parser will disagree on where the CDATA section ends.

Since it is useful to be able to use less-than signs (

CDATA 5

example:

//

or this CSS example:

/**/

This technique is only necessary when using inline scripts and stylesheets, and is language-specific. CSS stylesheets,

for example, only support the second style of commenting-out (/* ... */), but CSS also has less need for the < and &

characters than JavaScript and so less need for explicit CDATA markers.

CDATA in DTDs

CDATA-type attribute value

In Document Type Definition (DTD) files for SGML and XML, an attribute value may be designated as being of

type CDATA: arbitrary character data. Within a CDATA-type attribute, character and entity reference markup is

allowed and will be processed when the document is read.

For example, if an XML DTD contains

it means that elements named foo may optionally have an attribute named "a" which is of type CDATA. In an XML

document that is valid according to this DTD, an element like this might appear:

and an XML parser would interpret the "a" attribute's value as being the character data "1 & 2 are < 3".

CDATA-type entity

An SGML or XML DTD may also include entity declarations in which the token CDATA is used to indicate that

entity consists of character data. The character data may appear within the declaration itself or may be available

externally, referenced by a URI. In either case, character reference and parameter entity reference markup is allowed

in the entity, and will be processed as such when it is read.

CDATA-type element content

An SGML DTD may declare an element's content as being of type CDATA. Within a CDATA-type element, no

markup will be processed. It is similar to a CDATA section in XML, but has no special boundary markup, as it

applies to the entire element.

CDATA 6

External links

• CDATA Confusion [1]

• Character Data and Markup (in XML) [2]

References

[1] http://www.flightlab.com/~joe/sgml/cdata.html

[2] http://www.w3.org/TR/REC-xml/#syntax

CDuce

CDuce is an XML-oriented functional language, which extends XDuce in a few directions. It features XML regular

expression types, XML regular expression patterns, XML iterators. CDuce is not strictly speaking an XML

transformation language since it can be used for general-purpose programming.

CDuce conforms to basic standards: Unicode, XML, DTD, Namespaces are fully supported, XML Schema is

partially supported.

Benefits of CDuce

• static verifications (e.g.: ensure that a transformation produces a valid document);

• in particular, we aim at smooth and safe compositions of XML transformations, and incremental programming;

• static optimizations and efficient execution model (knowing the type of a document is crucial to extract

information efficiently).

Features particular to CDuce

• XML objects can be manipulated as first-class citizen values: elements, sequences, tags, characters and strings,

attribute sets; sequences of XML elements can be specified by regular expressions, which also apply to characters

strings;

• functions themselves are first-class values, they can be manipulated, stored in data structure, returned by a

function,...

• a powerful pattern matching operation can perform complex extractions from sequences of XML elements;

• a rich type algebra, with recursive types and arbitrary boolean combinations (union, intersection, complement)

allows precise definitions of data structures and XML types; general purpose types and types constructors are

taken seriously (products, extensible records, arbitrary precision integers with interval constraints, Unicode

characters);

• polymorphism through a natural notion of subtyping, and overloaded functions with dynamic dispatch;

• a highly-effective type-driven compilation schema.

External links

• CDuce [1]

References

[1] http://www.cduce.org

Character entity reference 7

Character entity reference

In the markup languages SGML, HTML, XHTML and XML, a character entity reference is a reference to a

particular kind of named entity that has been predefined or explicitly declared in a Document Type Definition

(DTD). The "replacement text" of the entity consists of a single character from the Universal Character Set/Unicode.

The purpose of a character entity reference is to provide a way to refer to a character that is not universally

encodable.

Although in popular usage character references are often called "entity references" or even "entities", this usage is

wrong. A character reference is a reference to a character, not to an entity. Entity reference refers to the content of a

named entity. An entity declaration is created by using the syntax in a document type

definition (DTD) or XML schema. Then, the name defined in the entity declaration is subsequently used in the

XML. When used in the XML, it is called an entity reference.

Concepts

XML has two relevant concepts:

Predefined entity

A "predefined entitys reference" is a reference to one of the special characters denoted by:

Character coding

entity character code (dec) meaning

" " x22 (34) (double) quotation mark

& & x26 (38) ampersand

' ' x27 (39) apostrophe (= apostrophe-quote)

< < x3C (60) less-than sign

> > x3E (62) greater-than sign

A "character reference" is a construct such as   or equally   that refers to a character by means of its

numeric Unicode code point, i.e. here, the character code 160 (or xA0 in hexa) refers the   character, the

non-breaking space.

See also

• SGML entity

• Character encodings in HTML

• Numeric character reference

• List of XML and HTML character entity references

Character entity reference 8

External links

• Entities Table [1]

• A Simple Character Entity Chart [2]

• A character entity chart with images for entities [3]

• A Clear and Quick Reference to HTML Symbol Entities Codes [4]

References

[1] http://www.elizabethcastro.com/html/extras/entities.html

[2] http://www.evolt.org/article/ala/17/21234/

[3] http://www.escapecodes.info/

[4] http://www.entitycode.com/

CodeSynthesis XSD 9

CodeSynthesis XSD

Written

in

C++

Type library or framework

CodeSynthesis XSD is an XML Data Binding compiler for C++ developed by Code Synthesis and dual-licensed

under the GNU GPL and a proprietary license. Given an XML instance specification (XML Schema), it generates

C++ classes that represent the given vocabulary as well as parsing and serialization code. It is supported on a large

number of platforms, including AIX, GNU/Linux, HP-UX, Mac OS X, Solaris, Windows, HP OpenVMS, and IBM

z/OS. Supported C++ compilers include GNU G++, Intel C++, HP aCC, Sun C++, IBM XL C++, and Microsoft

Visual C++. A version for mobile and embedded systems, called CodeSynthesis XSD/e, is also available.

One of the unique features of CodeSynthesis XSD is its support for two different XML Schema to C++ mappings:

in-memory C++/Tree and stream-oriented C++/Parser. The C++/Tree mapping is a traditional mapping with a

tree-like, in-memory data structure. C++/Parser is a new, SAX-like mapping which represents the information stored

in XML instance documents as a hierarchy of vocabulary-specific parsing events. In comparison to C++/Tree, the

C++/Parser mapping allows one to handle large XML documents that would not fit in memory, perform

stream-oriented processing, or use an existing in-memory representation.

CodeSynthesis XSD itself is written in C++ [1] .

External links

• CodeSynthesis XSD Home Page [2]

• An Introduction to the C++/Tree Mapping [3]

• An Introduction to the C++/Parser Mapping [4]

• An Introduction to XML Data Binding in C++ [5]

References

[1] Bjarne Stroustrup. C++ applications (http://www.research.att.com/~bs/applications.html), 2007-05-25. Retrieved on 2007-06-18.

[2] http://www.codesynthesis.com/products/xsd/

[3] http://www.codesynthesis.com/projects/xsd/documentation/cxx/tree/guide/ [4]

http://www.codesynthesis.com/projects/xsd/documentation/cxx/parser/guide/ [5]

http://www.artima.com/cppsource/xml_data_binding.html

D3L 10

D3L

D3L (Data Definition Description Language) is an XML-based message description language that describes the

structure that an application's native, non-XML format message (known also as its native view) must follow to

communicate. Currently used in Oracle Application Server InterConnect, D3L message description language is used

to interact through several transport adapters, including FTP, HTTP(S), MQ Series, and SMTP.

External links

http://download-uk.oracle.com/docs/cd/B10465_01/integrate.904/b10404/appx_d3l.htm#620714

Darwin Information Typing Architecture

The Darwin Information Typing Architecture (DITA) is an XML-based architecture for authoring, producing,

and delivering information. Although its main applications have so far been in technical publications, DITA is also

used for other types of documents such as policies and procedures.

Origin and name

The DITA architecture and a related DTD and XML Schema were originally developed by IBM. The architecture

incorporates ideas in XML architecture, such as modular information architecture, various features for content reuse,

and specialization, that had been developed over previous decades. [1] DITA is now an OASIS standard.

The first word in the name "Darwin Information Typing Architecture" is a reference to the naturalist Charles Darwin.

The key concept of "specialization" in DITA is in some ways analogous to Darwin's concept of evolutionary

adaptation, with a specialized element inheriting the properties of the base element from which it is specialized.

Features and limitations

Topic orientation

DITA content is written as modular topics, as opposed to long "book-oriented" files. A DITA map contains links to

topics, organized in the sequence (which may be hierarchical) in which they are intended to appear in finished

documents. A DITA map defines the table of contents for deliverables. Relationship tables in DITA maps can also

specify which topics link to each other.

Modular topics can be easily reused in different deliverables. However, the strict topic-orientation of DITA makes it

an awkward fit for content that contains lengthy narratives that do not lend themselves to being broken into small,

standalone chunks. Experts stress the importance of content analysis in the early stages of implementing structured

[2] [3] [4]

authoring.

Darwin Information Typing Architecture 11

Content references

Fragments of content within topics (or less commonly, the topics themselves) can be reused through the use of

content references (conref), a transclusion mechanism.

Conditional text

Conditional text allows filtering or styling content based on attributes for audience, platform, product, and other

properties.

Metadata

DITA includes extensive metadata elements and attributes, which make topics easier to find.

Information typing

DITA specifies three basic topic types: Task, Concept and Reference. Each of the three basic topic types is a

specialization of a generic Topic type, which contains a title element, a prolog element for metadata, and a body

element. The body element contains paragraph, table, and list elements, similar to HTML.

1. A Task topic is intended for a procedure that describes how to accomplish a task. A Task topic lists a series of

steps that users follow to produce an intended outcome. The steps are contained in a taskbody element, which is a

specialization of the generic body element. The steps element is a specialization of an ordered list element.

2. Concept information is more objective, containing definitions, rules, and guidelines.

3. A Reference topic is for topics that describe command syntax, programming instructions, and other reference

material, and usually contains detailed, factual material.

Specialization

DITA allows adding new elements and attributes through specialization of base DITA elements and attributes.

Through specialization, DITA can accommodate new topic types, element types, and attributes as needed for specific

industries or companies. Specializations of DITA for specific industries, such as the semiconductor industry, are

standardized through OASIS technical committees or subcommittees. A significant percentage of organizations

using DITA also develop their own specializations.

The extensibility of DITA permits organizations to specialize DITA by defining specific information structures and

still use standard tools to work with them. The ability to define company-specific information architectures enables

companies to use DITA to enrich content with metadata that is meaningful to them, and to enforce company-specific

rules on document structure.

Compatibility with non-DITA content

The element types and structures in DITA topics are similar to popular languages such as HTML. For example, a

bulleted or numbered list can be copied and pasted directly from HTML to DITA.

DITA maps can include both DITA topics and non-DITA documents (such as HTML files and Microsoft Word

documents) in document hierarchies. However, processors are generally limited in their ability to merge DITA and

non-DITA content into consolidated printed documents.

Darwin Information Typing Architecture 12

Creating content in DITA

DITA map and topic documents are XML files. As with HTML, any images, video files, or other files which need to

appear in output are inserted via reference. Any XML editor can therefore be used to write DITA content, with the

exception of editors that support only a limited set of XML schemas (such as XHTML editors). Various editing tools

have been developed that provide specific features to support DITA, such as visualization of conrefs.

Publishing content written in DITA

DITA is conceived as an end-to-end architecture. In addition to indicating what elements, attributes, and rules are

part of the DITA language, the DITA specification [5] includes rules for publishing DITA content in print, HTML,

online Help, and other formats.

For example, the DITA specification indicates that if the conref attribute of element A contains a path to element B,

the contents of element B will be displayed in the location of element A. DITA-compliant publishing solutions,

known as DITA processors, must handle the conref attribute according to the specified behaviour. Rules also exist

for processing other rich features such as conditional text, index markers, and topic-to-topic links. Applications that

transform DITA content into other formats, and meet the DITA specification's requirements for interpreting DITA

markup, are known as DITA processors.

DITA Open Toolkit

When DITA was released as a public XML standard in 2001, IBM contributed the DITA Open Toolkit (DITA OT)

to the wider community. The DITA OT was therefore the first DITA processor, and continues to be the foundation of

most publishing of DITA content. It is currently an active open-source project, with contributions from several

companies.

Out of the box, the DITA OT handles all valid DITA specializations and produces several output formats, including:

• PDF, through XSL-FO

• XHTML

• Microsoft Compiled HTML Help

• Eclipse Help

• Java Help

• Oracle Help

• Rich Text Format

The DITA OT can also be extended to produce other (arbitrary) output formats. The raw DITA OT can be run from

the command line. Some DITA authoring tools and content management systems now integrate the DITA OT, or

parts of it, into their own publishing workflows. Standalone tools have also been developed to run the DITA OT via

a graphical user interface instead of the command line.

The DITA OT includes customizable stylesheets that control the formatting and layout of human-readable

deliverables.

Darwin Information Typing Architecture 13

Brief history

• March 2001 Introduction by IBM

• May 2002 Domain specialization added to topic specialization

• April 2004 OASIS [6] Technical Committee for DITA formed

• February 2005 SourceForge [7] begins DITA Open Toolkit support

• June 2005 DITA v1.0 approved as an OASIS standard

• August 2005 DITA Open Toolkit v1.1 is released

• March 2006 OASIS launches DITA.XML.org [8]

• August 2007 DITA V1.1 is approved by OASIS, including Bookmap specialization

See also

• DocBook

• S1000D

• List of document markup languages

• Comparison of document markup languages

References

• IBM's Introduction to DITA [9]

• DITA Architectural Specification, v 1.1 [5]

• DITA Language Specification, v 1.1 [10]

Further reading

• Priestley, Michael; Swope, Amber (2008) (PDF). The DITA Maturity Model Whitepaper [11] . IBM Corp and

JustSystems.

• Doyle, Bob (2008) (PDF). DITA Tools from A to Z [12] . Society for Technical Communication.

External links

• DITA XML.org community site [8]

• DITA World [13] — Comprehensive list of DITA resources: articles, vendors, user groups and more

• DITA Open Toolkit User Guide and Reference [14]

• Roadmap for DITA Development [15] , OASIS DITA Technical Committee

• DITA News [16] - aggregates DITA bloggers, has extensive resources, and DITA tools listing

• RuDI: Ruby Utilities for DITA processing [17]

Darwin Information Typing Architecture 14

References

[1] Doyle, Bob. "History of DITA" (http://dita.xml.org/book/history-of-dita). . Retrieved 2009-07-31.

[2] "Implementing DITA versus implementing custom XML architecture" (http://www.scriptorium.com/whitepapers/dita_assessment/

dita_assessment4.html). Scriptorium Publishing Services, Inc. 2008. . Retrieved 2009-07-29.

[3] "Structure, DITA, and content other than technical documentation …" (http://rockley.com/blog/?p=22). The Rockley Group. October 16,

2007. . Retrieved 2009-07-29.

[4] "Survey on DITA Chellenges" (http://writepoint.com/blog/?p=1011). WritePoint Ltd.. January 18, 2010. . Retrieved 2010-01-21.

[5] http://docs.oasis-open.org/dita/v1.1/CS01/archspec/archspec.html

[6] http://www.oasis-open.org

[7] http://dita-ot.sourceforge.net

[8] http://dita.xml.org

[9] http://www.ibm.com/developerworks/xml/library/x-dita1/

[10] http://docs.oasis-open.org/dita/v1.1/CS01/langspec/ditaref-type.html

[11] http://na.justsystems.com/files/Whitepaper-DITA_MM.pdf

[12] http://www.ditanews.com/tools/STC_Intercom.pdf

[13] http://www.ditaworld.com

[14] http://dita-ot.sourceforge.net/doc/ot-userguide/xhtml/

[15] http://wiki.oasis-open.org/dita/Roadmap_for_DITA_development

[16] http://www.ditanews.com

[17] http://kenai.com/projects/rudi/pages/Home

DITA Open Toolkit

The DITA Open Toolkit is a free and open-source implementation of the OASIS DITA Technical Committee's

specification for Darwin Information Typing Architecture (DITA) DTDs and Schemas. [1]

The Toolkit transforms DITA content (topics and maps) into deliverable formats like web (XHTML), print (PDF),

and online Help.

The DITA Open Toolkit, or dita-ot for short, is a set of Ant- and Java-based, open source tools that provide a

"reference implementation" for processing DITA maps and topical content to multiple output formats.

It is a demonstration of DITA's capabilities for single source publishing, modularity, structured writing, information

typing, inheritance, specialization, topic-based authoring, conditional processing, component publishing, task

orientation, and content reuse.

Several XML editors and XML content management systems integrate the DITA Open Toolkit into their products,

including Oxygen XML Editor, XMetaL, and Syntext Serna.

See also

Darwin Information Typing Architecture

Further reading

• Linton, Jen and Bruski, Kylene (2006). Introduction to DITA: A Basic User Guide to the Darwin Information

Typing Architecture [2] . Denver, CO: Comtech Services.

External links

• http://dita.xml.org

• SourceForge page on the DITA OT [3]

• Don Day's Resources Page for the DITA OT [4]

• DITA Open Toolkit User Guide [5]

DITA Open Toolkit 15

• Download page for the DITA OT [6]

• DITA Users [7] - a member organization with workspace folders and online version of the DITA Open Toolkit

• PHP debugging tools for the DITA OT [8]

• DITA Infocenter [9] - DITA OT User Guide in online help format

References

[1] http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita

[2] http://www.comtech-serv.com/dita2.shtml#book

[3] http://sourceforge.net/projects/dita-ot/

[4] http://www.ditaopentoolkit.org/

[5] http://dita-ot.sourceforge.net/SourceForgeFiles/doc/user_guide.html

[6] http://sourceforge.net/project/showfiles.php?group_id=132728

[7] http://www.ditausers.org

[8] http://www.vrcommunications.com/Code/ditaotug131-18042007-tools.zip

[9] http://www.ditainfocenter.com

Document Structure Description

Document Structure Description, or DSD, is a schema language for XML, that is, a language for describing valid

XML documents. It's an alternative to DTD or the W3C XML Schema.

An example of DSD in its simplest form:

This says that element named "foo" in the XML namespace "http://example.com" may have two attributes, named

"first" and "second". A "foo" element may not have any character data. It must contain one subelement, named "bar",

also in the "http://example.com" namespace. A "bar" element is not allowed any attributes, character data or

subelements.

Document Structure Description 16

One XML document that would be valid according to the above DSD would be:

Current Software store

• Prototype Java Processor [1] from BRICS

External links

• DSD home page [2]

• Full DSD specification [3]

• Comparison of DTD, W3C XML Schema, and DSD [4]

References

[1] http://www.brics.dk/DSD/dsd2

[2] http://www.brics.dk/DSD/

[3] http://www.brics.dk/DSD/dsd2.html

[4] http://www.brics.dk/~amoeller/XML/schemas/

Document-Centric

Document Centric XML processing is a notion first introduced in VTD-XML. Before VTD-XML, traditional

XML processing models (e.g. DOM, SAX and JAXB etc.) are designed around the notion of objects. The XML text,

merely as the serialization of the objects, is relegated to the status of a second-class citizen. You base your

applications on DOM nodes, string and various business objects, but rarely on the physical documents. If you have

followed my articles on DevX so far, it should quickly become obvious that this object-oriented approach of XML

processing makes little sense because of the performance hits from virtually all directions. Not only are object

creation and garbage collection inherently memory and CPU inefficient, but your applications incur the cost of

re-serialization with even the smallest changes to the original text.

With document-centric XML processing, the XML document (the persistent format of data) is the starting point from

which everything else comes about. Whether it is parsing, XPath evaluation, modifying content, or slicing element

fragments, by default you no longer work directly with objects. You only do that when it makes sense. More often

than not, you treat documents purely as syntax, and think in bytes, byte arrays, integers, offsets, lengths, fragments

and namespace-compensated fragments. The first-class citizen in this paradigm is the XML text. And the

object-centric notions of XML processing, such as serialization and de-serialization (or marshalling and

unmarshalling) are often displaced, if not replaced, by more document-centric notions of parsing and composition.

Increasingly you will find that your XML programming experience is getting simpler. And not surprisingly, the

simpler, intuitive way to think about XML processing is also the most efficient and powerful.

Document-centric XML processing 17

Document-centric XML processing

Document-centric XML processing is one of two conceptual approaches to processing XML content, along with

Data-centric XML processing. Although there is no universally accepted definition of the term, following articles

discuss features typically associated with this approach:

• Data-centric vs Document-centric XML [1]

• Text-centric vs data-centric XML retrieval [2]

Applications based on Document-centric Approach

VTD-XML

Before VTD-XML, traditional XML processing models (e.g. DOM, SAX and JAXB etc.) are designed around the

notion of objects. The XML text, merely as the serialization of the objects, is relegated to the status of a second-class

citizen. Applications are based on DOM nodes, strings and various business objects, but rarely on the physical

documents. This object-oriented approach of XML processing has serious issues because of the performance hits

from virtually all directions. Not only are object creation and garbage collection inherently memory and CPU

inefficient, but applications incur the cost of re-serialization with even the smallest changes to the original text.

With document-centric XML processing, the XML document (the persistent format of data) is the starting point from

which everything else comes about. Whether it is parsing, XPath evaluation, modifying content, or slicing element

fragments, by default you no longer work directly with objects. You only do that when it makes sense. More often

than not, one treat documents purely as syntax, and think in bytes, byte arrays, integers, offsets, lengths, fragments

and namespace-compensated fragments. The first-class citizen in this paradigm is the XML text. And the

object-centric notions of XML processing, such as serialization and de-serialization (or marshalling and

unmarshalling) are often displaced, if not replaced, by more document-centric notions of parsing and composition.

References

[1] http://techessence.info/node/51

[2] http://nlp.stanford.edu/IR-book/html/htmledition/text-centric-vs-data-centric-xml-retrieval-1.html

Dynamic XML 18

Dynamic XML

Dynamic XML means dynamic data that is in an XML format.

Another popular use of this term also refers to information which is extracted from a database (commonly a

relational database) and placed into XML format. Clearly this is a completely different case as it does not involve

any updates to the data – and is in fact static data. In this context the word "dynamic" is taking the alternative

meaning of "automated", in the sense that something which is performed dynamically is actioned without effort.

ECMAScript for XML

ECMAScript for XML (E4X) is a programming language extension that adds native XML support to ECMAScript

(which includes ActionScript, DMDScript, JavaScript, and JScript). The goal is to provide an alternative to DOM

interfaces that uses a simpler syntax for accessing XML documents. It also offers a new way of making XML

visible. Before the release of E4X, XML was always accessed at an object level. E4X instead treats XML as a

primitive (like characters, integers, and booleans). This implies faster access, better support, and acceptance as a

building block (data structure) of a program.

E4X is standardized by Ecma International in the ECMA-357 standard [1] . The first edition was published in June

2004, the second edition in December 2005.

Browser support

E4X is currently supported by Mozilla's Rhino, used in OpenOffice.org and several other projects, and

SpiderMonkey, used in Firefox, Thunderbird, and other XUL-based applications. It is also supported by Tamarin, the

JavaScript engine used in the Flash virtual machine. It is not currently supported by Nitro (Safari), V8 (Google

Chrome), or Internet Explorer.[2]

Example

var sales =

;

alert( sales.item.(@type == "carrot").@quantity );

alert( sales.@vendor );

for each( var price in sales..@price ) {

}

alert( price );

delete sales.item[0];

sales.item += ;

sales.item.(@type == "oranges").@quantity = 4;

ECMAScript for XML 19

Implementations

The first implementation of E4X was designed by Terry Lucas and John Schneider and appeared in BEA's Weblogic

Workshop 7.0 released in February 2002. BEA's implementation was based on Rhino and released before the

ECMAScript E4X spec was completed in June 2004. John Schneider wrote an article [3] on the XML extensions in

BEA's Workshop at the time.

• E4X is implemented in SpiderMonkey (Gecko's JavaScript engine) since version 1.6.0 [4] and in Rhino (Mozilla's

other JavaScript engine written in Java instead of C) since version 1.6R1 [5] .

• As Mozilla Firefox is based on Gecko, it can be used to run scripts using E4X. The specification is supported in

the 1.5 release or later.

• Adobe's ActionScript 3 scripting language fully supports E4X. Early previews of ActionScript 3 were first made

available in late 2005. Adobe officially released the language with Flash Player 9 on June 28, 2006.

• E4X is available in Flash CS3, Adobe AIR and Adobe Flex as they use ActionScript 3 as a scripting language.

• E4X is also available in Adobe Acrobat and Adobe Reader versions 8.0 or higher.

• E4X is also available in Aptana's Jaxer Ajax application server which uses the Mozilla engine server-side.

• Since the release of Alfresco Community Edition 2.9B, E4X is also available in this enterprise document

management system.

External links

• ECMA-357 standard [1]

• E4X at faqts.com [6]

• Slides from 2005 E4X Presentation by Brendan Eich, Mozilla Chief Architect [7]

• E4X at Mozilla Developer Center [8]

• Introducing E4X at xml.com [9] : compares E4X and json

• Processing XML with E4X [10] at Mozilla Developer Center

• Tutorial from W3 Schools [11]

• E4X: Beginner to Advanced [12] at Yahoo Developer Network

References

[1] http://www.ecma-international.org/publications/standards/Ecma-357.htm

[2] http://code.google.com/p/chromium/issues/detail?id=30975

[3] http://web.archive.org/web/20080403052807/http://dev2dev.bea.com/pub/a/2002/09/JSchneider_XML.html

[4] SpiderMonkey 1.6.0 release notes (http://www.mozilla.org/js/spidermonkey/release-notes/JS_160.html)

[5] Rhino 1.6R1 Change log (http://www.mozilla.org/rhino/rhino16R1.html)

[6] http://www.faqts.com/knowledge_base/index.phtml/fid/1762

[7] https://developer.mozilla.org/presentations/xtech2005/e4x/ [8]

https://developer.mozilla.org/en/docs/E4X

[9] http://www.xml.com/pub/a/2007/11/28/introducing-e4x.html

[10] https://developer.mozilla.org/index.php?title=En/Core_JavaScript_1.5_Guide/Processing_XML_with_E4X

[11] http://www.w3schools.com/e4x/default.asp

[12] http://developer.yahoo.com/flash/articles/e4x-beginner-to-advanced.html

Efficient XML Interchange 20

Efficient XML Interchange

Efficient XML Interchange (EXI) is a proposed data format from the Efficient XML Interchange Working Group

of the World Wide Web Consortium (W3C). It is one of the various efforts to encode XML documents in a binary

data format, rather than plain text.

Using a binary XML format generally reduces the verbosity of XML documents, and may reduce the cost of parsing.

Performance of writing (generating) content is usually not similarly improved, although this depends on actual

binary representation used.

The EXI format is derived from the AgileDelta Efficient XML format [1] .

See also

• Binary XML

• Fast Infoset

External links

• Efficient XML Interchange Format 1.0 (Candidate Recommendation) [2]

• Efficient XML Interchange Working Group home page [3]

• EXIficient - Open Source implementation of the EXI Format 1.0 [4]

• W3C binary XML requirements [5]

References

[1] "Lightning-Fast Delivery of XML to More Devices in More Locations" (http://www.agiledelta.com/product_efx.html). AgileDelta.

2007-05-08. . Retrieved 2007-07-17.

[2] http://www.w3.org/TR/exi/

[3] http://www.w3.org/XML/EXI/

[4] http://exificient.sourceforge.net/

[5] http://www.w3.org/TR/2005/NOTE-xbc-characterization-20050331/

Embedded RDF 21

Embedded RDF

Embedded RDF (eRDF) is a syntax for writing HTML in such a way that the information in the HTML document

can be extracted (with an eRDF parser or XSLT stylesheet) into Resource Description Framework.

It was invented by Ian Davis in 2005, and partly inspired by microformats, a simplified approach to semantically

annotate data in websites. [1]

See also

• RDFa, W3C's approach at embedding RDF

• GRDDL, a way to extract (annotated) data out of XHTML and XML documents and transform it into an RDF

graph

• Microdata (HTML5), a proposed feature of HTML5 that improves on the capabilities of microformats

External links

• eRDF [2]

References

[1] Ian Davis (http://iandavis.com/)

[2] http://research.talis.com/2005/erdf/wiki/Main/RdfInHtml

EpiDoc

The EpiDoc Collaborative [1] , building recommendations for structured markup of epigraphic documents in TEI

XML, was originally formed in 2000 by scholars at the University of North Carolina at Chapel Hill: Tom Elliott, the

former director of the Ancient World Mapping Center, with Hugh Cayless and Amy Hawkins. The guidelines have

matured considerably through extensive discussion on the Markup list [2] and other discussion fora, at several

conferences, and through the experience of various pilot projects. The first major—but not by any means the

only—epigraphic project to adopt and pilot the EpiDoc recommendations has been the Inscriptions of Aphrodisias,

and the guidelines have reached a degree of stability for the first time during this process.

The EpiDoc schema and guidelines may also be applied, perhaps with some local modification to related

palaeolgraphical fields including Papyrology (projects in progress), Sigillography, and Numismatics.

Guidelines

The EpiDoc Guidelines are available in two forms:

1. the stable guidelines, released periodically and available at: http://www.stoa.org/epidoc/gl/(Current version 5

[3] )

2. the source code, available in its most up-to-date form in the CVS repository at SourceForge [4] ; the GL source

files are a series of XML documents

EpiDoc 22

Tools

Tool developed by and for the EpiDoc community include:

• The EpiDoc webapp, available from the SourceForge [4] CVS repository (the same application is used to deliver

the guidelines).

• The EpiDoc Crosswalker, a tool to transform data in both directions between EpiDoc and other encoding

schemes, markup schemas, and databases. (In progress.)

• CHET-C (the Chapel Hill Electronic Text-Converter), an application originally written in VBA, then as a

free-standing Java app, and now available as a self-contained Javascript platform written by Hugh Cayless. [5] (A

Python and XSLT version of CHET-C is under construction as part of the IDP project.)

• Transcoder: a Java tool for converting between Beta Code, Unicode NF C, Unicode NF D, and GreekKeys

encoding for Greek script on the fly (download link to follow).

Projects

• Concordia [6] , King's College London and New York University

• Inscriptions of Aphrodisias [7] , King's College London, UK

• Inscriptions of Roman Cyrenaica [7] , KCL

• Integrating Digital Papyrology (Duke University, Columbia University, Heidelberg University, King's College

London), see now http://papyri.info/

• US Epigraphy Project [8] , Brown University, Providence RI, USA

• Vindolanda Tablets Online [9] , Oxford University, UK

• Etruscan Texts Project [10] , University of Massachusetts Amherst, Amherst MA, USA

Bibliography

• G. Bodard, 'Digital Epigraphy and Lexicographical and Onomastic Markup', in (edd. Aitken, Fraser, Thompson)

Ancient Greek Lexicography: Electronic Databanks and the design of new dictionaries, Cardiff: University Press

of Wales, (forthcoming 2007).

• G. Bodard / Ch. Roueché, 'The Epidoc Aphrodisias Pilot Project', Forum Archaeologiae 23/VI/2002, online at

http://farch.net (available: 2006-04-07)

• J. Flanders / C. Roueché, 'Introduction for Epigraphers', online at http://epidoc.sf.net/IntroEpigraphers.shtml

(available: 2006-04-25)

• A. Mahoney, 'Epigraphy', in (edd. Burnard, O'Brian, Unsworth) Electronic Textual Editing (2006), preview online

at http://www.tei-c.org/Activities/ETE/Preview/mahoney.xml (available: 2006-04-07)

See also

• Leiden Conventions

• Epigraphy

• Text Encoding Initiative

• Digital Classicist

References

[1] http://epidoc.sourceforge.net/

[2] http://lsv.uky.edu/archives/markup.html

[3] http://www.stoa.org/epidoc/gl/5/

[4] http://sourceforge.net/projects/epidoc

[5] http://www.stoa.org/projects/epidoc/stable/chetc-js/chetc.html

[6] http://concordia.atlantides.org/

EpiDoc 23

[7] http://insaph.kcl.ac.uk/

[8] http://usepigraphy.brown.edu/

[9] http://vindolanda.csad.ox.ac.uk/

[10] http://etp.classics.umass.edu/

eXtensible Server Pages

eXtensible servers Pages (XSP) is an XML-based language, which offers the possibility of dynamically arranged

Java code into XML documents.

It was developed by the Apache Software Foundation for the Web Publishing Framework Cocoon. The focus of XSP

is the separation of content, logic and presentation. The Java program code is in its own XML section

that can either occur within or outside of the root element ().

The Java code is compiled with the first call. These directives are replaced by the generated content so that the

resulting, augmented XML document can be subject to further processing with XSL Transformations.

XSP pages are transformed into Cocoon producers, typically as Java classes, though any scripting language for

which a Java-based processor exists could also be used.

Directives can be either XSP built-in processing tags or user-defined library tags. XSP built-in tags are used to

embed procedural logic, substitute expressions and dynamically build XML nodes. User-defined library tags act as

templates that dictate how program code is generated from information encoded in each dynamic tag.

External links

• Cocoon XSP 2.1 [1]

• XSP 1.x - Working Draft [2]

References

[1] http://cocoon.apache.org/2.1/userdocs/xsp/logicsheet.html

[2] http://cocoon.apache.org/1.x/wd-xsp.html

Fast Infoset 24

Fast Infoset

Fast Infoset (or FI) is an international standard that specifies a binary encoding format for the XML Information Set

(XML Infoset) as an alternative to the XML document format. It aims to provide more efficient serialization than the

text-based XML format.

One can think of FI as gzip for XML, though FI aims to optimize both document size and processing performance,

whereas gzip optimizes only the size. While the original formatting is lost, no information is lost in the conversion

from XML to FI and back to XML.

The Fast Infoset specification is defined by both the ITU-T and the ISO standards bodies. FI is officially named

ITU-T Rec. X.891 and ISO/IEC 24824-1 (Fast Infoset), respectively. However, it is commonly referred to by the

name Fast Infoset. The standard was published by ITU-T on May 14, 2005, and by ISO on May 4, 2007.

The Fast Infoset standard can be downloaded from the ITU website at [1]. There are no intellectual property

restrictions on its implementation and use.

A common misconception is that FI requires ASN.1 tool support. Although the formal specification uses ASN.1

formalisms, ASN.1 tools are not required by implementations.

Structure

The underlying file format is ASN.1, with tag/length/value blocks. Text values of attributes and elements are therefor

stored with length prefixes rather than end delimeters, so there is no need to escape special characters. There is also

no need for any end tags, and binary data need not be base64 encoded.

Although ASN.1 is used for storage, Fast Infoset is a higher level protocol built upon it. In particular, element and

attribute names are stored within the octet stream, unlike raw ASN.1. This means that it is possible to recover a

conventional XML file from the binary stream without the need to reference any XML Schema. It does not attempt

to convert and XML Schema directly into an ASN.1 definition. (ASN.1 "Tags" are just type names, eg. String,

Integer, or complex types.)

An index table is built for most strings, which includes element and attribute names, and their values. This means

that the text of repeated tags and values only appears once per document. The details are complex.

Implementations

Reference implementation

A Java implementation [2] of the FI specification is available as part of the GlassFish project. The library is open

source and is distributed under the terms of the Apache License 2.0. Several projects use this implementation,

including the reference implementation for JAX-RPC and JAX-WS used in JWSDP.

Alternative implementations

The OSS Fast Infoset Tools [3] are designed for use with applications written in C or C++.

Liquid Technologies [4] provides both C++ and C# .NET implementations of Fast Infoset with its XML Data

Binding product Liquid XML.

Applied Informatics [5] provides a C++ implementation [6] of Fast Infoset based on the POCO C++ Libraries.

FastInfoset.NET [7] is a C# implementation for the .NET Framework. It is licensed under a proprietary licence.

The XIOT [8] library has parts of Fast Infoset implemented to read and write compressed binary X3D files. It is

licensed under LGPL.

Fast Infoset 25

Performance

Because Fast Infosets are compressed as as part of the XML generation process, they are much faster than using

Zip-style compression algorithms on an XML stream, although they can produce slightly larger files.

SAX-type parsing performance of Fast Infoset is also much faster than parsing performance of XML 1.0, even

without any Zip-style compression. Typical increases in parsing speed observed for the reference Java

implementation are a factor of 10 compared to Java Xerces, and a factor of 4 compared to the Piccolo driver [9] (one

[10] [11] [12]

of the fastest Java-based XML parsers).

Typical applications

Portable Devices - With mobile devices typically having access to low bandwidth data connections, and have slower

CPUs. This can make Fast Infoset a better choice, lowering both data transmission and data processing times.

Persisting Large Volumes of Data - When persisting XML either to file or a database, the volume of data your

system produces can often get out of hand. This has a number of detrimental effects; the access times go up as you're

reading more data, CPU load goes up as XML data takes more effort to process, and your storage costs go up. By

persisting your XML data in Fast Infoset format, it is possible to reduce the data volume by up to 80 percent.

Passing XML via the internet - As soon as an application starts passing information over the internet, one of the

main bottlenecks is bandwidth. If you send reasonable chunks of data, this bottleneck can seriously degrade the

performance of your client applications and limit your server's ability to process requests. Reducing the amount of

data moving across the internet reduces the time it takes a message to be sent or received, while increasing the

number of transactions a server can process per hour.

See also

• Binary XML

• EXI

• X3D

External links

• A heavy technical description on Sun [13]

• FastInfoset.NET home page [7]

• FI project home page [14]

• Fast Infoset page at the ASN.1 site [15]

• OSS Fast Infoset Tools page [3]

• Free download of the Fast Infoset standard (ITU-T Rec. X.891) from the ITU Web site [1]

• Free download of the Fast Infoset standard (ISO/IEC 24824-1:2007) from ISO Freely Available Standards [16]

Fast Infoset 26

References

[1] http://www.itu.int/rec/T-REC-X.891-200505-I/en

[2] https://fi.dev.java.net/

[3] http://www.oss.com/xml/products/fi.html

[4] http://www.liquid-technologies.com/Product_XmlCompression.aspx

[5] http://www.appinf.com/

[6] http://www.appinf.com/en/products/fis.html

[7] http://www.noemax.com/products/fastinfoset/index.html

[8] http://forge.collaviz.org/community/xiot

[9] http://piccolo.sourceforge.net/

[10] "Fast Infoset performance reports" (https://fi.dev.java.net/performance.html). 2005-10-06. . Retrieved 2007-10-11.

[11] "Japex Report: ParsingPerformance" (https://fi.dev.java.net/reports/parsing/report.html). 2005-01-10. . Retrieved 2007-10-11.

[12] "Japex Report: SizePerformance" (https://fi.dev.java.net/reports/size/report.html). 2005-01-10. . Retrieved 2007-10-11.

[13] http://java.sun.com/developer/technicalArticles/xml/fastinfoset/

[14] http://fi.dev.java.net/

[15] http://asn1.elibel.tm.fr/xml/finf.htm

[16] http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html

Global listings format

Global listings format (GLF) refers to metadata for transferring program guide information and multimedia

information. It is coded in XML format.

GMX

GMX [1] (global mail exchange) is also the name of a German company with an international webmail product

GMX Mail.

GMX is a collection of current and proposed standards, primarily targeted at the needs of the translation industry,

although able to be used for other purposes also. They are concerned with measuring quantitatively aspects of a

document, particularly those with relevance to the translation process (e.g. word counts, complexity). The primary

use cases are in quoting, estimating and billing translation work.

GMX-V is the first of the three standards to be completed. Work will commence in 2007 on GMX-Q and GMX-C.

Quality (GMX-Q) will deal with the level of quality required for a task. For example, the quality required for the

translation of a legal document is much higher than that for technical documentation that will have a relatively small

audience. Complexity (GMX-C) will take into consideration the source and format of the original document and its

subject matter. For example, a highly complex document dealing with a specific tight domain is far more complex to

translate than user instructions for a simple consumer device.

GMX-V forms part of the Open Architecture for XML Authoring and Localization (OAXAL) reference architecture.

References

[1] http://www.gmx.net/

GMX-V 27

GMX-V

GMX-V (Global Information Management Metrics eXchange - Volume: Word and Character Count Standard) is a

word and character count standard for electronic documents. GMX-V is developed and maintained by OSCAR [1]

(Open Standards for Container/Content Allowing Re-use), a special interest group of LISA [2] (Localization Industry

Standards Association).

GMX-V is one of the tripartite series of standards from the Localization Industry Standards Association (LISA).

GMX-V deals with electronic document metrics.

GMX is made up of the following standards:

• GMX-V - Volume

• GMX-C - Complexity

• GMX-Q - Quality

GMX-V forms part of the Open Architecture for XML Authoring and Localization (OAXAL) reference architecture.

Scope and Primary Goal

GMX-V is designed to fulfill two primary roles:

• Establish a verifiable way of calculating the primary word and character counts for a given electronic document.

• Establish a specific XML vocabulary that enables the automatic exchange of metric data

Description

GMX-V is itself based on other well established standards:

• Unicode 5.0 normalized form

• Unicode Technical Report 29 – Text Boundaries

• OASIS XML Localization Interchange File Format (XLIFF) 1.2

• LISA OSCAR Segmentation Rules Exchange (SRX) 2.0

External links

• GMX-V page on the LISA OSCAR web site [3]

• GMX-V specification [4]

References

[1] OSCAR (http://www.lisa.org/sigs/oscar/) - Open Standards for Container/Content Allowing Re-use

[2] LISA (http://www.lisa.org/index.html) - Localization Industry Standards Association

[3] http://www.lisa.org/Global-information-m.104.0.html

[4] http://www.lisa.org/fileadmin/standards/GMX-V.html

Head-Body Pattern 28

Head-Body Pattern

The Head-Body Pattern is a common XML design pattern, used for example in the SOAP protocol. This pattern is

useful when a message, or parcel of data, requires considerable metadata. While mixing the meta-data with the data

could be done it makes the whole confusing. In this pattern the meta-data or meta-information are structured as the

header, sometimes known as the envelope. The ordinary data or information are structured as the body, sometimes

known as the payload. XML is employed for both head and body.

HyTime

HyTime (Hypermedia/Time-based Structuring Language) is a markup language that is an "application" of SGML.

HyTime defines a set of hypertext-oriented element types that, in effect, supplement SGML and allow SGML

document authors to build hypertext and multimedia presentations in a standardized way.

HyTime is an international standard published by the ISO and IEC. The first edition was published in 1992, and the

second edition was published in 1997.

Legacy

Some of the concepts formalized in HyTime were later incorporated into HTML and XML:

• HTML is an application of SGML for hypertext document presentations, that assigns specific semantics and

processing expectations to a fixed set of element types.

• XML defines a simplified subset of SGML that focuses on providing an open vocabulary of element types for data

modeling and establishes precise expectations for how the marked-up data is read and subsequently fed to another

software application for further processing, but does not assign semantics to the element types or establish

expectations for how the data is processed.

Standard

The HyTime standard itself is ISO/IEC 10744, first published in 1992 and available from the International

Organization for Standardization. It was developed by ISO/IEC JTC1/SC34 (ISO/IEC Joint Technical Committee 1,

[1] [2]

Subcommittee 34 - Document description and processing languages).

Further reading

• Steven DeRose and David Durand, "Making Hypermedia Work: A User's Guide to HyTime," Kluwer Academic

Publishers 1994 (ISBN 0-7923-9432-1).

External links

• ISO/IEC 10744:1992 - Information technology -- Hypermedia/Time-based Structuring Language (HyTime) [3]

• Robin Cover's HyTime resource list [4]

• ISO/IEC 10744 Amendment 1 [5] - an amendment to ISO/IEC 10744:1997 Annex A.3

• Standards: HyTime: A standard for structured hypermedia interchange [6] by Charles Goldfarb, from IEEE

Computer magazine, vol. 24, iss. 8 (Aug. 1991), pp. 81–84

• A Brief History of the Development of SMDL and HyTime [7]

HyTime 29

References

[1] ISO. "JTC 1/SC 34 - Document description and processing languages" (http://www.iso.org/iso/iso_technical_committee.

html?commid=45374). ISO. . Retrieved 2009-12-25.

[2] ISO JTC1/SC34. "JTC 1/SC 34 - Document Description and Processing Languages" (http://www.itscj.ipsj.or.jp/sc34/). . Retrieved

2009-12-25.

[3] http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_ics.htm?csnumber=18834

[4] http://xml.coverpages.org/hytime.html

[5] http://www.y12.doe.gov/sgml/wg8/document/1957.htm

[6] http://ieeexplore.ieee.org/iel1/2/2778/00084880.pdf?tp=&arnumber=84880&isnumber=2778

[7] http://www.sgmlsource.com/history/hthist.htm

Internationalization Tag Set

The Internationalization Tag Set (ITS) [1] is a set of attributes and elements designed to provide

internationalization and localization support in XML documents.

The ITS specification identifies concepts (called "ITS data categories") which are important for internationalization

and localization. It also defines implementations of these concepts through a set of elements and attributes grouped

in the ITS namespace. XML developers can use this namespace to integrate internationalization features directly into

their own XML schemas and documents.

Overview

ITS v1.0 includes seven data categories:

• Translate: Defines what parts of a document are translatable or not.

• Localization Note: Provides alerts, hints, instructions, and other information to help the localizers or the

translators.

• Terminology: Indicates parts of the documents that are terms and optionally pointers to information about these

terms.

• Directionality: Indicates what type of display directionality should be applied to parts of the document.

• Ruby: Indicates what parts of the document should be displayed as ruby text. (Ruby is a short run of text

alongside a base text, typically used in East Asian documents to indicate pronunciation or to provide a brief

annotation).

• Language Information: Identifies the language of the different parts of the document.

• Elements Within Text: Indicates how elements should be treated with regard to linguistic segmentation.

The vocabulary is designed to work on two different fronts: First by providing markup usable directly in the XML

documents. Secondly, by offering a way to indicate if there are parts of a given markup that correspond to some of

the ITS data categories and should be treated as such by ITS processors.

ITS applies to both new document types as well as existing ones. It also applies to both markups without any

internationalization features as well as the class of documents already supporting some internationalization or

localization-related functions.

ITS can be specified using global rules and local rules.

• The global rules are expressed anywhere in the document (embedded global rules), or even outside the document

(external global rules), using the its:rules element.

• The local rules are expressed by specialized attributes (and sometimes elements) specified inside the document

instance, at the location where they apply.

Internationalization Tag Set 30

Examples

Example of ITS markup for the Translate data category:

The elements and attributes with the its prefix are part of the ITS namespace. The its:rules element list the different

rules to apply to this file. There is one its:translateRule rule that indicates that any content inside the head element

should not be translated.

The its:translate attributes used in some elements are utilised to override the global rule. Here, to make translatable

the content of title and to make non-translatable the text "faux pas".

Sep-10-2006 v5

Ealasaidh McIan

ealasaidh@hogw.ac.uk

The Origins of Modern Novel

Introduction

It would certainly be quite a faux

pas to start a dissertation on the origin of modern novel without

mentioning the Epic of Gilgamesh...

Example of ITS markup for the Localization Note data category:

The its:locNote element specifies that any node corresponding to the XPath expression "//msg/data" has an

associated note. The location of that note is expressed by the locNotePointer attribute, which holds a relative XPath

expression pointing to the node where the note is, here ="../notes".

Note also the use of the its:translate attribute to mark the notes elements as non-translatable.

Internationalization Tag Set 31

A division by 0 was going to be computed.

Invalid parameter.

ITS limitations

ITS does not have a solution to all XML internationalization and localization issues.

One reason is that the version 1.0 does not have data categories for everything. For example, there is currently no

way to indicate a relation source/target in bilingual files where some parts of a document store the source text and

some other parts the corresponding translation.

The other reason is that many aspects of internationalization cannot be resolved with a markup. They have to do with

the design of the DTD or the schema itself. There are best practices, design and authoring guidelines [2] that are

necessary to follow to make sure documents are correctly internationalized and easy to localize. For example, using

attributes to store translatable text is a bad idea for many different reasons, but ITS cannot prevent an XML

developer to make such choice.

External links

• Internationalization Tag Set (ITS) Version 1.0 [3]

• W3C Internationalization Home [4]

• Best Practices for XML Internaltionalization (Working Draft) [5]

• List of ITS implementations and articles about ITS [6]

References

[1] http://www.w3.org/TR/its/

[2] http://www.w3.org/TR/xml-i18n-bp/

[3] http://www.w3.org/TR/2007/REC-its-20070403/

[4] http://www.w3.org/International/

[5] http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070427/

[6] http://www.w3.org/International/its/links.html

Klip 32

Klip

Klip is an XML file that contains markup, styles and JavaScript that provides the

Klipfolio desktop dashboard platform with rules for the retrieval, interpretation, and

presentation of arbitrary information sources such as web pages, RSS feeds, and

proprietary XML back-ends. The Klip file extension is ".klip".

When opened in Klipfolio, a Klip is rendered as a small window that displays text

and image content. The size, position and visibility of the Klip on-screen is managed by the user. Settings particular

to each Klip can be found in a "Klip Setup" dialog.

Klips are considered by most to be widgets, and KlipFolio a widget engine. There are thousands of different Klips

available as free downloads at Klipfolio.com [1] . Klips proivde all manner of information such as weather conditions,

news headlines, stock quotes etc. The consumer version of KlipFolio is freeware and can be downloaded, installed,

and used by anyone that cares to do so.

Example usage

This very simple example can be written using a plain text or XML editor.

My Klip

Your Description here....

The author of the Klip

15 keywords maximum to upload to KlipFarm the Klip directory

http://mydomain.com/myxml.xml

http://mydomain.com/myicon.jpg

http://mydomain.com/mybanner.gif

15

Saving it as first.klip will allow you to open it using KlipFolio.

Note: Klip also stands for the meaningful word Clip in a lot of Eastern Countries (e.g. Czechia, Lithuania, Poland,

Serbia, Slovakia)

Klip 33

See also

• KlipFolio

• Serence

Links

• KlipFolio Homepage [1]

References

[1] http://www.klipfolio.com/

klip izle (http://www.klipizle.gen.tr)

List of XML and HTML character entity

references

In SGML, HTML and XML documents, the logical constructs known as character data and attribute values consist

of sequences of characters, in which each character can manifest directly (representing itself), or can be represented

by a series of characters called a character reference, of which there are two types: a numeric character reference

and a character entity reference. This article lists the character entity references that are valid in HTML and XML

documents.

Character reference overview

A numeric character reference refers to a character by its Universal Character Set/Unicode code point, and uses the

format

or

&#nnnn;

&#xhhhh;

where nnnn is the code point in decimal form, and hhhh is the code point in hexadecimal form. The x must be

lowercase in XML documents. The nnnn or hhhh may be any number of digits and may include leading zeros. The

hhhh may mix uppercase and lowercase, though uppercase is the usual style.

In contrast, a character entity reference refers to a character by the name of an entity which has the desired character

as its replacement text. The entity must either be predefined (built-in to the markup language) or explicitly declared

in a Document Type Definition (DTD). The format is the same as for any entity reference:

&name;

where name is the name of the entity. The semicolon is required.

List of XML and HTML character entity references 34

Predefined entities in XML

The XML specification does not use the term "character entity" or "character entity reference". The XML

specification defines five "predefined entities" representing special characters, and requires that all XML processors

honor them. The entities can be explicitly declared in a DTD, as well, but if this is done, the replacement text must

be the same as the built-in definitions. XML also allows other named entities of any size to be defined on a

per-document basis.

The table below lists the five XML predefined entities. The "Name" column mentions the entity's name. The

"Character" column shows the character, if it is renderable. In order to render the character, the format &name; is

used; for example, & renders as &. The "Unicode code point" column cites the character via standard

UCS/Unicode "U+" notation, which shows the character's code point in hexadecimal. The decimal equivalent of the

code point is then shown in parentheses. The "Standard" column indicates the first version of XML that includes the

entity. The "Description" column cites the character via its canonical UCS/Unicode name, in English.

Name Character Unicode code point (decimal) Standard Description

quot " U+0022 (34) XML 1.0 (double) quotation mark

amp & U+0026 (38) XML 1.0 ampersand

apos ' U+0027 (39) XML 1.0 apostrophe (= apostrophe-quote)

lt < U+003C (60) XML 1.0 less-than sign

gt > U+003E (62) XML 1.0 greater-than sign

Character entity references in HTML

The HTML 4 DTDs define 252 named entities, references to which act as mnemonic aliases for certain Unicode

characters. The HTML 4 specification requires the use of the standard DTDs and does not allow users to define

additional entities.

In the table below, the "Standard" column indicates the first version of the HTML DTD that defines the character

entity reference. HTML 4.01 did not provide any new character references.

Name Character Unicode code point

(decimal)

Standard DTD DTD

Old ISO

subset ISOsubset

Description Description

quot " U+0022 (34) HTML 2.0 HTMLspecial ISOnum quotation mark (= APL quote)

amp & U+0026 (38) HTML 2.0 HTMLspecial ISOnum ampersand

apos ' U+0027 (39) XHTML

1.0

HTMLspecial ISOnum apostrophe (= apostrophe-quote); see below

lt < U+003C (60) HTML 2.0 HTMLspecial ISOnum less-than sign

gt > U+003E (62) HTML 2.0 HTMLspecial ISOnum greater-than sign

nbsp U+00A0 (160) HTML 3.2 HTMLlat1 ISOnum no-break space (= non-breaking space) spaces

iexcl ¡ U+00A1 (161) HTML 3.2 HTMLlat1 ISOnum inverted exclamation mark

cent ¢ U+00A2 (162) HTML 3.2 HTMLlat1 ISOnum cent sign

pound £ U+00A3 (163) HTML 3.2 HTMLlat1 ISOnum pound sign

curren ¤ U+00A4 (164) HTML 3.2 HTMLlat1 ISOnum currency sign

yen ¥ U+00A5 (165) HTML 3.2 HTMLlat1 ISOnum yen sign (= yuan sign)

brvbar ¦ U+00A6 (166) HTML 3.2 HTMLlat1 ISOnum broken bar (= broken vertical bar)

List of XML and HTML character entity references 35

sect § U+00A7 (167) HTML 3.2 HTMLlat1 ISOnum section sign

uml ¨ U+00A8 (168) HTML 3.2 HTMLlat1 ISOdia diaeresis (= spacing diaeresis); see German

umlaut

copy © U+00A9 (169) HTML 3.2 HTMLlat1 ISOnum copyright sign

ordf ª U+00AA (170) HTML 3.2 HTMLlat1 ISOnum feminine ordinal indicator

laquo « U+00AB (171) HTML 3.2 HTMLlat1 ISOnum left-pointing double angle quotation mark (= left

not ¬ U+00AC (172) HTML 3.2 HTMLlat1 ISOnum not sign

pointing guillemet)

shy U+00AD (173) HTML 3.2 HTMLlat1 ISOnum soft hyphen (= discretionary hyphen)

reg ® U+00AE (174) HTML 3.2 HTMLlat1 ISOnum registered sign ( = registered trade mark sign)

macr ¯ U+00AF (175) HTML 3.2 HTMLlat1 ISOdia macron (= spacing macron = overline = APL

overbar)

deg ° U+00B0 (176) HTML 3.2 HTMLlat1 ISOnum degree sign

plusmn ± U+00B1 (177) HTML 3.2 HTMLlat1 ISOnum plus-minus sign (= plus-or-minus sign)

sup2 ² U+00B2 (178) HTML 3.2 HTMLlat1 ISOnum superscript two (= superscript digit two =

squared)

sup3 ³ U+00B3 (179) HTML 3.2 HTMLlat1 ISOnum superscript three (= superscript digit three =

cubed)

acute ´ U+00B4 (180) HTML 3.2 HTMLlat1 ISOdia acute accent (= spacing acute)

micro µ U+00B5 (181) HTML 3.2 HTMLlat1 ISOnum micro sign

para U+00B6 (182) HTML 3.2 HTMLlat1 ISOnum pilcrow sign ( = paragraph sign)

middot · U+00B7 (183) HTML 3.2 HTMLlat1 ISOnum middle dot (= Georgian comma = Greek middle

cedil ¸ U+00B8 (184) HTML 3.2 HTMLlat1 ISOdia cedilla (= spacing cedilla)

sup1 ¹ U+00B9 (185) HTML 3.2 HTMLlat1 ISOnum superscript one (= superscript digit one)

ordm º U+00BA (186) HTML 3.2 HTMLlat1 ISOnum masculine ordinal indicator

raquo » U+00BB (187) HTML 3.2 HTMLlat1 ISOnum right-pointing double angle quotation mark (=

dot)

right pointing guillemet)

frac14 ¼ U+00BC (188) HTML 3.2 HTMLlat1 ISOnum vulgar fraction one quarter (= fraction one

quarter)

frac12 ½ U+00BD (189) HTML 3.2 HTMLlat1 ISOnum vulgar fraction one half (= fraction one half)

frac34 ¾ U+00BE (190) HTML 3.2 HTMLlat1 ISOnum vulgar fraction three quarters (= fraction three

quarters)

iquest ¿ U+00BF (191) HTML 3.2 HTMLlat1 ISOnum inverted question mark (= turned question mark)

Agrave À U+00C0 (192) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with grave (= Latin capital

letter A grave)

Aacute Á U+00C1 (193) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with acute

Acirc Â U+00C2 (194) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with circumflex

Atilde Ã U+00C3 (195) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with tilde

Auml Ä U+00C4 (196) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with diaeresis

Aring Å U+00C5 (197) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter A with ring above (= Latin

capital letter A ring)

List of XML and HTML character entity references 36

AElig Æ U+00C6 (198) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter AE (= Latin capital ligature

Ccedil Ç U+00C7 (199) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter C with cedilla

Egrave È U+00C8 (200) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with grave

Eacute É U+00C9 (201) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with acute

Ecirc Ê U+00CA (202) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with circumflex

Euml Ë U+00CB (203) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter E with diaeresis

Igrave Ì U+00CC (204) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with grave

Iacute Í U+00CD (205) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with acute

Icirc Î U+00CE (206) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with circumflex

Iuml Ï U+00CF (207) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter I with diaeresis

ETH Ð U+00D0 (208) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter ETH

Ntilde Ñ U+00D1 (209) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter N with tilde

Ograve Ò U+00D2 (210) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with grave

Oacute Ó U+00D3 (211) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with acute

Ocirc Ô U+00D4 (212) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with circumflex

Otilde Õ U+00D5 (213) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with tilde

Ouml Ö U+00D6 (214) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with diaeresis

times × U+00D7 (215) HTML 3.2 HTMLlat1 ISOnum multiplication sign

Oslash Ø U+00D8 (216) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter O with stroke (= Latin capital

AE)

letter O slash)

Ugrave Ù U+00D9 (217) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with grave

Uacute Ú U+00DA (218) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with acute

Ucirc Û U+00DB (219) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with circumflex

Uuml Ü U+00DC (220) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter U with diaeresis

Yacute Ý U+00DD (221) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter Y with acute

THORN Þ U+00DE (222) HTML 2.0 HTMLlat1 ISOlat1 Latin capital letter THORN

szlig ß U+00DF (223) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter sharp s (= ess-zed); see

German Eszett

agrave à U+00E0 (224) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with grave

aacute á U+00E1 (225) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with acute

acirc â U+00E2 (226) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with circumflex

atilde ã U+00E3 (227) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with tilde

auml ä U+00E4 (228) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with diaeresis

aring å U+00E5 (229) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter a with ring above

aelig æ U+00E6 (230) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter ae (= Latin small ligature ae)

ccedil ç U+00E7 (231) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter c with cedilla

egrave è U+00E8 (232) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with grave

eacute é U+00E9 (233) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with acute

ecirc ê U+00EA (234) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with circumflex

List of XML and HTML character entity references 37

euml ë U+00EB (235) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter e with diaeresis

igrave ì U+00EC (236) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with grave

iacute í U+00ED (237) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with acute

icirc î U+00EE (238) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with circumflex

iuml ï U+00EF (239) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter i with diaeresis

eth ð U+00F0 (240) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter eth

ntilde ñ U+00F1 (241) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter n with tilde

ograve ò U+00F2 (242) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with grave

oacute ó U+00F3 (243) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with acute

ocirc ô U+00F4 (244) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with circumflex

otilde õ U+00F5 (245) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with tilde

ouml ö U+00F6 (246) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with diaeresis

divide ÷ U+00F7 (247) HTML 3.2 HTMLlat1 ISOnum division sign

oslash ø U+00F8 (248) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter o with stroke (= Latin small

letter o slash)

ugrave ù U+00F9 (249) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with grave

uacute ú U+00FA (250) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with acute

ucirc û U+00FB (251) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with circumflex

uuml ü U+00FC (252) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter u with diaeresis

yacute ý U+00FD (253) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter y with acute

thorn þ U+00FE (254) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter thorn

yuml ÿ U+00FF (255) HTML 2.0 HTMLlat1 ISOlat1 Latin small letter y with diaeresis

OElig Œ U+0152 (338) HTML 4.0 HTMLspecial ISOlat2 Latin capital ligature oe ligature

oelig œ U+0153 (339) HTML 4.0 HTMLspecial ISOlat2 Latin small ligature oe ligature

Scaron Š U+0160 (352) HTML 4.0 HTMLspecial ISOlat2 Latin capital letter s with caron

scaron š U+0161 (353) HTML 4.0 HTMLspecial ISOlat2 Latin small letter s with caron

Yuml Ÿ U+0178 (376) HTML 4.0 HTMLspecial ISOlat2 Latin capital letter y with diaeresis

fnof ƒ U+0192 (402) HTML 4.0 HTMLsymbol ISOtech Latin small letter f with hook (= function =

circ ˆ U+02C6 (710) HTML 4.0 HTMLspecial ISOpub modifier letter circumflex accent

florin)

tilde ˜ U+02DC (732) HTML 4.0 HTMLspecial ISOdia small tilde

Alpha Α U+0391 (913) HTML 4.0 HTMLsymbol Greek capital letter Alpha

Beta Β U+0392 (914) HTML 4.0 HTMLsymbol Greek capital letter Beta

Gamma Γ U+0393 (915) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Gamma

Delta Δ U+0394 (916) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Delta

Epsilon Ε U+0395 (917) HTML 4.0 HTMLsymbol Greek capital letter Epsilon

Zeta Ζ U+0396 (918) HTML 4.0 HTMLsymbol Greek capital letter Zeta

Eta Η U+0397 (919) HTML 4.0 HTMLsymbol Greek capital letter Eta

Theta Θ U+0398 (920) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Theta

Iota Ι U+0399 (921) HTML 4.0 HTMLsymbol Greek capital letter Iota

List of XML and HTML character entity references 38

Kappa Κ U+039A (922) HTML 4.0 HTMLsymbol Greek capital letter Kappa

Lambda Λ U+039B (923) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Lambda

Mu Μ U+039C (924) HTML 4.0 HTMLsymbol Greek capital letter Mu

Nu Ν U+039D (925) HTML 4.0 HTMLsymbol Greek capital letter Nu

Xi Ξ U+039E (926) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Xi

Omicron Ο U+039F (927) HTML 4.0 HTMLsymbol Greek capital letter Omicron

Pi Π U+03A0 (928) HTML 4.0 HTMLsymbol Greek capital letter Pi

Rho Ρ U+03A1 (929) HTML 4.0 HTMLsymbol Greek capital letter Rho

Sigma Σ U+03A3 (931) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Sigma

Tau Τ U+03A4 (932) HTML 4.0 HTMLsymbol Greek capital letter Tau

Upsilon Υ U+03A5 (933) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Upsilon

Phi Φ U+03A6 (934) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Phi

Chi Χ U+03A7 (935) HTML 4.0 HTMLsymbol Greek capital letter Chi

Psi Ψ U+03A8 (936) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Psi

Omega Ω U+03A9 (937) HTML 4.0 HTMLsymbol ISOgrk3 Greek capital letter Omega

alpha α U+03B1 (945) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter alpha

beta β U+03B2 (946) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter beta

gamma γ U+03B3 (947) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter gamma

delta δ U+03B4 (948) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter delta

epsilon ε U+03B5 (949) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter epsilon

zeta ζ U+03B6 (950) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter zeta

eta η U+03B7 (951) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter eta

theta θ U+03B8 (952) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter theta

iota ι U+03B9 (953) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter iota

kappa κ U+03BA (954) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter kappa

lambda λ U+03BB (955) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter lambda

mu μ U+03BC (956) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter mu

nu ν U+03BD (957) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter nu

xi ξ U+03BE (958) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter xi

omicron ο U+03BF (959) HTML 4.0 HTMLsymbol NEW Greek small letter omicron

pi π U+03C0 (960) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter pi

rho ρ U+03C1 (961) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter rho

sigmaf ς U+03C2 (962) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter final sigma

sigma σ U+03C3 (963) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter sigma

tau τ U+03C4 (964) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter tau

upsilon υ U+03C5 (965) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter upsilon

phi φ U+03C6 (966) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter phi

chi χ U+03C7 (967) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter chi

psi ψ U+03C8 (968) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter psi

List of XML and HTML character entity references 39

omega ω U+03C9 (969) HTML 4.0 HTMLsymbol ISOgrk3 Greek small letter omega

thetasym ϑ U+03D1 (977) HTML 4.0 HTMLsymbol NEW Greek theta symbol

upsih ϒ U+03D2 (978) HTML 4.0 HTMLsymbol NEW Greek Upsilon with hook symbol

piv ϖ U+03D6 (982) HTML 4.0 HTMLsymbol ISOgrk3 Greek pi symbol

ensp U+2002 (8194) HTML 4.0 HTMLspecial ISOpub en space spaces

emsp U+2003 (8195) HTML 4.0 HTMLspecial ISOpub em space spaces

thinsp U+2009 (8201) HTML 4.0 HTMLspecial ISOpub thin space spaces

zwnj U+200C (8204) HTML 4.0 HTMLspecial NEW RFC 2070 zero-width non-joiner

zwj U+200D (8205) HTML 4.0 HTMLspecial NEW RFC 2070 zero-width joiner

lrm U+200E (8206) HTML 4.0 HTMLspecial NEW RFC 2070 left-to-right mark

rlm U+200F (8207) HTML 4.0 HTMLspecial NEW RFC 2070 right-to-left mark

ndash – U+2013 (8211) HTML 4.0 HTMLspecial ISOpub en dash

mdash — U+2014 (8212) HTML 4.0 HTMLspecial ISOpub em dash

lsquo ‘ U+2018 (8216) HTML 4.0 HTMLspecial ISOnum left single quotation mark

rsquo ’ U+2019 (8217) HTML 4.0 HTMLspecial ISOnum right single quotation mark

sbquo ‚ U+201A (8218) HTML 4.0 HTMLspecial NEW single low-9 quotation mark

ldquo “ U+201C (8220) HTML 4.0 HTMLspecial ISOnum left double quotation mark

rdquo ” U+201D (8221) HTML 4.0 HTMLspecial ISOnum right double quotation mark

bdquo „ U+201E (8222) HTML 4.0 HTMLspecial NEW double low-9 quotation mark

dagger † U+2020 (8224) HTML 4.0 HTMLspecial ISOpub dagger

Dagger ‡ U+2021 (8225) HTML 4.0 HTMLspecial ISOpub double dagger

bull • U+2022 (8226) HTML 4.0 HTMLspecial ISOpub bullet (= black small circle) black

hellip … U+2026 (8230) HTML 4.0 HTMLsymbol ISOpub horizontal ellipsis (= three dot leader)

permil ‰ U+2030 (8240) HTML 4.0 HTMLspecial ISOtech per mille sign

prime ′ U+2032 (8242) HTML 4.0 HTMLsymbol ISOtech prime (= minutes = feet)

Prime ″ U+2033 (8243) HTML 4.0 HTMLsymbol ISOtech double prime (= seconds = inches)

lsaquo ‹ U+2039 (8249) HTML 4.0 HTMLspecial ISO proposed single left-pointing angle quotation mark proposed

rsaquo › U+203A (8250) HTML 4.0 HTMLspecial ISO proposed single right-pointing angle quotation mark proposed

oline ‾ U+203E (8254) HTML 4.0 HTMLsymbol NEW overline (= spacing overscore)

frasl ⁄ U+2044 (8260) HTML 4.0 HTMLsymbol NEW fraction slash (= solidus)

euro € U+20AC (8364) HTML 4.0 HTMLspecial NEW euro sign

image ℑ U+2111 (8465) HTML 4.0 HTMLsymbol ISOamso black-letter capital I (= imaginary part)

weierp ℘ U+2118 (8472) HTML 4.0 HTMLsymbol ISOamso script capital P (= power set = Weierstrass p)

real ℜ U+211C (8476) HTML 4.0 HTMLsymbol ISOamso black-letter capital R (= real part symbol)

trade U+2122 (8482) HTML 4.0 HTMLsymbol ISOnum trademark sign

alefsym ℵ U+2135 (8501) HTML 4.0 HTMLsymbol NEW alef symbol (= first transfinite cardinal) alefsym

larr ← U+2190 (8592) HTML 4.0 HTMLsymbol ISOnum leftwards arrow

uarr ↑ U+2191 (8593) HTML 4.0 HTMLsymbol ISOnum upwards arrow

List of XML and HTML character entity references 40

rarr → U+2192 (8594) HTML 4.0 HTMLsymbol ISOnum rightwards arrow

darr ↓ U+2193 (8595) HTML 4.0 HTMLsymbol ISOnum downwards arrow

harr ↔ U+2194 (8596) HTML 4.0 HTMLsymbol ISOamsa left right arrow

crarr ↵ U+21B5 (8629) HTML 4.0 HTMLsymbol NEW downwards arrow with corner leftwards (=

carriage return)

lArr ⇐ U+21D0 (8656) HTML 4.0 HTMLsymbol ISOtech leftwards double arrow lArr

uArr ⇑ U+21D1 (8657) HTML 4.0 HTMLsymbol ISOamsa upwards double arrow

rArr ⇒ U+21D2 (8658) HTML 4.0 HTMLsymbol ISOnum rightwards double arrow rArr

dArr ⇓ U+21D3 (8659) HTML 4.0 HTMLsymbol ISOamsa downwards double arrow

hArr ⇔ U+21D4 (8660) HTML 4.0 HTMLsymbol ISOamsa left right double arrow

forall ∀ U+2200 (8704) HTML 4.0 HTMLsymbol ISOtech for all

part ∂ U+2202 (8706) HTML 4.0 HTMLsymbol ISOtech partial differential

exist ∃ U+2203 (8707) HTML 4.0 HTMLsymbol ISOtech there exists

empty ∅ U+2205 (8709) HTML 4.0 HTMLsymbol ISOamso empty set (= null set = diameter)

nabla ∇ U+2207 (8711) HTML 4.0 HTMLsymbol ISOtech nabla (= backward difference)

isin ∈ U+2208 (8712) HTML 4.0 HTMLsymbol ISOtech element of

notin ∉ U+2209 (8713) HTML 4.0 HTMLsymbol ISOtech not an element of

ni ∋ U+220B (8715) HTML 4.0 HTMLsymbol ISOtech contains as member

prod ∏ U+220F (8719) HTML 4.0 HTMLsymbol ISOamsb n-ary product (= product sign) prod

sum ∑ U+2211 (8721) HTML 4.0 HTMLsymbol ISOasmb n-ary summation sum

minus − U+2212 (8722) HTML 4.0 HTMLsymbol ISOtech minus sign

lowast ∗ U+2217 (8727) HTML 4.0 HTMLsymbol ISOtech asterisk operator

radic √ U+221A (8730) HTML 4.0 HTMLsymbol ISOtech square root (= radical sign)

prop ∝ U+221D (8733) HTML 4.0 HTMLsymbol ISOtech proportional to

infin ∞ U+221E (8734) HTML 4.0 HTMLsymbol ISOtech infinity

ang ∠ U+2220 (8736) HTML 4.0 HTMLsymbol ISOamso angle

and ∧ U+2227 (8743) HTML 4.0 HTMLsymbol ISOtech logical and (= wedge)

or ∨ U+2228 (8744) HTML 4.0 HTMLsymbol ISOtech logical or (= vee)

cap ∩ U+2229 (8745) HTML 4.0 HTMLsymbol ISOtech intersection (= cap)

cup ∪ U+222A (8746) HTML 4.0 HTMLsymbol ISOtech union (= cup)

int ∫ U+222B (8747) HTML 4.0 HTMLsymbol ISOtech integral

there4 ∴ U+2234 (8756) HTML 4.0 HTMLsymbol ISOtech therefore

sim ∼ U+223C (8764) HTML 4.0 HTMLsymbol ISOtech tilde operator (= varies with = similar to) sim

cong ≅ U+2245 (8773) HTML 4.0 HTMLsymbol ISOtech congruent to

asymp ≈ U+2248 (8776) HTML 4.0 HTMLsymbol ISOamsr almost equal to (= asymptotic to)

ne ≠ U+2260 (8800) HTML 4.0 HTMLsymbol ISOtech not equal to

equiv ≡ U+2261 (8801) HTML 4.0 HTMLsymbol ISOtech identical to; sometimes used for 'equivalent to'

le ≤ U+2264 (8804) HTML 4.0 HTMLsymbol ISOtech less-than or equal to

ge ≥ U+2265 (8805) HTML 4.0 HTMLsymbol ISOtech greater-than or equal to

List of XML and HTML character entity references 41

sub ⊂ U+2282 (8834) HTML 4.0 HTMLsymbol ISOtech subset of

sup ⊃ U+2283 (8835) HTML 4.0 HTMLsymbol ISOtech superset of sup

nsub ⊄ U+2284 (8836) HTML 4.0 HTMLsymbol ISOamsn not a subset of

sube ⊆ U+2286 (8838) HTML 4.0 HTMLsymbol ISOtech subset of or equal to

supe ⊇ U+2287 (8839) HTML 4.0 HTMLsymbol ISOtech superset of or equal to

oplus ⊕ U+2295 (8853) HTML 4.0 HTMLsymbol ISOamsb circled plus (= direct sum)

otimes ⊗ U+2297 (8855) HTML 4.0 HTMLsymbol ISOamsb circled times (= vector product)

perp ⊥ U+22A5 (8869) HTML 4.0 HTMLsymbol ISOtech up tack (= orthogonal to = perpendicular) perp

sdot ⋅ U+22C5 (8901) HTML 4.0 HTMLsymbol ISOamsb dot operator sdot

lceil ⌈ U+2308 (8968) HTML 4.0 HTMLsymbol ISOamsc left ceiling (= APL upstile)

rceil ⌉ U+2309 (8969) HTML 4.0 HTMLsymbol ISOamsc right ceiling

lfloor ⌊ U+230A (8970) HTML 4.0 HTMLsymbol ISOamsc left floor (= APL downstile)

rfloor ⌋ U+230B (8971) HTML 4.0 HTMLsymbol ISOamsc right floor

lang U+2329 (9001) HTML 4.0 HTMLsymbol ISOtech left-pointing angle bracket (= bra) lang

rang U+232A (9002) HTML 4.0 HTMLsymbol ISOtech right-pointing angle bracket (= ket) rang

loz ◊ U+25CA (9674) HTML 4.0 HTMLsymbol ISOpub lozenge

spades ♠ U+2660 (9824) HTML 4.0 HTMLsymbol ISOpub black spade suit black

clubs ♣ U+2663 (9827) HTML 4.0 HTMLsymbol ISOpub black club suit (= shamrock) black

hearts ♥ U+2665 (9829) HTML 4.0 HTMLsymbol ISOpub black heart suit (= valentine) black

diams ♦ U+2666 (9830) HTML 4.0 HTMLsymbol ISOpub black diamond suit black

Notes:

• DTD: the full public DTD name (where the character entity name is defined) is actually mapped from one of the

following three defined named entities:

HTMLlat1

maps to:

• PUBLIC "-//W3C//ENTITIES Latin 1//EN//HTML" in HTML (the DTD is implicitly defined,

no system URI is needed);

• PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN" "http://www.w3.org/TR/xhtml1/

HTMLsymbol

DTD/xhtml-lat1.ent" in XHTML 1.0;

maps to:

• PUBLIC "-//W3C//ENTITIES Symbols//EN//HTML" in HTML (the DTD is implicitly defined,

no system URI is needed);

• PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN" "http://www.w3.org/TR/xhtml1/

HTMLspecial

DTD/xhtml-symbol.ent" in XHTML 1.0;

maps to:

List of XML and HTML character entity references 42

• PUBLIC "-//W3C//ENTITIES Special//EN//HTML" in HTML (the DTD is implicitly defined,

no system URI is needed);

• PUBLIC "-//W3C//ENTITIES Special for XHTML//EN" "http://www.w3.org/TR/xhtml1/

DTD/xhtml-special.ent" in XHTML 1.0.

• Old ISO subset: these are old (documented) character subsets used in legacy encodings before the unification

within ISO 10646.

• Description: the standard ISO 10646 and Unicode character name is displayed first for each character, with

non-standard but legacy synonyms shown in italics between parentheses after an equal sign)

• spaces: a blue background has been used in order to display each space's width.

• ISO proposed: these characters have been standardized in ISO 10646 after the release of HTML 4.0.

• ligature: this is a standard misnomer as this is a separate character in some languages.

• black: here it seems to mean filled as opposed to hollow.

• alefsym: 'alef symbol' is NOT the same as U+05D0 'Hebrew letter alef', although the same glyph could be used

to depict both characters.

• lArr: ISO 10646 does not say that 'leftwards double arrow' is the same as the 'is implied by' arrow but also does

not have any other character for that function. So lArr can be used for 'is implied by' as ISOtech suggests.

• rArr: ISO 10646 does not say that 'rightwards double arrow' is the 'implies' character but does not have another

character with this function so rArr can be used for 'implies' as ISOtech suggests.

• prod: 'n-ary product' is NOT the same character as U+03A0 'Greek capital letter Pi' though the same glyph might

be used for both.

• sum: 'n-ary summation' is NOT the same character as U+03A3 'Greek capital letter Sigma' though the same glyph

might be used for both.

• sim: 'tilde operator' is NOT the same character as U+007E 'tilde', although the same glyph might be used to

represent both.

• sup: note that nsup, U+2283 'not a superset of', is not covered by the Symbol font encoding and is not included.

Should it be, for symmetry? It is in the ISOamsn subset.

• perp: Unicode only defines U+22A5 as the "up tack". The Unicode symbol for "perpendicular" is U+27C2. The

two symbols look similar, but are separate in Unicode. However, HTML uses U+22A5 as its "perpendicular"

symbol. This is a discrepancy between HTML and Unicode.

• sdot: 'dot operator' is NOT the same character as U+00B7 'middle dot'.

• lang: 'left-pointing angle bracket' is NOT the same character as U+003C 'less than' or U+2039 'single

left-pointing angle quotation mark'.

• rang: 'right-pointing angle bracket' is NOT the same character as U+003E 'greater than' or U+203A 'single

right-pointing angle quotation mark'.

Entities representing special characters in XHTML

The XHTML DTDs explicitly declare 253 entities (including the 5 predefined entities of XML 1.0) whose expansion

is a single character, which can therefore be informally referred to as "character entities". These (with the exception

of the ' entity) have the same names and represent the same characters as the 252 character entities in HTML.

Also, by virtue of being XML, XHTML documents may reference the predefined ' entity, which is not one of

the 252 character entities in HTML. Additional entities of any size may be defined on a per-document basis.

However, the usability of entity references in XHTML is affected by how the document is being processed:

• If the document is read by a conforming HTML processor, then only the 252 HTML character entities can safely

be used. The use of ' or custom entity references may not be supported and may produce unpredictable

results.

• If the document is read by an XML parser that does not or cannot read external entities, then only the five built-in

XML character entities (see above) can safely be used, although other entities may be used if they are declared in

List of XML and HTML character entity references 43

the internal DTD subset.

• If the document is read by an XML parser that does read external entities, then the five built-in XML character

entities can safely be used. The other 248 HTML character entities can be used as long as the XHTML DTD is

accessible to the parser at the time the document is read. Other entities may also be used if they are declared in the

internal DTD subset.

Because of the special ' case mentioned above, only ", &, <, and > will work in all processing

situations.

See also

• Character encodings in HTML

• HTML decimal character rendering

• SGML entity

References

• Unicode Consortium [1] . See also: Unicode Consortium

• UnicodeData.txt from the Unicode Consortium [2]

• World Wide Web Consortium [3] . See also: World Wide Web Consortium

• XML 1.0 spec [4]

• HTML 2.0 spec [5]

• HTML 3.2 spec [6]

• HTML 4.0 spec [7]

• HTML 4.01 spec [8]

• XHTML 1.0 spec [9]

• XML Entity Definitions for Characters [10]

• The normative reference to RFC 2070 (still found in DTDs defining the character entities for HTML or XHTML)

is historic; this RFC (along with other RFC's related to different part of the HTML specification) has been

deprecated in favor of the newer informational RFC 2854 which defines the "text/html" MIME type and

references directly the W3C specifications for the actual HTML content.

• Numerical Reference of Unicode code points at Wikibooks

External links

• Character entity references in HTML 4 [11] at the W3C

• Multilanguage special character entity list [12] - List of special characters, entities and their names.

References

[1] http://www.unicode.org/

[2] http://www.unicode.org/Public/UNIDATA/UnicodeData.txt

[3] http://www.w3.org/

[4] http://www.w3.org/TR/REC-xml/

[5] http://www.w3.org/MarkUp/html-spec/html-spec_toc.html

[6] http://www.w3.org/TR/REC-html32

[7] http://www.w3.org/TR/1998/REC-html40-19980424/

[8] http://www.w3.org/TR/REC-html40/

[9] http://www.w3.org/TR/xhtml1/

[10] http://www.w3.org/TR/xml-entity-names/

[11] http://www.w3.org/TR/html4/sgml/entities.html

[12] http://www.seomister.com/ch

Log4js 44

Log4js

Developer(s)

Log4js Logo

Stephan Strittmatter, Seth Chisamore

[1]

Stable release 1.0 / August 4, 2008

Operating

system

Type Framework

Windows, Linux, Mac OS

License Apache Software Foundation

Website http://log4js.berlios.de [1]

Log4js is a framework written in JavaScript to log application events.

The framework is very close to the API of Log4j. It is also available under the licence of Apache Software

Foundation.

Functionality

The base concept is identical to Log4j. The same log levels and almost

all methods are identical.

One special feature of Log4js is the ability to log the events of the

browser remote on the server. Using Ajax it is possible to send the

logging events in several formats (XML, JSON, plain ASCII etc.) to

the server to be evaluated there.

Appender

Following appenders are implemented currently:

AjaxAppender

Sends the logs via XmlHttpRequest (Ajax) to the server to be processed there.

ConsoleAppender

Logs within the HTML page or in a separate window.

FileAppender

Writes to a local file (Internet Explorer and Mozilla supported).

JSConsoleAppender

Appender for the JavaScript Console of Mozilla, Opera and Safari.

MetatagAppender

Adds the log events to Metatags in the DOM of document.

class diagram

Log4js 45

WindowsEventsAppender

Layout

Using Internet Explorer it is possible to log to Windows System Events.

The Layout classes are for different formattings of the events:

BasicLayout

Simple textual output of the events.

HtmlLayout

Formats the event as HTML -element.

JSONLayout

Converts the events to JSON-objects which are readable in many other programming languages like Perl, PHP

and Java.

XMLLayout

XML formatted output.

External links

• Log4js Homepage [1]

• Log4js Wiki [2]

• Apache Logging Homepage [3]

References

[1] http://log4js.berlios.de

[2] http://scratchpad.wikia.com/wiki/Log4js

[3] http://logging.apache.org/

MAREC 46

MAREC

The MAtrixware REsearch Collection (MAREC) is a standardised patent data corpus available for research

purposes. MAREC could be defined as corpus that seeks to represent patent documents of several languages in order

to answer specific research questions. [1] [2] It consists of 19 million patent documents in different languages,

normalised to a highly specific XML schema.

MAREC is intended as raw material for research in areas such as information retrieval, natural language processing

or machine translation, which require large amounts of complex documents. [3] The collection contains documents in

19 languages, the majority being English, German and French, and about half of the documents include full text.

In MAREC, the documents from different countries and sources are normalised to a common XML format with a

uniform patent numbering scheme and citation format. The standardised fields include dates, countries, languages,

references, person names, and companies as well as subject classifications such as IPC codes. [4]

MAREC is a comparable corpus, where many documents are available in similar versions in other languages. A

comparable corpus can be defined as consisting of texts that share similar topics – news text from the same time

period in different countries, while a parallel corpus is defined as a collection of documents with aligned translations

from the source to the target language. [5] Since the patent document refers to the same “invention” or “concept of

idea” the text is a translation of the invention, but it does not have to be a direct translation of the text itself – text

parts could have been removed or added for clarification reasons.

The 19,386,697 XML files measure a total of 621 GB and are hosted by the Information Retrieval Facility. Access

and support are free of charge for research purposes.

External links

• User guide and statistics [6]

• Information Retrieval Facility [7]

• "One week of MAREC" sample [8]

References

[1] Merz C., (2003) A Corpus Query Tool For Syntactically Annotated Corpora Licentiate Thesis, The University of Zurich, Department of

Computation linguistic, Switzerland

[2] Biber D., Conrad S., and Reppen R. (2000) Corpus Linguistics: Investigating Language Structure and Use. Cambridge University Press, 2nd

edition

[3] Manning, C. D. and Schütze, H. (2002) Foundations of statistical natural language processing Cambridge, MA, Massachusetts Institute of

Technology (MIT) ISBN 0-262-13360-1.

[4] European Patent Office (2009) Guidelines for examination in the European Patent Office (http://documents.epo.org/projects/babylon/

eponet.nsf/0/1AFC30805E91D074C125758A0051718A/$File/guidelines_2009_complete_en.pdf), Published by European Patent Office,

Germany (April 2009)

[5] Järvelin A. , Talvensaari T. , Järvelin Anni, (2008) Data driven methods for improving mono- and cross-lingual IR performance in noisy

environments, Proceedings of the second workshop on Analytics for noisy unstructured text data, (Singapore)

[6] http://www.matrixware.com/documentation/marec/index.jsp?topic=/com.MxW.MAREC/ch02.html

[7] http://ir-facility.org

[8] http://matrixware.net/tos/marec/

Media Object Server 47

Media Object Server

Media Object Server (MOS) is an XML-based protocol for transferring information between newsroom automation

systems and other associated systems such as media servers.

The MOS protocol allows a variety of devices to be controlled from one central device or piece of software. This

limits the need to have operators in multiple locations throughout the studio environment. For example, multiple

character generators can be fired from a single control workstation, without needing an operator at each CG console.

External references

• http://www.mosprotocol.com/

• http://www.codeproject.com/KB/cs/mosprotocol.aspx by Rizwan Qureshi

METS

The Metadata Encoding and Transmission Standard is a metadata standard for encoding descriptive, administrative,

and structural metadata regarding objects within a digital library, expressed using the XML schema language of the

World Wide Web Consortium. The standard is maintained in the Network Development and MARC Standards Office

of the Library of Congress, and is being developed as an initiative of the Digital Library Federation.

Introduction

METS is an XML Schema designed for the purpose of:

• Creating XML document instances that express the hierarchical structure of digital library objects.

• Recording the names and locations of the files that comprise those objects.

• Recording associated metadata. METS can, therefore, be used as a tool for modeling real world objects, such as

particular document types.

Depending on its use, a METS document could be used in the role of Submission Information Package (SIP),

Archival Information Package (AIP), or Dissemination Information Package (DIP) within the Open Archival

Information System (OAIS) Reference Model.

Digital libraries Vs Traditional libraries

Maintaining a library of digital objects requires maintaining metadata about those objects. The metadata necessary

for successful management and use of digital objects is both more extensive than and different from the metadata

used for managing collections of printed works and other physical materials.

• Where a traditional library may record descriptive metadata regarding a book in its collection, the book will not

dissolve into a series of unconnected pages if the library fails to record structural metadata regarding the book's

organization, nor will scholars be unable to evaluate the book's worth if the library fails to note that the book was

produced using a Ryobi offset press.

• The same cannot be said for a digital library. Without structural metadata, the page image or text files

comprising the digital work are of little use, and without technical metadata regarding the digitization process,

scholars may be unsure of how accurate a reflection of the original the digital version provides.

• However in a digital library it is possible to create e-book like PDF file, Tiff file which can be seen a single

physical book and reflect the integrity of the original.

METS 48

Characteristics of METS documents

Any METS document has the following features:

• An open standard (non-proprietary)

• Developed by the library community

• Relatively simple

• Extensible

• Modular

Sections of a METS document Example of a METS document

The 7 sections of a METS document

• METS header: Contains metadata describing the METS document itself, such as its creator, editor, etc.

• Descriptive Metadata: May contain internally embedded metadata or point to metadata external to the METS

document. Multiple instances of both internal and external descriptive metadata may be included.

• Administrative Metadata: Provides information regarding how files were created and stored, intellectual

property rights, metadata regarding the original source object from which the digital library object derives, and

information regarding the provenance of files comprising the digital library object (such as master/derivative

relationships, migrations, and transformations). As with descriptive metadata, administrative metadata may be

internally encoded or external to the METS document.

• File Section: Lists all files containing content which comprise the electronic versions of the digital object. file

elements may be grouped within fileGrp elements to subdivide files by object version.

• Structural Map: Outlines a hierarchical structure for the digital library object, and links the elements of that

structure to associated content files and metadata.

• Structural Links: Allows METS creators to record the existence of hyperlinks between nodes in the Structural

Map. This is of particular value in using METS to archive Websites.

• Behavioral: Used to associate executable behaviors with content in the METS object. Each behavior has a

mechanism element identifying a module of executable code that implements behaviors defined abstractly by its

interface definition.

METS 49

METS profiles

METS Profiles are intended to describe a class of METS documents in sufficient detail to provide both document

authors and programmers the guidance they require to create and process METS documents conforming with a

particular profile.

A profile is expressed as an XML document. There is a schema for this purpose. The profile expresses the

requirements that a METS document must satisfy. A sufficiently explicit METS Profile may be considered a data

standard.

METS Profiles in use

• Musical Score (may be a score, score and parts, or a set of parts only)

• Print Material (books, pamphlets, etc.)

• Music Manuscript (score or sketches)

• Recorded Event (audio or video)

• PDF Document

• Bibliographic Record

• Photograph

• Compact Disc

• Collection

See also

• Digital Item Declaration Language

• Dublin Core, an ISO metadata standard

• Preservation Metadata: Implementation Strategies (PREMIS)

• Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)

External links

• Network Development and MARC Standards Office [1]

• Library of Congress [2]

• Digital Library Federation [3]

• METS Official web site [1]

References

[1] http://www.loc.gov/standards/mets/

[2] http://www.loc.gov/index.html

[3] http://www.diglib.org/

Numeric character reference 50

Numeric character reference

A numeric character reference (NCR) is a common markup construct used in SGML and other SGML-related

markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represent a

single character from the Universal Character Set (UCS) of Unicode. NCRs are typically used in order to represent

characters that are not directly encodable in a particular document. When the document is interpreted by a

markup-aware reader, each NCR is treated as if it were the character it represents.

Example

In SGML, HTML, and XML, the following are all valid numeric character references for the Greek capital letter

Sigma ("Σ"),

Numerical character reference of Unicode character Σ

Σ = U+03A3: GREEK CAPITAL LETTER SIGMA (3A3 = 931 )

16 10

Unicode character Numerical base Numerical reference in markup Effect

U+03A3 Decimal Σ Σ

U+03A3 Hexadecimal Σ Σ

Discussion

Markup languages are typically defined in terms of UCS or Unicode characters. That is, a document consists, at its

most fundamental level of abstraction, of a sequence of characters, which are abstract units that exist independently

of any encoding.

Ideally, when the characters of a document utilizing a markup language are encoded for storage or transmission over

a network as a sequence of bits, the encoding that is used will be one that supports representing each and every

character in the document, if not in the whole of Unicode, directly as a particular bit sequence.

Sometimes, though, for reasons of convenience or due to technical limitations, documents are encoded with an

encoding that cannot represent some characters directly. For example, the widely used encodings based on ISO 8859

can only represent, at most, 256 unique characters as one 8-bit byte each.

Documents are rarely, in practice, ever allowed to use more than one encoding internally, so the onus is usually on

the markup language to provide a means for document authors to express unencodable characters in terms of

encodable ones. This is generally done through some kind of "escaping" mechanism.

The SGML-based markup languages allow document authors to use special sequences of characters from the ASCII

range (the first 128 code points of Unicode) to represent, or reference, any Unicode character, regardless of whether

the character being represented is directly available in the document's encoding. These special sequences are

character references.

Character references that are based on the referenced character's UCS or Unicode "code point" are called numeric

character references. In HTML 4 and in all versions of XHTML and XML, the code point can be expressed either as

a decimal (base 10) number or as a hexadecimal (base 16) number. The syntax is as follows:

Numeric character reference 51

Character U+0026 (ampersand), followed by character U+0023 (number sign), followed by one of the following

choices:

• one or more decimal digits zero (U+0030) through nine (U+0039); or

• character U+0078 ("x") followed by one or more hexadecimal digits, which are zero (U+0030) through nine

(U+0039), Latin capital letter A (U+0041) through F (U+0046), and Latin small letter a (U+0061) through f

(U+0066);

all followed by character U+003B (semicolon). Older versions of HTML disallowed the hexadecimal syntax.

The characters that comprise a numeric character reference can be represented in every character encoding used in

computing and telecommunications today, so there is no risk of the reference itself being unencodable.

There is another kind of character reference called a character entity reference, which allows a character to be

referred to by a name instead of a number. (Naming a character creates a character entity.) HTML defines some

character entities, but not many; all other characters can only be included by direct encoding or using NCRs.

Restrictions

The Universal Character Set defined by ISO 10646 is the "document character set" of SGML, HTML 4, so by

default, any character in such a document, and any character referenced in such a document, must be in the UCS.

While the syntax of SGML does not prohibit references to unassigned code points, such as ,

SGML-derived markup languages such as HTML and XML can, and often do, restrict numeric character references

to only those code points that are assigned to characters or that have not been permanently left unassigned.

Restrictions may also apply for other reasons. For example, in HTML 4, , which is a reference to a

non-printing "form feed" control character, is allowed because a form feed character is allowed. But in XML, the

form feed character cannot be used, not even by reference. As another example, , which is a reference to

another control character, is not allowed to be used or referenced in either HTML or XML, but when used in HTML,

it is usually not flagged as an error by web browsers—some of which attempt to interpret it as a reference to the

character represented by code value 128 in the Windows-1252 encoding: "€", which actually should be represented

as €. As a further example, prior to the publication of XML 1.0 Second Edition on October 6, 2000, XML 1.0

was based on an older version of ISO 10646 and prohibited using characters above U+FFFD, except in character

data, thus making a reference like 𐀀 (U+10000) illegal. In XML 1.1 and newer editions of XML 1.0, such a

reference is allowed, because the available character repertoire was explicitly extended.

Markup languages also place restrictions on where character references can occur.

See also

• Character entity reference

• List of XML and HTML character entity references

Office Open XML 52

Office Open XML

class="infobox" style="width: 22em; font-size: 88%; line-height: 1.5em" Office

Open XML

• Office Open XML file formats

• Open Packaging Conventions

• Open Specification Promise

• Vector Markup Language

• Office Open XML software

• Comparison of Office Open XML software

• Office Open XML standardization

Filename extension .docx or .docm

Internet media

type

application/vnd.

openxmlformats-officedocument.

wordprocessingml.

[1]

document

Developed by Microsoft, Ecma, ISO/IEC

Type of format Document file format

Extended from XML, DOC, WordProcessingML

Standard(s) ECMA-376, ISO/IEC 29500

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]

,

Filename extension .pptx or .pptm

Internet media

type

application/vnd.

openxmlformats-officedocument.

presentationml.

[1]

presentation

Office Open XML 53

|-

|}

Developed by Microsoft, Ecma, ISO/IEC

Type of format Presentation

Extended from XML, PPT

Standard(s) ECMA-376, ISO/IEC 29500

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]

,

Filename extension .xlsx or .xlsm

Internet media

type

application/vnd.

openxmlformats-officedocument.

spreadsheetml.

[1]

sheet

Developed by Microsoft, Ecma, ISO/IEC

Type of format Spreadsheet

Extended from XML, XLS, SpreadsheetML

Standard(s) ECMA-376, ISO/IEC 29500

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]

,

Office Open XML (also informally known as OOXML or OpenXML) is a zipped, XML-based file format

developed by Microsoft [4] for representing spreadsheets, charts, presentations and word processing documents. The

Office Open XML specification has been standardised both by Ecma and, in a later edition, by ISO and IEC as an

International Standard (ISO/IEC 29500).

Starting with Microsoft Office 2007, the Office Open XML file formats (ECMA-376) have become the default [5]

target file format of Microsoft Office, [6] [7] although the Strict variant of the standard is not fully supported. [8]

Background

In 2000, Microsoft released an initial version of an XML-based format for Microsoft Excel, which was incorporated

in Office XP. In 2002, a new file format for Microsoft Word followed. [9] The Excel and Word formats—known as

the Microsoft Office XML formats—were later incorporated into the 2003 release of Microsoft Office.

Microsoft announced in November 2005 that it would co-sponsor standardization of the new version of their

XML-based formats through Ecma International, as "Office Open XML". [10]

Office Open XML 54

Standardization process

Microsoft submitted initial material to Ecma International Technical Committee TC45, where it was standardized to

become ECMA-376, approved in December 2006. [11]

This standard was then fast-tracked in the Joint Technical Committee 1 of ISO and IEC.

After initially failing to pass, an amended version of the format received the necessary votes for approval as an

ISO/IEC Standard as the result of a JTC 1 fast tracking standardization process that concluded in April 2008. [12] The

resulting four part International Standard (designated ISO/IEC 29500:2008) was published in November 2008 [13]

and can be downloaded from the ITTF. [14] A technically equivalent set of texts is published by Ecma as ECMA-376

Office Open XML File Formats — 2nd edition (December 2008); they can be downloaded from their web site. [15]

Licensing

Under the Ecma International code of conduct in patent matters, [16] participating and approving member

organisations of ECMA are required to make available their patent rights on a Reasonable and Non Discriminatory

(RAND) basis.

Holders of patents which concern ISO/IEC International Standards may agree to a standardized license governing the

terms under which such patents may be licensed, in accord with the ISO/IEC/ITU common patent policy [17] .

Microsoft, the main contributor to the standard, provided a Covenant Not to Sue [18] for its patent licensing. The

covenant received a mixed reception, with some like the Groklaw blog criticizing it, [19] and others such as Lawrence

Rosen, (an attorney and lecturer at Stanford Law School), endorsing it. [20]

Microsoft has added the format to their Open Specification Promise [21] in which

Microsoft irrevocably promises not to assert any Microsoft Necessary Claims against you for making,

using, selling, offering for sale, importing or distributing any implementation to the extent it conforms to

a Covered Specification […]

This is limited to applications which do not deviate from the ISO/IEC 29500:2008 or Ecma-376 standard and to

parties that do not "file, maintain or voluntarily participate in a patent infringement lawsuit against a Microsoft

implementation of such Covered Specification". [22] [23] The Open Specification Promise was included in documents

submitted to ISO/IEC in support of the ECMA-376 fast track submission. [24] Ecma International asserted that, "The

OSP enables both open source and commercial software to implement [the specification]". [25]

Versions

The Office Open XML specification exists in a number of versions.

ECMA-376 1st edition (2006)

The ECMA standard is structured in five parts to meet the needs of different audiences. [15]

Part 1. Fundamentals

Vocabulary, notational conventions and abbreviations

Summary of primary and supporting markup languages

Conformance conditions and interoperability guidelines

Constraints within the Open Packaging Conventions that apply to each document type

Part 2. Open Packaging Conventions

The Open Packaging Conventions (OPC), for the package model and physical package, is defined and used by

various document types in various applications from multiple vendors.

Office Open XML 55

It defines core properties, thumbnails, digital signatures, and authorizations and encryption capabilities for

parts or all the contents in the package.

XML schemas for the OPC are declared as XML Schema Definitions (XSD) and (non-normatively) using

RELAX NG (ISO/IEC 19757-2)

Part 3. Primer

Informative (non-normative) introduction to WordprocessingML, SpreadsheetML, PresentationML,

DrawingML, VML and Shared MLs, providing context and illustrating elements through examples and

diagrams

Describes the custom XML data storing facility within a package to support integration with business data

Part 4. Markup Language Reference

Contains the reference material for WordprocessingML, SpreadsheetML, PresentationML, DrawingML,

Shared MLs and Custom XML Schema, defining every element and attribute including the element hierarchy

(parent/child relationships)

XML schemas for the markup languages are declared as XSD and (non-normatively) using RELAX NG

Defines the custom XML data storing facility

Part 5. Markup Compatibility and Extensibility

Describes extension facilities of OpenXML documents and specifies elements and attributes by which

applications with different extensions can interoperate

ISO/IEC 29500:2008

The ISO/IEC standard is structured into four parts. [26] Parts 1, 2 and 3 are independent standards; for example Part 2,

specifying Open Packaging Conventions, is used by other files formats including XPS and Design Web Format. Part

4 is to be read as a modification to Part 1, on which it depends.

A technically equivalent set of texts is also published by Ecma as ECMA-376 2nd edition (2008).

Part 1 (Fundamentals and Markup Language Reference)

This part has 5560 pages. It contains:

• Conformance definitions

• Reference material for the XML document markup languages defined by the Standard

• XML schemas for the document markup languages declared using XSD and (non-normatively) RELAX NG

• Defines the foreign markup facilities

Part 2 (Open Packaging Conventions)

This part has 129 pages. It contains:

• A description of the Open Packaging Conventions (package model, physical package)

• Core properties, thumbnails and digital signatures

• XML schemas for the OPC are declared using XSD and (non-normatively) RELAX NG)

Part 3 (Markup Compatibility and Extensibility)

This part has 40 pages. It contains:

• A description of extensions: elements and attributes which define mechanisms allowing applications to specify

alternative means of negotiating content

• Extensibility rules are expressed using NVDL

Part 4 (Transitional Migration Features)

This part has 1464 pages. It contains:

Office Open XML 56

• Legacy material such as compatibility settings and the graphics markup language VML

• A list of syntactic differences between this text and ECMA-376 1st edition

The standard specifies two levels of document and application conformance, strict and transitional for each of

WordprocessingML, PresentationML and SpreadsheetML. The standard also specifies applications descriptions of

base and full.

Compatibility between versions

The intent of the changes from ECMA-376 1st edition to ISO/IEC 29500:2008 was that a valid ECMA-376

document would be a valid ISO 29500 "transitional" document [27] , but at least one change introduced at the BRM

(refusing to allow further values for xsd:boolean) had the effect of breaking backwards compatibility for most

documents. [28] A fix for this has been suggested to ISO/IEC JTC1/SC34/WG4, and was approved in June 2009 to go

forward as a recommendation for the first amendment to Office Open XML. [29]

File formats

The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents.

The format defines a set of XML markup vocabularies for word processing documents, spreadsheets and

presentations as well as specific XML markup vocabularies for material such as mathematical formulae, graphics,

bibliographies etc. The stated goal of the Office Open XML standard is to be capable of faithfully representing the

pre-existing corpus of word-processing documents, spreadsheets and presentations that had been produced by the

Microsoft Office applications and to facilitate extensibility and interoperability by enabling implementations by

multiple vendors and on multiple platforms.

An Office Open XML file is a ZIP-compatible OPC package containing XML documents and other resources. That

is, one can see the insides of a .xlsm file, for example, by renaming it as .zip file. Then, the file can be opened by any

zip tool and the actual .xml files contained therein can be viewed in a web browser or a plain text editor.

Adoption

Several countries have formally announced either adoption, or the evaluation of adoption of Office Open XML,

while others have rejected it completely. In some cases the Office Open XML standard has a national standard

identifier; In some cases the Office Open XML standard is permitted to be used where national regulation says that

non-proprietary formats must be used, in other cases, it means that some government body has actually decided that

Office Open XML will be used in some specific context, and in still other cases, some Government body has decided

that it will not use Office Open XML at all.

Belgium

Denmark

Germany

Belgium's Federal Public Service for Information and Communication Technology in 2006 was evaluating the

adoption of the Office Open XML format. It already then confirmed that it would consider all ISO standards to

be open standards, mentioning Office Open XML as such a possible future ISO standard. [30]

In June 2007, the Danish Ministry of Science, Technology and Innovation recommended that beginning with

January 1, 2008 public authorities must support at least one of the two word processing document formats

Office Open XML or Open Document Format in all new IT solutions, where appropriate. [31]

In Germany the Office Open XML standard is currently under observation by the governmental office for

standards in public IT ("Koordinierungs- und Beratungsstelle der Bundesregierung für Informationstechnik in

der Bundesverwaltung" (KBSt). The latest release of "SAGA" (Standards and Architectures for

Office Open XML 57

Japan

Lithuania

Norway

Sweden

E-Government-Applications) includes Office Open XML file formats. The standard may be used to exchange

complex documents when further processing is required. [32]

On June 29, 2007, the government of Japan published a new interoperability framework which gives

preference to the procurement of products that follow open standards. [33] [34] On July 2 the government

declared that they hold the view that formats like Office Open XML which organizations such as Ecma

International and ISO had also approved was, according to them, an open standard . Also, they said that it was

one of the preferences, whether the format is open, to choose which software the government shall deploy.

Lithuanian Standards Board has adopted the ISO/IEC 29500:2008 Office Open XML format standard as

Lithuanian National standard. The decision was made by Technical Committee 4 Information Technology on

March 5, 2009. The proposal to adopt the Office Open XML format standard was submitted by Lithuanian

Archives Department under the Government of the Republic of Lithuania. [35]

Norway's Ministry of Government Administration and Reform is evaluating the adoption of the Office Open

XML format. The ministry put the document standard under observation in December 2007. [36]

The Kingdom of Sweden has adopted Office Open XML as a 4 part Swedish National Standard SS-ISO/IEC

[37] [38] [39] [40]

29500:2009.

Switzerland

In July 2007, the Swiss Federal Council announced adherence SAGA.ch e-Government standards mandatory

for its departments as well as for cantons, cities and municipalities. The latest version of SAGA.ch includes

Office Open XML file formats. [41]

United Kingdom

The UK has put out an action plan for use of open standards, which includes ISO/IEC 29500 as one of several

[42] [43]

formats to be supported.

United States of America

On April 15, 2009, the ANSI-accredited INCITS organisation voted to adopt ISO/IEC 29500:2008 as an

American National Standard. [44]

The state of Massachusetts has been examining its options for implementing XML-based document

processing. In early 2005, Eric Kriss, Secretary of Administration and Finance in Massachusetts, was the first

government official in the United States to publicly connect open formats to a public policy purpose: "It is an

overriding imperative of the American democratic system that we cannot have our public documents locked up

in some kind of proprietary format, perhaps unreadable in the future, or subject to a proprietary system license

that restricts access". [45] Since 2007 Massachusetts has classified Office Open XML as "Open Format" and has

amended [46] its approved technical standards list — the Enterprise Technical Reference Model (ETRM) — to

include Office Open XML. Massachusetts, under heavy pressure from some vendors, now formally endorses

Office Open XML formats for its public records. [47]

Office Open XML 58

Application support

Starting with Microsoft Office 2007, the Office Open XML file formats (ECMA-376) have become the default [5] file

format of Microsoft Office. [6] [7] However, due to the changes introduced in a later version, Office 2007 is not

entirely in compliance with ISO/IEC 29500:2008. [48] [49] [50] [51] Microsoft Office 2010 includes support for the

ISO/IEC 29500:2008 compliant version of Office Open XML. [49] . Office 2010 does not yet support saving

document conform the strict schema of the ISO/IEC 29500:2008 specification, but saves documents conform the

transitional schema of the ISO/IEC 29500:2008 specification. [52] [53] The intent of the ISO/IEC is to allow the

removal of the transitional variant from the ISO/IEC 29500 standard. [53]

The SoftMaker Office 2010 Suite claims to be able to reliably read and write .DOCX and .XLSX files in its word

processor and spreadsheet applications.

The OpenOffice.org office suite has been able to import Office Open XML files (.docx, .xlsx, .pptx, etc.) since

version 3. [54]

The KOffice office suite has been able to import Office Open XML files since version 2.2.

Other mainstream Office products that have started to offer import support for the Office Open XML formats are

Apple's TextEdit (included with Mac OS X) and iWork, IBM Lotus Notes, Corel Wordperfect, Kingsoft Office and

Google apps.

Controversies

The ISO standardization of Office Open XML was controversial and embittered. According to InfoWorld:

OOXML was opposed by many on grounds it was unneeded, as software makers could use

OpenDocument Format (ODF), a less complicated office software format that was already an

international standard. [55]

The same InfoWorld article reported that IBM (which supports the ODF format) threatened to leave standards bodies

that it said allow dominant corporations like Microsoft to wield undue influence. Microsoft was accused of co-opting

the standardization process by leaning on countries to ensure that it got enough votes at the ISO for Office Open

XML to pass. [56]

Richard Stallman of the Free Software Foundation has stated that "Microsoft offers a gratis patent license for

OOXML on terms which do not allow free implementations." [57]

See also

• List of document markup languages

• Comparison of document markup languages

• Open Document Format

External links

• ECMA-376 site [2]

• ISO/IEC 29500:2008 [3]

• OpenXMLDeveloper.org [58] , Microsoft's site for developers

• Open XML Community site [59] Microsoft's site for customers and partners

• "The WordprocessingML Vocabulary", sample chapter from O'Reilly book Office 2003 XML [60] PDF (1.22 MB)

• OpenOffice.org [61] , How do I open Microsoft Office 2007 files? Article by OpenOffice.org

• Information technology -- Office Open XML file formats [62] , ISO Standards, JTC 1 Information technology, SC

34

• FAQs on ISO/IEC 29500 [63] , ISO's FAQ site on ISO/IEC 29500

Office Open XML 59

• DOCX reference document [64] , contains a file with fairly complex formatting and can be used to quickly check

compatibility of an implementation

• OpenXML site [65] , contains resources, articles and tools for Office Open XML

• Interoperability study [66] showing an indication of the percentage of support for Office Open XML by several

different office suite implementations in aug-2008

References

[1] Microsoft. "Register file extensions on third party servers" (http://technet.microsoft.com/en-us/library/cc179224.aspx). microsoft.com. .

Retrieved 2009-09-04.

[2] http://www.ecma-international.org/publications/standards/Ecma-376.htm

[3] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_tc_browse.htm?commid=45374

[4] "Q&A: Microsoft Co-Sponsors Submission of Office Open XML Document Formats to Ecma International for Standardization" (https://

www.microsoft.com/presspass/features/2005/nov05/11-21Ecma.mspx). Microsoft. 2005-11-21. .

[5] "Microsoft Expands List of Formats Supported in Microsoft Office" (http://www.microsoft.com/Presspass/press/2008/may08/

05-21ExpandedFormatsPR.mspx?rss_fdn=Press Releases). Microsoft. . Retrieved 2008-05-21.

[6] "Microsoft's future lies somewhere beyond the Vista by Evansville Courier & Press" (http://www.courierpress.com/news/2008/oct/24/

microsofts-future-lies-somewhere-beyond-the/). Courierpress.com. . Retrieved 2009-05-19.

[7] "Rivals Set Their Sights on Microsoft Office: Can They Topple the Giant? - Knowledge@Wharton" (http://knowledge.wharton.upenn.edu/

article.cfm?articleid=1795). Knowledge.wharton.upenn.edu. . Retrieved 2009-05-19.

[8] ISO OOXML convener: Microsoft's format "heading for failure" (http://arstechnica.com/microsoft/news/2010/04/

iso-ooxml-convener-microsofts-format-heading-for-failure.ars)

[9] Brian Jones (2007-01-25). "History of office XML formats (1998–2006)" (http://blogs.msdn.com/brian_jones/archive/2007/01/25/

office-xml-formats-1998-2006.aspx). MSDN blogs. .

[10] "Microsoft Co-Sponsors Submission of Office Open XML Document Formats to Ecma International for Standardization" (http://www.

microsoft.com/presspass/features/2005/nov05/11-21Ecma.mspx). Microsoft. 2005-11-21. .

[11] "Ecma International approves Office Open XML standard" (http://www.ecma-international.org/news/PressReleases/

PR_TC45_Dec2006.htm). Ecma International. 2006-12-07. .

[12] "ISO/IEC DIS 29500 receives necessary votes for approval as an International Standard" (http://www.iso.org/iso/pressrelease.

htm?refid=Ref1123). ISO. 2008-04-02. .

[13] ISO/IEC (2008-11-18). "Publication of ISO/IEC 29500:2008, Information technology — Office Open XML formats" (http://www.iso.

org/iso/pressrelease.htm?refid=Ref1181). ISO. . Retrieved 2008-11-19.

[14] "Freely Available Standards" (http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html). ITTF (ISO/IEC). 2008-11-18. .

[15] "Standard ECMA-376" (http://www.ecma-international.org/publications/standards/Ecma-376.htm). Ecma-international.org. . Retrieved

2009-05-19.

[16] "Code of Conduct in Patent Matters" (http://www.ecma-international.org/memento/codeofconduct.htm). Ecma International. .

[17] "ISO/IEC/ITU common patent policy" (http://isotc.iso.org/livelink/livelink/fetch/2000/2122/3770791/Common_Policy.htm). .

[18] "Microsoft Covenant Regarding Office 2003 XML Reference Schemas" (http://www.microsoft.com/office/xml/covenant.mspx).

Microsoft. . Retrieved 2006-07-11.

[19] "2 Escape Hatches in MS's Covenant Not to Sue" (http://www.groklaw.net/articlebasic.php?story=20051202135844482). Groklaw. .

Retrieved 2007-01-29.

[20] Berlind, David (November 28, 2005). "Top open source lawyer blesses new terms on Microsoft's XML file format" (http://blogs.zdnet.

com/BTL/?p=2192). ZDNet. . Retrieved 2007-01-27.

[21] "Microsoft Open Specification Promise" (http://www.microsoft.com/interop/osp/default.mspx). Microsoft. 2006-09-12. . Retrieved

2007-04-22.

[22] "" (http://www.ecma-international.org/publications/index.html). Ecma International. . ""Ecma Standards and Technical Reports are

made available to all interested persons or organizations, free of charge and licensing restrictions""

[23] "Microsoft Open Specification Promise" (http://www.microsoft.com/Interop/osp/default.mspx). Microsoft.com. .

[24] "Licensing conditions that Microsoft offers for Office Open XML" (http://www.jtc1sc34.org/repository/0810c.htm). Jtc1sc34.org.

2006-12-20. . Retrieved 2009-05-19.

[25] "Microsoft Word — Responses to Comments and Perceived Contradictions.doc" (http://www.ecma-international.org/news/

TC45_current_work/Ecma responses.pdf) (PDF). . Retrieved 2009-09-16.

[26] "ISO (You searched for "29500" in title and abstract" (http://www.iso.org/iso/search.htm?qt=29500&published=on&

active_tab=standards). International Organization for Standardization. 2009-06-05. .

[27] "Re-introducing on/off-values to ST-OnOff in OOXML Part 4" (http://idippedut.dk/post/2009/06/23/

Re-introducing-onoff-values-to-ST-OnOff-in-OOXML-Part-4.aspx). . Retrieved 2009-09-29.

[28] "OOXML and Office 2007 Conformance: a Smoke Test" (http://www.adjb.net/post/

OOXML-and-Office-2007-Conformance-a-Smoke-Test.aspx). . Retrieved 2009-09-29.

Office Open XML 60

[29] "Minutes of the Copenhagen Meeting of ISO/IEC JTC1/SC34/WG4" (http://www.itscj.ipsj.or.jp/sc34/open/1239.pdf). 2009-06-22. .

Retrieved 2009-09-29. page 15

[30] "FED13321-docsPeterStrickx.indd" (http://www.fedict.belgium.be/nl/binaries/Open_Standaarden_NL_V1_tcm167-16667.pdf) (PDF).

. Retrieved 2009-09-16.

[31] "Bilag 8 – Sammenligning af rapporten om "Estimering af omkostningerne ved indførelse af Office Open XML (OOXML) og Open

Document Format (ODF) i centraladministrationen" i forhold til de spørgsmål, der skal belyses i de økonomiske konsekvensvurderinger, jf.

rapporten om "Anvendelse af åbne standarder i det offentlige"" (http://vtu.dk/nyheder/aktuelle-temaer/2007/aabne-standarder/bilag/

bilag-8.html/). Vtu.dk. . Retrieved 2009-05-19.

[32] "SAGA 4.0" (http://gsb.download.bva.bund.de/KBSt/SAGA/SAGA_v4.0.pdf) (PDF). . Retrieved 2009-09-16.

[33] Gardner, David (2007-07-10). "Office Software Formats Battle Moves To Asia" (http://www.informationweek.com/news/showArticle.

jhtml?articleID=201000546). Information Week. . Retrieved 2007-07-27.

[34] "Interoperability framework for information systems (in Japanese)" (http://www.meti.go.jp/press/20070629014/20070629014.html).

Ministry of Economy, Trade and Industry, Japan. 2007-06-29. . Retrieved 2007-07-27.

[35] "Latest News" (http://www.openxmlcommunity.com/latestnews.aspx). Open XML Community. . Retrieved 2009-05-19.

[36] "Referansekatalog for IT-standarder i offentlig sektor" (http://www.regjeringen.no/en/dep/fad/Documents/Rundskriv/2007/

Referansekatalog-for-IT-standarder-i-off.html?id=494951). regjeringen.no. . Retrieved 2009-05-19.

[37] "SS-ISO/IEC 29500-1:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68693&PresID=2&

Desc=SS-ISO/IEC 29500-1:2009). Sis.se. 2009-01-19. . Retrieved 2009-09-16.

[38] "SS-ISO/IEC 29500-2:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68694&PresID=1&

Desc=SS-ISO/IEC 29500-2:2009). Sis.se. . Retrieved 2009-09-16.

[39] "SS-ISO/IEC 29500-3:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68695&PresID=2&

Desc=SS-ISO/IEC 29500-3:2009). Sis.se. . Retrieved 2009-09-16.

[40] "SS-ISO/IEC 29500-4:2009" (http://www.sis.se/DesktopDefault.aspx?tabName=@DocType_1&Doc_ID=68696&PresID=1&

Desc=SS-ISO/IEC 29500-4:2009). Sis.se. . Retrieved 2009-09-16.

[41] "eCH — Downloads | Standards/Normes | eCH-0014 d SAGA.ch" (http://www.ech.ch/index.php?option=com_docman&

task=cat_view&gid=92&lang=en). Ech.ch. . Retrieved 2009-05-19.

[42] "Open Source, Open Standards and Re–Use: Government Action Plan" (http://www.cabinetoffice.gov.uk/government_it/open_source/

action.aspx). UK Government Cabinet Office. 2009-02-24. .

[43] Rick Jelliffe (2009-02-26). "Open standards: the UK gets it, probably" (http://broadcast.oreilly.com/2009/02/

open-standards-the-uk-gets-it.html). .

[44] "INCITS Letter Ballot 3025" (http://ballot.itic.org/itic/archive.taf?function=detail&ballot_id=3025&

_UserReference=9B6726AA59D4BAC249E6E82E). INCITS. 2009-04-15. .

[45] "Informal comments on Open Formats" (http://web.archive.org/web/20061013201242/http://www.mass.gov/eoaf/

open_formats_comments.html). Web.archive.org. . Retrieved 2009-09-16.

[46] http://www.mass.gov/?pageID=itdterminal&L=3&L0=Home&L1=Policies%2c+Standards+%26+Guidance&L2=Drafts+for+

Review&sid=Aitd&b=terminalcontent&f=policies_standards_etrmv4_etrmv4dot0revisions&csid=Aitd

[47] "Cover Pages: Major Revision of Massachusetts Enterprise Technical Reference Model (ETRM)" (http://xml.coverpages.org/

ni2007-07-03-a.html). Xml.coverpages.org. . Retrieved 2009-05-19.

[48] "OOXML Implementations: A Community of One" (http://www.odfalliance.org/resources/IssueBriefImplementations.pdf). ODF

Alliance. 2008-02-20. . Retrieved 2009-05-19.

[49] "Microsoft Expands List of Formats Supported in Microsoft Office" (http://www.microsoft.com/Presspass/press/2008/may08/

05-21ExpandedFormatsPR.mspx). Microsoft.com. 2008-05-21. . Retrieved 2009-05-19.

[50] Lai, Eric (2008-05-27). = 141&pageNumber=1 "FAQ: Office 14 and Microsoft's support for ODF" (http://www.computerworld.com/

action/article.do?command=viewArticleBasic&taxonomyName=Protocols+and+Standards&articleId=9089258&taxonomyId).

Computerworld.com. = 141&pageNumber=1. Retrieved 2009-05-19.

[51] Andy Updegrove. "Microsoft Office 2007 to Support ODF — and not OOXML" (http://consortiuminfo.org/standardsblog/article.

php?story=20080521092930864). ConsortiumInfo.org. . Retrieved 2009-05-19.

[52]

Office Open XML 61

[59] http://www.openxmlcommunity.org/

[60] http://www.oreilly.com/catalog/officexml/chapter/ch02.pdf

[61] http://wiki.services.openoffice.org/wiki/Documentation/FAQ/General/OpeningMSO2007Files

[62] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=45515

[63] http://www.iso.org/iso/faqs_isoiec29500

[64] http://katana.oooninja.com/w/reference_sample_documents

[65] http://www.openxml.biz/

[66] http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1201708

Office Open XML file formats

Office Open XML

• Office Open XML file formats

• Open Packaging Conventions

• Open Specification Promise

• Vector Markup Language

• Office Open XML software

• Comparison of Office Open XML

software

• Office Open XML standardization

Filename extension .docx or .docm

Internet media

type

application/vnd.

openxmlformats-officedocument.

wordprocessingml.

[1]

document

Developed by Microsoft, Ecma, ISO/IEC

Type of format Document file format

Extended from XML, DOC, WordProcessingML

Standard(s) ECMA-376, ISO/IEC 29500

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]

Office Open XML file formats 62

Filename extension .pptx or .pptm

Internet media

type

application/vnd.

openxmlformats-officedocument.

presentationml.

[1]

presentation

Developed by Microsoft, Ecma, ISO/IEC

Type of format Presentation

Extended from XML, PPT

Standard(s) ECMA-376, ISO/IEC 29500

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]

,

Filename extension .xlsx or .xlsm

Internet media

type

application/vnd.

openxmlformats-officedocument.

spreadsheetml.

[1]

sheet

Developed by Microsoft, Ecma, ISO/IEC

Type of format Spreadsheet

Extended from XML, XLS, SpreadsheetML

Standard(s) ECMA-376, ISO/IEC 29500

Website ECMA-376 [2] ISO/IEC 29500:2008 [3]

,

The Office Open XML file formats are a set of file formats that can be used to represent electronic office

documents. There are formats for word processing documents, spreadsheets and presentations as well as specific

formats for material such as mathematical formulae, graphics, bibliographies etc.

The formats were developed by Microsoft and first appeared in Microsoft Office 2007. They were standardized

between December 2006 and November 2008, first by the Ecma International consortium, where they became

Office Open XML file formats 63

ECMA-376, and subsequently, after a contentious standardization process, by the ISO/IEC's Joint Technical

Committee 1, where they became ISO/IEC 29500:2008.

Container

Office Open XML documents are stored in Open Packaging

Convention (OPC) packages, which are ZIP files containing

XML and other data files, along with a specification of the

relationships between them. [2] Depending on the type of the

document, the packages have different internal directory

structures and names. An application will use the relationships

files to locate individual sections (files), with each having

accompanying metadata, in particular MIME metadata.

A basic package contains an XML file called

[Content_Types].xml at the root, along with three directories:

_rels, docProps, and a directory specific for the document

type (for example, in a .docx word processing package, there

would be a word directory). The word directory contains the

document.xml file which is the core content of the document.

[Content_Types].xml

_rels

_rels/.rel

This file provided MIME type information for parts of

the package, using defaults for certain file extensions

and overrides for parts specificied by IRI.

Container structure of Part 2 of the Ecma Office Open XML

standard, ECMA-376

This directory contains relationships for the files within the package. To find the relationships for a specific

file, look for the _rels directory that is a sibling of the file, and then for a file that has the original file name

with a .rels appended to it. For example, if the content types file had any relationships, there would be a file

called [Content_Types].xml.rels inside the _rels directory.

This file is where the package relationships are located. Applications look here first. Viewing in a text editor,

one will see it outlines each relationship for that section. In a minimal document containing only the basic

document.xml file, the relationships detailed are metadata and document.xml.

docProps/core.xml

This file contains the core properties for any Office Open XML document.

word/document.xml

This file is the main part for any Word document.

Relationships

An example relationship file (word/_rels/document.xml.rels), is:

Office Open XML file formats 64

Target="http://en.wikipedia.org/images/wiki-en.png"

TargetMode="External" />

As such, images referenced in the document can be found in the relationship file by looking for all relationships that

are of type http://schemas.microsoft.com/office/2006/relationships/image. To change the used image, edit the

relationship.

The following code shows an example of inline markup for a hyperlink:

In this example, the Uniform Resource Locator (URL) is represented by "rId2". The actual URL is in the

accompanying relationships file, located by the corresponding "rId2" item. Linked images, templates, and other

items are referenced in the same way.

Pictures can be embedded or linked using a tag:

This is the reference to the image file. All references are managed via relationships. For example, a document.xml

has a relationship to the image. There is a _rels directory in the same directory as document.xml, inside _rels is a file

called document.xml.rels. In this file there will be a relationship definition that contains type, ID and location. The

ID is the referenced ID used in the XML document. The type will be a reference schema definition for the media

type and the location will be an internal location within the ZIP package or an external location defined with a URL.

Document properties

Office Open XML uses the Dublin Core Metadata Element Set and DCMI Metadata Terms to store document

properties. Dublin Core is a standard for cross-domain information resource description and is defined in ISO

15836:2003 [3] .

An example document properties file (docProps/core.xml) that uses Dublin Core metadata, is:

Office Open XML file formats 65

2008-06-19T20:00:00Z

2008-06-19T20:42:00Z

Document file format

Final

Document markup languages

An Office Open XML file may contain several documents encoded in specialized markup languages corresponding

to applications within the Microsoft Office product line. Office Open XML defines multiple vocabularies using 27

namespaces and 89 schema modules.

The primary markup languages are:

• WordprocessingML for word-processing

• SpreadsheetML for spreadsheets

• PresentationML for presentations

Shared markup language materials include:

• Office Math Markup Language (OMML)

• DrawingML used for vector drawing, charts, and for example, text art (additionally, though deprecated, VML is

supported for drawing)

• Extended properties

• Custom properties

• Variant Types

• Custom XML data properties

• Bibliography

In addition to the above markup languages custom XML schemas can be used to extend Office Open XML.

Design approach

Patrick Durusau, the editor of ODF, has viewed the markup style of OOXML and ODF as representing two sides of

a debate: the "element side" and the "attribute side". He notes that OOXML represents "the element side of this

approach" and singles out the KeepNext element as an example:

…

In contrast, he notes ODF would use the single attribute fo:keep-next, rather than an element, for the same

semantic. [4]

The XML Schema of Office Open XML emphasizes reducing load time and improving parsing speed. [5] In a test

with applications current in April 2007, XML-based office documents were slower to load than binary formats. [6] To

enhance performance, Office Open XML uses very short element names for common elements and spreadsheets save

dates as index numbers (starting from 1899 or from 1904). In order to be systematic and generic, Office Open XML

typically uses separate child elements for data and metadata (element names ending in Pr for properties) rather than

using multiple attributes, which allows structured properties. Office Open XML does not use mixed content but uses

elements to put a series of text runs (element name r) into paragraphs (element name p). The result is terse and

highly nested in contrast to HTML, for example, which is fairly flat, designed for humans to write in text editors and

is more congenial for humans to read.

Office Open XML file formats 66

The naming of elements and attributes within the text have attracted some criticism. There are three different

syntaxes in OOXML (ECMA-376) for specifying the color and alignment of text depending on whether the

document is a text, spreadsheet, or presentation. Rob Weir (an IBM employee and co-chair of the OASIS

OpenDocument Format TC) asks "What is the engineering justification for this horror?". He contrasts with

OpenDocument: "ODF uses the W3C's XSL-FO vocabulary for text styling, and uses this vocabulary

consistently". [7]

Some have argued the design is based too closely on Microsoft applications. In August 2007, the Linux Foundation

published a blog post calling upon ISO National Bodies to vote "No, with comments" during the International

Standardization of OOXML. It said, "OOXML is a direct port of a single vendor's binary document formats. It

avoids the re-use of relevant existing international standards (e.g. several cryptographic algorithms, VML, etc.).

There are literally hundreds of technical flaws that should be addressed before standardizing OOXML including

continued use of binary code tied to platform specific features, propagating bugs in MS-Office into the standard,

proprietary units, references to proprietary/confidential tags, unclear IP and patent rights, and much more". [8]

The version of the standard submitted to JTC 1 was 6546 pages long. The need and appropriateness of such length

has been questioned. [9] [10] Google stated that "the ODF standard, which achieves the same goal, is only 867

pages" [9]

WordprocessingML (WML)

Word processing documents use the XML vocabulary known as WordprocessingML normatively defined by the

schema wml.xsd which accompanies the standard. This vocabulary is defined in clause 11 of Part 1. [11]

SpreadsheetML (SML)

Spreadsheet documents use the XML vocabulary known as SpreadsheetML normatively defined by the schema

sml.xsd which accompanies the standard. This vocabulary is described in clause 12 of Part 1. [11]

Each worksheet in a spreadsheet is represented by an XML document with a root element named

in the http://schemas.openxmlformats.org/spreadsheetml/2006/main Namespace.

The representation of date and time values in SpreadsheetML has attracted some criticism. ECMA-376 1st edition

does not conform to ISO 8601:2004 "Representation of Dates and Times". It requires that implementations replicate

a Lotus 1-2-3 [12] bug that dictates that 1900 is a leap year, which in fact it isn't. Products complying with

ECMA-376 would be required to use the WEEKDAY() spreadsheet function, and therefore assign incorrect dates to

some days of the week, and also miscalculate the number of days between certain dates. [13] ECMA-376 2nd edition

(ISO/IEC 29500) allows the use of 8601:2004 "Representation of Dates and Times" in addition to the Lotus 1-2-3

[14] [15]

bug-compatible form.

3

Office MathML (OMML)

Office Math Markup Language is a mathematical markup language which can be embedded in WordprocessingML,

with intrinsic support for including word processing markup like revision markings, [16] footnotes, comments, images

and elaborate formatting and styles. [17] The OMML format is different from the World Wide Web Consortium

(W3C) MathML recommendation that does not support those office features, but is partially compatible [18] through

XSL Transformations.

The following Office MathML example defines the fraction:

Office Open XML file formats 67

Office Open XML file formats 68

Foreign resources

Non-XML content

OOXML documents are typically composed of other resources in addition to XML content (graphics, video, etc.).

Some have criticised the choice of permitted format for such resources: ECMA-376 1st edition specifies "Embedded

Object Alternate Image Requests Types" and "Clipboard Format Types", which refer to Windows Metafiles or

Enhanced Metafiles – each of which are proprietary formats that have hard-coded dependencies on Windows itself.

The critics state the standard should instead have referenced the platform neutral standard ISO/IEC 8632 "Computer

Graphics Metafile". [13]

Foreign markup

The Standard provides three mechanisms to allow foreign markup to be embedded within content for editing

purposes:

• Smart tags

• Custom XML markup

• Structured Document Tags

These are defined in clause 17.5 of Part 1.

Compatibility settings

Versions of Office Open XML contain what are termed "compatibility settings". These are contained in Part 4

("Markup Language Reference") of ECMA-376 1st Edition, but during standardization were moved to become a new

part (also called Part 4) of ISO/IEC 29500:2008 ("Transitional Migration Features").

These settings (including element with names such as autoSpaceLikeWord95, footnoteLayoutLikeWW8,

lineWrapLikeWord6, mwSmallCaps, shapeLayoutLikeWW8, suppressTopSpacingWP, truncateFontHeightsLikeWP6,

uiCompat97To2003, useWord2002TableStyleRules, useWord97LineBreakRules, wpJustification and wpSpaceWidth)

were the focus of some controversy during the standardisation of DIS 29500. [24] As a result, new text was added to

ISO/IEC 29500 to document them. [25]

An article in Free Software Magazine has criticized the markup used for these settings. Office Open XML uses

distinctly named elements for each compatibility setting, each of which is declared in the schema. The repertoire of

settings is thus limited — for new compatibility settings to be added, new elements may need to be declared,

"potentially creating thousands of them, each having nothing to do with interoperability". [26]

Extensibility

The standard provides two types of extensibility mechanism, Markup Compatibility and Extensibility (MCE) defined

in Part 3 (ISO/IEC 29500-3:2008) and Extension Lists defined in clause 18.2.10 of Part 1.

References

[1] Microsoft. "Register file extensions on third party servers" (http://technet.microsoft.com/en-us/library/cc179224.aspx). microsoft.com. .

Retrieved 2009-09-04.

[2] Tom Ngo (December 11, 2006). "Office Open XML Overview" (http://www.ecma-international.org/news/TC45_current_work/

OpenXML White Paper.pdf) (PDF). Ecma International. p. 6. . Retrieved 2007-01-23.

[3] http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=37629

[4] Patrick Durusau (21 October 2008). "Old Wine In New Skins" (http://www.durusau.net/publications/old_wine.pdf). .

[5] Intellisafe Technologies. "Software Developer uses Office Open XML to Minimize File Space, Increase Interoperability" (http://www.

openxmlcommunity.org/documents/casestudies/Intellisafe_OpenXML_Final.pdf). .

Office Open XML file formats 69

[6] George Ou (2007-04-27). "MS Office 2007 versus Open Office 2.2 shootout" (http://blogs.zdnet.com/Ou/?p=480). ZDnet.com. .

Retrieved 2007-04-27.

[7] Rob Weir (14 March 2008). "Disharmony of OOXML" (http://www.robweir.com/blog/2008/03/disharmony-of-ooxml.html). .

[8] John Cherry (14 March 2008). "OOXML — vote "No, with comments"" (http://www.linux-foundation.org/weblogs/cherry/2007/08/29/

ooxml-vote-no-with-comments/). .

[9] "Google's Position on OOXML as a Proposed ISO Standard" (http://www.odfalliance.org/resources/Google OOXML Q A.pdf). Google.

2008-02. . "If ISO were to give OOXML with its 6546 pages the same level of review that other standards have seen, it would take 18 years

(6576 days for 6546 pages) to achieve comparable levels of review to the existing ODF standard (871 days for 867 pages) which achieves the

same purpose and is thus a good comparison. Considering that OOXML has only received about 5.5% of the review that comparable standards

have undergone, reports about inconsistencies, contradictions and missing information are hardly surprising"

[10] "OOXML: What's the big deal?" (http://www.ibm.com/developerworks/library/x-ooxmlstandard.html). IBM. 2008-02-19. .

[11] "ISO/IEC 29500-1:2008" (http://standards.iso.org/ittf/PubliclyAvailableStandards/c051463_ISOIEC 29500-1_2008(E).zip). ISO and

IEC. 2008-09. .

[12] Kyd, Charley (October 2006). "How to Work With Dates Before 1900 in Excel" (http://www.exceluser.com/explore/earlydates.htm).

ExcelUser. . Retrieved 2009-09-16.

[13] "The Contradictory Nature of OOXML" (http://www.consortiuminfo.org/standardsblog/article.php?story=20070117145745854).

ConsortiumInfo.org. .

[14] "ECMA-376 2nd edition Part 1 (3. Normative references)" (http://www.ecma-international.org/publications/standards/Ecma-376.htm).

Ecma-international.org. . Retrieved 2009-09-16.

[15] "New set of proposed dispositions posted, including more positive changes to the Ecma Office Open XML formats – Dispositions now

proposed for more than half of National Bodies' comments" (http://www.ecma-international.org/news/TC45_current_work/New set of

proposed dispositions posted.htm). Ecma-international.org. 2007-12-11. . Retrieved 2009-09-16.

[16] Jesper Lund Stocholm (2008-01-29). "Do your math — OOXML and OMML" (http://idippedut.dk/post/2008/01/

Do-your-math---OOXML-and-OMML.aspx). A Mooh Point blog. . Retrieved 2008-02-12.

[17] Murray Sargent (2007-06-05). "Science and Nature have difficulties with Word 2007 mathematics" (http://blogs.msdn.com/murrays/

archive/2007/06/05/science-and-nature-have-difficulties-with-word-2007-mathematics.aspx). MSDN blogs. . Retrieved 2007-07-31.

[18] David Carlisle (2007-05-09). "XHTML and MathML from Office 2007" (http://dpcarlisle.blogspot.com/2007/04/

xhtml-and-mathml-from-office-20007.html). David Carlisle. . Retrieved 2007-09-20.

[19] "Microsoft Office dumped by Science and Nature" (http://www.zdnet.com.au/news/software/soa/

Microsoft-Office-dumped-by-Science-and-Nature/0,130061733,339278690,00.htm). ZDNet Australia. 18 June 2007. .

[20] Wouter Van Vugt (2008-11-01). "Open XML Explained e-book" (http://openxmldeveloper.org/articles/1970.aspx).

Openxmldeveloper.org. . Retrieved 2007-09-14.

[21] Rick Jelliffe in Technical (2007-04-16). "Why EMUs? - O'Reilly XML Blog" (http://www.oreillynet.com/xml/blog/2007/04/

what_is_an_emu.html). Oreillynet.com. . Retrieved 2009-05-19.

[22] "The X Factor" (http://reddevnews.com/features/article.aspx?editorialsid=2356). reddevnews.com. October 2007. .

[23] "VML — the Vector Markup Language" (http://www.w3.org/TR/NOTE-VML). W3.org. 1998-05-13. . Retrieved 2009-05-19.

[24] "ODF/OOXML technical white paper — A white paper based on a technical comparison between the ODF and OOXML formats" (http://

www.freesoftwaremagazine.com/articles/odf_ooxml_technical_white_paper?page=0,9). Free Software Magazine. .

[25] "ECMA-376 2nd edition Part 4 (paragraph 9.7.3)" (http://www.ecma-international.org/publications/standards/Ecma-376.htm).

Ecma-international.org. . Retrieved 2009-09-16.

[26] "ODF/OOXML technical white paper — A white paper based on a technical comparison between the ODF and OOXML formats" (http://

www.freesoftwaremagazine.com/articles/odf_ooxml_technical_white_paper?page=0,7). Free Software Magazine. . ""... OOXML chose this

route. Rather than create an application-definable configuration tag there is a unique tag for each setting ... Currently, the only application's

unique settings that are catered for are the applications that the standard's authors have decided to include, ... For other applications to be

added, further tag names would need to be defined in the specification, potentially creating thousands of them, each having nothing to do with

interoperability .."."

OIOXML 70

OIOXML

OIOXML is a project by the Danish government to develop a number of reusable data components serializable in

various formats, although currently the only method of serialization for OIOXML data is in the XML format. This

project was undertaken so as to ease communication from, to and between Danish governmental instances. It was

made as part of the Danish government's transition to what they refer to as an eGovernment, in which

communication between governmental instances, companies and the public should be paper-free. There has been

some confusion as to what OIOXML is as the most prominent OIOXML format, the Danish Efaktura format which

is a localization of UBL is also referred to as OIOXML by many governmental documents. It is currently a

requirement for all invoices given to a Danish governmental organization to be in the Efaktura format.

Sources

• The interoperability framework [1]

• OIO - Offentlig Information Online (public information online) - english main page of the site [2]

• Description of OIOXML and its reasons [3]

• Reference to the OIOXML markup language [4]

• Validator for OIOXML [5]

• Examples of OIOXML invoices in comparison with regular invoices (danish) [6]

References

[1] http://standarder.oio.dk/my-home-your-home/view?set_language=en

[2] http://www.oio.dk/?o=a54bd5e3b9e3e94209f94882ac0c9301

[3] http://isb.oio.dk/Info/Standardization/OIOXML%20Classes.htm

[4] http://xmltools.oio.dk/oioonlinevalidator/ehandel/0p71/Invoice/

[5] http://xmltools.oio.dk/oioonlinevalidator/

[6] http://www.oio.dk/dataudveksling/ehandel/eFaktura/eksempler

Open XML Paper Specification 71

Open XML Paper Specification

Filename extension .oxps, .xps

Internet media

type

application/oxps, application/vnd.ms-xpsdocument

Developed by Microsoft, Ecma International

Initial release October 2006

Latest release First Edition / June 16, 2009

Type of format Page description language /

Document file format

Contained by Open Packaging Conventions

Extended from ZIP, XML, XAML

Standard(s) ECMA-388

Website [1] [1]

The Open XML Paper Specification (also referred to as OpenXPS), is an open specification for a page description

language and a fixed-document format originally developed by Microsoft as XML Paper Specification (XPS) that

was later standardized by Ecma International as international standard ECMA-388. It is an XML-based (more

precisely XAML-based) specification, based on a new print path and a color-managed vector-based document format

that supports device independence and resolution independence. OpenXPS was standardized as an open standard

document format on June 16, 2009. [2]

Development of the XML Paper Specification

In 2003 Global Graphics was chosen by Microsoft to provide consultancy and proof of concept development

services on XPS and worked with the Windows development teams on the specification and reference architecture

for the new format. [3]

The XPS document format consists of structured XML markup that defines the layout of a document and the visual

appearance of each page, along with rendering rules for distributing, archiving, rendering, processing and printing

the documents. Notably, the markup language for XPS is a subset of XAML, allowing it to incorporate

vector-graphic elements in documents, using XAML to mark up the WPF primitives. The elements used are

described in terms of paths and other geometrical primitives.

An XPS file is in fact a ZIP archive using the Open Packaging Conventions, containing the files which make up the

document. These include an XML markup file for each page, text, embedded fonts, raster images, 2D vector

graphics, as well as the digital rights management information. The contents of an XPS file can be examined simply

by opening it in an application which supports ZIP files.

Open XML Paper Specification 72

Features

XPS specifies a set of document layout functionality for paged, printable documents. It also has support for features

such as color gradients, transparencies, CMYK color spaces, printer calibration, multiple-ink systems and print

schemas. XPS supports the Windows Color System color management technology for color conversion precision

across devices and higher dynamic range. It also includes a software raster image processor (RIP) which is

downloadable separately. [4] The print subsystem also has support for named colors, simplifying color definition for

images transmitted to printers supporting those colors.

XPS also supports HD Photo images natively for raster images. [5] The XPS format used in the spool file represents

advanced graphics effects such as 3D images, glow effects, and gradients as Windows Presentation Foundation

primitives, which are processed by the printer drivers without rasterization, preventing rendering artifacts and

reducing computational load.

Similarities with PDF and PostScript

Like Adobe Systems's PDF format, XPS is a fixed-layout document format designed to preserve document

fidelity, [6] providing device-independent documents appearance. PDF is a database of objects, created from

PostScript and also directly generated from many applications, whereas XPS is based on XML. The filter pipeline

architecture of XPS is also similar to the one used in printers supporting the PostScript page description language.

PDF includes dynamic capabilities not supported by the XPS format. [7]

Viewing and creating XPS documents

XPS is supported on several versions of Windows.

Because the printing architecture of Windows Vista uses XPS as the spooler format, [6] it has native support for

generating and reading XPS documents. [8] XPS documents can be created by printing to the virtual XPS printer

driver. The XPS Viewer is installed by default in Windows Vista and Windows 7. The viewer is hosted within

Internet Explorer in Windows Vista, but is a native application in Windows 7. The IE-hosted XPS viewer and the

XPS Document Writer are also available to Windows XP users when they download the .NET Framework 3.0. The

IE-hosted viewer supports digital rights management and digital signatures. Users who do not wish to view XPS

documents in the browser can download the XPS Essentials Pack, [9] which includes a standalone viewer and the XPS

Document Writer. The XPS Essentials Pack also includes providers to enable the IPreview and IFilter capabilities

used by Windows Desktop Search, as well as shell handlers to enable thumbnail views and file properties for XPS

documents in Windows Explorer. [10] The XPS Essentials Pack is available for Windows XP, Windows Server 2003,

and Windows Vista. [10] Installing this pack enables operating systems prior to Windows Vista to use the XPS print

processor, instead of the GDI-based WinPrint, which can produce better quality prints for printers that support XPS

in hardware (directly consume the format). [11] The print spooler format on these operating systems when printing to

older, non-XPS-aware printers, however, remains unchanged.

Windows 7 contains a standalone version of the XPS viewer that supports digital signatures. [12]

Third-party support

Software

Open XML Paper Specification 73

GhostXPS

Name Publisher Platform Function

Artifex Software

Inc. [13]

Okular Okular team [15] • Linux

Cross platform The Ghostscript software suite for processing of various page description

• FreeBSD

• Microsoft

Windows

• Solaris

languages includes an input parser called GhostXPS for XPS. The software may

be downloaded in source code form from ghostscript.com [14]

.

Okular, the document viewer of the KDE project, can display XPS documents.

STDU Viewer STDUtility [16] Microsoft Windows STDU Viewer and display and organize XPS documents (as well as other

XPS Annotator

Aspose.Words

product family

www.xpsdev.com

[17]

ASPOSE [18] • .NET Framework

• Java

electronic document formats).

Microsoft Windows XPS Annotator can display, digitally-sign and annotate XPS documents. In

• Microsoft

Sharepoint

• SQL Server

Reporting

Services

• JasperReports

Multilizer Multilizer [20] • Microsoft

Windows

NiXPS View NiXPS [21] • Microsoft

Windows

• Mac OS X

NiXPS Edit NiXPS [21] • Microsoft

Windows

• Mac OS X

NiXPS SDK NiXPS [21] • Microsoft

Pagemark

XpsViewer

Pagemark

XpsConvert

Pagemark

XpsPlugin

PDFTron

XPSConvert

Pagemark

Technology,Inc.

[25]

Pagemark

Technology,Inc.

[25]

Pagemark

Technology,Inc.

[25]

Windows

• Mac OS X

• Microsoft

Windows

• Mac OS

• Linux

• Microsoft

Windows

• Mac OS

• Linux

• Mozilla Firefox

• Safari

PDFTron [27] • Microsoft

Windows

• Mac OS X

• Linux

addition, it can convert XPS documents to common picture formats.

Aspose.Words enables application developers to build applications that

"generate, modify, convert, render and print" XPS documents as well as some

other formats. Aspose.Words is .NET Framework class library rather than an

[19]

independent computer software; hence it cannot be used by consumers.

Multilizer localization products support the translation of documents through a

XPS Scanner plug-in. This plug-in enables users to extract texts from a XPS

document, translate it, and write a translated XPS document with the same

structure.

[22]

NiXPS View can display, search and print XPS documents.

NiXPS Edit can view, edit, search, print and export XPS

[23]

documents.

NiXPS SDK enables application developers to develop applications that can

[24]

view, edit or export XPS documents.

Pagemark XpsViewer can display and organize XPS documents as well as

[26]

converting them to common picture formats.

Pagemark XpsConverter, a command-line interface tool, can convert XPS

[26]

documents to PDF documents, as well as common picture formats.

Pagemark XpsPlugin, an add-on for Mozilla Firefox and Safari web browsers,

enables these web browsers to display XPS documents inside the browser

window. This commercial product is still not available for purchase, but a demo

[26]

version is

available.

PDFTron XPSConvert, a command-line interface tool, can convert XPS

[28]

documents to PDF format or common picture formats.

Open XML Paper Specification 74

PDFTron

PDF2XPS

Software Imaging

XPSViewer

PDFTron [27] • Microsoft

Windows

Software Imaging

[30]

• Mac OS X

• Linux

PDFTron PDF2XPS, a command-line interface tool, can convert PDF

[29]

documents into XPS documents.

Microsoft Windows Software Imaging XPSViewer, a freeware alternative to Microsoft XPS Viewer,

can view and print XPS documents.Software Imaging [31]

NDesk XPS

NDesk [32] Mono [33]

NDesk XPS can view and convert XPS documents.

Danet Studio

Danetsoft [34] Microsoft Windows

Danet Studio can create, display, sign, convert and annotate XPS documents. It

[35]

can split and merge existing XPS documents to create new XPS

documents.

xps2pdf.org [36] World Wide Web xps2pdf.org, an online tool, can convert XPS documents to PDF format.

TreasureUP XPS to

Image Converter

1.1

Hardware

TreasureUP [37] Microsoft Windows

Convert XPS pages to image files formats: Jpeg, Png and Gif. Supports batch

[38]

files conversion, and automatically converting files in specified folder.

XPS has the support of printing companies such as Konica Minolta, Sharp, [39] Canon, Epson, Hewlett-Packard, [40]

and Xerox [41] and software and hardware companies such as Software Imaging, [42] Pagemark Technology Inc., [43]

Informative Graphics Corp. (IGC), [44] NiXPS NV, [45] Zoran, [46] and Global Graphics. [47]

Native XPS printers have been introduced by Canon ,Konica Minolta, Toshiba, and Xerox. [48]

Devices that are Certified for Windows Vinod' level of Windows Logo conformance certificate are required to have

XPS drivers for printing since 1 June 2007. [49]

Licensing

In order to encourage wide use of the format, Microsoft has released XPS under a royalty-free patent license called

the Community Promise for XPS, [50] [51] allowing users to create implementations of the specification that read, write

and render XPS files as long as they include a notice within the source that technologies implemented may be

encumbered by patents held by Microsoft. Microsoft also requires that organizations "engaged in the business of

developing (i) scanners that output XPS Documents; (ii) printers that consume XPS Documents to produce

hard-copy output; or (iii) print driver or raster image software products or components thereof that convert XPS

Documents for the purpose of producing hard-copy output, [...] will not sue Microsoft or any of its licensees under

the XML Paper Specification or customers for infringement of any XML Paper Specification Derived Patents (as

defined below) on account of any manufacture, use, sale, offer for sale, importation or other disposition or promotion

of any XML Paper Specification implementations." The specification itself is released under a royalty-free copyright

license, allowing its free distribution. [52]

Standardization

Microsoft submitted the XPS specification to Ecma International. [53]

In June 2007 Ecma International Technical Committee 46 (TC46) was set up to develop a standard based on the

Open XML Paper Specification (OpenXPS). [54]

At the 97th General Assembly held in Budapest, June 16, 2009, Ecma International approved Open XML Paper

Specification (OpenXPS) as an Ecma standard (ECMA-388). [2]

TC46's members are:

Open XML Paper Specification 75

See also

• Comparison of OpenXPS and PDF

• Windows Vista printing technologies

• Functional specification

External links

• XML Paper Specification [55]

• Autodesk • Konica Minolta • QualityLogic

• Brother Industries • Lexmark • Ricoh

• Canon • Microsoft • Software Imaging Limited

• Fujifilm • Monotype Imaging • Toshiba

• Fujitsu • Océ Technologies • Xerox

• Global Graphics • Pagemark Technology • Zoran Corporation

• Hewlett Packard • Panasonic/Matsushita

• Microsoft XPS Development Team Blog [56]

• Standard ECMA-388 Open XML Paper Specification [1]

• XPS FAQ and white papers on office and professional printing from a software technology provider [57]

• Viewing XPS Documents [58]

References

[1] http://www.ecma-international.org/publications/standards/Ecma-388.htm

[2] Steve McGibbon (Microsoft) (2009-06-17). "OpenXPS - OpenXML Paper Specification" (http://notes2self.net/archive/2009/06/17/

openxps-openxml-paper-specification.aspx). .

[3] "Global Graphics XPS reference" (http://www.redorbit.com/news/technology/665662/

global_graphics_xps_reference_rip_available_from_microsoft/index.html). Redorbit.com. 2006-09-21. . Retrieved 2009-12-10.

[4] "Reference Raster Image Processor (RIP)" (http://www.microsoft.com/whdc/device/print/RRIP.mspx). Microsoft.com. 2007-01-09. .

Retrieved 2009-12-10.

[5] "HD Photo information on Microsoft Photography team blog" (http://blogs.msdn.com/pix/archive/2007/03/12/hd-photo.aspx).

Blogs.msdn.com. 2007-03-12. . Retrieved 2009-12-10.

[6] Foley, Mary Jo (2005-04-25). "Microsoft Readies New Document Printing Specification" (http://www.microsoft-watch.com/content/

operating_systems/microsoft_readies_new_document_printing_specification.html). Microsoft-watch.com. . Retrieved 2009-12-10.

[7] "Comparison of PDF, XPS and ODF by an ISV providing PDF solutions" (http://www.amyuni.com/blog/?p=8). Amyuni.com. . Retrieved

2009-12-10.

[8] "XPS Documents in Windows Vista" (http://www.microsoft.com/windows/products/windowsvista/features/details/xps.mspx).

Microsoft.com. . Retrieved 2009-12-10.

[9] Download details: XPS Essentials Pack Version 1.0 (http://www.microsoft.com/downloads/details.

aspx?FamilyID=b8dcffdd-e3a5-44cc-8021-7649fd37ffee&displaylang=en) Microsoft XML Paper Specification Essentials Pack

[10] "View and generate XPS" (http://www.microsoft.com/whdc/xps/viewxps.mspx). Microsoft.com. . Retrieved 2009-12-10.

[11] XPSDrv Filter Pipeline: Implementation and Best Practice (http://download.microsoft.com/download/9/c/5/

9c5b2167-8017-4bae-9fde-d599bac8184a/XPSDrv_FilterPipe.doc)

[12] "View and Generate XPS" (http://www.microsoft.com/whdc/xps/viewxps.mspx). Microsoft.com. . Retrieved 2009-12-10.

[13] http://www.artifex.com/

[14] http://www.ghostscript.com/GhostPCL.html

[15] http://okular.kde.org/team.php

[16] http://www.stdutility.com

[17] http://www.xpsdev.com

[18] http://www.aspose.com/

[19] "Aspose.Words Product Family" (http://www.aspose.com/categories/product-family-packs/aspose.words-product-family/default.

aspx). Aspose.com. . Retrieved 2010-03-24.

[20] http://www.multilizer.com

Open XML Paper Specification 76

[21] http://www.nixps.com

[22] "NiXPS View" (http://www.nixps.com/view3/index.html). Nixps.com. . Retrieved 2010-03-24.

[23] "NiXPS Edit" (http://www.nixps.com/nixps_edit_20.html). Nixps.com. . Retrieved 2010-03-24.

[24] "Nixps Sdk" (http://www.nixps.com/library.html). Nixps.com. . Retrieved 2010-03-24.

[25] http://www.pagemarktechnology.com/

[26] "Pagemark: XPS Viewer, XPS Converter and XPS Plug-in" (http://www.pagemarktechnology.com/home/products.html).

Pagemarktechnology.com. . Retrieved 2010-03-24.

[27] http://www.pdftron.com/

[28] "PDFTron XPSConvert" (http://www.pdftron.com/xpsconvert/index.html). Pdftron.com. 2007-04-02. . Retrieved 2010-03-24.

[29] "PDFTron PDF2XPS" (http://www.pdftron.com/pdf2xps/index.html). Pdftron.com. 2007-04-02. . Retrieved 2010-03-24.

[30] http://softwareimaging.com/

[31] http://softwareimaging.com/products-services/XPSViewer/index.asp

[32] http://www.ndesk.org/

[33] "NDesk XPS" (http://www.ndesk.org/Xps). Ndesk.org. . Retrieved 2010-03-24.

[34] http://www.danetsoft.com/

[35] Danet Studio (http://www.danetsoft.com/product)

[36] http://www.xps2pdf.org

[37] http://www.treasureup.com/page1.aspx

[38] "XPS to Image" (http://download.cnet.com/TreasureUP-XPS-to-Image-Converter/3000-6675_4-10838983.html). download.cnet.com.

2010-04-05. .

[39] "Sharp Open Systems Architecture supports XPS in multi-function printers" (http://www.sharpusa.com/products/

FunctionPressReleaseSingle/0,1080,650-5,00.html#). Sharpusa.com. . Retrieved 2009-12-10.

[40] Monckton, Paul. "''IT Week'' 10 November 2006, Canon, Epson and HP support for XPS" (http://www.itweek.co.uk/

personal-computer-world/features/2167665/photo-printing-under-windows). Itweek.co.uk. . Retrieved 2009-12-10.

[41] "''Fuji Xerox and Microsoft Collaborate in Document Management Solutions Field''" (http://www.fujixerox.co.jp/eng/headline/2006/

1128_withms.html). Fujixerox.co.jp. 2006-11-28. . Retrieved 2009-12-10.

[42] "XPS & Windows Vista" (http://softwareimaging.com/xps). Software Imaging. . Retrieved 2009-12-10.

[43] "Bot generated title ->" (http://www.pagemarktechnology.com). Pagemark Technology

PCDATA 77

PCDATA

PCDATA is a term originated from SGML, short for "Parsed Character Data".

#PCDATA in XML DTD

In XML DTD[1], #PCDATA is the keyword to specify "mixed content", meaning an element can contain character

data and/or child elements in arbitrary order and number of occurrences. For example:

In this example, element must contain character data only; element can contain a mixture of any

combination of character data , , element(s).

Although its name and its appearance in DTD suggest so, #PCDATA itself is not a semantic term for character

data; it can only appear as the leading syntactic construct in "mixed content" definition. The following usages are

illegal:

[1] http://www.w3.org/TR/REC-xml/#sec-mixed-content

Plain Old XML 78

Plain Old XML

Plain Old XML (POX) is a term used to describe basic XML, sometimes mixed in with other, blendable

specifications like XML Namespaces, Dublin Core, XInclude and XLink. People typically use the term as a contrast

with complicated, multilayered XML specifications like those for web services or RDF. The term may have been

derived from or inspired by the expression plain old telephone service (a.k.a. POTS) and, similarly Plain Old Java

Object.

An interesting question is how POX relates to XML Schema. On the one hand, POX is completely compatible with

XML Schema. However, many POX users eschew XML Schema to avoid the poor or inconsistent quality of XML

Schema-to-Java tools.

POX is complementary to REST: REST refers to a communication pattern, while POX refers to an information

format style.

The primary competitors to POX are more strictly-defined XML-based information formats such as RDF and SOAP

section 5 encoding, as well as general non-XML information formats such as JSON and CSV.

External links

• REST and POX article [1] from the Microsoft Developer Network

• Plain Old XML Considered Harmful [2] from Microformats.org

• Support for POX [3] in the Java Spring Framework

• PlainXML on SourceForge.net [4]

References

[1] http://msdn.microsoft.com/en-us/library/aa395208.aspx

[2] http://microformats.org/wiki/plain-old-xml-considered-harmful

[3] http://static.springsource.org/spring-ws/sites/1.5/apidocs/org/springframework/ws/pox/package-summary.html

[4] http://sourceforge.net/projects/plainxml/

Portable Application Description 79

Portable Application Description

Portable Application Description is a machine-readable document format designed by the Association of

Shareware Professionals.

It allows authors to provide product descriptions and specifications to online sources in a standard way, using a

standard data format, a simplified subset of XML, that will allow webmasters and program librarians to automate

program listings. PAD saves time for both authors and webmasters.

Each field in the specification has a regular expression (regex) associated with it. The regex acts as a constraint on

the field: if the regex matches, the field value is legal and if it fails to match, the field and the PAD file as a whole

are out of spec. Only files where all fields in the file pass validation are properly called PAD files.

The simplifications in PAD over XML are primarily PAD does not use name/value pairs in tags. All tags are

attribute-free. This is less expressive than XML but easier to parse. The official PAD spec uses unique tags. To

extract the fields in the official spec, it is not necessary to descend through the tag path. However, if multiple

languages are represented in a single PAD file, then correct parsing does require descending though the tag path

because leaf tags are duplicated for each language supported.

External links

• Official PAD site [1]

• The Official PAD specification [2]

• The Official PAD validator [3]

• 30 or so free and commercial PAD products, services, and links [4]

• PAD database and graphics updated weekly [5]

• About PAD files (Software Industry Professionals) [6]

• PAD Validation Tool [7]

• Online PAD Generator [8]

• Taşınabilir Uygulama Tanımı [9]

References

[1] http://www.asp-shareware.org/pad/

[2] http://www.asp-shareware.org/pad/spec/spec.php

[3] http://www.asp-shareware.org/pad/spec/validate.php

[4] http://www.asp-shareware.org/pad/padlinks.php

[5] http://paddatacenter.net/

[6] http://www.siprofessionals.org/developers/viewarticle.php?id=si20070802

[7] http://www.sharewarepromotions.com/PAD_Validation.asp

[8] http://www.padbuilder.com/

[9] http://www.tankado.com/pad-portable-application-description/

Publishing Requirements for Industry Standard Metadata 80

Publishing Requirements for Industry Standard

Metadata

PRISM Metadata Standard

Introduction

The Publishing Requirements for Industry Standard Metadata (PRISM) [1] specification defines a set of XML

metadata vocabularies for syndicating, aggregating, post-processing and multi-purposing content. PRISM provides a

framework for the interchange and preservation of content and metadata, a collection of elements to describe that

content, and a set of controlled vocabularies listing the values for those elements. PRISM can be XML, RDF/XML,

or XMP and incorporates Dublin Core elements. PRISM can be thought of as a set of XML tags used to contain the

metadata of articles and even tag article content.

PRISM conforms to the World Wide Web standard for Namespaces. PRISM namespaces are PRISM (prism:),

PRISM Usage Rights (pur:), Dublin Core (dc: and dcterms:), PRISM Inline Metadata (pim:), PRISM Rights

Language (prl:), PRISM Aggregator Message (pam:), and PRISM Controlled Vocabulary (pcv:). PRISM

incorporated existing industry standards such as Dublin Core and XHTML in order to leverage work that had already

been done in the publishing industry. New elements were created only when required, and were assigned to PRISM

specific namespaces.

Overview

PRISM consists of three specifications. The PRISM Specification, itself, provides definition for the overall PRISM

framework. A second specification, the PRISM Aggregator Message (PAM) Schema/DTD, is a standard format for

publishers to use for delivery of content to websites, aggregators, and syndicators. PAM is available as an XML

DTD and an XML schema (XSD). Both PAM formats provides a simple, flexible model for transmitting content and

PRISM metadata. The third, and newest, specification provides an XML schema (XSD) for capture of content usage

rights metadata. This Guide to PRISM Usage Rights utilizes the elements found in PRISM’s Usage Rights

Namespace to allow users to comprehensively capture and relay rights metadata for text and media content.

Background

In 1999, IDEAlliance contracted Linda Burman to found the PRISM Working Group to address emerging publisher

requirements for a metadata standard to facilitate “agile” content for search, digital asset management, content

aggregation. Since that time, individuals from more than 50 IDEAlliance member companies have participated in the

development of the specifications.

PRISM is an IDEAlliance specification but is available free of charge. IDEAlliance (International Digital Enterprise

Alliance) is a not-for-profit membership organization. Its mission is to advance user-driven, cross-industry solutions

for all publishing and content-related processes by developing standards, fostering business alliances, and identifying

best practices.

Many organizations use PRISM because it provides a common metadata standard across platforms, media types and

business units. Organizations who are involved in any type of content creation, categorization, management,

aggregation and distribution, both commercially and within intranet and extranet frameworks can use the PRISM

standards.

The PRISM Working Group is open to all IDEAlliance members and includes: Adobe Systems, Hachette Filipacchi

Media, Hearst, L.A. Burman Associates, LexisNexis, The McGraw-Hill Companies, Reader’s Digest, Source

Interlink Media Companies, Time Inc., The Nature Publishing Group, and U.S. News and World Report.

Publishing Requirements for Industry Standard Metadata 81

Usage and Applications

PRISM can be incorporated into other standards and at this time, the PRISM Working Group is only aware of

PRISM incorporation with RSS 1.0. See RSS 1.0 [2] and the RSS 1.0 PRISM Module for more information.

The PRISM specification defines a set of metadata vocabularies. PRISM metadata may be expressed in a different

syntax depending on the specific use-case scenario. Currently PRISM metadata can be encoded XML, XML/RDF, or

as XMP. Each of these expressions of PRISM metadata is called a profile.

• Profile 1 is for the expression of PRISM metadata in XML. An example is the XML PRISM Aggregator Message

(PAM).

• Profile 2 is for the expression of PRISM metadata in XML/RDF such as for expressing PRISM metadata in RSS

feeds.

• Profile 3 is for embedding PRISM metadata in media objects such as digital images or PDFs using XMP

technology.

PRISM describes many components of print, online, mobile, and multimedia content including the following:

• Who created, contributed to, and owns the rights to the content?

• What locations, organizations, topics, people, and/or events it covers, the media it contains, and under what

conditions it may be reproduced?

• When it was published? (cover date, post date, volume, number), withdrawn?

• Where it can be republished, and the original platform on which it appeared?

• How it can be reused?

Common PRISM Usage

• Syndication to partners

• Content aggregation

• Content repurposing

• Resource discovery and search optimization

• Multiple platform and channel distribution

• Content archiving

• Capture rights usage information

• Creation of feeds, such as RSS

• Standalone services

• Embedded descriptions, such as XMP

• Web publishing

See also

• Dublin Core

• DTD

• Comparison of document markup languages

• Controlled vocabulary

• Interoperability

Publishing Requirements for Industry Standard Metadata 82

See also

• Dublin Core Metadata Initiative

• Bibliographic Ontology

Further reading

• IDEAlliance [3]

• PRISM Standard [4]

• PRISM FAQ [5]

• RSS 1.0 PRISM Module [6]

• Using PRISM - The PRISM Cookbook [7] is a systematic guide that demonstrates how to apply PRISM elements

in particular business scenarios. The existing PRISM Cookbook addresses only PRISM Profile 1 (XML).

• W3C – Namespaces in XML [8]

References

[1] PRISM Metadata Standard (http://www.idealliance.org/industry_resources/intelligent_content_informed_workflow/prism)

[2] http://web.resource.org/rss/1.0/spec

[3] http://www.idealliance.org

[4] http://www.prismstandard.org

[5] http://www.prismstandard.org/faq/

[6] http://nurture.nature.com/rss/modules/mod_prism.html

[7] http://www.prismstandard.org/resources/

[8] http://www.w3.org/TR/2006/REC-xml-names11-20060816/

QName

QNames were introduced by XML Namespaces in order to be used as URI references [1] . QName stands for

"qualified name" and defines a valid identifier for elements and attributes. QNames are generally used to reference

particular elements or attributes within XML documents. [2]

Motivation

Since URI references can be long and may contain prohibited characters for element/attribute naming, QNames are

used to create a mapping between the URI and a namespace prefix. The mapping enables the abbreviation of URIs,

therefore it achieves a more convenient way to write XML documents. (see Example)

Formal definition

QNames are formally defined by the W3C as [3] :

QName ::= PrefixedName | UnprefixedName

PrefixedName ::= Prefix ':' LocalPart

UnprefixedName ::= LocalPart

Whereby the Prefix is used as placeholder for the namespace and the LocalPart as the local part of the qualified

name. A local part can be an attribute name or an element name.

QName 83

Example

In line two the prefix "x" is declared to be associated with the URI "http://example.com/ns/foo". This prefix can

further on be used as abbreviation for this namespace. Subsequently the tag "x:p" is a valid QName because it uses

the "x" as namespace reference and "p" as local part. The tag "doc" is also a valid QName, but it consists only of a

local part. [4]

See also

• CURIE

References

[1] Namespaces in XML 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#dt-qualname)

[2] Using Qualified Names (QNames) as Identifiers in XML Content (http://www.w3.org/2001/tag/doc/qnameids.html#sec-qnames-xml)

[3] Namespaces in XML 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#NT-QName)

[4] Namespaces in XML 1.0 (Second Edition) (http://www.w3.org/TR/REC-xml-names/#NT-LocalPart)

QTI

The IMS Question and Test Interoperability specification (QTI) defines a standard format for the representation

of assessment content and results, supporting the exchange of this material between authoring and delivery systems,

repositories and other learning management systems. It allows assessment materials to be authored and delivered on

multiple systems interchangeably. It is, therefore, designed to facilitate interoperability between systems [1] .

The specification consists of a data model that defines the structure of questions, assessments and results from

questions and assessments together with an XML data binding that essentially defines a language for interchanging

questions and other assessment material. The XML binding is widely used for exchanging questions between

different authoring tools and by publishers. The assessment and results parts of the specification are less widely used.

Background

QTI was produced by the IMS Global Learning Consortium, which is an industry and academic consortium that

develops specifications for interoperable learning technology. QTI was inspired by the need for interoperability in

question design, and to avoid people losing or having to re-type questions when technology changes. Developing and

validating good questions is time consuming, and it's desirable to be able to create them in a platform and technology

neutral format.

QTI version 1.0 was materially based on a proprietary Questions Markup Language (QML) language defined by

QuestionMark, but the language has evolved over the years and can now describe almost any reasonable question

that one might want to describe. (QML is still in use by Questionmark and is generated for interoperability by tools

like Adobe Captivate).

The most widely used version of QTI at the time of writing is version 1.2, which was finalized in 2002. This works

well for exchanging simple question types, and is supported by many tools that allow the creation of questions.

Version 2.0 was released in 2005, with v2.1 due for release in 2008 [2] . 2.0 addressed the item (individual question)

level of the specification only, with 2.1 covering assessments and results as well as correcting errors which had

QTI 84

become apparent in 2.0. Version 2.x is a significant improvement on earlier versions, defining a new underlying

interaction model. It is also notable for its significantly greater degree of integration with other specifications (some

of which did not exist during the production of v1): the specification addresses the relationship with IMS Content

Packaging v1.2, IEEE Learning Object Metadata, IMS Learning Design, IMS Simple Sequencing and other

standards such as XHTML. It also provides guidance on representing context-specific usage data and information to

support the migration of content from earlier versions of the specification.

Because v2.0 was limited to items only, and v2.1 has yet to be formally released by IMS (although two public drafts

plus an addendum are currently available), uptake of v2.x has been slow to date. The delay between the release of 2.0

and 2.1 (over three years to date) may have hindered uptake to some extent, with developers reluctant to commit to

v2.0 knowing that v2.1 is in development. The use of a profile of v1.2.1 in the IMS Common Cartridge specification

may exacerbate this. A number of implementations are emerging, however, and uptake may increase once the

specification is finally available in a stable form.

In early 2009, the IMS Global Learning Consortium withdrew QTI 2.1, stating that "Adequate feedback on the

specification has not been received, and therefore, the specification has been put back into the IMS project group

process for further work." [3] The most recent version of QTI that is fully endorsed by IMS GLC is v1.2.1. This

decision met with disapproval on the IMS-QTI mailing list. [4] A further clarification on the QTI 2.1 withdrawal

acknowledged the work done on implementing the QTI 2.1 draft specification, and cited criticism on the lack of

interoperability of IMS specifications as a reason for endorsing only IMS QTI 1.2. [5] A few weeks later IMS GLC

reposted the QTI v2.1 draft specification on their website [6] with a warning that the specification is incomplete:

Caution: The QTIv2.1PD Version 2 specification is incomplete in its current state. The IMS QTI project group

is in the process of evolving this specification based on input from market participants. Suppliers of products

and services are encouraged to participate by contacting Mark McKell at [e-mail address removed]. This

specification will be superseded by an updated release based on the input of the project group participants.

Please note that supplier's claims as to implementation of QTI v2.1 and conformance to it HAVE NOT BEEN

VALIDATED by IMS GLC. While such suppliers are likely well-intentioned, IMS GLC member

organizations have not yet put in place the testing process to validate these claims. IMS GLC currently grants a

conformance mark to the Common Cartridge profile of QTI v1.2.1. [7]

Timeline

Date Version Comments

March 1999 0.5 Internal to IMS

February 2000 1.0 public draft

May 2000 1.0 final release

August 2000 1.01

March 2001 1.1

January 2002 1.2

March 2003 1.2.1 addendum

September 2003 2.0 charter Initiation of working group

January 2005 2.0 final release

January 2006 2.1 public draft

July 2006 2.1 public draft version 2

April 2008 2.1 public draft addendum

early 2009 2.1 removed from website

QTI 85

January 2010 2.1 reinstated on website

Applications with IMS QTI support

Name QTI

ANGEL Learning

Management Suite

APIS QTIv2

Assessment Engine

version

Type of tool Comment

2.1 [8] LMS also supports IMS Common Cartridge [8]

2.0 draft

[9]

Java library & demo

application.

AQuRate 2.1 [10] authoring tool see QTITools

ASDEL 2.1 [11] assessment delivery system see QTITools

ATutor 1.2, 2.1

[12]

LCMS

Canvas Learning [13]

1.2.1

Authoring tools and SCORM

compatible item renderer

CCReader 1.2.1 CC

Cognero

Profile

[14]

1.2 and

2.1 [15]

Content-e 1.2 & 2.0

[16]

DB Primary 2.0 [17]

[18]

Diploma 1.2, 2.1

[19]

Dokeos

Elques

1.2 and

2.0 [20]

2.1 [21]

[22]

available as middle-ware

solutions.

Common Cartridge Viewer

Assessment authoring and

delivery system.

Professional authoring tool

Content-e.

LMS

Incomplete. Author recommends using QTITools instead.

Creators - Can Studios contributed to the development of the QTI specification.

A number of LMS systems used the Canvas Learning Player to achieve

compatibility with the Becta learning platform conformance regime. The system

is currently being distributed to schools in the UK as a result of this integration

work.

Cognero imports QTI 1.2 and exports QTI 1.2 and 2.1 to allow content to work

with other systems.

Imports QTI 1.2 and 2.0.

export QTI 1.2 & 2.1

LMS/LCMS export QTI 1.2 & 2.0 (1.2 disabled by default but available) (supports SCORM

1.2)

authoring tool exports QTI 2.1 and QTI 1.2 (for LMS OLAT only); imports QTI 2.1, Tests

from Blackboard and OLAT (kind of QTI 1.2 too)

it's learning 2.1 [23] VLE import and export questions in QTI 2.1 format

ILIAS

Lectora

not stated

[24]

not stated

[25]

LMS supports SCORM 1.2 and SCORM 2004

authoring tool supports SCORM 1.2 and SCORM 2004

Mathqurate 2.1 [26] authoring tool see QTITools. Embedded Gecko engine and support for multiple interactions

Moodle

not stated

[27]

LCMS supports adaptive questions; QTI 2.0 export is still unfinished

QTI 86

Online Learning And

Training

QTI 1.2

[28]

ONYX 2.1 [29] modular assessment delivery

OWL Testing

Software

not stated

[30]

LCMS QTI 2.1 compliance can be achieved with ONYX as plugin

system

QTITools 2.1 [31] collection of tools and

QuestionMark

Perception

Question Writer 2.0

Publisher Edition

Question Writer 3.5

Professional

not stated

[33]

Respondus 1.2 [39]

RM Test Authoring

System

open-source, QTI 2.1 import and export, Report Viewer for graphical

visualization of QTI-Result-Files

test management system can import IMS QTI

libraries

authoring tool and delivery

system

Test authoring tool Spectatus procudes QTI

[32]

2.1

can export IMS QTI, an online tool provides QTI 1.2 import

[34]

1.2 authoring tool Exports as QTI 1.2 and SCORM 1.2 [35]

[36]

1.2 authoring tool Exports as QTI 1.2 and SCORM 1.2 [37] Also specific QTI Export for Pearson

VUE [38]

[40]

authoring tool QTI export

2.1 [41] authoring tool

Sakai 1.2 [42] LMS

SToMP (Software

Teaching of Modular

Physics)

2.1 [43] assessment system mostly unavailable as of July 2008

Studywiz 1.2 [44] Virtual Learning

Wimba Create

Other software:

QTI Lite

[45]

Environment Module

authoring tool only export

An optional module for creating and assigning QTI v1.2 questions to students.

Available as of June 2008

• QTI Migration Tool (University of Cambridge): converts QTI version 1.x data into QTI 2.0 content packages. [46]

External links

• IMS Global Learning Consortium: IMS Question & Test Interoperability Specification [47]

• TOIA (Technologies for Online Interoperable Assessment) [48] - this project ended in 2007 and software is no

longer available.

• QTI Tools [49]

• JISC CETIS Assessment special interest group [50]

• JISC CETIS wiki: Assessment tools, projects and resources [51]

• IMS Question & Test Interoperability mailing list [52]

QTI 87

References

[1] Effective Practice with e-Assessment guide, p.44 (http://www.jisc.ac.uk/media/documents/themes/elearning/effpraceassess.pdf)

[2] QTI Update (http://wiki.cetis.ac.uk/Assessment_and_EC_SIGs_meeting_Feb_2008#QTI_Update)

[3] IMS Global Learning Consortium: IMS Question & Test Interoperability Specification (http://www.imsglobal.org/question/index.html).

Accessed March 29, 2009.

[4] E-mail thread "QTI 2.1 draft specification withdrawn" (http://lists.ucles.org.uk/public/ims-qti/2009-March/001456.html), starting

March 27, 2009.

[5] Rob Abel: Further clarification on the removal of QTI v2.1 from the IMS web site (http://www.imsglobal.org/community/forum/

messageview.cfm?catid=21&threadid=36&enterthread=y), on the IMS Global Learning Consortium's Question and Test Interoperability

Forum, March 30, 2009. Accessed March 29, 2009.

[6] rabel: We are reposting the QTI v2.1 (http://www.imsglobal.org/community/forum/messageview.cfm?catid=21&threadid=41&

enterthread=y). Question and Test Interoperability Forum, April 14, 2009. Accessed April 17, 2009.

[7] IMS Global Learning Consortium: IMS Question & Test Interoperability Specification (http://www.imsglobal.org/question/index.html).

Accessed April 17, 2009.

[8] ANGEL Learning Management Suite: Standards Leadership (http://www.angellearning.com/products/lms/standards.html). Accessed

March 30, 2009.

[9] Sourceforge.net: APIS QTIv2 Assessment Engine (http://sourceforge.net/projects/apis). Accessed March 30, 2009.

[10] AQuRate: A QTI-2.x Authoring Tool (http://aqurate.kingston.ac.uk/). Accessed March 30, 2009.

[11] ASDEL: assessment delivery system for QTIv2 questions (http://www.asdel.ecs.soton.ac.uk/). Accessed March 30, 2009.

[12] ATutorATutor Learning Content Management System: Information (http://www.atutor.ca/atutor/). Accessed March 30, 2009.

[13] Canvas Learning (http://www.canvaslearning.com). Accessed August, 2009.

[14] CCReader project in Sourceforge (http://sourceforge.net/projects/ccreader). Accessed March 30, 2009.

[15] Cognero: Cognero Features (http://www.cognero.com/features.html). Accessed February 19, 2009

[16] Professional authoring tool content-e. (http://eng.content-e.nl/) Accessed July, 2009.

[17] iBoard content available in DB Primary (http://www.e2bn.org/services/120/iboard-content-available-in-db-primary.html). Accessed

March 30, 2009.

[18] DB Primary's own Technical Overview (http://www.getprimary.com/tech_spec.html) does not mention QTI.

[19] Diploma 6 (Windows) Release Notes (6.61 (Build 0087 - 8/8/2008)) (http://www.brownstone.net/support/Dip6-ReleaseNotes.asp).

Accessed March 30, 2009.

[20] Dokeos code (no other reference available) (http://dokeos.svn.sourceforge.net/viewvc/dokeos/trunk/dokeos/main/exercice/export/)

[21] Elques: Elques Features (http://elques.bps-system.de/en/?Features). Accessed March 30, 2009.

[22] Elques: Elques 2.0[[Category:Articles containing German language text (http://elques.bps-system.de/)]] (in German). Accessed

September 30, 2009.

[23] it's learning: Importing and exporting (https://www.itslearning.com/Ntt/Help/en-GB/Default_Left.htm#StartTopic=Adding). Accessed

June 19, 2009.

[24] ILIAS France (http://ilias-france.info/ilias.htm). Accessed March 30, 2009.

[25] Lectora Supports eLearning Standards (http://www.trivantis.com/products/elearningstandards.html). Accessed March 30, 2009.

[26] Mathqurate: Maths-enabled QTI-2.1 item authoring (http://aqurate.kingston.ac.uk/mathqurate/). Accessed April 3, 2009.

[27] Development:Question engine - MoodleDocs (http://docs.moodle.org/en/Question_engine). Accessed March 30, 2009.

[28] OLAT Feature List and Some Screenshots (http://www.olat.org/website/en/html/about_features.html). Accessed March 30, 2009.

[29] Onyx Feature List and more Infos (http://onyx.bps-system.de/en/?Features). Accessed March 30, 2009.

[30] OWL Test Conversion Service (http://www.owlts.com/test-conversion.html). Accessed March 30, 2009.

[31] SourceForge.net: QTItools (http://sourceforge.net/projects/qtitools/). Accessed March 30, 2009.

[32] Paul Neve: " Spectatus - QTI 2.1 test authoring tool (http://lists.ucles.org.uk/public/ims-qti/2010-February/001571.html)", IMS-QTI

mailing list, February 26, 2010. Accessed April 14, 2010.

[33] Questionmark - Windows Based Authoring - Question Types (http://www.questionmark.com/us/perception/

authoring_windows_qm_qtypes.aspx). Accessed March 30, 2009.

[34] Publisher's Legacy Software Page (http://www.questionwriter.com/pricing/custom-development.html). Accessed March 31, 2009.

[35] Question Writer 2.0 Publisher Edition Manual (http://downloads.centralquestion.com/QuestionWriterManual.pdf). Accessed March 31,

2009.

[36] Question Writer Blog Announcement (http://www.questionwriterblog.com/archives/2009/05/question_writer_34.html). Accessed May

18, 2009.

[37] Question Writer Features Description (http://www.questionwriter.com/features.html). Accessed May 18, 2009.

[38] Question Writer Blog Entry on Feature (http://www.questionwriterblog.com/archives/2009/06/qti_for_pearson_vue.html). Accessed

July 29, 2009.

[39] Respondus Plug-in for Moodle (http://www.respondus.com/update/2007-11-c.shtml). Accessed March 30, 2009.

[40] The Respondus Version 3.5 page (http://www.respondus.com/products/respondus.shtml) does not mention the QTI version.

[41] RM: Test Authoring System (http://www.rm.com/generic.asp?cref=GP1002551). Accessed March 31, 2009.

QTI 88

[42] Sakai: SAMigo/Test and Quizzes (http://bugs.sakaiproject.org/confluence/display/SAM/Home). Accessed March 30, 2009.

[43] SToMP: An Overview (http://www.stomp.ac.uk/). Accessed March 31, 2009.

[44] Studywiz QT Assessment (http://www.europe.studywiz.com/?page_id=72). Accessed April 03, 2009.

[45] Wimba Create Brochure (http://www.wimba.com/assets/resources/wimbaCrBrochure_HE.pdf). Accessed March 30, 2009.

[46] QTI Migration Tool (http://qtitools.caret.cam.ac.uk/index.php?option=com_docman&task=cat_view&gid=18&Itemid=28). Accessed

March 30, 2009.

[47] http://www.imsglobal.org/question

[48] http://www.toia.ac.uk

[49] http://qtitools.caret.cam.ac.uk/

[50] http://jisc.cetis.ac.uk/domain/assessment

[51] http://wiki.cetis.ac.uk/Assessment_tools%2C_projects_and_resources

[52] http://lists.ucles.org.uk/lists/listinfo/ims-qti

Resource Description Framework 89

Resource Description Framework

Current Status Published

Editors Frank Manola, Eric Miller

Base Standards XML, URI

Related

Standards

RDFS, OWL

Domain Semantic Web

Abbreviation RDF

Website RDF Primer [1]

The Resource Description Framework (RDF) is a family of World Wide Web Consortium (W3C) specifications

originally designed as a metadata data model. It has come to be used as a general method for conceptual description

or modeling of information that is implemented in web resources, using a variety of syntax formats.

Overview

The RDF data model [2] is similar to classic conceptual modeling approaches such as Entity-Relationship or Class

diagrams, as it is based upon the idea of making statements about resources (in particular Web resources) in the form

of subject-predicate-object expressions. These expressions are known as triples in RDF terminology. The subject

denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between

the subject and the object. For example, one way to represent the notion "The sky has the color blue" in RDF is as

the triple: a subject denoting "the sky", a predicate denoting "has the color", and an object denoting "blue". RDF is

an abstract model with several serialization formats (i.e., file formats), and so the particular way in which a resource

or triple is encoded varies from format to format.

This mechanism for describing resources is a major component in what is proposed by the W3C's Semantic Web

activity: an evolutionary stage of the World Wide Web in which automated software can store, exchange, and use

machine-readable information distributed throughout the Web, in turn enabling users to deal with the information

with greater efficiency and certainty. RDF's simple data model and ability to model disparate, abstract concepts has

also led to its increasing use in knowledge management applications unrelated to Semantic Web activity.

A collection of RDF statements intrinsically represents a labeled, directed multi-graph. As such, an RDF-based data

model is more naturally suited to certain kinds of knowledge representation than the relational model and other

ontological models. However, in practice, RDF data is often persisted in relational database or native representations

also called Triplestores, or Quad stores if context (i.e. the named graph) is also persisted for each RDF triple. [3] As

RDFS and OWL demonstrate, additional ontology languages can be built upon RDF.

History

There were several ancestors to the W3C's RDF. Technically the closest was MCF, a project initiated by

Ramanathan V. Guha while at Apple Computer and continued, with contributions from Tim Bray, during his tenure

at Netscape Communications Corporation. Ideas from the Dublin Core community, and from PICS, the Platform for

Internet Content Selection (the W3C's early Web content labelling system) were also key in shaping the direction of

the RDF project.

The W3C published a specification of RDF's data model and XML syntax as a Recommendation in 1999. [4] Work

then began on a new version that was published as a set of related specifications in 2004. While there are a few

Resource Description Framework 90

implementations based on the 1999 Recommendation that have yet to be completely updated, adoption of the

improved specifications has been rapid since they were developed in full public view, unlike some earlier

technologies of the W3C. Most newcomers to RDF are unaware that the older specifications even exist.

RDF Topics

RDF Vocabulary

The vocabulary defined by the RDF specification is:

• rdf:type - a predicate used to state that a resource is an instance of a class

• rdf:XMLLiteral - the class of typed literals

• rdf:Property - the class of properties

• rdf:Alt, rdf:Bag, rdf:Seq - containers of alternatives, unordered containers, and ordered containers (rdfs:Container

is a super-class of the three)

• rdf:List - the class of RDF Lists

• rdf:nil - an instance of rdf:List representing the empty list

• rdf:Statement, rdf:subject, rdf:predicate, rdf:object – used for reification (see below)

This vocabulary is used as a foundation for RDF Schema where it is extended.

Serialization formats

Two common serialization formats are in use.

The first is an XML format. This format is often called simply RDF because it was introduced among the other W3C

specifications defining RDF. However, it is important to distinguish the XML format from the abstract RDF model

itself. Its MIME media type, application/rdf+xml, was registered by RFC 3870. It recommends RDF documents to

follow the new 2004 specifications.

In addition to serializing RDF as XML, the W3C introduced Notation 3 (or N3) as a non-XML serialization of RDF

models designed to be easier to write by hand, and in some cases easier to follow. Because it is based on a tabular

notation, it makes the underlying triples encoded in the documents more easily recognizable compared to the XML

serialization. N3 is closely related to the Turtle and N-Triples formats.

Triples may be stored in a triplestore.

Resource identification

The subject of an RDF statement is either a Uniform Resource Identifier (URI) or a blank node, both of which

denote resources. Resources indicated by blank nodes are called anonymous resources. They are not directly

identifiable from the RDF statement. The predicate is a URI which also indicates a resource, representing a

relationship. The object is a URI, blank node or a Unicode string literal.

In Semantic Web applications, and in relatively popular applications of RDF like RSS and FOAF (Friend of a

Friend), resources tend to be represented by URIs that intentionally denote, and can be used to access, actual data on

the World Wide Web. But RDF, in general, is not limited to the description of Internet-based resources. In fact, the

URI that names a resource does not have to be dereferenceable at all. For example, a URI that begins with "http:"

and is used as the subject of an RDF statement does not necessarily have to represent a resource that is accessible via

HTTP, nor does it need to represent a tangible, network-accessible resource — such a URI could represent

absolutely anything. However, there is broad agreement that a bare URI (without a # symbol) which returns a

300-level coded response when used in an http GET request should be treated as denoting the internet resource that it

succeeds in accessing.

Resource Description Framework 91

Therefore, producers and consumers of RDF statements must agree on the semantics of resource identifiers. Such

agreement is not inherent to RDF itself, although there are some controlled vocabularies in common use, such as

Dublin Core Metadata, which is partially mapped to a URI space for use in RDF. The intent of publishing

RDF-based ontologies on the Web is often to establish, or circumscribe, the intended meanings of the resource

identifiers used to express data in RDF. For example, the URI http:/ / www. w3. org/ TR/ 2004/

REC-owl-guide-20040210/ wine#merlot is intended by its owners to refer to the class of all Merlot red wines, an

intent which is expressed by the OWL ontology — itself an RDF document — in which it occurs. Note that this is

not a 'bare' resource identifier, but is rather a URI reference, containing the '#' character and ending with a fragment

identifier.

Statement reification and context

The body of knowledge modeled by a collection of statements may be subjected to reification, in which each

statement (that is each triple subject-predicate-object altogether) is assigned a URI and treated as a resource about

which additional statements can be made, as in "Jane says that John is the author of document X". Reification is

sometimes important in order to deduce a level of confidence or degree of usefulness for each statement.

In a reified RDF database, each original statement, being a resource, itself, most likely has at least three additional

statements made about it: one to assert that its subject is some resource, one to assert that its predicate is some

resource, and one to assert that its object is some resource or literal. More statements about the original statement

may also exist, depending on the application's needs.

Borrowing from concepts available in logic (and as illustrated in graphical notations such as conceptual graphs and

topic maps), some RDF model implementations acknowledge that it is sometimes useful to group statements

according to different criteria, called situations, contexts, or scopes, as discussed in articles by RDF specification

co-editor Graham Klyne [5] [6] . For example, a statement can be associated with a context, named by a URI, in order

to assert an "is true in" relationship. As another example, it is sometimes convenient to group statements by their

source, which can be identified by a URI, such as the URI of a particular RDF/XML document. Then, when updates

are made to the source, corresponding statements can be changed in the model, as well.

Implementation of scopes does not necessarily require fully reified statements. Some implementations allow a single

scope identifier to be associated with a statement that has not been assigned a URI, itself [7] [8] . Likewise named

graphs in which a set of triples is named by a URI can represent context without the need to reify the triples. [9]

Query and inference languages

The predominant query language for RDF graphs is SPARQL. SPARQL is an SQL-like language, and a

recommendation of the W3C as of January 15, 2008.

An example of a SPARQL query to show country capitals in Africa, using a fictional ontology.

PREFIX abc: .

SELECT ?capital ?country

WHERE {

}

?x abc:cityname ?capital ;

abc:isCapitalOf ?y.

?y abc:countryname ?country ;

abc:isInContinent abc:Africa.

Other ways to query RDF graphs include:

• RDQL, precursor to SPARQL, SQL-like

• Versa, compact syntax (non–SQL-like), solely implemented in 4Suite (Python)

Resource Description Framework 92

• RQL, one the first declarative languages for uniformly querying RDF schemas and resource descriptions,

implemented in RDFSuite.

• XUL has a template [10] element in which to declare rules for matching data in RDF. XUL uses RDF extensively

for databinding.

Examples

Example 1: RDF Description of a person named Eric Miller [11]

Here is an example taken from the W3C website [11] describing a resource with statements "there is a Person

identified by http:/ / www. w3. org/ People/ EM/ contact#me, whose name is Eric Miller, whose email address is

em@w3.org, and whose title is Dr.".

The resource "http:/ / www. w3. org/ People/ EM/ contact#me" is the

subject. The objects are: (i) "Eric Miller" (with a predicate "whose

name is"), (ii) em@w3.org (with a predicate "whose email address is"),

and (iii) "Dr." (with a predicate "whose title is"). The subject is a URI.

The predicates also have URIs. For example, the URI for the predicate:

(i) "whose name is" is http:/ / www. w3. org/ 2000/ 10/ swap/ pim/

contact#fullName, (ii) "whose email address is" is http:/ / www. w3.

org/ 2000/ 10/ swap/ pim/ contact#mailbox, (iii) "whose title is" is

http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#personalTitle. In

addition, the subject has a type (with URI http://www.w3.org/1999/

02/ 22-rdf-syntax-ns#type), which is person (with URI http:/ / www.

[11]

An RDF Graph Describing Eric Miller

w3. org/ 2000/ 10/ swap/ pim/ contact#Person), and a mailbox (with URI http:/ / www. w3. org/ 2000/ 10/ swap/

pim/contact#mailbox.) Therefore, the following "subject, predicate, object" RDF triples can be expressed:

(i) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#fullName,

"Eric Miller"

(ii) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/

contact#personalTitle, "Dr."

(iii) http://www.w3.org/People/EM/contact#me, http://www.w3.org/1999/02/22-rdf-syntax-ns#type, http://

www.w3.org/2000/10/swap/pim/contact#Person

(iv) http:/ / www. w3. org/ People/ EM/ contact#me, http:/ / www. w3. org/ 2000/ 10/ swap/ pim/ contact#mailbox,

em@w3.org

Example 2: The postal abbreviation for New York

Certain concepts in RDF are taken from logic and linguistics, where subject-predicate and subject-predicate-object

structures have meanings similar to, yet distinct from, the uses of those terms in RDF. This example demonstrates:

In the English language statement 'New York has the postal abbreviation NY' , 'New York' would be the subject, 'has

the postal abbreviation' the predicate and 'NY' the object.

Encoded as an RDF triple, the subject and predicate would have to be resources named by URIs. The object could be

a resource or literal element. For example, in the Notation 3 form of RDF, the statement might look like:

"NY" .

In this example, "urn:x-states:New%20York" is the URI for a resource that denotes the U.S. state New York,

"http://purl.org/dc/terms/alternative" is the URI for a predicate (whose human-readable definition can be found at

here [12] ), and "NY" is a literal string. Note that the URIs chosen here are not standard, and don't need to be, as long

Resource Description Framework 93

as their meaning is known to whatever is reading them.

N-Triples is just one of several standard serialization formats for RDF. The triple above can also be equivalently

represented in the standard RDF/XML format as:

NY

However, because of the restrictions on the syntax of QNames (such as dcterms:alternative above), there are some

RDF graphs that are not representable with RDF/XML.

Example 3: A Wikipedia article about Tony Benn

In a like manner, given that "http://en.wikipedia.org/wiki/Tony_Benn" identifies a particular resource (regardless of

whether that URI could be traversed as a hyperlink, or whether the resource is actually the Wikipedia article about

Tony Benn), to say that the title of this resource is "Tony Benn" and its publisher is "Wikipedia" would be two

assertions that could be expressed as valid RDF statements. In the N-Triples form of RDF, these statements might

look like the following:

"Tony Be

"Wik

And these statements might be expressed in RDF/XML as:

Tony Benn

Wikipedia

To an English-speaking person, the same information could be represented simply as:

The title of this resource, which is published by Wikipedia, is 'Tony Benn'

However, RDF puts the information in a formal way that a machine can understand. The purpose of RDF is to

provide an encoding and interpretation mechanism so that resources can be described in a way that particular

software can understand it; in other words, so that software can access and use information that it otherwise couldn't

use.

Both versions of the statements above are wordy because one requirement for an RDF resource (as a subject or a

predicate) is that it be unique. The subject resource must be unique in an attempt to pinpoint the exact resource being

described. The predicate needs to be unique in order to reduce the chance that the idea of Title or Publisher will be

ambiguous to software working with the description. If the software recognizes http://purl.org/dc/elements/1.1/title

(a specific definition for the concept of a title established by the Dublin Core Metadata Initiative), it will also know

that this title is different from a land title or an honorary title or just the letters t-i-t-l-e put together.

Resource Description Framework 94

The following example shows how such simple claims can be elaborated on, by combining multiple RDF

vocabularies. Here, we note that the primary topic of the Wikipedia page is a "Person" whose name is "Tony Benn":

Tony Benn

Wikipedia

Applications

Tony Benn

• Sigma [13] - Application from DERI in National University of Ireland, Galway(NUIG).

• Creative Commons - Uses RDF to embed license information in web pages and mp3 files.

• DOAC (Description of a Career) - supplements FOAF to allow the sharing of résumé information.

• FOAF (Friend of a Friend) - designed to describe people, their interests and interconnections.

• Haystack client - Semantic web browser from MIT CS & AI lab. [14]

• IDEAS Group - developing a formal 4D Ontology for Enterprise Architecture using RDF as the encoding. [15]

• Microsoft shipped a product, Connected Services Framework [16] ,which provides RDF-based Profile Management

capabilities.

• MusicBrainz - Publishes information about Music Albums. [17]

• NEPOMUK, an open-source software specification for a Social Semantic desktop uses RDF as a storage format

for collected metadata. NEPOMUK is mostly known because of its integration into the KDE4 desktop

environment.

• RDF Site Summary - one of several "RSS" languages for publishing information about updates made to a web

page; it is often used for disseminating news article summaries and sharing weblog content.

• Simple Knowledge Organization System (SKOS) - an KR representation intended to support

vocabulary/thesaurus applications

• SIOC (Semantically-Interlinked Online Communities) - designed to describe online communities and to create

connections between Internet-based discussions from message boards, weblogs and mailing lists. [18]

• Smart-M3 - provides an infrastructure for using RDF and specifically uses the ontology agnostic nature of RDF to

enable heterogeneous mashing-up of information [19]

• Many other RDF schemas are available by searching SchemaWeb. [20]

Some uses of RDF include research into social networking. This is important because it could help governments

keep track of terrorists cells. It will also help people in business fields understand better their relationships with

members of industries that could be of use for product placement [21] . It will also help scientists understand how

people are connected to one another.

RDF is being used to have a better understanding of traffic patterns. This is because the information regarding traffic

patterns is on different websites, and RDF is used to integrate information from different sources on the web. Before,

the common methodology was using keyword searching, but this method is problematic because it does not consider

Resource Description Framework 95

synonyms. This is why ontologies are useful in this situation. But one of the issues that comes up when trying to

efficiently study traffic is that to fully understand traffic, concepts related to people, streets, and roads must be well

understood. Since these are human concepts, they require the addition of fuzzy logic. This is because values that are

useful when describing roads, like slipperiness, are not precise concepts and cannot be measured. This would imply

that the best solution would incorporate both fuzzy logic and ontology. [22]

See also

Notations for RDF

• N3

• N-Triples

• TRiG

• TRiX

• Turtle

• RDF/XML

• RDFa

Ontology/vocabulary languages

• OWL

• SKOS

• RDF schema

Similar concepts

• Entity-attribute-value model

• Graph theory - An RDF model is a labeled, directed multi-graph.

• Website Parse Template

• Tagging

• Topic Maps - Topic Maps is in some ways, similar to RDF.

• Semantic network

Other (unsorted)

• Associative model of data

• Business Intelligence 2.0 (BI 2.0)

• DataPortability

• Folksonomy

• GRDDL

• Life Science Identifiers

• Meta Content Framework

• Semantic Web

• Swoogle

• Universal Networking Language (UNL)

Resource Description Framework 96

Further reading

• W3C's RDF at W3C [23] : specifications, guides, and resources

• RDF Semantics [24] : specification of semantics, and complete systems of inference rules for both RDF and RDFS

Tutorials and documents

• Quick Intro to RDF [25]

• RDF in Depth [26]

• Introduction to the RDF Model [27]

• What is RDF? [28]

• An introduction to RDF [29]

• RDF and XUL [30] , with examples.

External links

News and resources

• Dave Beckett's RDF Resource Guide [31]

• Resource Description Framework: According to W3C specifications and Mozilla's documentation [30]

• RDF Datasources [32] : RDF datasources in Mozilla

• The Finance Ontology [33] Semantic web application under construction.

RDF software tools

• Raptor RDF Parser Library [34]

• Listing of RDF and OWL tools at W3C wiki [35]

• SemWebCentral [36] Open Source semantic web tools

• Intellidimension [37] Semantic web software and tools for Windows, .NET/C# and SQL Server

• Listing of RDF software at xml.com [38]

• Rhodonite [39] : freeware RDF editor and RDF browser with a drag-and-drop interface

• D2R Server [40] : tool to publish relational databases as an RDF-graph

• Virtuoso Universal Server: a SPARQL compliant platform for RDF data management, SQL-RDF integration, and

RDF based Linked Data deployment

• ROWLEX [41] : .NET library and toolkit built to create and browse RDF documents easily. It abstracts away the

level of RDF triples and elevates the level of the programming work to (OWL) classes and properties.

• AlchemyAPI [42] : web service API / SDK that converts unstructured text into RDF & Linked Data.

• The Sweet Tools [43] listing of 800+ RDF and -related tools, most open source, and sortable by category and

language (among other facets).

RDF datasources

• Wikipedia 3 [44] : System One's RDF conversion of the English Wikipedia, updated monthly

• DBpedia: a Linking Open Data Community Project [45] that exposes an every increasing collection of RDF based

Linked Data sources

• Semantic Systems Biology [46]

Resource Description Framework 97

References

[1] http://www.w3.org/TR/rdf-primer/

[2] http://www.w3.org/TR/PR-rdf-syntax/"Resource Description Framework (RDF) Model and Syntax Specification"

[3] Optimized Index Structures for Querying RDF from the Web (http://sw.deri.org/2005/02/dexa/yars.pdf) Andreas Harth, Stefan Decker,

3rd Latin American Web Congress, Buenos Aires, Argentina, October 31 to November 2, 2005, pp. 71-80

[4] W3C 1999 specification (http://www.w3.org/TR/rdf-syntax-grammar/)

[5] Contexts for RDF Information Modelling (http://www.ninebynine.org/RDFNotes/RDFContexts.html)

[6] Circumstance, Provenance and Partial Knowledge (http://www.ninebynine.org/RDFNotes/UsingContextsWithRDF.html)

[7] The Concept of 4Suite RDF Scopes (http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/scopes)

[8] Redland RDF Library - Contexts (http://librdf.org/notes/contexts.html)

[9] Named Graphs (http://www.w3.org/2004/03/trix/)

[10] http://developer.mozilla.org/en/docs/XUL:Template_Guide:Introduction

[11] "RDF Primer" (http://www.w3.org/TR/rdf-primer/). W3C. . Retrieved 2009-03-13.

[12] http://dublincore.org/documents/library-application-profile/index.shtml#Alternative

[13] http://sig.ma/

[14] Haystack (http://groups.csail.mit.edu/haystack/home.html)

[15] The IDEAS Group Website (http://www.ideasgroup.org)

[16] Connected Services Framework (http://www.microsoft.com/serviceproviders/solutions/connectedservicesframework.mspx)

[17] RDF on MusicBrainz Wiki (http://wiki.musicbrainz.org/RDF)

[18] SIOC (Semantically-Interlinked Online Communities) (http://sioc-project.org/)

[19] Oliver Ian, Honkola Jukka, Ziegler Jurgen (2008). “Dynamic, Localized Space Based Semantic Webs”. IADIS WWW/Internet 2008.

Proceedings, p.426, IADIS Press, ISBN 978-972-8924-68-3

[20] SchemaWeb (http://www.schemaweb.info)

[21] An RDF Approach for Discovering the Relevant Semantic Associations in a Social Network By Thushar A.K, and P. Santhi Thilagam

[22] Traffic Information Retrieval Based on Fuzzy Ontology and RDF on the Semantic Web By Jun Zhai, Yi Yu, Yiduo Liang, and Jiatao Jiang

(2008)

[23] http://www.w3.org/RDF/

[24] http://www.w3.org/TR/2004/REC-rdf-mt-20040210/

[25] http://rdfabout.com/quickintro.xpd

[26] http://rdfabout.com/intro/

[27] http://www.xulplanet.com/tutorials/mozsdk/rdfstart.php

[28] http://www.xml.com/pub/a/2001/01/24/rdf.html

[29] http://www-128.ibm.com/developerworks/library/w-rdf/

[30] http://www.xul.fr/en-xml-rdf.html

[31] http://planetrdf.com/guide/

[32] http://xulplanet.com/tutorials/mozsdk/rdfsources.php

[33] http://www.fadyart.com/ontology.html

[34] http://librdf.org/raptor/

[35] http://esw.w3.org/topic/SemanticWebTools

[36] http://projects.semwebcentral.org/

[37] http://www.intellidimension.com/

[38] http://www.xml.com/pub/rg/RDF_Software

[39] http://rhodonite.angelite.nl

[40] http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/

[41] http://rowlex.nc3a.nato.int

[42] http://www.alchemyapi.com/api/entity/ldata.html

[43] http://www.mkbergman.com/new-version-sweet-tools-sem-web/

[44] http://labs.systemone.at/wikipedia3

[45] http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

[46] http://www.semantic-systems-biology.org

Resources of a Resource 98

Resources of a Resource

Resources of a Resource (ROR) is an XML format for describing the content of an internet resource or website in a

generic fashion so this content can be better understood by search engines, spiders, web applications, etc. The ROR

format provides several pre-defined terms for describing objects like sitemaps, products, events, reviews, jobs,

classifieds, etc. The format can be extended with custom terms.

RORweb.com [1] is the official website of ROR; the ROR format was created by AddMe.com [2] as a way to help

search engines better understand content and meaning. Similar concepts, like Google Sitemaps and Google Base,

have also been developed since the introduction of the ROR format.

ROR objects are placed in an ROR feed called ror.xml. This file is typically located in the root directory of the

resource or website it describes. When a search engine like Google or Yahoo searches the web to determine how to

categorize content, the ROR feed allows the search engines "spider" to quickly identify all the content and attributes

of the website.

This has three main benefits:

1. It allows the spider to correctly categorize the content of the website into its engine.

2. It allows the spider to extract very detailed information about the objects on a website (sitemaps, products,

events, reviews, jobs, classifieds, etc)

3. It allows the website owner to optimize his site for inclusion of its content into the search engines.

External links

• RORweb.com [1]

References

[1] http://www.rorweb.com

[2] http://www.AddMe.com

Reverse Ajax 99

Reverse Ajax

Reverse Ajax refers to an Ajax design pattern that uses long-lived HTTP connections to enable low-latency

communication between a web server and a browser. Basically it is a way of sending data from client to server and a

[1] [2]

mechanism for pushing server data back to the browser.

This server–client communication takes one of two forms:

• Client polling: the client repeatedly queries (polls) the server and waits for an answer.

• Server pushing: a connection between a server and client is kept open and the server sends data when available.

Reverse Ajax describes the implementation of either of these models, or a combination of both. The design pattern is

also known as Ajax Push, Full Duplex Ajax and Streaming Ajax.

Examples

The following is a simple example. Imagine we have 2 clients and 1 server, and client1 wants to send the message

"hello" to every other client.

With traditional Ajax (polling):

• client1 sends the message "hello"

• server receives the message "hello"

• client2 polls the server

• client2 receives the message "hello"

• client1 polls the server

• client1 receives the message "hello"

With reverse Ajax (pushing):

• client1 sends the message "hello"

• server receives the message "hello"

• server sends the message "hello" to all clients

Less traffic is generated with Reverse Ajax and messages are transferred with less delay (low-latency).

External links

• The Slow Load Technique/Reverse AJAX - Simulating Server Push in a Standard Web Browser [3]

• Exploring Reverse Ajax [4]

• Reverse Ajax with DWR (an Java Ajax framework) [5]

• Changing the Web Paradigm - Moving from traditional Web applications to Streaming-AJAX [6]

References

[1] Crane, Dave; McCarthy, Phil (July 2008) (in English). Comet and Reverse Ajax: The Next Generation Ajax 2.0. Apress. ISBN 1590599985.

[2] Martin, Katherine (2007-03-22). "Developing Applications using Reverse Ajax" (http://today.java.net/pub/a/today/2007/03/22/

developing-applications-using-reverse-ajax.html). java.net, O'Reilly and CollabNet. .

[3] http://www.obviously.com/tech_tips/slow_load_technique

[4] http://gmapsdotnetcontrol.blogspot.com/2006/08/exploring-reverse-ajax-ajax.html

[5] http://ajaxian.com/archives/reverse-ajax-with-dwr

[6] http://www.lightstreamer.com/Lightstreamer_Paradigm.pdf

Root element 100

Root element

Each XML document has exactly one single root element. This element is also known as the document element. It

encloses all the other elements and is therefore the sole parent element to all the other elements.

The World Wide Web Consortium defines not only the specifications for XML itself [1] , but also the DOM, which is

a platform- and language-independent standard object model for representing XML documents. DOM Level 1

defines, for every XML document, an object representation of the document itself and an attribute or property on the

document called documentElement. This property provides access to an object of type element which directly

represents the root element of the document [2] .

content

There can be other XML nodes outside of the root element [3] , in particular the root element may be preceded by a

prolog, which itself may consist of an XML declaration, optional comments, processing instructions and whitespace,

followed by an optional DOCTYPE declaration and more optional comments, processing instructions and

whitespace. After the document element there may be further optional comments, processing instructions and

whitespace within the document [4] .

Within the document element, apart from any number of attributes and other elements, there may also be more

optional text, comments, processing instructions and whitespace.

A more expanded example of an XML document follows, demonstrating some of these extra nodes along with a

single rootElement element.

text

Root element 101

References

[1] The current W3C XML 1.0 specification (http://www.w3.org/TR/xml/)

[2] The 'documentElement' definition in the W3C DOM Level 1 specification (http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/

level-one-core.html#i-Document)

[3] The 'well-formed document' section of the W3C XML specification (http://www.w3.org/TR/2006/REC-xml-20060816/

#sec-well-formed)

[4] The 'prolog' section of the W3C XML specification (http://www.w3.org/TR/2006/REC-xml-20060816/#NT-prolog)

Schematron

In markup languages, Schematron is a rule-based validation language for making assertions about the presence or

absence of patterns in XML trees. It is a structural schema language expressed in XML using a small number of

elements and XPath.

In a typical implementation, the Schematron schema XML is processed into normal XSLT code for deployment

anywhere that XSLT can be used.

Schematron is capable of expressing constraints in ways that XDR and DTD cannot. For example, it can require that

the content of an element be controlled by one of its siblings. Or it can request or require that the root element,

regardless of what element that is, must have specific attributes. Schematron can also specify required relationships

between multiple XML files.

Constraints and content rules may be associated with "plain-English" validation error messages. This may be

preferred by some users who might otherwise have to cross-reference numeric error codes to understand what they

mean.

Uses

Schematron's design of expressing constraints through an XPath-based language that can be deployed as XSLT code,

make it practical for applications such as the following:

Adjunct to Structural Validation

by testing for co-occurrence constraints, non-regular constraints, and inter-document constraints, Schematron

can extend the validations able to be expressed in languages such as DTDs, RELAX NG or XML Schema.

Lightweight Business Rules Engine

Schematron is not a comprehensive, Rete rules engine, but it can be used to express rules about complex

structures with an XML document.

XML Editor Syntax Highlighting Rules

XML Editors use Schematron rules to conditionally highlight XML files for errors.

Schematron 102

Versions

Schematron was invented by Rick Jelliffe at Academia Sinica Computing Centre, Taiwan. He described Schematron

as "a feather duster to reach the parts other schema languages cannot reach".

The most common versions of Schematron are:

• Schematron 1.0 (1999)

• Schematron 1.3 (2000): this version used the namespace http://xml.ascc.net/schematron/''.It was supported by

an XSLT implementation with a plug-in architecture.

• Schematron 1.5 [1] (2001): this version was widely implemented and still found.

• Schematron 1.6 [2] (2002): this version was the base of ISO Schematron and obsoleted by it

• ISO Schematron [16] (2006): this version regularizes several features, and provides an XML output format SVRL.

It uses the new namespace http://purl.oclc.org/dsdl/schematron''

• ISO Schematron (2010): this proposed version adds support for XSLT2 and arbitrary properties

Schematron as an ISO Standard

Schematron has been standardized to become part of : ISO/IEC 19757 - Document Schema Definition Languages

(DSDL) - Part 3: Rule-based validation - Schematron.

This standard is available free on the ISO Publicly Available Specifications [16] list. Paper versions may be

purchased from ISO or national standards bodies.

Schemas that use ISO/IEC FDIS 19757-3 should use the following namespace:

http://purl.oclc.org/dsdl/schematron

Sample Rule

Schematron rules are very simple to create using a standard XML editor or XForms application. The following is a

sample schema:

Date rules

ContractDate should be in the pa

are not allowed.

This rule checks to make sure that the ContractDate XML element has a date that is before the current date. If this

rule fails the validation will fail and an error message which is the body of the assert element will be returned to the

user.

Schematron 103

Implementation

Schematron source files are usually transformed into XSLT files (using XSLT) and placed in an XML Pipeline. This

allows workflow process designers to build and maintain rules using standard XML manipulation tools.

For example an Apache Ant task can be used to convert Schematron rules into XSLT files.

See also

• XML Schema Language Comparison - Comparison to other XML Schema languages.

• Service Modeling Language - Service Modeling Language uses Schematron.

External links

• ISO Schematron Home Page [3]

• Academia Sinica Computing Centre's Schematron Home Page [4]

• Schematron Wiki including Implementer's FAQ [5]

References

[1] http://xml.ascc.net/schematron/

[2] http://xml.ascc.net/resource/schematron/Schematron2000.html

[3] http://www.schematron.com

[4] http://www.ascc.net/xml/resource/schematron/

[5] http://www.eccnet.com/schematron/index.php/Main_Page

Simple Outline XML

Simple Outline XML (SOX) is a compressed way of writing XML.

SOX uses indenting to represent the structure of an XML document, eliminating the need for closing tags.

Example

The following XHTML markup fragment:

Sample page

A very brief page

... would appear in SOX as:

html>

xmlns=http://www.w3.org/1999/xhtml

head>

body>

title> Sample page

p> A very brief page

Simple Outline XML 104

SOX can be readily converted to XML.

See also

• Haml is a meta-XHTML representation that integrates with Ruby on Rails and has a similar mark-up structure.

Sources

• http://www.langdale.com.au/SOX/

• http://www.ibm.com/developerworks/xml/library/x-syntax.html

Simple XML

Simple XML is a variation of XML containing only elements. All attributes are converted into elements. Not having

attributes or other xml elements such as the XML declaration / DTDs allows the use of simple and fast parsers. This

format is also compatible with mainstream XML parsers.

Structure

For example:

gardening Watering 6:00

7:00 cooking

12:00

would represent:

Validation

Simple XML uses a simple XPath list for validation. The XML snippet above for example, would be represented by:

/Agenda/type|(Activity/type|(*/time))

or a bit more human readable as:

/Agenda/type /Agenda/Activity/type /Agenda/Activity/*/time

This allows the XML to be processed as a stream (without creating an object model in memory) with fast validation.

References

1. http://www.w3.org/XML/simple-XML.html

Streaming XML 105

Streaming XML

Streaming XML means dynamic data which is in an XML format.

Another popular use of this term refers to one method of consuming XML data – largely known as Simple API for

XML. This is via asynchronous events that are generated as the XML data is parsed. In this context, the consumer

streams through the XML data one item at a time. It does not have anything to do whether the underlying data is

being updated via dynamic or static means.

Uses

• Extensible Messaging and Presence Protocol (XMPP). This is the protocol used for example in Google Talk.

Styled Layer Descriptor

A Styled Layer Descriptor (SLD) is an XML schema specified by the Open Geospatial Consortium (OGC) for

describing the appearance of map layers. It is capable of describing the rendering of vector and raster data. A typical

use of SLDs is to instruct a Web Map Service (WMS) of how to render a specific layer.

In August 2007 the SLD specification has been split up into two new OGC specifications [1] :

• Symbology Encoding Implementation Specification (SE)

• Styled Layer Descriptor

Styled Layer Descriptor Specification now only contains the protocol for communicating with a WMS about how to

style a layer. The actual description of the styling is now exclusively described in the Symbology Encoding

Implementation Specification.

Open source SLD supporting software

Desktop software

• JUMP GIS

• UDig

Server-side software

• GeoServer

• Mapserver

See also

• UDig

• GeoServer

Styled Layer Descriptor 106

External links

• AtlasStyler SLD Editor [2] is a free-software (LGPL) SLD Editor developed with GeoTools+Java+Swing.

External links

• OpenGIS Styled Layer Descriptor Implementation Specification [3]

• OpenGIS Symbology Encoding Implementation Specification [4]

References

[1] OGC press release about Symbology Encoding and SLD (http://www.opengeospatial.org/press/?page=pressrelease&year=0&prid=306)

[2] http://wald.intevation.org/projects/atlas-framework

[3] http://www.opengeospatial.org/standards/sld

[4] http://www.opengeospatial.org/standards/symbol

Topic (XML)

In XML terminology, topic can mean

1. A resource that acts as a proxy for some subject; the topic map system's representation of that subject. The

relationship between a topic and its subject is defined to be one of reification. Reification of a subject allows topic

characteristics to be assigned to the topic that reifies it.

2. A short document which is written in such a way that it completely answers a single question. For example, an

online help system typically consists of hundreds of topics, each describing a single procedure or concept. See

topic-based authoring.

3. A element, used in many XML formats.

See also

• Topic Maps

External links

• Specification in XML Topic Maps (XTM) 1.0 (topicmaps.org) [1]

• FAQ: The Topic Architecture of DITA [2]

References

[1] http://www.topicmaps.org/xtm/index.html

[2] http://dita.xml.org/node/1230

Unique Particle Attribution 107

Unique Particle Attribution

The Unique Particle Attribution (UPA) rule is XML Schema's mechanism to prevent schema ambiguity.

Due to the UPA rule the schema fragment given below is prohibited.

Given the instance fragment:

42

It is not possible to create a Post-Schema-Validation Infoset, because it is ambiguous whether should be

associated with the element declaration x, or the wildcard (xsd:any).

The W3C schema workgroup is considering weak wildcards for schema version 1.1. Using weak wildcards, the

explicit element declaration would always take precedence ( is associated with the element declaration), thus

removing the ambiguity.

See also

• W3C XML Schema

External links

• Schema Component Constraint: Unique Particle Attribution [1]

• An Approach for Evolving XML Vocabularies Using XML Schema [2]

• XML Schema 1.1 Part 1: Structures [3]

• XML Schema 1.1 Part 2: Datatypes [4]

References

[1] http://www.w3.org/TR/xmlschema-1/#cos-nonambig

[2] http://lists.w3.org/Archives/Public/www-tag/2004Aug/att-0010/NRMVersioningProposal.html

[3] http://www.w3.org/TR/xmlschema11-1/

[4] http://www.w3.org/TR/xmlschema11-2/

VTD-XML 108

VTD-XML

Developer(s) XimpleWare

Stable release 2.8 / April 12, 2009

Operating

system

Portable

Type XML parser/indexer/slicer/editor library

License GPL and Proprietary License

Website vtd-xml.sourceforge.net [1] VTD-XML blog

[2]

Virtual Token Descriptor for eXtensible Markup Language (VTD-XML) refers to a collection of cross-platform

XML processing technologies centered around a non-extractive [3] [4] XML, "document-centric" parsing technique

called Virtual Token Descriptor (VTD). Depending on the perspective, VTD-XML can be viewed as one of the

following:

• A "Document-Centric" [5] [6] [7] [8] [9]

XML parser

• A native XML indexer or a file format that uses binary data to enhance the text XML [10]

[11] [12]

• An incremental XML content modifier

• An XML slicer/splitter/assembler [13]

• An XML editor/eraser

[14] [15] [16]

• A way to port XML processing on chip

• A non-blocking, stateless XPath evaluator [17]

VTD-XML is developed by XimpleWare and dual-licensed under GPL and proprietary license. It is originally

written in Java, but is now available in C [18] and C#. An extended version supporting 256 GB file size is also

available.

Basic Concept

Non-Extractive, Document-Centric Parsing

Traditionally, a lexical analyzer represents tokens (the small units of indivisible character values) as discrete string

objects. This approach is designated extractive parsing. In contrast, non-extractive tokenization mandates that one

keeps the source text intact, and uses offsets and lengths to describe those tokens.

Virtual Token Descriptor

Virtual Token Descriptor (VTD) applies the concept of non-extractive, document-centric parsing to XML

processing. A VTD record uses a 64-bit integer to encode the offset, length, token type and nesting depth of a token

in an XML document. Because all VTD records are 64-bit in length, they can be stored efficiently and managed as

an array. [19]

VTD-XML 109

Location Cache

Location Caches (LC) build on VTD records to provide efficient random access. Organized as tables, with one table

per nesting depth level, LCs contain entries modeling an XML document's element hierarchy. An LC entry is a

64-bit integer encoding a pair of 32-bit values. The upper 32 bits identify the VTD record for the corresponding

element. The lower 32 bits identify that element's first child in the LC at the next lower nesting level.

Benefits

Overview

Virtually all the core benefits of VTD-XML are inherent to non-extractive, document-centric parsing which provides

these characteristics:

• The source XML text is kept intact in memory without decoding.

• The internal representation of VTD-XML is inherently persistent.

• Obviates object-oriented modeling of the hierarchical representation as it relies entirely on primitive data types

(e.g., 64-bit integers) to represent the XML hierarchy, thus reducing object creation cost to nearly zero [20] .

Combining those characteristics permits thinking of XML purely as syntax (bits, bytes, offsets, lengths, fragments,

namespace-compensated fragments, and document composition) instead of the serialization/deserialization of

objects. This is a powerful way to think about XML/SOA applications.

Simplicity

Developers' typical first impression is that, with VTD-XML, there are relatively few classes and methods to

remember in order to write applications.

As Parser

When used in parsing mode, VTD-XML is a general purpose, extremely high performance [21] XML parser which

compares favorably with others:

• VTD-XML typically outperforms SAX (with NULL content handler) while still providing full random access and

built-in XPath support.

• VTD-XML typically consumes 1.3-1.5 times the XML document's size in memory, which is about 1/5 the

memory usage of DOM

• Applications written in VTD-XML are usually much shorter and cleaner than their DOM or SAX versions.

As Indexer

Because of the inherent persistence of VTD-XML, developers can write the internal representation of a parsed XML

document to disk and later reload it to avoid repetitive parsing. To this end, XimpleWare has introduced VTD+XML

as a binary packaging format combining VTD, LC and the XML text. It can typically be viewed in one of the

following two ways:

• A native XML index that completely eliminates the parsing cost and also retains all benefits of XML. It is a file

format that is human readable and backward compatible with XML.

• A binary XML format that uses binary data to enhance the processing of the XML text.

VTD-XML 110

XML Content Modifier

Because VTD-XML keeps the XML text intact without decoding, when an application intends to modify the content

of XML it only needs to modify the portions most relevant to the changes. This is in stark contrast with DOM, SAX,

or StAx parsing, which incur the cost of parsing and re-serialization no matter how small the changes are.

Since VTDs refer to document elements by their offsets, changes to the length of elements occurring earlier in a

document require adjustments to VTDs referring to all later elements. However, those adjustments are integer

additions, albeit to many integers in multiple tables, so they are quick.

XML Slicer/Splitter/Assembler

An application based on VTD-XML can also use offsets and lengths to address tokens, or element fragments. This

allows XML documents to be manipulated like arrays of bytes.

• As a slicer, VTD-XML can "slice" off a token or an element fragment from an XML document, then insert it back

into another location in the same document, or into a different document.

• As a splitter, VTD-XML can split sub-elements in an XML document and dump each into a separate XML

document.

• As an assembler, VTD-XML can "cut" chunks out of multiple XML documents and assemble them into a new

XML document.

XML Editor/Eraser

Used as an editor/eraser, VTD-XML can directly edit/erase the underlying byte content of the XML text, provided

that the token length is wider than the intended new content. An immediate benefit of this approach is that the

application can immediately reuse the original VTD and LC. In contrast, when using VTD-XML to incrementally

update an XML document, an application needs to reparse the updated document before the application can process

it.

An editor can be made smart enough to track the location of each token, permitting new, longer tokens to replace

existing, shorter tokens by merely addressing the new token in separate memory outside that used to store the

original document. Likewise, when reordering the document, element text does not need to be copied; only the LCs

need to be updated. When a complete, contiguous XML document is needed, such as when saving it, the disparate

parts can be reassembled into a new, contiguous document.

Other Benefits

VTD-XML also pioneers the non-blocking, stateless XPath evaluation approach.

Weaknesses

VTD-XML also exhibits a few noticeable shortcomings:

• As an XML parser, it does not support external entities declared in the DTD.

• As a file format, it increases the document size by about 30% to 50%.

• As an API, it is not compatible with DOM or SAX.

• It is difficult to support certain validation techniques, employed by DTD and XML Schema (e.g., default

attributes and elements), that require modifications to the XML instances being parsed.

VTD-XML 111

Areas of Applications

General-purpose Replacement for DOM or SAX

Because of VTD-XML's performance and memory advantages, it covers a larger portion of XML use cases than

either DOM or SAX [22] .

• Compared to DOM, VTD-XML processes bigger (3x~5x) XML documents for the same amount of physical

memory at about 3 to 10 times the performance.

• Compared to SAX, VTD-XML provides random access and XPath support and outperforms SAX by at least 2x.

XPath over Huge XML documents

The extended edition of VTD-XML combining with 64-bit JVM makes possible XPath-based XML processing over

huge XML documents (up to 256 GB) in size.

For SOA/WS/XML Security

[23] [24] [25]

The combination of VTD-XML's high performance and incremental-update capability makes it essential

to achieve the desired level of Quality of Service for SOA/WS/XML security applications.

For SOA/WS/XML Intermediary

VTD-XML is well suited for SOA intermediary applications such as XML routers/switches/gateways, Enterprise

Service Buses, and services aggregation points. All those applications perform the basic "store and forward"

operations for which retaining the original XML is critical for minimizing latency. VTD-XML's incremental update

capability also contributes significantly to the forwarding performance.

VTD-XML's random-access capability lends itself well to XPath-based XML routing/switching/filtering common in

AJAX and SOA deployment.

Intelligent SOA/WS/XML Load-balancing and Offloading

When an XML document travels through several middle-tier SOA components, the first message stop, after finishing

the inspection of the XML document, can choose to send the VTD+XML file format to the downstream components

to avoid repetitive parsing, thus improving throughput.

By the same token, an intelligent SOA load balancer can choose to generate VTD+XML for incoming/outgoing

SOAP messages to offload XML parsing from the application servers that receive those messages.

XML Persistence Data Store

When viewed from the perspective of native XML persistence, VTD-XML can be used as a human-readable, easy to

use, general-purpose XML index. XML documents stored this way can be loaded into memory to be queried,

updated, or edited without the overhead of parsing/re-serialization.

Schemaless XML Data Binding

VTD-XML's combination of high performance, low memory usage, and non-blocking XPath evaluation makes

possible a new XML data binding approach based entirely on XPath. This approach's biggest benefit is it no longer

requires XML schema, avoids needless object creation, and takes advantage of XML's inherent loose encoding [26] .

It is worth noting that data binding discussed in the article mentioned above needs to be implemented by the

application: VTD-XML itself only offers accessors. In this regard VTD-XML is not a data binding solution itself

(unlike JiBX, JAXB, XMLBeans), although it offers extraction functionality for data binding packages, much like

other XML parsers (STAX, StAX).

VTD-XML 112

Essential Classes

As of Version 2.6, the Java and C# versions of VTD-XML consist of the following classes:

• VTDGen (VTD Generator) is the class that encapsulates the main parsing, index loading and index writing

functions.

• VTDNav (VTD Navigator) is the class that (1) encapsulates XML, VTD, and hierarchical info, (2) contains

various navigation methods,(3) performs various comparisons between VTD records and strings, and (4) converts

VTD records to primitive data types.

• AutoPilot is a class containing functions that perform node-level iteration and XPath.

• XMLModifier is a class that offers incremental update capability, such as delete, insert and update.

The extended VTD-XML consists of the following classes:

• VTDGenHuge (Extended VTD Generator) encapsulates the main parsing.

• XMLBuffer performs in-memory loading of XML documents.

• XMLMemMappedBuffer performs memory mapped loading of XML documents.

• VTDNavHuge (Extended VTD Navigator)1) encapsulates XML, Extended VTD, and hierarchical info, (2)

contains various navigation methods,(3) performs various comparisons between VTD records and strings, and (4)

converts VTD records to primitive data types.

• AutoPilotHuge performs node-level iteration and XPath.

Code Sample

/* In this java program, we demonstrate how to use XMLModifier to

incrementally

* update a simple XML purchase order.

* a particular name space. We also are going

* to use VTDGen's parseFile to simplify programming.

*/

import com.ximpleware.*;

public class Update {

public static void main(String argv[]) throws NavException,

ModifyException, IOException{

// open a file and read the content into a byte array

VTDGen vg = new VTDGen();

if (vg.parseFile("oldpo.xml", true)){

VTDNav vn = vg.getNav();

AutoPilot ap = new AutoPilot(vn);

XMLModifier xm = new XMLModifier(vn);

ap.selectXPath("/purchaseOrder/items/item[@partNum='872-AA']");

int i = -1;

while((i=ap.evalXPath())!=-1){ xm.remove();

xm.insertBeforeElement("\n");

VTD-XML 113

}

References

}

ap.selectXPath("/purchaseOrder/items/item/USPrice[.

X-expression 114

X-expression

X-expressions are the unification of S-expressions found in the Lisp programming language with XML.

X-expressions unify notions of computation with data sharing.

XBRLS

XBRLS (XBRL Simple Application Profile) is an application profile of XBRL.

XBRLS is designed to be 100% XBRL compliant. The stated goals of XBRLS are "to maximize XBRL's benefits,

reduce costs of implementation, and maximize the functionality and effectiveness of XBRL" [1] . XBRL is a general

purpose specification, based on the idea that no one is likely to use 100% of the components of XBRL in building

any one solution. XBRLS specifies a subset of XBRL that is designed to meet the needs of most business users in

most situations, and offers it as a starting point for others. This approach creates an application profile of XBRL

(equivalent to a database view but concerned with metadata, not data).

XBRLS is intended to enable the non-XBRL expert to create both XBRL metadata and XBRL reports in a simple

and convenient manner. At the same time, it seeks to improve the usability of XBRL, the interoperability among

XBRL-based solutions, the effectiveness of XBRL extensions and to reduce software development costs.

The profile was created by Rene van Egmond and Charlie Hoffman, who was the initial creator of XBRL. It borrows

heavily from the US GAAP Taxonomy Architecture.

XBRLS Architecture

The XBRLS architecture is based on many ideas used by the US GAAP Taxonomy Architecture. The intent of the

XBRLS architecture is to make it easier for business users to make use of XBRL, to make it easier for software

vendors to support XBRL, and to safely use the features of XBRL. XBRLS is a subset of what is allowed by the

complete XBRL Specification. Examples of these limitations placed on XBRL are the following:

• Uses no tuples.

• Only uses the segment element of the instance context and disallows the use of the scenario element.

• Allows only XBRL dimensional information as content for the segment element in the instance context.

Furthermore, it requires that every concept (member, primary item) participates in a hypercube and that all

hypercubes are closed.

• Allows no uses of simple or complex typed members within XBRL Dimensions.

• XBRLS never uses the precision attribute, always uses the decimals attribute.

• Requires that every measure exists in at least one XBRL Dimension.

XBRL Components not used in XBRLS

XBRLS 115

XBRL

Specification

Instance Context: entity

Topic Explanation

identifier, entity

scheme

Although not required when using XBRLS, it is highly encouraged that the entity scheme and identifier

be “held static” or synchronized with an explicit member and rather have XBRL Dimensions be used to

articulate entity information, perhaps with an XBRLS “Entity [Axis]” dimension.

The “entity identifier” and “entity scheme” portion of a context should not be used. Rather, the “entity

identifier” and “entity schema” are static (i.e., dummy values in order to pass XBRL validation), using

constant values. The information articulates relating to the entity identifier and entity scheme are moved

to an XBRLS specific taxonomy that makes use of XBRL Dimensions to communicate this information.

Instance Context: period Although not required when using XBRLS, it is highly encouraged that the period context be “held

Instance (sections

4.7.4 and 4.7.3.2)

Context: segments,

scenarios

Instance Fact Value:

precision

Taxonomy Elements: tuples Tuples are not allowed.

static” or synchronized with an explicit member and that rather XBRL Dimensions be used to articulate

this information, perhaps with an XBRLS “Period [Axis]” dimension. Uses XBRL Dimensions to

articulate this XBRL quasi dimension.

Only uses XBRL Dimensions to articulate the content of segments and scenarios, excluding the use of

XML Schema-based contextual information allowed by sections. Furthermore, mixing XML Schema

based-contextual information and XBRL Dimensions is technically dangerous.

Uses only the decimals attribute, precision must not be used.

Taxonomies Weight The weight attribute value of calculations must be either “1” or “-1”, no decimal value between the two is

Taxonomies Annotation,

Documentation

allowed.

Each schema and each linkbase must provide documentation that describes the contents of the file that is

readable by a computer application.

Dimensions Open Hypercubes Open hypercubes are not allowed, only closed hypercubes are allowed.

Dimensions notAll Only “all” has-hypercube arcroles are allowed, “notAll” is not allowed

Dimensions Typed Members Typed members (simple or complex) are not allowed.

External links

• XBRL Business Information Exchange [2]

• XBRLS: how a simpler XBRL can make a better XBRL [3]

• Comprehensive Example [4]

• XBRLS - XBRL Made Easy [5]

• Data Interactive: An Interview with Charlie Hoffman [6]

References

[1] XBRL Business Information Exchange (http://xbrl.squarespace.com/xbrls/) website

[2] http://xbrl.squarespace.com/xbrls/

[3] http://xbrl.squarespace.com/storage/xbrls/XBRLS-How-simpler-can-be-better-2008-03-11.pdf

[4] http://xbrl.squarespace.com//storage/xbrls/XBRLS-ComprehensiveExample-2008-04-18.zip

[5] http://www.ubmatrix.com/company/innovation.htm

[6] http://hitachidatainteractive.com/2008/04/23/an-interview-with-charlie-hoffman

Xdos 116

Xdos

XDoS is an acronym for XML denial-of-service.

An XDoS attack is a content-borne attack whose purpose is to shut down a web service or system running that

service. A common XDoS attack occurs when an XML message is sent with a multitude of digital signatures and a

naive parser would look at each signature and use all the CPU cycles, eating up all resources. These are less common

than inadvertent XDoS attacks which occur when a programming error by a trusted customer causes a handshake to

go into an infinite loop.

XDR Schema

XML-Data Reduced (XDR) schema, used in W3C XML-Data Note and the Document Content Description (DCD)

initiative for XML.

MSXML provided XDR schema support from versions 2.0 up to - but not including - version 6.0 [1] .

See also

• XML Schema Language Comparison - Comparison of other XML Schema languages (not XDR).

• List of XML Schemas - list of XML schemas in use on the Internet sorted by purpose

External links

• XDR Schema Data Types Reference [2]

References

[1] Version and Conformance (http://msdn2.microsoft.com/en-us/library/ms757825(VS.85).aspx)

[2] http://msdn2.microsoft.com/en-us/library/ms256049.aspx

XEE (Starlight) 117

XEE (Starlight)

XEE (XML Engineering Environment) is a visual language for data processing and ETL tasks. It is designed for the

Starlight Information Visualization System as a method for producing and processing XML data.

XEP 118

XEP

Developer(s) RenderX

Stable release 4.18 / March 2010

Written in Java

Operating

system

Type Layout engine

Website [1]

Microsoft Windows, Linux, FreeBSD

XEP is a commercial XSL-FO layout engine written in Java. XEP is proprietary software by RenderX.

History

Started in 1999 as a working prototype written in Perl and completely rewritten in Java soon, XEP has evolved into a

complete engine. XEP runs on any platform where Java runtime is available, including Windows, Linux, FreeBSD

and other server platforms.

Features

XEP accepts XSL-FO as input, as well as XML+XSLT. Its output formats are: PDF, PostScript, AFP, PPML, XPS,

HTML, SVG, and internal XML-based format called XEPOUT.

XEP demonstrates conformance with XSL-FO Recommendation v1.0, a wide range of extensions, and support for a

good subset of XSL 1.1 features. [2]

Available font types, depending on the output format generator, are Type 1, TrueType and OpenType, with the

ability of embedding and subsetting.

Accepted images are most of flavors of raster graphics, SVG, EPS and PDF.

API

For integration XEP provides API in Java and examples covering a number of approaches such as SAX, JAXP and

DOM. XEP has a flexible configuration, which allows running it concurrently in threads on huge input documents,

but also in a small heap in diskless environments such as appservers.

Satellite software

For Windows users there exists a .NET wrapper called XEPWin, and an accompanying .NET development kit with

API in C#, VB and ASP.NET.

Satellite software includes EnMasse - a multiplexer of a grid of XEP engines, with simple networked API and

examples in C, Java, Perl and Python.

XEP 119

External links

• XEP on RenderX site [1]

• Official W3C XSL recommendation formatted by XEP [3]

• How to use XEP with Stylus Studio [4]

References

[1] http://www.renderx.com/tools/xep.html

[2] http://xml.coverpages.org/ni2001-11-08-b.html

[3] http://www.w3.org/TR/2006/REC-xsl11-20061205/xsl11.pdf

[4] http://www.stylusstudio.com/renderx/xep.html

XML

Filename extension .xml

Internet media type [1] [2]

application/xml , text/xml (deprecated)

Uniform Type Identifier public.xml

Developed by World Wide Web Consortium

Type of format Markup language

Extended from SGML

Extended to Numerous, including:

XHTML, RSS, Atom

Standard(s) 1.0 (Fifth Edition) [3] November 26, 2008

1.1 (Second Edition) [4] August 16, 2006

Open format? Yes

XML 120

Current Status Published

Year Started 1996

Editors Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, François Yergeau, John Cowan

Related

Standards

XML Schema

Domain Data Serialization

Abbreviation XML

Website XML 1.0 [5]

XML (Extensible Markup Language) is a set of rules for encoding documents in machine-readable form. It is

defined in the XML 1.0 Specification [6] produced by the W3C, and several other related specifications, all gratis

open standards. [7]

XML's design goals emphasize simplicity, generality, and usability over the Internet. [8] It is a textual data format,

with strong support via Unicode for the languages of the world. Although XML's design focuses on documents, it is

widely used for the representation of arbitrary data structures, for example in web services.

There are many programming interfaces that software developers may use to access XML data, and several schema

systems designed to aid in the definition of XML-based languages.

As of 2009, hundreds of XML-based languages have been developed, [9] including RSS, Atom, SOAP, and XHTML.

XML-based formats have become the default for most office-productivity tools, including Microsoft Office (Office

Open XML), OpenOffice.org (OpenDocument), and Apple's iWork. [10]

Key terminology

The material in this section is based on the XML Specification. This is not an exhaustive list of all the constructs

which appear in XML; it provides an introduction to the key constructs most often encountered in day-to-day use.

(Unicode) Character

By definition, an XML document is a string of characters. Almost every legal Unicode character may appear

in an XML document.

Processor and Application

The processor analyzes the markup and passes structured information to an application. The specification

places requirements on what an XML processor must do and not do, but the application is outside its scope.

The processor (as the specification calls it) is often referred to colloquially as an XML parser.

Markup and Content

Tag

Element

The characters which make up an XML document are divided into markup and content. Markup and content

may be distinguished by the application of simple syntactic rules. All strings which constitute markup either

begin with the character "", or begin with the character "&" and end with a ";". Strings of

characters which are not markup are content.

A markup construct that begins with "". Tags come in three flavors: start-tags, for example

, end-tags, for example , and empty-element tags, for example .

A logical component of a document which either begins with a start-tag and ends with a matching end-tag, or

consists only of an empty-element tag. The characters between the start- and end-tags, if any, are the element's

content, and may contain markup, including other elements, which are called child elements. An example of an

XML 121

Attribute

element is Hello, world. (see hello world). Another is .

A markup construct consisting of a name/value pair that exists within a start-tag or empty-element tag. In the

example (below) the element img has two attributes, src and alt:

. Another example would be

Connect A to B. where the name of the attribute is "number" and the value is "3":

XML Declaration

XML documents may begin by declaring some information about themselves, as in the following example.

Example

Here is a small, complete XML document, which uses all of these constructs and concepts.

This is Raphael's "Foligno" Madonna, painted in

1511–1512.

There are five elements in this example document: painting, img, caption, and two dates. The date elements are

children of caption, which is a child of the root element painting. img has two attributes, src and alt.

Characters and escaping

XML documents consist entirely of characters from the Unicode repertoire. Except for a small number of

specifically excluded control characters, any character defined by Unicode may appear within the content of an XML

document. The selection of characters which may appear within markup is somewhat more limited but still large.

XML includes facilities for identifying the encoding of the Unicode characters which make up the document, and for

expressing characters which, for one reason or another, cannot be used directly.

Details on valid characters

Unicode characters in the following code point ranges are valid in XML 1.0 documents: [11]

• U+0009

• U+000A

• U+000D

• U+0020–U+D7FF

• U+E000–U+FFFD

• U+10000–U+10FFFF

Unicode characters in the following code point ranges are always valid in XML 1.1 documents: [12]

• U+0001–U+0008

• U+000B–U+000C

• U+000E–U+001F

• U+007F–U+0084

• U+0086–U+009F

XML 122

The preceding code points are contained in the following code point ranges which are only valid in certain contexts

in XML 1.1 documents:

• U+0001–U+D7FF

• U+E000–U+FFFD

• U+10000–U+10FFFF

Encoding detection

The Unicode character set can be encoded into bytes for storage or transmission in a variety of different ways, called

"encodings". Unicode itself defines encodings which cover the entire repertoire; well-known ones include UTF-8

and UTF-16. [13] There are many other text encodings which pre-date Unicode, such as ASCII and ISO/IEC 8859;

their character repertoires in almost every case are subsets of the Unicode character set.

XML allows the use of any of the Unicode-defined encodings, and any other encodings whose characters also appear

in Unicode. XML also provides a mechanism whereby an XML processor can reliably, without any prior knowledge,

determine which encoding is being used. [14] Encodings other than UTF-8 and UTF-16 will not necessarily be

recognized by every XML parser.

Escaping

There are several reasons why it may be difficult or impossible to include some character directly in an XML

document.

• The characters "

XML 123

Comments

Comments may appear anywhere in a document outside other markup. Comments should not appear on the first line

or otherwise above the XML declaration for XML processor compatibility. The string "--" (double-hyphen) is not

allowed (as it is used to delimit comments), and entities must not be recognized within comments.

An example of a valid comment: ""

International use

XML supports the direct use of almost any Unicode character in element names, attributes, comments, character

data, and processing instructions (other than the ones that have special symbolic meaning in XML itself, such as the

open corner bracket, "

XML 124

DTD

The oldest schema language for XML is the Document Type Definition (DTD), inherited from SGML.

DTDs have the following benefits:

• DTD support is ubiquitous due to its inclusion in the XML 1.0 standard.

• DTDs are terse compared to element-based schema languages and consequently present more information in a

single screen.

• DTDs allow the declaration of standard public entity sets for publishing characters.

• DTDs define a document type rather than the types used by a namespace, thus grouping all constraints for a

document in a single collection.

DTDs have the following limitations:

• They have no explicit support for newer features of XML, most importantly namespaces.

• They lack expressiveness. XML DTDs are simpler than SGML DTDs and there are certain structures that cannot

be expressed with regular grammars. DTDs only support rudimentary datatypes.

• They lack readability. DTD designers typically make heavy use of parameter entities (which behave essentially as

textual macros), which make it easier to define complex grammars, but at the expense of clarity.

• They use a syntax based on regular expression syntax, inherited from SGML, to describe the schema. Typical

XML APIs such as SAX do not attempt to offer applications a structured representation of the syntax, so it is less

accessible to programmers than an element-based syntax may be.

Two peculiar features that distinguish DTDs from other schema types are the syntactic support for embedding a

DTD within XML documents and for defining entities, which are arbitrary fragments of text and/or markup that the

XML processor inserts in the DTD itself and in the XML document wherever they are referenced, like character

escapes.

DTD technology is still used in many applications because of its ubiquity.

XML Schema

A newer schema language, described by the W3C as the successor of DTDs, is XML Schema, often referred to by

the initialism for XML Schema instances, XSD (XML Schema Definition). XSDs are far more powerful than DTDs

in describing XML languages. They use a rich datatyping system and allow for more detailed constraints on an XML

document's logical structure. XSDs also use an XML-based format, which makes it possible to use ordinary XML

tools to help process them.

RELAX NG

RELAX NG was initially specified by OASIS and is now also an ISO international standard (as part of DSDL).

RELAX NG schemas may be written in either an XML based syntax or a more compact non-XML syntax; the two

syntaxes are isomorphic and James Clark's Trang conversion tool can convert between them without loss of

information. RELAX NG has a simpler definition and validation framework than XML Schema, making it easier to

use and implement. It also has the ability to use datatype framework plug-ins; a RELAX NG schema author, for

example, can require values in an XML document to conform to definitions in XML Schema Datatypes.

XML 125

Schematron

Schematron is a language for making assertions about the presence or absence of patterns in an XML document. It

typically uses XPath expressions.

ISO DSDL and other schema languages

The ISO DSDL (Document Schema Description Languages) standard brings together a comprehensive set of small

schema languages, each targeted at specific problems. DSDL includes RELAX NG full and compact syntax,

Schematron assertion language, and languages for defining datatypes, character repertoire constraints, renaming and

entity expansion, and namespace-based routing of document fragments to different validators. DSDL schema

languages do not have the vendor support of XML Schemas yet, and are to some extent a grassroots reaction of

industrial publishers to the lack of utility of XML Schemas for publishing.

Some schema languages not only describe the structure of a particular XML format but also offer limited facilities to

influence processing of individual XML files that conform to this format. DTDs and XSDs both have this ability;

they can for instance provide the infoset augmentation facility and attribute defaults. RELAX NG and Schematron

intentionally do not provide these.

Related specifications

A cluster of specifications closely related to XML have been developed, starting soon after the initial publication of

XML 1.0. It is frequently the case that the term "XML" is used to refer to XML together with one or more of these

other technologies which have come to be seen as part of the XML core.

• XML Namespaces enable the same document to contain XML elements and attributes taken from different

vocabularies, without any naming collisions occurring. Essentially all software which is advertised as supporting

XML also supports XML Namespaces.

• XML Base defines the xml:base attribute, which may be used to set the base for resolution of relative URI

references within the scope of a single XML element.

• The XML Information Set or XML infoset describes an abstract data model for XML documents in terms of

information items. The infoset is commonly used in the specifications of XML languages, for convenience in

describing constraints on the XML constructs those languages allow.

• xml:id Version 1.0 asserts that an attribute named xml:id functions as an "ID attribute" in the sense used in a

DTD.

• XPath defines a syntax named XPath expressions which identifies one or more of the internal components

(elements, attributes, and so on) included in an XML document. XPath is widely used in other core-XML

specifications and in programming libraries for accessing XML-encoded data.

• XSLT is a language with an XML-based syntax that is used to transform XML documents into other XML

documents, HTML, or other, unstructured formats such as plain text or RTF. XSLT is very tightly coupled with

XPath, which it uses to address components of the input XML document, mainly elements and attributes.

• XSL Formatting Objects, or XSL-FO, is a markup language for XML document formatting which is most often

used to generate PDFs.

• XQuery is an XML-oriented query language strongly rooted in XPath and XML Schema. It provides methods to

access, manipulate and return XML.

• XML Signature defines syntax and processing rules for creating digital signatures on XML content.

• XML Encryption defines syntax and processing rules for encrypting XML content.

Some other specifications conceived as part of the "XML Core" have failed to find wide adoption, including

XInclude, XLink, and XPointer.

XML 126

Use on the Internet

It is common for XML to be used in interchanging data over the Internet. RFC 3023 gives rules for the construction

of Internet Media Types for use when sending XML. It also defines the types "application/xml" and "text/xml",

which say only that the data is in XML, and nothing about its semantics. The use of "text/xml" has been criticized as a

potential source of encoding problems and is now in the process of being deprecated. [18] RFC 3023 also

recommends that XML-based languages be given media types beginning in "application/" and ending in "+xml"; for

example "application/svg+xml" for SVG.

Further guidelines for the use of XML in a networked context may be found in RFC 3470, also known as IETF BCP

70; this document is very wide-ranging and covers many aspects of designing and deploying an XML-based

language.

Programming interfaces

The design goals of XML include "It shall be easy to write programs which process XML documents." [8] Despite

this fact, the XML specification contains almost no information about how programmers might go about doing such

processing. The XML Infoset provides a vocabulary to refer to the constructs within an XML document, but once

again does not provide any guidance on how to access this information. A variety of APIs for accessing XML have

been developed and used, and some have been standardized.

Existing APIs for XML processing tend to fall into these categories:

• Stream-oriented APIs accessible from a programming language, for example SAX and StAX.

• Tree-traversal APIs accessible from a programming language, for example DOM.

• XML data binding, which provides an automated translation between an XML document and

programming-language objects.

• Declarative transformation languages such as XSLT and XQuery.

Stream-oriented facilities require less memory and, for certain tasks which are based on a linear traversal of an XML

document, are faster and simpler than other alternatives. Tree-traversal and data-binding APIs typically require the

use of much more memory, but are often found more convenient for use by programmers; some include declarative

retrieval of document components via the use of XPath expressions.

XSLT is designed for declarative description of XML document transformations, and has been widely implemented

both in server-side packages and Web browsers. XQuery overlaps XSLT in its functionality, but is designed more for

searching of large XML databases.

Simple API for XML (SAX)

SAX is a lexical, event-driven interface in which a document is read serially and its contents are reported as

callbacks to various methods on a handler object of the user's design. SAX is fast and efficient to implement, but

difficult to use for extracting information at random from the XML, since it tends to burden the application author

with keeping track of what part of the document is being processed. It is better suited to situations in which certain

types of information are always handled the same way, no matter where they occur in the document.

Pull parsing

Pull parsing [19] treats the document as a series of items which are read in sequence using the Iterator design pattern.

This allows for writing of recursive-descent parsers in which the structure of the code performing the parsing mirrors

the structure of the XML being parsed, and intermediate parsed results can be used and accessed as local variables

within the methods performing the parsing, or passed down (as method parameters) into lower-level methods, or

returned (as method return values) to higher-level methods. Examples of pull parsers include StAX in the Java

programming language, SimpleXML in PHP and System.Xml.XmlReader in the .NET Framework.

XML 127

A pull parser creates an iterator that sequentially visits the various elements, attributes, and data in an XML

document. Code which uses this iterator can test the current item (to tell, for example, whether it is a start or end

element, or text), and inspect its attributes (local name, namespace, values of XML attributes, value of text, etc.), and

can also move the iterator to the next item. The code can thus extract information from the document as it traverses

it. The recursive-descent approach tends to lend itself to keeping data as typed local variables in the code doing the

parsing, while SAX, for instance, typically requires a parser to manually maintain intermediate data within a stack of

elements which are parent elements of the element being parsed. Pull-parsing code can be more straightforward to

understand and maintain than SAX parsing code.

Document Object Model (DOM)

DOM (Document Object Model) is an interface-oriented Application Programming Interface that allows for

navigation of the entire document as if it were a tree of "Node" objects representing the document's contents. A

DOM document can be created by a parser, or can be generated manually by users (with limitations). Data types in

DOM Nodes are abstract; implementations provide their own programming language-specific bindings. DOM

implementations tend to be memory intensive, as they generally require the entire document to be loaded into

memory and constructed as a tree of objects before access is allowed.

Data binding

Another form of XML processing API is XML data binding, where XML data is made available as a hierarchy of

custom, strongly typed classes, in contrast to the generic objects created by a Document Object Model parser. This

approach simplifies code development, and in many cases allows problems to be identified at compile time rather

than run-time. Example data binding systems include the Java Architecture for XML Binding (JAXB), XML

Serialization in .NET, [20] [21] [22]

and CodeSynthesis XSD for C++.

XML as data type

XML is beginning to appear as a first-class data type in other languages. The ECMAScript for XML (E4X)

extension to the ECMAScript/JavaScript language explicitly defines two specific objects (XML and XMLList) for

JavaScript, which support XML document nodes and XML document lists as distinct objects and use a dot-notation

specifying parent-child relationships. E4X is supported by the Mozilla 2.5+ browsers and Adobe Actionscript, but

has not been adopted more universally. Similar notations are used in Microsoft's LINQ implementation for Microsoft

.NET 3.5 and above, and in Scala (which uses the Java VM). The open-source xmlsh application, which provides a

Linux-like shell with special features for XML manipulation, similarly treats XML as a data type, using the

notation. [23] The Resource Description Framework defines a data type rdf:XMLLiteral to hold wrapped, canonical

XML. [24]

History

XML is an application profile of SGML (ISO 8879). [25]

The versatility of SGML for dynamic information display was understood by early digital media publishers in the

late 1980s prior to the rise of the Internet. [26] [27] By the mid-1990s some practitioners of SGML had gained

experience with the then-new World Wide Web, and believed that SGML offered solutions to some of the problems

the Web was likely to face as it grew. Dan Connolly added SGML to the list of W3C's activities when he joined the

staff in 1995; work began in mid-1996 when Sun Microsystems engineer Jon Bosak developed a charter and

recruited collaborators. Bosak was well connected in the small community of people who had experience both in

SGML and the Web. [28]

XML was compiled by a working group of eleven members, [29] supported by an (approximately) 150-member

Interest Group. Technical debate took place on the Interest Group mailing list and issues were resolved by consensus

XML 128

or, when that failed, majority vote of the Working Group. A record of design decisions and their rationales was

compiled by Michael Sperberg-McQueen on December 4, 1997. [30] James Clark served as Technical Lead of the

Working Group, notably contributing the empty-element "" syntax and the name "XML". Other names that

had been put forward for consideration included "MAGMA" (Minimal Architecture for Generalized Markup

Applications), "SLIM" (Structured Language for Internet Markup) and "MGML" (Minimal Generalized Markup

Language). The co-editors of the specification were originally Tim Bray and Michael Sperberg-McQueen. Halfway

through the project Bray accepted a consulting engagement with Netscape, provoking vociferous protests from

Microsoft. Bray was temporarily asked to resign the editorship. This led to intense dispute in the Working Group,

eventually solved by the appointment of Microsoft's Jean Paoli as a third co-editor.

The XML Working Group never met face-to-face; the design was accomplished using a combination of email and

weekly teleconferences. The major design decisions were reached in twenty weeks of intense work between July and

November 1996, when the first Working Draft of an XML specification was published. [31] Further design work

continued through 1997, and XML 1.0 became a W3C Recommendation on February 10, 1998.

Sources

XML is a profile of an ISO standard SGML, and most of XML comes from SGML unchanged. From SGML comes

the separation of logical and physical structures (elements and entities), the availability of grammar-based validation

(DTDs), the separation of data and metadata (elements and attributes), mixed content, the separation of processing

from representation (processing instructions), and the default angle-bracket syntax. Removed were the SGML

Declaration (XML has a fixed delimiter set and adopts Unicode as the document character set).

Other sources of technology for XML were the Text Encoding Initiative (TEI), which defined a profile of SGML for

use as a 'transfer syntax'; HTML, in which elements were synchronous with their resource, the separation of

document character set from resource encoding, the xml:lang attribute, and the HTTP notion that metadata

accompanied the resource rather than being needed at the declaration of a link. The Extended Reference Concrete

Syntax (ERCS) project of the SPREAD (Standardization Project Regarding East Asian Documents) project of the

ISO-related China/Japan/Korea Document Processing expert group was the basis of XML 1.0's naming rules;

SPREAD also introduced hexadecimal numeric character references and the concept of references to make available

all Unicode characters. To support ERCS, XML and HTML better, the SGML standard IS 8879 was revised in 1996

and 1998 with WebSGML Adaptations. The XML header followed that of ISO HyTime.

Ideas that developed during discussion which were novel in XML included the algorithm for encoding detection and

the encoding header, the processing instruction target, the xml:space attribute, and the new close delimiter for

empty-element tags. The notion of well-formedness as opposed to validity (which enables parsing without a schema)

was first formalized in XML, although it had been implemented successfully in the Electronic Book Technology

"Dynatext" software [32] ; the software from the University of Waterloo New Oxford English Dictionary Project; the

RISP LISP SGML text processor at Uniscope, Tokyo; the US Army Missile Command IADS hypertext system;

Mentor Graphics Context; Interleaf and Xerox Publishing System.

Versions

There are two current versions of XML. The first (XML 1.0) was initially defined in 1998. It has undergone minor

revisions since then, without being given a new version number, and is currently in its fifth edition, as published on

November 26, 2008. It is widely implemented and still recommended for general use.

The second (XML 1.1) was initially published on February 4, 2004, the same day as XML 1.0 Third Edition [33] , and

is currently in its second edition, as published on August 16, 2006. It contains features (some contentious) that are

intended to make XML easier to use in certain cases [34] . The main changes are to enable the use of line-ending

characters used on EBCDIC platforms, and the use of scripts and characters absent from Unicode 3.2. XML 1.1 is

not very widely implemented and is recommended for use only by those who need its unique features. [35]

XML 129

Prior to its fifth edition release, XML 1.0 differed from XML 1.1 in having stricter requirements for characters

available for use in element and attribute names and unique identifiers: in the first four editions of XML 1.0 the

characters were exclusively enumerated using a specific version of the Unicode standard (Unicode 2.0 to Unicode

3.2.) The fifth edition substitutes the mechanism of XML 1.1, which is more future-proof but reduces redundancy.

The approach taken in the fifth edition of XML 1.0 and in all editions of XML 1.1 is that only certain characters are

forbidden in names, and everything else is allowed, in order to accommodate the use of suitable name characters in

future versions of Unicode. In the fifth edition, XML names may contain characters in the Balinese, Cham, or

Phoenician scripts among many others which have been added to Unicode since Unicode 3.2. [36]

Almost any Unicode code point can be used in the character data and attribute values of an XML 1.0 or 1.1

document, even if the character corresponding to the code point is not defined in the current version of Unicode. In

character data and attribute values, XML 1.1 allows the use of more control characters than XML 1.0, but, for

"robustness", most of the control characters introduced in XML 1.1 must be expressed as numeric character

references (and #x7F through #x9F, which had been allowed in XML 1.0, are in XML 1.1 even required to be

expressed as numeric character references [37] ). Among the supported control characters in XML 1.1 are two line

break codes that must be treated as whitespace. Whitespace characters are the only control codes that can be written

directly.

There has been discussion of an XML 2.0, although no organization has announced plans for work on such a project.

XML-SW (SW for skunk works), written by one of the original developers of XML, contains some proposals for

what an XML 2.0 might look like: elimination of DTDs from syntax, integration of namespaces, XML Base and

XML Information Set (infoset) into the base standard.

The World Wide Web Consortium also has an XML Binary Characterization Working Group doing preliminary

research into use cases and properties for a binary encoding of the XML infoset. The working group is not chartered

to produce any official standards. Since XML is by definition text-based, ITU-T and ISO are using the name Fast

Infoset for their own binary infoset to avoid confusion (see ITU-T Rec. X.891 | ISO/IEC 24824-1).

See also

• Category:XML

• Binary XML

• XML Protocol

• List of XML markup languages

• Category:XML-based standards

• Comparison of layout engines (XML)

• Comparison of data serialization formats

• OpenDocument

Further reading

• Annex A of ISO 8879:1986 (SGML)

• Lawrence A. Cunningham (2005). "Language, Deals and Standards: The Future of XML Contracts". Washington

University Law Review. SSRN 900616 [38] .

• Bosak, Jon; Tim Bray (May 1999). "XML and the Second-Generation Web". Scientific American. Online at XML

and the Second-Generation Web [39] .

XML 130

External links

• W3C XML homepage [40]

• XML 1.0 Specification [41]

• Introduction to Generalized Markup [42] by Charles Goldfarb

• Making Mistakes with XML [43] by Sean Kelly

• The Multilingual WWW [44] by Gavin Nicol

• Retrospective on Extended Reference Concrete Syntax [45] by Rick Jelliffe

• XML, Java and the Future of the Web [46] by Jon Bosak

• XML tutorials in w3schools [47]

• XML.gov [48]

• Thinking XML: The XML decade [49] by Uche Ogbuji

• XML: Ten year anniversary [50] by Elliot Kimber

• Five years later, XML... [51] by Simon St. Laurent

• 23 XML fallacies to watch out for [52] by Sean McGrath

• XML Injection [53] - Web Application Security Consortium

• W3C XML is Ten! [54] , XML 10 years press release

References

[1] "XML Media Types, RFC 3023" (http://tools.ietf.org/html/rfc3023#section-3.2). IETF. 2001-01. pp. 9–11. . Retrieved 2010-01-04.

[2] "XML Media Types, RFC 3023" (http://tools.ietf.org/html/rfc3023#section-3.1). IETF. 2001-01. pp. 7–9. . Retrieved 2010-01-04.

[3] http://www.w3.org/TR/2008/REC-xml-20081126/

[4] http://www.w3.org/TR/2006/REC-xml11-20060816/

[5] http://www.w3.org/TR/rec-xml

[6] XML 1.0 Specification (http://www.w3.org/TR/REC-xml)

[7] "W3C DOCUMENT LICENSE" (http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231). .

[8] "XML 1.0 Origin and Goals" (http://www.w3.org/TR/REC-xml/#sec-origin-goals). . Retrieved July 2009.

[9] "XML Applications and Initiatives" (http://xml.coverpages.org/xmlApplications.html). .

[10] "Introduction to iWork Programming Guide. Mac OS X Reference Library" (http://developer.apple.com/mac/library/documentation/

AppleApplications/Conceptual/iWork2-0_XML/Chapter01/Introduction.html). Apple. .

[11] http://www.w3.org/TR/2006/REC-xml-20060816/#charsets

[12] http://www.w3.org/TR/xml11/#charsets

[13] "Characters vs. Bytes" (http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF). .

[14] "Autodetection of Character Encodings" (http://www.w3.org/TR/REC-xml/#sec-guessing). .

[15] It is allowed, but not recommended, to use "

XML 131

[27] edited by Sueann Ambron and Kristina Hooper ; foreword by John Sculley. (1988). "Publishers, multimedia, and interactivity". Interactive

multimedia. Cobb Group. ISBN 1-55615-124-1.

[28] Eliot Kimber (2006). "XML is 10" (http://drmacros-xml-rants.blogspot.com/#116460437782808906). .

[29] The working group was originally called the "Editorial Review Board." The original members and seven who were added before the first

edition was complete, are listed at the end of the first edition of the XML Recommendation, at http://www.w3.org/TR/1998/

REC-xml-19980210.

[30] "Reports From the W3C SGML ERB to the SGML WG And from the W3C XML ERB to the XML SIG" (http://www.w3.org/XML/

9712-reports.html). W3.org. . Retrieved 2009-07-31.

[31] "Extensible Markup Language (XML)" (http://www.w3.org/TR/WD-xml-961114.html). W3.org. 1996-11-14. . Retrieved 2009-07-31.

[32] Jon Bosak, Sun Microsystems (2006-12-07). "Closing Keynote, XML 2006" (http://2006.xmlconference.org/proceedings/162/

presentation.html). 2006.xmlconference.org. . Retrieved 2009-07-31.

[33] Extensible Markup Language (XML) 1.0 (Third Edition) (http://www.w3.org/TR/2004/REC-xml-20040204)

[34] "Extensible Markup Language (XML) 1.1 (Second Edition) – Rationale and list of changes for XML 1.1" (http://www.w3.org/TR/

xml11/#sec-xml11). W3C. . Retrieved 2006-12-21.

[35] Harold, Elliotte Rusty (2004). Effective XML (http://www.cafeconleche.org/books/effectivexml/). Addison-Wesley. pp. 10–19.

ISBN 0321150406. .

[36] "Extensible Markup Language (XML) 1.1 (Second Edition) – Rationale and list of changes for XML 1.1" (http://www.w3.org/TR/

xml11/#dt-name). W3C. . Retrieved 2009-12-11.

[37] http://www.w3.org/TR/xml11/#sec-xml11

[38] http://ssrn.com/abstract=900616

[39] http://www.scientificamerican.com/article.cfm?id=xml-and-the-second-genera

[40] http://www.w3.org/XML/

[41] http://www.w3.org/TR/REC-xml

[42] http://www.sgmlsource.com/history/AnnexA.htm

[43] http://www.developer.com/xml/article.php/10929_3583081_1

[44] http://www.mind-to-mind.com/library/papers/multilingual/multilingual-www.html

[45] http://xml.ascc.net/en/utf-8/ercsretro.html

[46] http://www.xml.com/pub/a/w3j/s3.bosak.html

[47] http://www.w3schools.com/xml/default.asp

[48] http://xml.gov/

[49] http://www-128.ibm.com/developerworks/library/x-think38.html

[50] http://drmacros-xml-rants.blogspot.com/2006/11/xml-ten-year-aniversary.html

[51] http://www.oreillynet.com/xml/blog/2003/02/five_years_later_xml.html

[52] http://www.itworld.com/xml-fallacies-nlstipsm-080122

[53] http://projects.webappsec.org/XML-Injection

[54] http://www.w3.org/2008/02/xml10-pressrelease

XML and MIME 132

XML and MIME

XML

An XML document is a text document that consists of an XML declaration and a root element with well-formed

content.

Example XML Document

MIME

Blah

MIME (Multipurpose Internet Mail Extensions) is an Internet Standard that allows email systems to interpret

complex data. Web browsers also use the MIME type to accurately display information or launch a separate

application to handle the data.

All MIME types (called Internet media type) consist of two parts, in the form type/subtype.

This information is sent to the browser by a web server. Usually, the server determines the MIME type based on the

document's file extension. For example, the server would interpret an extension of .txt (plain text file) to have a

MIME type of text/plain.

XML Specific MIME Types

There are two MIME assignments for XML data. These are:

• application/xml (RFC 3023)

• text/xml (RFC 3023)

Because of the wide variety of documents that can be expressed using an XML syntax, additional MIME types are

needed to differentiate between languages. XML-based formats add a suffix of +xml to the MIME type.

The followings are some examples of common XML media types.

• Registered

• Extensible HyperText Markup Language (XHTML): application/xhtml+xml (RFC 3236)

• Atom: application/atom+xml (RFC 4287)

• Registration-In-Progress

• Extensible Stylesheet Language Transformations (XSLT): application/xslt+xml [1]

• Scalable Vector Graphics (SVG): image/svg+xml [2]

• Unregistered

• Mathematical Markup Language (MathML): application/mathml+xml

• Really Simple Syndication (RSS 2.0): application/rss+xml

XML and MIME 133

External links

• Official List of MIME Types [3]

• IBM article [4]

References

[1] http://www.w3.org/TR/xslt20/#xslt-mime-definition

[2] http://www.w3.org/TR/SVGMobile12/mimereg.html

[3] http://www.iana.org/assignments/media-types/

[4] http://www-128.ibm.com/developerworks/xml/library/x-mxd2.html

XML appliance

An XML appliance is a separate computer system with deliberately narrow functionality that exchanges XML

messages with other computer systems. XML appliances secure, accelerate and route XML so enterprises can

cost-effectively realize its full potential for messaging and service-oriented architectures (SOAs). They are designed

specifically to be easy to install, configure and manage. While some XML appliances must rely on specialized

hardware and software to accelerate the processing of XML messages, others accomplish the same tasks using

standards-based hardware and operating systems.

History of XML appliances

The first XML appliances were created by DataPower in 1999, Sarvega and Forum Systems in 2001, but there were

generally two groups of engineers - some who were focused on large volumes of XML transformations and some

who were focused on high-speed XML processing and security. The transformation team created specialized

software or Application-specific integrated circuits that performed transformations up to 100 times faster than basic

software-only solutions. Although there were some early adopters of these systems, it was initially restricted to large

e-commerce sites such as Yahoo! and Amazon. The XML processing team created highly optimized appliances that

secured and integrated XML across many use cases. Early entrants in XML appliances include vendors such as

DataPower (now owned by IBM), Reactivity, Inc. (acquired by CISCO), Forum Systems, Layer 7 Technologies,

Vordel, and Sarvega (now owned by Intel).

These two approaches began to converge when a second generation of XML appliances started to appear around

2003, when these devices were used to exchange SOAP XML messages between computers on public networks.

These messages required advanced security features such as encryption, digital signatures and denial of service

attack prevention. Because the setup and configuration of software-only systems was time consuming, companies

could save a great deal of money by using appliances that were pre-packaged with WS-Security standards built in.

XML appliance 134

Common features of XML appliances

• They can validate XML messages for well-formedness as they enter or exit the appliance

• They include hardware and/or software customized for efficient XML parsing and analysis.

• They have built-in support for many XML standards such as XSLT, XPath, SOAP and WS-Security

Classification of XML appliances

Although the term XML appliance is the most general term to describe these devices, most vendors use alternative

terminology that describe more specific functionality of these devices. The following are alternative names used for

XML Appliances:

• XML accelerators — are devices that typically use custom hardware or software built on standards-based

hardware to accelerate XPath processing. This hardware typically provides a performance boost between 10 and

100 times in the number of messages per second that can be processed.

• Integration appliance — (also known as application routers) are devices that are designed to make the integration

of computer systems easier.

• XML security gateways (also known as XML firewalls) are devices that support the WS-Security standards.

These appliances typically offload encryption and decryption to specialized hardware devices.

• XML Enabled Networking — an abstraction layer that exists alongside the traditional IP network. This layer

addresses the security, incompatibility and latency issues encumbering XML messages, web services and

service-oriented architectures (SOAs).

Notable XML appliance vendors

• Bloombase

• Citrix Systems (through acquisition of QuickTree [1])

• DataPower (now owned by IBM), see IBM WebSphere DataPower SOA Appliances

• F5 Networks

• Radware

• Solace Systems

• Xtradyne

• Cisco

See also

• XML

• XSLT

• SOAP

• XML Enabled Networking

• WS-Security

• Apache Axis

• Integration appliance

XML appliance 135

References

[1] http://community.citrix.com/display/ocb/2008/11/14/XML+Security+Features+in+Netscaler+9.0

XML Base

XML Base is a World Wide Web Consortium recommended facility for defining base URIs for parts of XML

documents.

XML Base recommendation was adopted on 2001-06-27.

The attribute xml:base may be inserted in XML documents to specify a base URI other than the base URI of the

document or external entity. The value of this attribute is interpreted as a URI Reference as defined in RFC 3986

[IETF RFC 3986]. It serves the function described in section 5.1.1 of RFC3986, establishing the base URI (or IRI)

for resolving any relative references found within the effective scope of the xml:base attribute.

In namespace-aware XML processors, the "xml" prefix is bound to the namespace name http:/ / www. w3. org/

XML/1998/namespace as described in Namespaces in XML [XML Names]. Note that xml:base can be still used by

non-namespace-aware processors.

External links

• XML Base W3C Recommendation [1]

References

[1] http://www.w3.org/TR/xmlbase/

XML Catalog 136

XML Catalog

XML documents typically refer to external entities, for example the public and/or system ID for the Document Type

Definition. These external relationships are expressed using URIs, typically as URLs.

However, if they are absolute URLs, they only work when your network can reach them. Relying on remote

resources makes XML processing susceptible to both planned and unplanned network downtime.

Conversely, if they are relative URLs, they're only useful in the context where they were initially created. For

example, the URL "../../xml/dtd/docbookx.xml" will usually only be useful in very limited circumstances.

One way to avoid these problems is to use an entity resolver (a standard part of SAX) or a URI Resolver (a standard

part of JAXP). A resolver can examine the URIs of the resources being requested and determine how best to satisfy

those requests. The XML catalog is a document describing a mapping between external entity references and

locally-cached equivalents.

Example Catalog.xml

The following simple catalog shows how one might provide locally-cached DTDs for an XHTML page validation

tool, for example.

This catalog makes it possible to resolve -//W3C//DTD XHTML 1.0 Strict//EN to the local URI

dtd/xhtml1/xhtml1-strict.dtd. Similarly, it provides local URIs for two other public IDs.

Note that the document above includes a DOCTYPE - this may cause the parser to attempt to access the system ID

URL for the DOCTYPE (i.e. http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd) before the

catalog resolver is fully functioning, which is probably undesirable. To prevent this, simply remove the DOCTYPE

declaration.

The following example shows this, and also shows the equivalent declarations as an alternative to

declarations.

XML Catalog 137

Using a Catalog - Java SAX Example

Catalog resolvers are available for various programming languages. The following example shows how, in Java, a

SAX parser may be created to parse some input source in which the

org.apache.xml.resolver.tools.CatalogResolver is used to resolve external entities to

locally-cached instances. This resolver originates from Apache Xerces but is now included with the Sun Java

runtime.

Simply create a SAXParser in the normal way, using factories. Obtain the XML reader and set the entity resolver

to the standard one (CatalogResolver) or another of your own.

final SAXParser saxParser =

SAXParserFactory.newInstance().newSAXParser();

final XMLReader reader = saxParser.getXMLReader();

final ContentHandler handler = ...;

final InputSource input = ...;

reader.setEntityResolver( new CatalogResolver() );

reader.setContentHandler( handler );

reader.parse( input );

It is important to call the parse method on the reader, not on the SAX parser.

XML Catalog 138

See also

• XML Catalogs. OASIS Standard, Version 1.1. 07-October-2005. [1]

• XML Entity and URI Resolvers [2] , Sun

• XML Catalog Manager [3] project on Sourceforge

• XML Catalogs for .NET and Mono [4]

References

[1] http://www.oasis-open.org/committees/download.php/14810/xml-catalogs.pdf

[2] http://java.sun.com/webservices/docs/1.6/jaxb/catalog.html

[3] http://xmlcatmgr.sourceforge.net/

[4] http://xmlcatalog.net/

XML Certification Program

XML Certification Program (XML Master) is IT professional certification for XML and related technologies.

There are two levels of XML Certifications, XML Master Basic certification and XML Master Professional

certification, and more than 18000 examiners have passed those examinations.

Certification paths

XML Master Professional Application Developer Certification

• XML Master Professional Application Developer is a certification for professionals who have demonstrated the

ability to use technology in developing applications that deal with XML data.

XML Master Professional Application Developer Certification Requirements

• Pass the XML Master Basic exam and the XML Master Professional Application Developer certification exam.

XML Master Professional Application Developer Certification Exam

• Duration => 90 minutes

• Number of Questions => 45 questions

• Required Passing Score => 70%

XML Master Professional Application Developer Certification Exam Topics

• Section 1 - DOM / SAX

• Section 2 - DOM / SAX Programming

• Section 3 - XSLT

• Section 4 - XML Schema

• Section 5 - XML Processing System Design Technology

• Section 6 - Utilizing XML

XML Certification Program 139

XML Master Professional Database Administrator Certification

• The XML Master Professional Database Administrator is a certification for professionals who have demonstrated

the ability to use technology in XQuery and XMLDB.

XML Master Professional Database Administrator Certification Requirements

• Pass the XML Master Basic exam and the XML Master Professional Database Administrator certification exam.

XML Master Professional Database Administrator Certification Exam

• Duration in minutes => 90 minutes

• Number of Questions => 30 questions

• Required Passing Score => 80%

XML Master Professional Database Administrator Certification Exam Topics

• Section 1 - Overview

• Section 2 - XQuery, XPath

• Section 3 - Manipulating XML Data

• Section 4 - Creating XML Schema and Other XML Database Objects

XML Master Basic Certification

• XML Master Basic is a certification for professionals who have demonstrated the ability to use XML and related

technologies.

XML Master Basic Certification Requirements

• Pass the XML Master Basic certification exam.

XML Master Basic Certification Exam

• Duration in minutes => 90 minutes

• Number of Questions => 50 questions

• Minimum Passing Score => 70%

XML Master Basic Certification Exam Topics

• Section 1 - XML Overview

• Section 2 - Creating XML Documents

• Section 3 - DTD

• Section 4 - XML Schema

• Section 5 - XSLT, XPath

• Section 6 - Namespace

XML Certification Program 140

For Certification Exam Takers

Exam Fee

It takes US$125 for each certification exam.

Exam Enrollment

The XML Master exams are available daily at Prometric Authorized Testing Centers. To take the exam, schedule a

day and time at Prometric Web site [1] .

External links

XML Certification Program (XML Master) official website

• Introduction to XML Certification Program: XML Master [2]

• XML Master Certification Practice Exam [3]

• XML Master Certification Success Stories [4]

XML Master Basic Certification Exam Preparation Links

Section 1 - XML Overview

• a. Overview of XML

• XML features [5]

• Purpose of XML [6]

• b. Overview of related XML technologies

• Names for and overview of XML-related technologies defined by the W3C or other standards

organizations XPath, XLink, XQuery, XPointer, DOM, SAX, SOAP, XHTML etc.

[7]

• Names for and overview of applicable XML specifications defined according to industry or purpose by the

W3C or other standards organizations [6]

• Purpose of schema definition language defining XML structure [8]

• Differences in defined content and functions of XML Schema and DTD [8]

Section 2 - Creating XML Documents

• a. Syntax

• Naming rules, usable characters defined within an XML document [9]

• Methods for coding XML documents utilizing tags [10]

• Rules for coding declarations, elements, comments, character references, and processing commands

comprising an XML document [10]

• Methods for coding character data and markups (tags, references, comments, etc.) comprising an XML

document [11]

• The role of an XML processor (XML parser) [12]

• b. Elements, attributes, entities

• Coding elements that include attributes [13]

• Types of entities [14]

• Handling entities and references using an XML processor [15]

• Usage of character references [16]

• Usage of Predefined entities [17]

• Method for referencing entities [17]

• c. Valid XML documents, well-formed XML documents

XML Certification Program 141

• Well-formed XML document coding methods [18]

• Coding methods to ensure valid XML documents [8]

• Differences between valid XML documents and well-formed XML documents [19]

• Creating valid XML documents for defined DTDs [19]

• Creating valid XML documents for defined XML Schema [19]

• d. Special characters/ character codes, encoding/ normalizing XML documents

• Character references [16]

• XML declarations and text declarations [20]

• Handling white spaces [21]

• End-of-line handling in XML documents [22]

• Normalizing attribute values [23]

Section 3 - DTD

• a. Basics

• Document type declarations [24]

• Methods for coding DTD internal subsets and external subsets [24]

• Differences between DTD internal subsets and external subsets [24]

• Internal entities and external entities, Parsed entities and unparsed entities [24]

• b. Content model/element type declarations/attribute-list declarations/actual processing/entity declarations

• Element type declarations [25]

• Content model definitions for elements [25]

• Attribute-list declarations [26]

• Attribute types [26]

• Attribute defaults [26]

• Entity declarations [27]

Section 4 - XML Schema

• a. Basics

• XML Schema document structure [28]

• XML Schema Namespace [28]

• Mapping between XML documents and XML schema documents [29]

• b. Data types/ coding methods/ actual processing

• XML Schema embedded data types [30]

• Simple type and complex type [31]

• Type extensions and restrictions [29]

• Element definitions [30]

• Attribute definitions [30]

Section 5 - XSLT, XPath

• a. Basics

• Purpose of XSLT [32]

• Application use of XSLT [33]

• XSLT stylesheet structure [34]

• XSLT Namespace [34]

• b. Elements/ templates/ character encoding/ actual transformation processing

• Coding methods and related functions for well-known XSLT elements [35]

• Template rules and templates [36]

XML Certification Program 142

• Pattern coding and matching patterns and nodes [36]

• Output processing using XSLT [37]

• c. Coding XPath expressions within a stylesheet

• Basic operators [36]

• Basic functions [36]

• Basic coding methods for location paths (designating tree structure nodes) [36]

Section 6 - Namespace

• a. XML namespaces

• XML namespace defined content [38]

• Application use of XML namespace [38]

• XML namespace coding methods [38]

• XML namespace scope (effective scope) [39]

External links

• XML Master Trainings [40] (German)

• XML Master Basic Training [41] (German)

References

[1] http://securereg3.prometric.com/

[2] http://www.xmlmaster.org/en

[3] http://www.xmlmaster.org/en/practice_exam/

[4] http://www.xmlmaster.org/en/success/index.html

[5] http://www.w3schools.com/xml/xml_whatis.asp

[6] http://www.w3schools.com/xml/xml_usedfor.asp

[7] http://www.xmlmaster.org/en/article/d01/c01/

[8] http://www.w3schools.com/schema/schema_why.asp

[9] http://www.w3schools.com/xml/xml_elements.asp

[10] http://www.w3schools.com/xml/xml_syntax.asp

[11] http://www.w3schools.com/xml/xml_cdata.asp [12]

http://www.w3schools.com/dtd/dtd_validation.asp [13]

http://www.w3schools.com/xml/xml_attributes.asp

[14] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-entity-decl

[15] http://www.w3.org/TR/2006/REC-xml-20060816/#TextEntities

[16] http://www.w3.org/TR/2004/REC-xml-20040204/#sec-entexpand

[17] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-references

[18] http://www.xmlmaster.org/en/article/d01/c02/

[19] http://www.w3schools.com/xml/xml_dtd.asp

[20] http://www.w3schools.com/xml/xml_encoding.asp

[21] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space

[22] http://www.w3.org/TR/2006/REC-xml-20060816/#sec-line-ends

[23] http://www.w3.org/TR/2006/REC-xml-20060816/

[24] http://www.xmlmaster.org/en/article/d01/c03/

[25] http://www.w3schools.com/dtd/dtd_elements.asp

[26] http://www.w3schools.com/dtd/dtd_attributes.asp

[27] http://www.w3schools.com/dtd/dtd_entities.asp

[28] http://www.w3schools.com/schema/schema_schema.asp

[29] http://www.xmlmaster.org/en/article/d01/c06/

[30] http://www.xmlmaster.org/en/article/d01/c04/

[31] http://www.xmlmaster.org/en/article/d01/c05/

[32] http://www.w3schools.com/xsl/xsl_intro.asp

[33] http://www.xmlmaster.org/en/article/d01/c07/

[34] http://www.w3schools.com/xsl/xsl_transformation.asp

XML Certification Program 143

[35] http://www.w3schools.com/xsl/xsl_templates.asp

[36] http://www.xmlmaster.org/en/article/d01/c08/

[37] http://www.w3schools.com/xsl/el_output.asp

[38] http://www.w3schools.com/xml/xml_namespaces.asp

[39] http://www.xmlmaster.org/en/article/d01/c10/

[40] http://www.digicomp.ch/xml

[41] http://www.data2type.de/leistungen/schulungen/xmlmaster

XML Configuration Access Protocol

The XML Configuration Access Protocol (XCAP), is an application layer protocol that allows a client to read, write,

and modify application configuration data stored in XML format on a server.

Overview

XCAP maps XML document sub-trees and element attributes to HTTP URIs, so that these components can be

directly accessed by clients using HTTP protocol. An XCAP server is used by XCAP clients to store data like buddy

lists and presence policy in combination with a SIP Presence server that supports PUBLISH, SUBSCRIBE and

NOTIFY methods to provide a complete SIP SIMPLE server solution.

Features

The following operations are supported via XCAP protocol in a client-server interaction:

• Retrieve an item

• Delete an item

• Modify an item

• Add an item

The operations above can be executed on the following items:

• Document

• Element

• Attribute

The XCAP addressing mechanism is based on XPath, that provides the ability to navigate around the XML tree.

Application Usages

The following applications are provided by XCAP, by using specific auid (Application Unique Id):

• XCAP capabilities (auid = xcap-caps).

• Resource lists (auid = resource-lists). A resource lists application is any application that needs access to a list of

resources, identified by a URI, to which operations, such as subscriptions, can be applied.

• Presence rules (auid = pres-rules, org.openmobilealliance.pres-rules). A Presence Rules application is an

application which uses authorization policies, also known as authorization rules, to specify what presence

information can be given to which watchers, and when.

• RLS services (auid = rls-services). A Resource List Server (RLS) services application is Session Initiation

Protocol (SIP) application whereby a server receives SIP SUBSCRIBE requests for resource, and generates

subscriptions towards the resource list.

• PIDF manipulation (auid = pidf-manipulation). Pidf-manipulation application usage defines how XCAP is used

to manipulate the contents of PIDF based presence documents.

XML Configuration Access Protocol 144

Standards

The XCAP protocol is based on the following IETF standards:

RFC4825 [1] , RFC4826 [2] , RFC4827 [3] , RFC5025 [4]

External links

• XCAP Tutorial [5]

• OpenXCAP [6]

References

[1] RFC4825 (http://www.ietf.org/rfc/rfc4825.txt)

[2] RFC4826 (http://www.ietf.org/rfc/rfc4826.txt)

[3] RFC4827 (http://www.ietf.org/rfc/rfc4827.txt)

[4] RFC5025 (http://www.ietf.org/rfc/rfc5025.txt)

[5] http://www.jdrosen.net/papers/xcap-tutorial.ppt

[6] http://openxcap.org/

XML Control Protocol

XML Control Protocol, or XCP, was launched as an April Fools' Day joke on April 1, 2004. It was pitched as a

drop-in replacement for TCP with the slogan "Light the Fiber!". The web site put up for the occasion now seems to

be owned by a link farm.

External links

• TCP is So Over by Tim Bray [1]

• Former XCP home page [2]

References

[1] http://www.tbray.org/ongoing/When/200x/2004/04/01/XCP

[2] http://www.x-cp.org/

XML data binding 145

XML data binding

XML data binding refers to the process of representing the information in an XML document as an object in

computer memory. This allows applications to access the data in the XML from the object rather than using the

DOM or SAX to retrieve the data from a direct representation of the XML itself.

An XML data binder accomplishes this by automatically creating a mapping between elements of the XML schema

of the document we wish to bind and members of a class to be represented in memory.

When this process is applied to convert a XML document to an object, it is called unmarshalling. The reverse

process, to serialize an object as XML, is called marshalling.

Since XML is inherently sequential and objects are (usually) not, XML data binding mappings often have difficulty

preserving all the information in an XML document. Specifically, information like comments, XML entity

references, and sibling order may fail to be preserved in the object representation created by the binding application.

This is not always the case; sufficiently complex data binders are capable of preserving 100% of the information in

an XML document.

Similarly, since objects in computer memory are not inherently sequential, and may include links to other objects

(including self-referential links), XML data binding mappings often have difficulty preserving all the information

about an object when it is marshalled to XML.

An alternative approach to automatic data binding relies instead on hand-crafted XPath expressions that extract the

data from XML. This approach has a number of benefits. First, the data binding code only needs proximate

knowledge (e.g., topology, tag names, etc.) of the XML tree structure, which developers can determine by looking at

the XML data; XML schemas are no longer mandatory. Furthermore, XPath allows the application to bind the

relevant data items and filter out everything else, avoiding the unnecessary processing that would be required to

completely unmarshall the entire XML document. The drawback of this approach is the lack of automation in

implementing the object model and XPath expressions. Instead the application developers have to create these

artifacts manually.

Data binding in general

One of XML data binding's strengths is the ability to un/serialize objects across programs, languages, and platforms.

You can dump a time series of structured objects from a datalogger written in C on an embedded processor, bring it

across the network to process in perl and finally visualize in Mathematica. The structure and the data remain

consistent and coherent throughout the journey, and no custom formats or parsing is required. This is not unique to

XML. YAML, for example, is emerging as a powerful data binding alternative to XML. JSON (which can be

regarded as a subset of YAML) is often suitable for lightweight or restricted applications.

XML data binding 146

External links

• XML Data Binding Resources [1] , by Ronald Bourret

• XML Schema Patterns for Databinding Working Group [2]

See also

• Bound control

• Data structure

• JSON

• Serialization

• YAML

References

[1] http://www.rpbourret.com/xml/XMLDataBinding.htm

[2] http://www.w3.org/2002/ws/databinding

XML database

An XML database is a data persistence software system that allows data to be stored in XML format. This data can

then be queried, exported and serialized into the desired format.

Two major classes of XML database exist:

1. XML-enabled: these map all XML to a traditional database (such as a relational database [1] ), accepting XML as

input and rendering XML as output. This term implies that the database does the conversion itself (as opposed to

relying on middleware).

2. Native XML (NXD): the internal model of such databases depends on XML and uses XML documents as the

fundamental unit of storage, which are, however, not necessarily stored in the form of text files.

Rationale for XML in databases

O'Connell (2005, 9.2) gives one reason for the use of XML in databases: the increasingly common use of XML for

data transport, which has meant that "data is extracted from databases and put into XML documents and vice-versa".

It may prove more efficient (in terms of conversion costs) and easier to store the data in XML format.

Native XML databases

The term "native XML database" (NXD) can lead to confusion. Many NXDs do not function as standalone databases

at all, and do not really store the native (text) form.

The formal definition from the XML:DB initiative (which appears to be inactive since 2003 [2] ) states that a native

XML database:

• Defines a (logical) model for an XML document — as opposed to the data in that document — and stores and

retrieves documents according to that model. At a minimum, the model must include elements, attributes,

PCDATA, and document order. Examples of such models include the XPath data model, the XML Infoset, and

the models implied by the DOM and the events in SAX 1.0.

• Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a

table as its fundamental unit of (logical) storage.

XML database 147

• Need not have any particular underlying physical storage model. For example, NXDs can use relational,

hierarchical, or object-oriented database structures, or use a proprietary storage format (such as indexed,

compressed files).

Additionally, many XML databases provide a logical model of grouping documents, called "collections". Databases

can set up and manage many collections at one time. In some implementations, a hierarchy of collections can exist,

much in the same way that an operating system's directory-structure works.

All XML databases now support at least one form of querying syntax. Minimally, just about all of them support

XPath for performing queries against documents or collections of documents. XPath provides a simple pathing

system that allows users to identify nodes that match a particular set of criteria.

In addition to XPath, many XML databases support XSLT as a method of transforming documents or query-results

retrieved from the database. XSLT provides a declarative language written using an XML grammar. It aims to define

a set of XPath filters that can transform documents (in part or in whole) into other formats including Plain text,

XML, or HTML.

Many XML databases also support XQuery to perform querying. XQuery includes XPath as a node-selection

method, but extends XPath to provide transformational capabilities. Users sometimes refer to its syntax as

"FLWOR" (pronounced 'Flower') because the flow may include the following statements: 'For', 'Let', 'Where', 'Order'

and 'Return'. Traditional RDBMS vendors (who traditionally had SQL only engines), are now shipping with hybrid

SQL and XQuery engines. Hybrid SQL/XQuery engines help to query XML data alongside the Relational data, in

the same query expression. This approach helps in combining Relational and XML data.

Some XML databases support an API called the XML:DB API (or XAPI) as a form of implementation-independent

access to the XML datastore. In XML databases, XAPI resembles ODBC and JDBC as used with relational

databases. On the 24th of June 2009, The Java Community Process released the final version of the XQuery API for

Java specification (XQJ) [3] - "a common API that allows an application to submit queries conforming to the W3C

XQuery 1.0 specification and to process the results of such queries".

Databases known to support XQuery, XQJ, XML:DB, or a RESTful API

XML Database License Language XQJ API XML:DB

[4]

Apache XIndice (no longer maintained )

API

RESTful

API

Open source Java No Yes No No

BaseX Open source Java Yes Yes Yes Yes

Gemfire Enterprise Commercial Unknown No Yes No Yes

DOMSafeXML Commercial Unknown No Yes No Yes

eXist Open source Java No Yes Yes No

MarkLogic Server Commercial C++ No No Yes Yes

MonetDB/XQuery Open source C++ No Yes No No

myXMLDB Open source Java No Yes No Unknown

OZONE Open source Java No Yes No Yes

Sedna Open source C++ Yes Yes No Yes

Tamino Commercial Unknown No Partial No Unknown

TeXtML Commercial Unknown Unknown Unknown No Yes

Xpriori XMS Commercial C++ No No No Yes

Transaction Support

XML database 148

Implementations

• Apache Xindice [5] (previous name:dbxml)

• BaseX [6] native, open-source XML Database developed at the University of Konstanz. Supports XQuery and Full

Text [7] and Update [8] extensions.

• BSn/NONMONOTONIC Lab: IB Search Engine [9] , embeddable XML++ search engine using a generic/abstract

model and a mix of polymorphic objects types. Spin-off from the Isearch project.

• Clusterpoint Storage Engine [10] , an XML storage engine geared towards high-volume applications and

millisecond query times.

• DB2 9 Express-C [11] , no-charge hybrid relational/XML data server with PureXML

• EMC Documentum xDB [12] , a commercial native XML database including XQuery implementation, embeddable

• eXist-db [13] , open-source native XML database, written in Java

• Gemstone System's GemFire Enterprise [14] commercial XML database

• MarkLogic Server [15] , a native XML database which uses XQuery.

• M/DB:X [16] , a lightweight, REST-interfaced native XML database designed for use as a Cloud database.

• MonetDB/XQuery [17] - XQuery processor on top of the MonetDB relational database system. Also supports

W3C XQUF [8] updates. Open source.

• Oracle XML DB [18] XML Enabled, (as of Oracle 10g known as Oracle XDB) despite its name it does not support

the XML:DB API.

• Oracle Berkeley DB XML [19] , XML Enabled, embedded database; built on top of the Berkeley DB (a key-value

database).

• Sedna XML Database [20] , Open source XML database developed by MODIS [21] team at Institute for System

Programming [22] . Supports XQuery, Updates, XQJ API, Transactions and Triggers

• SQL Server 2005 [23] , Free Express Edition with full xml features

• Tamino XML Server [24] , native XML database. support for XQuery, XQuery Update, Transactions and Server

Extensions.

• TEXTML Server [25] , a native XML database combined with a full-text search engine.

• TigerLogic XDMS [26] native XML Database

• Timber [27] , a native XML database system developed at the University of Michigan

• Qizx 3.0 [28] a native XQuery database engine written in Java (free & open source edition available)

• XStreamDB [29] , native XML Database

• Xpriori XMS [30] , XMS is a completely self constructing native XML database.

External references

• XML Databases - The Business Case, Charles Foster, June 2008 [31] - Talks about the current state of Databases

and data persistence, how the current Relational Database model is starting to crack at the seams and gives an

insight into a strong alternative for today's requirements.

• An XML-based Database of Molecular Pathways (2005-06-02) [32] Speed / Performance comparisons of eXist,

X-Hive, Sedna and Qizx/open

• XML Native Database Systems: Review of Sedna, Ozone, NeoCoreXMS [33] 2006

• XML Data Stores: Emerging Practices [34]

• Bhargava, P.; Rajamani, H.; Thaker, S.; Agarwal, A. (2005) XML Enabled Relational Databases, Texas, The

University of Texas at Austin.

• O'Connell, S. Advanced Databases Course Notes, Southampton, University of Southampton, 2005

• Initiative for XML Databases [35]

• XML and Databases, Ronald Bourret, September 2005 [36]

• XML Database Products, Ronald Bourret, 2000-2009 [37]

XML database 149

• The State of Native XML Databases, Elliotte Rusty Harold, August 13, 2007 [38]

• XML for DB2 Information Integration [39] , an IBM Redbook that has a chapter on XML and databases (1st

chapter).

References

[1] Mustafa Atay and Shiyong Lu, “Storing and Querying XML: An Efficient Approach Using Relational Databases”, ISBN 3639115813, VDM

Verlag, 2009.

[2] http://xmldb-org.sourceforge.net/faqs.html

[3] http://jcp.org/en/jsr/detail?id=225

[4] http://www.oreillynet.com/onjava/blog/2006/03/dont_be_misled_xindice_is_dead.html

[5] http://xml.apache.org/xindice/

[6] http://basex.org/

[7] http://www.w3.org/TR/xpath-full-text-10/

[8] http://www.w3.org/TR/xqupdate/

[9] http://www.ibu.de/node/52

[10] http://www.clusterpoint.com/

[11] http://ibm.com/db2/viper/

[12] http://www.emc.com/products/detail/software/documentum-xdb.htm

[13] http://exist.sourceforge.net/

[14] http://www.gemstone.com/products/gemfire/enterprise.php

[15] http://www.marklogic.com/

[16] http://www.mgateway.com/mdbx.html

[17] http://monetdb.cwi.nl/XQuery/

[18] http://www.oracle.com/technology/tech/xml/xmldb/index.html

[19] http://www.oracle.com/database/berkeley-db/xml/index.html

[20] http://modis.ispras.ru/sedna

[21] http://modis.ispras.ru

[22] http://ispras.ru

[23] http://www.microsoft.com/sql/default.mspx

[24] http://www.softwareag.com/corporate/products/wm/tamino/

[25] http://www.ixiasoft.com/textmlserver

[26] http://www.rainingdata.com/products/tl/index.html

[27] http://www.eecs.umich.edu/db/timber/

[28] http://www.xmlmind.com/qizx/

[29] http://bluestream.com/products/xstreamdb32

[30] http://www.xpriori.com

[31] http://www.cfoster.net/articles/xmldb-business-case

[32] http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-3717

[33] http://swing.felk.cvut.cz/index.php?option=com_docman&task=doc_view&gid=5&Itemid=62

[34] http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/mags/ic/&toc=comp/mags/ic/2005/02/w2toc.xml&DOI=10.

1109/MIC.2005.48

[35] http://xmldb-org.sourceforge.net

[36] http://www.rpbourret.com/xml/XMLAndDatabases.htm

[37] http://www.rpbourret.com/xml/XMLDatabaseProds.htm

[38] http://cafe.elharo.com/xml/the-state-of-native-xml-databases/

[39] http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246994.html

XML editor 150

XML editor

An XML editor is a markup language editor with added functionality to facilitate the editing of XML. This can be

done using a plain text editor, with all the code visible, but XML editors have added facilities like tag completion

and menus and buttons for tasks that are common in XML editing, based on data supplied with document type

definition (DTD) or the XML tree.

There are also graphical XML editors that hide the code in the background and present the content to the user in a

more user-friendly format, approximating the rendered version or editing forms. This is helpful for situations where

people who are not fluent in XML code need to enter information in XML based documents such as time sheets and

expenditure reports. And even if the user is familiar with XML, use of such editors, which take care of syntax

details, is often faster and more convenient.

Functionality beyond syntax highlighting

An XML editor goes beyond the syntax highlighting offered by many plaintext editors and generic source code

editors, verifying the XML source based on an XML Schema or XML DTD, and some can do it as the document is

being edited in real time. Other features of an editor designed specifically for editing XML might include element

word completion and automatic appending of a closing tag whenever an opening tag is entered. These features can

help to prevent typographically originating errors in the XML code. Some XML editors provide for the ability to run

an XSLT transform, or series of transforms, over a document. Some of the larger XML packages even offer XSLT

debugging features and XSL-FO processors for generation of PDF files from documents.

Textual editors

Text XML editors generally provide features dealing with working with element tags. Syntax highlighting is a basic

standard of any XML editor; that is, they color element text differently from regular text. Element and attribute

completion based on a DTD or schema is also available from many text XML editors. Displaying line numbers is

also a common and useful feature, as is providing the ability to reformat a document to conform to a particular style

of indenture.

Here is an example of edition in a text editor with syntax coloring:

The advantage of text editors is that they present exactly the information that is stored in the XML file. It is the best

way to control the formatting of the file (such as indentations), to do low-level operations (such as a find/replace on

element names) and to edit XML files without any schema or configuration file.

Graphical editors

Graphical editors based on GUIs may be easier for some people to use than text editors, and may not require

knowledge of XML syntax. These are often called WYSIWYG ("What You See Is What You Get") editors, but not

all of them are WYSIWYG: graphical XML editors can be WYSIWYG when they try to display the final rendering

or WYSIWYM ("What You See Is What You Mean") when they try to display the actual meaning of XML elements.

When they are not WYSIWYG, they do not display the (or one of the) graphical end result of a document, but

instead focus on conveying the meaning of the text. They use DTDs or XML schemas and/or configuration files to

XML editor 151

map XML elements to graphical components.

These kinds of editors are generally more useful for XML languages for data rather than for storing documents.

Documents tends to be fairly freeform in structure, which tends to defy the generally rigid nature of many graphical

editors.

In the above example, the editor is using a configuration file to know that the TABLE element represents a table, the

TR element represents a row of the table, and the TD element represents a cell of the table. It is using this

information to display the table based on this structuring information, in order to make editing easier.

Schema and configuration files information can also be used to ensure that users do not create invalid documents.

For instance, in a text editor, it is possible to create a row with too many cells in the table, while this would not be

possible with the above graphical user interface.

WYSIWYG editors

WYSIWYG editors let people edit files directly with the tags represented by some form of graphical viewing rather

than bare XML code. Often, WYSIWYG editors attempt to emulate the end result of some transform or CSS

stylesheet application. This emulation may or may not be possible, depending on the transformation from XML into

the end result.

Naive use of a WYSIWYG editor can lead to the creation of documents that do not have the intrinsic semantics of

the particular XML language. This comes about if the user is focused on trying to achieve a certain visual

presentation with the editor, rather than using the WYSIWYG to make editing the document easier. For instance,

someone creating a web page could use an H2 element (meaning: second level title) instead of H1 (meaning: first

level title) because it looks smaller on their current WYSIWYG editor. Such an author is making a choice based on

the apparent visual representation, but a visitor to the author's web page can offer a very different rendering in their

browser.

However, as long as the underlying meaning of the document is understood by the author, and the author does not

make decisions based on the exact look in the WYSIWYG editor, such an editor can be of value to the writer. It is

generally much easier to read a document that is being rendered in some fashion than it is to read the raw XML code.

Also, editing can be much more intuitive, as the WYSIWYG editor can use tools similar to many word processing

applications. Some WYSIWYG editors even allow the user to use a DTD or Schema and define their own user

interface for editing.

Usually WYSIWYG editors support CSS but not XSLT, because XSLT transformations can be very complex, and

guessing what the user meant when changing the end result can be impossible. The WYSIWYG editors that do

support XSLT, such as Syntext Serna, will therefore apply changes directly to the original XML, while updating the

view by running the XSLT for every change.

XML editor 152

In the above example, a stylesheet is used to color table cells in a particular way. For instance, even rows do not have

the same background color as odd rows, in order to make reading easier.

Application domains

• Computer programming

• Technical editing

See also

• List of XML editors

• Authoring system

• Editing

• Source code editor

Edited formats

• XML

• Darwin Information Typing Architecture (DITA)

• DocBook

External links

• XML Editors [1] at the Open Directory Project

• List of editors from xml.com [2]

References

[1] http://www.dmoz.org/Computers/Data_Formats/Markup_Languages/XML/Tools/Editors//

[2] http://www.xml.com/pub/pt/3

XML Enabled Directory 153

XML Enabled Directory

XML Enabled Directory (XED) is a framework for managing objects represented using the Extensible Markup

Language (XML). XED builds on X.500 and LDAP directory services technologies.

XED was originally designed in 2003 by Steven Legg of eB2Bcom (formerly of Adacel Technologies) and Daniel

Prager (formerly of Deakin University).

The XML Enabled Directory (XED) framework leverages existing Lightweight Directory Access Protocol (LDAP)

and X.500 directory technology to create a directory service that stores, manages and transmits Extensible Markup

Language (XML) format data, while maintaining interoperability with LDAP clients, X.500 Directory User Agents

(DUAs), and X.500 Directory System Agents (DSAs).

The main features of XED are:

• semantically equivalent XML renditions of existing directory protocols,

• XML renditions of directory data,

• the ability to accept at run time, user defined attribute syntaxes specified in a variety of XML schema languages,

• the ability to perform filter matching on the parts of XML format attribute values.

• the flexibility for implementors to develop XED clients using only their favoured XML schema language.

The XML Enabled Directory allows directory entries to contain XML formatted data as attribute values.

Furthermore, the attribute syntax can be specified in any one of a variety of XML schema languages that the

directory understands.

The directory server is then able to perform data validation and semantically meaningful matching of XML

documents, or their parts, on behalf of client applications, making the implementation of XML-based applications

easier and faster.

XML applications can also exploit the directory's traditional capabilities of cross-application data sharing, data

distribution, data replication, user authentication and user access control, further lowering the cost of building new

XML applications

XED Implementations

eB2Bcom's View500 Identity Server provides organisations with a fast, scalable and flexible directory system. As it

has been developed strictly adhering to open standards and it features support for the X.500, LDAP, XED and

ACP133 Standards. Being standards compliant, View500 will interface with a variety of applications, both now and

into the future.

External links

• XML Enabled Directory [1]

• A work-in-progress XED specification [2]

References

[1] http://www.xmled.info/

[2] http://www.xmled.info/specs.htm

XML Encryption 154

XML Encryption

XML Encryption, also known as XML-Enc, is a specification, governed by a W3C recommendation, that defines

how to encrypt the contents of an XML element.

Although XML Encryption can be used to encrypt any kind of data, it is nonetheless known as "XML Encryption"

because an XML element (either an EncryptedData or EncryptedKey element) contains or refers to the cipher text,

keying information, and algorithms.

Both XML Signature and XML Encryption use the KeyInfo element, which appears as the child of a SignedInfo,

EncryptedData, or EncryptedKey element and provides information to a recipient about what keying material to use

in validating a signature or decrypting encrypted data.

The KeyInfo element is optional: it can be attached in the message, or be delivered through a secure channel.

External links

• W3C info [1]

References

[1] http://www.w3.org/TR/xmlenc-core/

XML Events

In computer science and web development, XML Events is a W3C standard [1] for handling events that occur in an

XML document. These events are typically caused by users interacting with the web page using a device such as a

web browser on a personal computer or mobile phone.

Formal Definition

An XML Event is the representation of some asynchronous occurrence (such as a mouse button click) that gets

associated with a data element in an XML document. XML Events provides a static, syntactic binding to the DOM

Events interface, allowing the event to be handled.

Motivation

The XML Events standard is defined to provide XML-based languages with the ability to uniformly integrate event

listeners and associated event handlers with Document Object Model (DOM) Level 2 event interfaces. The result is

to provide a declarative, interoperable way of associating behaviors with XML-based documents such as XHTML.

Advantages of XML Events

XML Events uses a separation of concerns design pattern, and is technology-neutral with regards to handlers. It

gives authors freedom in organizing their code and allows separation of document content from scripting.

legacy HTML and early SVG versions bind events to presentation elements by encoding the event name in an

attribute name, such that the value of the attribute is the action for that event at that element. For example (with

Javascript’s onclick attribute):

Stay here!

XML Events 155

This design has three drawbacks:

1. it hard-wires the events into the language, so that adding new event types requires changes to the language

2. it forces authors to mix the content of the document with the specifications of the scripting and event handling,

rather than allowing them to separate them.

3. it restricts authors to a single scripting language per document.

Relationship to Other Standards

Unlike DOM Events which are usually associated with HTML documents, XML events are designed to be

independent of specific devices. XML Events are used extensively in XForms, and, in version 1.2 of the SVG

specification as of July 2006, is still a working draft.

Example of XML Events using Listener in XForms

The following is an example of how XML events are used in the XForms specification:

Do it!

alert("test");

In this example, when the DOMActivate event occurs on the data element with an id attribute of myButton, the

handler doit (for example a Javascript script element) is executed.

See also

• ECMAScript

• DOM Events

• XForms

• XHTML

External links

• W3C XML Events Specification [2] was a W3C Recommendation on 14 October 2003 [3]

• W3C XML Events for HTML Authors [4] tutorial

XML Events 156

References

[1] "XML Events: An Events Syntax for XML" (http://www.w3.org/TR/xml-events/). World Wide Web Consortium. 2003-10-14. .

Retrieved 2008-11-19.

[2] http://www.w3.org/TR/xml-events

[3] http://www.w3.org/TR/2003/REC-xml-events-20031014/

[4] http://www.w3.org/MarkUp/2004/xmlevents-for-html-authors

XML framework

An XML framework is a Software framework for XML. Basically, the framework implements several features to aid

the programmer in creating her own application, but an XML framework differs from other frameworks in that all

data produced is XML. The programmer defines and produces pure data in XML format and the framework

transforms the document to any format desired.

One code, one XML and several transformations like XHTML, SVG, WML, Excel or Word format, or any

document type may result.

Features in an XML framework

• Classes to abstract the USE of XML documents

• Classes to abstract the DATA access - All data is XML independent of your source, like XML, Database, text

files

• XSLT cache.

• Easy way to create XSLT documents like code snippets

• Framework must be extensible because XML is extensible by definition.

Pure XML frameworks

• XMLNuke

XML Literals 157

XML Literals

In the Microsoft .NET framework, XML Literal allows computer program to include XML directly in the code. It is

currently only supported in VB.NET 9.0. When Visual Basic expression is embedded in an XML literal, the

application creates a LINQ to XML object for each literal at run time.

XML namespace

XML namespaces are used for providing uniquely named elements and attributes in an XML document. They are

defined in Namespaces in XML [1] , a W3C recommendation. An XML instance may contain element or attribute

names from more than one XML vocabulary. If each vocabulary is given a namespace then the ambiguity between

identically named elements or attributes can be resolved.

A simple example would be to consider an XML instance that contained references to a customer and an ordered

product. Both the customer element and the product element could have a child element named id. References to the

id element would therefore be ambiguous; placing them in different namespaces would remove the ambiguity.

Namespace declaration

A namespace is declared using the reserved XML attribute xmlns, the value of which must be an Internationalized

Resource Identifier (IRI), usually a Uniform Resource Identifier (URI) reference.

For example:

xmlns="http://www.w3.org/1999/xhtml"

Note, however, that the namespace specification does not require nor suggest that the namespace URI be used to

retrieve information; it is simply treated by an XML parser as a string. For example, the document at http:/ / www.

w3.org/1999/xhtml itself does not contain any code. It simply describes the XHTML namespace to human readers.

Using a URI (such as "http://www.w3.org/1999/xhtml") to identify a namespace, rather than a simple string (such as

"xhtml"), reduces the possibility of different namespaces using duplicate identifiers.

It is also possible to map namespaces to prefixes in namespace declarations. For example:

xmlns:xhtml="http://www.w3.org/1999/xhtml"

In this case, any element or attribute names that start with the prefix "xhtml:" are considered to be in the XHTML

namespace.

Namespace names

Although the term namespace URI is widespread, the W3C Recommendation refers to it as the namespace name.

The specification is not entirely prescriptive about the precise rules for namespace names (it does not explicitly say

that parsers must reject documents where the namespace name is not a valid Uniform Resource Identifier), and many

XML parsers allow any character string to be used. In version 1.1 of the recommendation, the namespace name

becomes an Internationalized Resource Identifier, which licenses the use of non-ASCII characters that in practice

were already accepted by nearly all XML software. The term namespace URI persists, however, not only in popular

usage but also in many other specifications from W3C and elsewhere.

Following publication of the Namespaces recommendation, there was an intensive debate about how a relative URI

should be handled, with some arguing that it should simply be treated as a character string, and others that it should

be turned into an absolute URI by resolving it against the base URI of the document [2] . The result of the debate was

XML namespace 158

a ruling from W3C that relative URIs were deprecated [3] .

The use of URIs taking the form of URLs in the http scheme (such as http:/ / www. w3. org/ 1999/ xhtml'') is

common, despite the absence of any formal relationship with the HTTP protocol. The Namespaces specification does

not say what should happen if such a URL is dereferenced (that is, if software attempts to retrieve a document from

this location). One convention adopted by some users is to place a RDDL document at the location [4] . In general,

however, users should assume that the namespace URI is simply a name, not the address of a document on the web.

See also

• Namespace

External links

• Namespaces in XML 1.0 (Third Edition) [1]

• Namespaces in XML 1.1 (Second Edition) [8]

References

[1] http://www.w3.org/TR/REC-xml-names/

[2] Leigh Dodds (24 May 2000), News from the trenches (http://www.xml.com/pub/a/2000/05/24/deviant/index.html),

[3] Dan Connolly (11 Sep 2000), W3C XML Plenary decision on relative URI references in namespace declarations

[4] Elliotte Rusty Harold (20 Feb 2001), RDDL Me This: What Does a Namespace URL Locate? (http://www.oreillynet.com/pub/a/oreilly/

xml/news/xmlnut2_0201.html),

XML Pretty Printer

XML Pretty Printers are a type of Prettyprint or code beautifier that specifically improve the readability of XML.

XML as a standard is designed to be human readable, but is sometimes generated by a computer as tightly

compressed or compacted, and hence more difficult to read and edit. Running the XML file through a pretty printer

will improve its readability and editability.

Examples of XML Pretty Printers

• xmllint (utility in open source library libxml2)

• xmlindent open source tool, more information on the homepage here [1] .

Online:

• XML Pretty Printer Online

• DecisionSoft XML Pretty Printer

Windows:

• xmlpp (command line)

XML Pretty Printer 159

See also

• Prettyprint

• XML

External links

• XML Pretty Printer Online [2]

• DecisionSoft XML Pretty Printer [3]

• xmlpp pretty printer [4]

• XML Indent [1] , an XML stream reformatter

References

[1] http://xmlindent.sourceforge.net/

[2] http://www.iconv.com/xmllint.htm

[3] http://tools.decisionsoft.com/xmlpp.html

[4] http://www.cheztabor.com/xmlpp/index.htm

XML Protocol

The XML Protocol ("XMLP") is a standard being developed by the W3C XML Protocol Working Group to the

following guidelines, outlined in the group's charter:

1. An envelope for encapsulating XML data to be transferred in an interoperable manner that allows for distributed

extensibility.

2. A convention for the content of the envelope when used for RPC (Remote Procedure Call) applications. The

protocol aspects of this should be coordinated closely with the IETF and make an effort to leverage any work they

are doing, see below for details.

3. A mechanism for serializing data representing non-syntactic data models such as object graphs and directed

labeled graphs, based on the data types of XML Schema.

4. A mechanism for using HTTP transport in the context of an XML Protocol. This does not mean that HTTP is the

only transport mechanism that can be used for the technologies developed, nor that support for HTTP transport is

mandatory. This component merely addresses the fact that HTTP transport is expected to be widely used, and so

should be addressed by this Working Group. There will be coordination with the Internet Engineering Task Force

(IETF). (See Blocks Extensible Exchange Protocol)

Further, the protocol developed must meet the following requirements, as per the working group's charter:

1. The envelope and the serialization mechanisms developed by the Working Group may not preclude any

programming model nor assume any particular mode of communication between peers.

2. Focus must be put on simplicity and modularity and must support the kind of extensibility actually seen on the

Web. In particular, it must support distributed extensibility where the communicating parties do not have a priori

knowledge of each other.

XML Protocol 160

See also

• XML

• Internet Engineering Task Force

External links

• XML Protocol Working Group Charter [1]

• XML Protocol Working Group [2]

References

[1] http://www.w3.org/2004/02/XML-Protocol-Charter

[2] http://www.w3.org/2000/xp/Group/

XML schema

An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the

structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML

itself. These constraints are generally expressed using some combination of grammatical rules governing the order of

elements, Boolean predicates that the content must satisfy, data types governing the content of elements and

attributes, and more specialized rules such as uniqueness and referential integrity constraints.

There are languages developed specifically to express XML schemas. The Document Type Definition (DTD)

language, which is native to the XML specification, is a schema language that is of relatively limited capability, but

that also has other uses in XML aside from the expression of schemas. Two more expressive XML schema

languages in widespread use are XML Schema (with a capital S) and RELAX NG.

The mechanism for associating an XML document with a schema varies according to the schema language. The

association may be achieved via markup within the XML document itself, or via some external means.

Validation

The process of checking to see if an XML document conforms to a schema is called validation, which is separate

from XML's core concept of syntactic well-formedness. All XML documents must be well-formed, but it is not

required that a document be valid unless the XML parser is "validating," in which case the document is also checked

for conformance with its associated schema. DTD-validating parsers are most common, but some support W3C

XML Schema or RELAX NG as well.

Documents are only considered valid if they satisfy the requirements of the schema with which they have been

associated. These requirements typically include such constraints as:

• Elements and attributes that must/may be included, and their permitted structure

• The structure as specified by a regular expression syntax

• How character data is to be interpreted, e.g. as a number, a date, a URL, a Boolean, etc.

Validation of an instance document against a schema can be regarded as a conceptually separate operation from

XML parsing. In practice, however, many schema validators are integrated with an XML parser.

XML schema 161

XML schema languages

• Document Content Description facility for XML, an RDF framework [1]

• Document Definition Markup Language (DDML)

• Document Schema Definition Languages (DSDL)

• Document Structure Description (DSD)

• Document Type Definition (DTD)

• Namespace Routing Language (NRL)

• RELAX NG and its predecessors RELAX and TREX

• SGML

• Schema for Object-Oriented XML (SOX)

• Schematron

• XML-Data Reduced (XDR)

• XML Schema (WXS or XSD)

Capitalization

There is some confusion as to when to use the capitalized spelling "Schema" and when to use the lowercase spelling.

The lowercase form is a generic term and may refer to any type of schema, including DTD, XML Schema (aka

XSD), RELAX NG, or others, and should always be written using lowercase except when appearing at the start of a

sentence. The form "Schema" (capitalized) in common use in the XML community always refers to W3C XML

Schema.

See also

• Data structure

• Structuring information

• List of XML schemas

• XML Information Set

• XML Schema Language Comparison

• Schema (for other uses of the term)

External links

• Comparing XML Schema Languages [2] by Eric van der Vlist (2001)

• Comparative Analysis of Six XML Schema Languages [3] by Dongwon Lee, Wesley W. Chu, In ACM SIGMOD

Record, Vol. 29, No. 3, page 76-87, September 2000

• Taxonomy of XML Schema Languages using Formal Language Theory [4] by Makoto Murata, Dongwon Lee,

Murali Mani, Kohsuke Kawaguchi, In ACM Trans. on Internet Technology (TOIT), Vol. 5, No. 4, page 1-45,

November 2005

• Application of XML Schema in Web Services Security [5] by Sridhar Guthula, W3C Schema Experience Report,

May 2005

XML schema 162

References

[1] "Document Content Description for XML: Submission to the World Wide Web Consortium 31-July-1998" (http://www.w3.org/TR/

NOTE-dcd). .

[2] http://www.xml.com/pub/a/2001/12/12/schemacompare.html

[3] http://pike.psu.edu/publications/sigmod-record-00.pdf

[4] http://pike.psu.edu/publications/toit05.pdf

[5] http://www.w3.org/2005/05/25-schema/guthula.html

XML Schema Editor

The W3C's XML Schema Recommendation defines a formal mechanism for describing XML documents. The

standard has become very popular and is used by the majority of standards bodies when describing their data. [1]

The standard is very versatile allowing for programming concepts such as inheritance, and type creation. However

its high complexity is one of its main issues. The standard itself is highly technical and published in 3 different parts,

making it difficult to understand without committing large amounts of time to it.

XML Schema Editor Tools

The problems users face when working with the XSD standard can largely be mitigated with the use of graphical

editing tools. Although any text-based editor can be used to edit an XML Schema, a graphical editor offers the

biggest advantages, allowing the structure of the document to be viewed graphically and edited with validation

support, entry helpers and other useful features.

The editors that have been developed so far take several different approaches to the presentation of information:

Text View

The text view of an XML Schema shows the schema in its native form. XML Schema Editors generally add to the

text view with features like inline entry helpers and entry helper windows, code completion, line numbering, source

folding, and syntax coloring.

In more lengthy and complex schema documents, this is often difficult for even highly trained content model

architects to work with, paving the way for software companies to come up with new and inventive way for users to

visualize these documents.

Physical View

A physical view of an XML Schema displays a graphical entity for each element within the XML Schema. This can

make an XSD document easier to read, but does little to simplify editing. This is largely down to the structure of the

XSD Standard, container elements are required which are dependent on the base type used and the types contained

within. Meaning small changes to the logical structure can cause changes to ripple through the document.

The structure of the XSD standard also means entities are referenced from other locations with the document, some

editors allow these to be expanded and viewed in the location they are referenced from some don't, meaning lots of

manual cross referencing.

XML Schema Editor 163

Logical View

A logical view shows the structure of the XML Schema without showing all the detail of the syntax used to describe

it. This provides a much clearer view of the XML Schema, making it easier to understand the structure of the

document, and makes it easier to edit. Because the editor shows the logical structure of the XSD document, there is

no need to show every element, removing much of the complexity and allowing the editor to automatically manage

the syntactical rules.

Example

The following example will show the source XSD, logical and physical views for a simple schema.

A Sample XML Document for the schema

Physical View Logical View

XML Schema Editor 164

John

Doe

As you can see the logical view provides more information, but without the syntactical clutter, making it easier to

understand and work with.

XML Schema Editors

As the XSD standard has gained support, a host of XML Schema editors have been developed.

Application Name Screenshot Code Editor Physical

Altova XMLSpy screenshots [2]

Editor

Eclipse XSD Editor (eclipse.org [3] ) screenshots [3] Limited Editing

Liquid XML Studio screenshots [4]

Oxygen xml screenshots [5] Read only

Stylus Studio screenshots [6] Read only

XML Fox - Freeware Edition screenshots [7]

References

[1] http://www.w3.org/TR/xmlschema-0/W3C Primer

[2] http://www.altova.com/features_dtdschema.html

[3] http://wiki.eclipse.org/index.php/Introduction_to_the_XSD_Editor

[4] http://www.liquid-technologies.com/XmlStudio/XmlStudio.aspx [5]

http://www.oxygenxml.com/xml_schema_editor.html

[6] http://www.stylusstudio.com/xml_schema_editor.html

[7] http://www.xmlfox.com/xml_schema_editor.htm

Logical Editor Split Code/Diagram

View

XML Schema Language Comparison 165

XML Schema Language Comparison

A XML schema is a description of a type of XML document, typically expressed in terms of constraints on the

structure and content of documents of that type, above and beyond the basic syntax constraints imposed by XML

itself. There are several different languages available for specifying an XML schema. Each language has its strengths

and weaknesses.

Note: the W3C defined schema language is called "XML Schema". However, this name can be confusing in the

context of referring to a number of XML schema languages. As such, throughout this document, references to the

term "XML schema" will be any XML schema language where the meaning might be ambiguous, while the term

"W3C XML Schema" (referred to in this article as WXS) will be used for the W3C-defined XML schema language.

Overview

Though there are a number of schema languages available, the primary three languages are Document Type

Definitions, W3C XML Schema, and RELAX NG. Each language has its own advantages and disadvantages.

This article also covers a brief review of other schema languages.

The primary purpose of a schema language is to specify what the structure of an XML document can be. This means

which elements can reside in which other elements, which attributes are and are not legal to have on a particular

element, and so forth. A schema is somewhat equivalent to a grammar for a language; a schema defines what the

vocabulary for the language may be and what a valid "sentence" is.

Document Type Definitions

Advantages

Of the primary three languages, DTDs are the only ones that can be defined inline. That is, the DTD can actually be

embedded directly into the document.

DTDs can define more than merely the content model. It can define data elements that can be used in the document,

much like a C or C++ preprocessor may have #defines that are used internally.

The DTD language is compact and highly readable, though it does require some experience to understand.

Disadvantages

The primary disadvantage to DTDs is their weakness of specificity. The content models for DTDs are very basic,

particularly compared to the other two languages.

Overuse of DTD-defined elements may make a document illegible or incomprehensible without the associated DTD.

Additionally, there are several XML processors that, typically for ease-of-implementation reasons, do not understand

DTDs. As such, if DTD-defined entities are being used, these XML processors will not recognize them.

The language that DTDs are written in is not XML. Therefore, DTDs cannot use the various frameworks that have

been built around XML. XML editors that support writing DTDs must do so by parsing an additional language, for

example. Some XML processors, typically for economy of implementation or execution, simply ignore DTD

information, including DTD data elements.

The DTD concept for XML was borrowed from the SGML DTD concept. As such, the construct could not be

changed when XML was extended with namespaces. As such, DTDs are namespace unaware.

There is limited support for defining the type of the contained data. DTDs are primarily structural in nature. They do

not have the ability to specify that an element contains an integral number, real number, a date, or anything of that

nature.

XML Schema Language Comparison 166

Tool Support

DTDs are perhaps the most widely supported schema language for XML. Because DTDs are one of the earliest

schema languages for XML, defined before XML even had namespace support, they are widely supported. Internal

DTDs are often supported in XML processors; external DTDs are less often supported, but only slightly. Most large

XML parsers, ones that support multiple XML technologies, will provide support for DTDs as well.

W3C XML Schema

Advantages over DTDs

Compared to DTDs, W3C XML Schemas are exceptionally powerful. They provide much greater specificity than

DTDs could. They are namespace aware, and provide support for types.

W3C XML Schema is written in XML itself, and therefore has a schema of its own (appropriately, written in W3C

XML Schema).

W3C XML Schema has a large number of built-in and derived data types. These are specified by the W3C XML

Schema specification, so all W3C XML Schema validators and processors must support them.

Due to the nature of the schema language, after an XML document is validated, the entire XML document, both

content and structure, can be expressed in terms of the schema itself. This functionality, known as

Post-Schema-Validation Infoset (PSVI), can be used to transform the document into a hierarchy of typed objects that

can be accessed in a programming language through a neutral interface.

Commonality with RELAX NG

Both RELAX NG and W3C XML Schema allow for similar mechanisms of specificity. Both allow for a degree of

modularity in their languages, going so far as to being able to split the schema into multiple files. And both of them

are, or can be, defined in an XML language.

Advantages over RELAX NG

RELAX NG lacks any analog to PSVI. Unlike W3C XML Schema, RELAX NG was not designed with type

assignment and data binding in mind.

W3C XML Schema has a formal mechanism for attaching a schema to an XML document.

RELAX NG has no ability to apply default attribute data to an element's list of attributes (i.e., changing the XML

info set), while W3C XML Schema does. [1]

W3C XML Schema has a rich "simple type" system built in (xs:number, xs:date, etc., plus derivation of custom

types), while RELAX NG has an extremely simplistic one because it's meant to use type libraries developed

independently of RELAX NG, rather than grow its own. This is seen by some as a disadvantage. In practice it's

common for a RELAX NG schema to use the predefined "simple types" and "restrictions" (pattern, maxLength, etc.)

of W3C XML Schema.

In W3C XML Schema a specific number or range of repetitions of patterns can be expressed more elegantly than

under RELAX NG. For large numbers it's practically not possible to specify at all in RELAX NG.

XML Schema Language Comparison 167

Disadvantages

W3C XML Schema is complex and hard to learn, although that's partially because it tries to do more than mere

validation (see PSVI).

Although being written in XML is an advantage, it is also a disadvantage in some ways. The W3C XML Schema

language in particular can be quite verbose, while a DTD can be terse and relatively easily editable.

Likewise, WXS's formal mechanism for associating a document with a schema can pose a potential security

problem. For WXS validators that will follow a URI to an arbitrary online location, there is the potential for reading

something malicious from the other side of the stream. [2]

W3C XML Schema does not implement most of the DTD ability to provide data elements to a document. While

technically a comparative deficiency, it also does not have the problems that this ability can create as well, which

makes it a strength.

Although W3C XML Schema's ability to add default attributes to elements is an advantage, it is a disadvantage in

some ways as well. It means that an XML file may not be usable in the absence of its schema, even if the document

would validate against that schema. In effect, all users of such an XML document must also implement the W3C

XML Schema specification, thus ruling out minimalist or older XML parsers. It can also dramatically slow down

processing of the document, as the processor must potentially download and process a second XML file (the

schema).

Tool Support

WXS support exists in a number of large XML parsing packages. Xerces and the .NET Framework's Base Class

Library both provide support for WXS validation.

RELAX NG

RELAX NG provides for most of the advantages that W3C XML Schema does over DTDs.

Advantages over W3C XML Schema

While the language of RELAX NG can be written in XML, it also has an equivalent form that is much more like a

DTD, but with greater specifying power. This form is known as the compact syntax. Tools can easily convert

between these forms with no loss of features or even commenting. Even arbitrary elements specified between

RELAX NG XML elements can be converted into the compact form.

RELAX NG provides very strong support for unordered content. That is, it allows the schema to state that a

sequence of patterns may appear in any order.

RELAX NG also allows for non-deterministic content models. What this means is that RELAX NG allows the

specification of a sequence like the following:

When the validator encounters something that matches the "odd" pattern, it is unknown whether this is the optional

last "odd" reference or simply one in the zeroOrMore sequence without looking ahead at the data. RELAX NG

allows this kind of specification. W3C XML Schema requires all of its sequences to be fully deterministic, so

XML Schema Language Comparison 168

mechanisms like the above must be either specified in a different way or omitted altogether.

RELAX NG allows attributes to be treated as elements in content models. In particular, this means that one can

provide the following:

false

true

This block states that the element "some_element" must have an attribute named "has_name". This attribute can only

take true or false as values, and if it is true, the first child element of the element must be "name", which stores text.

If "name" did not need to be the first element, then the choice could be wrapped in an "interleave" element along

with other elements. The order of the specification of attributes in RELAX NG has no meaning, so this block need

not be the first block in the element definition.

W3C XML Schema cannot specify such a dependency between the content of an attribute and child elements.

RELAX NG's specification only lists two built-in types (string and token), but it allows for the definition of many

more. In theory, the lack of a specific list allows a processor to support data types that are very problem-domain

specific.

Most RELAX NG schemas can be algorithmically converted into W3C XML Schemas and even DTDs (except when

using RELAX NG features not supported by those languages, as above). The reverse is not true. As such, RELAX

NG can be used as a normative version of the schema, and the user can convert it to other forms for tools that do not

support RELAX NG.

Disadvantages

Most of RELAX NG's disadvantages are covered under the section on W3C XML Schema's advantages over

RELAX NG.

Though RELAX NG's ability to support user-defined data types is useful, it comes at the disadvantage of only

having two data types that the user can rely upon. Which, in theory, means that using a RELAX NG schema across

multiple validators requires either providing those user-defined data types to that validator or using only the two

basic types. In practice however, most RELAX NG processors support the W3C XML Schema set of data types.

XML Schema Language Comparison 169

Tool Support

RELAX NG's tool support is significant, but it is less widespread than W3C XML Schema. The Mono Project's

implementation of the .NET Framework includes a RELAX NG validator. The C library libxml2 provides RELAX

NG support as well. Sun Microsystems's Multiple Schema Validator for Java also provides RELAX NG support.

Schematron

Schematron is a fairly unique schema language. Unlike the main three, it defines an XML file's syntax as a list of

XPath-based rules. If the document passes these rules, then it is valid.

Advantages

Because of its rule-based nature, Schematron's specificity is very strong. It can require that the content of an element

be controlled by one of its siblings. It can also request or require that the root element, regardless of what element

that happens to be, have specific attributes. It can even specify required relationships between multiple XML files.

Disadvantages

While Schematron is good at relational constructs, its ability to specify the basic structure of a document, that is,

which elements can go where, results in a very verbose schema.

The typical way to solve this is to combine Schematron with RELAX NG or W3C XML Schema. There are several

schema processors available for both languages that support this combined form. This allows Schematron rules to

specify additional constraints to the structure defined by W3C XML Schema or RELAX NG.

Tool Support

Schematron's reference implementation is actually an XSLT transformation that transforms the Schematron

document into an XSLT that validates the XML file. As such, Schematron's potential toolset is any XSLT processor,

though libxml2 provides an implementation that does not require XSLT. Sun Microsystems's Multiple Schema

Validator for Java has an add-on that allows it to validate RELAX NG schemas that have embedded Schematron

rules.

Namespace Routing Language (NRL)

This is not technically a schema language. Its sole purpose is to direct parts of documents to individual schemas

based on the namespace of the encountered elements. An NRL is merely a list of XML namespaces and a path to a

schema that each corresponds to. This allows each schema to be concerned with only its own language definition,

and the NRL file routes the schema validator to the correct schema file based on the namespace of that element.

This XML format is schema-language agnostic and works for just about any schema language.

XML Schema Language Comparison 170

See also

• Document Type Definition

• Document Structure Description

• W3C XML Schema

• RELAX NG

• Schematron

• Namespace Routing Language

• Namespace-based Validation Dispatching Language

References

• Comparative Analysis of Six XML Schema Languages [3] by Dongwon Lee, Wesley W. Chu, In ACM SIGMOD

Record, Vol. 29, No. 3, page 76-87, September 2000

• Taxonomy of XML Schema Languages using Formal Language Theory [4] by Makoto Murata, Dongwon Lee,

Murali Mani, Kohsuke Kawaguchi, In ACM Trans. on Internet Technology (TOIT), Vol. 5, No. 4, page 1-45,

November 2005

[1] While annotations in RELAX NG can support default attribute values, the RELAX NG specification does not mandate that a validator

provide this ability to modify an XML infoset as part of validation. The WXS specification does mandate this behavior. An additional

specification associated with RELAX NG does provide this ability. See Relax NG DTD Compatibility (default value) (http://www.

oasis-open.org/committees/relax-ng/compatibility.html#default-value).

[2] James Clark (co-creator of RELAX NG). RELAX NG and W3C XML Schema (http://www.imc.org/ietf-xml-use/mail-archive/

msg00217.html)

XML Studio 171

XML Studio

Editing an XML Schema in XML Studio

Developer(s) Liquid Technologies

Operating system Microsoft Windows

Type XML Editor

License EULA

Website [1]

Liquid XML Studio is an XML Editor and Integrated Development Environment (IDE) from Liquid Technologies.

Liquid XML Studio allows developers to create XML-based and Web services applications using technologies such

as XML, XML Schema, XSLT, XPath, WSDL, and SOAP [2] . Liquid XML Studio is also available as a plug-in for

Microsoft Visual Studio [3] .

Editions

• Starter Edition

• Designer Edition. Adds Visual Studio Integration and an XML Differencing tool.

• Developer Edition. Adds Code generation to the features found in the Designer Edition. The XML Data Binder

generates code for C++, C#, VB.Net, Java, Silverlight & Visual Basic.

Editing Views

• Graphical XML Schema Editor (XSD).

• XML editor - with syntax highlighting and intellisense

• DTD & CSS Editor - with syntax highlighting and Validation

• XSLT Editor - Test Transform, syntax highlighting, intellisense and Validation

Features

• XPath Expression Builder - shows the results of your queries in realtime

• Web Service Call Composer - allows developers to browse and call web services

• XML Sample Generator - generates sample XML from an XML Schema

• XSD Documentation Generation - creates HTML documentation from an XML Schema

• XML Differencing tool - visualize the differences between 2 XML files

• XML Schema Code Generation (XML Data Binding) for C++, C#, Java, VB.Net & Visual Basic 6

• XSLT Editor - edits and executes XSL Transforms

• Fast Infoset Support - Load and Save XML as Fast InfoSet [4]

XML Studio 172

See also

• Liquid Technologies

• XML

• Category:XML editors

• IDE

• XML Schema

• XSLT

• XPath

• Web services

• Web Services Description Language

• SOAP

External links

• XML Studio product page [1]

References

[1] http://www.liquid-technologies.com/Product_XmlStudio.aspx

[2] Liquid XML Studio Product Page (http://www.liquid-technologies.com/Product_XmlStudio.aspx)

[3] Micorosoft Visual Studio Gallery (http://visualstudiogallery.com/ExtensionDetails.

aspx?ExtensionID=33d43486-e73a-4f64-a342-f32c702abc19)

[4] OSS Nokalva 'Market Wire' (http://www.marketwire.com/press-release/Oss-Nokalva-714198.html)

XML Telemetric and Command Exchange

XTCE (for XML Telemetric and Command

Exchange) is an XML based exchange

format for spacecraft telemetry and

command meta-data. Using XTCE the

format and content of a space systems

command and telemetry links can be readily

exchanged between spacecraft operators and

manufacturers. XTCE was originally

standardized by the OMG. In April 2007 the

OMG released revision 1.1 of XTCE as an

OMG available specification. Version 1.0 of the XTCE specification is a CCSDS red-book specification and version

1.1 is a candidate CCSDS blue-book specification.

Overview

During the entire ground system development and operation phases of a mission, telemetry and telecommand

definitions may be exchanged between multiple systems and organizations. Without a standard format, databases

need dedicated converters to convert between the various proprietary database formats and editors. Allowing for a

common database exchange format throughout the entire mission lifecycle will significantly reduce the cost of

database conversions that occur in many space projects. XTCE has been developed as part of an international

cooperation involving the National Aeronautics and Space Administration, the Jet Propulsion Laboratory, the

Goddard Space Flight Center, the European Space Agency, the United States Air Force and private industry

XML Telemetric and Command Exchange 173

including RT Logic, Harris, SciSys, Boeing and Lockheed Martin. The standards development effort has been

coordinated via the Consultative Committee for Space Data Systems and the Object Management Group. The XML

Telemetry and Command Exchange standard is now in active use as a means to exchange mission databases

improving interoperability while reducing mission readiness costs.

External links

• XTCE home [1]

References

• AIAA conference - SpaceOps2006, The XTCE Standardization approach of Telemetry and Command Databases -

The ESA example: http://pdf.aiaa.org/preview/CDReadyMSPOPS06_1317/PV2006_5582.pdf

• AIAA conference - SpaceOps 2006, Exchanging Databases with Dissimilar Systems using CCSDS XTCE: http://

pdf.aiaa.org/preview/CDReadyMSPOPS06_1317/PV2006_5801.pdf

• CCSDS, MOIMS-SMC Working Group: http://cwe.ccsds.org/moims/docs/MOIMS-SMandC

• GSAW conference - 2006, Exchanging Databases with Dissimilar Systems using CCSDS XTCE, http://sunset.

usc.edu/gsaw/gsaw2006/s2/merri.pdf

• Aerospace Conference, 2004, XTCE: a standard XML-schema for describing mission operations databases, http:/

/ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/9422/29904/01368138.pdf

• AIAA conference - SpaceOps2006, A Model for a Spacecraft Operations Language, http://www.rheagroup.

com/AIAA-2006-5708-129.pdf

References

[1] http://www.omg.org/space/xtce

XML template engine 174

XML template engine

An XML template engine (or XML template processor) is a specialized template processor for XML input and/or

output, working in an XML template system context. There are two main types:

• "XML-suite standards" compliant engines:

• XSLT engines, named also XSLT processors

• XQuery engines, named also XQuery processors

• Others, like Web template engines

XSLT processors

XSLT processors may be delivered as standalone applications, or as software components or libraries intended for

use by applications. Many web browsers and web server software have XSLT processor components built into them.

Most current operating systems have an XSLT processor installed. For example, Windows XP comes with the

MSXML3 library, which includes an XSLT processor.

Optimizations

Early XSLT processors had very few optimizations; stylesheet documents were read using the Document Object

Model and the processor would act on them directly. XPath engines were also not optimized.

By 2000, however, implementors saw optimization opportunities in both XPath evaluation and template rule

processing. For example, the Java programming language's Transformation API for XML (TrAX), later subsumed

into the Java API for XML Processing (JAXP), acknowledged one such optimization: before processing, the XSLT

processor could condense the template rules and other stylesheet tree information into a single, compact Templates

object, free from the constraints and bloat of standard DOMs, in an implementation-specific manner. This

intermediate representation of the stylesheet tree allows for more efficient processing by potentially reducing

preparation time and memory overhead. Additionally, the formal API allows for the object to be cached and reused

for multiple transformations, potentially providing higher performance if several input documents are to be

processed with the same XSLT stylesheet. Parallels are often drawn between this optimization and the compilation

of programming language source code to bytecode: the stylesheets are said to be "compiled", even though they don't

usually produce native programming language bytecode; rather, they produce intermediate structures and routines

that are stored and processed internally. [1]

In contrast, Eugene Kuznetsov (DataPower, IBM) and Jacek Ambroziak (Sun Microsystems: XSLT, Ambrosoft:

Gregor/XSLT) have, independently, created the industry's first genuine optimizing compilers to output executable

binary output. The approach has two major benefits: 1) the transformation executable can be run anywhere: servers,

mobile devices, embedded environments lacking memory for the complete interpreter/compiler system, and 2) the

transformation performance may reach the highest possible levels. The optimized compilation approach will lead to

fastest transformation execution only when complemented by equally careful runtime system design!

XPath evaluation also has room for significant optimizations, and most processor vendors have implemented at least

some of them, for speed. For example, in the test will evaluate to true if /some/nodes

identifies any nodes, so evaluation can stop as soon as the first matching node is found; continuing to look for the

entire set of matching nodes would not change the result. Similar optimizations can be undertaken when processing

xsl:when and xsl:value-of, as well as expressions relying on, either implicitly or explicitly, string(), boolean(), or

number(), and those that use numeric and position()/last()-based predicates.

XML template engine 175

Implementations

Some of these are only libraries for specific programming languages, but some form the basis for command

line or shell script utilities for one or more operating systems. Such utilities are either bundled with the

libraries or independently maintained, and some are incorporated into other applications, such as database

engines and web browsers, in order to add XSLT functionality to them. With the exception of web browsers,

such utilities and applications are not listed here.

Implementations for Java

Xalan: Xalan-Java [2]

SAXON by Michael Kay

Gregor/XSLT [3] optimizing compiler and runtime by Jacek Ambroziak

XT [4] originally by James Clark

Oracle XSLT, in the Oracle XDK [5]

Implementations for the .NET Framework

Saxon .NET SourceForge Project Page [6] , an IKVM.NET-based port of Dr. Michael Kay's and

Saxonica's Saxon Processor provides XSLT 2.0, XPath 2.0, and XQuery 1.0 support on the .NET

platform.

The .NET System. XML assembly provides a compiled XSLT 1.0 implementation, as well as an

interpreted XSLT 1.0 implementation.

Implementations for C or C++

Xalan: Xalan-C++ [7]

libxslt the XSLT C library for GNOME

Sablotron [8] , which is integrated into PHP4

XJR [9] , with XSLT 2.0, XPath2.0, and JSON support

Implementations for Perl

XML::LibXSLT [10] is a Perl interface to the libxslt C library

XML::Sablotron [11] is a Perl interface to the Sablotron [8] processor

Implementations for PHP

XSLT [12] is the PHP4 interface to the Sablotron [8] processor

XSL [13] is the new interface to XSL introduced in PHP5. The extension uses the libxslt library.

Implementations for Python

4XSLT, in the 4Suite [14] toolkit by Fourthought, Inc.

lxml [15] is a Pythonic wrapper of the libxslt C library

Implementations for Ruby

Implementations for Tcl

Ruby/XSLT [16] is a simple XSLT class based on libxml and libxslt

Sablotron module for Ruby [17] is a ruby interface to Sablotron

TclXSLT [18] wraps the libxslt library.

tDOM [19] is a generic XML package, based on the expat library, that includes an XSLT

implementation. In 2003, it was deemed "very probably the fastest available open source XSLT

implementation, especially for bigger source files". [20]

Implementations for JavaScript

XML template engine 176

Google AJAXSLT [21] is an implementation of XSLT in JavaScript, intended for use in Ajax

applications.

Implementations for specific operating systems

Microsoft's MSXML library may be used in various Microsoft Windows application development

environments and languages, such as Visual Basic, C, and JScript.

Microsoft offers a new XSLT processor in the System. XML component of the .NET Framework.

Implementations integrated into web browsers

References

(Comparison of layout engines (XML))

Mozilla has native XSLT support [22] based on TransforMiiX.

Safari 1.2+ has native XSLT support, but Safari 1.2 is unable to perform XSL transformations via

JavaScript [23] , a limitation that does not occur in Mozilla or Internet Explorer, or Safari 3. This limits

the capabilities of Ajax applications that would run in Safari 2. Safari's (all varsions?) XML-parser is

also not standards-compliant; it will parse XML strings according to HTML rules. Therefore, under

certain circumstances, it will omit data from the DOM tree if it encounters malformed "HTML" — even

though it actually encountered valid XML. These errors will propagate to XSL-processed DOM trees.

X-Smiles has native XSLT support.

Opera has partial native XSLT support since Version 9. Notable exceptions include the absence of the

document() function.

Internet Explorer 6 supports XSLT 1.0 via the MSXML library (described above). IE5 and IE5.5 came

with an earlier MSXML component that only supported an older, nonrecommended dialect of XSLT. A

newer version of MSXML can be downloaded and installed separately to enable IE5 and IE5.5 to

support XSLT 1.0 through scripting, and if certain Windows Registry keys are modified, the newer

library will replace the older version as the default used by IE.

[1] Saxon: Anatomy of an XSLT processor (http://www-128.ibm.com/developerworks/xml/library/x-xslt2/) - An article describing the

implementation and optimization details of a popular Java-based XSLT processor.

[2] http://xml.apache.org/xalan-j/

[3] http://ambrosoft.com/

[4] http://www.blnz.com/xt/

[5] http://www.oracle.com/technology/tech/xml/xdkhome.html

[6] http://saxon.sourceforge.net/

[7] http://xml.apache.org/xalan-c/

[8] http://www.gingerall.org/sablotron.html

[9] https://www.p6r.com/software/xjr.html

[10] http://search.cpan.org/~msergeant/XML-LibXSLT-1.57/LibXSLT.pm

[11] http://search.cpan.org/~pavelh/XML-Sablotron-1.01/Sablotron.pm [12]

http://no.php.net/manual/en/ref.xslt.php

[13] http://no.php.net/manual/en/book.xsl.php

[14] http://4suite.org/

[15] http://codespeak.net/lxml/

[16] http://raa.ruby-lang.org/project/ruby-xslt/

[17] http://www.rubycolor.org/sablot/

[18] http://tclxml.sourceforge.net/tclxslt.html

[19] http://www.tdom.org/

[20] Loewer, Jochen; Ade, Rolf. "tDOM manual: tDOM Overview" (http://www.tdom.org/). . Retrieved 2009-11-12.

[21] http://goog-ajaxslt.sourceforge.net/

[22] http://www.mozilla.org/projects/xslt/

[23] http://developer.apple.com/internet/safari/faq.html#anchor21

XML tree 177

XML tree

XML documents have a hierarchical structure and can conceptually be interpreted as a tree structure, called an XML

tree.

This tree structure can not be divided into just root, nodes and leaves as normal tree structures. Although there is no

consensus on the terminology used on XML Trees, at least two standard terminologies exist:

• The terminology used in the XPath Data Model

• The terminology used in the XML Information Set.

XML validation

XML validation is the process of checking a document written in XML (eXtensible Markup Language) to confirm

that it is both "well-formed" and also "valid" in that it follows a defined structure. A "well-formed" document

follows the basic syntactic rules of XML, which are the same for all XML documents. [1] A valid document also

respects the rules dictated by a particular DTD or XML schema, according to the application-specific choices for

those particular . [2]

In addition, extended tools are available such as OASIS CAM standard specification that provide contextual

validation of content and structure that is more flexible than basic schema validations.

xmllint is a command line XML tool that can perform XML validation. It can be found in UNIX / Linux

environments. An example with the use of this program for validation of a file called example.xml is

xmllint --valid --noout example.xml

External links

Example C program

• Validate XML against XSD in C [3] (using libxml)

XML toolkit

• The XML C parser and toolkit of Gnome [4] – libxml includes xmllint

• Windows port of libxml [5] – maintained by Igor Zlatkovic

Online validators for XML files

• http://www.xmlvalidation.com/

• http://www.stg.brown.edu/service/xmlvalid/

• http://www.jcam.org.uk

Articles discussing XML validation

• DEVX March, 2009 - Taking XML Validation to the Next Level: Introducing CAM [6]

XML validation 178

References

[1] "Well-Formed XML Documents" (http://www.w3.org/TR/2004/REC-xml11-20040204/#sec-well-formed). Extensible Markup Language

(XML) 1.1. W3C. 2004. .

[2] "Constraints and Validation Rules" (http://www.w3.org/TR/xmlschema-1/#concepts-schemaConstraints). XML Schema Part 1:

Structures Second Edition. W3C. 2004. .

[3] http://knol2share.blogspot.com/2009/05/validate-xml-against-xsd-in-c.html

[4] http://xmlsoft.org/xmldtd.html

[5] http://www.zlatkovic.com/libxml.en.html

[6] http://www.devx.com/xml/Article/41066

XML-Enabled Networking

XML Enabled Networking provides an abstraction layer that exists alongside the traditional Internet Protocol (IP)

network. This layer addresses the security, incompatibility and latency issues encumbering XML messages, web

services and service oriented architectures (SOAs).

History of XML Enabled Networking

Many organizations have adopted XML technologies - often as Web services or service oriented architectures

(SOAs) - as the standard for new application development and integration. Applications based on XML and Web

services offer rapid interoperability and seamless service re-use by establishing a standard data format and a standard

interface.

With faster development cycles, less development effort and improved agility, XML and Web services enable IT to

deliver more solutions to the business at a substantially lower cost. However, using these technologies also creates

some potential problems:

• Security concerns: XML messages are text-based, human readable, verbose, and self-describing. An XML

message could include descriptions of identities and credentials used to authenticate services, signatures requiring

verification etc. XML by itself does not provide an infrastructure for integrating with multiple identity/access

control systems across the organization, ensuring trust and compliance for XML message processing, or

protecting the organization from the threats that malicious individuals could introduce into the organization with

XML.

• Incompatibilities: Many XML standards have emerged. XML messages use a variety of security standards,

transport protocols, credential types and data structures. Web service developers need some way to mediate

between these different standards and protocols, especially when they are integrating with business partners who

may employ entirely different standards and protocols.

• Application latency: XML messages can consume significant processing resources from application servers,

lowering performance for the XML-based service and for other applications that run on the same platform.

XML Enabled Networking attempts to address these issues by creating an abstraction layer that exists alongside the

traditional Internet Protocol (IP) network to provide security and access enforcement, accelerated XML message

processing, mediation between standards and protocols, policy control and auditing. XML Enabled Networks have

typically been sold as network appliances. Initially they required application-specific integrated circuits, but

appliances that run on standards-based hardware and operating systems are now available.

XML-Enabled Networking 179

Common Features of XML Enabled Networking

• It is powered by hardened network appliances, ready to incorporate into the network with minimal disruption

• XML Enabled Networking appliances have software to make the appliances easy to install, configure and manage

• They can validate XML messages for well-formedness as they enter or exit the appliance

• They can convert XML to any data format

• They have built-in storage capabilities to enable on-device logging for compliance and debugging purposes.

• They have built-in support for many XML standards such as XSLT, XPath, SOAP and WS-Security

• They are easily upgradeable

Classification of XML Enabled Networking

XML Security Gateways or XML Firewalls offer comprehensive XML security processing. XML Security Gateways

include acceleration and integration functionality. Enterprise class XML Security Gateways include robust policy

management, correlated event/message/policy logging for visibility and extensibility frameworks.

XML Routers deliver robust access control and integration with identity authorities with acceleration and integration

functionality. Enterprise class XML Routers include robust policy management, correlated event/message/policy

logging for visibility and extensibility frameworks.

XML Accelerators optimize both message throughput and server performance for XML operations including schema

validation, encryption/decryption, authentication, signing, data transformation and protocol mediation. Enterprise

class XML Accelerators include robust policy management, correlated event/message/policy logging for visibility

and extensibility frameworks.

XML Enabled Networking vendors

• Citrix Systems

• DataPower (IBM)

• F5 Networks

• Forum Systems

• Layer 7 Technologies

• Reactivity, Inc. (Cisco [1] )

• Solace Systems

• Sonoa Systems

• Strangeloop Networks

• Vordel

• Zeus Systems

See also

XML

SOAP

WS-Security

XML appliance

References

[1] http://newsroom.cisco.com/dlls/2007/corp_022107.html

XML-Retrieval 180

XML-Retrieval

XML Retrieval, or XML Information Retrieval, is the content-based retrieval of documents structured with XML

(eXtensible Markup Language). As such it is used for computing relevance of XML documents. [1]

Queries

Most XML retrieval approaches do so based on techniques from the information retrieval (IR) area, e.g. by

computing the similarity between a query consisting of keywords (query terms) and the document. However, in

XML-Retrieval the query can also contain structural hints. So-called "content and structure" (CAS) queries enable

users to specify what structure the requested content can or must have.

Exploiting XML structure

Taking advantage of the self-describing structure of XML documents can improve the search for XML documents

significantly. This includes the use of CAS queries, the weighting of different XML elements differently and the

focused retrieval of subdocuments.

Ranking

Ranking in XML-Retrieval can incorporate both content relevance and structural similarity, which is the

resemblance between the structure given in the query and the structure of the document. Also, the retrieval units

resulting from an XML query may not always be entire documents, but can be any deeply nested XML elements, i.e.

dynamic documents. The aim is to find the smallest retrieval unit that is highly relevant. Relevance can be defined

according to the notion of specificity, which is the extent to which a retrieval unit focuses on the topic of request. [2]

Existing XML search engines

An overview of two potential approaches is available. [3] [4] The INitiative for the Evaluation of XML-Retrieval

(INEX) was founded in 2002 and provides a platform for evaluating such algorithms. [2] Three different areas

influence XML-Retrieval: [5]

Traditional XML query languages

Query languages such as the W3C standard XQuery [6] supply complex queries, but only look for exact matches.

Therefore, they need to be extended to allow for vague search with relevance computing. Most XML-centered

approaches imply a quite exact knowledge of the documents' schemas. [7]

Databases

Classic database systems have adopted the possibility to store semi-structured data [5] and resulted in the development

of XML databases. Often, they are very formal, concentrate more on searching than on ranking, and are used by

experienced users able to formulate complex queries.

Information retrieval

Classic information retrieval models such as the vector space model provide relevance ranking, but do not include

document structure; only flat queries are supported. Also, they apply a static document concept, so retrieval units

usually are entire documents. [7] They can be extended to consider structural information and dynamic document

retrieval. Examples for approaches extending the vector space models are available: they use document subtrees

(index terms plus structure) as dimensions of the vector space. [8]

XML-Retrieval 181

See also

• Document retrieval

• Information retrieval applications

References

[1] Winter, Judith; Drobnik, Oswald (November 9, 2007).

%20Architecture%20for%20XML%20Information%20Retrieval%20in%20a%20Peer-to-Peer%20Environment_2007.pdf "An Architecture

for XML Information Retrieval in a Peer-to-Peer Environment" (ftp://ftp.tm.informatik.uni-frankfurt.de/pub/papers/ir/An). ACM.

%20Architecture%20for%20XML%20Information%20Retrieval%20in%20a%20Peer-to-Peer%20Environment_2007.pdf. Retrieved

2009-02-10.

[2] Malik, Saadia; Trotman, Andrew; Lalmas, Mounia; Fuhr, Norbert (2007). "Overview of INEX 2006" (http://www.cs.otago.ac.nz/

homepages/andrew/2006-10.pdf). Proceedings of the Fifth Workshop of the INitiative for the Evaluation of XML Retrieval. . Retrieved

2009-02-10.

[3] Amer-Yahia, Sihem; Lalmas, Mounia (2006). "XML Search: Languages, INEX and Scoring" (http://www.sigmod.org/record/issues/

0612/p16-article-yahia.pdf). SIGMOD Rec. Vol. 35, No. 4. . Retrieved 2009-02-10.

[4] Pal, Sukomal (June 30, 2006). "XML Retrieval: A Survey" (http://66.102.1.104/scholar?q=cache:R6ZYFNoTRrUJ:citeseerx.ist.psu.edu/

viewdoc/download?doi=10.1.1.109.5986&rep=rep1&type=pdf). Technical Report, CVPR. . Retrieved 2009-02-10.

[5] Fuhr, Norbert; Gövert, N.; Kazai, Gabriella; Lalmas, Mounia (2003). "INEX: Initiative for the Evaluation of XML Retrieval" (http://www.

is.informatik.uni-duisburg.de/bib/pdf/ir/Fuhr_etal:02a.pdf). Proceedings of the First INEX Workshop, Dagstuhl, Germany, 2002. ERCIM

Workshop Proceedings, France. . Retrieved 2009-02-10.

[6] Boag, Scott; Chamberlin, Don; Fernández, Mary F.; Florescu, Daniela; Robie, Jonathan; Siméon, Jérôme (23 January 2007). "XQuery 1.0:

An XML Query Language" (http://www.w3.org/TR/2007/REC-xquery-20070123/). W3C Recommendation. World Wide Web

Consortium. . Retrieved 2009-02-10.

[7] Schlieder, Torsten; Meuss, Holger (2002). "Querying and Ranking XML Documents" (http://209.85.173.132/

search?q=cache:KHBo9BRjO7QJ:www.cis.uni-muenchen.de/people/Meuss/Pub/JASIS02.ps.gz). Journal of the American Society for

Information Science and Technology, Vol. 53, No. 6. . Retrieved 2009-02-10.

[8] Liu, Shaorong; Zou, Qinghua; Chu, Wesley W. (2004). "Configurable Indexing and Ranking for XML Information Retrieval" (http://www.

cobase.cs.ucla.edu/tech-docs/sliu/SIGIR04.pdf). SIGIR'04. ACM. . Retrieved 2009-02-10.

XMLHttpRequest 182

XMLHttpRequest

HTTP

Persistence · Compression · HTTP

Secure

Headers

ETag · Cookie · Referrer · Location

Status codes

301 Moved permanently

302 Found

303 See Other

403 Forbidden

404 Not Found

XMLHttpRequest (XHR) is an API available in web browser scripting languages such as JavaScript. It is used to

send HTTP or HTTPS requests directly to a web server and load the server response data directly back into the

script. [1] The data might be received from the server as XML text [2] or as plain text. [3] Data from the response can be

used directly to alter the DOM of the currently active document in the browser window without loading a new web

page document. The response data can also be evaluated by the client-side scripting. For example, if it was formatted

as JSON by the web server, it can easily be converted into a client-side data object for further use.

XMLHttpRequest has an important role in the Ajax web development technique. It is currently used by many

websites to implement responsive and dynamic web applications. Examples of these web applications include Gmail,

Google Maps, Facebook, and many others.

History and support

The concept behind the XMLHttpRequest object was originally created by the developers of Outlook Web Access for

Microsoft Exchange Server 2000. [4] An interface called IXMLHTTPRequest was developed and implemented into

the second version of the MSXML library using this concept. [4] [5] The second version of the MSXML library was

shipped with Internet Explorer 5.0 in March 1999, allowing access, via ActiveX, to the IXMLHTTPRequest interface

using the XMLHTTP wrapper of the MSXML library. [6]

The Mozilla Foundation developed and implemented an interface called nsIXMLHttpRequest into the Gecko layout

[7] [8]

engine. This interface was modelled to work as closely to Microsoft's IXMLHTTPRequest interface as possible.

Mozilla created a wrapper to use this interface through a JavaScript object which they called XMLHttpRequest. [9]

[10] [11]

The XMLHttpRequest object was accessible as early as Gecko version 0.6 released on December 6 of 2000,

but it was not completely functional until as late as version 1.0 of Gecko released on June 5, 2002. [10] [11] The

XMLHttpRequest object became a de facto standard amongst other major user agents, implemented in Safari 1.2

released in February 2004, [12] Konqueror, Opera 8.0 released in April 2005, [13] and iCab 3.0b352 released in

September 2005. [14]

The World Wide Web Consortium published a Working Draft specification for the XMLHttpRequest object on April

5, 2006, edited by Anne van Kesteren of Opera Software and Dean Jackson of W3C. [15] Its goal is "to document a

minimum set of interoperable features based on existing implementations, allowing Web developers to use these

features without platform-specific code." The last revision to the XMLHttpRequest object specification was on

[16] [17]

November 19 of 2009, being a last call working draft.

XMLHttpRequest 183

Microsoft added the XMLHttpRequest object identifier to its scripting languages in Internet Explorer 7.0 released in

October 2006. [6]

With the advent of cross-browser JavaScript libraries such as jQuery and the Prototype JavaScript Framework,

developers can invoke XMLHttpRequest functionality without coding directly to the API. Prototype provides an

asynchronous requester object called Ajax.Request that wraps the browser's underlying implementation and provides

access to it. [18] jQuery objects represent or wrap elements from the current client-side DOM. They all have a .load()

method that takes a URI parameter and makes an XMLHttpRequest to that URI, then by default places any returned

[19] [20]

HTML into the HTML element represented by the jQuery object.

The W3C has since published another Working Draft specification for the XMLHttpRequest object,

"XMLHttpRequest Level 2", on February 25 of 2008. [21] Level 2 consists of extended functionality to the

XMLHttpRequest object, including, but not currently limited to, progress events, support for cross-site requests, and

the handling of byte streams. The latest revision of the XMLHttpRequest Level 2 specification is that of 20th August

2009, which is still a working draft. [22]

Support in Internet Explorer versions 5, 5.5 and 6

Internet Explorer versions 5 and 6 did not define the XMLHttpRequest object identifier in their scripting languages

as the XMLHttpRequest identifier itself was not standard at the time of their releases. [6] Backward compatibility can

be achieved through object detection if the XMLHttpRequest identifier does not exist.

An example of how to instantiate an XMLHttpRequest object with support for Internet Explorer versions 5 and 6

using JScript method ActiveXObject is below. [23]

/*

Provide the XMLHttpRequest constructor for IE 5.x-6.x:

Other browsers (including IE 7.x-8.x) do not redefine

XMLHttpRequest if it already exists.

This example is based on findings at:

http://blogs.msdn.com/xmlteam/archive/2006/10/23/using-the-right-version-of-msxml-in-inte

*/

if (typeof XMLHttpRequest == "undefined")

XMLHttpRequest = function () {

};

try { return new ActiveXObject("Msxml2.XMLHTTP.6.0"); }

catch (e) {}

try { return new ActiveXObject("Msxml2.XMLHTTP.3.0"); }

catch (e) {}

try { return new ActiveXObject("Msxml2.XMLHTTP"); }

catch (e) {}

//Microsoft.XMLHTTP points to Msxml2.XMLHTTP.3.0 and is redundant

throw new Error("This browser does not support XMLHttpRequest.");

Web pages that use XMLHttpRequest or XMLHTTP can mitigate the current minor differences in the

implementations either by encapsulating the XMLHttpRequest object in a JavaScript wrapper, or by using an

existing framework that does so. In either case, the wrapper should detect the abilities of current implementation and

work within its requirements.

XMLHttpRequest 184

HTTP request

The following sections demonstrate how a request using the XMLHttpRequest object functions within a conforming

user agent based on the W3C Working Draft. As the W3C standard for the XMLHttpRequest object is still a draft,

user agents may not abide by all the functionings of the W3C definition and any of the following is subject to

change. Extreme care should be taken into consideration when scripting with the XMLHttpRequest object across

multiple user agents. This article will try to list the inconsistencies between the major user agents.

The open method

The HTTP and HTTPS requests of the XMLHttpRequest object must be initialized through the open method. This

method must be invoked prior to the actual sending of a request to validate and resolve the request method, URL,

and URI user information to be used for the request. This method does not assure that the URL exists or the user

information is correct. This method can accept up to five parameters, but requires only two, to initialize a request.

The first parameter of the method is a text string indicating the HTTP request method to use. The request methods

that must be supported by a conforming user agent, defined by the W3C draft for the XMLHttpRequest object, are

currently listed as the following. [24]

• GET (Supported by IE7+, Mozilla 1+)

• POST (Supported by IE7+, Mozilla 1+)

• HEAD (Supported by IE7+)

• PUT

• DELETE

• OPTIONS (Supported by IE7+)

However, request methods are not limited to the ones listed above. The W3C draft states that a browser may support

additional request methods at their own discretion.

The second parameter of the method is another text string, this one indicating the URL of the HTTP request. The

W3C recommends that browsers should raise an error and not allow the request of a URL with either a different port

or ihost URI component from the current document. [25]

The third parameter, a boolean value indicating whether or not the request will be asynchronous, is not a required

parameter by the W3C draft. The default value of this parameter should be assumed to be true by a W3C conforming

user agent if it is not provided. An asynchronous request ("true") will not wait on a server response before continuing

on with the execution of the current script. It will instead invoke the onreadystatechange event listener of the

XMLHttpRequest object throughout the various stages of the request. A synchronous request ("false") however will

block execution of the current script until the request has been completed, thus not invoking the onreadystatechange

event listener.

The fourth and fifth parameters are the URI user and password, respectively. These parameters are not required and

should default to the current user and password of the document if not supplied, as defined by the W3C draft.

The setRequestHeader method

Upon successful initialization of a request, the setRequestHeader method of the XMLHttpRequest object can be

invoked to send HTTP headers with the request. The first parameter of this method is the text string name of the

header. The second parameter is the text string value. This method must be invoked for each header that needs to be

sent with the request. Any headers attached here will be removed the next time the open method is invoked in a W3C

conforming user agent.

XMLHttpRequest 185

The send method

To send an HTTP request, the send method of the XMLHttpRequest must be invoked. This method accepts a single

parameter containing the content to be sent with the request. This parameter may be omitted if no content needs to be

sent. The W3C draft states that this parameter may be any type available to the scripting language as long as it can be

turned into a text string, with the exception of the DOM document object. If a user agent cannot stringify the

parameter, then the parameter should be ignored.

If the parameter is a DOM document object, a user agent should assure the document is turned into well-formed

XML using the encoding indicated by the inputEncoding property of the document object. If the Content-Type

request header was not added through setRequestHeader yet, it should automatically be added by a conforming user

agent as "application/xml;charset=charset," where charset is the encoding used to encode the document.

The onreadystatechange event listener

If the open method of the XMLHttpRequest object was invoked with the third parameter set to true for an

asynchronous request, the onreadystatechange event listener will be automatically invoked for each of the

following actions that change the readyState property of the XMLHttpRequest object.

• After the open method has been invoked successfully, the readyState property of the XMLHttpRequest object

should be assigned a value of 1.

• After the send method has been invoked and the HTTP response headers have been received, the readyState

property of the XMLHttpRequest object should be assigned a value of 2.

• Once the HTTP response content begins to load, the readyState property of the XMLHttpRequest object should

be assigned a value of 3.

• Once the HTTP response content has finished loading, the readyState property of the XMLHttpRequest object

should be assigned a value of 4.

The major user agents are inconsistent with the handling of the onreadystatechange event listener.

The HTTP response

After a successful and completed call to the send method of the XMLHttpRequest, if the server response was valid

XML and the Content-Type header sent by the server is understood by the user agent as an Internet media type for

XML, the responseXML property of the XMLHttpRequest object will contain a DOM document object. Another

property, responseText will contain the response of the server in plain text by a conforming user agent, regardless of

whether or not it was understood as XML.

See also

• Hypertext Transfer Protocol

• Representational State Transfer

• Ajax

External links

• Level 1 specification of the XMLHttpRequest object from W3C [26]

• Level 2 specification of the XMLHttpRequest object from W3C [27]

• Specification of the XMLHttpRequest object for Apple developers [28]

• Specification of the XMLHttpRequest object for Microsoft developers [29]

• Specification of the XMLHttpRequest object for Mozilla developers [30]

• Specification of the XMLHttpRequest object for Opera developers [31]

XMLHttpRequest 186

• "Attacking AJAX Applications" [32] , a presentation given at the Black Hat security conference. Discusses several

issues involving XHR and the future of cross-domain AJAX.

References

[1] "XMLHttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/XMLHttpRequest/). W3.org. . Retrieved

2009-07-14.

[2] "The responseXML attribute of the XMLHttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/

XMLHttpRequest/#responsexml). W3.org. . Retrieved 2009-07-14.

[3] "The responseText attribute of the XMLHttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/

XMLHttpRequest/#responsetext). W3.org. . Retrieved 2009-07-14.

[4] "Article on the history of XMLHTTP by an original developer" (http://www.alexhopmann.com/xmlhttp.htm). Alexhopmann.com.

2007-01-31. . Retrieved 2009-07-14.

[5] "Specification of the IXMLHTTPRequest interface from the Microsoft Developer Network" (http://msdn.microsoft.com/en-us/library/

ms759148(VS.85).aspx). Msdn.microsoft.com. . Retrieved 2009-07-14.

[6] Dutta, Sunava (2006-01-23). "Native XMLHTTPRequest object" (http://blogs.msdn.com/ie/archive/2006/01/23/516393.aspx). IEBlog.

Microsoft. . Retrieved 2006-11-30.

[7] "Specification of the nsIXMLHttpRequest interface from the Mozilla Developer Center" (https://developer.mozilla.org/en/

nsIXMLHttpRequest). Developer.mozilla.org. 2008-05-16. . Retrieved 2009-07-14.

[8] "Specification of the nsIJSXMLHttpRequest interface from the Mozilla Developer Center" (https://developer.mozilla.org/en/

NsIJSXMLHttpRequest). Developer.mozilla.org. 2009-05-03. . Retrieved 2009-07-14.

[9] "Specification of the XMLHttpRequest object from the Mozilla Developer Center" (https://developer.mozilla.org/en/XmlHttpRequest).

Developer.mozilla.org. 2009-05-03. . Retrieved 2009-07-14.

[10] "Version history for the Mozilla Application Suite" (http://www.mozilla.org/releases/history.html). Mozilla.org. . Retrieved 2009-07-14.

[11] "Downloadable, archived releases for the Mozilla browser" (http://www-archive.mozilla.org/releases/). Archive.mozilla.org. . Retrieved

2009-07-14.

[12] "Archived news from Mozillazine stating the release date of Safari 1.2" (http://weblogs.mozillazine.org/hyatt/archives/2004_02.html).

Weblogs.mozillazine.org. . Retrieved 2009-07-14.

[13] "Press release stating the release date of Opera 8.0 from the Opera website" (http://www.opera.com/press/releases/2005/06/16/).

Opera.com. 2005-04-19. . Retrieved 2009-07-14.

[14] Soft-Info.org. "Detailed browser information stating the release date of iCab 3.0b352 from" (http://www.soft-info.org/browsers/

icab-10109.html). Soft-Info.com. . Retrieved 2009-07-14.

[15] "Specification of the XMLHttpRequest object from the Level 1 W3C Working Draft released on April 5th, 2006" (http://www.w3.org/

TR/2006/WD-XMLHttpRequest-20060405/). W3.org. . Retrieved 2009-07-14.

[16] "XMLHttpRequest W3C Working Draft 19 November 2009" (http://www.w3.org/TR/2009/WD-XMLHttpRequest-20091119/).

W3.org. . Retrieved 2009-12-17.

[17] "W3C Process Document, Section 7.4.2 Last Call Announcement" (http://www.w3.org/2005/10/Process-20051014/tr#last-call).

W3.org. . Retrieved 2009-12-17.

[18] Porteneuve, Christophe (2007). "9". in Daniel H Steinberg. Raleigh, North Carolina: Pragmatic Bookshelf. pp. 183. ISBN 1-934356-01-8.

[19] Chaffer, Jonathan; Karl Swedberg (2007). Learning jQuery. Birmingham: Packt Publishing. pp. 107. ISBN 978-1-847192-50-9.

[20] Chaffer, Jonathan; Karl Swedberg (2007). jQuery Reference Guide. Birmingham: Packt Publishing. pp. 156. ISBN 978-1-847193-81-0.

[21] "Specification of the XMLHttpRequest object from the Level 2 W3C Working Draft released on February 25th, 2008" (http://www.w3.

org/TR/2008/WD-XMLHttpRequest2-20080225/). W3.org. . Retrieved 2009-07-14.

[22] "XMLHttpRequest Level 2, W3C Working Draft 20 August 2009" (http://www.w3.org/TR/XMLHttpRequest2/). W3.org. . Retrieved

2010-04-08.

[23] "Ajax Reference (XMLHttpRequest object)" (http://www.javascriptkit.com/jsref/ajax.shtml). JavaScript Kit. 2008-07-22. . Retrieved

2009-07-14.

[24] "Dependencies of the XMLHttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/XMLHttpRequest/

#dependencies). W3.org. . Retrieved 2009-07-14.

[25] "The "open" method of the XMLHttpRequest object explained by the W3C Working Draft" (http://www.w3.org/TR/XMLHttpRequest/

#the-open-method). W3.org. . Retrieved 2009-10-13.

[26] http://www.w3.org/TR/XMLHttpRequest/

[27] http://www.w3.org/TR/XMLHttpRequest2/

[28] http://developer.apple.com/internet/webcontent/xmlhttpreq.html

[29] http://msdn.microsoft.com/en-us/library/ms535874(VS.85).aspx

[30] https://developer.mozilla.org/en/XMLHttpRequest

[31] http://www.opera.com/docs/specs/opera9/xhr/

[32] http://www.isecpartners.com/files/iSEC-Attacking_AJAX_Applications.BH2006.pdf

XMLSocket 187

XMLSocket

XMLSocket is a class in ActionScript which allows Adobe Flash content to use socket communication, via TCP

stream sockets. It can be used for plain text, although, as the name implies, it was made for XML. It is often used in

chat applications and multiplayer games.

Examples

ActionScript 2.0

For a simple Hello, World! application in ActionScript 2.0, you could use the code below:

var xmlSocket:XMLSocket=new XMLSocket();

xmlSocket.onConnect=function() {

}

xmlSocket.send(new XML("Hello, World!"));

xmlSocket.onXML=function(myXML) {

}

trace(myXML.firstChild.childNodes[0].firstChild.nodeValue);

xmlSocket.close();

xmlSocket.connect("localhost",8463);

This would result in the output window of the Flash IDE opening and displaying "Hello, World!", assuming that a

socket server was running on port 8463 of the local machine, and was echoing everything sent to it.

External links

• XML Sockets: the basics of multiplayer games [1] , gotoAndPlay Flash Tutorials

• XMLSocket Simplified [2] , Heliant Whitepaper for ActionScript

• Utilizing Flash Player XMLSockets for JavaScript applications [3]

• Palabre, Simple open source XML socket server for Flash written in python [4]

References

[1] http://www.gotoandplay.it/_articles/2003/12/xmlSocket.php

[2] http://www.heliant.net/~stsai/code/

[3] http://www.devpro.it/xmlsocket/

[4] http://palabre.gavroche.net

XPath 188

XPath

Paradigm Query language

Appeared in 1999

Developer W3C

Stable release 2.0 (January 23 2007)

Major implementations JavaScript, C#, Java

Influenced by XSLT, XPointer

Influenced XML Schema,

XForms

XPath, the XML Path Language, is a query language for selecting nodes from an XML document. In addition,

XPath may be used to compute values (e.g., strings, numbers, or Boolean values) from the content of an XML

document. XPath was defined by the World Wide Web Consortium (W3C).

History

The XPath language is based on a tree representation of the XML document, and provides the ability to navigate

around the tree, selecting nodes by a variety of criteria. [1] In popular use (though not in the official specification), an

XPath expression is often referred to simply as an XPath.

Originally motivated by a desire to provide a common syntax and behavior model between XPointer and XSLT,

subsets of the XPath query language are used in other W3C specifications such as XML Schema and XForms.

Versions

There are currently two versions in use.

• XPath 1.0 became a Recommendation on 16 November 1999 and is widely implemented and used, either on its

own (called via an API from languages such as Java, C# or JavaScript), or embedded in languages such as XSLT

or XForms.

• XPath 2.0 is the current version of the language; it became a Recommendation on 23 January 2007. A number of

implementations exist but are not as widely used as XPath 1.0. The XPath 2.0 language specification is much

larger than XPath 1.0 and changes some of the fundamental concepts of the language such as the type system.

The most notable change is that XPath 2.0 has a much richer type system; [2] Every value is now a sequence (a single

atomic value or node is regarded as a sequence of length one). XPath 1.0 node-sets are replaced by node sequences,

which may be in any order.

To support richer type sets, XPath 2.0 offers a greatly expanded set of functions and operators.

XPath 2.0 is in fact a subset of XQuery 1.0. It offers a for expression which is cut-down version of the "FLWOR"

expressions in XQuery. It is possible to describe the language by listing the parts of XQuery that it leaves out: the

main examples are the query prolog, element and attribute constructors, the remainder of the "FLWOR" syntax, and

the typeswitch expression.

XPath 189

See also

• XPath 1.0

• XPath 2.0

External links

• XPath syntax [3]

• XPath 1.0 specification [4]

• XPath 2.0 specification [5]

• What's New in XPath 2.0 [6]

References

[1] Article on xpath in techsoftcomputing.com

[2] XPath 2.0 supports atomic types, defined as built-in types in XML Schema, and may also import user-defined types from a schema. (http://

www.techsoftcomputing.com)

[3] http://www.w3schools.com/XPath/xpath_syntax.asp

[4] http://www.w3.org/TR/xpath

[5] http://www.w3.org/TR/xpath20/

[6] http://www.xml.com/pub/a/2002/03/20/xpath2.html

XPath 2.0

XPath 2.0 is the current version of the XPath language defined by the World Wide Web Consortium, W3C. It

became a recommendation on 23 January 2007.

XPath is used primarily for selecting parts of an XML document. For this purpose the XML document is modelled as

a tree of nodes. XPath allows nodes to be selected by means of a hierarchic navigation path through the document

tree.

The language is significantly larger than its predecessor, XPath 1.0, and some of the basic concepts such as the data

model and type system have changed. The two language versions are therefore described in separate articles.

XPath 2.0 is used as a sublanguage of XSLT 2.0, and it is also a subset of XQuery 1.0. All three languages share the

same data model, type system, and function library, and were developed together and published on the same day.

Data model

Every value in XPath 2.0 is a sequence of items. The items may be nodes or atomic values. An individual node or

atomic value is considered to be a sequence of length one. Sequences may not be nested.

Nodes are of seven kinds, corresponding to different constructs in the syntax of XML: elements, attributes, text

nodes, comments, processing instructions, namespace nodes, and document nodes. (The document node replaces the

root node of XPath 1.0, because the XPath 2.0 model allows trees to be rooted at other kinds of node, notably

elements.)

Nodes may be typed or untyped. A node acquires a type as a result of validation against an XML Schema. If an

element or attribute is successfully validated against a particular complex type or simple type defined in a schema,

the name of that type is attached as an annotation to the node, and determines the outcome of operations applied to

that node: for example, when sorting, nodes that are annotated as integers will be sorted as integers.

Atomic values may belong to any of the 19 primitive types defined in the XML Schema specification (for example,

string, boolean, double, float, decimal, dateTime, QName, and so on). They may also belong to a type derived from

XPath 2.0 190

one of these primitive types: either a built-in derived type such as integer or Name, or a user-defined derived type

defined in a user-written schema.

Type system

The type system of XPath 2.0 is noteworthy for the fact that it mixes strong typing and weak typing within a single

language.

Operations such as arithmetic and boolean comparison require atomic values as their operands. If an operand returns

a node (for example, @price * 1.2), then the node is automatically atomized to extract the atomic value. If the input

document has been validated against a schema, then the node will typically have a type annotation, and this

determines the type of the resulting atomic value (in this example, the price attribute might have the type decimal). If

no schema is in use, the node will be untyped, and the type of the resulting atomic value will be untypedAtomic.

Typed atomic values are checked to ensure that they have an appropriate type for the context where they are used:

for example, it is not possible to multiply a date by a number. Untyped atomic values, by contrast, follow a weak

typing discipline: they are automatically converted to a type appropriate to the operation where they are used: for

example with an arithmetic operation an untyped atomic value is converted to the type double.

Path expressions

The location paths of XPath 1.0 are referred to in XPath 2.0 as path expressions. Informally, a path expression is a

sequence of steps separated by the "/" operator, for example a/b/c (which is short for child::a/child::b/child::c). More

formally, however, "/" is simply a binary operator that applies the expression on its right-hand side to each item in

turn selected by the expression on the left hand side. So in this example, the expression a selects all the element

children of the context node that are named ; the expression child::b is then applied to each of these nodes,

selecting all the children of the elements; and the expression child::c is then applied to each node in this

sequence, which selects all the children of these elements.

The "/" operator is generalized in XPath 2.0 to allow any kind of expression to be used as an operand: in XPath 1.0,

the right-hand side was always an axis step. For example, a function call can be used on the right-hand side. The

typing rules for the operator require that the result of the first operand is a sequence of nodes. The right hand operand

can return either nodes or atomic values (but not a mixture). If the result consists of nodes, then duplicates are

eliminated and the nodes are returned in document order, and ordering defined in terms of the relative positions of

the nodes in the original XML tree.

In many cases the operands of "/" will be axis steps: these are largely unchanged from XPath 1.0, and are described

in the article on XPath 1.0.

Other operators

Other operators available in XPath 2.0 include the following:

XPath 2.0 191

Operators Effect

+, -, *, div, mod, idiv Arithmetic on numbers, dates, and durations

=, !=, , = General comparison: compare arbitrary sequences. The result is true if any pair of items, one from each sequence, satisfies

the comparison

eq, ne, lt, gt, le, ge Value comparison: compare single items

is Compare node identity: true if both operands are the same node

Compare node position, based on document order

union, intersect,

except

Compare sequences of nodes, treating them as sets, returning the set union, intersection, or difference

and, or boolean conjunction and disjunction. Negation is achieved using the not() function.

to defines an integer range, for example 1 to 10

instance of determines whether a value is an instance of a given type

cast as converts a value to a given type

castable as tests whether a value is convertible to a given type

Conditional expressions may be written using the syntax if (A) then B else C.

XPath 2.0 also offers a for expression, which is a small subset of the FLWOR expression from XQuery. The

expression for $x in X return Y evaluates the expression Y for each value in the result of expression X in turn,

referring to that value using the variable reference $x.

Function library

The function library in XPath 2.0 is greatly extended from the function library in XPath 1.0.

The functions available include the following:

Purpose Example Functions

General string

handling

Regular

expressions

lower-case, upper-case, substring, substring-before, substring-after, translate, starts-with, ends-with, contains, string-length,

concat, normalize-space, normalize-unicode

matches, replace, tokenize

Arithmetic count, sum, avg, min, max, round, floor, ceiling, abs

Dates and times adjust-dateTime-to-timezone, current-dateTime, day-from-dateTime, month-from-dateTime, days-from-duration,

months-from-duration, etc.

Properties of nodes name, node-name, local-name, namespace-uri, base-uri, nilled

Document handling doc, doc-available, document-uri, collection, id, idref

URIs encode-for-uri, escape-html-uri, iri-to-uri, resolve-uri

QNames QName, namespace-uri-from-QName, prefix-from-QName, resolve-QName

Sequences insert-before, remove, subsequence, index-of, distinct-values, reverse, unordered, empty, exists

Type checking one-or-more, exactly-one, zero-or-one

XPath 2.0 192

Backwards compatibility

Because of the changes in the data model and type system, not all expressions in XPath 2.0 have exactly the same

effect as in 1.0. The main difference is that XPath 1.0 was more relaxed about type conversion, for example

comparing two strings ("4" > "4.0") was quite possible but would do a numeric comparison; in XPath 2.0 this is

defined to compare the two values as strings using a context-defined collating sequence.

To ease transition, XPath 2.0 defines a mode of execution in which the semantics are modified to be as close as

possible to XPath 1.0 behavior. When using XSLT 2.0, this mode is activated by setting version="1.0" as an attribute

on the xsl:stylesheet element. This still doesn't offer 100% compatibility, but any remaining differences are only

likely to be encountered in unusual cases.

Support

Support for XPath 2.0 is still limited.

• For browser support, see Comparison of layout engines (XML).

External links

• XPath 2.0 specification [5]

• What's New in XPath 2.0 [6]

Xs3p

xs3p is an XSLT stylesheet that generates XHTML documentation from XML Schema Definition language (XSD)

schema.

xs3p requires an XSLT processor like Xalan from Apache Software Foundation. The results can be generally viewed

with any browser that supports Cascading Style Sheets Level 2 (CSS2) and XHTML 1.0, such as Explorer 5.5,

Mozilla 1.0, Netscape 6 or Opera 5 (or later).

xs3p was developed by Project Titanium [1] , Distributed Systems Technology Centre (DSTC) Pty Ltd. and

distributed under a Mozilla Public License (MPL). xs3p is used by both the Oxygen XML Editor and Stylus Studio

to generate schema documentation, and a modified version of the stylesheet is included with this program.[2]

Recently the DSTC website, which was officially hosting the xs3p stylesheet, has become unavailable. A download

of the xs3p stylesheet is available from the FiForms XML Definitions [3] project.

References

[1] http://titanium.dstc.edu.au/xml/xs3p/

[2] http://www.oxygenxml.com/forum/ftopic2027.html

[3] http://xml.fiforms.org/xs3p/

XSQL 193

XSQL

XSQL combines the power of XML and SQL to provide a language and database independent means to store and

retrieve SQL queries and their results.

Description

XSQL is the combination of XML (Extensible Markup Language) and SQL (Structured Query Language) to provide

a language and database independent means for storing SQL queries, clauses and query results. XSQL development

is still in its infancy and welcomes suggestions for improvement (especially in the form of patches).

Currently, the XSQL project has a DTD (Document Type Definition) to define the structure of an XSQL document

and researchers are currently working on modifying the XML Generator, DBI Perl module to be able to parse XSQL

documents and provide a tree- and event-based API (Application Programming Interface) to their elements. These

modifications are being submitted as patches to the modules maintainer, Matt Sergeant. Thus, the source code does

not live at this site.

It is hoped that XSQL will provide an end-to-end solution for handling SQL in Perl (other languages can be

supported if there is interest). Creating XSQL implementations in other languages will allow all databases to support

XML without having to alter the database source code in any way. The XSQL implementations can take care of

turning XSQL in SQL and turning results into XSQL.

External links

• XSQL project website [1]

References

[1] http://xsql.sourceforge.net/

Article Sources and Contributors 194

Article Sources and Contributors

Binary XML Source: http://en.wikipedia.org/w/index.php?oldid=353493919 Contributors: Chrisch, Cpl Syx, CyberSkull, Cybercobra, DSosnoski, Hervegirod, Hooperbloob, Joriki,

Jzhang2007, Mac D83, Mipadi, Ordinant, Pengo, Potato32, Qutezuce, Semog, Skrapion, Sneftel, Tbleier, The Anome, Thumperward, 44 anonymous edits

Business Process Definition Metamodel Source: http://en.wikipedia.org/w/index.php?oldid=349128636 Contributors: BPDM, Baudoin1, Diveintobpm, Ehheh, Goflow6206, Jpbowen, Lurp,

Sisyph, Tomdebevoise, 8 anonymous edits

CDATA Source: http://en.wikipedia.org/w/index.php?oldid=365608659 Contributors: Archer3, Barefootliam, CesarB, Ded.morris, Duke33, Ehn, ILikeThings, Luislobo, MC10, Mjb, Npowell,

Phluid61, PoliticalJunkie, Renesis, Rjwilmsi, Thickycat, WakiMiko, Wiml, 49 anonymous edits

CDuce Source: http://en.wikipedia.org/w/index.php?oldid=367963828 Contributors: AndrewGNF, Apokrif, Elonka, Elwikipedista, Frisch, Hans Adler, Jaxhere, Sourada, Stentie, The Thing

That Should Not Be, Trovatore, VoluntarySlave

Character entity reference Source: http://en.wikipedia.org/w/index.php?oldid=365999121 Contributors: ANONYMOUS COWARD0xC0DE, Bitnap, Clixus, DePiep, Derekread, Gazpacho,

Gdr, Jatkins, Koujimachi07, Loadmaster, M7, Martin451, Mhkay, Mjb, Mzajac, Oashi, Svick, Tokek, UU, 12 anonymous edits

CodeSynthesis XSD Source: http://en.wikipedia.org/w/index.php?oldid=333209308 Contributors: Boseko, Bunnyhop11, Csabo, Nicolas1981, Pedram.salehpoor, Soumyasch, 4 anonymous

edits

D3L Source: http://en.wikipedia.org/w/index.php?oldid=344098822 Contributors: Dawynn, Fabrictramp, Jackollie, Malcolma, Squids and Chips, Vgiasolli, 4 anonymous edits

Darwin Information Typing Architecture Source: http://en.wikipedia.org/w/index.php?oldid=357405956 Contributors: AlexSpurling, Andy Dingley, Biker JR, Bobdoyle, Bruce Esrig,

ChrisLott, Clayoquot, Cmsreview, Cschleifstein, Deathphoenix, DeweyQ, Dmccreary, Doug Bell, Elharo, Eslchip, Ghettoblaster, Hgkamath, Infoprosmktg, JDBravo, JamesBWatson,

JosebaAbaitua, Jwalling, Krusch, LCP, LeeHunter, Masiano, MatisseEnzer, Mhedblom, Mythobeast, Ndenison, Nozipedia, Ohnoitsjamie, Roberto999, Ru.spider, Sernauser, Sibersandi,

Skierpage, Terrillja, Toussaint, Tsemii, Walk Up Trees, Who, WissenVeredeln, Yorrose, 78 anonymous edits

DITA Open Toolkit Source: http://en.wikipedia.org/w/index.php?oldid=367972391 Contributors: Andy Dingley, Bobdoyle, Cander0000, Elwikipedista, Ewlyahoocom, Sernauser

Document Structure Description Source: http://en.wikipedia.org/w/index.php?oldid=344099967 Contributors: Amalas, Asser hassanain, Bunnyhop11, Dawynn, Dreftymac, Jerazol,

Kbdank71, Mamling, Minghong, Rene Mas, 8 anonymous edits

Document-Centric Source: http://en.wikipedia.org/w/index.php?oldid=319489730 Contributors: Canis Lupus, Jzhang2007, Malcolma, Oh Snap

Document-centric XML processing Source: http://en.wikipedia.org/w/index.php?oldid=363018500 Contributors: Aj00200, Gary King, Jzhang2007, LilHelpa, R'n'B, RJFJR, Victor Lopes, 3

anonymous edits

Dynamic XML Source: http://en.wikipedia.org/w/index.php?oldid=302412968 Contributors: Aboriginal Noise, Egpetersen, Filmackay, Malcolma, 1 anonymous edits

ECMAScript for XML Source: http://en.wikipedia.org/w/index.php?oldid=368395484 Contributors: AVRS, Aaronbrick, Ale jrb, Asqueella, Bobince, CesarB, Cybit, David Gerard, Deineka,

DonToto, Drdamour, Drukepple, Everyking, Ffangs, Ghettoblaster, Guppie, Herorev, Imroy, Intgr, Jasonglchu, Klondike, Kuteni, Mbini, Mysterd429, Niqueco, Onevalefan, Pfurla, Pointillist,

Schepers, Schristie, Shepard, Simonster, Spankman, Speck-Made, Tabletop, Vishnava, William Graham, WulfTheSaxon, Ysangkok, 83 anonymous edits

Efficient XML Interchange Source: http://en.wikipedia.org/w/index.php?oldid=335112487 Contributors: Biscuittin, Cybercobra, Darobin, Erechtheus, Hervegirod, Jeffhos, Pengo, Sdw,

TuukkaH, 10 anonymous edits

Embedded RDF Source: http://en.wikipedia.org/w/index.php?oldid=344100836 Contributors: 4th-otaku, Cander0000, Dawynn, Earle Martin, Iridescent, Keithalexander, Mathiastck, Mdd, O

keyes, Prodoc, Shepard, The Anome, Themfromspace, Ultimatewisdom, 1 anonymous edits

EpiDoc Source: http://en.wikipedia.org/w/index.php?oldid=331916994 Contributors: Bluemoose, Bpiche, El C, Gabrielbodard, Paregorios, Polon11, Tobias Bergemann, XPtr, 3 anonymous

edits

eXtensible Server Pages Source: http://en.wikipedia.org/w/index.php?oldid=171773630 Contributors: Honestcurio, John Vandenberg, Jutta234, 2 anonymous edits

Fast Infoset Source: http://en.wikipedia.org/w/index.php?oldid=363168067 Contributors: Beetstra, Doug Bell, Drano, Dreftymac, Ernstdehaan, Ghettoblaster, Gurch, Hervegirod, Iharjw,

JavaIsGroovy, Johndrinkwater, Jzhang2007, Ksn, Merlin12, Obiltschnig, Pelegri, Precious Roy, Prickus, Torc2, Tuntable, Tycoon de, Warreed, 35 anonymous edits

Global listings format Source: http://en.wikipedia.org/w/index.php?oldid=323960620 Contributors: Alvin Seville, Capnstank, 1 anonymous edits

GMX Source: http://en.wikipedia.org/w/index.php?oldid=297460141 Contributors: Azydron, Canadian, GEn3S!Z, GregorB, Ikar.us, Jared Preston, Malinaccier, Pegship, Radon210, 8

anonymous edits

GMX-V Source: http://en.wikipedia.org/w/index.php?oldid=325214239 Contributors: Azydron, Emeraude, 2 anonymous edits

Head-Body Pattern Source: http://en.wikipedia.org/w/index.php?oldid=332049421 Contributors: Duncharris, Pegship, RedWolf, Robertvan1, Timc, Uthbrian, Ynhockey, 3 anonymous edits

HyTime Source: http://en.wikipedia.org/w/index.php?oldid=334129102 Contributors: Andreas Kaufmann, Klimov, Mjb, Mosca, Onlyemarie, Sderose, Thumperward, 9 anonymous edits

Internationalization Tag Set Source: http://en.wikipedia.org/w/index.php?oldid=247890861 Contributors: Ghettoblaster, Sintaku, Ysavourel, 18 anonymous edits

Klip Source: http://en.wikipedia.org/w/index.php?oldid=359761468 Contributors: Bogrady, Diveloop, Gdrori, Melaen, SDC, Utcursch, Wizard191, Wykis, Xe7al, 16 anonymous edits

List of XML and HTML character entity references Source: http://en.wikipedia.org/w/index.php?oldid=365723538 Contributors: Adoniscik, Alerante, Andrew Carlssin, AxSkov, Beland,

BenjaminHare, Cbrunet, Christian75, Clixus, Cy21, DePiep, DmitTrix, ERcheck, Fudo, Gaius Cornelius, George Hernandez, Gerbrant, Happy-melon, Isaac Dupree, J4 james, Jatkins, Joejava,

John Vandenberg, Kf4yfd, Kieff, LiborX, Loadmaster, Mathtinder, Mhkay, Mindmatrix, Mjb, Monedula, NJJ.Rocher, Ohnoitsjamie, Phil Boswell, Psychonaut, Radon210, Reinyday, Reisio,

RetiredUser2, Ringbang, Rjwilmsi, Rwwww, SallyForth123, Sam 1123, Suruena, Tamfang, Tezza2k1, The Thing That Should Not Be, The wub, Thinboy00P, Tokek, TreasuryTag, Wavelength,

Wolf1728, Wwoods, 93 anonymous edits

Log4js Source: http://en.wikipedia.org/w/index.php?oldid=333453341 Contributors: Amux, Euchiasmus, Ian Moody, JLaTondre, Stritti, Wdflake, 5 anonymous edits

MAREC Source: http://en.wikipedia.org/w/index.php?oldid=352689127 Contributors: Hydrox, Mpgarnier, Ofalk, 13 anonymous edits

Media Object Server Source: http://en.wikipedia.org/w/index.php?oldid=282541918 Contributors: Chungkuo, The Anome, Theroachman, Xezbeth, 1 anonymous edits

METS Source: http://en.wikipedia.org/w/index.php?oldid=357913828 Contributors: Buiras, CBM, Charles Brooking, Davissp, DerHexer, Elonka, Grumpycraig, Isnow, Lyc. cooperi,

M4gnum0n, Nicolas1981, Paulerb, Rich Farmbrough, Sallyrenee, SchfiftyThree, Stf, Thryduulf, Trovatore, WilliamDenton, 15 anonymous edits

Numeric character reference Source: http://en.wikipedia.org/w/index.php?oldid=364363130 Contributors: ABCD, ANONYMOUS COWARD0xC0DE, Ahoerstemeier, Ajgorhoe, D99figge,

David H. Flint, DePiep, Gudeldar, Hytri, Indefatigable, Karl Dickman, Kjoonlee, LeoNomis, Million Moments, Mjb, Ringbang, Shlomital, TreasuryTag, Voidvector, 11 anonymous edits

Office Open XML Source: http://en.wikipedia.org/w/index.php?oldid=368502841 Contributors: AJRobbins, AVRS, Adi86, Adiel, Agentbla, AlbinoFerret, Ale2006, AlexHudson, Alexbrn,

Alexmaco, AlistairMcMillan, Aljullu, AllTheThings, Alvestrand, Amux, Ancheta Wis, Andrew J. MacDonald, AnonMoos, Ans, Arebenti, Arnieswap, ArnoldReinhold, Artw, Asbjornu, Atchom,

Avenue, BCable, Bbatsell, BeSherman, Beetstra, BenLanghinrichs, Bender235, Bento00, Biztalkguy, Blaisorblade, Blakkandekka, Bobblehead, Bobman52, Boing! said Zebedee, Booyabazooka,

BradC, Brucevdk, Brumle72, Bryan Derksen, Bull Market, Cahill1, Cander0000, Catskul, CattleGirl, CesarB, Cfauck, Charles Esson, Chealer, CheesePlease NL, Cheros, Chowbok,

Chuckhoffmann, Cibumamo, Clicketyclack, Cloud02, CodeNaked, Codyrank, CritterNYC, CyberSkull, D-Notice, DMacks, Damian Yerrick, Danfuzz, Danieldotcom, Dave souza, David Gerard,

Article Sources and Contributors 195

DavidJ710, Davidprior, Delafield, Denis.labaye, DennyColt, DerHexer, Dguertin, Diamonddavej, Discospinster, Dockurt2k, Dolda2000, Donho, Dougofborg, Dovi, Downcreate, Dreftymac,

Dwheeler, Długosz, Earthsound, EatMyShortz, Ebyabe, Edschofield, Egandrews, Elagatis, Emurphy42, Etscrivner, Euchiasmus, EvenT, Evice, Existhigh, Feedmecereal, Fingerz, Fjarlq,

Fleminra, Froth, Fulldecent, Gabriella11758, Gabrielzorz, Gagravarr, Gakrivas, GangsterPanda, Garnwraly, Ghettoblaster, Gilliam, Greg L, GregorB, Guyjohnston, H2g2bob, HAl, HPSCHD,

HaeB, Hamish Lawson, Hankwang, HarryHenryGebel, Harumphy, Hebrides, Helpsloose, Herorev, Hervegirod, Herzen, HiDrNick, HorsePunchKid, Hu12, HubertRoksor, Ildefonso Giron, Innv,

Intgr, Iridescent, Ironiridis, Irperez, Isaac Dupree, ItsProgrammable, Iunaw, JLaTondre, Jac16888, Jacob Poon, JanusDC, Jeffmcneill, Jeltz, Jleedev, Jlovick, Joelpt, John Nevard, John of Reading,

John zhu, JohnOwens, Johndrinkwater, Joker1984, Joker2007, Jonathan888, Joshua Issac, Jstaniek, Jtnn, Juliancolton, Justin545, Juventas, Jynus, KAMiKAZOW, Kaern, Karada, Karnesky,

Kayano, Kedar damle, Kegart, Kenb215, Kenyon, Ketil, Khalid hassani, Khukri, KiloByte, Kilz, Klauys, Kneale, Kozuch, Kravietz, Kungfuadam, Latha P Nair, Laughton.andrew, Leandrod,

LeeHunter, Leotohill, Lester, Liftarn, Lisamh, Lulu of the Lotus-Eaters, MBisanz, MZMcBride, Mahanga, Marbux, Mardus, Masterpjz9, Mat macwilliam, Mateo LeFou, Mathias

Schindler, Mauro Bieg, Max Naylor, Mcld, Melomel, Mentaka, Merbenz, Micro01, Midnightcomm, Mipadi, Mitchoyoshitaka, Mmj, MonirTime, Mrand, Mratzloff, Mxn, NJA, Nberardi, Nbibler,

Nealmcb, NeutralPoint, Niemeyerstein en, Nigelj, Nil Einne, Nitesh.dubey, Nmagedman, Noloader, Octahedron80, Odie5533, Odoncaoa, Oggiejnr, Oneiros, Opium, Orrc, Osaeris, Oub, Pairadox,

Palfrey, Pandion auk, ParticleMan, Partyoffive, Paul Foxworthy, Paul1337, Pdfpdf, Peak Freak, Peashy, Perfect Proposal, Phil153, Piano non troppo, Pieterh, Piken, Piperh, Pixelface, PlainHolds,

Plopez339, PokeYourHeadOff, PonThePony, Praetor alpha, Promethean promise, Putt1ck, Quantumelfmage, R3m0t, RS Ren, Rafert, Rainwarrior, Ramdrake, Rasmus.p, Raul654, Rcandelori,

Reedy, RekishiEJ, Remiel, Reuqr, Rick Jelliffe, Rizox, Rjwilmsi, Rlmorgan, Robdurbar, RockMFR, Ronark, RossPatterson, Rursus, Ruud Koot, Ryuch, Régis Décamps, Salimfadhley, Scarian,

Scientus, Scisonic, Scj2315, Sdedeo, SeanDuggan, Segedunum, Seweso, Shd, Shir Khan, Shmget, SigmaEpsilon, Sigmundg, Signalhead, Simosx, Sir Anon, SkyWalker, Sladen, SmartWarthog,

Smartse, Soumyasch, Spartaz, Spitzak, SpuriousQ, Stang99gtv8, Stannered, Stephenchou0722, SteveSims, Stevenfruitsmaak, Stevenj, Subsume, Sumb, Superluser, Superm401, Svdb, Swiftdove,

Syncrosoft, TKD, Ta bu shi da yu, Tabletop, Tackit, TakuyaMurata, Tarmle, Tatoute, Tawker, Tayste, Tgape, The Anome, The Divine Fluffalizer, The Thing That Should Not Be, TheMadGerman,

Thelennonorth, Theonlyedge, Theosch, Thiseye, Thrapper, Thumperward, Tigernike1, Tiptoety, Tmpsantos, Todd Vierling, Tomdobb, Toolnut, Torfason, Towsonu2003, Tprit, Trails, TraxPlayer,

Tregoweth, Trevordevore, Ttiotsw, Tunah, Turlo Lomon, Tvhuang, Tvol, Ultramandk, Utcursch, Veinor, Verbal, Verdy p, Vexorian, Virtualt333, WalterGR, Warren, Webhat, West London

Dweller, Wheelybrook, WhiteCat, WiebeVanDerWorp, Wiki Raja, Wiki1959, WikiLaurent, Witoldp, Wmorein, Womble, Work permit, Wrightbus, WurmWoode, X-Bert, X-dark, Xpclient,

Xx521xx, Yellowdesk, Yesudeep, Yoonkit, Zayani, Zero0w, Zoobab, Zsvedic, 1036 anonymous edits

Office Open XML file formats Source: http://en.wikipedia.org/w/index.php?oldid=363884163 Contributors: Alvestrand, CommonsDelinker, Nigelj, Rjwilmsi, Verdy p, 3 anonymous edits

OIOXML Source: http://en.wikipedia.org/w/index.php?oldid=232294100 Contributors: Covergaard, JosefAssad, Part Deux, 2 anonymous edits

Open XML Paper Specification Source: http://en.wikipedia.org/w/index.php?oldid=368337454 Contributors: A.Ou, Akhristov, Alecamiga, Alexander Abramov, Ambarish, Azakea,

Benhutchings, Bfinn, Blicktek, Bokarevitch, Callidior, Chris Chittleborough, Chris the speller, Chuck Marean, CobraA1, Csiahistorian, Cwolfsheep, Cynical, DBrane, Danglobalgraph, David

Haslam, Dawnseeker2000, Digita, Etienne.navarro, Feedmecereal, Filemon, FleetCommand, Fleminra, Frap, Fritz Saalfeld, Gertyk, Ghettoblaster, Gioto, Gordonf, HAl, Hervegirod, Inarius,

JLaTondre, JanSöderback, Javalenok, Joaopaulo1511, Joelholdsworth, Joker1984, Jonhall, Jutiphan, Kpearce, Lasindi, Lboonsen, Leafnode, Lhammer610, LobStoR, LodesterreLLC, Maerk,

Marasmusine, Marcosw, Mathrick, Morris lin, Mpbailey, Msiebuhr, Mythobeast, Nihiltres, Nil Einne, Nixps, Objectivesea, Oneiros, Orderud, Owen Ambur, Paul A, Paulej, Pelago, Philippe,

PseudoSudo, Psiphiorg, Qef, Quiggles, RedAznor, Rjwilmsi, SURIV, SW2000, Seth Nimbosa, Simaocampos, Snailshoes, Soumyasch, Stephenchou0722, Sterrys, Sugeina, Superm401, Svick,

Thumperward, Todd Vierling, Tooki, TotoBaggins, Toussaint, TreasuryTag, Uzume, Voidxor, Warren, WatchAndObserve, Wikianon, Woohookitty, Wq-man, Xpclient, ZimZalaBim, 159

anonymous edits

PCDATA Source: http://en.wikipedia.org/w/index.php?oldid=360253969 Contributors: Chealer, Fæ, Lobner, Malcolma, Renata3, Winterheat, 4 anonymous edits

Plain Old XML Source: http://en.wikipedia.org/w/index.php?oldid=268137155 Contributors: Alynna Kasmira, Arto B, Atifmk, BrokenSegue, Bunnyhop11, Chalisa, Charivari, CondeNasty,

Djmackenzie, Dpm64, Emersoni, Evil Monkey, GermanX, Hoos-foos, Julesd, LittleDan, MarXidad, Mindmatrix, Minghong, Tantek, Thumperward, Toby Woodwark, ZayZayEM, 16 anonymous

edits

Portable Application Description Source: http://en.wikipedia.org/w/index.php?oldid=353334808 Contributors: Aaleksanyants, Bitsmith, Christopher.widdowson, Gesslein, Here, Jll, MER-C,

Pegship, RenegadeMinds, Riki, TheParanoidOne, 22 anonymous edits

Publishing Requirements for Industry Standard Metadata Source: http://en.wikipedia.org/w/index.php?oldid=367751412 Contributors: Malcolma, Mauro Bieg, Prismwg, Rettetast, Rich

Farmbrough

QName Source: http://en.wikipedia.org/w/index.php?oldid=319026346 Contributors: Amire80, Anthony Appleyard, Frap, Gurch, Jnutting512, Motine, Stezton, Zundark, 1 anonymous edits

QTI Source: http://en.wikipedia.org/w/index.php?oldid=361214744 Contributors: Alexcq, Bektur, Benscripps, Carnildo, ChristopheS, Fujnky, Gcm, Gimboid13, Grussak, Hammersmith38,

J04n, Ja6a, JimTittsler, Larham, Lastkaled, Lindsey Kuper, Olak Ksirrin, RobertG, Ruale, Staffordaz, The7thone1188, Ysangkok, 33 anonymous edits

Resource Description Framework Source: http://en.wikipedia.org/w/index.php?oldid=368011899 Contributors: 213.253.39.xxx, A5b, Acaciz, Akinyemi, Alcalazar, Alexius08,

AlistairMcMillan, Amire80, AnAj, Andy Dingley, Angela, Ankitasdeveloper, Anrie Nord, Arto B, Asqueella, Backoftheboat, Barticus88, Bawolff, BeakerK44, BernhardBauer, Blathnaid,

Blue.death, BobKeim, Booles, Broosty, C1932, Caoimhin, Carbuncle, Carlo.Ierna, Carmenutzadd, Cedringen, Chmod007, Cjcollier, Clan-destine, CloCkWeRX, Conversion script, Cygri, DRE,

DanBri, Dancter, Daniele Gallesio, Davemck, Deodar, Dmccreary, Donald Albury, Dpv, Dr Shorthair, Dtcdthingy, Earle Martin, EddyVanderlinden, Emperor, EoGuy, Erick.Antezana, Esprit15d,

Finell, Fleminra, FrankTobia, Fredrik, Funandtrvl, Ghettoblaster, Graham87, GregorB, Gyuri10, Haakon, Harrigan, Hetar, Hu12, Ian Spackman, IanDBailey, Ianalchemy, Jdthood,

JesseChisholm, Jhammerb, Joe Jarvis, John Vandenberg, JonHarder, Jonathan O'Donnell, Jpbowen, Kaihsu, Kbdank71, Khurrad, KimvdLinde, KingsleyIdehen, Kiranoush, Kku, Knavesdied,

Kwan, Langec, Liftarn, Lokatzis, Luk, Lysy, M3wiki1, Maduskis, Mandarax, Mark Renier, Mathiastck, Mauro Bieg, Mav, Mccaffry, Mdd, Mecanismo, Michael Hardy, MichaelBillington,

Michal Nebyla, Midnight Madness, Minghong, Mjb, N2e, N3c, Nicolas1981, Nikevich, Niteowlneils, Nkour, Novum, Nux, Ojw, Onlyemarie, Pagatiponon, PatHayes, Pemboid, Pete142, Piet

Delport, Pointillist, Pvosta, RaymondYee, RedWolf, Roland2, RossPatterson, Rursus, SEWilco, SMcCandlish, SamuelScarano, Sanxiyn, Sapoguapo, Schandi, Sdorrance, Securiger,

ShaunMacPherson, Shepard, Shermanmonroe, Shinkolobwe, Sibersandi, Sina2, Smalljim, Soumyasch, Sstair, StWeasel, SteinbDJ, StephenReed, Stevertigo, Stoni, Stw, TNLNYC, Tezza2k1, The

Anome, TikaKino, Tomlzz1, Toussaint, Triadic2000, Trixter, Turnstep, Ultimatewisdom, Universimmedia, Uriyan, Venullian, Vsddkjn, Wavelength, Wesleyneo, Wiki alf, WojPob, Xezbeth,

Yaron K., Yitzhak, 217 anonymous edits

Resources of a Resource Source: http://en.wikipedia.org/w/index.php?oldid=252504394 Contributors: GregorB, Jjordanpedia, NawlinWiki, Pearle, Robocoder, 8 anonymous edits

Reverse Ajax Source: http://en.wikipedia.org/w/index.php?oldid=354518378 Contributors: Agentscott00, Anaraug, Brest, CarlManaster, CometGuru, Damiens.rf, Fadookie, FatalError,

Furrykef, Gregdan, In side the pc, Inquisitus, Jacobolus, Jwoodger, Kalan, Kdknigga, MrOllie, MuffledThud, Pohta ce-am pohtit, Psilya, Sleepyhead81, Sprocketonline, Stefan Hintz, Ødipus sic,

52 anonymous edits

Root element Source: http://en.wikipedia.org/w/index.php?oldid=292478129 Contributors: Ferkelparade, Malcolma, Mike the k, Nigelj, Pegship, RJFJR, Rich Farmbrough, Robertvan1,

Sardine, 6 anonymous edits

Schematron Source: http://en.wikipedia.org/w/index.php?oldid=345263915 Contributors: Aqueenan, Bunnyhop11, Canadabear, Chsimps, Dmccreary, Dreftymac, Ghettoblaster, HoodedMan,

Hymek, JukoFF, Kbdank71, Korval, Modify, Nickcarr, Pnkrockr, Rjwilmsi, Samdutton, Securiger, Wellithy, Žiedas, 23 anonymous edits

Simple Outline XML Source: http://en.wikipedia.org/w/index.php?oldid=245061146 Contributors: CDV, Dreftymac, KennethJ, Krusch, Nfwu, Qu3a, Stevage, Tadman, Verdatum, 4

anonymous edits

Simple XML Source: http://en.wikipedia.org/w/index.php?oldid=290618000 Contributors: Codebytez, Danlev, Melab-1, Sydius, 8 anonymous edits

Streaming XML Source: http://en.wikipedia.org/w/index.php?oldid=313593995 Contributors: Clq, Deathy, Egpetersen, Fikus, Filmackay, Maustrauser, Neustradamus, Patdreams

Styled Layer Descriptor Source: http://en.wikipedia.org/w/index.php?oldid=345632921 Contributors: Beautyod, Ebyabe, Firsfron, Lars Washington, Lordsatri, Mabdul, Oskosk, SEWilco,

SheldonYoung, Vitomeuli, 4 anonymous edits

Topic (XML) Source: http://en.wikipedia.org/w/index.php?oldid=305325594 Contributors: Barticus88, Blathnaid, Clayoquot, Eleusis, Fool, Hbent, Lheuer, Pearle, Quaque, Treborbassett, Walk

Up Trees, 8 anonymous edits

Unique Particle Attribution Source: http://en.wikipedia.org/w/index.php?oldid=272335554 Contributors: Bunnyhop11, Frandsen, Politepunk, Rich Farmbrough, 2 anonymous edits

VTD-XML Source: http://en.wikipedia.org/w/index.php?oldid=356464294 Contributors: AnmaFinotera, Beefyt, CamTarn, CambridgeBayWeather, EurekaLott, FayssalF, Greatestrowerever,

Hervegirod, Hut 8.5, Jacosi, Jzhang2007, Katieh5584, LilHelpa, Paul8046, Pegship, Raise exception, Rjwilmsi, Rookkey, Switchercat, Toohool, Torc2, UncleDouggie, םודנר, 187 anonymous

edits

Article Sources and Contributors 196

X-expression Source: http://en.wikipedia.org/w/index.php?oldid=272451914 Contributors: Dragentsheets, Greenrd, JLaTondre, 1 anonymous edits

XBRLS Source: http://en.wikipedia.org/w/index.php?oldid=338054643 Contributors: Blowdart, CharlesHoffman, Glennfcowan, Lancet75, Niente21, Pohta ce-am pohtit, 3 anonymous edits

Xdos Source: http://en.wikipedia.org/w/index.php?oldid=352934824 Contributors: Dawynn, FreeKresge, Malcolma, Pearle, Salad Days, Smthng2sav, Tinucherian, 3 anonymous edits

XDR Schema Source: http://en.wikipedia.org/w/index.php?oldid=335801797 Contributors: Aaaidan, Abelson, Greenrd, Jonnie d smith, Sergey.Radkevich, 2 anonymous edits

XEE (Starlight) Source: http://en.wikipedia.org/w/index.php?oldid=245059907 Contributors: Elblanco, Malcolma, Ratarsed, 1 anonymous edits

XEP Source: http://en.wikipedia.org/w/index.php?oldid=359239404 Contributors: Msulyaev, Odo1982, Toddst1, Zundark

XML Source: http://en.wikipedia.org/w/index.php?oldid=367508169 Contributors: .:Ajvol:., 207.172.11.xxx, 213.253.39.xxx, 24ten, AHMartin, AThing, Aadaam, Actam, AdamCarden, Adeio,

Ahabr, Ahkond, Ahoerstemeier, Aitias, Ajcumming, Aklauss, Aksi great, Alan Liefting, Alansohn, Alexbrn, AlistairMcMillan, Allkeyword, Amire80, AndersFeder, Andrisi, Angeltoribio, Ani td,

Ankitasdeveloper, Anna Lincoln, Anon lynx, AnonMoos, Anti stupidity, Anu-43, Aomarks, Asqueella, Asteiner, Asymmetric, Atanveer9, AzaToth, B4hand, Barek, Barticus88, Bdesham,

Beetstra, Belamp, Bernd in Japan, BertSen, Bevo, Bhadani, Biezl, BigFatBuddha, Bissinger, Bje2089, Blinklmc, Bluemoose, BlurTento, Bobdc, Bobianite, Boehm, Bonbayel, Bonethugnd, Booles,

BorgQueen, Borgdylan, Boseko, BrianCully, Brick Thrower, Brighterorange, Brion VIBBER, Bryan Derksen, Brz7, Bunnyhop11, Burschik, Businessman332211, Bvajet,

C.M.Sperberg-McQueen, CLD, CambridgeBayWeather, Cameltrader, Can't sleep, clown will eat me, CanadianLinuxUser, Caomhin, CapitalSasha, Carewolf, CarlHewitt, Cbdorsett, Cels2, Centrx,

Charivari, Chininazu12, ChongDae, Chowbok, Chris 73, Chris Roy, Chrislk02, Chrisnewell, ChristopheS, Chzz, Cipherynx, Clayoquot, ClementSeveillac, CoSort2007, Coconut99 99, Cody5,

Colonies Chris, Comesuntbob, Contraverse, Conversion script, CptAnonymous, Crosstowns, Cspan64, Cybercobra, D6, DKEdwards, Da monster under your bed, Dan100, DanConnolly, Daniel

Olsen, Daniel.Cardenas, DanielVonEhren, DarkFalls, Darkfred, David spector, Davis685, Dcattell, Dcoetzee, DeadEyeArrow, Delcnsltmd, Deodar, Derek Ross, Derekread, Dicklyon, Dickpenn,

DigitalEnthusiast, Dingbats, Dino72, Dkrms, Dlohcierekim, Dlrohrer2003, Dolcecars, DominiqueHazaelMassieux, Donmay12, DopefishJustin, DoriSmith, DougBarry, Dpattison2007, Dpbsmith,

Dpm64, Dr Headgear, Dreftymac, Dthvt, Dullhunk, Dwheeler, Ebruchez, Edcolins, Edward Z. Yang, Efcavanaugh, Egandrews, Egil, Eisnel, ElBenevolente, Elharo, Ellmist, Elwikipedista,

EngineerScotty, Eranb, Ericjs, Erik Zachte, Erikdw, Eritain, Etu, Evaluist, Ewsers, Fang.zheng, Fantasticfears, FatalError, Feline Hymnic, Ferdinand Pienaar, Figure, Fleminra, FloatingMind,

Fnielsen, Folajimi, Fragglet, Fran Rogers, Francl, Frap, Freyr, Frisket, Fsolda, Furrykef, Fvw, GTBacchus, Gaius Cornelius, Gc9580, Gdrori, Geniac, Gennaro Prota, GentlemanGhost,

GeoffPurchase, Ghettoblaster, Giftlite, Gjlubbertsen, Gjs238, Glass of water, Glenn, Gogo Dodo, Golwengaud, GrEp, GraemeL, Graham, Greg Murray, Ground Zero, Grumpycraig, Gudeldar,

Haakon, Hairy Dude, Hannes Hirzel, Harold f, Hashar, Hervegirod, Hicketyhicketyhack, Highwayman65251, Hirzel, Hogman500, Hu12, Hurricane111, Hypertrek, Hyuri, IMSoP, Ian Moody,

IanBurrell, Iftikhar88hussaini, Ijmorlan, IlanaDavidi, Imars, Imjustmatthew, Int21h, Intgr, Iridescent, Isilanes, Itai, J.delanoy, JForget, JKing, JLaTondre, JPalonus, JRocketeer, Jackacon,

Jacobko, Jacobolus, JakobVoss, JamesBrownJr, Jao, Jargon64, Jauerback, JavaWoman, Jaxad0127, Jaxsam1, Jay, Jeenuv, Jeff G., Jeff3000, Jehzlau, Jerazol, Jesin,

Jhannah, Jibjibjib, Jilplo Haggins, Jimthing, Jmlipton, Joachim Wuttke, Joanjoc, John Vandenberg, JohnSmith777, JohnWhitlock, Johnmarkh, Johnwcowan, Joku, Jonabbey, Jonkerz,

Jonnyamazing, Jor, Jpbowen, Jshadias, Jzhang2007, Kai.Klesatschke, Kaldosh, Kamalakannanprogrammer, Kanags, Kapoing, Karderio, Karl Dickman, Katalaveno, Kbrose, Kc2idf,

Keithgabryelski, Kenmccallum, Kensall, Kevinconroy, Kgaughan, Kha0sK1d, KickAssClown, Kl4m, Klaws, Koavf, Korval, Krauss, Kubigula, Kx1186, LDiracDelta, Lambiam, Larala,

Lazynitwit, Lianmei, Liao, Lifefeed, Liftarn, Ligulem, Ling.Nut, LittleDan, Loveenatayal, Lumi71, Lycurgus, M.franceschet, M4gnum0n, MER-C, MK8, MaBoehm, Madir, Mah159, Mak

Thorpe, Manishtomar, Maoj-wsu-sp, Mark Renier, MarkSweep, Martijn faassen, Martin451, Martinp23, MartynDavies, Mathmo, Matthäus Wander, MaxEnt, Maximaximax, Maximus06,

Mayfare, Mbbradford, Mbell, Mcintyem, Mcorazao, Melab-1, Melon039, Meszigues, Mhkay, Michael Hardy, MichaelJanich, Miguelfms, Minghong, Mion, Miss Dark, Mjb, Mjpieters,

Mola8sses, Montgomery '39, Mp, Mr. Shoeless, Mr.Z-man, MrJones, MrOllie, Mrjmcneil, Ms2ger, Mthibault, Mvulpe, Mwtoews, Mww113, Mxn, NO ACMLM,AND XKEPPER SUCK !,

Nannus, Nanshu, Natasha2006, NawlinWiki, Neckro, Nemo bis, Netsnipe, Nicmila, Nigelj, Nikkimaria, Nile, Ninly, Niteowlneils, Nivaca, Nixeagle, Noldoaran, Nomediga, Norm mit, Nowa,

Nsh, Nwbeeson, Octane, Ogmios, Ohnoitsjamie, Okyea, OliD, OsamaK, Oscar-ja, Osquar F, OverlordQ, Oxblood, P3x984, PTSE, Patrick, Paul Foxworthy, PaulXemdli, Pavel Vozenilek,

Paxsimius, Peashy, Pelle, Pengo, PeteVerdon, Peterl, Pgk, Philip Trueman, Phluid61, Phoenix-forgotten, Phyzome, Pianohacker, Pikiwyn, Pmberry, Poccil, Porges, Pozcircuitboy, Prakash

Nadkarni, Prodoc, Quarl, Quasipalm, Quiddity, Quilokos, Ramesses the Great, Rbonvall, Rbstimers, Rdmsoft, Red660, RedWolf, Redherring, Reinthal, Remy B, RenniePet, Rich Farmbrough,

RichMorin, Richalex2010, Rick Block, Rick Jelliffe, RickBeton, Risi, Ritvikbhatnagar1, Rivecoder, Rje, Rjstott, Rjwilmsi, Rklawton, Robert K S, Robert Merkel, Robinjwest, Robomaeyhem,

Rodney Boyd, Roger costello, Rory096, RoseParks, Rr2bwreain, Rror, Rvmolen, Ryanrs, Sam Hocevar, SamHathaway, SandiCastle, Sandius, Saqib, Saucepan, Sbvb, Schnolle, Scjessey, Scott

MacLean, Scottielad, Sderose, Seanhan, Seidenstud, Semper discens, Sen Mon, ShaneCavanaugh, Shanes, Shibboleth, Shii, Shinkolobwe, Shizhao, Shlomital, SickTwist, Signsofstatic, Simetrical,

SivaKumar, Sj, Sjc, Sleepyhead81, Smyth, Sosinfo, Sound effx, Spankman, Spe88, Spudstud, SqueakBox, Stefan.ciobaca, Stephen Gilbert, Steve R Barnes, SteveRwanda, Stevy76, StewartMH,

Stf, Stijn Vermeeren, Stupiddestyredgasd, Stwalkerster, Superm401, Suruena, Suwayya, Svetovid, Syangtar, Sydius, TPK, Tagith, Taknik, Talktovalentine, TastyPoutine, Technopilgrim, Teddyb,

Terjen, Terrifictriffid, Terrycojones, Thadius856, The Thing That Should Not Be, TheMightyOrb, Thierryc, Think777, Thumperward, Thunderhead, TimBray, TimR, Timc, Timur.shemsedinov,

Tobias Bergemann, Todd Vierling, Tony1, ToonArmy, Topbanana, Toussaint, Trade2tradewell, Trankin, Traroth, Treekids, Trovatore, Trscavo, Tsunaminoai, Turnstep, TwoOneTwo, Twocs,

Typhoonhurricane, Typochimp, UkPaolo, Unforgettableid, Unixxx, Unknown W. Brackets, Vaganyik, Varlaam, Versageek, Vespristiano, Vigilius, Violetriga, Vladkornea, Vojta, Volphy,

WSU-AW-AK, Waskage, Wavelength, Wellithy, Wereon, Whale plane, Whkoh, Wickorama, Wiki alf, Wiki0709, Wikilibrarian, Wmahan, WojPob, Woohookitty, Wrs1864, Wulfila, Ww,

XJamRastafire, Xompanthy, Xpclient, Yaronf, Ygramul, Yonkie, Zhaolei, Zoeb, Zootm, Олександр Кравчук, 1175 anonymous edits

XML and MIME Source: http://en.wikipedia.org/w/index.php?oldid=359215170 Contributors: Crowne, Ellymelly, Hawky, John Vandenberg, Mgungora, O keyes, Roger costello,

ShakespeareFan00, SpK, Typhoonhurricane, Wdflake, Wrs1864, 8 anonymous edits

XML appliance Source: http://en.wikipedia.org/w/index.php?oldid=358713658 Contributors: AJR, Abesford, Alfe, Biot, Bunnly, Bunnyhop11, Comindico, CommonsDelinker, Darraghs,

Dmccreary, Glace, Haakon, Hoagtim, Hughser, Iamrohit, Irishguy, Isotope23, Jbromhead, JonHarder, Jpbowen, Julesd, Kakarrott64, Kmorozov, L200817s, Layer7, Layer7tech, Lisfire, Lsonne,

Martpol, MinorContributor, Ohthelameness, Reedy, Sherool, Sreekesh, Staffwaterboy, Stephen Compall, Tcramer1234, Vikingforties, 30 anonymous edits

XML Base Source: http://en.wikipedia.org/w/index.php?oldid=333780510 Contributors: Anrie Nord, Fullstop, Furrykef, Pegship, Suruena, TimBray, Toussaint, Utcursch, 2 anonymous edits

XML Catalog Source: http://en.wikipedia.org/w/index.php?oldid=350443768 Contributors: Abcoates, Alex.g, Markup854, Nate1481, RickBeton, TubularWorld, 4 anonymous edits

XML Certification Program Source: http://en.wikipedia.org/w/index.php?oldid=365135620 Contributors: Melon039, Michel7789, Sykamoore, WestCity, 26 anonymous edits XML

Configuration Access Protocol Source: http://en.wikipedia.org/w/index.php?oldid=367225868 Contributors: Calment, Kbrose, Mondoblu, R'n'B, 9 anonymous edits

XML Control Protocol Source: http://en.wikipedia.org/w/index.php?oldid=294538733 Contributors: Asbjornu, Malcolma, Melab-1, Mild Bill Hiccup, Salmar

XML data binding Source: http://en.wikipedia.org/w/index.php?oldid=362549516 Contributors: Beetstra, Biehl, Boseko, Cander0000, Coconut99 99, DSosnoski, Doug Bell, Drrngrvy,

Dsevilla, Emerks, Eshear, Jnutting512, Khookguy, Liempt, Miami33139, MrOllie, Mrflip, Nskhan84, Objsys, Payxystaxna, Poccil, Precious Roy, RedWolf, Redvers, Robert van Engelen,

Sebastian.Dietrich, Simon sprott, SprottS, Squash, Stephen B Streater, Teeks99, Tirkfl, Trident job, Venango, Virgiltrasca, Wavelength, Yourfired101, 86 anonymous edits

XML database Source: http://en.wikipedia.org/w/index.php?oldid=366561321 Contributors: 16x9, AJackl, Abukaspar, Adrianwn, Amirfr, Andionita, Arnabdotorg, Barefootliam, Belovedfreak,

Bernd vdB old, Bohumir Zamecnik, Bradjamesbrown, Brick Thrower, Bunnyhop11, Ccouvrette, ChristianGruen, Colonies Chris, CorcaighAbu, DickieRose, Dilane, Dizzzz, Dmccreary,

Doclabyrinth, DoriSmith, Edward C. Zimmermann, Eedeebee, Enric Naval, Epbr123, EricBloch, GVogeler, Glen Pepicelli, Gpallis, Gregburd, Happygiraffe, Hgkamath, Hobartimus, Joerg84,

John Vandenberg, Johndbritton, Juansempere, Jzhang2007, Klingon, Kmorozov, Kokotero, Lamdk, Libcub, Mdd, Metaperl, Michael Slone, MiddleEarth, Nichtich, Nikkimaria, OlliX, Pearle,

Pedant17, Philip Trueman, Playmobilonhishorse, Radim Baca, Rastgoo, Rayngwf, Rjwilmsi, Rtweed1955, Signalhead, Slakr, Snodnipper, Stevertigo, Sykamoore, TRosenbaum, Tbradford,

Terrifictriffid, Thumperward, Tide rolls, Touko vk, Xmlchamp, Xpriori, Xshezang, Xxanthippe, 216 anonymous edits

XML editor Source: http://en.wikipedia.org/w/index.php?oldid=358875951 Contributors: Alcalazar, Asqueella, Booles, Cedric dlb, Cinnamon42, Clayoquot, Damien1, DirkvdM, Dulciana,

Efcavanaugh, Egandrews, Furrykef, GeoffPurchase, Geralds, Icairns, Julesd, Korval, LeeHunter, Mark Richards, Mjb, Mzajac, Nabeth, Owens1, Ownlyanangel, Quasipalm, RedWolf, Remuel,

Richardmtl, Saqib, Sernauser, SimonP, Sjoerd visscher, Skreyola, Spankman, Srbauer, Swaq, Thv, Tobias Bergemann, Wrs1864, 72 anonymous edits

XML Enabled Directory Source: http://en.wikipedia.org/w/index.php?oldid=291296534 Contributors: Chowbok, EagleOne, Kdz, Melab-1, MerryMorris, 3 anonymous edits

XML Encryption Source: http://en.wikipedia.org/w/index.php?oldid=354384058 Contributors: Alekseysanin, ArnoldReinhold, AutumnSnow, Cuonghuyto, Gudeldar, Jc3s5h, Mabdul, Ntsimp,

Pmerson, Samsara, Sverdrup, Westenra, Wrs1864, 15 anonymous edits

XML Events Source: http://en.wikipedia.org/w/index.php?oldid=328519050 Contributors: Ahoerstemeier, Dmccreary, Dmyersturnbull, Dvunkannon, Ghettoblaster, Groupsixty, Hawky, I

already forgot, Lev Matematik, Mathiastck, Pemboid, Reinthal, Risi, Rjwilmsi, Toussaint, Xaje, Zundark, 10 anonymous edits

XML framework Source: http://en.wikipedia.org/w/index.php?oldid=322139018 Contributors: Bunnyhop11, Byjg, Kateshortforbob, Libcub, 2 anonymous edits

XML Literals Source: http://en.wikipedia.org/w/index.php?oldid=297983431 Contributors: Biscuittin, Drilnoth, Highpitch, Maniamin, 1 anonymous edits

Article Sources and Contributors 197

XML namespace Source: http://en.wikipedia.org/w/index.php?oldid=347806300 Contributors: Anthony Appleyard, Anwar saadat, AutumnSnow, CardinalDan, Detroit, Dpm64, Dreftymac,

Ear1grey, Eh kia, Ehn, Franl, Gagsie, Hairy Dude, I am neuron, Ilyanep, ImperfectlyInformed, Juanpablosoto, Korval, Mabdul, Mhkay, Nigelj, Pitoutom, Reinthal, Robina Fox, Sciurinæ,

Sourcejedi, SuperHamster, The.Modificator, TimBray, TubularWorld, 24 anonymous edits

XML Pretty Printer Source: http://en.wikipedia.org/w/index.php?oldid=349425356 Contributors: Ashburnite, BackToThePast, KeithTyler, Malcolma, Oneiros, Tiberiusgrant, 4 anonymous

edits

XML Protocol Source: http://en.wikipedia.org/w/index.php?oldid=272452094 Contributors: ClementSeveillac, Imjustmatthew, Longhair, Pegship

XML schema Source: http://en.wikipedia.org/w/index.php?oldid=340715184 Contributors: ABCD, Acdx, Ahoerstemeier, Alik Kirillovich, AutumnSnow, Beetstra, Bunnyhop11, Cbdorsett,

Choster, Crystallina, Derekread, Dongwon, Doug Bell, Dreftymac, Ehn, Fryed-peach, Gardenstew, Hervegirod, Hymek, Jamelan, Jaxsam1, Korval, Krauss, Kucing, Mamling, MariahX, Mark

Renier, MarkSweep, Mhkay, Minghong, Mjb, Ninly, Pi8ch, Pmerson, Poccil, Pxma, Rich Farmbrough, Runnerupnj, SheepNotGoats, Smyth, Stevage, SteveLoughran, Tobias Bergemann,

Vernanimalcula, Vishrave, Wellithy, Xan 213, Þjóðólfr, 51 anonymous edits

XML Schema Editor Source: http://en.wikipedia.org/w/index.php?oldid=364480930 Contributors: Bunnyhop11, Ched Davis, Egandrews, Fabrictramp, Gsgsgsgs, Kostmo, Pjcwikip, Rhubbarb,

Rklear, Simon sprott, 12 anonymous edits

XML Schema Language Comparison Source: http://en.wikipedia.org/w/index.php?oldid=349051677 Contributors: Ahoerstemeier, Bunnyhop11, Cfeet77, Crystallina, Decrease789, Dongwon,

Dreftymac, Ghettoblaster, Giraffedata, Grumpycraig, Hsivonen, Jlowery, Korval, Penter ghost, Q Chris, Sloop Jon, Sześćsetsześćdziesiątsześć, Tuntable, 31 anonymous edits

XML Studio Source: http://en.wikipedia.org/w/index.php?oldid=345963200 Contributors: Beetstra, Fabrictramp, Simon sprott, 7 anonymous edits

XML Telemetric and Command Exchange Source: http://en.wikipedia.org/w/index.php?oldid=368146406 Contributors: Briangregory2000, BuffaloChip97, Eyreland, GerryInColorado,

Iridescent, Jsafranek, Minizinim, Nasa-verve, O keyes, Pan Dan, Rich Farmbrough, SamFCooper, Timmerlj, 4 anonymous edits

XML template engine Source: http://en.wikipedia.org/w/index.php?oldid=362160823 Contributors: Akmg, Crystallina, FatalError, Ishnigarrab, JacekA, Krauss, Markjoseph sc, Mhkay,

MichaK, RHaworth, Radiant!, Rjwilmsi, Sanxiyn, Stevage, Stf, Tokek, 13 anonymous edits

XML tree Source: http://en.wikipedia.org/w/index.php?oldid=352933815 Contributors: Booyabazooka, Dawynn, Malcolma, Nagle, Tinucherian, Velle, WereSpielChequers

XML validation Source: http://en.wikipedia.org/w/index.php?oldid=361641348 Contributors: 3nx, Andy Dingley, David Haslam, Dawynn, Dreftymac, Drrwebber, EdJogg, Fnielsen, Hmains,

Hymek, Jaxsam1, Korval, Pmerson, Rich Farmbrough, Waacstats, 10 anonymous edits

XML-Enabled Networking Source: http://en.wikipedia.org/w/index.php?oldid=352338066 Contributors: Asparagus, Hybernator, Kakarrott64, Krbabu, Lsonne, MaxDel, Mbenna, Melab-1,

MinorContributor, 7 anonymous edits

XML-Retrieval Source: http://en.wikipedia.org/w/index.php?oldid=342675426 Contributors: DoriSmith, JudithWinter, Magioladitis, Nikkimaria

XMLHttpRequest Source: http://en.wikipedia.org/w/index.php?oldid=366820907 Contributors: .:Ajvol:., A3r0, Aditsu, Ahoerstemeier, Alaa.moustafa, Alansohn, Alcalazar, Alex Smotrov,

Alexandre Martins, Algae, Alphachimp, Anirvan, Apv, Arjun G. Menon, Artw, Bezenek, Blackdenimgumby, BobBagwill, Bobo192, Bovineone, CDV, Caged.danimal, CambridgeBayWeather,

CanisRufus, CapitalR, Catamorphism, Chealer, Christopherlin, Cic, Coffeeflower, DJ Rubbie, Damicatz, Dantman, Darklama, Delfuego, Digita, Dionyziz, Dirus, Discospinster, Djkenzie,

Downfromzero, Drano, Dsnell923, EatMyShortz, Ej0c, Eloi.sanmartin, Enyo, Eric B. and Rakim, Eve Teschlemacher, Fabiob, FatalError, Filipvr, Fred Bradstadt, Fromz, Furrykef, Gabrielsroka,

Gerbrant, Gilgamesh, Gilliam, Gimboid13, GraemeL, GregorB, Haza-w, Hondavice, Ignacio Javier Igjav, Isnow, J.delanoy, Jaray, Javalenok, Javawizard, Jaw959, Jdowland, Jeroldan, Jmabel,

John Vandenberg, Jriffel, Keelypavan, Khalid hassani, Kozuch, Krellis, Kugland, Lee J Haywood, LemonairePaides, Liberatus, Lindsay-mclennan, Locos epraix, Lupin, Macaldo, Maian,

Mamund, Manop, Marktmilligan, Marskind, Martin Hampl, Martnym, Masonbarge, Meand, Merc64, Metaeducation, Mindmatrix, Minghong, Mnot, Molily, Mrcs, Nickshanks, Nigelj,

Nightstallion, Niven, Nkour, Norm mit, Oeln, Ohgyun Ahn, Pcj, Pctopp, Ph0t0phobic, Phloopy, Pjakubo86, Pjdonnelly, Proton.mule, Quilokos, Ramu50, RedWolf, Reisio, Remember the dot,

Renku, RidinHood25, Ringbang, Rjwilmsi, Robert p levy, Rohan Jayasekera, Rufous, SalM, Sega381, Shamesspwns, Simon Lieschke, SineSwiper, Skeejay, Slant, Sleepyhead81, Spankman,

Speight, Stephen Morley, Suruena, SvartMan, Taka, TakuyaMurata, Tamlyn, Teiladnam, The Anome, TheJosh, Thedangerouskitchen, Thumperward, Timc, Timeroot, Timwi, Tolmaion, Twsx,

Urkle0, Vberger, VictorAnyakin, Vladogr, Wengier, White 720, WhiteHatLurker, Widgetguy, WikHead, Zippedmartin, Zoef1234, Zvn, Zzuuzz, ~K, 380 anonymous edits

XMLSocket Source: http://en.wikipedia.org/w/index.php?oldid=313909088 Contributors: Icktoofay, O keyes, Tomjenkins52, 1 anonymous edits

XPath Source: http://en.wikipedia.org/w/index.php?oldid=355507321 Contributors: Bitbit, Bunnyhop11, D.c.camero, Girlo2111, Gondooley, JLaTondre, Jasondburkert, Jeffz1, Mabdul,

Mathiastck, Mhkay, Ninly, Norro, Pgfearo, RSStockdale, Ringbang, Tibti, Walk Up Trees, 15 anonymous edits

XPath 2.0 Source: http://en.wikipedia.org/w/index.php?oldid=344816885 Contributors: Bunnyhop11, D.c.camero, Fredrik, Girlo2111, Gudeldar, Int19h, Jan.Sievers, K1Bond007, Lar, Mabdul,

Mathiastck, Mhkay, Roland Beker, Stevage, TheParanoidOne, Typhoonhurricane, Xiroth, 7 anonymous edits

Xs3p Source: http://en.wikipedia.org/w/index.php?oldid=352203304 Contributors: AriManninen, Ashburnite, Databases, Dawynn, Hysteria18, 4 anonymous edits

XSQL Source: http://en.wikipedia.org/w/index.php?oldid=362589846 Contributors: Bunnyhop11, Cander0000, Fatal!ty, HJWeng, Intgr, Legoktm, Levin, Melab-1, Tabletop, Xezbeth, 4

anonymous edits

Image Sources, Licenses and Contributors 198

Image Sources, Licenses and Contributors

Image:Klip-logo1.png Source: http://en.wikipedia.org/w/index.php?title=File:Klip-logo1.png License: unknown Contributors: User:Awille, User:Cydebot, User:Diveloop

Image:Log4js.png Source: http://en.wikipedia.org/w/index.php?title=File:Log4js.png License: GNU Free Documentation License Contributors: Stritti

Image:Log4JS-UML.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Log4JS-UML.jpg License: GNU Free Documentation License Contributors: Stritti

Image:PARTSangles.jpg Source: http://en.wikipedia.org/w/index.php?title=File:PARTSangles.jpg License: Public Domain Contributors: Buiras

Image:METSdocument.jpg Source: http://en.wikipedia.org/w/index.php?title=File:METSdocument.jpg License: unknown Contributors: Buiras

Image:X-office-document.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-document.svg License: unknown Contributors: Bdesham, Rocket000, Sasa Stefanovic

Image:X-office-presentation.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-presentation.svg License: unknown Contributors: Linuxerist, Rocket000, Túrelio, 1

anonymous edits

Image:X-office-spreadsheet.svg Source: http://en.wikipedia.org/w/index.php?title=File:X-office-spreadsheet.svg License: unknown Contributors: Bdesham, Rocket000, Sasa Stefanovic

Image:Open Packaging Convention.png Source: http://en.wikipedia.org/w/index.php?title=File:Open_Packaging_Convention.png License: GNU General Public License Contributors:

various

Image:DrawingML example.png Source: http://en.wikipedia.org/w/index.php?title=File:DrawingML_example.png License: Public Domain Contributors: Original uploader was Tuanese at

en.wikipedia

Image:XPSIcon.png Source: http://en.wikipedia.org/w/index.php?title=File:XPSIcon.png License: unknown Contributors: Athaenara, Cristan, Joelholdsworth, Salavat, Sfan00 IMG, 2

anonymous edits

Image:Rdf graph for Eric Miller.png Source: http://en.wikipedia.org/w/index.php?title=File:Rdf_graph_for_Eric_Miller.png License: Attribution Contributors: W3C

Image:XML.svg Source: http://en.wikipedia.org/w/index.php?title=File:XML.svg License: Creative Commons Attribution-Sharealike 2.5 Contributors: AutumnSnow, Fryed-peach, JeffyP,

Jusjih, Karl Dickman, Latics, Platonides, SKvalen, Soeb, Verdy p, 3 anonymous edits

Image:Xml_text_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_text_editor.png License: Public Domain Contributors: Damien1, 1 anonymous edits

Image:xml_graphical_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_graphical_editor.png License: Public Domain Contributors: Damien1

Image:xml_wysiwyg_editor.png Source: http://en.wikipedia.org/w/index.php?title=File:Xml_wysiwyg_editor.png License: Public Domain Contributors: Damien1, 1 anonymous edits

Image:SimpleXsd Physical.png Source: http://en.wikipedia.org/w/index.php?title=File:SimpleXsd_Physical.png License: Creative Commons Attribution 3.0 Contributors: User:Simon sprott

Image:SimpleXsd Logical.png Source: http://en.wikipedia.org/w/index.php?title=File:SimpleXsd_Logical.png License: Creative Commons Attribution 3.0 Contributors: User:Simon sprott

Image:Tick-green.png Source: http://en.wikipedia.org/w/index.php?title=File:Tick-green.png License: Public Domain Contributors: Wesley Warren

Image:ScreenShot XsdEditor.png Source: http://en.wikipedia.org/w/index.php?title=File:ScreenShot_XsdEditor.png License: Creative Commons Attribution 3.0 Contributors: Simon sprott

(talk). Original uploader was Simon sprott at en.wikipedia

Image:XTCE exchange.gif Source: http://en.wikipedia.org/w/index.php?title=File:XTCE_exchange.gif License: Public Domain Contributors: GerryInColorado

License 199

License

Creative Commons Attribution-Share Alike 3.0 Unported

http://creativecommons.org/licenses/by-sa/3.0/

EXtensible Markup Language (XML) - Cultural View

EXtensible Markup Language (XML) - Cultural View ... View more EXtensible Markup Language (XML) - Cultural View

Delete template?

Save as template ?

EXtensible Markup Language (XML) - Cultural View EXtensible Markup Language (XML) - Cultural View