Professional Documents
Culture Documents
Chapter 5 - Introduction To XML
Chapter 5 - Introduction To XML
Fekade Getahun
1
What is XML - eXtensible Markup
Language?
A set of rules for defining and representing information as
structured documents for applications on the Internet; a
restricted form of SGML
8
What is XML?
Rule 1: Information is represented in units called XML
documents.
Rule 2: An XML document contains one or more
elements.
Rule 3: An element has a name, it is denoted in the
document by explicit markup, it can contain other
elements, and it can be associated with attributes.
9
An example of XML
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>aaucs_gs03@yahoogroups.com</to>
<from>satnafu@gmail.com</from>
<subject>CoSc 705 - Seminar: Article Review Presentation
Schedule</subject >
<body> The presentations of the seminar course are scheduled
from July 4 - 8, 2011. Please find attached the presentation
schedule</body>
</note>
10
Example of an XML document
<?xml version="1.0"?>
<catalog>
<product category = "mobile phone">
<mfg>Nokia</mfg><model>8890</model>
<description> Intended for EGSM 900 and GSM 1900 networks …
</description>
<clock setting= "nist" alarm = "yes"/>
</product>
<product category = "mobile phone">
<mfg>Ericsson</mfg><model>A3618</model>
<description>...</description>
</product>
...
</catalog>
11
Why XML?
Needs:
Simple, common rules that are easy to understand by people
with different backgrounds (like HTML)
Capability to describe Internet resources and their
relationships (like HTML)
Capability to define information structures for different
kinds of business sectors (unlike HTML, like SGML)
12
Why XML?
Needs (cont’d):
Format formal enough for computers and clear enough to be
human-legible (like SGML)
Rules simple enough to allow easy building of software
(unlike SGML)
Strong support for diverse natural languages (unlike SGML)
13
XML 1.0 fundamentals: Physical (storage) units
Each XML document has both a logical and a physical
structure.
Physically, XML document consists of one or more
entities
A “document entity” serves as the root
Entity may contain
external portion of DTD
parsed character data, which replaces any references to the entity
unparsed data
Entities located by URIs
XML provides a mechanism to impose constraints on the
storage layout and logical structure.
14
XML 1.0 fundamentals: Logical structure
XML document is composed of
XML declaration: <?xml version=”1.0”?>
Markup vocabulary: elements, attributes
<product category = "mobile phone">
<mfg>Nokia</mfg><model>8890</model>
... <clock setting = "nitz" alarm = "yes"/> ...
</product>
White space
Parsed and unparsed character data
Entity references &diagram;
Comments <!-- how interesting… -->
Processing instructions '<?' PITarget (S (Char* -
(Char* '?>' Char*)))? '?>'
`
15
XML 1.0 fundamentals: Logical structure
XML document contains
one or more elements, the boundaries of which are either
delimited by
start-tags and end-tags, or,
for empty elements, by an empty-element tag.
Each element has
a type, identified by name, sometimes called its "generic identifier"
(GI),
may have a set of attribute specifications
Each attribute specification has a name and a value
16
XML 1.0 fundamentals:DTD
Internal vs. external declarations
<!DOCTYPE greeting [<!ELEMENT greeting (#PCDATA)>]>
<greeting>Hello, world!</greeting>
<?xml version="1.0"?>
<!DOCTYPE greeting SYSTEM "hello.dtd">
<greeting>Hello, world!</greeting>
Root document type
17
XML 1.0 fundamentals:DTD
Element Type Declaration
The element be constrained using element type and attribute-list
declarations.
An element type declaration constrains the element's content.
Element type declarations often constrain which element types can appear
as children of the element.
Element Type Declaration
elementdecl ::= '<!ELEMENT' S Name S contentspec S>'
contentspec ::= 'EMPTY' | 'ANY' | Mixed | children
Example
<!ELEMENT br EMPTY>
<!ELEMENT p (#PCDATA|emph)* >
<!ELEMENT %name.para; %content.para; >
<!ELEMENT container ANY>
18
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE message [<!ELEMENT message (contents, from)> <!ELEMENT
contents (#PCDATA)>
<!ELEMENT from (#PCDATA)> ] >
<message>
<contents>First XML Page</contents>
<from>www.HiLCoE.com.et</from>
</message>
19
XML 1.0 fundamentals:DTD
Element Content
An element type has element content when elements of that type
must contain only child elements (no character data).
The constraint includes a content model, a simple grammar governing
the allowed types of the child elements and the order in which they
are allowed to appear.
The grammar is built on content particles (cps), which consist of
names, choice lists of content particles, or sequence lists of content
particles:
Element-content Models
children ::= (choice | seq) ('?' | '*' | '+')?
cp ::= (Name | choice | seq) ('?' | '*' | '+')?
choice ::= '(' S? cp ( S? '|' S? cp )* S? ')'
seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'
20
XML 1.0 fundamentals:DTD
Example
<!ELEMENT category (mfg, model, description , clock?)>
<!ELEMENT description (#PCDATA | feature)*>
<!ELEMENT clock EMPTY>
<!ELEMENT spec (front, body, back?)>
<!ELEMENT div1 (head, (p | list | note)*, div2*)>
<!ELEMENT dictionary-body (%div.mix; | %dict.mix;)*>
21
XML 1.0 fundamentals: Attribute
declaration
Attributes are used to associate name-value pairs with elements
AttlistDecl ::= '<!ATTLIST' S EName AttDef* S '>'
AttDef ::= S Name S AttType S DefaultDecl
Example:
<!ATTLIST termdef id ID #REQUIRED name CDATA #IMPLIED>
<!ATTLIST list type (bullets|ordered|glossary) "ordered“>
<!ATTLIST form method CDATA #FIXED "POST">
22
XML 1.0 fundamentals
Conformance:
Well-formed:
syntactically correct tags
matching tags
nested elements
all entities declared before they are used
Valid:
well-formed
DTD + doctype matches DTD
unique IDs
no dangling IDREFs
23
XML 1.0 fundamentals
XML
document
application
24
XML family of languages
XML-related languages fall into the following
classes:
XML accessories
XML transducers
XML applications
25
XML family of languages: XML Accessory
Extends the capabilities specified in XML
Intended for wide, general use
Examples:
XML Names: to allow qualified names
Names ::= Name (S Name)*
XML Schema: extends the definition capabilities of
DTD
XPath: for addressing parts in XML documents
XLink: to create hyperlinks between resources
26
XML family of languages: XML Transducer
Converts XML input data into output
Associated with a processing model
Examples:
XSLT: intended for transforming XML documents
into other XML documents
CSS and XSL intended to produce an external
presentation from some XML data
XQuery: for querying
27
XML family of languages: XML Application
XML applications are languages which define constraints for a
class of XML data for some special application area
28
Naming conventions: XML namespaces
Provides a method for qualifying element and attribute
names so that name collisions can be avoided
Motivation: modularity and documentation
31
Naming conventions: XML namespaces
Collection of names, identified by a URI
No formal rules for defining names in a namespace
Example
Namespace: http://aau.edu.et
Element names: department, name, professor, student,
last_name, first_name, ...
Global attribute names: id, ...
Per-element-type attribute names: student: supervisor, ...
32
Naming conventions
Namespace declaration: defines a label (prefix) for the
namespace and associates it to the namespace
identifier (URI)
Qualified name: a namespace prefix and a local part,
separated by a colon
<?xml version="1.0"?>
<report xmlns:uw="http://aau.edu.et">
<uw:department>
<uw:name>Department of Computer Science</uw:name>
...
</report>
33
Hypertext facilities
Hyperlink: cross-reference primarily for presentation to a
human
Arc: two ends + direction source and destination
Traversal: following a link (for any purpose)
HTML (review):
“HyperText Markup Language” for WWW
Supports simple (binary, unidirectional) links
special elements <A> can be used as anchors
named anchor (identified by id or name attribute) used as
destination
anchor with href attribute is link source
destination name matches href value (URI)
34
HTML Linking Limitations < HREF=
“…”>
Static,
Uni-directional,
Single source anchor,
Single destination anchor,
Implied direction between them,
Automatic traversal,
No content embedding.
36
Hypertext facilities
XML not designed specifically for hypertext
Nevertheless, XML provides:
references to external entities (by URI references)
<!ENTITY spring SYSTEM "../grafix/flower.gif" NDATA gif >...
<figure picture="spring">
references to elements in the same document (via ID and
IDREF attributes)
<!ELEMENT students (student*)>
<!ELEMENT student (name, address*)>
<!ATTLIST student id ID #REQUIRED sibling-of IDREF
#IMPLIED>
<students>
<student id="123"><name>Daniel</name></student>
37
<student id="425" sibling-of="123"><name>Kebede</name></student>
Hypertext facilities
Advanced techniques for referencing & linking:
XPath
for addressing parts of XML document, typically node-sets
intended to be used by other specifications
XPointer
extends addressing capabilities of XPath to include locations
within unstructured (i.e., text) components
locations are points or ranges
XLink
for specifying links between resources
made explicit by an XLink linking element
link ends described by XPointer
38
What is XPath?
XPath is a syntax used for selecting parts of an XML
document
The way XPath describes paths to elements is similar to
the way an operating system describes paths to files
XPath is almost a small programming language; it has
functions, tests, and expressions
XPath is a W3C standard
XPath is not itself written as XML, but is used heavily in
XSLT
39
Terminology
<library>
<book>
parent
Children
<chapter>
</chapter>
siblings (they have the same parent)
Ancestors
<chapter> Descendents
<section>
<paragraph/>
<paragraph/>
</section>
</chapter>
</book>
</library>
40
Paths
Operating system: XPath:
/ = the root directory /library = the root element (if named
library )
/users/dave/foo = the /library/book/chapter/section =
(one) file named foo every section element in a chapter
in dave in users in every book in the library
foo = the (one) file named section = every section element
foo in the current directory that is a child of the current
element
. = the current directory . = the current element
.. = the parent directory .. = parent of the current element
42
Brackets and last()
A number in brackets selects a particular matching child
(counting starts from 1, except in Internet Explorer)
Example: /library/book[1] selects the first book of the library
Example: //chapter/section[2] selects the second section of every
chapter in the XML document
Example: //book/chapter[1]/section[2]
Only matching elements are counted; for example, if a book has both
sections and exercises, the latter are ignored when counting sections
The function last() in brackets selects the last matching child
Example: /library/book/chapter[last()]
You can even do simple arithmetic
Example: /library/book/chapter[last()-1]
43
Stars
A star, or asterisk, is a “wild card”—it means “all the
elements at this level”
Example: /library/book/chapter/* selects every child of
every chapter of every book in the library
Example: //book/* selects every child of every book
(chapters, tableOfContents, index, etc.)
Example: /*/*/*/paragraph selects every paragraph that has
exactly three ancestors
Example: //* selects every element in the entire document
44
Attributes
You can select attributes by themselves, or elements that have
certain attributes
Remember: an attribute consists of a name-value pair, for example in
<chapter num="5">, the attribute is named num
To choose the attribute itself, prefix the name with @
Example: @num will choose every attribute named num
Example: //@* will choose every attribute, everywhere in the document
To choose elements that have a given attribute, put the attribute
name in square brackets
Example: //chapter[@num] will select every chapter element
(anywhere in the document) that has an attribute named num
45
Attributes
//chapter[@num] selects every chapter element with an
attribute num
46
Values of attributes
//chapter[@num='3'] selects every chapter element
with an attribute num with value 3
The normalize-space() function can be used to remove
leading and trailing spaces from a value before
comparison
Example: //chapter[normalize-space(@num)="3"]
47
Axes
Starting from a given node, the self, preceding, following, ancestor, and
descendant axes form a partition of all the nodes (if we ignore attribute and
namespace nodes)
<library>
<book> //chapter[2]/self::*
<chapter/>
<chapter> //chapter[2]/preceding::*
<section>
<paragraph/>
<paragraph/> //chapter[2]/following::*
</section>
</chapter> //chapter[2]/ancestor::*
<chapter/>
</book>
//chapter[2]/descendant::*
<book/>
</library>
49
Arithmetic expressions
+ add
- subtract
* multiply
div (not /) divide
mod modulo (remainder)
50
Some XPath functions
XPath contains a number of functions on node sets,
numbers, and strings; here are a few of them:
count(elem) counts the number of selected elements
Example: //chapter[count(section)=1] selects chapters with
exactly two section children
name() returns the name of the element
Example: //*[name()='section'] is the same as //section
starts-with(arg1, arg2) tests if arg1 starts with arg2
Example: //*[starts-with(name(), 'sec']
contains(arg1, arg2) tests if arg1 contains arg2
Example: //*[contains(name(), 'ect']
51
Introduction: XLink
XLink is the XML Linking Language.
It is a language for creating hyperlinks in XML documents.
It is similar to HTML links - but more powerful.
Any element in an XML document can behave as an Xlink.
XLink supports simple links (like HTML) and extended links
(for linking multiple resources together).
With XLink, the links can be defined outside of the linked
files.
XLink is a W3C Recommendation.
52
XLink Syntax
In HTML, <a> element defines a hyperlink:
In XML documents,
can use whatever element names you want - it is impossible for
browsers to predict what hyperlink elements will be called in
XML documents.
The solution for creating links in XML documents was to
put a marker on elements that should act as hyperlinks.
53
XLink Syntax
A simple example of how to use XLink to create links in an XML
document.
<?xml version="1.0"?>
<homepages xmlns:xlink="http://www.w3.org/1999/xlink">
<homepage xlink:type="simple" xlink:href="http://www.sse.rdg.ac.uk">Visit SSE
</homepage>
<homepage xlink:type="simple" xlink:href="http://www.w3.org">Visit W3C
</homepage>
</homepages>
To get access to the XLink attributes and features, declare the XLink
namespace at the top of the document.
The XLink namespace is "http://www.w3.org/1999/xlink".
The xlink:type and the xlink:href attributes in the <homepage> elements
define that the type and href attributes come from the xlink namespace.
The xlink:type="simple" creates a simple, two-ended link (means "click
from here and go to there").
54
XLink
Goal: provide hypertext links from within XML
documents.
The specification concerns itself (mostly) with the syntax
of links and their properties.
The Semantics are left to the implementation, allows
simple links like HTML, but also more complicated link
structures.
55
XLink
· A namespace (usually prefixed as "xlink” ) with a set of elements
and attributes anchored in that namespace
(http://www.w3.org/1999/xlink/namespace/).
· Two kinds of links:
- Simple link elements (HTML like),
- Extended links.
• Well-Formed XML.
• XLinks described with attributes:
<!ELEMENT Authors>
<!ATTLIST Authors
xmlns:xlink CDATA #FIXED http://www.w3c.org/1999/xlink
xlink:type (simple | extended | locator | arc | resource | title)
xlink:href CDATA #REQUIRED
xlink:role CDATA #IMPLIED
xlink:title CDATA #IMPIED
xlink:show (new | replace | embed | other | none ) “replace”
xlink:actuate {onLoad | onRequest | other | none ) “onRequest” >
56
Type Attribute
Type (required) defines the type of link:
simple - similar to HTML <A HREF=….>,
extended - allows multiple links,
Sub-elements to extended links:
locator - locations to participate in an extended link,
arc - define navigable connections between locations of resources,
title - human readable sub-element,
resource - define participants in a link.
57
XLink Attributes
Category Attribute Description
Semantic arcrole URI reference describes the intended property of the arc
title Describes the meaning of the link in human readable terms
Describes the desired presentation of the data, possible values
show
are new, replace, embed, other, or none.
Behavioural
Describes the desired timing of the link. Possible values are
actuate
onLoad, onRequest, other, or none.
Provides and identifier that can be used to identify how different
label
types of resources and locators are connected within a link
Identifies the origin, using a label value from one of the locator or
Traversal from
resource elements of an extended link
Identifies the destination, using a label value from one of the
to
locator or resource elements of an extended link
58
XML Linking Model
XLink allows the definition of links.
XLink is simply an association
between a number of resources
(locators).
Link does not imply traversal.
Link source and destination are
optional.
It is given that a link may contain a
number of different arcs.
XLink separates associating resources (using a link) from traversal
amongst resources (using arcs)
59
XLink Functionality
Extended links : allow link authors to create new types of
links, typically by using arcs to create relationships (via
multiple traversal rules) between many different local and
remote resources.
60
XLink - Types
Resource - this link type represents a local resource which
participates in the extended link:
it contains some content that serves as the starting point for link
traversal,
<local xlink:type="resource" xlink:label="name"
xlink:title="John Doe" />
61
XLink - Types
Arc: An arc sets up navigation rules between locators and
resources (via the additional "from" and "to" attributes), and is
the primary construct for specifying the direction and
behaviour of link traversal:
<arc xlink:type="arc" xlink:from="name" xlink:to="salary"
xlink:show="replace" xlink:actuate="onRequest" xlink:title="Link from
name to salary" />
Arcs come in very handy when setting up so-called bi-
directional links, e.g, A -> B and B -> A.
An example:
<arc xlink:type="arc" xlink:from="resume" xlink:to="salary"
xlink:show="replace" xlink:actuate="onRequest" xlink:title="Link from
resume to salary" />
<arc xlink:type="arc" xlink:from="salary" xlink:to="resume"
xlink:show="replace" xlink:actuate="onRequest" xlink:title="Link from
salary to resume" />
62
XLink - Types
The XLink specification states that arcs cannot be duplicated -
in other words, a particular from-to relationship can be
represented by one and only one arc.
Title - used to provide additional, human-readable information
for an extended link's participating resources:
The XLink specification suggests that this link type is primarily used in
the context of internationalization, where different languages may
require different titles –
Example:
<ext xmlns:xlink="http://www.w3.org/1999/xlink"; xlink:type="extended">
<greeting xlink:type="title" xml:lang="en">Hello!</greeting>
<greeting xlink:type="title" xml:lang="fr">Bon jour!</greeting>
<greeting xlink:type="title" xml:lang="it">Ciao!</greeting>
</ext>
63
XLink - Attributes
In addition to the "type" attribute, the XLink specification
defines nine additional attributes:
The href attribute is used to specify the URL of a remote
resource, and is mandatory for locator links:
In addition to the URL of the remote resource, it may also contain an
additional "fragment identifier", which drills down to a specific location
within the target document.
The show attribute is used to define the manner in which the
endpoint of a link is presented to the user:
The value of this attribute may be any one of:
new - display linked resource in a new window;
replace - display linked resource in the current window, replacing whatever is
currently there;
Embed - display linked resource in a specific area of the current window;
other - display as per other, application-dependent directives;
none - display method unspecified.
64
XLink - Attributes
actuate is used to specify when a link is traversed - it may take
any of the values
onLoad - display linked resource as soon as loading is complete;
onRequest - display linked resource only when expressly directed to by
the user, either via a click or other input;
other and none.
label attribute is used to identify a link for subsequent use in an
arc.
from and to are used to specify the starting and ending points
for an arc respectively:
Both these attributes use labels to identify the links involved.
65
XLink - Attributes
role and arcrole reference a URL which contains information
on the link's role or purpose:
Normally, you can ignore these attribute, as they are intended primarily
for descriptive purposes,
The exception to this is when the "arcrole" attribute used in conjunction
with a linkbase - more on this later.
The title attribute, must not to be confused with the title type of
link, which provides a human-readable descriptive title for a
link.
It should be noted that it is not necessary for all these attributes
to be present in a particular XLink definition:
It is certainly mandatory for any element defining an XLink to contain a
type attribute, but with this obvious exception, different types of links
have different attribute requirements.
66
The basic rules
1. A simple link may contain href attributes to define the remote
resource, and optionally show and actuate attributes to define
link behaviour; none mandatory.
2. An extended link may contain optional role and title
attributes to define its purpose.
3. A locator link must contain an href attribute to define the
URL of the remote resource:
It typically also uses an optional label attribute to identify itself for arc
traversals.
4. A resource link usually contains an optional label, again for
purposes of arc traversals.
5. An arc typically contains optional from and to attributes to
define the direction of link traversal, optional show and
actuate attributes to specify how and when to traverse links,
Optional arcrole attribute if used in the context of a linkbase.
67
Querying and transforming
Roots in conventional database querying + information
retrieval + Web search engines
David Maier's desiderata
Preserve order and association; produce XML output
Suitable for documents, business records, and metadata
Support for new datatypes
Support {selection, extraction, reduction, restructuring, combination}
No schema required, but exploit available schema
XML representation; mutually embedding with XML; programmatic
manipulation
XLink and XPointer cognizant
Namespace alias independence
68
Querying and transforming: XML Query
Specifications
XML Query Requirements, 16 February 2001
XML Query Use Cases , 8 June 2001
XQuery 1.0 and XPath 2.0 Data Model (replaces XML Query
Data Model), 7 June 2001
XQuery 1.0 Formal Semantics (replaces XML Query Algebra),
7 June 2001
XQuery 1.0: An XML Query Language, 7 June 2001
XML Syntax for XQuery 1.0 (XQueryX), 7 June 2001
69
Objectives
How XML generalizes relational databases
The XQuery Language
70
XQuery 1.0
XML documents naturally generalize database relations
XQuery is the corresponding generalization of SQL
71
From Relations to Trees
People
FirstName LastName Age
Jone Doe 43
Jane Dow 38
Joe Smith 26
Juan Cruz 29
72
Trees Are Not Relations
Not all trees satisfy the previous characterization
Trees are ordered, while both rows and columns of tables
may be permuted without changing the meaning of the
data
<students>
<student id="100026">
<name>Joe Average</name>
<age>21</age>
<major>Biology</major>
<results>
<result course="Math 101" grade="C-"/>
<result course="Biology 101" grade="C+"/>
<result course="Statistics 101" grade="D"/>
</results>
</student>
73
XQuery Design Requirements
Must have at least one XML syntax and at least one
human-readable syntax
Must be declarative
Must be namespace aware
Must coordinate with XML Schema
Must support simple and complex datatypes
Must combine information from multiple documents
Must be able to transform and create XML trees
74
XQuery Syntax
FLWOR
For Let Where Order by Return
75
Example 1
For $x in (1, 2, 3, 4)
Let $y := ("a", "b", "c")
return ($x, $y)
1, a, b, c, 2, a, b, c, 3, a, b, c, 4, a, b, c
76
Example 2
For $x in (1, 2, 3, 4)
For $y in ("a", "b", "c")
return ($x, $y)
1, a, 1, b, 1, c, 2, a, 2, b, 2, c, 3, a, 3, b, 3, c, 4, a, 4, b,
4, c
77
Querying and transforming
prices.xml
<prices>
<book>
<title>Advanced Programming in the Unix
DTD environment</title>
<!ELEMENT prices (book*)> <source>www.amazon.com</source>
<!ELEMENT book (title, source, price)> <price>65.95</price>
<!ELEMENT title (#PCDATA)> </book>
<!ELEMENT source (#PCDATA)> <book>
<!ELEMENT price (#PCDATA)> <title>Advanced Programming in the Unix
environment </title>
<source>www.bn.com</source>
<price>65.95</price>
</book>
<book>
<title>TCP/IP Illustrated</title>
<source>www.amazon.com</source>
<price>65.95</price>
</book>
<book>
<title>TCP/IP Illustrated</title>
<source>www.bn.com</source>
<price>65.95</price>
</book>
<book>
<title>Data on the Web</title>
<source>www.amazon.com</source>
<price>34.95</price>
</book>
<book>
<title>Data on the Web</title>
<source>www.bn.com</source>
78 <price>39.95</price>
</book>
Querying and transforming
Use Case 1.1.9.10 Q10
In the document “prices.xml”, find the minimum price for each
book, in the form of a “minprice” element with the book title as its
title attribute.
Solution in XQuery:
<results>
{
<results>
LET $doc := document("prices.xml") <minprice title="Advanced
FOR $t IN distinct($doc/book/title) Programming in the Unix
LET $p := $doc/book[title = $t]/price environment">
RETURN <price>65.95</price>
<minprice title={ $t/text() }> </minprice>
min($p) } <minprice title="TCP/IP Illustrated">
</minprice> <price>65.95</price>
} </minprice>
<minprice title="Data on the Web">
</results> <price>34.95</price>
</minprice>
</results>
79
DTD
<!ELEMENT bib (book* )> bib.xml
<!ELEMENT book (title, (author+ | editor+ ), publisher, price )> <bib>
<!ATTLIST book year CDATA #REQUIRED > <book year="1994">
<!ELEMENT author (last, first )>
<!ELEMENT editor (last, first, affiliation )>
<title>TCP/IP Illustrated</title>
<!ELEMENT title (#PCDATA )> <author><last>Stevens</last><first>W.</first></author>
<!ELEMENT last (#PCDATA )> <publisher>Addison-Wesley</publisher>
<!ELEMENT first (#PCDATA )> <price>65.95</price>
<!ELEMENT affiliation (#PCDATA )> </book>
<!ELEMENT publisher (#PCDATA )>
<!ELEMENT price (#PCDATA )>
<book year="1992">
<title>Advanced Programming in the Unix environment</title>
<author><last>Stevens</last><first>W.</first></author>
XQuery <publisher>Addison-Wesley</publisher>
<bib> <price>65.95</price>
{ </book>
for $b in
doc("http://bstore1.example.com/bib.xml")/bib <book year="2000">
/book <title>Data on the Web</title>
<author><last>Abiteboul</last><first>Serge</first></author>
where $b/publisher = "Addison-Wesley" and <author><last>Buneman</last><first>Peter</first></author>
$b/@year > 1991 <author><last>Suciu</last><first>Dan</first></author>
return <publisher>Morgan Kaufmann Publishers</publisher>
<book year="{ $b/@year }"> <price>39.95</price>
{ $b/title } </book>
</book>
<book year="1999">
} <title>The Economics of Technology and Content for Digital TV</title>
</bib> <editor>
<last>Gerbarg</last><first>Darcy</first>
<affiliation>CITI</affiliation>
</editor>
<publisher>Kluwer Academic Publishers</publisher>
<price>129.95</price>
</book>
80 </bib>
Transforming XML Documents
with XSLT
81
Introduction
XSLT is particularly good (designed for) extracting info from
XML documents and outputting some other text-based format -
text, xhtml, html, xml, etc.
XSLT is not simply a stylesheet to present XML, though it is
frequently used that way
For publishing applications, often the entire application can be
performed in XSLT.
Add prefix or suffix text to content
Remove, create, re-order and sort elements
Re-use elements elsewhere in the document
Transform data between different XML dialects
XSLT can be used like a stylesheet or embedded in a
programming language
82
Presenting a Business Card
<card xmlns ="http://businesscard.org">
<name>John Doe</name>
<title>CEO, Widget Inc.</title>
<email>john.doe@
widget.inc</email>
<phone>(202) 555-1414</phone>
<logo uri="widget.gif"/>
</card>
<card>
<name>John Doe</name>
<title>CEO, Widget Inc.</title>
<email>john.doe@widget.inc</email>
<phone>(202) 555-1414</phone>
<logo uri="widget.gif"/>
83
</card>
Using XSLT
<?xml-stylesheet type ="text/xsl" href="card.xsl"?>
<card xmlns="http://businesscard.org">
<name>John Doe</name>
<title>CEO, Widget Inc.</title>
<email>john.doe@widget.inc</email>
<phone>(202) 555-1414</phone>
<logo uri="widget.gif"/>
</card>
John Doe
CEO, Widget Inc.
john.doe@widget.inc
Phone: (202) 555-1414
84
XSLT for Business Cards (1/2)
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:b=http://
businesscard.org xmlns="http://www.w3.org/1999/xhtml">
<xsl:template match="b:card">
<html>
<head>
<title><xsl:value-of select="b:name/text()"/></title>
</head>
<body bgcolor="#ffffff">
<table border="3">
<tr>
<td>
<xsl:apply-templates select="b:name"/><br/>
<xsl:apply-templates select ="b:title"/><p/>
<xsl:apply-templates select="b:email"/><br/>
<xsl:if test="b:phone"> Phone: <xsl:apply-templates
select="b:phone"/><br/></xsl:if>
</td>
<td><xsl:if test="b:logo"><img src="{b:logo/@uri}"/></xsl:if></td>
85
XSLT for Business Cards (2/2)
</tr>
</table>
</body>
</html>
</xsl:template>
<xsl:template match="b:name|b:title|b:email|b:phone">
<xsl:value-of select= "text()"/>
</xsl:template>
</xsl:stylesheet>
86
Querying and transforming:
<?xml-stylesheet typeXSLT
<catalog>
="text/xsl" href="catalog.xsl"?>
<cd> Transformations
XML Stylesheet Language
<title>Empire Burlesque</title>
For transforming XML documents,
<artist>Bobincluding converting XML
Dylan</artist>
to HTML, etc. <country>USA</country>
<company>Columbia</company>
<?xml version="1.0" encoding="ISO-8859-1"?>
<price>10.90</price>
<xsl:stylesheet version="1.0“"
<year>1985</year>
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
</cd>
<xsl:template match="/">
<html> <cd>
<body> <title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<h2>My CD Collection</h2>
<country>UK</country>
<xsl:for-each select="catalog/cd">
<xsl:value-of<company>CBS Records</company>
select="title"/>
<xsl:value-of<price>9.90</price>
select="artist"/>
</xsl:for-each><year>1988</year>
</html> </cd>
</xsl:template> </catalog>
</xsl:stylesheet>
87
Applications
CallXML: a phone markup language, used to describe the user
interface of a telephone, voice over IP, or multi-media call
application to a CallXML browser so it can control and react to
the call itself.
Theological Markup Language (ThML)
Chemical Markup Language (CML)
Mathematical Markup Language (MathML)
etc.
91
Application: Separate data
XML can Separate Data from HTML
Store data in separate XML files
Using HTML for layout and display
Using Data Islands
Data Islands can be bound to HTML elements
Benefits:
Changes in the underlying data will not require any changes to your HTML
92
Application: Exchange data
XML is used to Exchange Data
Text format
Software-independent, hardware-independent
Exchange data between incompatible systems, given that they agree on the
same tag definition.
Can be read by many different types of applications
Benefits:
Reduce the complexity of interpreting data
Easier to expand and upgrade a system
93
Application: Store Data
XML can be used to Store Data
Plain text file
Store data in files or databases
Application can be written to store and retrieve information from the store
Other clients and applications can access your XML files as data sources
Benefits:
Accessible to more applications
94
XML support in IE 5.0+
Internet Explorer 5.0 has the following XML support:
Viewing of XML documents
Full support for W3C DTD standards
XML embedded in HTML as Data Islands
Binding XML data to HTML elements
Transforming and displaying XML with XSL
Displaying XML with CSS
Access to the XML DOM (Document Object Model)
96
Microsoft XML Parser
Comes with IE 5.0
The parser features a language-neutral programming model
that supports:
JavaScript, VBScript, Perl, VB, Java, C++ and more
W3C XML 1.0 and XML DOM
DTD and validation
97
Java APIs for XML
JAXP: Java API for XML Processing
JAXB: Java Architecture for XML Binding
JDOM: Java DOM
DOM4J: an alternative to JDOM
JAXM: Java API for XML Messaging (asynchronous)
JAX-RPC: Java API for XML-based Remote Process
Communications (synchronous)
JAXR: Java API for XML Registries
98