Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 85

XML Introduction

Fekade Getahun

1
What is XML - eXtensible Markup
Language?
A set of rules for defining and representing information as
structured documents for applications on the Internet; a
restricted form of SGML

 T. Bray, J. Paoli, and C. M. Sperberg-McQueen (Eds.),


Extensible Markup Language (XML) 1.0, W3C
Recommendation 10- February-1998,
http://www.w3.org/TR/1998/REC-xml-19980210/.

 T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler


(Eds.), Extensible Markup Language (XML) 1.0 (Second
Edition), W3C Recommendation 6 October 2000,
http://www.w3.org/TR/2000/REC-xml-20001006/.
7
What Is XML?
 A simplified version of SGML
 Maintains the most useful parts of SGML
 Its goal is to enable generic SGML to be served, received, and
processed on the Web in the way that is now possible with
HTML.
 XML has been designed for ease of implementation and for
interoperability with both SGML and HTML.
 XML was designed to carry data, not displaying data

8
What is XML?
 Rule 1: Information is represented in units called XML
documents.
 Rule 2: An XML document contains one or more
elements.
 Rule 3: An element has a name, it is denoted in the
document by explicit markup, it can contain other
elements, and it can be associated with attributes.

9
An example of XML
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>aaucs_gs03@yahoogroups.com</to>
<from>satnafu@gmail.com</from>
<subject>CoSc 705 - Seminar: Article Review Presentation
Schedule</subject >
<body> The presentations of the seminar course are scheduled
from July 4 - 8, 2011. Please find attached the presentation
schedule</body>
</note>

10
Example of an XML document
<?xml version="1.0"?>
<catalog>
<product category = "mobile phone">
<mfg>Nokia</mfg><model>8890</model>
<description> Intended for EGSM 900 and GSM 1900 networks …
</description>
<clock setting= "nist" alarm = "yes"/>
</product>
<product category = "mobile phone">
<mfg>Ericsson</mfg><model>A3618</model>
<description>...</description>
</product>
...
</catalog>

11
Why XML?
 Needs:
 Simple, common rules that are easy to understand by people
with different backgrounds (like HTML)
 Capability to describe Internet resources and their
relationships (like HTML)
 Capability to define information structures for different
kinds of business sectors (unlike HTML, like SGML)

12
Why XML?
 Needs (cont’d):
 Format formal enough for computers and clear enough to be
human-legible (like SGML)
 Rules simple enough to allow easy building of software
(unlike SGML)
 Strong support for diverse natural languages (unlike SGML)

13
XML 1.0 fundamentals: Physical (storage) units
 Each XML document has both a logical and a physical
structure.
 Physically, XML document consists of one or more
entities
 A “document entity” serves as the root
 Entity may contain
 external portion of DTD
 parsed character data, which replaces any references to the entity
 unparsed data
 Entities located by URIs
 XML provides a mechanism to impose constraints on the
storage layout and logical structure.
14
XML 1.0 fundamentals: Logical structure
 XML document is composed of
 XML declaration: <?xml version=”1.0”?>
 Markup vocabulary: elements, attributes
<product category = "mobile phone">
<mfg>Nokia</mfg><model>8890</model>
... <clock setting = "nitz" alarm = "yes"/> ...
</product>
 White space
 Parsed and unparsed character data
 Entity references &diagram;
 Comments <!-- how interesting… -->
 Processing instructions '<?' PITarget (S (Char* -
(Char* '?>' Char*)))? '?>'
`

15
XML 1.0 fundamentals: Logical structure
 XML document contains
 one or more elements, the boundaries of which are either
delimited by
 start-tags and end-tags, or,
 for empty elements, by an empty-element tag.
 Each element has
 a type, identified by name, sometimes called its "generic identifier"
(GI),
 may have a set of attribute specifications
 Each attribute specification has a name and a value

16
XML 1.0 fundamentals:DTD
 Internal vs. external declarations
<!DOCTYPE greeting [<!ELEMENT greeting (#PCDATA)>]>
<greeting>Hello, world!</greeting>
<?xml version="1.0"?>
<!DOCTYPE greeting SYSTEM "hello.dtd">
<greeting>Hello, world!</greeting>
 Root document type

17
XML 1.0 fundamentals:DTD
 Element Type Declaration
 The element be constrained using element type and attribute-list
declarations.
 An element type declaration constrains the element's content.
 Element type declarations often constrain which element types can appear
as children of the element.
Element Type Declaration
elementdecl ::= '<!ELEMENT' S Name S contentspec S>'
contentspec ::= 'EMPTY' | 'ANY' | Mixed | children

Example
<!ELEMENT br EMPTY>
<!ELEMENT p (#PCDATA|emph)* >
<!ELEMENT %name.para; %content.para; >
<!ELEMENT container ANY>

18
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE message [<!ELEMENT message (contents, from)> <!ELEMENT
contents (#PCDATA)>
<!ELEMENT from (#PCDATA)> ] >

<message>
<contents>First XML Page</contents>
<from>www.HiLCoE.com.et</from>
</message>

19
XML 1.0 fundamentals:DTD
 Element Content
 An element type has element content when elements of that type
must contain only child elements (no character data).
 The constraint includes a content model, a simple grammar governing
the allowed types of the child elements and the order in which they
are allowed to appear.
 The grammar is built on content particles (cps), which consist of
names, choice lists of content particles, or sequence lists of content
particles:

Element-content Models
children ::= (choice | seq) ('?' | '*' | '+')?
cp ::= (Name | choice | seq) ('?' | '*' | '+')?
choice ::= '(' S? cp ( S? '|' S? cp )* S? ')'
seq ::= '(' S? cp ( S? ',' S? cp )* S? ')'

20
XML 1.0 fundamentals:DTD
 Example
<!ELEMENT category (mfg, model, description , clock?)>
<!ELEMENT description (#PCDATA | feature)*>
<!ELEMENT clock EMPTY>

<!ELEMENT spec (front, body, back?)>
<!ELEMENT div1 (head, (p | list | note)*, div2*)>
<!ELEMENT dictionary-body (%div.mix; | %dict.mix;)*>

21
XML 1.0 fundamentals: Attribute
declaration
 Attributes are used to associate name-value pairs with elements
AttlistDecl ::= '<!ATTLIST' S EName AttDef* S '>'
AttDef ::= S Name S AttType S DefaultDecl

AttType ::= StringType | TokenizedType | EnumeratedType


StringType ::= 'CDATA'
TokenizedType ::= 'ID' | 'IDREF'| 'IDREFS'|'ENTITY'|
 'ENTITIES'| 'NMTOKEN'| 'NMTOKENS'
DefaultDecl ::= '#REQUIRED' | '#IMPLIED'
| (('#FIXED' S)? AttValue)

Example:
<!ATTLIST termdef id ID #REQUIRED name CDATA #IMPLIED>
<!ATTLIST list type (bullets|ordered|glossary)  "ordered“>
<!ATTLIST form method  CDATA  #FIXED "POST">

22
XML 1.0 fundamentals
 Conformance:
 Well-formed:
 syntactically correct tags
 matching tags
 nested elements
 all entities declared before they are used
 Valid:
 well-formed
 DTD + doctype matches DTD
 unique IDs
 no dangling IDREFs

23
XML 1.0 fundamentals
XML
document

XML may or may not be


processor “validating”

“XML Information Set”

application

24
XML family of languages
 XML-related languages fall into the following
classes:
 XML accessories
 XML transducers
 XML applications

25
XML family of languages: XML Accessory
 Extends the capabilities specified in XML
 Intended for wide, general use
 Examples:
 XML Names: to allow qualified names
Names ::= Name (S Name)*
 XML Schema: extends the definition capabilities of
DTD
 XPath: for addressing parts in XML documents
 XLink: to create hyperlinks between resources

26
XML family of languages: XML Transducer
 Converts XML input data into output
 Associated with a processing model

 Examples:
 XSLT: intended for transforming XML documents
into other XML documents
 CSS and XSL intended to produce an external
presentation from some XML data
 XQuery: for querying

27
XML family of languages: XML Application
 XML applications are languages which define constraints for a
class of XML data for some special application area

 Examples (developed at W3C):


 XHTML: reformulation of HTML 4.0
 MathML defined for mathematical data and
 RDF: to describe metadata for resources
 XML-Signature: for digital signatures
 XForms: for Web forms

28
Naming conventions: XML namespaces
 Provides a method for qualifying element and attribute
names so that name collisions can be avoided
 Motivation: modularity and documentation

If a well-understood markup vocabulary for element


and attribute names exists, it should be re-used rather
than re-invented, especially if there is also software
available.

31
Naming conventions: XML namespaces
 Collection of names, identified by a URI
 No formal rules for defining names in a namespace
Example
 Namespace: http://aau.edu.et
 Element names: department, name, professor, student,
last_name, first_name, ...
 Global attribute names: id, ...
 Per-element-type attribute names: student: supervisor, ...

32
Naming conventions
 Namespace declaration: defines a label (prefix) for the
namespace and associates it to the namespace
identifier (URI)
 Qualified name: a namespace prefix and a local part,
separated by a colon
<?xml version="1.0"?>
<report xmlns:uw="http://aau.edu.et">
<uw:department>
<uw:name>Department of Computer Science</uw:name>
...
</report>

33
Hypertext facilities
 Hyperlink: cross-reference primarily for presentation to a
human
 Arc: two ends + direction  source and destination
 Traversal: following a link (for any purpose)
 HTML (review):
 “HyperText Markup Language” for WWW
 Supports simple (binary, unidirectional) links
 special elements <A> can be used as anchors
 named anchor (identified by id or name attribute) used as
destination
 anchor with href attribute is link source
 destination name matches href value (URI)

34
HTML Linking Limitations < HREF=
“…”>
 Static,
 Uni-directional,
 Single source anchor,
 Single destination anchor,
 Implied direction between them,
 Automatic traversal,
 No content embedding.

36
Hypertext facilities
 XML not designed specifically for hypertext
 Nevertheless, XML provides:
 references to external entities (by URI references)
<!ENTITY spring SYSTEM "../grafix/flower.gif" NDATA gif >...
<figure picture="spring">
 references to elements in the same document (via ID and
IDREF attributes)
<!ELEMENT students (student*)>
<!ELEMENT student (name, address*)>
<!ATTLIST student id ID #REQUIRED sibling-of IDREF
#IMPLIED>
<students>
<student id="123"><name>Daniel</name></student>
37
<student id="425" sibling-of="123"><name>Kebede</name></student>
Hypertext facilities
 Advanced techniques for referencing & linking:
 XPath
 for addressing parts of XML document, typically node-sets
 intended to be used by other specifications
 XPointer
 extends addressing capabilities of XPath to include locations
within unstructured (i.e., text) components
 locations are points or ranges
 XLink
 for specifying links between resources
 made explicit by an XLink linking element
 link ends described by XPointer

38
What is XPath?
 XPath is a syntax used for selecting parts of an XML
document
 The way XPath describes paths to elements is similar to
the way an operating system describes paths to files
 XPath is almost a small programming language; it has
functions, tests, and expressions
 XPath is a W3C standard
 XPath is not itself written as XML, but is used heavily in
XSLT

39
Terminology
 <library>
<book>
 parent
 Children
<chapter>
</chapter>
 siblings (they have the same parent)
 Ancestors
<chapter>  Descendents
<section>
<paragraph/>
<paragraph/>
</section>
</chapter>

</book>
</library>

40
Paths
Operating system: XPath:
/ = the root directory /library = the root element (if named
library )
/users/dave/foo = the /library/book/chapter/section =
(one) file named foo every section element in a chapter
in dave in users in every book in the library
foo = the (one) file named section = every section element
foo in the current directory that is a child of the current
element
. = the current directory . = the current element
.. = the parent directory .. = parent of the current element

/users/dave/* = all the files /library/book/chapter/* = all the


in /users/dave elements in /library/book/chapter
41
Slashes
 A path that begins with a / represents an absolute path,
starting from the top of the document
 Example: /email/message/header/from
 Note that even an absolute path can select more than one element
 A slash by itself means “the whole document”
 A path that does not begin with a / represents a path starting
from the current element
 Example: header/from
 A path that begins with // can start from anywhere in the
document
 Example: //header/from selects every element from that is a child of
an element header
 This can be expensive, since it involves searching the entire document

42
Brackets and last()
 A number in brackets selects a particular matching child
(counting starts from 1, except in Internet Explorer)
 Example: /library/book[1] selects the first book of the library
 Example: //chapter/section[2] selects the second section of every
chapter in the XML document
 Example: //book/chapter[1]/section[2]
 Only matching elements are counted; for example, if a book has both
sections and exercises, the latter are ignored when counting sections
 The function last() in brackets selects the last matching child
 Example: /library/book/chapter[last()]
 You can even do simple arithmetic
 Example: /library/book/chapter[last()-1]

43
Stars
 A star, or asterisk, is a “wild card”—it means “all the
elements at this level”
 Example: /library/book/chapter/* selects every child of
every chapter of every book in the library
 Example: //book/* selects every child of every book
(chapters, tableOfContents, index, etc.)
 Example: /*/*/*/paragraph selects every paragraph that has
exactly three ancestors
 Example: //* selects every element in the entire document

44
Attributes
 You can select attributes by themselves, or elements that have
certain attributes
 Remember: an attribute consists of a name-value pair, for example in
<chapter num="5">, the attribute is named num
 To choose the attribute itself, prefix the name with @
 Example: @num will choose every attribute named num
 Example: //@* will choose every attribute, everywhere in the document
 To choose elements that have a given attribute, put the attribute
name in square brackets
 Example: //chapter[@num] will select every chapter element
(anywhere in the document) that has an attribute named num

45
Attributes
 //chapter[@num] selects every chapter element with an
attribute num

 //chapter[not(@num)] selects every chapter element that


does not have a num attribute

 //chapter[@*] selects every chapter element that has any


attribute

 //chapter[not(@*)] selects every chapter element with no


attributes

46
Values of attributes
 //chapter[@num='3'] selects every chapter element
with an attribute num with value 3
 The normalize-space() function can be used to remove
leading and trailing spaces from a value before
comparison
 Example: //chapter[normalize-space(@num)="3"]

47
Axes
Starting from a given node, the self, preceding, following, ancestor, and
descendant axes form a partition of all the nodes (if we ignore attribute and
namespace nodes)

 <library>
<book> //chapter[2]/self::*
<chapter/>
<chapter> //chapter[2]/preceding::*
<section>
<paragraph/>
<paragraph/> //chapter[2]/following::*
</section>
</chapter> //chapter[2]/ancestor::*
<chapter/>
</book>
//chapter[2]/descendant::*
<book/>
</library>
49
Arithmetic expressions
 + add
 - subtract
 * multiply
 div (not /) divide
 mod modulo (remainder)

50
Some XPath functions
 XPath contains a number of functions on node sets,
numbers, and strings; here are a few of them:
 count(elem) counts the number of selected elements
 Example: //chapter[count(section)=1] selects chapters with
exactly two section children
 name() returns the name of the element
 Example: //*[name()='section'] is the same as //section
 starts-with(arg1, arg2) tests if arg1 starts with arg2
 Example: //*[starts-with(name(), 'sec']
 contains(arg1, arg2) tests if arg1 contains arg2
 Example: //*[contains(name(), 'ect']

51
Introduction: XLink
 XLink is the XML Linking Language.
 It is a language for creating hyperlinks in XML documents.
 It is similar to HTML links - but more powerful.
 Any element in an XML document can behave as an Xlink.
 XLink supports simple links (like HTML) and extended links
(for linking multiple resources together).
 With XLink, the links can be defined outside of the linked
files.
 XLink is a W3C Recommendation.

52
XLink Syntax
 In HTML, <a> element defines a hyperlink:
 In XML documents,
 can use whatever element names you want - it is impossible for
browsers to predict what hyperlink elements will be called in
XML documents.
 The solution for creating links in XML documents was to
put a marker on elements that should act as hyperlinks.

53
XLink Syntax
 A simple example of how to use XLink to create links in an XML
document.
<?xml version="1.0"?>
<homepages xmlns:xlink="http://www.w3.org/1999/xlink">
<homepage xlink:type="simple" xlink:href="http://www.sse.rdg.ac.uk">Visit SSE
</homepage>
<homepage xlink:type="simple" xlink:href="http://www.w3.org">Visit W3C
</homepage>
</homepages>

 To get access to the XLink attributes and features, declare the XLink
namespace at the top of the document.
 The XLink namespace is "http://www.w3.org/1999/xlink".
 The xlink:type and the xlink:href attributes in the <homepage> elements
define that the type and href attributes come from the xlink namespace.
 The xlink:type="simple" creates a simple, two-ended link (means "click
from here and go to there").

54
XLink
 Goal: provide hypertext links from within XML
documents.
 The specification concerns itself (mostly) with the syntax
of links and their properties.
 The Semantics are left to the implementation, allows
simple links like HTML, but also more complicated link
structures.

55
XLink
· A namespace (usually prefixed as "xlink” ) with a set of elements
and attributes anchored in that namespace
(http://www.w3.org/1999/xlink/namespace/).
· Two kinds of links:
- Simple link elements (HTML like),
- Extended links.
• Well-Formed XML.
• XLinks described with attributes:
<!ELEMENT Authors>
<!ATTLIST Authors
xmlns:xlink CDATA #FIXED http://www.w3c.org/1999/xlink
xlink:type (simple | extended | locator | arc | resource | title)
xlink:href CDATA #REQUIRED
xlink:role CDATA #IMPLIED
xlink:title CDATA #IMPIED
xlink:show (new | replace | embed | other | none ) “replace”
xlink:actuate {onLoad | onRequest | other | none ) “onRequest” >

56
Type Attribute
 Type (required) defines the type of link:
 simple - similar to HTML <A HREF=….>,
 extended - allows multiple links,
 Sub-elements to extended links:
 locator - locations to participate in an extended link,
 arc - define navigable connections between locations of resources,
 title - human readable sub-element,
 resource - define participants in a link.

57
XLink Attributes
Category Attribute Description

Describes the type of link. Possible values are simple, extended,


type
locator, resource, arc, title, none.
Supplies the location (URI) to allow the XLink application to find a
href
remote resource

role URI reference describes the intended property of the XLink

Semantic arcrole URI reference describes the intended property of the arc
title Describes the meaning of the link in human readable terms
Describes the desired presentation of the data, possible values
show
are new, replace, embed, other, or none.
Behavioural
Describes the desired timing of the link. Possible values are
actuate
onLoad, onRequest, other, or none.
Provides and identifier that can be used to identify how different
label
types of resources and locators are connected within a link
Identifies the origin, using a label value from one of the locator or
Traversal from
resource elements of an extended link
Identifies the destination, using a label value from one of the
to
locator or resource elements of an extended link
58
XML Linking Model
 XLink allows the definition of links.
 XLink is simply an association
between a number of resources
(locators).
 Link does not imply traversal.
 Link source and destination are
optional.
 It is given that a link may contain a
number of different arcs.
XLink separates associating resources (using a link) from traversal
amongst resources (using arcs)

59
XLink Functionality
 Extended links : allow link authors to create new types of
links, typically by using arcs to create relationships (via
multiple traversal rules) between many different local and
remote resources.

60
XLink - Types
 Resource - this link type represents a local resource which
participates in the extended link:
 it contains some content that serves as the starting point for link
traversal,
<local xlink:type="resource" xlink:label="name"
xlink:title="John Doe" />

 Locator - this link type represents a remote resource (via


the additional "href" attribute) that participates in the
extended link:
<remote xlink:type="locator" xlink:href="salary.xml"
xlink:label="salary" xlink:title="John Doe's salary information"
/>

61
XLink - Types
 Arc: An arc sets up navigation rules between locators and
resources (via the additional "from" and "to" attributes), and is
the primary construct for specifying the direction and
behaviour of link traversal:
<arc xlink:type="arc" xlink:from="name" xlink:to="salary"
xlink:show="replace" xlink:actuate="onRequest" xlink:title="Link from
name to salary" />
 Arcs come in very handy when setting up so-called bi-
directional links, e.g, A -> B and B -> A.
 An example:
<arc xlink:type="arc" xlink:from="resume" xlink:to="salary"
xlink:show="replace" xlink:actuate="onRequest" xlink:title="Link from
resume to salary" />
<arc xlink:type="arc" xlink:from="salary" xlink:to="resume"
xlink:show="replace" xlink:actuate="onRequest" xlink:title="Link from
salary to resume" />
62
XLink - Types
 The XLink specification states that arcs cannot be duplicated -
in other words, a particular from-to relationship can be
represented by one and only one arc.
 Title - used to provide additional, human-readable information
for an extended link's participating resources:
 The XLink specification suggests that this link type is primarily used in
the context of internationalization, where different languages may
require different titles –
Example:
<ext xmlns:xlink="http://www.w3.org/1999/xlink"; xlink:type="extended">
<greeting xlink:type="title" xml:lang="en">Hello!</greeting>
<greeting xlink:type="title" xml:lang="fr">Bon jour!</greeting>
<greeting xlink:type="title" xml:lang="it">Ciao!</greeting>
</ext>

63
XLink - Attributes
 In addition to the "type" attribute, the XLink specification
defines nine additional attributes:
 The href attribute is used to specify the URL of a remote
resource, and is mandatory for locator links:
 In addition to the URL of the remote resource, it may also contain an
additional "fragment identifier", which drills down to a specific location
within the target document.
 The show attribute is used to define the manner in which the
endpoint of a link is presented to the user:
 The value of this attribute may be any one of:
 new - display linked resource in a new window;
 replace - display linked resource in the current window, replacing whatever is
currently there;
 Embed - display linked resource in a specific area of the current window;
 other - display as per other, application-dependent directives;
 none - display method unspecified.
64
XLink - Attributes
 actuate is used to specify when a link is traversed - it may take
any of the values
 onLoad - display linked resource as soon as loading is complete;
 onRequest - display linked resource only when expressly directed to by
the user, either via a click or other input;
 other and none.
 label attribute is used to identify a link for subsequent use in an
arc.
 from and to are used to specify the starting and ending points
for an arc respectively:
 Both these attributes use labels to identify the links involved.

65
XLink - Attributes
 role and arcrole reference a URL which contains information
on the link's role or purpose:
 Normally, you can ignore these attribute, as they are intended primarily
for descriptive purposes,
 The exception to this is when the "arcrole" attribute used in conjunction
with a linkbase - more on this later.
 The title attribute, must not to be confused with the title type of
link, which provides a human-readable descriptive title for a
link.
 It should be noted that it is not necessary for all these attributes
to be present in a particular XLink definition:
 It is certainly mandatory for any element defining an XLink to contain a
type attribute, but with this obvious exception, different types of links
have different attribute requirements.

66
The basic rules
1. A simple link may contain href attributes to define the remote
resource, and optionally show and actuate attributes to define
link behaviour; none mandatory.
2. An extended link may contain optional role and title
attributes to define its purpose.
3. A locator link must contain an href attribute to define the
URL of the remote resource:
 It typically also uses an optional label attribute to identify itself for arc
traversals.
4. A resource link usually contains an optional label, again for
purposes of arc traversals.
5. An arc typically contains optional from and to attributes to
define the direction of link traversal, optional show and
actuate attributes to specify how and when to traverse links,
 Optional arcrole attribute if used in the context of a linkbase.
67
Querying and transforming
 Roots in conventional database querying + information
retrieval + Web search engines
 David Maier's desiderata
 Preserve order and association; produce XML output
 Suitable for documents, business records, and metadata
 Support for new datatypes
 Support {selection, extraction, reduction, restructuring, combination}
 No schema required, but exploit available schema
 XML representation; mutually embedding with XML; programmatic
manipulation
 XLink and XPointer cognizant
 Namespace alias independence

68
Querying and transforming: XML Query
Specifications
 XML Query Requirements, 16 February 2001
 XML Query Use Cases , 8 June 2001
 XQuery 1.0 and XPath 2.0 Data Model (replaces XML Query
Data Model), 7 June 2001
 XQuery 1.0 Formal Semantics (replaces XML Query Algebra),
7 June 2001
 XQuery 1.0: An XML Query Language, 7 June 2001
 XML Syntax for XQuery 1.0 (XQueryX), 7 June 2001

69
Objectives
 How XML generalizes relational databases
 The XQuery Language

70
XQuery 1.0
 XML documents naturally generalize database relations
 XQuery is the corresponding generalization of SQL

71
From Relations to Trees
People
FirstName LastName Age
Jone Doe 43
Jane Dow 38
Joe Smith 26
Juan Cruz 29

72
Trees Are Not Relations
 Not all trees satisfy the previous characterization
 Trees are ordered, while both rows and columns of tables
may be permuted without changing the meaning of the
data
<students>
<student id="100026">
<name>Joe Average</name>
<age>21</age>
<major>Biology</major>
<results>
<result course="Math 101" grade="C-"/>
<result course="Biology 101" grade="C+"/>
<result course="Statistics 101" grade="D"/>
</results>
</student>

73
XQuery Design Requirements
 Must have at least one XML syntax and at least one
human-readable syntax
 Must be declarative
 Must be namespace aware
 Must coordinate with XML Schema
 Must support simple and complex datatypes
 Must combine information from multiple documents
 Must be able to transform and create XML trees

74
XQuery Syntax
 FLWOR
For Let Where Order by Return

75
Example 1
For $x in (1, 2, 3, 4)
Let $y := ("a", "b", "c")
return ($x, $y)

1, a, b, c, 2, a, b, c, 3, a, b, c, 4, a, b, c

76
Example 2
For $x in (1, 2, 3, 4)
For $y in ("a", "b", "c")
return ($x, $y)

1, a, 1, b, 1, c, 2, a, 2, b, 2, c, 3, a, 3, b, 3, c, 4, a, 4, b,
4, c

77
Querying and transforming
prices.xml
<prices>
<book>
<title>Advanced Programming in the Unix
DTD environment</title>
<!ELEMENT prices (book*)> <source>www.amazon.com</source>
<!ELEMENT book (title, source, price)> <price>65.95</price>
<!ELEMENT title (#PCDATA)> </book>
<!ELEMENT source (#PCDATA)> <book>
<!ELEMENT price (#PCDATA)> <title>Advanced Programming in the Unix
environment </title>
<source>www.bn.com</source>
<price>65.95</price>
</book>
<book>
<title>TCP/IP Illustrated</title>
<source>www.amazon.com</source>
<price>65.95</price>
</book>
<book>
<title>TCP/IP Illustrated</title>
<source>www.bn.com</source>
<price>65.95</price>
</book>
<book>
<title>Data on the Web</title>
<source>www.amazon.com</source>
<price>34.95</price>
</book>
<book>
<title>Data on the Web</title>
<source>www.bn.com</source>
78 <price>39.95</price>
</book>
Querying and transforming
 Use Case 1.1.9.10 Q10
 In the document “prices.xml”, find the minimum price for each
book, in the form of a “minprice” element with the book title as its
title attribute.
 Solution in XQuery:
<results>
{
<results>
LET $doc := document("prices.xml") <minprice title="Advanced
FOR $t IN distinct($doc/book/title) Programming in the Unix
LET $p := $doc/book[title = $t]/price environment">
RETURN <price>65.95</price>
<minprice title={ $t/text() }> </minprice>
min($p) } <minprice title="TCP/IP Illustrated">
</minprice> <price>65.95</price>
} </minprice>
<minprice title="Data on the Web">
</results> <price>34.95</price>
</minprice>
</results>
79
DTD
<!ELEMENT bib (book* )> bib.xml
<!ELEMENT book (title, (author+ | editor+ ), publisher, price )> <bib>
<!ATTLIST book year CDATA #REQUIRED > <book year="1994">
<!ELEMENT author (last, first )>
<!ELEMENT editor (last, first, affiliation )>
<title>TCP/IP Illustrated</title>
<!ELEMENT title (#PCDATA )> <author><last>Stevens</last><first>W.</first></author>
<!ELEMENT last (#PCDATA )> <publisher>Addison-Wesley</publisher>
<!ELEMENT first (#PCDATA )> <price>65.95</price>
<!ELEMENT affiliation (#PCDATA )> </book>
<!ELEMENT publisher (#PCDATA )>
<!ELEMENT price (#PCDATA )>
<book year="1992">
<title>Advanced Programming in the Unix environment</title>
<author><last>Stevens</last><first>W.</first></author>
XQuery <publisher>Addison-Wesley</publisher>
<bib> <price>65.95</price>
{ </book>
for $b in
doc("http://bstore1.example.com/bib.xml")/bib <book year="2000">
/book <title>Data on the Web</title>
<author><last>Abiteboul</last><first>Serge</first></author>
where $b/publisher = "Addison-Wesley" and <author><last>Buneman</last><first>Peter</first></author>
$b/@year > 1991 <author><last>Suciu</last><first>Dan</first></author>
return <publisher>Morgan Kaufmann Publishers</publisher>
<book year="{ $b/@year }"> <price>39.95</price>
{ $b/title } </book>
</book>
<book year="1999">
} <title>The Economics of Technology and Content for Digital TV</title>
</bib> <editor>
<last>Gerbarg</last><first>Darcy</first>
<affiliation>CITI</affiliation>
</editor>
<publisher>Kluwer Academic Publishers</publisher>
<price>129.95</price>
</book>
80 </bib>
Transforming XML Documents
with XSLT

81
Introduction
 XSLT is particularly good (designed for) extracting info from
XML documents and outputting some other text-based format -
text, xhtml, html, xml, etc.
 XSLT is not simply a stylesheet to present XML, though it is
frequently used that way
 For publishing applications, often the entire application can be
performed in XSLT.
 Add prefix or suffix text to content
 Remove, create, re-order and sort elements
 Re-use elements elsewhere in the document
 Transform data between different XML dialects
 XSLT can be used like a stylesheet or embedded in a
programming language

82
Presenting a Business Card
<card xmlns ="http://businesscard.org">
<name>John Doe</name>
<title>CEO, Widget Inc.</title>
<email>john.doe@
widget.inc</email>
<phone>(202) 555-1414</phone>
<logo uri="widget.gif"/>
</card>
<card>
<name>John Doe</name>
<title>CEO, Widget Inc.</title>
<email>john.doe@widget.inc</email>
<phone>(202) 555-1414</phone>
<logo uri="widget.gif"/>
83
</card>
Using XSLT
<?xml-stylesheet type ="text/xsl" href="card.xsl"?>
<card xmlns="http://businesscard.org">
<name>John Doe</name>
<title>CEO, Widget Inc.</title>
<email>john.doe@widget.inc</email>
<phone>(202) 555-1414</phone>
<logo uri="widget.gif"/>
</card>

John Doe
CEO, Widget Inc.

john.doe@widget.inc
Phone: (202) 555-1414

84
XSLT for Business Cards (1/2)
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:b=http://
businesscard.org xmlns="http://www.w3.org/1999/xhtml">
<xsl:template match="b:card">
<html>
<head>
<title><xsl:value-of select="b:name/text()"/></title>
</head>
<body bgcolor="#ffffff">
<table border="3">
<tr>
<td>
<xsl:apply-templates select="b:name"/><br/>
<xsl:apply-templates select ="b:title"/><p/>
<xsl:apply-templates select="b:email"/><br/>
<xsl:if test="b:phone"> Phone: <xsl:apply-templates
select="b:phone"/><br/></xsl:if>
</td>
<td><xsl:if test="b:logo"><img src="{b:logo/@uri}"/></xsl:if></td>
85
XSLT for Business Cards (2/2)
</tr>
</table>
</body>
</html>
</xsl:template>

<xsl:template match="b:name|b:title|b:email|b:phone">
<xsl:value-of select= "text()"/>
</xsl:template>

</xsl:stylesheet>

86
Querying and transforming:
<?xml-stylesheet typeXSLT
<catalog>
="text/xsl" href="catalog.xsl"?>

 <cd> Transformations
XML Stylesheet Language
<title>Empire Burlesque</title>
 For transforming XML documents,
<artist>Bobincluding converting XML
Dylan</artist>
to HTML, etc. <country>USA</country>
<company>Columbia</company>
<?xml version="1.0" encoding="ISO-8859-1"?>
<price>10.90</price>
<xsl:stylesheet version="1.0“"
<year>1985</year>
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
</cd>
<xsl:template match="/">
<html> <cd>
<body> <title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<h2>My CD Collection</h2>
<country>UK</country>
<xsl:for-each select="catalog/cd">
<xsl:value-of<company>CBS Records</company>
select="title"/>
<xsl:value-of<price>9.90</price>
select="artist"/>
</xsl:for-each><year>1988</year>
</html> </cd>
</xsl:template> </catalog>
</xsl:stylesheet>
87
Applications
 CallXML: a phone markup language, used to describe the user
interface of a telephone, voice over IP, or multi-media call
application to a CallXML browser so it can control and react to
the call itself.
 Theological Markup Language (ThML)
 Chemical Markup Language (CML)
 Mathematical Markup Language (MathML)
 etc.

91
Application: Separate data
XML can Separate Data from HTML
 Store data in separate XML files
 Using HTML for layout and display
 Using Data Islands
 Data Islands can be bound to HTML elements

Benefits:
Changes in the underlying data will not require any changes to your HTML

92
Application: Exchange data
XML is used to Exchange Data
 Text format
 Software-independent, hardware-independent
 Exchange data between incompatible systems, given that they agree on the
same tag definition.
 Can be read by many different types of applications

Benefits:
 Reduce the complexity of interpreting data
 Easier to expand and upgrade a system

93
Application: Store Data
XML can be used to Store Data
 Plain text file
 Store data in files or databases
 Application can be written to store and retrieve information from the store
 Other clients and applications can access your XML files as data sources

Benefits:
Accessible to more applications

94
XML support in IE 5.0+
Internet Explorer 5.0 has the following XML support:
 Viewing of XML documents
 Full support for W3C DTD standards
 XML embedded in HTML as Data Islands
 Binding XML data to HTML elements
 Transforming and displaying XML with XSL
 Displaying XML with CSS
 Access to the XML DOM (Document Object Model)

*Netscape 6.0 also have full XML support

96
Microsoft XML Parser
 Comes with IE 5.0
 The parser features a language-neutral programming model
that supports:
 JavaScript, VBScript, Perl, VB, Java, C++ and more
 W3C XML 1.0 and XML DOM
 DTD and validation

97
Java APIs for XML
 JAXP: Java API for XML Processing
 JAXB: Java Architecture for XML Binding
 JDOM: Java DOM
 DOM4J: an alternative to JDOM
 JAXM: Java API for XML Messaging (asynchronous)
 JAX-RPC: Java API for XML-based Remote Process
Communications (synchronous)
 JAXR: Java API for XML Registries

98

You might also like