Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Objectives

ƒ Location steps and paths


An Introduction to XML and Web Technologies
ƒ Typical locations paths
The XPath Language ƒ Abbreviations
ƒ General expressions

Anders Møller & Michael I. Schwartzbach


© 2006 Addison-Wesley
An Introduction to XML and Web Technologies 2

XPath Expressions Location Paths

ƒ Flexible notation for navigating around trees ƒ A location path evaluates to a sequence of nodes
ƒ A basic technology that is widely used ƒ The sequence is sorted in document order
• uniqueness and scope in XML Schema ƒ The sequence will never contain duplicates
• pattern matching an selection in XSLT
• relations in XLink and XPointer
• computations on values in XSLT and XQuery

An Introduction to XML and Web Technologies 3 An Introduction to XML and Web Technologies 4

1
Locations Steps Evaluating a Location Path

ƒ The location path is a sequence of steps ƒ A step maps a context node into a sequence
ƒ A location step consists of ƒ This also maps sequences to sequences
• an axis • each node is used as context node
• a nodetest • and is replaced with the result of applying the step
• some predicates ƒ The path then applies each step in turn

axis :: nodetest [Exp1] [Exp2] …

An Introduction to XML and Web Technologies 5 An Introduction to XML and Web Technologies 6

An Example An Example

A A

B B B B

C C D C C D

E F C E E F C E

F F F E E F F F F E E F

F F

Context node

An Introduction to XML and Web Technologies 7 An Introduction to XML and Web Technologies 8

2
An Example An Example

A A

B B B B

C C D C C D

E F C E E F C E

F F F E E F F F F E E F

F F

descendant::C descendant::C/child::E

An Introduction to XML and Web Technologies 9 An Introduction to XML and Web Technologies 10

An Example Contexts

A
ƒ The context of an XPath evaluation consists of
B B • a context node (a node in an XML tree)
C C D • a context position and size (two nonnegative integers)
• a set of variable bindings
E F C E • a function library
• a set of namespace declarations
F F F E E F
ƒ The application determines the initial context
F ƒ If the path starts with ‘/’ then
• the initial context node is the root
descendant::C/child::E/child::F
• the initial position and size are 1

An Introduction to XML and Web Technologies 11 An Introduction to XML and Web Technologies 12

3
Axes Axis Directions

ƒ An axis is a sequence of nodes ƒ Each axis has a direction


ƒ The axis is evaluated relative to the context node ƒ Forwards means document order:
ƒ XPath supports 12 different axes • child, descendant, following-sibling,
following, self, descendant-or-self
•child •attribute
ƒ Backwards means reverse document order:
•descendant •following
•parent •preceding • parent, ancestor, preceding-sibling,
•ancestor •self preceding
•following-sibling •descendant-or-self ƒ Stable but depends on the implementation:
•preceding-sibling •ancestor-or-self • attribute

An Introduction to XML and Web Technologies 13 An Introduction to XML and Web Technologies 14

The parent Axis The child Axis

An Introduction to XML and Web Technologies 15 An Introduction to XML and Web Technologies 16

4
The descendant Axis The ancestor Axis

An Introduction to XML and Web Technologies 17 An Introduction to XML and Web Technologies 18

The following-
following-sibling Axis The preceding-
preceding-sibling Axis

An Introduction to XML and Web Technologies 19 An Introduction to XML and Web Technologies 20

5
The following Axis The preceding Axis

An Introduction to XML and Web Technologies 21 An Introduction to XML and Web Technologies 22

Node Tests Predicates

ƒ text() ƒ General XPath expressions


ƒ comment() ƒ Evaluated with the current node as context
ƒ processing-instruction() ƒ Result is coerced into a boolean
ƒ node() • a number yields true if it equals the context position
ƒ * • a string yields true if it is not empty
• a sequence yields true if it is not empty
ƒ QName
ƒ *:NCName
ƒ NCName:*

An Introduction to XML and Web Technologies 23 An Introduction to XML and Web Technologies 24

6
Abbreviations General Expressions

/child::rcp:collection/child::rcp:recipe
/child::rcp:ingredient ƒ Every expression evaluates to a sequence of
• atomic values
/rcp:collection/rcp:recipe/rcp:ingredient
• nodes
/child::rcp:collection/child::rcp:recipe ƒ Atomic values may be
/child::rcp:ingredient/attribute::amount • numbers
• booleans
/rcp:collection/rcp:recipe/rcp:ingredient/@amount
• Unicode strings
/descendant-or-self::node()/ // • datatypes defined in XML Schema
ƒ Nodes have identity
self::node() . parent::node() ..

An Introduction to XML and Web Technologies 25 An Introduction to XML and Web Technologies 26

Atomization Literal Expressions

ƒ A sequence may be atomized 42


ƒ This results in a sequence of atomic values 3.1415
ƒ For element nodes this is the concatenation of all 6.022E23
descendant text nodes ’XPath is a lot of fun’
ƒ For other nodes this is the obvious string ”XPath is a lot of fun”
’The cat said ”Meow!”’
”The cat said ””Meow!”””
”XPath is just
so much fun”
An Introduction to XML and Web Technologies 27 An Introduction to XML and Web Technologies 28

7
Arithmetic Expressions Variable References

ƒ +, -, *, div, idiv, mod $foo


ƒ Operators are generalized to sequences $bar:foo
• if any argument is empty, the result is empty
• if all argument are singleton sequences of numbers, ƒ $foo-17 refers to the variable ”foo-17”
the operation is performed
• otherwise, a runtime error occurs
ƒ Possible fixes:
($foo)-17, $foo -17, $foo+-17

An Introduction to XML and Web Technologies 29 An Introduction to XML and Web Technologies 30

Sequence Expressions Path Expressions

ƒ The ’,’ operator concatenates sequences ƒ Locations paths are expressions


ƒ Integer ranges are constructed with ’to’ ƒ The may start from arbitrary sequences
ƒ Operators: union, intersect, except • evaluate the path for each node
ƒ Sequences are always flattened • use the given node as context node
• context position and size are taken from the sequence
ƒ These expression give the same result:
• the results are combined in document order
(1,(2,3,4),((5)),(),(((6,7),8,9))
1 to 9
1,2,3,4,5,6,7,8,9

An Introduction to XML and Web Technologies 31 An Introduction to XML and Web Technologies 32

8
Filter Expressions Value Comparison

ƒ Predicates generalized to arbitrary sequences ƒ Operators: eq, ne, lt, le, gt, ge
ƒ The expression ’.’ is the context item ƒ Used on atomic values
ƒ The expression: ƒ When applied to arbitrary values:
(10 to 40)[. mod 5 = 0 and position()>20] • atomize
• if either argument is empty, the result is empty
has the result: • if either has length >1, the result is false
30, 35, 40 • if incomparable, a runtime error
• otherwise, compare the two atomic values

8 eq 4+4
(//rcp:ingredient)[1]/@name eq ”beef cube steak”

An Introduction to XML and Web Technologies 33 An Introduction to XML and Web Technologies 34

General Comparison Node Comparison

ƒ Operators: =, !=, <, <=, >, >= ƒ Operators: is, <<, >>
ƒ Used on general values: ƒ Used to compare nodes on identity and order
• atomize ƒ When applied to arbitrary values:
• if there exists two values, one from each argument, whose
• if either argument is empty, the result is empty
comparison holds, the result is true
• if both are singleton nodes, the nodes are compared
• otherwise, the result is false
• otherwise, a runtime error

8 = 4+4
(//rcp:recipe)[2] is
(1,2) = (2,4)
//rcp:recipe[rcp:title eq ”Ricotta Pie”]
//rcp:ingredient/@name = ”salt”
/rcp:collection << (//rcp:recipe)[4]
(//rcp:recipe)[4] >> (//rcp:recipe)[3]

An Introduction to XML and Web Technologies 35 An Introduction to XML and Web Technologies 36

9
Be Careful About Comparisons Algebraic Axioms for Comparisons

((//rcp:ingredient)[40]/@name,(//rcp:ingredient)[40]/@amount) eq •Reflexivity: x=x


((//rcp:ingredient)[53]/@name, (//rcp:ingredient)[53]/@amount)

Yields false, since the arguments are not singletons •Symmetry: x= y⇒ y=x
((//rcp:ingredient)[40]/@name, (//rcp:ingredient)[40]/@amount) =
((//rcp:ingredient)[53]/@name, (//rcp:ingredient)[53]/@amount
•Transitivity: x= y∧ y = z⇒ x= z
x< y∧ y< z⇒ x< z
Yields true, since the two names are found to be equal

((//rcp:ingredient)[40]/@name, (//rcp:ingredient)[40]/@amount) is
•Anti-symmetry: x≤ y∧ y≤ x⇒ x= y
((//rcp:ingredient)[53]/@name, (//rcp:ingredient)[53]/@amount)

Yields a runtime error, since the arguments are not singletons


•Negation: x ≠ y ⇔ ¬x = y

An Introduction to XML and Web Technologies 37 An Introduction to XML and Web Technologies 38

XPath Violates Most Axioms Boolean Expressions

ƒ Reflexivity? ƒ Operators: and, or


()=() yields false ƒ Arguments are coerced, false if the value is:
ƒ Transitivity? • the boolean false
(1,2)=(2,3), (2,3)=(3,4), not (1,2)=(3,4) • the empty sequence
ƒ Anti-symmetry? • the empty string
• the number zero
(1,4)<=(2,3), (2,3)<=(1,4), not (1,2)=(3,4)
ƒ Constants use functions true() and false()
ƒ Negation?
ƒ Negation uses not(…)
(1)!=() yields false, (1)=() yields false

An Introduction to XML and Web Technologies 39 An Introduction to XML and Web Technologies 40

10
Functions Function Invocation

ƒ XPath has an extensive function library ƒ Calling a function with 4 arguments:


ƒ Default namespace for functions: fn:avg(1,2,3,4)
http://www.w3.org/2006/xpath-functions
ƒ 106 functions are required ƒ Calling a function with 1 argument:
ƒ More functions with the namespace: fn:avg((1,2,3,4))
http://www.w3.org/2001/XMLSchema

An Introduction to XML and Web Technologies 41 An Introduction to XML and Web Technologies 42

Arithmetic Functions Boolean Functions

fn:abs(-23.4) = 23.4 fn:not(0) = fn:true()


fn:ceiling(23.4) = 24 fn:not(fn:true()) = fn:false()
fn:floor(23.4) = 23 fn:not("") = fn:true()
fn:round(23.4) = 23 fn:not((1)) = fn:false()
fn:round(23.5) = 24

An Introduction to XML and Web Technologies 43 An Introduction to XML and Web Technologies 44

11
String Functions Regexp Functions

fn:concat("X","ML") = "XML" fn:contains("XML book","XML") = fn:true()


fn:concat("X","ML"," ","book") = "XML book" fn:matches("XML book","XM..[a-z]*") = fn:true()
fn:string-join(("XML","book")," ") = "XML book" fn:matches("XML book",".*Z.*") = fn:false()
fn:string-join(("1","2","3"),"+") = "1+2+3" fn:replace("XML book","XML","Web") = "Web book"
fn:substring("XML book",5) = "book" fn:replace("XML book","[a-z]","8") = "XML 8888"
fn:substring("XML book",2,4) = "ML b"
fn:string-length("XML book") = 8
fn:upper-case("XML book") = "XML BOOK"
fn:lower-case("XML book") = "xml book"

An Introduction to XML and Web Technologies 45 An Introduction to XML and Web Technologies 46

Cardinality Functions Sequence Functions

fn:exists(()) = fn:false() fn:distinct-values((1, 2, 3, 4, 3, 2)) = (1, 2, 3, 4)


fn:exists((1,2,3,4)) = fn:true() fn:insert-before((2, 4, 6, 8), 2, (3, 5))
fn:empty(()) = fn:true() = (2, 3, 5, 4, 6, 8)
fn:empty((1,2,3,4)) = fn:false() fn:remove((2, 4, 6, 8), 3) = (2, 4, 8)
fn:count((1,2,3,4)) = 4 fn:reverse((2, 4, 6, 8)) = (8, 6, 4, 2)
fn:count(//rcp:recipe) = 5 fn:subsequence((2, 4, 6, 8, 10), 2) = (4, 6, 8, 10)
fn:subsequence((2, 4, 6, 8, 10), 2, 3) = (4, 6, 8)

An Introduction to XML and Web Technologies 47 An Introduction to XML and Web Technologies 48

12
Aggregate Functions Node Functions

fn:avg((2, 3, 4, 5, 6, 7)) = 4.5 fn:doc("http://www.brics.dk/ixwt/recipes/recipes.xml")


fn:max((2, 3, 4, 5, 6, 7)) = 7 fn:position()
fn:min((2, 3, 4, 5, 6, 7)) = 2 fn:last()
fn:sum((2, 3, 4, 5, 6, 7)) = 27

An Introduction to XML and Web Technologies 49 An Introduction to XML and Web Technologies 50

Coercion Functions For Expressions

xs:integer("5") = 5
ƒ The expression
xs:integer(7.0) = 7
xs:decimal(5) = 5.0 for $r in //rcp:recipe
return fn:count($r//rcp:ingredient[fn:not(rcp:ingredient)])
xs:decimal("4.3") = 4.3
xs:decimal("4") = 4.0 returns the value
xs:double(2) = 2.0E0
11, 12, 15, 8, 30
xs:double(14.3) = 1.43E1
xs:boolean(0) = fn:false() ƒ The expression
xs:boolean("true") = fn:true() for $i in (1 to 5)
xs:string(17) = "17" for $j in (1 to $i)
xs:string(1.43E1) = "14.3" return $j
xs:string(fn:true()) = "true" returns the value
1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5

An Introduction to XML and Web Technologies 51 An Introduction to XML and Web Technologies 52

13
Conditional Expressions Quantified Expressions

fn:avg( some $r in //rcp:ingredient


for $r in //rcp:ingredient return satisfies $r/@name eq "sugar"
if ( $r/@unit = "cup" )
then xs:double($r/@amount) * 237
else if ( $r/@unit = "teaspoon" )
then xs:double($r/@amount) * 5 fn:exists(
else if ( $r/@unit = "tablespoon" ) for $r in //rcp:ingredient return
then xs:double($r/@amount) * 15 if ($r/@name eq "sugar") then fn:true() else ()
else () )
)

An Introduction to XML and Web Technologies 53 An Introduction to XML and Web Technologies 54

XPath 1.0 Restrictions XPointer

ƒ Many implementations only support XPath 1.0 ƒ A fragment identifier mechanism based on XPath
ƒ Smaller function library ƒ Different ways of pointer to the fourth recipe:
ƒ Implicit casts of values
...#xpointer(//recipe[4])
ƒ Some expressions change semantics: ...#xpointer(//rcp:recipe[./rcp:title ='Zuppa Inglese'])
”4” < ”4.0” ...#element(/1/5)
...#r102
is false in XPath 1.0 but true in XPath 2.0

An Introduction to XML and Web Technologies 55 An Introduction to XML and Web Technologies 56

14
XLink An XLink Link

<mylink xmlns:xlink="http://www.w3.org/1999/xlink"
ƒ Generalizing hyperlinks from HTML to XML xlink:type="extended">
<myresource xlink:type="locator"
ƒ Allow many-to-many relations xlink:href="students.xml#Carl" xlink:label="student"/>
<myresource xlink:type="locator"
ƒ Allow third party links xlink:href="students.xml#Fred" xlink:label="student"/>
<myresource xlink:type="locator"
ƒ Allow arbitrary element names xlink:href="teachers.xml#Joe" xlink:label="teacher"/>
<myarc xlink:type="arc"
xlink:from="student" xlink:to="teacher"/>
</mylink>

ƒ Not widely used…

An Introduction to XML and Web Technologies 57 An Introduction to XML and Web Technologies 58

A Picture of the XLink Simple Links

Carl <mylink xlink:type="simple" xlink:href="..." xlink:show="..." .../>

myresource myarc
<mylink xlink:type="extended">
<myresource xlink:type="resource"
mylink xlink:label="local"/>
myresource <myresource xlink:type="locator"
myresource
xlink:label="remote" xlink:href="..."/>
<myarc xlink:type="arc"
xlink:from="local" xlink:to="remote" xlink:show="..."
Fred Joe .../>
</mylink>

myarc

An Introduction to XML and Web Technologies 59 An Introduction to XML and Web Technologies 60

15
HTML Links in XLink The HLink Alternative

<a xlink:type="simple" <hlink namespace="http://www.w3.org/1999/xhtml"


xlink:href="..." element="a"
locator="@href"
xlink:show="replace"
effect="replace"
xlink:actuate="onRequest"/> actuate="onRequest"
replacement="@target"/>

An Introduction to XML and Web Technologies 61 An Introduction to XML and Web Technologies 62

Essential Online Resources

ƒ http://www.w3.org/TR/xpath/
ƒ http://www.w3.org/TR/xpath20/
ƒ http://www.w3.org/TR/xlink/
ƒ http://www.w3.org/TR/xptr-framework/

An Introduction to XML and Web Technologies 63

16

You might also like