Download as pdf or txt
Download as pdf or txt
You are on page 1of 159

XML and XPath with PHP

TobiasSchlitt <toby@php.net>
IPC SE 2009

2009-05-29

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

1 / 102

About me

Tobias Schlitt <toby@php.net> PHP since 2001 Freelancing consultant Qualied IT Specialist Studying CS at TU Dortmund
(expect to nish this year)

OSS addicted
PHP eZ Components PHPUnit

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

2 / 102

And who are you?

What is your name? Where are you from? What is your business? What are your experiences with XML / XPath? What do you expect from this workshop?

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

3 / 102

Overview

1 XML 2 XML in PHP 3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

4 / 102

Outline

1 XML

Overview Terminology The XML tree


2 XML in PHP 3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

5 / 102

Outline - XML

1 XML

Overview Terminology The XML tree


2 XML in PHP 3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

6 / 102

What is XML?

General specication for creating markup languages Data exchange between computer systems
System independent Human readable Most used on the web Successor of SGML

W3C recommendation 1.0 1998 (last update 2008-11-26) 1.1 2004 (last update 2006-08-16)

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

7 / 102

XML languages by example

XHTML RSS

XML variant of the popular HTML Really Simply Syndication Provide news / updates / ... of websites Read by special clients Aggregation on portals / planets Scalable Vector Graphics Describe vector graphics in XML Potentially interactive / animated (via ECMAScript)

SVG

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

8 / 102

Related technologies - Schemas


Schema A schema denes the structure XML instance documents. XMLSchema Written in XML W3C recommendation Popular 2 syntax variants
XML based Short plain text based

RelaxNG

OASIS / ISO standard Popular DTD Plain text W3C recommendation Deprecated

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

9 / 102

Related technologies - Querying

Query A query extracts a sub-set of information from a data source. XPath W3C recommendation Navigation in XML documents more on that later... Functional programming language Allows complex queries

XQuery

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

10 / 102

Outline - XML

1 XML

Overview Terminology The XML tree


2 XML in PHP 3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

11 / 102

An XML document
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102

Preamble
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102

Element nodes
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102

Attribute nodes
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102

Atomic value nodes


<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102

Document element
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102

Outline - XML

1 XML

Overview Terminology The XML tree


2 XML in PHP 3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

13 / 102

The XML tree


<bookshelf> <book> <book>

id= 1

<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007

<price>

39.95

currency= Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

14 / 102

Children I
<bookshelf> <book> <book>

id= 1

<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007

<price>

39.95

currency= Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

14 / 102

Children II
<bookshelf> <book> <book>

id= 1

<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007

<price>

39.95

currency= Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

14 / 102

Parent
<bookshelf> <book> <book>

id= 1

<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007

<price>

39.95

currency= Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

14 / 102

Descendants
<bookshelf> <book> <book>

id= 1

<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007

<price>

39.95

currency= Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

14 / 102

Ascendants
<bookshelf> <book> <book>

id= 1

<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007

<price>

39.95

currency= Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

14 / 102

Siblings
<bookshelf> <book> <book>

id= 1

<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007

<price>

39.95

currency= Euro

Attribute siblings Attributes are not considered siblings to elements.


Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 14 / 102

CDATA
CDATA Avoid the escaping hell in text content.

Without CDATA
<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <h i n t> Some e x a m p l e s make u s e o f &lt;xml&gt; . </ h i n t> </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

15 / 102

CDATA
CDATA Avoid the escaping hell in text content.

With CDATA
<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <h i n t> <![CDATA[ Some e x a m p l e s make u s e o f <xml> . ]]> </ h i n t> </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

15 / 102

CDATA
CDATA Avoid the escaping hell in text content.

The CDATA dilemma


<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <h i n t> <![CDATA[ Some e x a m p l e s show t h e u s a g e o f <![CDATA[ ]]> ]]> </ h i n t> </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

15 / 102

CDATA
CDATA Avoid the escaping hell in text content.

The CDATA dilemma workaround


<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <h i n t> <![CDATA[ Some e x a m p l e s show t h e u s a g e o f <![CDATA[ ]]]]><![CDATA[> ]]> </ h i n t> </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

15 / 102

CDATA
CDATA Avoid the escaping hell in text content.

The CDATA dilemma solution


<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <h i n t> U29tZSBleGFtcGxlcyBzaG93IHRoZSB1c2FnZSBvZiA8IVtDREFUQVsgXV0+ </ h i n t> </ book> </ b o o k s h e l f>

Base 64 in PHP Use the built in functions base64 encode() and base64 decode().

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

15 / 102

Comments

Comments in XML
<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> <! . . . more b o o k s . . . > </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

16 / 102

Namespaces
Namespaces Allow to avoid naming conicts between dierent XML sources.

Single, default namespace


<b o o k s h e l f x m l n s= h t t p : // e x a m p l e . com/ book > <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

17 / 102

Namespaces
Namespaces Allow to avoid naming conicts between dierent XML sources.

Multiple namespaces
<b o o k s h e l f x m l n s= h t t p : // e x a m p l e . com/ book x m l n s : b o o k= h t t p : // e x a m p l e . com/ book x m l n s : d c= h t t p : // p u r l . o r g / dc / e l e m e n t s / 1 . 1 / > <book i d= 1 > < d c : t i t l e b o o k : l a n g= en > B e a u t i f u l c o d e</ d c : t i t l e> <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

17 / 102

Outline

1 XML 2 XML in PHP

DOM
Introductional example Essential classes Practical DOM

XMLReader/-Writer (by example) SimpleXml (by example)


3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

18 / 102

Overview

XML APIs PHP has quite some XML APIs. The most important are:
DOM XMLReader / XMLWriter SimpleXML

Deprecated are:
DOM XML XML Parser

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

19 / 102

Overview - DOM

Document Object Model Standardized API to access XML tree


W3C recommendation Level 1 in 1999 Currently: Level 3 (2004)

Available in many languages


C Java Perl Python ...

Represents XML nodes as objects Loads full XML tree into memory

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

20 / 102

Overview - XMLReader/-Writer

Popular approach to access XML data Similar implementations available in


Java C#

Pull / push based Does not load XML fully into memory

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

21 / 102

Overview - SimpleXml

Very simple access to XML data Unique (?) to PHP Represents XML structures as objects Initial implementation hackish Loads full XML tree into memory You dont want to use SimpleXML, seriously!

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

22 / 102

APIs compared
DOM Read Write Manipulate Full control Namespaces XPath Validate XMLReader/-Writer SimpleXML -

Comfort

DTD Schema RelaxNG

DTD Schema RelaxNG

Fully supported Supported but not nice Poorly supported - Not supported at all

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

23 / 102

Outline - XML in PHP

1 XML 2 XML in PHP

DOM
Introductional example Essential classes Practical DOM

XMLReader/-Writer (by example) SimpleXml (by example)


3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

24 / 102

Outline - DOM

Introductional example

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

25 / 102

Getting started
Printing all authors
$dom = new DOMDocument ( ) ; $dom>l o a d ( s o u r c e s / e x a m p l e . xml ) ; $ a u t h o r s = $dom>getElementsByTagName ( a u t h o r ) ; foreach ( $authors as $author ) { echo A u t h o r : . $ a u t h o r >n o d e V a l u e . \n ; }

Output
Author : Author : Author : Author : A. G. T. K. Oram Wilson Schlitt Nordmann
XML and XPath with PHP 2009-05-29 26 / 102

Tobias Schlitt (IPC SE 2009)

Outline - DOM

Essential classes

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

27 / 102

DOMNode
Purpose Base class for all nodes types (elements, attributes, ...).

Typical tree operations


appendChild() removeChild() replaceChild() hasChildNodes() insertBefore()

Typical tree properties


$parent $childNodes $previousSibling $nextSibling

DOM specic properties


$nodeType $nodeValue $ownerDocument $namespaceURI $prefix $localName
2009-05-29 28 / 102

DOM specic operations


cloneNode() lookupNamespaceURI() lookupPrefix() isDefaultNamespace() normalize()
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP

DOMNode
Purpose Base class for all nodes types (elements, attributes, ...).

Typical tree operations


appendChild() removeChild() replaceChild() hasChildNodes() insertBefore()

Typical tree properties


$parent $childNodes $previousSibling $nextSibling

DOM specic properties


$nodeType $nodeValue $ownerDocument $namespaceURI $prefix $localName
2009-05-29 28 / 102

DOM specic operations


cloneNode() lookupNamespaceURI() lookupPrefix() isDefaultNamespace() normalize()
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP

DOMNode
Purpose Base class for all nodes types (elements, attributes, ...).

Typical tree operations


appendChild() removeChild() replaceChild() hasChildNodes() insertBefore()

Typical tree properties


$parent $childNodes $previousSibling $nextSibling

DOM specic properties


$nodeType $nodeValue $ownerDocument $namespaceURI $prefix $localName
2009-05-29 28 / 102

DOM specic operations


cloneNode() lookupNamespaceURI() lookupPrefix() isDefaultNamespace() normalize()
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP

DOMNode
Purpose Base class for all nodes types (elements, attributes, ...).

Typical tree operations


appendChild() removeChild() replaceChild() hasChildNodes() insertBefore()

Typical tree properties


$parent $childNodes $previousSibling $nextSibling

DOM specic properties


$nodeType $nodeValue $ownerDocument $namespaceURI $prefix $localName
2009-05-29 28 / 102

DOM specic operations


cloneNode() lookupNamespaceURI() lookupPrefix() isDefaultNamespace() normalize()
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP

DOMElement
Purpose Representation of an element (extends node)

Attribute related operations


hasAttribute[NS]() getAttribute[NS]() getAttributeNode[NS]() setAttribute[NS]() removeAttribute[NS]()

Properties
$tagName

Element related operations


getElementsByTagName[NS]() appendChild() (inherited) removeChild() (inherited) replaceChild() (inherited)
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 29 / 102

DOMElement
Purpose Representation of an element (extends node)

Attribute related operations


hasAttribute[NS]() getAttribute[NS]() getAttributeNode[NS]() setAttribute[NS]() removeAttribute[NS]()

Properties
$tagName

Element related operations


getElementsByTagName[NS]() appendChild() (inherited) removeChild() (inherited) replaceChild() (inherited)
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 29 / 102

DOMElement
Purpose Representation of an element (extends node)

Attribute related operations


hasAttribute[NS]() getAttribute[NS]() getAttributeNode[NS]() setAttribute[NS]() removeAttribute[NS]()

Properties
$tagName

Element related operations


getElementsByTagName[NS]() appendChild() (inherited) removeChild() (inherited) replaceChild() (inherited)
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 29 / 102

DOMDocument
Purpose Representation of a XML document

Creation methods
createAttribute[NS]() createElement[NS]()

Load /save methods


load/save() load/saveHTMLFile()

Element retrieval methods


getElementsByTagName[NS]() getElementById()

Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)

Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

30 / 102

DOMDocument
Purpose Representation of a XML document

Creation methods
createAttribute[NS]() createElement[NS]()

Load /save methods


load/save() load/saveHTMLFile()

Element retrieval methods


getElementsByTagName[NS]() getElementById()

Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)

Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

30 / 102

DOMDocument
Purpose Representation of a XML document

Creation methods
createAttribute[NS]() createElement[NS]()

Load /save methods


load/save() load/saveHTMLFile()

Element retrieval methods


getElementsByTagName[NS]() getElementById()

Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)

Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

30 / 102

DOMDocument
Purpose Representation of a XML document

Creation methods
createAttribute[NS]() createElement[NS]()

Load /save methods


load/save() load/saveHTMLFile()

Element retrieval methods


getElementsByTagName[NS]() getElementById()

Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)

Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

30 / 102

DOMDocument
Purpose Representation of a XML document

Creation methods
createAttribute[NS]() createElement[NS]()

Load /save methods


load/save() load/saveHTMLFile()

Element retrieval methods


getElementsByTagName[NS]() getElementById()

Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)

Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

30 / 102

DOMNodeList
Purpose Collection of DOMNodes (elements, attributes, ...). Iteratable. Operations
item()

Properties
$length

Standard
f o r ( $ i = 0 ; $ i < $ a u t h o r s >l e n g t h ; ++$ i ) { echo A u t h o r : . $ a u t h o r s >i t e m ( $ i )>n o d e V a l u e . \n ; }

Foreach
foreach ( $authors as $author ) { echo A u t h o r : . $ a u t h o r >n o d e V a l u e . \n ; }

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

31 / 102

DOMNodeList
Purpose Collection of DOMNodes (elements, attributes, ...). Iteratable. Operations
item()

Properties
$length

Standard
f o r ( $ i = 0 ; $ i < $ a u t h o r s >l e n g t h ; ++$ i ) { echo A u t h o r : . $ a u t h o r s >i t e m ( $ i )>n o d e V a l u e . \n ; }

Foreach
foreach ( $authors as $author ) { echo A u t h o r : . $ a u t h o r >n o d e V a l u e . \n ; }

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

31 / 102

Outline - DOM

Practical DOM

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

32 / 102

Reading XML I

Reading
$dom = new DOMDocument ( ) ; $dom>l o a d ( s o u r c e s / e x a m p l e . xml ) ; $ r o o t = $dom>documentElement ; echo Document i s a . $ r o o t >tagName . \n ;

$ b o o k s = $ r o o t >getElementsByTagName ( book ) ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

33 / 102

Reading XML II
Reading II
f o r e a c h ( $ b o o k s a s $book ) { echo Book w i t h ID . $book>g e t A t t r i b u t e ( i d ) ; f o r e a c h ( $book>getElementsByTagName ( p r i c e ) a s $price ) { echo c o s t s . $ p r i c e >n o d e V a l u e . . $ p r i c e >g e t A t t r i b u t e ( c u r r e n c y ) . \n ; } }

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

34 / 102

Reading XML III

Output
Document i s a b o o k s h e l f Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

35 / 102

Create XML I

Creating
$dom = new DOMDocument ( 1 . 0 , UTF8 ) ; $dom>f o r m a t O u t p u t = t r u e ; $ r o o t = $dom>a p p e n d C h i l d ( $dom>c r e a t e E l e m e n t ( d v d s h e l f ) ); $dvd = $ r o o t >a p p e n d C h i l d ( $dom>c r e a t e E l e m e n t ( dvd ) );

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

36 / 102

Create XML II
Creating II
$dvd>s e t A t t r i b u t e ( i d , 1 ) ; $dvd>a p p e n d C h i l d ( $dom>c r e a t e E l e m e n t ( t i t l e , S t a r Trek ) ); $dom>s a v e ( s o u r c e s / d o m c r e a t e . xml ) ;

Output
<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> < d v d s h e l f> <dvd i d= 1 > < t i t l e >S t a r Trek</ t i t l e > </ dvd> </ d v d s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

37 / 102

Manipulate XML I

Source XML
<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <b o o k s h e l f> <book i d= 1 >< t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e> <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> <p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

38 / 102

Manipulate XML II

Manipulating
$dom = new DOMDocument ( ) ; $dom>f o r m a t O u t p u t = t r u e ; $dom>l o a d ( s o u r c e s / e x a m p l e w e i r d f o r m a t . xml , LIBXML NOBLANKS );

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

39 / 102

Manipulate XML III


Manipulating II
$ r o o t = $dom>documentElement ; $newBook = $ r o o t >a p p e n d C h i l d ( $dom>c r e a t e E l e m e n t ( book ) ); $newBook>s e t A t t r i b u t e ( i d , 2 ) ; $newBook>a p p e n d C h i l d ( $dom>c r e a t e E l e m e n t ( t i t l e , eZ Components Das E n t w i c k l e r h a n d b u c h ) ); $dom>s a v e ( s o u r c e s / d o m m a n i p u l a t e . xml ) ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

40 / 102

Manipulate XML IV
Output
<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> <book i d= 2 > < t i t l e >eZ Components Das E n t w i c k l e r h a n d b u c h</ t i t l e > </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

41 / 102

Outline - XML in PHP

1 XML 2 XML in PHP

DOM
Introductional example Essential classes Practical DOM

XMLReader/-Writer (by example) SimpleXml (by example)


3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

42 / 102

Reading XML I

Reading
$ r e a d e r = new XMLReader ( ) ; $ r e a d e r >open ( s o u r c e s / e x a m p l e . xml ) ; w h i l e ( $ r e a d e r >r e a d ( ) ) { i f ( $ r e a d e r >nodeType !== XMLReader : : ELEMENT ) { continue ; }

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

43 / 102

Reading XML II
Reading
i f ( $ r e a d e r >l o c a l N a m e === book ) { echo Book w i t h ID . $ r e a d e r >g e t A t t r i b u t e ( i d ) ; } i f ( $ r e a d e r >l o c a l N a m e === p r i c e ) { echo c o s t s . $ r e a d e r >r e a d S t r i n g ( ) . . $ r e a d e r >g e t A t t r i b u t e ( c u r r e n c y ) . \n ; } }

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

44 / 102

Reading XML III

Output
Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

45 / 102

Create XML I

Creating
$ w r i t e r = new XMLWriter ( ) ; $ w r i t e r >o p e n U r i ( s o u r c e s / x m l w r i t e r c r e a t e . xml ) ; $ w r i t e r >s e t I n d e n t S t r i n g ( $ w r i t e r >s e t I n d e n t ( t r u e ) ; );

$ w r i t e r >s t a r t D o c u m e n t ( 1 . 0 , UTF8 ) ; $ w r i t e r >s t a r t E l e m e n t ( d v d s h e l f ) ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

46 / 102

Create XML II

Creating II
$ w r i t e r >s t a r t E l e m e n t ( dvd ) ; $ w r i t e r >w r i t e A t t r i b u t e ( i d , 1 ) ; $ w r i t e r >w r i t e E l e m e n t ( t i t l e , S t a r Trek ) ; $ w r i t e r >e n d E l e m e n t ( ) ; // <dvd> $ w r i t e r >f l u s h ( ) ; $ w r i t e r >e n d E l e m e n t ( ) ; // <d v d s h e l f > $ w r i t e r >endDocument ( ) ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

47 / 102

Create XML III

Output
<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <b o o k s h e l f> <book i d= 1 > < t i t l e > B e a u t i f u l c o d e</ t i t l e > </ book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

48 / 102

Outline - XML in PHP

1 XML 2 XML in PHP

DOM
Introductional example Essential classes Practical DOM

XMLReader/-Writer (by example) SimpleXml (by example)


3 XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

49 / 102

Reading XML I

Reading
$xml = s i m p l e x m l l o a d f i l e ( s o u r c e s / e x a m p l e . xml ) ; foreach ( { echo . . . . . } $xml>book a s $book ) Book w i t h ID $book [ i d ] costs $book>p r i c e . $book>p r i c e [ c u r r e n c y ] \n ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

50 / 102

Reading XML II

Output
Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

51 / 102

Create XML I
Creating
$xml = new S i m p l e X m l E l e m e n t ( <d v d s h e l f ></d v d s h e l f > ) ; $xml>a d d C h i l d ( dvd ) ; $xml>dvd [0]> a d d A t t r i b u t e ( i d , 1 ) ; $xml>dvd [0]> a d d C h i l d ( t i t l e , S t a r Trek ) ; $xml>asXML ( s o u r c e s / s i m p l e x m l c r e a t e . xml ) ;

Output
<? xml v e r s i o n= 1 . 0 ?> < d v d s h e l f><dvd i d= 1 >< t i t l e >S t a r Trek</ t i t l e ></ dvd></ d v d s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

52 / 102

Outline

1 XML 2 XML in PHP 3 XPath

Overview Basics In depth


Axis Predicates

XPath in action
Namespaces in RDF pQuery

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

53 / 102

Outline - XPath

1 XML 2 XML in PHP 3 XPath

Overview Basics In depth


Axis Predicates

XPath in action
Namespaces in RDF pQuery

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

54 / 102

Overview

XPath Enables you to select information parts from XML documents. Traverse the XML tree Select XML nodes W3C recommendation Version 1: November 1999 Version 2: January 2007 Fields of application
XSLT (XML Stylesheet Language Transformations) Fetching XML nodes within programming languages

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

55 / 102

Overview

XPath Enables you to select information parts from XML documents. Traverse the XML tree Select XML nodes W3C recommendation Version 1: November 1999 Version 2: January 2007 Fields of application
XSLT (XML Stylesheet Language Transformations) Fetching XML nodes within programming languages

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

55 / 102

XML example reminder


<b o o k s h e l f> <book id=1> < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <author>A. Oram</author> <author>G. Wilson</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </book> <book id=2> < t i t l e l a n g= de >eZ Components Das E n t w i c k l e r h a n d b u c h</ t i t l e> <author>T. Schlitt</author> <author>K. Nordmann</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 9 . 9 5</ p r i c e> </book> </ b o o k s h e l f>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

56 / 102

Introductory example
<b o o k s h e l f> <book id=1> < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <author>A. Oram</author> <author>G. Wilson</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </book> <book id=2> < t i t l e l a n g= de >eZ Components Das E n t w i c k l e r h a n d b u c h</ t i t l e> <author>T. Schlitt</author> <author>K. Nordmann</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 9 . 9 5</ p r i c e> </book> </ b o o k s h e l f>

2 variants to fetch all books


/ b o o k s h e l f / book // book

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

56 / 102

Introductory example
<b o o k s h e l f> <book id=1> < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <author>A. Oram</author> <author>G. Wilson</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </book> <book id=2> < t i t l e l a n g= de >eZ Components Das E n t w i c k l e r h a n d b u c h</ t i t l e> <author>T. Schlitt</author> <author>K. Nordmann</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 9 . 9 5</ p r i c e> </book> </ b o o k s h e l f>

2 variants to fetch all authors


/ b o o k s h e l f / book / a u t h o r // a u t h o r

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

56 / 102

Outline - XPath

1 XML 2 XML in PHP 3 XPath

Overview Basics In depth


Axis Predicates

XPath in action
Namespaces in RDF pQuery

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

57 / 102

Addressing
Every XPath expression matches a set of nodes (0..n) It encodes an address for the selected nodes Simple XPath expressions look similar to Unix le system addresses Two generally dierent ways of addressing are supported Absolute addressing
/ b o o k s h e l f / book / t i t l e / b o o k s h e l f / book / a u t h o r

Relative addressing
book / a u t h o r ../ title

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

58 / 102

Contexts
Every expression step creates a new context The next is evaluated in the context created by the previous one Contexts
/ b o o k s h e l f / book / t i t l e

/ resets the context to global bookshelf selects all <bookshelf> elements in the global context / creates a new context, all children of <bookshelf> book selects all <book> elements in this context / creates a new context, all children of selected <book>s title selects all < title > elements in this context A set of all title element nodes.
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 59 / 102

Attributes

Select attributes Prepend the attribute name with an @. Select all currency attributes
/ b o o k s h e l f / book / p r i c e / @ c u r r e n c y

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

60 / 102

Steps

Navigation Navigation is not only possible in parent child direction. Navigate to parent
../ / b o o k s h e l f / book / t i t l e / . . / a u t h o r

Navigate to descendants
// t i t l e / b o o k s h e l f / book // @ c u r r e n c y

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

61 / 102

Indexing

Access nodes by position It is possible to access a specic node in a set by its position. Indexing
/ b o o k s h e l f / book [ 2 ] / b o o k s h e l f / book / a u t h o r [ 1 ]

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

62 / 102

Indexing

Access nodes by position It is possible to access a specic node in a set by its position. Indexing
/ b o o k s h e l f / book [ 2 ] / b o o k s h e l f / book / a u t h o r [ 1 ]

Start index Indexing generally 1 based Some Internet Explorer versions start with 0

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

63 / 102

Wildcards

Wildcard search A wildcard represents a node of a certain type with arbitrary name. Wildcards
/ b o o k s h e l f // t i t l e / b o o k s h e l f / book /@

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

64 / 102

Union

Union Union the node sets selected by multiple XPath expressions. Union
/ b o o k s h e l f / book / t i t l e | / b o o k s h e l f / book / a u t h o r

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

65 / 102

A rst PHP example


Querying rst book title
$dom = new DOMDocument ( ) ; $dom>l o a d ( s o u r c e s / e x a m p l e . xml ) ; $ x p a t h = new DOMXPath( $dom ) ; $ t i t l e s = $xpath >q u e r y ( / b o o k s h e l f / book [ 1 ] / t i t l e ) ; echo T i t l e o f f i r s t book i s . $ t i t l e s >i t e m ( 0 )>n o d e V a l u e . \n ;

Output
T i t l e o f f i r s t book i s B e a u t i f u l c o d e

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

66 / 102

Second PHP example


Querying all currencies
// . . . $ c u r r e n c i e s = $xpath >q u e r y ( // p r i c e / @ c u r r e n c y ) ; echo F o l l o w i n g c u r r e n c i e s o c c u r : \ n ; foreach ( $ c u r r e n c i e s as $currency ) { echo $ c u r r e n c y >n o d e V a l u e . \n ; }

Output
Following c u r r e n c i e s occur : Euro Euro
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 67 / 102

Outline - XPath

1 XML 2 XML in PHP 3 XPath

Overview Basics In depth


Axis Predicates

XPath in action
Namespaces in RDF pQuery

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

68 / 102

XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3

an axis a node test a predicate

child (default) attribute (@) descendant-or-self (//) parent (..)

Node tests:
name of the node (default) wildcard (*)

Predicate:
accessing a node by index

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

69 / 102

XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3

an axis a node test a predicate

child (default) attribute (@) descendant-or-self (//) parent (..)

Node tests:
name of the node (default) wildcard (*)

You already saw examples

Predicate:
accessing a node by index

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

69 / 102

XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3

an axis a node test a predicate

child (default) attribute (@) descendant-or-self (//) parent (..)

Node tests:
name of the node (default) wildcard (*)

You already saw examples

Predicate:
accessing a node by index

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

69 / 102

XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3

an axis a node test a predicate

child (default) attribute (@) descendant-or-self (//) parent (..)

Node tests:
name of the node (default) wildcard (*)

You already saw examples

Predicate:
accessing a node by index

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

69 / 102

XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3

an axis a node test a predicate

child (default) attribute (@) descendant-or-self (//) parent (..)

Node tests:
name of the node (default) wildcard (*)

You already saw examples

Predicate:
accessing a node by index

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

69 / 102

XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3

an axis a node test a predicate

child (default) attribute (@) descendant-or-self (//) parent (..)

Node tests:
name of the node (default) wildcard (*)

You already saw examples

Predicate:
accessing a node by index

Step syntax
<a x i s >::< n o d e t e s t >[< p r e d i c a t e >]

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

69 / 102

Outline - In depth

Axis

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

70 / 102

Axis
13 dimensions Imagine an XML document to be a 13 dimensional space... Axis ancestor ancestor-or-self attribute child descendant descendant-or-self following following-sibling namespace parent preceding preceding-sibling self
Tobias Schlitt (IPC SE 2009)

Shortcut

@ //

..

.
2009-05-29 71 / 102

XML and XPath with PHP

Axis

Axis syntax
<a x i s n a m e >::< n o d e t e s t >

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

72 / 102

Axis

Axis syntax
<a x i s n a m e >::< n o d e t e s t >

Child axis
/ b o o k s h e l f / book c h i l d : : b o o k s h e l f / c h i l d : : book

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

72 / 102

Axis

Axis syntax
<a x i s n a m e >::< n o d e t e s t >

Descendant-or-self axis
// book d e s c e n d a n t or s e l f : : book

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

72 / 102

Axis

Axis syntax
<a x i s n a m e >::< n o d e t e s t >

Attribute axis
// book / @id d e s c e n d a n t or s e l f : : book / a t t r i b u t e : : i d

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

72 / 102

Axis

Axis syntax
<a x i s n a m e >::< n o d e t e s t >

Parent axis
// book / @id / . . d e s c e n d a n t or s e l f : : book / a t t r i b u t e : : i d / p a r e n t : : node ( )

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

72 / 102

Ancestor axis
XPath with ancestor axis
// t i t l e / a n c e s t o r : :

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

73 / 102

Ancestor axis
XPath with ancestor axis
// t i t l e / a n c e s t o r : :

Selected XML nodes


<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <bookshelf> <book id=1> < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </book> <! . . . > </bookshelf>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

73 / 102

Following axis
XPath with ancestor axis
// book / f o l l o w i n g : :

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

74 / 102

Following axis
XPath with ancestor axis
// book / f o l l o w i n g : :

Selected XML nodes


<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <! . . . > </ book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <! ... > </book> </ b o o k s h e l f>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 74 / 102

Following-sibling axis
XPath with ancestor axis
// book / f o l l o w i n g s i b l i n g : :

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

75 / 102

Following-sibling axis
XPath with ancestor axis
// book / f o l l o w i n g s i b l i n g : :

Selected XML nodes


<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <! . . . > </ book> <book id=2> < t i t l e l a n g= de >eZ Components Das E n t w i c k l e r h a n d b u c h </ t i t l e > <! . . . > </book> </ b o o k s h e l f>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 75 / 102

Namespace axis
XPath with namespace axis
namespace : :

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

76 / 102

Namespace axis
XPath with namespace axis
namespace : :

XML with namespaces


<b o o k s h e l f xmlns=http://example.com/book xmlns:book=http://example.com/book xmlns:dc=http://purl.org/dc/elements/1.1/ > <book i d= 1 > < d c : t i t l e b o o k : l a n g= en > B e a u t i f u l c o d e</ d c : t i t l e> <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 76 / 102

Outline - In depth

Predicates

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

77 / 102

Predicate syntax

You already saw indexing with numeric predicates Predicates can also be booleans Functions and operators allow ne grained tests Predicate syntax
<n o d e t e s t >[< p r e d i c a t e >]

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

78 / 102

Predicate examples
Select all books with ID 1
// book [ @id = 1 ]

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

79 / 102

Predicate examples
Select all books that have any attribute at all
// book [ @ ] // book /@ / . .

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

79 / 102

Predicate examples
Select all books with price round 40
// book [ r o u n d ( p r i c e ) = 4 0 ]

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

79 / 102

Predicate examples
Select all authors with rst name initial T
// book / a u t h o r [ s u b s t r i n g ( . , 1 , 1 ) = T ]

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

79 / 102

Operator overview
Mathematical operators +, , div mod Addition, subtraction, multiplication Division Modulo operation

Comparison operators = != <, <= >, >= Check for equality Check for inequality Less than and less than or equal Greater than and greater than or equal

Logical operators or and Logical or Logical and

Logical negation not() is a function in XPath!


Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 80 / 102

Operator overview
Mathematical operators +, , div mod Addition, subtraction, multiplication Division Modulo operation

Comparison operators = != <, <= >, >= Check for equality Check for inequality Less than and less than or equal Greater than and greater than or equal

Logical operators or and Logical or Logical and

Logical negation not() is a function in XPath!


Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 80 / 102

Operator overview
Mathematical operators +, , div mod Addition, subtraction, multiplication Division Modulo operation

Comparison operators = != <, <= >, >= Check for equality Check for inequality Less than and less than or equal Greater than and greater than or equal

Logical operators or and Logical or Logical and

Logical negation not() is a function in XPath!


Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 80 / 102

Operator overview
Mathematical operators +, , div mod Addition, subtraction, multiplication Division Modulo operation

Comparison operators = != <, <= >, >= Check for equality Check for inequality Less than and less than or equal Greater than and greater than or equal

Logical operators or and Logical or Logical and

Logical negation not() is a function in XPath!


Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 80 / 102

Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102

Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102

Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102

Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102

Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102

Outline - XPath

1 XML 2 XML in PHP 3 XPath

Overview Basics In depth


Axis Predicates

XPath in action
Namespaces in RDF pQuery

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

82 / 102

Outline - XPath in action

Namespaces in RDF

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

83 / 102

Find all namespaces in an RDF document


RDF Resource Description Framework Semantic web Makes heavy usage of dierent namespaces Complex variant
/ / [ name ( . ) = r d f : D e s c r i p t i o n ] / [ namespaceu r i ( . ) != namespaceu r i ( . . ) and namespaceu r i ( . ) != and namespaceu r i ( . ) != namespaceu r i ( p r e c e d i n g sibling ::) ]

Simpler variant
/ / [ name ( . ) = r d f : D e s c r i p t i o n ] / n a m e s p a c e s : :

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

84 / 102

Find all namespaces in an RDF document


RDF Resource Description Framework Semantic web Makes heavy usage of dierent namespaces Complex variant
/ / [ name ( . ) = r d f : D e s c r i p t i o n ] / [ namespaceu r i ( . ) != namespaceu r i ( . . ) and namespaceu r i ( . ) != and namespaceu r i ( . ) != namespaceu r i ( p r e c e d i n g sibling ::) ]

Simpler variant
/ / [ name ( . ) = r d f : D e s c r i p t i o n ] / n a m e s p a c e s : :

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

84 / 102

Outline - XPath in action

pQuery

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

85 / 102

jQuery

Cool JavaScript framework Allows selecting HTML elements by CSS selectors http://jquery.com/

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

86 / 102

pQuery

Little example class Models a tiny bits of jQuery, using


DOM XPath

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

87 / 102

Example HTML
<html> <head> < t i t l e>Some w e b s i t e</ t i t l e> </ head> <body> <h1>Some h e a d l i n e</h1> <p> Some n i c e <a h r e f= t e s t . ht m l c l a s s= i n t e r n a l >c o n t e n t</a >. </p> <h1 i d= s e c o n d >Second h e a d l i n e</h1> <p> More n i c e <a h r e f= t e s t . ht m l >c o n t e n t</a> . </p> </ body> </ html>

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

88 / 102

Usage
r e q u i r e p q u e r y / p q u e r y . php ; $dom = new DOMDocument ( ) ; $dom>f o r m a t O u t p u t = t r u e ; $dom>loadHTMLFile ( . . / e x a m p l e . h tm l ) ; $q = new pQuery ( $dom ) ; $q>q u e r y ( h1#s e c o n d ) ; $q>a d d C l a s s ( s o m e c l a s s ) ; $dom>saveHTMLFile ( . . / p q u e r y s i m p l e . ht m l ) ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

89 / 102

The pQuery class


c l a s s pQuery { p r o t e c t e d $dom ; protected $context ; p r o t e c t e d $xpath ; public function c o n s t r u c t ( DOMDocument $dom ) { $ t h i s >dom = $dom ; $ t h i s >c o n t e x t = a r r a y ( $dom>documentElement ) ; $ t h i s >x p a t h = new DOMXPath( $dom ) ; } // . . .

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

90 / 102

Issueing a query
// . . . p u b l i c f u nc t i o n query ( $ s t r i n g ) { $ x p a t h = $ t h i s >c r e a t e X P a t h ( $ s t r i n g ) ; echo Q u e r y i n g w i t h XPath $ x p a t h . \ n ; $ n e w C o n t ex t = a r r a y ( ) ; f o r e a c h ( $ t h i s >c o n t e x t a s $node ) { $ n o d e L i s t = $ t h i s >xpath >q u e r y ( $xpath , $node ) ; f o r e a c h ( $ n o d e L i s t a s $newNode ) { $n e w Co nt ext [ ] = $newNode ; } } $ t h i s >c o n t e x t = $ ne w C on tex t ; } // . . .
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 91 / 102

Creating XPath
// . . . protected f u n c t i o n createXPath ( $ s t r i n g ) { $ p a r t s = explode ( , $ s t r i n g ) ; $xpath = array () ; foreach ( $par ts as $part ) { $ x p a t h [ ] = $ t h i s >c r e a t e S i n g l e X P a t h ( $ p a r t ) ; } r e t u r n implode ( | } // . . . , $xpath ) ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

92 / 102

Creating XPath part I


// . . . protected function { $parts = p r e g ((#|\.) ) $string , 1, PREG SPLIT ); $ x p a t h = // ; // . . . createSingleXPath ( $string ) split ( ,

DELIM CAPTURE

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

93 / 102

Creating XPath part II


// . . . i f ( count ( $ p a r t s ) < 3 ) { $ x p a t h .= $ s t r i n g ; } else { $ x p a t h .= $ t h i s >c r e a t e S e l e c t o r ( $ p a r t s ) ; } r e t u r n $xpath ; } // . . .

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

94 / 102

Creating selector
// . . . protected function c r e a t e S e l e c t o r ( array $parts ) { switch ( $parts [ 1 ] ) { case . : return $parts [0] . [ contains ( @class , . $parts [ 2 ] . ] ; break ; c a s e # : return $parts [0] . [ @id = . $ p a r t s [ 2 ] . break ; } } }

] ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

95 / 102

Adding a class I
// . . . public function addClass ( $ c la s s ) { f o r e a c h ( $ t h i s >c o n t e x t a s $node ) { i f ( $node>nodeType !== XML ELEMENT NODE ) { continue ; } i f ( ! $node>h a s A t t r i b u t e ( c l a s s ) ) { $ c l a s s e s = array () ; } else { $ c l a s s e s = explode ( , $node>g e t A t t r i b u t e ( c l a s s ) ); } // . . .
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 96 / 102

Adding a class II
// . . . i f ( ! in array ( $class , $classes ) ) { $classes [] = $class ; } $node>s e t A t t r i b u t e ( class , implode ( , $ c l a s s e s ) ); } } // . . .

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

97 / 102

Simple example
r e q u i r e p q u e r y / p q u e r y . php ; $dom = new DOMDocument ( ) ; $dom>f o r m a t O u t p u t = t r u e ; $dom>loadHTMLFile ( . . / e x a m p l e . h tm l ) ; $q = new pQuery ( $dom ) ; $q>q u e r y ( h1#s e c o n d ) ; $q>a d d C l a s s ( s o m e c l a s s ) ; $dom>saveHTMLFile ( . . / p q u e r y s i m p l e . ht m l ) ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

98 / 102

Simple example result


<!DOCTYPE html PUBLIC //W3C//DTD HTML 4 . 0 T r a n s i t i o n a l //EN h t t p : / /www . w3 . o r g /TR/RECh t m l 4 0 / l o o s e . d t d > <html> <head> <meta h t t p e q u i v= Content Type c o n t e n t= t e x t / ht ml ; c h a r s e t= UTF8> < t i t l e>Some w e b s i t e</ t i t l e> </ head> <body> <h1>Some h e a d l i n e</h1> <p> Some n i c e <a h r e f= t e s t . ht m l c l a s s= i n t e r n a l >c o n t e n t</a >. </p> <h1 i d= s e c o n d c l a s s= s o m e c l a s s >Second h e a d l i n e</h1> <p> More n i c e <a h r e f= t e s t . ht m l >c o n t e n t</a> . </p> </ body> </ html>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 99 / 102

Advanced example
r e q u i r e p q u e r y / p q u e r y . php ; $dom = new DOMDocument ( ) ; $dom>f o r m a t O u t p u t = t r u e ; $dom>loadHTMLFile ( . . / e x a m p l e . h tm l ) ; $q = new pQuery ( $dom ) ; $q>q u e r y ( h1 a . i n t e r n a l ) ; $q>a d d C l a s s ( c o o l c l a s s ) ; $dom>saveHTMLFile ( . . / p q u e r y a d v a n c e d . ht m l ) ;

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

100 / 102

Advanced example result


<!DOCTYPE html PUBLIC //W3C//DTD HTML 4 . 0 T r a n s i t i o n a l //EN h t t p : / /www . w3 . o r g /TR/RECh t m l 4 0 / l o o s e . d t d > <html> <head> <meta h t t p e q u i v= Content Type c o n t e n t= t e x t / ht ml ; c h a r s e t= UTF8> < t i t l e>Some w e b s i t e</ t i t l e> </ head> <body> <h1 c l a s s= c o o l c l a s s >Some h e a d l i n e</h1> <p> Some n i c e <a h r e f= t e s t . ht m l c l a s s= i n t e r n a l c o o l c l a s s > c o n t e n t</a> . </p> <h1 i d= s e c o n d c l a s s= c o o l c l a s s >Second h e a d l i n e</h1> <p> More n i c e <a h r e f= t e s t . ht m l >c o n t e n t</a> . </p> </ body> </ html>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 101 / 102

The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

102 / 102

The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

102 / 102

The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

102 / 102

The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

102 / 102

The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!

Tobias Schlitt (IPC SE 2009)

XML and XPath with PHP

2009-05-29

102 / 102

You might also like