Professional Documents
Culture Documents
XML and Xpath With PHP: Ipc Se 2009
XML and Xpath With PHP: Ipc Se 2009
TobiasSchlitt <toby@php.net>
IPC SE 2009
2009-05-29
2009-05-29
1 / 102
About me
Tobias Schlitt <toby@php.net> PHP since 2001 Freelancing consultant Qualied IT Specialist Studying CS at TU Dortmund
(expect to nish this year)
OSS addicted
PHP eZ Components PHPUnit
2009-05-29
2 / 102
What is your name? Where are you from? What is your business? What are your experiences with XML / XPath? What do you expect from this workshop?
2009-05-29
3 / 102
Overview
2009-05-29
4 / 102
Outline
1 XML
2009-05-29
5 / 102
Outline - XML
1 XML
2009-05-29
6 / 102
What is XML?
General specication for creating markup languages Data exchange between computer systems
System independent Human readable Most used on the web Successor of SGML
W3C recommendation 1.0 1998 (last update 2008-11-26) 1.1 2004 (last update 2006-08-16)
2009-05-29
7 / 102
XHTML RSS
XML variant of the popular HTML Really Simply Syndication Provide news / updates / ... of websites Read by special clients Aggregation on portals / planets Scalable Vector Graphics Describe vector graphics in XML Potentially interactive / animated (via ECMAScript)
SVG
2009-05-29
8 / 102
RelaxNG
OASIS / ISO standard Popular DTD Plain text W3C recommendation Deprecated
2009-05-29
9 / 102
Query A query extracts a sub-set of information from a data source. XPath W3C recommendation Navigation in XML documents more on that later... Functional programming language Allows complex queries
XQuery
2009-05-29
10 / 102
Outline - XML
1 XML
2009-05-29
11 / 102
An XML document
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
Preamble
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
Element nodes
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
Attribute nodes
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
Document element
<?xml version=1.0 encoding=UTF-8?> <bookshelf> <book id=1> <title lang=en>Beautiful code</title> <author>A. Oram</author> <author>G. Wilson</author> <year>2007</year> <price currency=Euro>35.95</price> </book> <book id=2> <title lang=de>eZ Components - Das Entwicklerhandbuch</title> <author>T. Schlitt</author> <author>K. Nordmann</author> <year>2007</year> <price currency=Euro>39.95</price> </book> </bookshelf>
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 12 / 102
Outline - XML
1 XML
2009-05-29
13 / 102
id= 1
<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007
<price>
39.95
currency= Euro
2009-05-29
14 / 102
Children I
<bookshelf> <book> <book>
id= 1
<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007
<price>
39.95
currency= Euro
2009-05-29
14 / 102
Children II
<bookshelf> <book> <book>
id= 1
<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007
<price>
39.95
currency= Euro
2009-05-29
14 / 102
Parent
<bookshelf> <book> <book>
id= 1
<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007
<price>
39.95
currency= Euro
2009-05-29
14 / 102
Descendants
<bookshelf> <book> <book>
id= 1
<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007
<price>
39.95
currency= Euro
2009-05-29
14 / 102
Ascendants
<bookshelf> <book> <book>
id= 1
<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007
<price>
39.95
currency= Euro
2009-05-29
14 / 102
Siblings
<bookshelf> <book> <book>
id= 1
<title> <author> <author> <year> A. Oram G. Wilson Beautiful lang= code en 2007
<price>
39.95
currency= Euro
CDATA
CDATA Avoid the escaping hell in text content.
Without CDATA
<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <h i n t> Some e x a m p l e s make u s e o f <xml> . </ h i n t> </ book> </ b o o k s h e l f>
2009-05-29
15 / 102
CDATA
CDATA Avoid the escaping hell in text content.
With CDATA
<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <h i n t> <![CDATA[ Some e x a m p l e s make u s e o f <xml> . ]]> </ h i n t> </ book> </ b o o k s h e l f>
2009-05-29
15 / 102
CDATA
CDATA Avoid the escaping hell in text content.
2009-05-29
15 / 102
CDATA
CDATA Avoid the escaping hell in text content.
2009-05-29
15 / 102
CDATA
CDATA Avoid the escaping hell in text content.
Base 64 in PHP Use the built in functions base64 encode() and base64 decode().
2009-05-29
15 / 102
Comments
Comments in XML
<b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> <! . . . more b o o k s . . . > </ b o o k s h e l f>
2009-05-29
16 / 102
Namespaces
Namespaces Allow to avoid naming conicts between dierent XML sources.
2009-05-29
17 / 102
Namespaces
Namespaces Allow to avoid naming conicts between dierent XML sources.
Multiple namespaces
<b o o k s h e l f x m l n s= h t t p : // e x a m p l e . com/ book x m l n s : b o o k= h t t p : // e x a m p l e . com/ book x m l n s : d c= h t t p : // p u r l . o r g / dc / e l e m e n t s / 1 . 1 / > <book i d= 1 > < d c : t i t l e b o o k : l a n g= en > B e a u t i f u l c o d e</ d c : t i t l e> <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f>
2009-05-29
17 / 102
Outline
DOM
Introductional example Essential classes Practical DOM
2009-05-29
18 / 102
Overview
XML APIs PHP has quite some XML APIs. The most important are:
DOM XMLReader / XMLWriter SimpleXML
Deprecated are:
DOM XML XML Parser
2009-05-29
19 / 102
Overview - DOM
Represents XML nodes as objects Loads full XML tree into memory
2009-05-29
20 / 102
Overview - XMLReader/-Writer
Pull / push based Does not load XML fully into memory
2009-05-29
21 / 102
Overview - SimpleXml
Very simple access to XML data Unique (?) to PHP Represents XML structures as objects Initial implementation hackish Loads full XML tree into memory You dont want to use SimpleXML, seriously!
2009-05-29
22 / 102
APIs compared
DOM Read Write Manipulate Full control Namespaces XPath Validate XMLReader/-Writer SimpleXML -
Comfort
Fully supported Supported but not nice Poorly supported - Not supported at all
2009-05-29
23 / 102
DOM
Introductional example Essential classes Practical DOM
2009-05-29
24 / 102
Outline - DOM
Introductional example
2009-05-29
25 / 102
Getting started
Printing all authors
$dom = new DOMDocument ( ) ; $dom>l o a d ( s o u r c e s / e x a m p l e . xml ) ; $ a u t h o r s = $dom>getElementsByTagName ( a u t h o r ) ; foreach ( $authors as $author ) { echo A u t h o r : . $ a u t h o r >n o d e V a l u e . \n ; }
Output
Author : Author : Author : Author : A. G. T. K. Oram Wilson Schlitt Nordmann
XML and XPath with PHP 2009-05-29 26 / 102
Outline - DOM
Essential classes
2009-05-29
27 / 102
DOMNode
Purpose Base class for all nodes types (elements, attributes, ...).
DOMNode
Purpose Base class for all nodes types (elements, attributes, ...).
DOMNode
Purpose Base class for all nodes types (elements, attributes, ...).
DOMNode
Purpose Base class for all nodes types (elements, attributes, ...).
DOMElement
Purpose Representation of an element (extends node)
Properties
$tagName
DOMElement
Purpose Representation of an element (extends node)
Properties
$tagName
DOMElement
Purpose Representation of an element (extends node)
Properties
$tagName
DOMDocument
Purpose Representation of a XML document
Creation methods
createAttribute[NS]() createElement[NS]()
Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)
Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()
2009-05-29
30 / 102
DOMDocument
Purpose Representation of a XML document
Creation methods
createAttribute[NS]() createElement[NS]()
Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)
Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()
2009-05-29
30 / 102
DOMDocument
Purpose Representation of a XML document
Creation methods
createAttribute[NS]() createElement[NS]()
Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)
Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()
2009-05-29
30 / 102
DOMDocument
Purpose Representation of a XML document
Creation methods
createAttribute[NS]() createElement[NS]()
Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)
Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()
2009-05-29
30 / 102
DOMDocument
Purpose Representation of a XML document
Creation methods
createAttribute[NS]() createElement[NS]()
Properties
$documentElement $documentURI $preserveWhitespace $formatOutput $doctype $xmlVersion $xmlEncoding $recover (not DOM!)
Misc operations
registerNodeClass() validate() schemaValidate() relaxNGValidate()
2009-05-29
30 / 102
DOMNodeList
Purpose Collection of DOMNodes (elements, attributes, ...). Iteratable. Operations
item()
Properties
$length
Standard
f o r ( $ i = 0 ; $ i < $ a u t h o r s >l e n g t h ; ++$ i ) { echo A u t h o r : . $ a u t h o r s >i t e m ( $ i )>n o d e V a l u e . \n ; }
Foreach
foreach ( $authors as $author ) { echo A u t h o r : . $ a u t h o r >n o d e V a l u e . \n ; }
2009-05-29
31 / 102
DOMNodeList
Purpose Collection of DOMNodes (elements, attributes, ...). Iteratable. Operations
item()
Properties
$length
Standard
f o r ( $ i = 0 ; $ i < $ a u t h o r s >l e n g t h ; ++$ i ) { echo A u t h o r : . $ a u t h o r s >i t e m ( $ i )>n o d e V a l u e . \n ; }
Foreach
foreach ( $authors as $author ) { echo A u t h o r : . $ a u t h o r >n o d e V a l u e . \n ; }
2009-05-29
31 / 102
Outline - DOM
Practical DOM
2009-05-29
32 / 102
Reading XML I
Reading
$dom = new DOMDocument ( ) ; $dom>l o a d ( s o u r c e s / e x a m p l e . xml ) ; $ r o o t = $dom>documentElement ; echo Document i s a . $ r o o t >tagName . \n ;
$ b o o k s = $ r o o t >getElementsByTagName ( book ) ;
2009-05-29
33 / 102
Reading XML II
Reading II
f o r e a c h ( $ b o o k s a s $book ) { echo Book w i t h ID . $book>g e t A t t r i b u t e ( i d ) ; f o r e a c h ( $book>getElementsByTagName ( p r i c e ) a s $price ) { echo c o s t s . $ p r i c e >n o d e V a l u e . . $ p r i c e >g e t A t t r i b u t e ( c u r r e n c y ) . \n ; } }
2009-05-29
34 / 102
Output
Document i s a b o o k s h e l f Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro
2009-05-29
35 / 102
Create XML I
Creating
$dom = new DOMDocument ( 1 . 0 , UTF8 ) ; $dom>f o r m a t O u t p u t = t r u e ; $ r o o t = $dom>a p p e n d C h i l d ( $dom>c r e a t e E l e m e n t ( d v d s h e l f ) ); $dvd = $ r o o t >a p p e n d C h i l d ( $dom>c r e a t e E l e m e n t ( dvd ) );
2009-05-29
36 / 102
Create XML II
Creating II
$dvd>s e t A t t r i b u t e ( i d , 1 ) ; $dvd>a p p e n d C h i l d ( $dom>c r e a t e E l e m e n t ( t i t l e , S t a r Trek ) ); $dom>s a v e ( s o u r c e s / d o m c r e a t e . xml ) ;
Output
<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> < d v d s h e l f> <dvd i d= 1 > < t i t l e >S t a r Trek</ t i t l e > </ dvd> </ d v d s h e l f>
2009-05-29
37 / 102
Manipulate XML I
Source XML
<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <b o o k s h e l f> <book i d= 1 >< t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e> <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> <p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> </ b o o k s h e l f>
2009-05-29
38 / 102
Manipulate XML II
Manipulating
$dom = new DOMDocument ( ) ; $dom>f o r m a t O u t p u t = t r u e ; $dom>l o a d ( s o u r c e s / e x a m p l e w e i r d f o r m a t . xml , LIBXML NOBLANKS );
2009-05-29
39 / 102
2009-05-29
40 / 102
Manipulate XML IV
Output
<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <b o o k s h e l f> <book i d= 1 > < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <a u t h o r>A . Oram</ a u t h o r> <a u t h o r>G . W i l s o n</ a u t h o r> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </ book> <book i d= 2 > < t i t l e >eZ Components Das E n t w i c k l e r h a n d b u c h</ t i t l e > </ book> </ b o o k s h e l f>
2009-05-29
41 / 102
DOM
Introductional example Essential classes Practical DOM
2009-05-29
42 / 102
Reading XML I
Reading
$ r e a d e r = new XMLReader ( ) ; $ r e a d e r >open ( s o u r c e s / e x a m p l e . xml ) ; w h i l e ( $ r e a d e r >r e a d ( ) ) { i f ( $ r e a d e r >nodeType !== XMLReader : : ELEMENT ) { continue ; }
2009-05-29
43 / 102
Reading XML II
Reading
i f ( $ r e a d e r >l o c a l N a m e === book ) { echo Book w i t h ID . $ r e a d e r >g e t A t t r i b u t e ( i d ) ; } i f ( $ r e a d e r >l o c a l N a m e === p r i c e ) { echo c o s t s . $ r e a d e r >r e a d S t r i n g ( ) . . $ r e a d e r >g e t A t t r i b u t e ( c u r r e n c y ) . \n ; } }
2009-05-29
44 / 102
Output
Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro
2009-05-29
45 / 102
Create XML I
Creating
$ w r i t e r = new XMLWriter ( ) ; $ w r i t e r >o p e n U r i ( s o u r c e s / x m l w r i t e r c r e a t e . xml ) ; $ w r i t e r >s e t I n d e n t S t r i n g ( $ w r i t e r >s e t I n d e n t ( t r u e ) ; );
2009-05-29
46 / 102
Create XML II
Creating II
$ w r i t e r >s t a r t E l e m e n t ( dvd ) ; $ w r i t e r >w r i t e A t t r i b u t e ( i d , 1 ) ; $ w r i t e r >w r i t e E l e m e n t ( t i t l e , S t a r Trek ) ; $ w r i t e r >e n d E l e m e n t ( ) ; // <dvd> $ w r i t e r >f l u s h ( ) ; $ w r i t e r >e n d E l e m e n t ( ) ; // <d v d s h e l f > $ w r i t e r >endDocument ( ) ;
2009-05-29
47 / 102
Output
<? xml v e r s i o n= 1 . 0 e n c o d i n g=UTF8 ?> <b o o k s h e l f> <book i d= 1 > < t i t l e > B e a u t i f u l c o d e</ t i t l e > </ book> </ b o o k s h e l f>
2009-05-29
48 / 102
DOM
Introductional example Essential classes Practical DOM
2009-05-29
49 / 102
Reading XML I
Reading
$xml = s i m p l e x m l l o a d f i l e ( s o u r c e s / e x a m p l e . xml ) ; foreach ( { echo . . . . . } $xml>book a s $book ) Book w i t h ID $book [ i d ] costs $book>p r i c e . $book>p r i c e [ c u r r e n c y ] \n ;
2009-05-29
50 / 102
Reading XML II
Output
Book w i t h ID 1 c o s t s 3 5 . 9 5 Euro Book w i t h ID 2 c o s t s 3 9 . 9 5 Euro
2009-05-29
51 / 102
Create XML I
Creating
$xml = new S i m p l e X m l E l e m e n t ( <d v d s h e l f ></d v d s h e l f > ) ; $xml>a d d C h i l d ( dvd ) ; $xml>dvd [0]> a d d A t t r i b u t e ( i d , 1 ) ; $xml>dvd [0]> a d d C h i l d ( t i t l e , S t a r Trek ) ; $xml>asXML ( s o u r c e s / s i m p l e x m l c r e a t e . xml ) ;
Output
<? xml v e r s i o n= 1 . 0 ?> < d v d s h e l f><dvd i d= 1 >< t i t l e >S t a r Trek</ t i t l e ></ dvd></ d v d s h e l f>
2009-05-29
52 / 102
Outline
XPath in action
Namespaces in RDF pQuery
2009-05-29
53 / 102
Outline - XPath
XPath in action
Namespaces in RDF pQuery
2009-05-29
54 / 102
Overview
XPath Enables you to select information parts from XML documents. Traverse the XML tree Select XML nodes W3C recommendation Version 1: November 1999 Version 2: January 2007 Fields of application
XSLT (XML Stylesheet Language Transformations) Fetching XML nodes within programming languages
2009-05-29
55 / 102
Overview
XPath Enables you to select information parts from XML documents. Traverse the XML tree Select XML nodes W3C recommendation Version 1: November 1999 Version 2: January 2007 Fields of application
XSLT (XML Stylesheet Language Transformations) Fetching XML nodes within programming languages
2009-05-29
55 / 102
2009-05-29
56 / 102
Introductory example
<b o o k s h e l f> <book id=1> < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <author>A. Oram</author> <author>G. Wilson</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </book> <book id=2> < t i t l e l a n g= de >eZ Components Das E n t w i c k l e r h a n d b u c h</ t i t l e> <author>T. Schlitt</author> <author>K. Nordmann</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 9 . 9 5</ p r i c e> </book> </ b o o k s h e l f>
2009-05-29
56 / 102
Introductory example
<b o o k s h e l f> <book id=1> < t i t l e l a n g= en > B e a u t i f u l c o d e</ t i t l e > <author>A. Oram</author> <author>G. Wilson</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 5 . 9 5</ p r i c e> </book> <book id=2> < t i t l e l a n g= de >eZ Components Das E n t w i c k l e r h a n d b u c h</ t i t l e> <author>T. Schlitt</author> <author>K. Nordmann</author> <y e a r>2007</ y e a r> < p r i c e c u r r e n c y= Euro >3 9 . 9 5</ p r i c e> </book> </ b o o k s h e l f>
2009-05-29
56 / 102
Outline - XPath
XPath in action
Namespaces in RDF pQuery
2009-05-29
57 / 102
Addressing
Every XPath expression matches a set of nodes (0..n) It encodes an address for the selected nodes Simple XPath expressions look similar to Unix le system addresses Two generally dierent ways of addressing are supported Absolute addressing
/ b o o k s h e l f / book / t i t l e / b o o k s h e l f / book / a u t h o r
Relative addressing
book / a u t h o r ../ title
2009-05-29
58 / 102
Contexts
Every expression step creates a new context The next is evaluated in the context created by the previous one Contexts
/ b o o k s h e l f / book / t i t l e
/ resets the context to global bookshelf selects all <bookshelf> elements in the global context / creates a new context, all children of <bookshelf> book selects all <book> elements in this context / creates a new context, all children of selected <book>s title selects all < title > elements in this context A set of all title element nodes.
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 59 / 102
Attributes
Select attributes Prepend the attribute name with an @. Select all currency attributes
/ b o o k s h e l f / book / p r i c e / @ c u r r e n c y
2009-05-29
60 / 102
Steps
Navigation Navigation is not only possible in parent child direction. Navigate to parent
../ / b o o k s h e l f / book / t i t l e / . . / a u t h o r
Navigate to descendants
// t i t l e / b o o k s h e l f / book // @ c u r r e n c y
2009-05-29
61 / 102
Indexing
Access nodes by position It is possible to access a specic node in a set by its position. Indexing
/ b o o k s h e l f / book [ 2 ] / b o o k s h e l f / book / a u t h o r [ 1 ]
2009-05-29
62 / 102
Indexing
Access nodes by position It is possible to access a specic node in a set by its position. Indexing
/ b o o k s h e l f / book [ 2 ] / b o o k s h e l f / book / a u t h o r [ 1 ]
Start index Indexing generally 1 based Some Internet Explorer versions start with 0
2009-05-29
63 / 102
Wildcards
Wildcard search A wildcard represents a node of a certain type with arbitrary name. Wildcards
/ b o o k s h e l f // t i t l e / b o o k s h e l f / book /@
2009-05-29
64 / 102
Union
Union Union the node sets selected by multiple XPath expressions. Union
/ b o o k s h e l f / book / t i t l e | / b o o k s h e l f / book / a u t h o r
2009-05-29
65 / 102
Output
T i t l e o f f i r s t book i s B e a u t i f u l c o d e
2009-05-29
66 / 102
Output
Following c u r r e n c i e s occur : Euro Euro
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 67 / 102
Outline - XPath
XPath in action
Namespaces in RDF pQuery
2009-05-29
68 / 102
XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3
Node tests:
name of the node (default) wildcard (*)
Predicate:
accessing a node by index
2009-05-29
69 / 102
XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3
Node tests:
name of the node (default) wildcard (*)
Predicate:
accessing a node by index
2009-05-29
69 / 102
XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3
Node tests:
name of the node (default) wildcard (*)
Predicate:
accessing a node by index
2009-05-29
69 / 102
XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3
Node tests:
name of the node (default) wildcard (*)
Predicate:
accessing a node by index
2009-05-29
69 / 102
XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3
Node tests:
name of the node (default) wildcard (*)
Predicate:
accessing a node by index
2009-05-29
69 / 102
XPath syntax
4 axis: An XPath query consists of steps Each step consists of:
1 2 3
Node tests:
name of the node (default) wildcard (*)
Predicate:
accessing a node by index
Step syntax
<a x i s >::< n o d e t e s t >[< p r e d i c a t e >]
2009-05-29
69 / 102
Outline - In depth
Axis
2009-05-29
70 / 102
Axis
13 dimensions Imagine an XML document to be a 13 dimensional space... Axis ancestor ancestor-or-self attribute child descendant descendant-or-self following following-sibling namespace parent preceding preceding-sibling self
Tobias Schlitt (IPC SE 2009)
Shortcut
@ //
..
.
2009-05-29 71 / 102
Axis
Axis syntax
<a x i s n a m e >::< n o d e t e s t >
2009-05-29
72 / 102
Axis
Axis syntax
<a x i s n a m e >::< n o d e t e s t >
Child axis
/ b o o k s h e l f / book c h i l d : : b o o k s h e l f / c h i l d : : book
2009-05-29
72 / 102
Axis
Axis syntax
<a x i s n a m e >::< n o d e t e s t >
Descendant-or-self axis
// book d e s c e n d a n t or s e l f : : book
2009-05-29
72 / 102
Axis
Axis syntax
<a x i s n a m e >::< n o d e t e s t >
Attribute axis
// book / @id d e s c e n d a n t or s e l f : : book / a t t r i b u t e : : i d
2009-05-29
72 / 102
Axis
Axis syntax
<a x i s n a m e >::< n o d e t e s t >
Parent axis
// book / @id / . . d e s c e n d a n t or s e l f : : book / a t t r i b u t e : : i d / p a r e n t : : node ( )
2009-05-29
72 / 102
Ancestor axis
XPath with ancestor axis
// t i t l e / a n c e s t o r : :
2009-05-29
73 / 102
Ancestor axis
XPath with ancestor axis
// t i t l e / a n c e s t o r : :
2009-05-29
73 / 102
Following axis
XPath with ancestor axis
// book / f o l l o w i n g : :
2009-05-29
74 / 102
Following axis
XPath with ancestor axis
// book / f o l l o w i n g : :
Following-sibling axis
XPath with ancestor axis
// book / f o l l o w i n g s i b l i n g : :
2009-05-29
75 / 102
Following-sibling axis
XPath with ancestor axis
// book / f o l l o w i n g s i b l i n g : :
Namespace axis
XPath with namespace axis
namespace : :
2009-05-29
76 / 102
Namespace axis
XPath with namespace axis
namespace : :
Outline - In depth
Predicates
2009-05-29
77 / 102
Predicate syntax
You already saw indexing with numeric predicates Predicates can also be booleans Functions and operators allow ne grained tests Predicate syntax
<n o d e t e s t >[< p r e d i c a t e >]
2009-05-29
78 / 102
Predicate examples
Select all books with ID 1
// book [ @id = 1 ]
2009-05-29
79 / 102
Predicate examples
Select all books that have any attribute at all
// book [ @ ] // book /@ / . .
2009-05-29
79 / 102
Predicate examples
Select all books with price round 40
// book [ r o u n d ( p r i c e ) = 4 0 ]
2009-05-29
79 / 102
Predicate examples
Select all authors with rst name initial T
// book / a u t h o r [ s u b s t r i n g ( . , 1 , 1 ) = T ]
2009-05-29
79 / 102
Operator overview
Mathematical operators +, , div mod Addition, subtraction, multiplication Division Modulo operation
Comparison operators = != <, <= >, >= Check for equality Check for inequality Less than and less than or equal Greater than and greater than or equal
Operator overview
Mathematical operators +, , div mod Addition, subtraction, multiplication Division Modulo operation
Comparison operators = != <, <= >, >= Check for equality Check for inequality Less than and less than or equal Greater than and greater than or equal
Operator overview
Mathematical operators +, , div mod Addition, subtraction, multiplication Division Modulo operation
Comparison operators = != <, <= >, >= Check for equality Check for inequality Less than and less than or equal Greater than and greater than or equal
Operator overview
Mathematical operators +, , div mod Addition, subtraction, multiplication Division Modulo operation
Comparison operators = != <, <= >, >= Check for equality Check for inequality Less than and less than or equal Greater than and greater than or equal
Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
Functions by example
String functions string-join() Concatenates 2 strings substring() Extracts a part from a string Node set functions count() Returns number of nodes in a set position() Returns the position index of each node Boolean functions not() Negates the received boolean expression true() Boolean true Mathematical functions round() Rounds the given number to the next integer oor() Returns the next integer smaller than the given number Function overview An overview on all functions can be found on http://www.w3.org/TR/xpath-functions/
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 81 / 102
Outline - XPath
XPath in action
Namespaces in RDF pQuery
2009-05-29
82 / 102
Namespaces in RDF
2009-05-29
83 / 102
Simpler variant
/ / [ name ( . ) = r d f : D e s c r i p t i o n ] / n a m e s p a c e s : :
2009-05-29
84 / 102
Simpler variant
/ / [ name ( . ) = r d f : D e s c r i p t i o n ] / n a m e s p a c e s : :
2009-05-29
84 / 102
pQuery
2009-05-29
85 / 102
jQuery
Cool JavaScript framework Allows selecting HTML elements by CSS selectors http://jquery.com/
2009-05-29
86 / 102
pQuery
2009-05-29
87 / 102
Example HTML
<html> <head> < t i t l e>Some w e b s i t e</ t i t l e> </ head> <body> <h1>Some h e a d l i n e</h1> <p> Some n i c e <a h r e f= t e s t . ht m l c l a s s= i n t e r n a l >c o n t e n t</a >. </p> <h1 i d= s e c o n d >Second h e a d l i n e</h1> <p> More n i c e <a h r e f= t e s t . ht m l >c o n t e n t</a> . </p> </ body> </ html>
2009-05-29
88 / 102
Usage
r e q u i r e p q u e r y / p q u e r y . php ; $dom = new DOMDocument ( ) ; $dom>f o r m a t O u t p u t = t r u e ; $dom>loadHTMLFile ( . . / e x a m p l e . h tm l ) ; $q = new pQuery ( $dom ) ; $q>q u e r y ( h1#s e c o n d ) ; $q>a d d C l a s s ( s o m e c l a s s ) ; $dom>saveHTMLFile ( . . / p q u e r y s i m p l e . ht m l ) ;
2009-05-29
89 / 102
2009-05-29
90 / 102
Issueing a query
// . . . p u b l i c f u nc t i o n query ( $ s t r i n g ) { $ x p a t h = $ t h i s >c r e a t e X P a t h ( $ s t r i n g ) ; echo Q u e r y i n g w i t h XPath $ x p a t h . \ n ; $ n e w C o n t ex t = a r r a y ( ) ; f o r e a c h ( $ t h i s >c o n t e x t a s $node ) { $ n o d e L i s t = $ t h i s >xpath >q u e r y ( $xpath , $node ) ; f o r e a c h ( $ n o d e L i s t a s $newNode ) { $n e w Co nt ext [ ] = $newNode ; } } $ t h i s >c o n t e x t = $ ne w C on tex t ; } // . . .
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 91 / 102
Creating XPath
// . . . protected f u n c t i o n createXPath ( $ s t r i n g ) { $ p a r t s = explode ( , $ s t r i n g ) ; $xpath = array () ; foreach ( $par ts as $part ) { $ x p a t h [ ] = $ t h i s >c r e a t e S i n g l e X P a t h ( $ p a r t ) ; } r e t u r n implode ( | } // . . . , $xpath ) ;
2009-05-29
92 / 102
DELIM CAPTURE
2009-05-29
93 / 102
2009-05-29
94 / 102
Creating selector
// . . . protected function c r e a t e S e l e c t o r ( array $parts ) { switch ( $parts [ 1 ] ) { case . : return $parts [0] . [ contains ( @class , . $parts [ 2 ] . ] ; break ; c a s e # : return $parts [0] . [ @id = . $ p a r t s [ 2 ] . break ; } } }
] ;
2009-05-29
95 / 102
Adding a class I
// . . . public function addClass ( $ c la s s ) { f o r e a c h ( $ t h i s >c o n t e x t a s $node ) { i f ( $node>nodeType !== XML ELEMENT NODE ) { continue ; } i f ( ! $node>h a s A t t r i b u t e ( c l a s s ) ) { $ c l a s s e s = array () ; } else { $ c l a s s e s = explode ( , $node>g e t A t t r i b u t e ( c l a s s ) ); } // . . .
Tobias Schlitt (IPC SE 2009) XML and XPath with PHP 2009-05-29 96 / 102
Adding a class II
// . . . i f ( ! in array ( $class , $classes ) ) { $classes [] = $class ; } $node>s e t A t t r i b u t e ( class , implode ( , $ c l a s s e s ) ); } } // . . .
2009-05-29
97 / 102
Simple example
r e q u i r e p q u e r y / p q u e r y . php ; $dom = new DOMDocument ( ) ; $dom>f o r m a t O u t p u t = t r u e ; $dom>loadHTMLFile ( . . / e x a m p l e . h tm l ) ; $q = new pQuery ( $dom ) ; $q>q u e r y ( h1#s e c o n d ) ; $q>a d d C l a s s ( s o m e c l a s s ) ; $dom>saveHTMLFile ( . . / p q u e r y s i m p l e . ht m l ) ;
2009-05-29
98 / 102
Advanced example
r e q u i r e p q u e r y / p q u e r y . php ; $dom = new DOMDocument ( ) ; $dom>f o r m a t O u t p u t = t r u e ; $dom>loadHTMLFile ( . . / e x a m p l e . h tm l ) ; $q = new pQuery ( $dom ) ; $q>q u e r y ( h1 a . i n t e r n a l ) ; $q>a d d C l a s s ( c o o l c l a s s ) ; $dom>saveHTMLFile ( . . / p q u e r y a d v a n c e d . ht m l ) ;
2009-05-29
100 / 102
The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!
2009-05-29
102 / 102
The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!
2009-05-29
102 / 102
The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!
2009-05-29
102 / 102
The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!
2009-05-29
102 / 102
The end
Thank you for listening! Are there any questions left? I hope you learned what you expected? Contact me: Tobias Schlitt <toby@php.net> Enjoy the conference!
2009-05-29
102 / 102