Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 41

XML

1
XML - Introduction

What is XML?
Difference between XML and HTML?
How XML Can be Used?
XML Syntax.
XML Elements and Attributes.
XML WellFormedNess and Validity.
What is XSL.?
DOM and SAX
XML CDATA & Entity References

2
What is XML ?

XML stands for EXtensible Markup Language.

XML is a markup language much like HTML .

XML was designed to describe data .

XML tags are not predefined in XML. You must define your own
tags .

XML uses a Document Type Definition (DTD) or an XML Schema


to describe the data .

XML with a DTD or XML Schema is designed to be self-descriptive.

3
HTML
<html>
<body>
<b>Hello World</b><br>
</body>
</html>

4
XML = EXtensible Markup Language

5
DTD = DOCUMENT TYPE DEFINITION

6
XSL= EXtensible Stylesheet Language

7
Difference between XML and HTML

XML and HTML were designed with different goals.XML is not a replacement for HTML.

XML was designed to carry data.

XML was designed to describe data and to focus on what data is.

HTML was designed to display data and to focus on how data looks.

HTML is about displaying information, while XML is about describing information.

8
How to Make Data Portable
• Tell what the data means
• Tell how the data is structured SO COMPUTERS CAN
UNDERSTAND IT
• Tell how it should look

• BUT DO THESE SEPARATELY

• The meaning -- XML


• The structure -- DTD (document type definition)
• The formatting -- XSL (Extensible style sheet)
• Example: XML catalog structure
How XML Can be Used.?

XML can Separate Data from HTML.


With XML, your data is stored outside your HTML.

XML is used to Exchange Data


With XML, data can be exchanged between incompatible systems.In the real world, computer systems and databases
contain data in incompatible formats. One of the most time-consuming challenges for developers has been to exchange
data between such systems over the Internet.
Converting the data to XML can greatly reduce this complexity and create data that can be read by many different
types of applications.

XML and B2B


With XML, financial information can be exchanged over the Internet.

XML can be used to Share Data


Since XML data is stored in plain text format, XML provides a software- and hardware-independent way of sharing
data.

XML can be used to Store Data


With XML, plain text files can be used to store data.

XML can make your Data more Useful


With XML, your data is available to more users.
Since XML is independent of hardware, software and application, you can make your data available to other than only
standard HTML browsers.

10
XML at a glance
Well Formed Document:
<Book>
<Author>George Soros</Author>
<Title>The Crisis of Global Capitalism</Title>
<Year>1998</Year>
<Publ>Public Affairs</Publ>
<Price>26.00</Price>
<ISBN>1-891620-27-4<ISBN>
</Book>

DTD: Document Type Definition


<?xml version="1.0">
<!DOCTYPE Book [
<!ELEMENT Book
(Author, Title, Year, Publ, Price, ISBN)> ]>

SOURCE: PROF. JEROME YEN

11
Xml Document

<?xml version="1.0"?>
<!-- Reference to a DTD to validate our customer data -->
<!DOCTYPE customer SYSTEM "http://www.example.com/customer.dtd">
<customer id="123456">
<first-name>John</first-name>
<last-name>Doe</last-name>
<address>
<street>123 Main</street>
<city>Anytown</city>
<state>CA</state>
<zip>99999</zip>
</address>
<phone>800-555-9999</phone>
<email-address>john@doe.com</email-address>
</customer>

12
XML Recipe Example
<?xml version="1.0"?>
<Recipe>
<Name>Apple Pie</Name>
<Ingredients>
<Ingredient>
<Qty unit=pint>1</Qty>
<Item>milk</Item>
</Ingredient>
<Ingredient>
<Qty unit=each>10</Qty>
<Item>apples</Item>
</Ingredient>
</Ingredients>
<Instructions>
<Step>Peel the apples</Step>
<Step>Pour the milk into a 10-inch saucepan</Step>
<!-- And so on... -->
</Instructions>
</Recipe>
XML Example
<?XML version="1.0" ?>
<PRESCRIPTION>
<MEDNAME MED="Amoxil"> Prescribed medication:
Amoxil</MEDNAME>
<FORM TYPE="capsule"> Form: capsule </FORM>
<DOSAGE AMOUNT="25 mg"> Dosage: 25 mg</DOSAGE>
<RXINSTRUCTIONS FREQUENCY="daily"> daily
</RXINSTRUCTIONS>
</PRESCRIPTION>

•SOURCE: RACHEL SOKOLOWSKI


Sample XML
<?XML version 1.0?>
<computer>
<Company_Name> HP <Company_Name>
<CPU> PENTIAM </CPU>
<RAM> 4 </RAM>
<HDD> 120 </HDD>
</computer>

15
DTD
<?xml version="1.0">
<!DOCTYPE computer [
<!ELEMENT computer
Company_Name, CPU, RAM,
HDD)>
<!ELEMENT Company_Name
#PCDATA>
]>
16
DOM Tree

17
XML Syntax
The syntax rules of XML are very simple and very strict. The rules are very easy to learn, and
very easy to use.

Because of this, creating software that can read and manipulate XML is very easy to do.

All XML elements must have a closing tag


With XML, it is illegal to omit the closing tag.

XML tags are case sensitive


With XML, the tag <Letter> is different from the tag <letter>.

All XML elements must be properly nested


In HTML some elements can be improperly nested within each other like this:
<b><i>This text is bold and italic</b></i>

Attribute values must always be quoted


With XML, it is illegal to omit quotation marks around attribute values. 
<?xml version="1.0" encoding="ISO-8859-1"?>

Comments in XML
The syntax for writing comments in XML is similar to that of HTML.
<!-- This is a comment -->

18
XML Elements and Attributes
XML Elements
XML Elements are Extensible
XML documents are Extensible.
XML Elements have Relationships
Elements are related as parents and children.
Elements have Content
Elements can have different content types.

XML Attributes
XML elements can have attributes.
<person sex="female">
Use of Elements vs. Attributes
Data can be stored in child elements or in attributes.
Take a look at these examples:
<person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>
<person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>
There are no rules about when to use attributes, and when to use child elements. My experience is
that attributes are handy in HTML, but in XML you should try to avoid them. Use child elements if the
information feels like data.

19
WellFormedNess & Validity
"Well Formed" XML documents

A "Well Formed" XML document has correct XML syntax.


A "Well Formed" XML document is a document that conforms to the XML syntax rules.

<?xml version="1.0" encoding="ISO-8859-1"?>


<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

"Valid" XML documents


A "Valid" XML document also conforms to a DTD.
A "Valid" XML document is a "Well Formed" XML document, which also conforms to the
rules of a Document Type Definition (DTD) or Schema.

<?xml version="1.0" encoding="ISO-8859-1"?>


<!DOCTYPE note SYSTEM "InternalNote.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

20
XSL(the eXtensible Stylesheet Language)

Displaying XML with XSL

XSL is the preferred style sheet language of XML.

XSL is to transform XML into HTML before it is displayed by the browser .

<?xml version="1.0" encoding="ISO-8859-1"?>


<!-- Edited with XML Spy v4.2 -->
<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/TR/xhtml1/strict">
<body style="font-family:Arial,helvetica,sans-serif;font-size:12pt;
background-color:#EEEEEE">
<xsl:for-each select="breakfast_menu/food">
<div style="background-color:teal;color:white;padding:4px">
<span style="font-weight:bold;color:white">
<xsl:value-of select="name"/></span>
- <xsl:value-of select="price"/>
</div>
<div style="margin-left:20px;margin-bottom:1em;font-size:10pt">
<xsl:value-of select="description"/>
<span style="font-style:italic">
(<xsl:value-of select="calories"/> calories per serving)
</span>
</div>
</xsl:for-each>
</body>
</html>
21
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="D:\XML\simple.xsl"?>
<breakfast_menu>
<food> <name>Belgian Waffles</name><price>$5.95</price>
<description>two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food> <name>Strawberry Belgian Waffles</name>
<price>$7.95</price><description>light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food> <name>Berry-Berry Belgian Waffles</name>
<price>$8.95</price>
<description>light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food> <name>French Toast</name>
<price>$4.50</price>
<description>thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food> <name>Homestyle Breakfast</name>
<price>$6.95</price>
<description>two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu> 22
DOM & SAX

SAX and DOM are two different APIs for accessing information in XML documents.

Using DOM Parsers ,XML documents are represented as a hierarchical tree structure.

SAX-based parsers invoke methods when markup (e.g. a start tag, an end tag, etc.) is encountered. No tree
structure is created - data is passed to the application from the XML document as it is found.

When to use DOM and When to use SAX.?

DOM – Importing and Exporting Data to the Database.


SAX - Purchase Orders and Sales Orders.

23
24
DOM

DocumentBuilderFactory documentBuilderFactory;
DocumentBuilder documentBuilder;

docBuilderFactory = DocumentBuilderFactory.newInstance();
docBuilder = docBuilderFactory.newDocumentBuilder();
DocName=docBuilder.parse(input);
root=(Node)DocName.getDocumentElement(); // Root Node

25
SAX

public void startDocument()


{ System.out.println("start document");
}
public void endDocument() {
System.out.println("end document");
}
public void startElement(String uri, String localName,
String qName, Attributes attributes) {
printIndent(); System.out.println("starting element: " + qName);
}

26
All text in an XML document will be parsed by the parser.

Only text inside a CDATA section is ignored by the parser.

Escape Characters

Illegal XML characters have to be replaced by entity references.


If you place a character like "<" inside an XML element, it will generate an error because the parser
interprets it as the start of a new element.

You cannot write something like this:

<message>if salary < 1000 then</message>

To avoid this, you have to replace the "<" character with an entity reference, like this:

<message>if salary &lt; 1000 then</message>

There are 5 predefined entity references in XML:

&lt; < lessthan


&gt > greaterthan
&amp; & ampersand 

&apos; ‘ apostrophe

&quot; “ quotation
mark

27
CDATA
Everything inside a CDATA section is ignored by the parser.

If your text contains a lot of "<" or "&" characters - like program code often does - the XML
element can be defined as a CDATA section.
A CDATA section starts with "<![CDATA[" and ends with "]]>":

<![CDATA[
function matchwo(a,b) {
if (a < b && a < 0) then { return 1 } else { return 0 }
}
]]>

28
Any Quiestions…..?

29
DTD’S and Schema’s

•DTDs (Document Type Definitions)


•The structure of an XML document is defined by its DTD.
DTDs define:
•the tags that can or must appear
•how often the tags can appear
•how the tags can be nested
•allowable, required and default attributes
•...but not the type of data
•Can parse well-formed XML without a DTD
•DTD is defined by the XML 1.0 specification

30
Introduction to DTD

The purpose of a Document Type Definition is to define the legal building blocks of an XML document.

It defines the document structure with a list of legal elements.

A DTD can be declared inline in your XML document, or as an external reference.

Seen from a DTD point of view, all XML documents are made up by the following simple building
blocks:

•Elements
•Tags
•Attributes
•Entities
•PCDATA
•CDATA

31
DTD Elements

In a DTD, XML elements are declared with a DTD element declaration.

Syntax : <!ELEMENT element-name category> or <!ELEMENT element-name (element-content)>

Empty elements
<!ELEMENT element-name EMPTY>
Elements with only character data
Elements with only character data are declared with #PCDATA inside parentheses:
<!ELEMENT element-name (#PCDATA)>

Elements with any contents


Elements declared with the category keyword ANY, can contain any combination of parsable data:
<!ELEMENT element-name ANY>

Elements with children


Elements with one or more children are defined with the name of the children elements inside
parentheses:
<!ELEMENT element-name (child-element-name)> or
<!ELEMENT element-name (child-element-name,child-element-name,.....)>

32
Declaring only one occurrence of the same element 

<!ELEMENT element-name (child-name)>


example:<!ELEMENT note (message)>

Declaring minimum one occurrence of the same element


<!ELEMENT element-name (child-name+)>
example:<!ELEMENT note (message+)>

Declaring zero or more occurrences of the same element 


<!ELEMENT element-name (child-name*)>
example:<!ELEMENT note (message*)>

Declaring zero or one occurrences of the same element 


<!ELEMENT element-name (child-name?)>
example:<!ELEMENT note (message?)>

33
In a DTD, Attributes are declared with an ATTLIST declaration.

Declaring Attributes
An attribute declaration has the following syntax:

<!ATTLIST element-name attribute-name attribute-type default-value>

example:
DTD example: <!ATTLIST payment type CDATA "check">
XML example: <payment type="check" />

The attribute-type can have the following values:

Attribute Type
CDATA
(en1|en2|..)
ID
IDREF
NMTOKEN
NMTOKENS
IDREF
IDREFS

34
The default-value can have the following values:

Value

Value

#REQUIRED

#IMPLIED

#FIXED value

Syntax <!ATTLIST element-name attribute-name attribute-type #IMPLIED>

<!ATTLIST element-name attribute-name attribute-type #FIXED "value">

35
36
37
38
39
40
XML Schemas
XML ("Extensible Markup Language") is rapidly establishing itself as a useful
tool for data exchange because it has the incredible potential to become a
universal format for structuring information.
To use XML effectively in a community such as the Internet,
there must be some constraints on the valid tags and tag sequences so that the
data exchange can actually make sense to someone other than the creator.
Still commonly used, DTDs (Document Type Definition) fulfilled this need.

DTDs, however, have several disadvantages, such as:


Creation of DTDs requires the use and knowledge of a completely different
syntax from XML.
Very limited ability to specify custom datatypes
Desire commonly used database datatypes (such as dates

The answer to these problems is XML Schemas. Schemas overcome


DTD's shortcomings and still provide the user with the power he needs.
Here are some advantages of schemas:
Schemas use the same syntax as XML, so there’s less to learn.
Allowed to specify custom datatypes.
More predefined types (over 40!).
41

You might also like