DTD - Document Type Definition

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 30

DTD – DOCUMENT TYPE

DEFINITION

DONE BY:
HARINI.D
IV M.Sc SS
SYNOPSIS:

 INTRODUCTION
 DTD- DEFINITION
 WHEN AND WHY USE DTD?
 TYPES OF DTD DECLARATION
 DTD- XML BUILDING BLOCKS
 CONTENT MODEL
 DATA TYPES
 ELEMENT TYPE DECLARATIONS
 ENTITY TYPE DECLARATIONS
 ATTRIBUTE TYPE DECLARATION
SYNOPSIS:

 XML ELEMENT ATTRIBUTES


 ATTRIBUTE RULES
 SAMPLE CODE
INTRODUCTION

 An XML document with correct syntax is


called "Well Formed".

 An XML document validated against a DTD is


both "Well Formed" and "Valid".
DTD – DEFINITION:
DTD - Document Type Definition.

A DTD defines the structure and the legal elements and attributes of an XML document.

Only the elements defined in a DTD can be used in an XML document.

DTD can be internal or external.

Processing overhead is incurred when validating XML with a DTD.

It allows the developer to create a set of rules to specify legal content and place restrictions on
an XML file
If the XML document does not follow the rules contained within the DTD, a parser generates an
error
An XML document that conforms to the rules within a DTD is said to be valid.
WHEN AND WHY USE DTD?
 With a DTD, independent groups of people can agree to use a standard
DTD for interchanging data.

 With a DTD, you can verify that the data you receive from the outside world
is valid.

 You can also use a DTD to verify your own data.

 A single DTD ensures a common format for each XML document that
references it .

 An application can use a standard DTD to verify that data that it receives
from the outside world is valid.

 A description of legal, valid data further contributes to the interoperability


and efficiency of using XML
TYPES OF DECLARATIONS:

There are two types of DTD. They are:

 Internal DTD Declaration.

 External DTD Declaration.


• INTERNAL DTD DECLARATION:
If the DTD is declared inside the XML file, it must be wrapped inside the <!DOCTYPE> definition:

XML document with an internal DTD

<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
- INTERPRETATION OF DTD:

 !DOCTYPE note defines that the root element of this document is


note.

 !ELEMENT note defines that the note element must contain four


elements: "to,from,heading,body“

 !ELEMENT to defines the to element to be of type "#PCDATA“.

 !ELEMENT from defines the from element to be of type "#PCDATA“.

 !ELEMENT heading defines the heading element to be of type


"#PCDATA“.

 !ELEMENT body defines the body element to be of type "#PCDATA"


• EXTERNAL DTD DECLARATION:

If the DTD is declared in an external file, the <!DOCTYPE> definition


must contain a reference to the DTD file:

XML document with a reference to an external DTD

<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "note.dtd">
<note>
  <to>Tove</to>
  <from>Jani</from>
  <heading>Reminder</heading>
  <body>Don't forget me this weekend!</body>
</note>
DTD – XML BUILDING BLOCKS:

Seen from a DTD point of view, all XML documents


are made up by the following building blocks:

 Elements
 Attributes
 Entities
 PCDATA
 CDATA
• ELEMENTS:
 Elements are the main building blocks of both XML and HTML
documents.

 Examples of HTML elements are "body" and "table". Examples of XML


elements could be "note" and "message".

 Elements can contain text, other elements, or be empty.

 Examples:

<body>some text</body>

<message>some text</message>
• ATTRIBUTES:

 Attributes provide extra information about elements.

 Attributes are always placed inside the opening tag of an element.

 Attributes always come in name/value pairs.

 The following "img" element has additional information about a


source file:

<img src="computer.gif" />

 The name of the element is "img". The name of the attribute is "src".
The value of the attribute is "computer.gif". Since the element itself is
empty it is closed by a " /".
• ENTITIES:

 Some characters have a special meaning in XML, like the less than sign (<) that
defines the start of an XML tag.

 Most of you know the HTML entity: "&nbsp;". This "no-breaking-space" entity is
used in HTML to insert an extra space in a document.

 Entities are expanded when a document is parsed by an XML parser.

 The following entities are predefined in XML:


• PC DATA:
 PCDATA means parsed character data.

 Think of character data as the text found between the start tag and the end
tag of an XML element.

 PCDATA is text that WILL be parsed by a parser. The text will be


examined by the parser for entities and markup.

 Tags inside the text will be treated as markup and entities will be expanded.

 However, parsed character data should not contain any &, <, or > characters;
these need to be represented by the &amp; &lt; and &gt; entities,
respectively.
• CDATA:

 CDATA means character data.

 CDATA is text that will NOT be parsed by a parser.

 Tags inside the text will NOT be treated as markup


and entities will not be expanded.
CONTENT MODEL
 Identify the name of the element and the nature of that
element’s content.

 The example declares an element that then describes


the document’s content model
NAME CONTENT
MODEL

<!ELEMENT note (to, from, subject, body)>


ELEMENT
DECLARATION
DATA TYPES

 Parsed Character Data


 #PCDATA
▪ <!ELEMENT firstname (#PCDATA)
▪ <!ELEMENT lastname (#PCDATA)

 Unparsed Character Data


 CDATA
▪ <firstname><![CDATA[<b>Jim</b>]]></firstname>
▪ <lastname><![CDATA[<b>Peters</b>]]></lastname>
ELEMENT TYPE DECLARATIONS

There are 3 types of elements. They are:

 EMPTY elements
 ANY elements
 MIXED elements
• EMPTY ELEMENT:

 An element that can not contain any content.

 The html image tag in xml would typically be empty, such


as <image></image> or <image/>.

 Empty elements are more useful with the use of attributes.

<!ELEMENT test EMPTY>


<!ELEMENT image EMPTY>
<!ELEMENT br EMPTY>
• ANY ELEMENT:

 An element that can contain any content.

 It is recommended not to get into the habit


declaring elements with the ANY keyword.

 It is useful when transferring a lot of mixed or


unknown data.

<!ELEMENT test ANY >


• MIXED ELEMENT:

 Elements that can contain a set of content


alternatives.

 Separate the options with the “or” symbol “|”.

<!ELEMENT test <#PCDATA | name>


ENTITY TYPE DECLARATIONS

 Entities are used to define shortcuts to special


characters.

 An entity has three parts: an ampersand (&), an entity


name, and a semicolon (;).

 Entities can be declared internal or external.


• INTERNAL ENTITY DECLARATION:

 Syntax:

<!ENTITY entity-name "entity-value">

 EXAMPLES:

DTD Example:

<!ENTITY writer "Donald Duck.">


<!ENTITY copyright "Copyright W3Schools.">

XML example:

<author>&writer;&copyright;</author>
• EXTERNAL ENTITY DECLARATION:

 SYNTAX:

<!ENTITY entity-name SYSTEM "URI/URL">

 EXAMPLES

DTD Example:

<!ENTITY writer SYSTEM "https://www.w3schools.com/entities.dtd">

<!ENTITY copyright SYSTEM "https://www.w3schools.com/entities.dtd">

XML example:

<author>&writer;&copyright;</author>
ATTRIBUTE TYPE DECLARATION
 Invoice Element Declaration:
<?xml version=“1.0” ?>
<!ELEMENT employee (#PCDATA)

<!ATTLIST ElementName AttributeName Type Default >

<!ATTLIST employee type (FullTime | PartTime) “FullTime” >

Usage in XML file:


<?xml version=“1.0” ?>
<employee type=“PartTime”/>
XML ELEMENT ATTRIBUTES:

 XML tags can contain attributes similar to


attributes in HTML tags.

HTML Examples:
<h1> align=“center”>An XML Example<h1>
<table width=page> </table>

 Attributes are usually used to provide processing


information to the XML application (the
application that is going to consume the XML).
ATTRIBUTE RULES

 Attribute values must be placed in “ “.


 In HTML this is only required id the attribute
contains the space character.

 Attribute values are not processed by the


XML parser.
 This means the values can’t be automatically
checked by the parser.
SAMPLE CODE:
<?xml version=“1.0” ?>
<!DOCTYPE email[
<!ATTLIST email
language (english | french | spanish) “english”
priority (normal | high | low) “normal” >
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA) >
<!ELEMENT subject (#PCDATA) >
<!ELEMENT message (#PCDATA) > ] >
<email language=“spanish” priorit=“high”>
<to>Peter Brenner</to>
<from>Dick Steflik</from
<subject> Test Reminder</subject>
<message>The exam is a week from today</message>
</email>
ANY QUESTIONS??

You might also like