Professional Documents
Culture Documents
Session03 XML Validation DTD
Session03 XML Validation DTD
DTD
Sep-2009
Slide 2
XML Validation
Slide 4
Validation
Slide 5
Schema Languages
There is more than one language in which you can express such
validation conditions. Generically, these are called schema
languages, and the documents that list the constraints are called
schemas.
Different schema languages have different strengths and
weaknesses.
The document type definition (DTD) is the only schema language
built into most XML parsers and endorsed as a standard part of XML.
The W3C XML Schema Language (schemas for short, though it’s
hardly the only schema language) addresses several limitations of
DTDs.
Many other schema languages have been invented that can easily
be integrated with your systems.
Slide 6
Document Type Definition (DTD)
XML 1.0 included a set of tools for defining XML document structures,
called Document Type Definitions (DTDs).
A DTD focuses on the element structure of a document. It says what
elements a document may contain, what each element may and must
contain in what order, and what attributes each element has. DTDs can be
used for:
defining reusable content (entities),
some kinds of metadata information (notations).
mechanisms for providing default values for attributes.
Document type definitions (DTDs) serve two general purposes.
They provide the syntax for describing/constraining the logical structure of
a document. (Element/attribute declarations are used for it)
They provide syntax for composing a logical document from physical
entities. (entity/notation declarations are used to accomplish it.)
Slide 8
DTD Declarations
DTDs contain
several types
of
declarations
Slide 9
<!DOCTYPE ... >
Slide 10
DOCTYPE Syntax
Slide 11
Internal Declarations
The simplest way to define a DTD is through internal declarations. In this case, all
declarations are simply placed between the open/close square brackets. The obvious
downside to this approach is that you can’t reuse the declarations across different
XML document instances.
<!DOCTYPE name [
<!-- insert declarations here -->
]>
Slide 12
External Declarations
Slide 13
Using external declarations examples
<person> <age>33</age>
<name>Billy Bob</name> </person>
<age>33</age>
</person>
Slide 14
Internal and external declarations
A DOCTYPE declaration can also use both the internal and external declarations.
This is useful when you’ve decided to use external declarations but you need to extend them
further or override certain external declarations.
Note: only ENTITY and ATTLIST declarations may be overridden.
Example
<!-- person.dtd -->
<!ENTITY % person-content "name, age">
<!ELEMENT person (%person-content;)>
<!ELEMENT name (#PCDATA)>
<!-- person2.xml -->
<!ELEMENT age (#PCDATA)> <!DOCTYPE person SYSTEM "person.dtd" [
<!-- change person's content model -->
<!-- person1.xml --> <!ENTITY % person-content "age, name">
<!DOCTYPE person SYSTEM "person.dtd">
]>
<person>
<person>
<name>Billy Bob</name>
<age>33</age>
<age>33</age>
<name>Billy Bob</name>
</person>
</person>
Slide 15
<!ELEMENT name content-model>
An ELEMENT declaration defines an element of the specified name with the
specified content model. The content model defines the element’s allowed
children.
Content Model Basics
Syntax Description
Syntax Description
No modifier means the child or child group must appear exactly once at
the specified location (except in a choice content model).
* Annotated child or child group may appear zero or more times at the
specified location.
+ Annotated child or child group may appear one or more times at the
specified location.
? Annotated child or child group may appear zero or one time at the
specified location.
CDATA Arbitrary character data “Value” Default value for attribute. If the
attribute is not explicitly used on
ID A name that is unique within the
the given element, it will still
document
exist in the logical document
IDREF A reference to an ID value in the with the specified default value.
document
ENTITY The name of an unparsed entity #REQUIRED Attribute is required on the given
declared in the DTD element.
ENTITIES A space-delimited list of ENTITY #IMPLIED Attribute is optional on the given
values element.
NMTOKEN A valid XML name (NMTOKEN is #FIXED Attribute always has the
essentially a word without spaces.) "value" specified fixed value.
NMTOKENS A space-delimited list of
NMTOKEN values
Slide 19
Attribute enumerations
<!ATTLIST eName aName (token1 | token2 | token3 | ...)>
<!ATTLIST eName aName NOTATION (token1 | token2 | token3 |
...)>
It’s also possible to define an attribute as an enumeration of tokens. The tokens may be of type NMTOKEN or NOTATION . In
either case, the attribute value must be one of the specified enumerated values.
Entities are the most atomic unit of information in XML. Entities are used
to construct logical XML documents (as well as DTDs) from physical
resources. There are several types of entities, each of which is declared
using an ENTITY declaration.
•General Entity may only be referenced in an XML document (not the DTD).
•Parameter Entity may only be referenced in a DTD (not the XML document).
Referenced
(%name;) is within
Internal
replaced with ELEMENT,
parameter
the parsed ATTRIBUTE,
entities
content NOTATION,
ENTITY
Used to
parameterize
portions of the Example: Parameter entities in the internal subset
DTD
<!DOCTYPE person [
<!ELEMENT person (name)>
<!ENTITY % nameDecl "<!ELEMENT name (#PCDATA)>">
<!-- parameter entity expands to
complete declaration -->
%nameDecl;
]>
<person><name>Billy Bob</name></person>
Slide 23
External parameter entities
<!ENTITY % name PUBLIC "publicId" "systemId">
<!ENTITY % name SYSTEM "systemId">
Slide 24
Internal general entities
<!ENTITY name "value">
Internal general entities always contain The resulting logical document could
parsed XML content. The parsed content is
be serialized as follows:
placed in the logical XML document
everywhere it’s referenced (&name;). <person>
Slide 25
External general parsed entities and Unparsed entities
Slide 26
Questions
Slide 27
Thank you