Professional Documents
Culture Documents
Lecture 5
Lecture 5
• What is structure?
– An observation of patterns or relationships of entities
– Structure may be
hierarchical, linear, network (many-to-many), lattice
• Where is the structure?
• In Biology:
– Atomic, molecular, cellular, tissue, organ
• A report
– Title, Abstract, contents, Introduction, main-body, conclusions,
bibliography, appendices
J.C.Roberts 4
Document Links are only one “structure”
component “HTML is precisely what we were trying to
PREVENT— ever-breaking links, links
going outward only, quotes you can't follow
to their origins, no version management,
no rights management”. – Ted Nelson
But what are links?
– A reference the reader can follow
– If it has an anchor then it can go to a particular place in that
document
• Not only one way?
– In some hypertext documents they can be one-to-one, one-to-
many?
• Are they typed or guaranteed?
• Where does the information go?
– Traditionally replaces the current,
– could open a new window.
– Transclusion – inserts the text in place
• “inline linking” (e.g., banner ads)
<img src="http://www.webserver.com/picture.jpg"
J.C.Roberts 5
Need a good way to encode structure of docs
The idea was to separate structure of the document from
its appearance:
e.g. author, title, address, introduction etc..
http://www.w3.org/TR/html4/intro/sgmltut.html J.C.Roberts 6
a generalized markup language should be...
• Declarative
– Describe the structure (and attributes) of the
document
– Not describe the processing that is done to it!
– Does not have side-effects
– Extensible
• Rigorous
– Clear correspondence to mathematics
– So there can be known/pre-defined and consistent
ways to manipulate it.
– Can be used in programming and processing
documents
J.C.Roberts 7
A bit more on SGML
• An element is a structural unit
e.g., <poem>
J.C.Roberts 8
We could simplify the markup (making assumptions)
• For instance
Rule 2, “a single title” could mean that
we could remove the end tag
Rule 3, we could imply the end of the
poem so don’t need </poem>
Likewise don’t need </line>
J.C.Roberts 9
Rules like these, need to be made formal
• Need a formal specification -> document type definition (DTD)
• Markup declarations, BNF, (Backus Normal Form or Backus–Naur
Form)
Three parts:
1. a name or group of names,
2. two characters specifying minimization rules,
i.e., does it need a start/end tag. Either a hyphen or a letter O (for
"optional“ or “ommisible”)
3. and a content model (the material)
several such reserved words, of which by far the most commonly
encountered is #PCDATA
(This means "parsed character data“)
occurrence indicators = plus sign (+), the question mark (?), and the
asterisk (*)
http://www-sul.stanford.edu/tools/tutorials/html2.0/gentle.html
J.C.Roberts 10
DTD, SGML & HTML
J.C.Roberts 12
Hypertext Markup Language (HTML) (cont.)
J.C.Roberts 13
HTML
<html>
• HTML has a specific set <head>
of tags that allow: <title> My Page </title>
– the structure of a document
to be described (e.g., <h1> </head>
- heading) <body>
– links to other documents on <h1> My Page </h1>
web defined (e.g, <a> - <p> This is great </p>
anchor)
</body>
– some control of
presentation (e.g., fonts). </html>
• To learn HTML you Uses Tags:
should start creating html, head, title,
simple files using body, h1, p
notepad.
J.C.Roberts 14
Basic HTML tags
J.C.Roberts 15
Tags and Elements
J.C.Roberts 16
Elements and Attributes
J.C.Roberts 17
Presentational markup
J.C.Roberts 18
URLs: Uniform Resource Locators
J.C.Roberts 19
The Anchor Tag; Href Attribute; Named anchors
J.C.Roberts 20
more HTML tags…
J.C.Roberts 21
HTML Versions and Validity
J.C.Roberts 22
Document Type Declaration & Character encoding
J.C.Roberts 23
Validators
J.C.Roberts 24
HTML Conclusion
J.C.Roberts 25
XHTML - Making HTML XML Compatible!
J.C.Roberts 26
HTML 4. vs XHTML
<!DOCTYPE … HTML 4.01 ..> <!DOCTYPE … XHTML ..>
<html> <html>
… …
<body> <body>
<h1> My HTML Page </H1> <h1> My HTML Page </h1>
<P> A paragraph <p> A paragraph </p>
<UL> <ul>
<li> a list item <li> a list item </li>
</UL> </ul>
</body> </body>
</html> </html>
validate validate
J.C.Roberts 27
XHTML vs HTML 4
J.C.Roberts 28
The XHTML document head
• Declaring The XHTML File Encoding
– <head>
– <meta http-equiv="content-type" content="text/html;
charset=utf-8"/>
– </head>
• XHTML documents must have a DTD
– strict: full compliance with XHTML
– transitional: allows some presentational markup
– frameset: pages with frames
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://
www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
J.C.Roberts 29
Extensible Markup Language (XML)
J.C.Roberts 30
XML Schema
J.C.Roberts 31
XML Schema, example
J.C.Roberts 33
XSLT, Extensible Stylesheet Language
Transformations
• a declarative, XML-based language
• used for the transformation of XML documents.
• use XSLT to convert XML data into HTML or
XHTML
• Typically transform XML into XSL Formatting
Objects
– Which are then translated into PDF, PostScript etc.
J.C.Roberts 34
J.C.Roberts 35
Stylesheets - CSS (Cascading Stylesheets)
• Stylesheets control body {background-color: blue}
presentation. h1 {color: red; font-family: times}
• Main stylesheet language
is CSS
<style type=“text/css”>
• Stylesheets specify how
body {background-color: blue}
elements should be h1 {color: red}
displayed </style>
• Placed where?
– style instructions at top of
your HTML file
– Or external file <link rel=“stylesheet”
o Good for consistent type=“text/css”
appearance across site
o Conflicts between internal href=“mystyle.css”>
and external (internal one
wins!)
J.C.Roberts 36
More on CSS
• Syntax:
– Style sheets consist of a list of rules
– Each rule consists of one or more comma-separated selectors
and a declaration block
– A declaration block consists of a list of semicolon-separated
declarations in curly braces
– Each declaration consists of a property, a colon (:) and a value.
J.C.Roberts 37
CSS example
p /* paragraph */
{ font-family: "Garamond", serif; }
h2 /* heading 2 */
{ font-size: 110%; color: red; background: white; }
p.Leftmargin
{margin-left: 2cm}
img
{float:right;border:1px dotted black;margin:0px 0px
15px 20px;}
J.C.Roberts 38
Advantages of using CSS
J.C.Roberts 39
FORM HTML tag
J.C.Roberts 40
INPUT tag of FORM
<HTML><HEAD>
• Several kinds of form elements <TITLE>Form example .. tell me your name </
TITLE>
can be defined using the </HEAD>
<BODY>
INPUT tag, <H2>Who are you?</H2>
– Uses TYPE attribute to <FORM
METHOD=POST
indicate (e.g) button, ACTION="mailto:j.c.roberts@somehost.ac.uk">
checkbox, and so on. <P>Enter your name:<INPUT NAME="theName"></
P>
<P><INPUT TYPE="submit"></P>
INPUT TYPE="BUTTON" </FORM>
</BODY>
INPUT TYPE="CHECKBOX" </HTML>
INPUT TYPE="FILE"
INPUT TYPE="HIDDEN"
INPUT TYPE="IMAGE"
INPUT TYPE="PASSWORD"
INPUT TYPE="RADIO"
INPUT TYPE="RESET"
INPUT TYPE="SUBMIT"
INPUT TYPE="TEXT"
J.C.Roberts 41
Enter some text, more fields
<html><head>
<title>Form example .. tell me your name </title>
</head>
<body>
<h2>Who are you?</h2>
<FORM
METHOD=POST
ACTION=mailto:j.c.roberts@somehost.ac.uk
enctype="text/plain">
<p>Enter your name:
<input name="theName">
</p>
<p>Enter your age:
<input name="theAge" size= "3" maxlength= "3” >
</p>
<p>Enter your address:<input name="theAddress”
type= ”text " ></p>
<p><input type="submit"></p>
</form>
</body>
</html>
J.C.Roberts 42
Radio buttons...
J.C.Roberts 43
Setting and Resetting
<FORM METHOD=POST
ACTION=“http://www.cs.bangor.ac.uk/cgi-bin/post-query”>
J.C.Roberts 44
Forms
J.C.Roberts 45
Summary
J.C.Roberts 46
The Structured web
J.C.Roberts 47