Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Department of Computer Science Institute for System Architecture, Chair for Computer Networks

Semantic Web
Classical Web approach

• The classical web and its extensions are designed to present


information to human beings and not to be processed by computer
programs
• Resources are connected with each other via hyperlinks
• There are two main problems:
1. Search is inefficient and results are often wrong or incomplete
(e.g. multistage search is not possible)
2. It is not possible to automate tasks in a simple manner based on
information published in the web

presented in web browser

web content
((X)HTML, XML,
AJAX applications …)

publisher
user
Search engine

databases ...
2
Example for a problem…

Advertisement?
Useful information?

Advertisement?
Useful information?

How can a computer


program “know”?

There needs to be
a universal way to
add information
about information

 Pages have to
be annotated with
metadata 3
The Semantic Web vision
• By adding additional information about data, web content becomes
machine processable and thus computer programs and especially
software agents can do special tasks based on the published
information
• Resources are not only connected by hyperlinks but by semantics and
semantic links
• Applications combine information from different sources and do
reasoning in order to generate additional information

annotated
person
website

book web service

annotated annotated
website website

search engine

applications user software agent


4
Simple scenario

Search engine

3 Additional data
1 4
Hospital’s Website

6 5
agent

2
Appointment
calendar

“I have some problems with my Find out the consultation hour


1 4
backbone - contact a doctor”
Make an appointment,
2 Check for possible dates 5
November 09, 3.00 p.m.
3 Ask for a good orthopaedist in a 6 Notification about appointment
radius of 50 km for user and calendar

5
Necessary components

• What is necessary to make complex data machine


processable in a standardised way?
1. Language for data exchange
• XML
2. Identifiers for arbitrary resources
• Uniform Resource Identifier Reference (URIref)
3. Standardised data model to describe / connect resources
• Resource Description Framework (RDF)
4. Some way to define the allowed vocabulary and
restrictions for the data model
• RDF Schema, Web Ontology Language (OWL)
5. A possibility to query for resources / information
• SPARQL Protocol and RDF Query Language (SPARQL)
6. Logic and reasoning approaches to combine and deduce
information (not discussed here)
• OWL, classical inference mechanisms

6
Semantic Web stack

User interface & applications

Trust RIF:
Rule Interchange Format
Allows uniform
Proof representation of rule
systems for automated
Unifying logic inference of further
knowledge out of existing
SPARQL

knowledge that is
OWL Rules: RIF

Encryption
Signature
represented in RDF

RDF Schema
Unicode:
A character code that
Data interchange: RDF defines every character in
most of the speaking
XML languages in the world;
e.g. XML uses UTF-8 (1
Byte encoded Unicode
URI Unicode
characters) as default

 Semantic Web specifications / standards are released by the W3C


7
Resource Identifier

• Concept of Uniform Resource Identifier (URI) makes it


possible to identify
– Individuals
– Properties
– Kinds of things
– Values of properties etc.
• URI = scheme : scheme specific part

http://www.example.com URN: Uniform Resource Name


A special URI (with the scheme ‘urn’)
urn:3540205683 used to represent things such as
mailto:person@example.com ISBNs or RFCs

• URI are extended to URI references (URIref) by the optional


use of fragment identifiers to identify a portion of resources:
– URI [+ ‘#’ + fragment identifier]= URIref
– E.g. http://www.hospital.org/departments#orthopaedy

8
RDF Model

• RDF is a language for representing information about


resources, especially for web content metadata
• A statement about a resource is in the form of a
subject-predicate-object expression (a so called RDF triple)
• Due to this it is possible to make statements like
– http://www.somedomain.com/index.html has an author whose value is
Arthur Dent
– http://www.hospital.org/department#orthopaedy has consultation hour
at 03.00 p.m.

predicate
subject (property) object
(resource that has a special property) (value of the property)

http://www.somedomain.com/term#author
http://www.somedomain.com/index.html Arthur Dent

9
RDF Model

• Object of one statement can be subject of another


• Possibility of combining statements to more complex ones

http://www.hospital.org/department#orthopaedy

http://www.hospital.org/term#consultationHour http://www.hospital.org/term#hasHead
.

03.00 p.m. http://www.hospital.org/staffid#2342

http://www.hospital.org/term#name http://www.hospital.org/term#age

Juri Schiwago “42”^^<http://www.w3.org/2001/XMLSchema#integer >

http://www.hospital.org/term#title

Dr.
10
RDF/XML

• The most common way of serialising RDF information is the use


of XML
<?xml version="1.0"?> XML declaration

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
namespace
xmlns:term="http://www.hospital.org/term#" declaration
xml:base="http://www.hospital.org/staffid#">

<rdf:Description rdf:about="http://www.hospital.org/departments#orthopaedy">
<term:consultationHour>03.00 p.m.</term:consultationHour>
<term:hasHead rdf:resource="2342"/>
</rdf:Description>
<rdf:Description rdf:ID="2342"> RDF statements
<term:name>Juri Schiwago</term:name>
<term:age rdf:datatype="&xsd;integer">42</term:age>
<term:title rdf:datatype="&xsd;string">Dr.</term:title>
</rdf:Description>
</rdf:RDF>

11
RDF Schema

• RDF only defines Syntax for data representation


• RDFS is a language to define RDF vocabulary and to structure
RDF resources by:
1. Defining classes
2. Defining properties
3. Describing class hierarchies
4. Describing property hierarchies
5. Specifying the domain (subjects class) and the range
(objects class) of properties

Main language constructs


Class Defines a class
Property Defines a property
subClassOf Defines inheritance hierarchy of classes (transitive)
subPropertyOf Defines inheritance hierarchy of properties (transitive)
domain Any resource that has a given property is defined as
instance of one or more specified classes
range The values of a property are defined to be instances of
one or more specified classes 12
RDFS example

Staff
RDFS member
classes rdfs:subClassOf rdfs:subClassOf

Administratio Technical
n staff rdfs:subClassOf support staff
member member

Academic rdfs:domain
staff
hasHead member
rdfs:range hasName
rdfs:domain
rdfs:range

department &xsd;string
type type

RDF type
instances
hasName
Juri
http://www.hospital.org/staffid#2342
Schiwago

13
RDF Schema
<?xml version="1.0"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">

<rdf:Description rdf:ID="StaffMember">
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
</rdf:Description>

<rdfs:Class rdf:ID="Department"/> abbreviation

<rdfs:Class rdf:ID="AcademicStaffMember">
<rdfs:subClassOf rdf:resource="#StaffMember"/>
</rdfs:Class>

<rdf:Property rdf:ID="hasHead">
<rdfs:domain rdf:resource="Department"/>
<rdfs:range rdf:resource="AcademicStaffMember"/>
</rdf:Property>

...

</rdf:RDF>

14
Instantiating RDFS classes

<?xml version="1.0"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" location of


xmlns:term="http://www.hospital.org/term#" the defined
RDF Schema
<term:Department rdf:ID="orthopaedy">
<term:hasHead rdf:ID="2342">
</term:Department>

<term:AcademicStaffMember rdf:ID="2342"/>
<term:hasName rdf:datatype="&xsd;string">
Juri Schiwago
</term:hasName>
</term:AcademicStaffMember>

</rdf:RDF>

15
Shortcomings of RDFS

• Although RDFS provides basic capabilities to describe RDF


vocabulary, it still lacks sufficient expressive power
• It is not possible to describe e.g.:
– Cardinality constraints on properties
• “a department has one and only one head”
– Disjunctive properties / classes
• “a technical support staff member can not be an
academic staff member”
– Inverse functions
• “hasHead is the inverse function of isHeadOf”
– A class by the use of other classes
• “the class ‘staff member’ is defined by conjunction of
the classes ‘Administration staff member’, ‘Academic
staff member’ and ‘Technical support staff member’ ”

• For more complex relationships a further layer is necessary:


 Web Ontology Language (OWL)

16
OWL

• OWL is a logic based language to define and instantiate formal


ontologies
• Designed as extension of RDFS
• An ontology describes existing entities in a special domain and
how these entities are related
• OWL supports reasoning with generic tool support

connected
knowledge semantics

OWL ontology
RDFS abstraction
RDF Relational model
XML Data Exchange
structure syntax

17
OWL

• Some important language constructs:

Ontology Defines assertions about the ontology,


especially import of other ontologies
equivalentClass Equivalence between classes
allValuesFrom Universal quantification
someValueFrom Existential quantification
functionalProperty P(x,y) and P(x,z) implies y = z
inverseFunctionalProperty P(y,x) and P(z,x) implies y = z
sameAs Defines identical individuals
inverseOf P1(x,y) iff P2(y,x)
symmetricProperty P(x,y) iff P(y,x)
transitiveProperty P(x,y) and P(y,z) implies P(x,z)
disjointWith Disjointness of a set of classes

18
OWL
<?xml version="1.0"?>
<rdf:RDF
xmlns:owl = "http://www.w3.org/2002/07/owl#"
xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs= "http://www.w3.org/2000/01/rdf-schema#">
<owl:Ontology rdf:about="">
<owl:priorVersion>
<owl:Ontology rdf:about="http://www.somedomain.org/oldontology"/>
</owl:priorVersion>
<owl:imports rdf:resource="http://www.somedomain.org/furtherontology"/>
<rdfs:label>Hospital Ontology</rdfs:label>
</owl:Ontology>
...
<owl:Class rdf:Id="AcademicStaffMember">
<rdfs:subClassOf rdf:resource="#Person" />
<owl:disjointWith rdf:resource="AdministrativeStaffMember"/>
</owl:Class>
...
<owl:Class rdf:Id="Department">
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="#hasHead"/>
<owl:minCardinality rdf:datatype="&xsd;nonNegativeInteger">1</owl:minCardinality>
</owl:Restriction>
</rdfs:subClassOf>
...
</owl:Class>
</rdf:RDF>

19
OWL sublanguages

• To fulfil different requirements OWL is divided into three sublanguages


with increasing expressive power:
1. OWL Lite
• Subset of full OWL
• Supports classification hierarchies and simple constraint
features, e.g. cardinality constraints
2. OWL Description Logic (DL)
• More powerful subset of OWL
• Decidable fragment of first order logic; includes all OWL
constructs but defines special restrictions such as type
separation (e.g. an individual can not be a class)
3. OWL Full
• The complete Web Ontology Language
• Allows free mixing with RDF Schema
• Does not enforce a strict separation of classes, properties,
individuals and data values
• Language is undecidable
 Computability is not guaranteed OWL Full

OWL DL
OWL Lite
20
SPARQL

• SPARQL is a query language for RDF data and reduces the


complexity of information extraction in the semantic web
• Query concept is comparable to the Structured Query
Language (SQL)
• A query consists of a SELECT clause and a WHERE clause
• Variables have the prefix “?” or “$”

PREFIX hosp: < http://www.hospital.org/terms#> introduction of


PREFIX dep: < http://www.hospital.org/departments# > abbreviations
SELECT ?hour
FROM < http://www.hospital.org/data.rdf > specifies a
RDF graph
WHERE { dep:orthopaedy hosp:consultationHour ?hour}

Result:
hour

3.00 pm
21
Example Schema: Friend of a Friend

• “Friend of a Friend” (FOAF) is a project to build social networks by


describing persons and their relationships among each other
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:foaf="http://xmlns.com/foaf/0.1/">
<foaf:Person>
<foaf:name>Vincent van Gogh</foaf:name>
<foaf:mbox rdf:resource="mailto:gogh@example.com"/>
<foaf:homepage rdf:resource="http://www.vincentshome.net"/>
<foaf:workplaceHomepage="http://www.vincentatwork.com"/>
<foaf:weblog rdf:resource="http://www.vincentshome.org/blog"/>
<foaf:phone rdf:resource="tel:00330642623623"/>
<foaf:img rdf:resource="http://www.vincentshome.net/mypicture.jpg" />
<foaf:jabberID>vangogh@jabber.org</foaf:jabberID>
<foaf:knows>
<foaf:Person>
<foaf:name>Theo van Gogh</foaf:name>
<foaf:mbox_sha1sum>cd4236a21ea61342e62341bd2d23e42eac2313dd
</foaf:mbox_sha1sum>
<rdfs:seeAlso rdf:resource="http://theovangogh.org/foaf.rdf"/>
</foaf:Person>
</foaf:knows>
</foaf:Person>
</rdf:RDF>
22
Ontology mapping

• Two different book stores have published the following RDF


data:

isbn:3257204205 isbn:3257204205

term:title term:author ont:autor ont:titel

A brief history of time A brief history of time

term:name term:homepage ont:name ont:internetseite

Stephen Hawking Stephen Hawking

http://www.hawking.org.uk http://www.hawking.org.uk

How can software deduce that


• author = autor
• homepage = internetseite
• title = titel? 23
Ontology mapping

 Usage of (de-facto-)standards for schemes/ontologies makes


unification possible
 Below “name” and “homepage” are represented by using FOAF

foaf:Person

isbn:3257204205 isbn:3257204205

term:title term:author ont:autor ont:titel

A brief history of time A brief history of time

foaf:name foaf:homepage foaf:name foaf:homepage

Stephen Hawking Stephen Hawking

http://www.hawking.org.uk http://www.hawking.org.uk

24
Web content annotation possibilities

1. Internally linked annotations


example.html
W3C recommends the use of a data.rdf

<link>-tag in the (X)HTML header:


<link rel="meta" type="application/rdf+xml" href="data.rdf"/>

2. Externally linked annotations example.html


RDF data and external links have data.rdf
to be accessible on a server
external mapping

3. Embedded annotations
example.html
RDF data can be embedded in:
some RDF statements
– the html-header
– an <object>- or <script>-tag
– an html-comment
– by using RDFa 25
Data integration
Personal information base
Appointment Personal Past Social
calendar interests activities connections

Logic ontology2.owl

RDF
HTML
Document
RDF arbitrary software RDF

RDBMS Logic Logic XML

Public information base ontology1.owl Public information base

By using OWL and RDF it is possible to bring data from includes
different data sources to an uniform representation information flow
which can be used by arbitrary software that is aware of
the semantics (e.g. knows OWL description) uses 26
Further approaches

• The vision of a Semantic Web has been defined more than a


decade ago
• Apart from dedicated applications e.g. in the domain of
bioinformatics a full realization is not in sight
• Central drawbacks of the proposed approach for a Semantic
Web:
• Complex standards, lack of tool support, ontology
matching problems
• Several simplified approaches have been developed and are
already applied:

Pragmatic approaches for


realizing a simplified
Semantic Web

RDFa Microformats Microdata


27
RDFa

• RDFa adds a set of attribute level extensions to XHTML for annotation


information within Web documents
• Defines a set of attributes which can be used for adding
semantic information to XHTML tags
• Examples for attributes:
• rel: whitespace separated list of URIs, used for expressing
relationships between two resources (predicates in RDF
terminology)
• href: a URI for expressing the partner resource of a relationship
(resource object in RDF terminology)
• about: a URI, used for stating what the data is about (subject in
RDF terminology)
• property: a whitespace separated list of URIs, used for
expressing relationships between a subject and some literal text
(also a predicate in RDF terminology)
• typeof: a whitespace separated list of URIs that indicate the
RDF type(s) to associate with a subject

28
RDFa example

<div vocab="http://xmlns.com/foaf/0.1/" typeof="Person">


<p> <span property="name">Alice Birpemswick</span>,
Email: <a property="mbox" href="mailto:alice@example.com">alice@example.com</a>,
Phone: <a property="phone" href="tel:+1-617-555-7332">+1 617.555.7332</a> </p>
<ul> <li property="knows" typeof="Person">
<a property="homepage" href="http://example.com/bob/">
<span property="name">Bob</span></a> </li>
<li property="knows" typeof="Person"> <a property="homepage" href="http://example.com/eve/">
<span property="name">Eve</span></a> </li> <li property="knows" typeof="Person">
<a property="homepage" href="http://example.com/manu/"><span property="name">Manu</span>
</a> </li>
</ul>
</div> 29

Example taken from: http://www.w3.org/TR/xhtml-rdfa-primer/


Microformats

• Microformats = approach to semantic markup reusing existing


(X)HTML tags to describe metadata and to make recognizable data
items capable of automated processing by software
• No responsible standardization organization for formats; defined by
an open community
• Microformats are available for various different areas such as for
events (hCalendar), for contact information (hCard), for products
(hProduct), for reviews (hReview) or for news content (hNews)
• Formats are listed at: http://microformats.org/
• Most important (X)HTML attribute: class for describing the type of
information

<ul class="vcard">
<li class="fn">Smith</li> Name
<li class="org">ACME Company</li> Associated
<li class="tel">1234-4321</li> organization
<li><a class="url„ href="http://example.com/">
Website
http://example.com/
associated to
</a> entity
</li>
</ul>
30
Microdata

• Microdata is a simple approach to embed semantics directly


into HTML pages proposed by the W3C and embedded into
HTML5
• Two core concepts:
• Item: Entity that is described; content of HTML tag is
marked as item using the attribute itemscope
• Property: name-value pair describing concrete
characteristics of an item; content of HTML tag is marked
as property using the attribute itemprop
Defines a
new item
<div itemscope><p>The first customer is called
<span itemprop="name">Elizabeth</span>.</p>
Defines a </div>
property with <div itemscope><p>The second customer is called
key "name" of <span itemprop="name">Daniel</span>.</p>
the item </div>

31
Microdata
• Three further attributes have been intoduced:
• itemid: specifies unique identifier of the item
• itemtype: valid URL of a vocabulary used for items and properties
• itemref : associates properties that are not descendants of the
element with the itemscope attribute with the item (via item id)
• Vocabulary is described informally
<div itemscope itemtype="http://data-vocabulary.org/Person"
itemid="urn:uuid:f3373a7b-4958-4e55-8820-d03a191fb76a">
The name of the customer is
<span itemprop="name">Bob Michael</span>
His homepage is available here:
<a href="http://www.example.com" itemprop="url">
www.example.com</a>
He has got a <span itemprop="title">Phd</span> title from
<span itemprop="affiliation">ACME University</span>.
</div>
<p itemscope
itemref="urn:uuid:f3373a7b-4958-4e55-8820-
d03a191fb76a">The customer works at
<span itemprop="affiliation">ACME Company</span>.
</p>
32
Conclusion…
Knowledge engineer

human information access


summarising
software agent sharing searching ontology and scheme design
and editing tools
browsing visualising
(de-facto-)standard ontologies
arbitrary program
and schemes
SPARQL frameworks
...

semantic web repository

RDFa
RDF RDFS OWL
<html> reference information data data Microdata <html>
… include …
</html> Microformats </html>

regular web content

<html>
<html>

<html>

</html> extraction of
personalised extraction of …
</html>
information (semi)structured </html> unstructured
data data
33
References

Links at W3C:

RDF Primer 1.1 http://www.w3.org/TR/rdf11-primer


OWL Guide http://www.w3.org/TR/owl-guide
OWL 2 http://www.w3.org/TR/owl2-overview/
SPARQL http://www.w3.org/TR/sparql11-overview/
OWL-S http://www.w3.org/Submission/OWL-S/
RDFa Primer http://www.w3.org/TR/xhtml-rdfa-primer/

Microdata http://www.w3.org/TR/microdata/
specification

34

You might also like