Development Team: ICT For Libraries Semantic Web, Invisible Web and Deep Web

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

 

 
 

 
 
 
 
     

Paper No: 05 ICT for Libraries


Principal Investigator   Dr. Jagdish Arora, Director
Module : 15 Semantic
& Web, Invisible Web and Deep Web
INFLIBNET Centre, Gandhinagar
Subject Coordinator

Development Team
Principal Investigator
Paper Coordinator
Dr. Jagdish Arora, Director
&
INFLIBNET Centre, Gandhinagar
Subject Coordinator
Content Writer

Dr Usha Mujoo Munshi, Librarian,


Paper Coordinator Indian Institute of Public Administration
Content Reviewer

Dr Aditiya,
Content Writer Assistant Professor,
Banaras Hindu University

Dr Usha Mujoo Munshi, Librarian,


Content Reviewer Indian Institute of Public Administration
 

Paper Coordinator
Content Reviewer  
 
Semantic Web, Invisible Web and Deep Web
I. Objectives

• To attain knowledge about the semantic web, its definition, concept,


history and architecture.
• To get familiarized with various components of semantic web

• To learn about major tools that are used for developing semantic web

II. Learning Outcomes


After studying this lesson, learners will attain knowledge about the semantic
web, its definition, concept, history and architecture. They would attain
knowledge about components of semantic web such as register, decomposer,
reasoner, invoker and matchmaker. They would learn about major tools that are
used for developing semantic web, viz. extensible Markup Language (XML),
Resource Description Framework (RDF), Web Ontology, etc.

III. Module Structure


1. Introduction
2. Semantic Web
2.1 Definition
3. History of Semantic Web
4. Architecture of Semantic Web
5. Semantic Web : components and tools
5.1 Components of Semantic Web Service
5.2 Tools for developing Semantic Applications
5.2.1 Extensible Markup Language (XML)
5.2.2 Resource Description Framework (RDF)
5.2.3 Modeling data in RDF
5.2.4 Web Ontology
5.2.5 Web Ontology Language (OWL)
6. Issues and & Challenges
7. Promises of Semantic Web
8. Implication and Applicability in Real World
9. Impact on trinity of libraries
10. Summary
11. References

1. Introduction
In the era of information explosion, information retrieval and management of
retrieved information and to point out the relevant information from the ocean
of information is a difficult task. To find out the exact information according to
the desired need over the Web is very difficult. To resolve the problems various
technologies are emerging everyday. In the same way the concept of semantic
web is one of the new ideas to improve our existing information retrieval
system using machines to reduce human efforts.
Semantic web is an effort of World Wide Web promoted by World Wide Web
Consortium to make information available on Web as machine-processable. It is
a concept which makes possible to organize available Web information
resources and to use them not only by syntax and structural methods but also by
the semantics’ on the concept. It is an abstract representation of World Wide
Web resources based on a framework known as RDF (Resource Description
Framework).

2. Semantic Web
The concept of Semantic Web has gone a long way since its inception. It is
visualized that application (search engines or intelligent agents) will not only
understand the semantics of the available information they would make devices
communicated as and when it is required. The promises are high with the
application of Semantic Web.

2.1 Definition
According to Tim Berners-Lee “Semantic web is an extension of current web in
which information is given well defined meaning ,better enabling computers
and people to work in cooperation.”
The term Semantics means study of meaning expressed by elements of a
language, characterizable as a symbolic system. Semantic web uses the
technologies which helps machines to understand information on the Web
including visible and invisible web (information which is available and indexed
in the database of search engine and information that cannot be reached by
search engines). It provides a better search result in more defined, meaningful
and understandable way. Semantic web can also link the databases and
applications along with the information contained by them, resulting a user in
getting the richest and relevant sources.
Thus, we can define the Semantic web as, the Web which can provide semantic
search results. In other words, it can understand the meaning of searched
linguistic element by analyzing it and show the results in defined way.

3. History of Semantic Web

Tim Berners-Lee of CERN lab had invented the Web in 1989 and since then it
has gone a long way. Tim Berners-Lee’s original vision of the Web was much
more ambitious than the reality of the existing (syntactic) Web. Further it is Tim
Berners-Lee who visualized Semantic Web as “A new form of Web content that
is meaningful to computers will unleash a revolution of new possibilities”.
Historically the concept has emerged out of different versions of the web. The
good old early web which is also called as static web or Web 1.0 led to another
version of the Web which was more interactive and hence labled as Web 2.0.
The history of Semantic applications or the Semantic web is an extract of
development of Web. The Web 2.0 where the applications are connected to
each other with the use of web based ontologies and metadata is a primitive
kind of Semantic application which will further lead to the full blown Semantic
Web as Web 3.0. Tim Berners Lee further puts it as if the interaction between
person and hypertext could be so intuitive that the machine-readable
information space gives an accurate representation of the state of people's
thoughts, interactions, and work patterns, then machine analysis could become a
very powerful management tool.

Hence, it is realized that there is a strong need to develop a system of Web


objects where machine can analyze search text and understand its
meaning/context and interpret accordingly for automating various procedures.
4. Architecture of Semantic Web
Semantic web ha s seven layered architecture which is composed mainly of
seven functions and each function is nearly represented by a layer as follows :

Fig. 1: Architecture of Semantic Web


Starting from the bottom of the above shown diagram the first layer consist of
UNICODE and URI. Unicode is a universal character representation standard
for representation of any written script. URI stands for Uniform Resource
Identifier, it is a standardized form that allows to identify resources uniquely.
There are different variants of URI as URN (uniform resource naming), PURL
(Persistent URL), URL (Uniform Resource Locator) and so on. URI provides
understandable identification of all resources in a distributed Internet system.
The next layer is of Extensible Markup Language (XML), a language used for
describing resource in a nested system like HTML (Hypertext MarkUp
Language). XML is used to define namespace and develop XML schema in a
standard syntax forming the very basics of Semantic Web.
Next layer is Resource Description Format (RDF) which describes the format of
representation of Knowledge or an idea or an object in triplicate format i.e;
Subject-Predicate-Object.
RDF incorporates metadata representation about WWW resources and provides
a mechanism through RDFS (Resource Description Format Schema) to define
taxonomies/ontology. These taxonomies/ontology forms basic constructs for
semantic services in the form of classes and their respective properties.
In order to develop large scale ontology a language is given, which is named as
Web Ontology Language (OWL). OWL is derived from description logic and
offers more constructs over RDFS. A construct is an architectural unit and here
these construct appear as standardized vocabularies. These vocabularies create
a knowledge structure which in turn should be used to reason using rules and
logics.
Rule Interface Format (RIF) and Semantic Web Rule Language (SWRL)
provide a layer over ontology layer for reasoning among the various concepts
represented as knowledge construct. Further a query layer of Simple Protocol
and RDF Query Language (SPARQL) is used to query the whole underlying
architecture using RDF sentences and resources. SPARQL is used to query
RDF data structure (knowledge base) woven by RDFS and OWL.
Above all this, there is execution layer resulting in the proof of logic and
develops a trust in terms of input given and output received. Finally, above all
these layers user interface is built.

5. Semantic Web: components and tools


5.1 Components of Semantic Web Service
Semantic Web is basically a concept where system performs the tasks which are
normally performed by applying human intelligence. In this, system analyzes
the search terms and understands its meaning and further interprets, rather
simply presenting to users. Computer based intelligent systems will replace
human intellect and only input is given by humans in the form of data. Hence,
machine would understand the meaning of the data; then after the processing
and arrangement it will provide a structured format to it so that reasoning can be
performed for more meaningful and understandable output. Following are the
physical components of Semantic web services:
Register: A register is a place where the factual data is stored collectively in the
form of resources or objects. This is raw state of data where no processing is
administered over the object.
Decomposer: This is the component which initiates the processing on object by
disintegrating the various components of object. The disintegrated parts are
arranged in a sequential manner.
Reasoner: This is the most important component of any semantic application.
It is collection of rules which is used for analyzing the object. Basically, these
are rule based systems which apply the rules on the collected data or object.
This reasoning is used for problem solving.
Invoker: It is a triggering component which initiates the process of searching or
the action of the service. It starts with the request made by the client for the
service.
Matchmaker: The main execution module which looks after the most suitable
result as per the request of client.
5.2 Tools for developing Semantic Applications
5.2.1 eXtensible Markup Language (XML)
XML describes and exchange data on web. It allows the creators to create pages
in their own language in which phrases can hold their meaning and description.
The tags used by XML are more meaningful than the tags used by the HTML.
For example, XML uses the tag <Author> rather than using the tag <H1> like
HTML. Author is more meaningful and self defined than H1. HTML tags are
predefined while in XML tags are of creators’ choice. XML can help the creator
to decide that what information should be put between the tags and those
information have hierarchical structure that’s why a user can easily understand
that what does this information means. (Aditya Tripathi, 2003)

5.2.2 Resource Description Framework (RDF)


It is a language for representing information about resources in the web.RDF
identifies things by using URIs. It uses simple statements (Triplets) to describe
things. It is a domain dependent technology providing a way to build an object
model from which actual data is referred.
Development of RDF started with the initiation of PICS (Platform for Internet
Content Selection) project in 1995. PICS was a rating mechanism about the
content of web pages. The idea was to filter the unwanted set of web pages,
which contain foul language, pornographic material, violence etc. Once the
project was initiated, it was found that it can be used for describing the content
of web page and could be made to represent content understandable by
machines. The extension of PICS project was PICSNG (PICS Next Generation),
which was later called as RDF (Resource Description Framework).

5.2.3 Modeling data in RDF


Representation of data through RDF is very easy as it follows a triplet of
Resource, Property and Value. A simple RDF model has three parts.
i. Resource (subject): Any entity which has to be described is known as
‘Resource’ which is equivalent to ‘Subject’ in normal English
grammar. It can be a ‘webpage’ on Internet or a ‘person’ in a society
or any object.
ii. Property (Predicate): Any characteristic of ‘Resource’ or its attribute
which is used for the description of the same is known as Property,
which is equivalent to ‘Predicate’ in normal English grammar. For
example, a webpage can be recognized by ‘Title’ or a man can be
recognized by his ‘Name’. So both are attributes for recognition of
resource ‘webpage’ and ‘person’ respectively.
iii. Value (Object): A Property must have a ’Value’ which is equivalent to
‘Object’ in normal English grammar.

5.2.4 Web Ontology


Presently, search engines perform searching over stored indexes in their
databases with pattern match algorithm. This search lacks representation of
concept with search term. This inherent problem is not due to any difficulty
with search engines rather it is due to representation of data in webpage using
Hyper Text Markup Language (HTML), the language of Web. Hence, a
mechanism is visualized to represent the data of web pages using another
language i.e. Extensible Markup Language (XML) with a standard data
description framework called as Resource Description Framework (RDF). It is
understood that each individual web page can be considered as an entity and
will have its attributes or characteristics. Based on this property the pages can
be grouped and further they can form relation with other web page(s) or group
of web pages. This develops a kind of web based ontology also known as Web
ontology for web documents but the original idea of ontology remains same.
This framework uses standard vocabularies like Resource Description Frame
Work Schema (RDFS) and Web Ontology Language (OWL) for describing the
concepts and its relations with other concepts. The search engines extract the
data from the web page and preserve the relation with the data, so that
meaningful results can be generated.

5.2.5 Web Ontology Language (OWL)


The Web Ontology Language (OWL) is a language to create the Web
ontologies. A Web ontology follows object oriented approach and hence,
facilitates descriptions of classes, properties and their instances. These
ontologies preserve the formal semantics and specifies derivation of logical
consequences. This ontological structure may represent one as well as
collection of web objects.
OWL ontologies would provide developing agents which can reason. These
agents would provide generic support avoiding any particular subject domain.
The standard method of constructing ontologies would lead to third party agents
including commercial as well as public domain agents. These agents will
further build services to ultimately benefit the users.

The Species of OWL


The OWL language provides three increasingly expressive sublanguages
designed for use by specific communities of implementers and users. (Michael
K, 2004)
OWL Lite supports those users primarily needing a classification hierarchy and
simple constraint features. For example, while OWL Lite supports cardinality
constraints, it only permits cardinality values of 0 or 1. It should be simpler to
provide tool support for OWL Lite than its more expressive relatives, and
provide a quick migration path for thesauri and other taxonomies.
OWL DL supports those users who want the maximum expressiveness without
losing computational completeness (all entailments are guaranteed to be
computed) and decidability (all computations will finish in finite time) of
reasoning systems. OWL DL includes all OWL language constructs with
restrictions such as type separation (a class cannot also be an individual or
property, a property cannot also be an individual or class). OWL DL is so
named due to its correspondence with description logics [Description Logics], a
field of research that has studied a particular decidable fragment of first order
logic. OWL DL was designed to support the existing Description Logic
business segment and has desirable computational properties for reasoning
systems.
OWL Full is meant for users who want maximum expressiveness and the
syntactic freedom of RDF with no computational guarantees. For example, in
OWL Full class can be treated simultaneously as,a collection of individuals and
as an individual in its own right. Another significant difference from OWL DL
is that a owl:DatatypeProperty can be marked as an
owl:InverseFunctionalProperty. OWL Full allows an ontology to augment the
meaning of the pre-defined (RDF or OWL) vocabulary. It is unlikely that any
reasoning software will be able to support every feature of OWL Full.
6. Issues and Challenges
Semantic Web is a concept and there is no physical availability of it. To develop
a machine with human intellect is difficult task and it is still evolving. The
major problem with semantic web is its implications. Some major issues of
semantic web are:
• Uncommon tags are used for creating the content that is why it is difficult
to understand the meaning. Creator can choose tags according to their
convenience so, for a particular context different tags are used. For
example, for the writer of book various tags like creator, author may be
used and due to this, there will be no standardization.
• Lack of common ontology may cause problem in creating databases.
There should be a top level ontology and that should be accepted by all.
So far, such kind of ontology has not been worked out.
• Multilingualism can create problems in exact searching because common
terms of different languages have different meaning.
• Practical implication of semantic web is difficult task because to design a
system having capabilities of thinking, analyzing and decision making is
a tough job.
• New information are emerging every second, anyone can put his/her
view, ideas, concepts. Semantic Web has to handle this large amount of
data generated every second. Further this large database will be reasoned
on the fly on each request of the user which is a challenge in itself.

7. Promises of Semantic Web


• Semantic web can represent information in more categorized way.
• Taxonomy is more standardized way of representation of knowledge.
• Data can be linked such that meaningful inferences are drawn.
• The whole concept of Semantic Web would function over web
browser; one need not to look for new software and technologies.
• New mechanism to find more reliable and trusted information.
• New services or agents will be developed, intelligent enough to pipe in
the useful data from other source.
• Web surfing in Semantic Web is more targeted and browsers used in it
are able to produce more customized searching.

8. Implication and Applicability in Real World


With the development of society, human being can move towards literacy, gain
information and become knowledgeable. The humans have developed various
technologies and are moving towards technological era. With the advent of
information technology, modernization of society is increasing day by day. The
existing society or the information society is becoming machine oriented and
today we are performing various tasks with the help of machines. The Human
being has developed computers for his ease to reduce his physical exercise but
his needs still remains the same and he is continuously moving towards new
inventions and new technologies. The growth of artificial intelligence and
robotics are good example of such development where a robot has almost all the
capabilities like humans but whatever it may be it doesn’t have thinking
,analyzing and decision making capabilities. Tim Berner’s Lee started working
in this direction and tried to develop such a system which have all these
capabilities and as a result of this concept of semantic web has evolved.
At present we can see lot of applications or the services over Web which assist
users to locate most suitable result. The flight booking agents, travel planner
agents, social networking sites do work within the scope of semantic
technology. It is just a beginning and in future there will be more and more
intelligent applications.

9. Impact on trinity of libraries


Libraries are changing and changing radically. Whether one day there will be
an agent to replace librarian is a question to be pondered over? Though sounds
impossible but near solutions can be reached. The libraries are being fully
automated with minimal interference from the staff. Semantic technology is
going to make its mark in information discovery and retrieval. This may reduce
the mental exercise of the staff and burden and further benefit the users. A user
can get all the information on his doorstep only by putting the query.
Semantic Web is also beneficial in document management. Though there is no
shelf but concept maps of Classification schemes are going to be a big help for
constructing ontologies especially for retrieval.
10. Summary:
Semantic web is in its starting phase and we are focusing to develop its basic
and static infrastructure. The next step will be to realize active components on
top that makes use of this infrastructure to provide intelligent services to users.
We are trying to provide various high level services by mechanizing different
aspects like searching for vendors, products, services; comparing and
combining products; coalition forming of vendors etc., which requires human
efforts but still most of long way is ahead.

11. References

1. Semantic Web introduction (http://infomesh.net/2001/swintro/)


2. Berners-Lee, T., Hendler J. and Lassila,O: The Semantic Web. Scientific
American 284(5) (2001)
3. Introduction to RDF and RDFS (
http://www.xml.com/pub/a/2001/01/24/rdf.html)
4. Ontologies and Semantic Web
(http://www.obitko.com/tutorials/ontologies-
semanticweb/ontologies.html)
5. Abran, A., and Moore, J.W. (Exec. Eds.), Bourque, P. and Dupuis, R.
(Eds.) Guide to the
6. Software Engineering Body of Knowledge (2004)
7. Advanced Computing: An International Journal ( ACIJ ), Vol.2, No.6,
November 2011
112
8. World Wide Web Consortium (W3C) Semantic Web activity’s
homepage.
(http://w3c.org/sw.)
9. The Semantic Web by Eric Miller (http://www.w3c.org)
10.IEEE Std 610.12-1990, IEEE Standard Glossary of Software Engineering
Terminology, IEEE, 1990.
11.Yajing Zhao, Jing Dong, Senior Member, IEEE, and Tu Peng Ontology
Classification for
12.Semantic-Web-Based Software Engineering, IEEE TRANSACTIONS
ON SERVICE
COMPUTING, VOL. 2, NO. 4, OCTOBER-DECEMBER 2009
13.H.-J. Happel and S. Seedorf. Applications of ontologies in software
engineering.(2nd Workshop on Semantic Web Enabled Software
Engineering (SWESE 2006) at ISWC 2006, Galway, Ireland, November
11-15, 2006)
14.Ian Horrocks and Alan Rector. The Semantic Web : Ontologies and OWL
15. Nigam Shan. An introduction to OWL and its alternatives, National
Centre for Biomedical Ontologies
16.Ying Ding, Dieter Fensel, Michel Klein and Borys Omelayenko. The
Semantic web : Yet Another Hip, Data Knowledge Engineering, 2002,
6.10.01.
17.OWL Web Ontology Language Guide, Michael K. Smith, Chris Welty,
and Deborah L. McGuinness, Editors, W3C Recommendation, 10
February 2004, http://www.w3.org/TR/owl-guide/
18.Aditya Tripathi. Resource Description Framework: A Tutorial for
Developing Web Ontology. DRTC Workshop on Semantic Web 8th –
10th December,2003DRTCBangalore
http://drtc.isibang.ac.in:8080/bitstream/handle/1849/120/D_aditya-
sematic_web.pdf?sequence=2

You might also like