Semantic Document-1

Table of Contents
1 Introduction 1-3
1.2 Motivation
2 History 5-12
2.1 Web 1.0
2.2 Web 2.0
3 Web 3.0 -A Basic Introduction 13-17
3.1 Semantic Web Vision
3.2 Difference between Web 1.0, 2.0 and 3.0
3.3 A Layered Approach
4 Key Components 18-22
4.1 URI
4.2 RDF
4.3 RDFS
4.4 OWL
4.5 Microformat
5 Project Implementation 23-26
5.1 Practical Illustration
6 Challenges 27-29
7 Advantages and Disadvantages 33-34
8 Conclusion 35-36
9 References 37-38
SEMANTIC WEB
INTRODUCTION
Currently the focus of a W3C working group, the Semantic Web vision was conceived by
Ting Berners-Lee, the inventor of World Wide Web. The World Wide Web changed the way
we communicate, the way we do business, the way we seek information and entertainment
the very way most of us live our daily lives. Calling it the next step in Web evolution,
Berners-Lee defines the Semantic Web as "a web of data that can be processed directly and
indirectly by machines."
In the Semantic Web data itself becomes part of the Web and is able to be processed
independently of application, platform, or domain. This is in contrast to the World Wide Web
as we know it today, which contains virtually boundless in formation in the form of
documents. We can use computers to search for these documents, but they still have to be
read and interpreted by humans before any useful information can be extrapolated.
Computers can present you with information but can't understand what the information is
well enough to display the data that is most relevant in a given circumstance. The Semantic
Web, on the other hand, is about having data as well as documents on the Web so that
machines can process, transform, assemble, and even set on the data in useful ways.
Imagine this scenario. You're a software consultant and have-just received a new project.
You're to create a series of SOAP-based Web service for one of your biggest clients. First,
you need to learn a bit about SOAP, so you search for the term using your favorite search
engine. Unfortunately- the results you're presented with are hardly helpful. There are listings
for dish detergents, facial soaps, and even soap operas mixed into the results. Only after
sifting through multiple listings and reading through the linked pages are you able to find
information about the W3C's SOAP specifications.
Department of CSE
SEMANTIC WEB
Because, of the different semantic associations of the word "soap," the results you receive are
varied in relevance and you still have to do a lot of work to find the information you’re
looking for. However, in a Semantic Web-enabled environment, you could use a Semantic
Web agent to search the Web for "SOAP" where SOAP is a type of
technology specification used in Web services. This time, the results of your search will be
relevant. Your Semantic Web agent can also search your corporate network for the SOAP
specification and discover if your colleagues have completed similar projects or have posted
SOAP-related research on the network. Based on the semantic information available for
SOAP. your agent also presents you with a List of related technologies. Now you know that
WSDL, XML, and URI are all technologies related to SOAP, and that you'll need to do some
research on them, too, before beginning your project. Armed with the information returned by
your Semantic Web agent, you read the related technology specifications and send emails to
the colleagues who have made SOAP-related materials available on the network to ask for
their input before starting your new project.
Department of CSE
SEMANTIC WEB
2. HISTORY
2.1 Web 1.0:

Web 1.0 (1991 -2003) is a retronym which refers to the state of the World Wide Web, and
any website design style used before the advent of the Web 2.0 phenomenon. Web 1.0 began
with the release of the WWW to the public in 1991, and is the general term that has been
created to describe the Web before the ‘’bursting of the Dot-corn bubble’’ in 2001, which is
seen by many as a turning point for the internet.
2.1.1 WEB 1.0 DESIGN ELEMENTS
Some typical design elements of a Web 1.0 site include:
 Static pages instead of dynamic user-generated content.

 The use of framesets.
 proprietary HTML extensions such as the <blink> and <marquee> tags introduced
during the first browser war.
 Online guestbook.
 GIF buttons, typically 88x31 pixels in size promoting web browsers and other
products.
Department of CSE
SEMANTIC WEB
 HTML forms sent via email. A user would fill in a form, and upon clicking submit
their email client would attempt to send an email containing the form’s details.
Figure 1. Web 1.0 Example
Wikipedia is an example of web 1.0 because the site allows the user to only view pages or
search information at best, but the user interaction is minimum and the site is basically static.
2.2 Web 2.0:
The term "Web 2.0'’ (2004-present) is commonly associated with web applications that
facilitate interactive information sharing, interoperability, user-centered design and
collaboration on the World Wide Web. Examples of Web 2.0 include web-based
Department of CSE
SEMANTIC WEB
communities, hosted services, web applications, social-networking sites, video-sharing sites,

wikis, blogs, mashups, and folksonomies. A Web 2.0 site allows its users to interact with
other users or to change website content, in contrast to non-interactive websites where users
are limited to the passive viewing of information that is provided to them.
Although term suggests a new version of the World Wide Web, it does not refer to an update
to any technical specifications, but rather to cumulative changes in the ways software
developers and end-users use the Web.
2.2.1 Web 2.0 Characteristics:

Web 2.0 websites allow users to do more than just retrieve information. They can build on the
interactive facilities of 'Web 1.0" to provide "Network as platform'’ computing. Allowing
users to run software-applications entirely through a browser. Users can own the data on a
Web 2.0 site and exercise control over that data. These sites may have an "Architecture
participation'’ that encourages users to add value to the application as they use it.
The concept of Web-as-participation-platform captures many of these characteristics. Bart

Decrem, a founder and former CEO of Flock, calls Web 2.0 the "participatory Web and
regards the Web-as-information-source as Web 1.0.
The impossibility of excluding group-members who don't contribute to the provision of

goods from sharing profits gives rise to the possibility that rational members will prefer to
withhold their contribution of effort and free-ride on the contribution of others. This requires
what is sometimes called Radical Trust by the management of the website. According to Best
the characteristics or Web 2.0 are: rich user experience, user participation, dynamic content,
metadata, web standards and scalability. Further characteristics, such as openness. freedom
and collective intelligence by way of user participation, can also be viewed as essential
attributes of Web 2.0.
Department of CSE
SEMANTIC WEB
2.2.2 web 2.0 Examples:
Figure 2 Web 2.0 Examples
Facebook is a social networking site and it is a prominent example of web 2.0. This site
allows user to make friends, write them messages, chat with them, upload and share photos
etc, activies.
Department of CSE
SEMANTIC WEB
3. Web 3.0: Basic introduction
The Semantic Web is a mesh of information linked up in such a way as to be easily

processable by machines, on a global scale, you can think of it as being an efficient way of
representing data on the World Wide Web, or as a globally linked database.
The Semantic Web was thought up by Tim Berners-Lee, inventor of the WWW, URLs,
HTTP, and HTML. There is a dedicated team of people at the World Wide Web consortium
(W3C) working to improve, extend and standardize the system, and many languages,
publications, tools and so on have already been developed. However, Semantic Web
technologies are still very much in their infancies, and although the future of the project in
general appear to be bright, there seems to be little consensus about the likely direction and
characteristics of the early Semantic Web.
What's the rationale for such a system? Data that is generally hidden away in HTML files is
often useful in some contexts, but not in others. The problem with the majority of data on the
Web that is in this form at the moment is that it is difficult to use on a large scale, because
there is no global system for publishing data in such a way as it can be easily processed by
anyone. For example, just think of information about local sports events, weather
information, plane times, Major League Baseball statistics, and television guides…all of this
information is presented by numerous sites, but all in HTML.
The Semantic Web is a web of data. There is lots of data we all use every day, and it is not
part of the web. I can see my bank statements on the web, and my photographs, and I can see
my appointments in a calendar. But can I see my photos in a calendar to see what I was doing
when I took them? Can I see bank statement lines in a calendar?
Why not? Because we don't have a web of data. Because data is controlled by applications,
and each application keeps it to itself.
The Semantic Web is about two things. It is about common formats for integration and
combination of data drawn from diverse sources, where on the original Web mainly
concentrated on the interchange of documents. It is also about language for recording how the
data relates to real world objects.
Department of CSE
SEMANTIC WEB
3.1 The Semantic Web Vision
Today's Web
The World Wide Web has changed the way people communicate with each other and the way
business is conducted. It lies at the heart of a revolution which is currently transforming the
developed world toward a knowledge economy, and more broadly speaking, to a knowledge
society. This development has also changed the way we think of computers. Originally, they
were used for computing numerical calculations. Currently their predominant use is
information processing, typical applications being data bases, text processing, and games. At
present there is a transition of focus towards the view of computers as entry points to the
information highways. Most of today's Web content is suitable for human consumption. Even
Web content that is generated automatically from data bases is usually presented without the
original structural information found in data bases. Typical uses of the Web today involve
humans seeking and consuming information, searching and getting in touch with other
humans, reviewing the catalogs of online stores and ordering products by filling out forms,
and viewing adult material. These activities are not particularly well supported by software
tools. Apart from the existence of links which establish connections between documents, the
main valuable, indeed indispensable, kind of tools are search engines. Keyword-based search
engines, such as AltaVista, Yahoo and Google, are the main tool for using today's Web. It is
clear that the Web would not have been the huge success it was, were it not for search
engines. However, there are serious problems associated with their use. Here we list the main
ones: -High recall, low precision: Even if the main relevant pages are retrieved, they are of
little use if another 28,758 mildly relevant or irrelevant documents were also retrieved. Too
much can easily become as bad as too little.
 Low or no recall: Often it happens that we don't get any answer for our request, or
that important and relevant pages are not retrieved. Although low recall is a less
frequent problem with current search engines, it does occur. This is often due to the
third problem:
 Results highly sensitive to vocabulary: often we have to use semantically similar
keywords to get the results we wish; in these cases, the relevant documents use
different terminology from the original query. This behaviour is unsatisfactory, since
semantically similar queries should return similar results.
Department of CSE
SEMANTIC WEB
Results are single Web pages: If we need information that is spread over various documents,
then we must initiate several queries to collect the relevant documents, and then we must
manually extract the partial information and put it together.
Interestingly. despite obvious improvements in search engine technology, the difficulties

remain essentially the same. It seems that the amount of Web content Outgrows the
technological progress. But even if a search is successful, it is the human who has to browse
selected retrieved documents to extract the information he is actually looking for, In other
words, there is not much support for retrieving the information (for some limited exceptions
see the next section), an activity that can be very time-consuming. Therefore the term
information retrieval, used in association with search engines, is somewhat misleading,
location finder might be a more appropriate term. Also, results of Web searches are not
readily accessible by other software tools; search engines are often Isolated applications.
Figure. 3: Semantic Web includes features
Department of CSE
SEMANTIC WEB
3.2 Difference between Web 1.0, Web 2.0 and Web 3.0
Web 1.0:
The Internet before1999, experts call it Read-Only era. The average internet user's rote was
limited only to reading the information presented to him. The best examples are millions of
static websites which mushroomed during the.com boom. There was no active
communication or information flow from consumer of the information to producer of the
information.
Web 2.0:
The lack of active interaction of common user with the web lead to the birth of Web 2.0. The
year 1999 marked the beginning of a Read-Write-Publish era with notable contributions from
LiveJournal (Launched in April, 1999) blogger (Launched in August, 1999). Now even a
non-technical user can actively interact & contribute to the web using different blog
platforms. This era empowered the common user with a few new concepts viz. blog, Social-
Media, & Video-Streaming. Publishing your content is only a few clicks away! Few
remarkable developments of Web 2.0 are Twitter, YouTube. eZineArticles, Flickr and
Facebook.
Web 3.0:
It seems we have everything whatever we had wished for in Web 2.0, but it is way behind
when it comes to intelligence. Perhaps a six-year-old child has a better analytical ability than
the existing search technologies! Keyword based search of web 2.0 resulted in an information
overload. The following attributes are going to be a part of web 3.0:
• contextual Search
• Tailor made Search
• Personalized Search
• Evolution of 3D Web
• Deductive Reasoning
Department of CSE
SEMANTIC WEB
Though Web is yet to see something which can be termed as fairly intelligent but the efforts
to achieve this goal has already begun. 2 weeks back the Official Google Blog mentioned
about how Google search algorithm is now getting intelligent as it can identify many
synonyms.
For example, Pictures & Photos are now treated as similar in meaning. From now onwards
your search query GM crop will not lead you to GM (General Motors) website. Why? Cause,
first by synonym identification Google will understand that GM may mean General Motors
or Genetically Modified. Then by context i.e., by the keyword crop it will deduce that the
user wants information on genetically modified crops and not on General Motors. Similarly,
GM car will not lead you genetically modified crop. Try out yourself to check how this newly
added artificial intelligence works in Google. Also, there are many websites built on Web 3.0
which personalizes your search. The web is indeed getting intelligent.
Department of CSE
SEMANTIC WEB
3.3 A Layered Approach
Figure. 4: A Layered Approach of Semantic Web
The main obstacle for providing better support to Web users is that, at present, the meaning
of web content is not machine accessible. Of course, there are tools that can retrieve texts,
split them into parts, check the spelling, decompose them, put them together in various ways,
and count their words. But when it comes to interpreting sentences and extracting useful
information for users, the capabilities of current software is still very limited.
The development of the Semantic Web proceeds in steps, each step building, a layer on top of
another. The pragmatic justification for this approach is that it is easier to achieve consensus
on small steps, while it is much harder to get everyone on board if too much is attempted.
Usually there are several research groups moving in different directions; this competition of
ideas is a major driving force for scientific progress. However, from an engineering
perspective there is a need to standardize. So, if most researchers agree on certain issues and
disagree on others, it makes sense to fix the points of agreement. This way, even if the more
Department of CSE
SEMANTIC WEB
ambitious research efforts should fail, there will be at least partial positive outcomes. Once a
standard has been established, many more groups and companies will adopt it, instead of
waiting to see which of the alternative research lines will be successful in the end. The nature
of the Semantic Web is such that companies and single users must build tools, add content
and use that content. We cannot wait until the Semantic Web vision materializes —it may
take another 10 years for it to be realized to its full extent (as envisioned today, of course!). In
building one layer of the Semantic Web on top of another, there are some principles that
should be followed:
1. Downward compatibility: Agents fully aware of a layer should also be able to interpret and
use information written at lower levels. For example, agents aware of the semantics of OWL
can take full advantage of information written in RDF and RDF Schema.
2. Upward partial understanding: On the other hand, agents fully aware of a layer should lake
at least partial advantage of information at higher levels. For example, an agent aware only of
the RDF and RDF Schema semantics can interpret knowledge written in OWL partly, by
disregarding those elements that go beyond RDF and RDF Schema.
Figure shows the "layer cake" of the Semantic Web, which is due to Tim Berners-Lee and
describes the main layers of the Semantic Web designed vision. At the bottom we find XML,
a language that lets one write structured web documents with a user-defined vocabulary.
XML is particularly suitable for sending documents across the Web. RDF is a basic data
model, like the entity-relationship model, for writing simple statements about Web objects
(resources). The RDF data model does not rely on XML, but RDF has an XML-based syntax.
Therefore, in Figure it is located on top of the XML layer.
RDF Schema provides modelling primitives for organizing Web objects into hierarchies.
Key primitives are classes and properties, subclass and sub property relationships, and
domain and range restrictions. RDF Schema is based on RDF. RDF Samna ern be viewed as
a primitive language for writing ontologies. But there is a need for more powerful ontology
languages that expand RDF Schema and allow the representations of more complex
relationships between Web objects. The logic layer is used to enhance the ontology language
further, and to allow to write application-specific declarative knowledge. The proof layer
involves the actual deductive process, as well as the representation of proofs in Web
languages (from lower levels) and proof validation. Finally trust will emerge through the use
of digital signatures, and other kind of knowledge, based on recommendations by agents we
Department of CSE
SEMANTIC WEB
trust, or rating and certification agencies and consumer bodies. Sometimes the word Web of
Trust is used, to indicate that trust will be organized in the same distributed and chaotic way
as the WWW itself. Being located at the top of the pyramid, trust is a high-level and crucial
concept: The Web will only achieve its full potential when users have trust in its operations
(security) and the quality of information provided.
Description
The basic architecture of semantic web contains Identifiers (Uniform Resource Identifiers)
and character code as Unicode. Above this layer is the Syntax layer, defining the syntactical
relationship and the base here is XML. Above this layer is the Data Interchange layer with
RDF defining the same. Above it the query handling part is handled by SPARQL and the
taxonomies is determined by RDFS. The Ontologies are governed by OWL, and rules by
RIF/SWRL. Above it is the unifying logic and the proof layer. All the above-mentioned
layers were encrypted using Cryptology. Above these is the Trust layer.
A brief description of all the above-mentioned layers and components shall be given in the
upcoming segments of the report.
Department of CSE
SEMANTIC WEB
4. KEY COMPONENTS
Semantic Web has five main components which help in accomplishing the required task and
define the functioning of the web:
4.1 Uniform Resource Identifier
A URI is simply a Web identifier: like the strings starting with 'http:" or ‘’ftp:” that you often
find on the World Wide Web. Anyone can create a URI, and the ownership of them is clearly
delegated. so, they form an ideal base technology with which to build a global Web on top of.
In fact, the World Wide Web is such a thing: anything that has a URI is considered to be ‘’on
the Web’’.
A URI may be classified as a locator (URL), or a name (URN), or both. A Uniform Resource
Name (URN) functions like a person’s name, while a Uniform Resource Locator (URL)
resembles that person's street address. In other words: the URN defines an item's identity,
while the URL provides a method for finding it.
The URI syntax consists of a URI scheme name followed by a colon character, and then by a
scheme-specific part. The specifications that govern the schemes determine the syntax and
semantics of the scheme-specific part, although the URI syntax does force all schemes to
adhere to a certain generic syntax that, among other things. reserves certain characters for
special purposes (without always identifying those purposes). The URI syntax also enforces
restrictions on the scheme-specific part, in order to, for example, provide for a degree of
consistency when the part has a hierarchical structure. Percent encoding can add extra
information to a URI.
A URI reference is another type of string that represents a URI, and (in turn) represents the
resource identified by that URI. Informal usage does not often maintain the distinction
between a URI and a URI reference, but protocol documents should not allow for ambiguity.
A URI reference may take the form of a full URI, or just the scheme-specific portion of one,
or even some trailing component thereof— even the empty string. An optional fragment
identifier, preceded by #, may be present at the end of a URI reference. The part of the
Department of CSE
SEMANTIC WEB
reference before the # indirectly identifies a resource, and the fragment identifier identifies
some portion of that resource.
In order to derive a URI from a URI reference, software converts the URI reference to
'absolute' form by merging it with an absolute 'base’ URI according to a fixed algorithm. The
system treats the URI reference as relative to the base URI, although in the case of an
absolute reference, the base has no relevance. The base URI typically identifies the document
containing the URI reference, although this can be overridden by declarations made within
the document or as part of an external data transmission protocol. If the base URI includes a
fragment identifier, it is ignored during the merging process. If a fragment identifier is
present in the URI reference, it is preserved during the merging process.
Web document markup languages frequently use URl references to point to other resources,
such as external documents or specific portions of the same logical document.
4.2 RDF:
The Resource Description Framework (W3C) specifications originally designed as a

metadata data model. It has come to be used as a general method for conceptual description
or modeling of information that is implemented in web resources, using a variety of syntax
formats.
The RDF data model is similar to classic conceptual modeling approaches such as Entity-
Relationship or Class diagrams, as it is based upon the idea of making statements about
resources (in particular Web resources) in the form of subject-predicate-object expressions.
These expressions are known as triples in RDF terminology. The subject denotes the
resource, and the predicate denotes traits or aspects of the resource and expresses a
relationship between the subject and the object. For example, one way to represent the notion
'’The sky has the color blue’’ in RDF is as the triple: a subject denoting '’the sky'', a predicate
denoting ‘’has the color", and an object denoting "blue". RDF is an abstract model with
several serialization formats (i.e., file formats); and so, the particular way in which a resource
or triple is encoded varies from format to format.
A collection of RDF statements intrinsically represents a labelled, directed multi-graph. As

such, an RDF-based data model is more naturally suited to certain kinds of knowledge
representation than relational model and other ontological models. However, in practice, RDF
Department of CSE
SEMANTIC WEB
data is often persisted in relational database or native representations also called Triplestores,
or Quad stores if context (i.e., the named graph) is also persisted for each RDF triple. As
RDFS and OWL demonstrate, additional ontology languages can be built upon RDF.
The subject of an RDF statement is either a Uniform Resource Identifier (URI) or a blank
node, both of which denote resources. Resources indicated by blank nodes are called
anonymous resources. They are not directly identifiable from the RDF statement. The
predicate is a URI which also indicates a resource, representing a relationship. The object is a
URI, blank node or a Unicode string literal.
In Semantic Web applications, and in relatively popular applications of RDF like RSS and
FOAF (Friend of a Friend), resources tend to be represented by URLs that intentionally
denote, and can be used to access, actual data on the World Wide Web. But RDF, in general,
is not limited to the description of Internet-based resources. In fact, the URI that names a
resource does not have to be dereference able at all. For example, a URI that begins with
‘'http:’’ and is used as the subject of an RDF statement does not necessarily have to represent
a resource that is accessible via HTTP, nor does it need to represent a tangible, network-
accessible resource — such a URI could represent absolutely anything. However, there is
broad agreement that a bare URI (without a # symbol) which returns a 300-level coded
response when used in an http GET request should be treated as denoting the Internet
resource that it succeeds in accessing.
4.3 RDFS:
RDF Schema (various abbreviated as RDFS, RDS(S), RDF-S, or RDF/S) is an extensible
knowledge representation language, providing basic elements for the description of
ontologies, otherwise called Resource Description Framework (RDF) vocabularies, intended
to structure RDF resources. The first version was published by the World-Wide Web
Consortium (W3C) in April 1998, and the final W3C recommendation was released in
February 2004, Many RDFS components are included in the more expressive language Web
Ontology Language (OWL).
For Example: rdfs: Class declares a resource as a class for other resources.
A typical example of a rdfs: Class is foaf: Person in the Friend of a Friend (FOAF)
vocabulary. An instance of foaf: Person is a resource that is linked to the class using the rdf:
Department of CSE
SEMANTIC WEB
type predicate, such as in the following formal expression of the natural language sentence:
'John is a Person'.
Ex: John rdf: type foaf: Person
The definition of rdfs: Class is recursive: rdfs. class is the rdfs: Class of any rdfs: Class.
rdfs:subClassOfallows to declare hierarchies of classes.
For example, the following declares that 'Every Person is an Agent':
Foaf: Person rdfs: subclass Of foaf: Agent
Hierarchies of classes support inheritance of a property domain and range from a class to its
subclasses. The RDF Schema specification describes rdf: Property as the class of RDF
properties. Each member of the class is an RDF predicate.
rdfs: domain of an rdf: predicate declares the class of the subject in a triple whose second
component is the predicate.
rdfs. Range of a rdf: predicate declares the class or datatype of the object in a triple whose
second component is the predicate.
For example, the following declarations are used to express that the properly ex: employer
relates a subject, which is of type foaf: Person, to an object, which is of type foaf:
Organization:
ex: employer rdfs: domain foaf: Person
ex: employer rdfs: range foaf: organization
Given the previous two declarations, the following triple requires that ex:John is necessarily
a foaf: Person and ex:CompanyX is necessarily a foaf: organization:
ex:John ex:employer ex:CompanyX
rdfs: subPropertyOf is an instance of rdf: property that is used to state that all resources
related by one properly are also related by another.
Example Statement.: "Abhijit stays in Pune.’'
Department of CSE
SEMANTIC WEB
Figure 5.RDF Example
RDF Triple: (Abhijit, stays in, pune)
This can be mapped to a schema which contains the classes " Citizen " and " Country". A
Citizen "abc" stays in a country " X", then 'X' also involves "abc".
The class citizen has subclasses "Voting citizen " and " non-voting citizen" and the country
class has subclasses " states " which inturn has subclasses
" city ", "town" ," taluka" represented by the "subclassof" property.
The rectangle represents properties, ellipses in the RDFS layer represents classes while
ellipses in the RDF layer represents instances. The domain and range enforce constraints on
the subject and objects of a property.
So, the above diagram suggests that the subject (Abhijit Thatte) is a ‘'type’’ of voting citizen ,
object (pune) is a ‘'type" of a city and the relationship between them is " stays in" or "resides
in’’
4.4 OWL:
The Web Ontology Language (OWL) is a family of knowledge representation languages for
authoring ontologies endorsed by the World Wide Web Consortium. They are characterised
by formal semantics and RDF/XML-based serializations for the Semantic Web. OWL has
attracted both academic, medical and commercial interest.
Department of CSE
SEMANTIC WEB
In October 2007, a new W3C working group was started to extend OWL with several new
features as proposed in the OWL 1.1-member submission. This new version, called OWL 2,
soon found its way into semantic editors such as Protege and semantic reasoners such as
Pellet, RacerPro and FaCT++.W3C announced the new version on 27 October 2009.
The OWL family contains many species, serializations, syntaxes and specifications with
similar names. This may be confusing unless a consistent approach is adopted. OWL and
OWL2 will be used to refer to the 2004 and 2009 specifications, respectively. Full species
names will be used, including specification version (for example, OWL2 EL). When referring
more generally, OWL Family will be used.
The data described by an ontology in the OWL family is interpreted as a set of ‘’Individuals"
and a set of '’property assertions" which relate these individuals to each other. An ontology
consists of a set of axioms which place constraints on sets of individuals (called "classes")
and the types of relationships permitted between them. These axioms provide semantics by
allowing systems to infer additional information based on the data explicitly provided. A full
introduction to the expressive power of the OWL is provided in the W3C's OWL Guide.
Example:
An ontology describing families might include axioms stating that a ‘’hasMother" property is
only present between two individuals when ‘’hasParent" is also present, and individuals of
class "HasTypeOBlood’’ are never related via "hasParent" to members of the
"HasTypeABBlood" class. If it is stated that the individual Harriet is related via "hasMother"
to the individual Sue, and that Harriet is a member of the ‘'HasTypcOBlood" class, then it can
be inferred that Sue is not a member of "HasTypeABBlood".
4.5 Microformat:
A microformat (sometimes abbreviated µF) is a web-based approach to semantic markup that
seeks to re-use existing HTML/XHTML tags to convey metadata-and other attributes, in web
pages and other contexts that support (X) HTML, such as RSS This approach allows
information intended for end-users (such as contact information, geographic coordinates,
calendar events, and the like) to also be automatically processed by software.
Although the content or web pages is technically already capable of "automated processing,"
and has been since the inception of the web, such processing is difficult because the
Department of CSE
SEMANTIC WEB
traditional markup tags used to display information on the web do not describe what the
information means. Microformats are intended to bridge this gap by attaching semantics, and
thereby obviate other, more complicated, methods of automated processing, such as natural
language processing or screen scraping. The use, adoption and processing of microformats
enables data items to be indexed, searched for, saved or cross-referenced, so that information
can be reused or combined.
Current microformats allow the encoding and extraction of events, contact information, social
relationships and so on. More are being developed. Version 3 of the Firefox browser, as well
as version 8 of Internet Explorer are expected to include naive support for microformats.
Microformats emerged as part of a grassroots movement to make recognizable data items

(such as events, contact details or geographical locations) capable of automated processing by
software, as well as directly readable by end-users Link-based microformats emerged first.
These include vote links that express opinions of the linked page, which can be tallied into
instant polls by search engines.
Neither Commerce Net nor Microformats.org is a standards body. The microformats

community is an open wiki, mailing list, and Internet relay chat (IRC)channel. Most of the
existing microformats were created at the Microformats.org wiki and associated mailing list,
by a process of gathering examples of web publishing behaviour, then codifying it. Some
other microformats (such as rel=nofoIlow and unAPI) have been proposed. or developed,
elsewhere.
Example:
In this example, the contact information is presented as follows:
<div›
<div>joe Doe</div>
<div>The Example Company</div>
<div>504-555-1234</div>
<a href=’’http://example.com/”>http://example.com/</a>
</div>
Department of CSE
SEMANTIC WEB
With hCard microformat markup, that becomes:
<div class="vcard'’>
<div class="fn’’>Joe Doe</div>
<div class="org’’>The Example Company</div› <div class="tel”>604-555-1234</div>
<a class=”url”href=’’http://example.com/">http://example.com</a>
</div›
here, the formatted name (fn), organisation (org), telephone number (tel)and web address(url)
have been identified using specific class names and the whole thing is wrapped in
class="vcard", which indicates that the other classes form an hCard (short for HTML) and are
not merely coincidentally named. Other, optional, hCard classes also exist. It is now possible
for software, such as browser plug-ins, to extract the information, and transfer it to other
applications, such as an address book.
Department of CSE
SEMANTIC WEB
5. Challenges
1. Vastness: The World Wide Web contains at least 4 billion pages as of this writing (August
2, 2009). The SNOMED CT medical terminology ontology contains 370,000 class names,
and existing technology has not yet been able to eliminate all semantically duplicated terms.
Any automated reasoning system will have to deal with truly huge inputs.
2. Vagueness: These are imprecise concepts like "young" or ‘’tall’’. This arises from the
vagueness of user queries, or concepts represented by content providers, of matching query
terms to provider terms and of trying to combine different knowledge bases with overlapping
but subtly different concepts. Fuzzy logic is the most common technique for dealing with
vagueness.
3. Uncertainly: These arc precise concepts with uncertain values. For example, a patient
might present a set of symptoms which correspond to a number of different distinct diagnoses
each with a different probability. Probabilistic reasoning techniques are generally employed
to address uncertainty.
4. Inconsistency: These are logical contradictions which will inevitably arise during the
development of large ontologies. Deductive reasoning fails catastrophically when faced with
inconsistency, because '’anything follows from a contradiction’’.
5. Deceit: This is when the producer of the information is intentionally misleading the
consumer of the information. Cryptography techniques are currently utilized to alleviate this
threat.
Department of CSE
SEMANTIC WEB
6. Project Implementation:
This section provides some example projects and tools, but is very incomplete. The choice of
projects is somewhat arbitrary but may serve illustrative purposes. It is also remarkable that
in this early stage of the development of semantic web technology, it is already possible to
compile a list or hundreds of components that in one way or another can be used in building
or extending semantic webs.
A) DEIPEDIA
DBpedia is an effort to publish structured data extracted from Wikipedia: the data is
published in RDF and made available on the Web for use under the GNU Free
Documentation License, thus allowing Semantic Web agents to provide inferencing and
advanced querying over the Wikipedia-derived dataset and facilitating interlinking, re-use
and extension in other data-sources.
B) FOAF
A popular application of the semantic web is Friend of a Friend (or FoaF), which uses RDF to
describe the relationships people have to other people and the "things" around them. FOAF
permits intelligent agents to make sense of the thousands of connections people have with
each other, their jobs and the items important to their lives; connections that may or may not
be enumerated in searches using traditional web search engines. Because the connections are
so vast in number, human interpretation of the information may not be the best way of
analysing them.
FOAF is an example of how the semantic Web attempts to make use of the relationships
within a social context.
C) GOODRELATIONS FOR E-COMMERCE

A huge potential for Semantic Web technologies lies in adding data structure and typed links
to the vast amount of offer data, product model features, and tendering /request for quotation
data.
The Good Relations ontology is a popular vocabulary for expressing product information,
prices, payment options, etc. It also allows expressing demand in a straightforward Fashion.
Department of CSE
SEMANTIC WEB
Good Relations has been adopted by Best Buy, Yahoo, Open Link Software, O'Reilly Media,
the Book Mashup, and many others.
D). SIOC
The SIOC Project - Semantically-Interlinked Online Communities provides a vocabulary of
Terms and relationships that model web data spaces. Examples of such data spaces include,
among others: discussion forums, weblogs, blogrolls/feed subscriptions, mailing lists, shared
bookmarks, image galleries.
E). SIMILE
Semantic Interoperability of Metadata and Information in unlike Environments
SIMILE is a joint project, conducted by the MIT Libraries and MIT CSAIL, which seeks to
enhance interoperability among digital assets, schemata/ontologies, meta data, and services.
F). NEXTBIO
A database consolidating high-throughput life sciences experimental data tagged and
connected via biomedical ontologies. Next bio is accessible via a search engine interface.
Researchers can contribute their findings for incorporation to the database. The database
currently supports gene or protein expression data and is steadily expanding to support other
biological data types.
G). LINKING OPEN DATA
Datasets in the Linking Open Data project, as of Sept 2008
Class linkages within the Linking Open Dada datasets
The Linking Open Data project is a W3C-led effort to create openly accessible, and
interlinked, RDF Data on the Web, the data in question takes the form of RDF Data Sets
drawn from a broad collection of data sources. There is a focus on the Linked Data style of
publishing RDF on the Web.
Department of CSE
SEMANTIC WEB
H). OPENPSI
OpenPSI the (OpenPSI project) is a community effort to create UK government linked data
service that supports research. It is a collaboration between the University of Southampton
and the UK government, lead by OPSI at the National Archive and is supported by JISC
funding.
I)ERFGOEDPLUS.BE
Erfgoedplus.be {'heritage-plus') is a Belgian project aimed at disclosing all types of heritage
from the provinces of Limburg and Flemish Brabant and the city of Leuven to the public by
applying semantic web technology. Erfgoedplus be uses RDF/XML, OWL and SKOS to
describe relationships to heritage types, concepts, objects, people, place and time. Data are
normalized and enriched by means of thesauri (AAT) and an ontology (CIDOC CRM),
available for input, conversion and navigation.
Erfgoedplus.be is a regional aggregator for European Local (Europeana) and an example of

how semantic web technology is applied within the heterogeneous context or heritage.
6.1 Practical Illustration of Semantic Web Application:

If we suppose that a certain Professor Anjali Sharma wishes to make a web page far her own
encompassing a faculty page, a research page, a blog site and a staff listing pure then using
traditional web modelling the pages would look like so:
Department of CSE
SEMANTIC WEB
Now, if she decides to use semantic web instead of the traditional web model then the
complexity and presentability of the web pages would increase immensely. So we can Link
Professor Sharma's faculty page to her research. Then link data in her blog to both of
these.And link profile data to her staff listing. And her staff listing could show some of the
other academics she works with. With her research page showing her links with worldwide
research collaborators. Who also know one of her colleagues. Who comment on Professor
Sharma's blog regularly. With all this data being able to be displayed simply it provides a
much richer user experience and offers information that previously might not have been
exposed. The web page would now look like:
Department of CSE
SEMANTIC WEB
7. ADVANTAGES & DISADVANTAGES
7.1 ADVANTAGES
 Semantic web will make search tasks faster and easier.

 It will make searches, more personalized.
 Semantic web browser will act as a personal assistant.
 It is usually very professional looking
 Personalization of the internet to a greater extent
7.2 DISADVANTAGES
 Web 3.0 will be inaccessible to less advanced gadgets.

 It’s too complex.
 It is not yet fully prepared by the technology
 It is simple to obtain information about a user’s public and private life
 More time will be spent on the internet
 It is necessary to have privacy policies
Department of CSE
SEMANTIC WEB
8. CONCLUSION
Semantic web is the Future of Internet. Semantic web is expected to re write the internet as
we know it and change the way we search information on net. The searches will become
personalized and the results wilt he more accurate and more relevant. The use of Resource
Description Format and Microformats will help in the advent of this technology.
Although there are many challenges that have to be overcome in order to do so but the
possibility of this technology overcoming and replacing the traditional web model seem
bright currently.
The traditional model of interact does not allow for intelligent searches and takes a lot of time
because of the irrelevant searches being displayed too. Semantic Web can overcome all these
problems to provide a better and rich user experience to consumers all over the globe. The
next generation of web will better connect people and will further advent the information
technology revolution.
Department of CSE
SEMANTIC WEB
10. REFERENCES
 IEEE Internet Computing the Semantic Web-. The Roles of XML and RDF
 Stefan Decker And Sergey Melnik Standard university
 ▪ IEEE INTELLIGENT SYSTEMS Ontology Languages for the Semantic Web
 Asuncion Gomez-Perez and Oscar Coreho, Universidad Politecnica de
Madrid
 IEEE Published by the IEEE Computer Society: -
 Semantics Scales Up Beyond Search in Web 3.0
 Amit Sheth Kno.e.sis, Wright State University
 November/December 2011
 T. Berners-Lee. Semantic Web Road Map.
 www.w3.0rg/DesignIssues/Semantic
 www.semanticweb.org
 www.wikipedia.org
 “The Semantic Web” by Wikipedia, at http://en.wikipedia.org/wiki/semantic_web.
 http://www.w3.org/2001/sw/SW
 https://www.w3.org/standards/semanticweb/
Department of CSE

Semantic Document-1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Semantic Document-1

Uploaded by

Copyright:

Available Formats

Table of Contents

2.1 Web 1.0:

2.1.1 WEB 1.0 DESIGN ELEMENTS

Some typical design elements of a Web 1.0 site include:

 Static pages instead of dynamic user-generated content.

Figure 1. Web 1.0 Example

2.2 Web 2.0:

communities, hosted services, web applications, social-networking sites, video-sharing sites,

2.2.1 Web 2.0 Characteristics:

The concept of Web-as-participation-platform captures many of these characteristics. Bart

The impossibility of excluding group-members who don't contribute to the provision of

2.2.2 web 2.0 Examples:

Figure 2 Web 2.0 Examples

3. Web 3.0: Basic introduction

The Semantic Web is a mesh of information linked up in such a way as to be easily

3.1 The Semantic Web Vision

Interestingly. despite obvious improvements in search engine technology, the difficulties

Figure. 3: Semantic Web includes features

• Tailor made Search

3.3 A Layered Approach

Figure. 4: A Layered Approach of Semantic Web

4.1 Uniform Resource Identifier

The Resource Description Framework (W3C) specifications originally designed as a

A collection of RDF statements intrinsically represents a labelled, directed multi-graph. As

Ex: John rdf: type foaf: Person

For example, the following declares that 'Every Person is an Agent':

Foaf: Person rdfs: subclass Of foaf: Agent

ex: employer rdfs: domain foaf: Person

ex: employer rdfs: range foaf: organization

ex:John ex:employer ex:CompanyX

Example Statement.: "Abhijit stays in Pune.’'

Figure 5.RDF Example

RDF Triple: (Abhijit, stays in, pune)

Microformats emerged as part of a grassroots movement to make recognizable data items

Neither Commerce Net nor Microformats.org is a standards body. The microformats

<div>The Example Company</div>

With hCard microformat markup, that becomes:

<div class="fn’’>Joe Doe</div>

<div class="org’’>The Example Company</div› <div class="tel”>604-555-1234</div>

C) GOODRELATIONS FOR E-COMMERCE

G). LINKING OPEN DATA

Datasets in the Linking Open Data project, as of Sept 2008

Class linkages within the Linking Open Dada datasets

Erfgoedplus.be is a regional aggregator for European Local (Europeana) and an example of

6.1 Practical Illustration of Semantic Web Application:

7. ADVANTAGES & DISADVANTAGES

 Semantic web will make search tasks faster and easier.

 Web 3.0 will be inaccessible to less advanced gadgets.

 “The Semantic Web” by Wikipedia, at http://en.wikipedia.org/wiki/semantic_web.

You might also like