KMST Project Report 2018

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Project Idea: Interactive Timeline of NASA Space

Missions
Information related to around 8050 space explorations have been made available in semi-
structured from in NASA portal [1]. Though the available information is very accurate, there
are certain gaps. Following are some examples
● Whether a mission is manned or unmanned?
● For manned missions who were the astronauts?
● What were the mission goals?
These gaps can be filled up by
● Extracting data from the portal itself or
● Linking entities to other linked open datasets

The dataset thus generated may be used to develop learning application that aims at
providing space mission related information to the learners.

Project Objective
The objective of this project is to
● Develop an interactive interface to provide space mission related information to the
learners.

Pedagogic Objective
We aim at exploring following spectrum of the Semantic Web technologies

Knowled Knowle Linking Retriev Semant


ge dge Dataset al in KG ic Web

Features
A list of application features have been presented below. However, do not limit your
imagination to the enumerated set and think of other interesting features.
1. Space Voyage Linked Data: Space mission data is available in NASA Portal [1] in a
semi-structured format. Generation of linked data from semi-structured data involves:
a. Development of data model or ontology
b. Web scraping the space mission pages that follow a fixed template
c. Extracting data from info box
d. Extracting data from description text. This will include semantic tagging with
DBpedia Spotlight service [10].
The created dataset has to be represented in a RDF and stored in RDF Store like
BlazeGraph [11]. The hosted data should be accessible via SPARQL and REST
endpoints.
2. Timeline View: Different space missions are to be ordered in Timeline. The timeline
can be zoomed-in or zoomed-out. Each entry in the timeline will be attached with a
Knowledge Card representing basic info about a mission. Please see Histropedia [2]
for reference.

3. Map-based Visualizations: Entities present in the taks are related to different geo
locations. For example, country of origin of different astronauts, launch sites of
different space crafts and others. The visualization interface overlays entities on
appropriate geo locations on Google Map. In this context, Exhibit Platform [4] may be
explored.
4. Connect with DBpedia: The information about an entity can be more enriched by
connecting to other datasets. For example, a spacecraft entity can be linked the
associated page in DBPedia or Yago and fetch information about the astronauts,
celestial bodies, etc.
5. Faceted Search and Browse: The collected information may be searched or
browsed in different ways. Both the services will be entity driven. The search
interface should provide autocomplete feature. The search or browse results should
be presented in faceted interface where the results can be filtered. Finally the result
links should point to the corresponding Wikipedia page. One may also like to explore
how gFacet [3] can be used. gFacet is a new way of browsing RDF data by
combining graph visualization and faceted browsing.
6. Complex Query Interface: An RDF data store can be queried using SPARQL query
language. However, writing a SPARQL query requires technical knowledge which
may not possessed by all the users. This feature can be implemented as an
interactive and visual query editor that interacts with relevant dataset and supports
user to specify her query. Example of such systems are FedViz [5][6].
7. Question Answering over Linked Data (QALD): Sometimes users will like to pose
a natural language query to the system. For example, “What are the space missions
Yuri Gagarin was involved with?”. The task here is to translate natural language
query into SPARQL query and retrieving answer from an RDF store. There have
been several challenges on this particular tasks in different Semantic Web
conferences. A details on the task and the related review is presented in [7]. For a
simple implementation of the task, work is presented in [8] can be referred.

8. Automatic Quiz Agent: This will add fun to the application. The developed dataset
along with DBpedia and Wikidata can be used to automatically generate quizzes.
The generated questions will be MCQs. The questions will be categorized based on
classes or entities. The learners would be provided hints when asked. The hints
again will be generated from dataset. The Quiz App can be deployed in a game
mode where the learners will gain points. You may think of a baseline implementation
as presented in [9].
9. Integration: Some effort has to put in order to stitch together different components.
References
[1] NASA Space Missions: https://nssdc.gsfc.nasa.gov/nmc/spacecraftSearch.do

[2] Histropedia timeline on ‘Discoveries in Mathematics’:


http://histropedia.com/timeline/466wbb666m/Discoveries-in-mathematics

[3] Visually Experiencing Data Web: http://www.visualdataweb.org/gfacet.php

[4] Exhibit Platform: http://www.simile-widgets.org/exhibit/

[5] Zainab et. al., FedViz: A Visual Interface for SPARQL Queries Formulation and
Execution, Workshop on Visualizations and User Interfaces for Ontologies and Linked Data
co-located with 14th International Semantic Web Conference (ISWC 2015)
URL: http://svn.aksw.org/papers/2015/ISWC-VOILA_FedViz/public.pdf

[6] Damljanovic et. al, FREyA: an Interactive Way of Querying Linked Data Using Natural
Language, 1st Workshop on Question Answering over Linked Data (QALD-1, Collocated
with the 8th Extended Semantic Web Conference (ESWC 2011)
URL: https://pdfs.semanticscholar.org/4d7b/a77451f4036a13ffb0d8bf29a6d971d9bfaa.pdf

[7] Christina Unger et. al., An Introduction to Question Answering over Linked Data,
Reasoning Web 2014
URL: http://andrefreitas.org/papers/preprint_introduction_qa_linked_data.pdf

[8] Ngonga et. al, Sorry, I don’t speak SPARQL – Translating SPARQL Queries into Natural
Language, 22nd international conference on World Wide Web (WWW 13)
URL: http://jens-lehmann.org/files/2013/www_sparql2nl.pdf

[9] Swe Geng, Drawing Questions from Wikidata, Bachelor Thesis, Computer Engineering
and Networks Laboratory, ETH Zurich, URL: https://pub.tik.ee.ethz.ch/students/2015-FS/BA-
2015-13.pdf

[10] DBpedia Spotlight: https://github.com/dbpedia-spotlight/dbpedia-spotlight

[11] BlazeGraph: https://www.blazegraph.com/

[12] NASA Taxonomy: http://www.taxonomystrategies.com/wp-


content/uploads/2017/01/NASA_Taxonomy.pdf
Group 2- Space Voyage Linked Data

Project Description:​​ Generation of linked data from semi-structured data on ​Space mission
available on Wikipedia and Wikidata.

Steps:
1. Defining the Ontology or Data model which describe the domain of interest.
Implementation:
● Understood the Domain of interest and define the informal Description of the
domain
● Explored various sites to understand the possible data extractions and
relationships
● Prepared a Data Model using RDF and RDFS
● Space_Mission was the main class used
● Domain and Range of each property was defined.
● Properties like orbit period, inclination, epoch, reference, periselene etc. defined
as subproperties of OrbitalParameter.
● Properties like vehicleManufacturer, maxCapacity, launchMass, decayDate, etc.
defined as subproperties of VehicleParameter.
● Some properties have not been shown in the knowledge graph below but were
extracted based on the availability of data from Wikipedia.
2. Web scraping the space mission pages by defining extractors, which is constructed
considering the page structure of the Web
Implementation:
● Extracted the list of all hyperlinks from the Wikipedia Page: “List of NASA
Missions” ( ​https://en.wikipedia.org/wiki/List_of_NASA_missions​ )
● Extracted the wikipedia infobox of all the space missions in the Wikipedia page
despite the highly unstructured manner in which the content was present
● Dumped the data into a CSV file.
● Tools Used: Python BeautifulSoup and Requests for Web Scraping

3. Extracting data in form of RDF triples and their annotation


Implementation
● Converted the data from the CSV to RDF triples 
● Extracted additional data in form of RDF triples from Wikidata using Wikipedia
SPARQL Endpoint
● Stored the data obtained from Wikidata in a CSV and converted them into RDF
Triples with the namespace given by us
● Joined the two sources of data using common namespace
Scope of Improvement:
1. The final dataset obtained can be further made more structured by including more
classes, restrictions and using OWL Concepts.
2. There is a need of a standard source of data for extraction as wikipedia did not have
very structured data.
3. Making the dataset richer and more uniform with respect to the availability of values for
various properties for different space missions.
KMST Group 5 Report

Project: Interactive Timeline of NASA Space Missions


Task: Map Based Visualisation

Project(Task) Description:
This project aims to provide an interactive timeline of Space
Missions. The task assigned to us is Map Based Visualisation. Different entities
present in the task are related to different geographical locations. The task was
to build a visualisation interface which overlays these entities on corresponding
geographical locations.

Implementation Details:
The implementation is done in three phases. The following is the
implementation done in each phase,
i) Phase 1:
In this phase, information was obtained from Dbpedia and
Wikidata. The required information was obtained from the results of
SPARQL queries written in the Dbpedia and Wikidata SPARQL
endpoints by looking at various relations present in their datasets.
ii) Phase 2:
In this phase, information was mapped on the Google Maps
using Exhibit platform. To understand about this platform, its
documentation was looked at. The exhibit platform takes a data file
whose link should be mentioned in the head content of html page.
Each entry(row) in data file must have a label and its coordinates to
plot on the Google Map. As, it was expected to show the launch place
of different space missions and birth places of different astronauts, it
was decided to have two web pages.
iii) Phase 3:
In this final phase, a GUI was constructed. Search option and
filters were added. The Dbpedia query results (csv format) link was
used as the link to the data. Wikidata did not have a link to the results
in csv format and handling other formats in exhibit dynamically is
difficult. So, the Wikidata results were stored in a file, which was also
provided as the link to the data. Finally, the website was hosted with
the help of GitHub pages.

System Features:
The first webpage shows various space missions plotted at their
launch places on the Google Maps. When clicked on a lens, it shows the name
of the space mission, related space agency, its launch place, involved astronauts
KMST Group 5 Report

and birth places if exist and link to its Dbpedia or Wikidata entry. There is also
an option to view the data in tabular format. Right side of the webpage has
search and filter options. Any term in the data can be searched by entering the
term in the search bar. There are two filter options. One filters based on the
space agency involved in the mission. Other filters based on the launch place of
the space mission. When a filter is added, points on the map and tabular data
automatically get changed.
The second webpage shows various astronauts plotted at their
birth places on the Google Maps. When clicked on a lens, it shows the name of
the astronaut, space missions involved, related space agencies, their launch
places and link to his Dbpedia or Wikidata entry. This webpage also has an
option to view the data in tabular format. Right side of the webpage has search
and filter options. Any term in the data can be searched by entering the term in
the search bar. There are two filter options. One filters based on the birth place
of astronaut. Other filters based on the space missions involved by him. When a
filter is added, points on the map and tabular data automatically get changed.
Each webpage also has a link to other.

Screenshots:
The following are the screenshots of the two webpages.
KMST Group 5 Report

We hope that this application helps the intended learners.

References:
1) Exhibit Documentation: http://simile-widgets.org/wiki/Exhibit
2) GitHub:
https://github.com/sampreeth1999/sampreeth1999.github.io/tree/mast
er/SpaceMissionsMapView
3) Space Missions Map View
4) Astronauts Map View

Group Members:
• 16CS10004 Aitipamula Aravind
• 16CS10010 Boddu Mahesh
• 16CS10027 Inkulu Sampreeth
• 16CS10035 Mulumudy Rohith
• 16CS30039 Veligeti Vineeth
Interactive Timeline of Nasa Space Missions
Faceted search and Browse

Objective​: - Implement Faceted Search and Browse to produce an interactive


Interface to filter and search NASA’s space mission-related information to the learners.

Our Timeline:​ -
1. Query Development:- We had to develop a method to implement the search
technique, essentially a backend to the faceted search and browse.
2. GUI Development:- We had to develop a suited GUI for the faceted search and
browse interface, which would include the search bar to start the search, the
facets to filter the search and later, display the results.
3. Integration: - We had to integrate the GUI developed in stage 2 to the Query and
Search method developed in stage 1.

Development​: -

1. Stage 1:-

We experimented with the various approaches to develop queries or index the literals to
develop the various methods to implement the faceted search and browse.

First, we tried indexing the literals with the Apache Solr search engine. We later realized
that this procedure did not yield the required results as the indexing of literals using the
Apache Solr search engine indexes only the subject of each literal and does not
establish the interlinking as long as the data was not in XML format which was not the
format that could be used for a dataset of an appreciable size (of course, the dataset
size at the end was rather small).

Then we tried indexing a CSV file by using the Pandas in Python but that approach was
really slow and was more of a brute force search than an indexing.

Lastly, we visited ​https://query.wikidata.org where we were able to use the GUI to


deduce literals that could be used in a SPARQL query. This approach was fast,
accurate and the best suited for implementing faceted search and browse for the limited
size dataset available to us.

2. Stage 2:-

We developed a pretty standard GUI using HTML, CSS and JavaScript that could
execute search queries in the front-end javascript files. This GUI has a search bar,
facets(range sliders and checkboxes) and an area for displaying the results.

3. Stage 3:-

In the end, we used SPARQL queries to achieve faceted search and browse. This was
done by using the query triples we obtained from Wikidata.

Then we were left with the aim to implement the autocomplete feature in the search bar
and also with the aim to link each search result with the corresponding Wikipedia page.

For the former, if the string that is written in the search bar is “ABC” we included a
regular expression of the form “ABC*” which essentially means the string ABC followed
by any text and we displayed it in a drop-down menu. Then the user just selects one of
the results from the autocomplete results and the results are displayed on the webpage.

For the later, we just added an extra query for the Wikipedia link using a new
namespace schema (obtained from query.wikidata.org) that was used to search for the
corresponding Wikipedia link from the dataset and was displayed in the front.
All these queries were generated in the front-end of the browser for speed and the
results were displayed right away.

In our case, the search query(not the autocomplete as that is a separate query) is first
executed and then the results are filtered with respect to the filters.

Possible Future Work:-


In the future, we aim to build filters in which the query will change with the filters. This
can be done essentially by string manipulation.
RESULTS:

The search result

The autocomplete feature(here, the background is not appropriately visible due to a formatting
error which had not occurred during the demo phase)

Group Members:
C Rajesh Khanna - 16CS10014
G Rahul Kranti Kiran - 16CS10018
Rakesh Bal - 16CS10043
Ayan Zunaid - 16CS30006
Mehul Kumar Nirala - 16CS10034
Complex Query Interface
Our term project, Complex Query interface, is one of the modules of the Interactive Timeline of
NASA Space Missions project. Our task was to simplify the process of querying an RDF data store via
SPARQL using an interactive visual query editor. This is because writing a SPARQL query requires
technical knowledge which may not possessed by all the users. Example of such systems are FedViz.
Our module was further sub-divided into three phases, namely:

• Phase 1: We stored all relevant RDF triples related to the space missions from different
source(DBpedia, Wikidata) to the database(Blazegraph)
• Phase 2: We developed an interactive and visual Query editor from where user can specify
their query.
• Phase 3: Convert the provided visual query to SPARQL query and execute over the database
endpoint and then provide the results.
Phase1:
We downloaded data from https://data.nasa.gov/ontologies/atmonto/NAS which contains
ontologies and database instances of flights, aircraft and manufacturers, etc. We then proceeded to
download and install blazegraph. We decided to use the equipment’s’ dataset and were successful
in placing queries and retrieving results.
Phase2:
For phase 2, deciding upon the type of visuals to go for which would maximise user interactivity and
productivity required quite a bit of research. We ended up with a UI that would allow users to
convert their natural queries to a graph like structure with directed edges. Such a structure is quite
intuitive for laymen and also serves as an intermediate representation for us as it breaks down
complex queries into simpler subparts.

Figure 1: A sample of the graphical UI

The library used for this purpose is mx-graph. Since it is written purely in JavaScript, it gave us a lot of
flexibility in terms of implementing our own custom functions and requirements. The graphs thus created
are free-flowing, editable and can be scaled or moved around giving users an interactive experience. Giving
options of creating subjects, objects or literals and using simple filters, we sub-consciously mould the users’
natural queries to a more triple like format as can be seen in the screenshot given above. Here the user
wanted to query all authors who have books which were published in the year 1950. Due to this
intermediate graph format, converting user queries to SPARQL became quite easy for us.

Phase3:
We ran to quite a bit of hurdles in this phase, very early on. Since mx-graph was built while keeping a certain
target audience in mind, implementing certain custom features became difficult. We had to change a lot of
the m-graph’s pipeline. Our UI also went through the following changes:

• We used the ontology of our database instance in order to provide users with dropdown boxes
containing options pertaining to the database in current use. This was done in order to remove any
ambiguity and confusion which comes with giving clean slates for users to write on.
• We also added an option to create nodes which the users actually want to query on so that we could
clearly separate out the select and where clauses.
• The option to create literal nodes was removed.

Figure 2: Sample of the final UI

We soon realised that converting all types of queries would become too difficult for us to handle so we
decided to restrict our scope by developing a template upon which further queries could be built.

In order to give users nodes with dropdown boxes instead of the earlier simple editable ones, we set up a
flask api which, via javascript, allowed us to call for options related to the database through python. Thus we
had to set up a localhost server as well. This made our UI dynamic since one can simply swap the ontology
file and database with their own and get the system to work. Finally, using the nodes with their edges
created and database information, we converted given queries into SPARQL queries and returned it as
output. In order to validate the output queries, we ran them in blazegraph. Here is the output of the above
example

Figure 3: SPARQL output of the generated query


Question Answering over Linked Data
KMST PROJECT
GROUP 6

Rishit Sinha (15NA30019) | Sayan Sinha (16CS10048) | Himanshu Mundhra (16CS10057) | Swastika Dutta
(16CS10060) | Sarthak Chakraborty (16CS30044)

Link To Presentation: https://prezi.com/view/6k85M2eYw5So6mdzWpkJ/

PROBLEM STATEMENT
When a natural language query is proposed to the system it should translate the query into SPARQL
query and retrieving answers from an RDF store. The steps to be involved in achieving this are:

− Extracting all the entities and predicates from the Natural Language Query (NLQ).

− Obtaining all the candidates URIs related to entities and predicates in NLQ from DBpedia and
then keep candidate which have high relevance with NLQ.

− Construct the SPARQL Query considering predefined template or rule-based approach or any
other methods.

RESOURCES
− Dataset: QALD 2017 large scale answering over RDF.

− Libraries: Natural Language Toolkit, SpaCy, Stanford CoreNLP, DBpedia Spotlight, Blazegraph

PROCEDURE
The methods that we adopted for translating the Natural Language Query into SPARQL query are
described as follows:

Extracting entities and predicates: We use the QALD dataset for training an
entity-predicate recognizer:
1. Determine the entities using DBpedia spotlight and then join the entities in a deterministic
fashion to convert them into a single unit.

2. Remove relevant stopwords and entities from the query to get the candidate predicates.

3. Stemming of obtained predicates to obtain them in raw form.

1
Obtaining candidate URIs: The following APIs can be used to obtain the
candidate URIs:
1. The joined entities, which have more than a single word, are combined together using
underscores.

2. Blazegraph: This provides the most significant URI of an entity based on context.

Constructing the parse tree: The parse tree of the hyphenated query is
constructed using the Stanford Parser and dependency graph is generated.

SPARQL QUERY GENERATION


The SPARQL query generation is discussed using the above obtained results.

− Possible Solutions: Existing libraries for converting to SPARQL queries, such as Apache
Jena, NLQuery, LODQA and QuePy but for various limitations of these libraries, we took a
template based heuristic approach for performing the task.

− Our Algorithm:

if ( NO subject & no. of keyword = 2 )


if ( no. of entity = 1 )
then: object = entity
if ( no. of entity > 1 )
then: object = the proper noun keyword
predicate = the other keyword
if ( there is a subject & subject is a proper noun )
if( 3 keywords )
if ( some keyword is a verb )
then : predicate = verb keyword
object = other keyword
//Now query depends if object is a proper noun or not.
if ( 2 keyword )
then : object is to be queried
predicate = other keyword
if ( there is a subject & subject is not a proper noun )
if ( no. of keyword = 2 )
then : object = the proper noun
predicate = the other keyword
if ( no. of keyword = 3 )
then : object = the proper noun
predicate = the verb

2
FRONT END
A react based front end was developed which could connect to a Flask based backend. The queries are
transferred using POST requests which is handled at the backend and the result is returned.

RESULTS
We tested our model on the QALD Large dataset and could obtain results for 41 questions out of the
given 100 questions. The glimpse of the frontend and backend is presented below:

LIMITATIONS
- This model works only for a limited set of queries. It does not handle complex queries having
more than three keywords (UNION of BGP, FILTER, etc.)
- The predicate obtained are based on the structure of the sentence and generally does not
represent the relationships in DBpedia.
- Class relationships should be same as that present in DBpedia.
- Cannot handle ‘Is A B’ type queries.
CONCLUSION AND DISCUSSION
We obtained a pretty good result based on this dataset. This gives a hint that a template based
heuristic model is suitable for approaching this types of problems. We can make this model more robust

3
by improving the correspondence between the predicates and the relationships in DBpedia. For easy
post development, we have made our model open source at https://github.com/americast/qald.

4
Automatic Quiz Agent

Team Objective: Develop a Quiz application that extracts question-answer pairs


from data automatically.

Project Steps:
1. Obtaining all the entities belonging to given class from the dataset and
choose randomly one of the entity (s).
2. Find the out degree of the nodes.
3. Weight the edges considering out degree of both the entities connected to
the edge.
4. Generate the Question using popular entity (s) and the connected predicate
(p).
5. Obtain randomly other entities which are not the right answer as remaining
choices.
6. The Quiz is deployed in a game mode where the admin has right to create
quiz and users are able to attempt quiz and view past scores.

Implementation note/architecture: RDF data is loaded in the system in the form of


a graph using python rdf library. Thus, all the entities in the system are obtained,
then we derive one entity for which we can create a question based on its
popularity.
Popularity of an entity is defined in terms of the outdegree which is calculated as
the count of neighbouring entities of the chosen entity. Also, Weight of the edge is
calculated as sum of out degrees of the two vertices joined by the edge.

Aim : Generate the question from the s,p,o tokens. Stating the approach for single
hop distance
Challenge: To form a well formed grammatical question.
Approach : Discriminative re-ranking approach to find out a parse tree for a given
sentence with some probability.The approach can be formulated as a supervised
learning task.
Let X denote a sentence and the Y’(X) the set of all possible parse trees for X where
we wish to learn :
f : X -> Y(X)
here, Y(X) is a subset of Y’(X) as we are interested in question generation only for
english.As a crude approximation, the tokens themselves are considered to be the
sentences(incomplete ones).
M .Heilman’s work from CMU-LTI has been used to generate questions which uses a
Logistic-Regression model on top of the previous model to output the final
probabilities of the questions generated from the tokens.

Observation: The model does not converge ( or takes a very long time to parse and
may not halt at all) if all of the tokens were nouns.
Since the predicate initially obtained from the knowledge graph (for example as
film-->resource/actor-->actor_page) was in the noun form, we need to consider all
possible verb forms of predicate.
Finally, the question which has the predicate in the verb form and for which the
probability of the classifier is maximum is chosen.

Verb Forms are identified by the predicate.


For example, say, we have a predicate as ‘act’.
GetWordForms(‘act’) = {‘act’, ‘acts’, ‘acted’, ‘acting’}

Identification of correct question is done by scoring the questions created using


word forms generated.
Amitabh Bachchan acted in Suryavansham.
Amitabh Bachchan acts in Suryavansham.
To choose multiple options except the correct answer for the question, we
randomly chose entities of same rdfs:type such that there’s no edge from s to p.
For example,
Who acted in Suryavansham?
Option1 : Salman Khan.
Cross checking of the option can be interpreted as
“Did Salman Khan act in Suryavansham?”
If Salman Khan does not come out as an object, we may choose Salman Khan as a
wrong option otherwise pick a new entity of same rdf:type randomly and then
verify it similarly.

Work Flow of the agent is given as:


Application features:
@Admin
1. Add questions to question bank.
2. Add new quiz.
3. View results.
4. View users list.
5. Set timing for generated quiz.
6. Attempt a quiz for testing purposes.
@User
1. Attempt a Quiz within time limit.
2. Checkout attempted quizzes.
3. Re-Attempt Quiz
References:
http://www.cs.cmu.edu/~mheilman/papers/heilman-smith-qg-tech-report.pdf

Group - 1
Chelsi Raheja (16CS10013)
G. Vishal (16CS10061)
Aurghya Maiti (16CS10059)
Rahul Kumar (16CS10042)
Vedic Partap (16CS10053)
 

KMST 2018 

Integration 
Interactive  Timeline  of  NASA  Space  Mission
 

Task assigned for the group : 

Our  group  was  supposed  to  make  the  wrapper  of  all  the  sub  tasks  in  this  project. 
The wrapper must bring all the things works together at a same place. In addition to 
that  we  were  supposed  to  implement  a  algorithm  to  make  the  decision  of  two 
entities being similar to each other, and how they are related. 

Description : 
Wrapper for bringing up all the subtasks : 

We  have  developed  a  static  HTML,  CSS  Webpage  deployed  on  github  repository 
pages  and  linked  all  the  application  links  in  that  page  explaining  a  bit  about  all  the 
project.  Developing  a  simple  Web  Page  and  linking  all  the  application  which  are 
contributing to the project is a simple task and easily created. 

RelFinder, for finding relation between entities : 

Along  with  that,  for  the  similarity  between  two  entities  in  RDF  database  and  the 
relation  b/w  them we have used a open source project called RelFinder. RelFinder is 
application  which  draw  a  visual  representation  of  the  graph  containing  the  entities 
as  the  end  point  and  thus  show  the relation b/w entities them. It clearly shows how 
two  entities  are  related  to  each  other  not  just  by  showing  all  the  entities  b/w  them 
but  plotting  all  the  classes  and  other  properties  of  the  entities.  RelFinder  can  be 
integrated  to  any  other  Database  and  to  any  other  application  or  a  web  page.  We 

 
 
 
have  successfully  integrated  the  Application  RelFinder  in  to  our  wrapper  web  page 
the connects all the sub tasks contributing to the project. 

The  algorithm  used  in  the  RelFinder  for  finding  the  relations is more or less a naive 
algorithm  which  creates  a  2D-Array  which  have  the  data  of  one  node  being 
connected  to  the  other  node  in  the  graph.  The  cache  is  the  Database  is  more  or 
less  infinite  and  the  array  that  holds  the  data for the nodes to be connected or not, 
to  counter  this  problem,  what  implemented  in  the  RelFinder,  is  not  the  best 
solution  but  it  works  for  now,  it  just  simply  put  a  limiter  on  the  size  of  array,  so  if 
the  entities  are  very very far away in the graph, RelFinder can’t have them displayed 
in  the  graph  connected and this is a very big drawback for a application to show the 
similarity between the entities.  

 

You might also like