Customized Search: Using Web Mash-Up and Web Usage Mining: IPASJ International Journal of Computer Science (IIJCS)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

IPASJ International Journal of Computer Science(IIJCS)

Web Site:

A Publisher for Research Motivation ........ Email:
Volume 2, Issue 5, May 2014 ISSN 2321-5992

Volume 2 Issue 5 May 2014 Page 37

Nowadays, web search has become very easy. The options of search engines have increased for web Users. Indeed having so
many choices, the user satisfaction is still not achieved. User has to search from all the search engines to get the desired
Our approach is to make User web search more effective and easier. Customized Search means the Integration of various web
based Search services on the basis of analyzed User web logs. Using this application we will try to shortlist three Image search
engines and make a mash-up application of them. The aim is to enhance the quality of the internet usage.
Our experimental results demonstrate that our approach can improve user satisfaction by giving customized results of search
from three search engines.

Keywords: Mash-Up, Adaptive Resonance theory, Encog, web log analysis.

There has been exponential growth in the World Wide Web in terms of Web sites and their users. Now-a-days, we have
many Advanced Search engines available on the web to search Data, Images etc. In spite, of having such advanced
searching technologies [7], the user has to do, three or more keyword trial and errors or has to switch between two or
more search engines to get the desired results and that too the results are not 100% satisfactory [7].
Different search engines have different approaches (algorithms) [7],[8] for search for example Googles PageRank
algorithm and yahoos yahoo search and many more. Each search gives different results according to the keywords
used, and may or may not match up to the users requirements.
Our Application is an attempt to increase the user satisfaction from search results. Few definitions before going into the
Concept. Mash-up [11], in web development, is a web page, or web application, that uses content from more than one
source to create a single new service displayed in a single graphical interface. Clustering [2], [3], [9], [10] is the
process of organizing objects into groups whose members are similar in some way. Web log analyzer [1],[6] is a kind
of web analytics software that parses a server log file from a web server, and based on the values contained in the log
file, derives indicators about when, how, and by whom a web server is visited.
To understand the concept consider 10 most popular search engines. These 10 search engines are grouped into a group
of 3 (because our system is currently capable of mashing up only 3 search engines, in future the number will be
increased) with considering permutation and combinations (not all). For example, a group may be of google, yahoo,
and flicker, and another group will contain google, Picasa, yahoo and so. In this way up to 20 groups are formed and
the search engines in a group are mashed up and few customization are added to get good results. So, now we have 20
Customized Search engines.
Now, in the User Web Browser the Customized search plugin is to be installed. The Function of this plugin is to
retrieve the user logs (from which the search engines used by user is Deducted) from server (for this AWStats is used)
and then on the basis of this extracted data one mash-up is allotted to the User (directed to the link of the mash-up)
from the above explained 20 mash-ups.

Figure 1: System Block Diagram [12]
Customized Search: Using Web Mash-up and
Web Usage Mining

Ravi Kumar Mondeti
, Manoj Valesha
, Vivek Sadhwani
, Gresha Bhatia (Mentor)

B.E. Computer Engg., VES Institute of Technology, Mumbai -400074, Maharashtra, India
Professor, Department of Computer Engineering, VES Institute of Technology, Mumbai -400074, Maharashtra, India.
IPASJ International Journal of Computer Science(IIJCS)
Web Site:
A Publisher for Research Motivation ........ Email:
Volume 2, Issue 5, May 2014 ISSN 2321-5992

Volume 2 Issue 5 May 2014 Page 38

We divided the implementation of this System into three phases.
User log file Analyzer.
Mash-Up Allocator.
Creating Web Mash-up.
The first two phases work in sequence as the output of Analyzer is input to Mash-Up Allocator. The third phase is not
required to be implemented for all the user.

2.1 User log Analyzer:
In this phase the user log files are retrieved from the remote server using the AWStats [6] . User logs are analyzed to
extract the most frequently used Search Engines by the User [1]. The Extracted data is stored in the following input
format to Mash-Up Allocator.
It is a ten bit Bipolar input of O and (empty space) bits . Where each bit represents a search engine and if that
search engine is used by the user then it is marked as O or else marked as . ( the size of the input depends on the
number of search engines considered , here we assumed 10)

2.2 Mash-Up Allocator:
The Adaptive Resonance Theory (Neural Networks) Algorithm [9],[10] is used in Mash-up Allocator. The Mash-Up
allocator reads the input from the output of the Analyzer phase and allocates the Mash-UP. The Mash-UP implements
the clustering algorithm and cluster user into the group of mash-ups. The JAVA ENCOG [8],[9] library is used to
code the Mash-Up allocator.

The output of the Mash-Up allocator is as follows:-
User-IP Mash-Up allotted 2
2.3 Creating Web Mash-up:

The Search Engines which are Grouped as explained in the Introduction section are Mashed Up using Yahoo pipes
[4],[5] . Schema of search engines is created and implemented using mash-up . The APIs of search engines is needed
for Mash-up. For example: - Consider the mash-up of Google, Picasa, and Flickr. Google is an independent platform
which allows user to access its servers in any formats possible. Picasa being a google product is also open for
development. The only issue is with Flickr. It restricts its usage to a limited purpose so in order to use Flickr we had to
contact them for a secret key also known as API. Once we were done getting the API our project was implemented

IPASJ International Journal of Computer Science(IIJCS)
Web Site:
A Publisher for Research Motivation ........ Email:
Volume 2, Issue 5, May 2014 ISSN 2321-5992

Volume 2 Issue 5 May 2014 Page 39

Figure 2: Schema of Google, Picasa, Flickr mash-up[4]

3. Softwares And Libraries

3.1 Yahoo Pipes:
Yahoo pipes [4],[5] is used to create all the mash-Up used in this project. Yahoo pipes is an amazing web mash-up
generator. The purpose of Yahoo Pipes is to create new pages by aggregating RSS feeds from different sources. Yahoo
Pipes has many modules which can be used either to grab data from sources or to edit the data that is grabbed from the
sources. These modules are grouped into categories. These categories are sources, user inputs, operators, URL, string,
date, location and number.

3.2 AWStats:
AWStats [1] is used to analyze the web log files of user to generate input for Mash-Up allocator. AWStats, A log file
analyzer which generates the most used websites and then clusters them together.

3.3 Encog:
Encog [2],[3] libraries is used to code Mash-up allocator in JAVA. Encog is a machine learning framework available
for Java, .Net, and C++. Encog supports different learning algorithms such as Bayesian Networks, Hidden Markov
Models and Support Vector Machines. However, its main strength lay in its neural network algorithms. Encog contains
classes to create a wide variety of networks, as well as support classes to normalize and process data for these neural

4. Related Work:
Karthick Murugan, a yahoo pipes user created Flickr image search module which allows user to search Flickr
A guy named a contributor has posted the most pipes amongst the members of yahoo blog he actually helped us
with the concepts.

5. Conclusion:
Customized search is an advanced searching approach in which different Search Engines are mashed-Up (combining
most useful methods of shortlisted search engines) to create an altogether more powerful search engine, according to
the User usage data of Search engines.
Experimental results show that that our Approach increases user search result efficiency and satisfaction.

6. Acknowledgement
This idea would not have been possible without noteworthy contributions of ,
Prof. Gresha Bhatia who inspired us for making this project the way it is.
The Contributer who helped us in finalizing the pipes schema.

[1] AWStats.
IPASJ International Journal of Computer Science(IIJCS)
Web Site:
A Publisher for Research Motivation ........ Email:
Volume 2, Issue 5, May 2014 ISSN 2321-5992

Volume 2 Issue 5 May 2014 Page 40

[2] Encog Frame .
[3] Encog learning platform.
[4] -yahoo pipes documentation
[5] yahoo pipes tutorial.
[6] Jaideep Srivastava, Robert Cooley, Mukund Deshpande, Pang-Ning Tan, Department of Computer Science and
Engineering University of Minnesota ,Web Usage Mining: Discovery and Applications of Usage Patterns from
Web Data SIGKDD Explorations. Copyright c 2000 ACMSIGKDD, Jan 2000..
[7] 1
meenakshi shruti pal, 2dr. sushil kumar garg. image retrieval: a literature review, international journal of
advanced research in computer engineering and technology (ijarcet) volume 2, issue 6, june 2013.
[8] poonam bhusari, rashmi gupta, amit sinahal ,personalized image search from photo sharing websites using ranking
based tensor factorization model (rmtf) ,international journal of advanced research, volume 3, issue 8, august
[9] vaishali a.zilpe, dr. mohammad atique , web usage mining using neural network approach: a critical review ,
(IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 3 (1) , 2012, 3073 - 3077
[10] sharma , m varshney, an efficient approach for web log mining using art, international conference on education
and management technology, 2010 (icemt 2010).
[11] volker hoyer ,Katarina stanoevska- slabeva,Simone kramer, andrea giessmann , what are the business benefits of
enterprise mashups ? 1530-1605/11 $26.00 2011 ieee.
[12] , block diagrams are drawn using Creatly.


Ravi Kumar Mondeti (Corresponding Author) is pursuing his B.Engg degree from VES Institute of
Technology, Mumbai -400074, and Maharashtra, India Affiliated to Mumbai University. Currently he is
working as Planning and Management Officer in ISTE-VESIT . His area of interest is Neural Networks,
Artificial Intelligence, Data Mining and Machine learning.

Manoj Valesha is pursuing his B.Engg degree from VES Institute of Technology, Mumbai -
400074,Maharashtra, India Affiliated to Mumbai University. His area of Interest is Web Mash-up, Computer
Networks, Game Design.

Vivek Sadhwani is pursuing his B.Engg degree fromVES Institute of Technology, Mumbai -400074, and Maharashtra,
India Affiliated to Mumbai University. His area of Interest is Web log analysis and Data Mining.

You might also like