Generating Query Facets Using Knowledge Bases: CSE Dept 1 VMTW

Generating Query Facets Using Knowledge Bases MN15_B03
1. INTRODUCTION
WITH the help of search engines, users can quickly find web pages containing the
information they want by issuing queries and receiving search results comprised of “ten blue
links”. However, previous studies show that many users are not satisfied with this kind of
conventional search result pages.
Users often have to click and view many documents to summarize the information they are
seeking, especially when they want to learn about a topic that covers different aspects. This
usually takes a lot of time and troubles the users. An automatic summarization of search results
may help users understand the query without browsing many pages, hence could save their time.
Mining query facets (or query dimensions) is an emerging approach to solve the problem
above.Existing query facet mining algorithms mainly rely on the top search results from search
engines Dou et al. first introduced the concept of query dimensions , which is the same concept
as query facet discussed in this paper.
They proposed QDMiner, a system that can automatically mine query facets by aggregating
frequent lists contained in the results. The lists are extracted by HTML tags (like <select> and
<table>), text patterns, and repeat content blocks contained in web pages. Kong et al. proposed
two supervised methods, namely QF-I and QF-J, to mine query facets from the results.
In all these existing solutions, facet items are extracted from the top search results from a
search engine (e.g., top 100 search results from Bing.com). More specifically, facet items are
extracted from the lists contained in the results. The problem is that the coverage of facets mined
using this kind of methods might be limited, because some useful words or phrases might not
appear in a list within the search results used and they have no opportunity to be mined.
In order to solve this problem, we propose leveraging a knowledge base as a

complementary data source to improve the quality of query facets. Knowledge bases contain
high quality structured information such as entities and their properties and are especially useful
when the query is related to an entity.
CSE Dept 1 VMTW

We propose using both knowledge bases and search results to mine query facets in this
paper. The reason why we don’t abandon search results is that search results reflect user intent
and provide abundant context for facet generation and expansion.
Our target is to improve the recall of facet and facet items by utilizing entities and their
properties contained in knowledge bases, and at the same time, make sure that the accuracy of
facet items are not harmed too much. Our approach consists of two methods which are facet
generation and facet expansion. In facet generation, we directly use properties of entities
corresponding to a query as its facet candidates.
In facet expansion, we expand initial facets mined by traditional algorithms such as

QDMiner to find more similar items contained in a knowledge base such as Freebase1. The
facets constructed by the two methods are further merged and ranked to generate final query
facets.
CSE Dept 2 VMTW

2.LITERATURE SURVEY
Title: Finding dimensions for queries
Zhicheng Dou1 , Sha Hu2, Yulong Luo
Description:
We address the problem of finding multiple groups of words or phrases that explain the
underlying query facets, which we refer to as query dimensions. We assume that the
important aspects of a query are usually presented and repeated in the query’s top retrieved
documents in the style of lists, and query dimensions can be mined out by aggregating these
significant lists. Experimental results show that a large number of lists do exist in the top
results, and query dimensions generated by grouping these lists are useful for users to learn
interesting knowledge about the queries.
Title: Automatically mining facets for queries from their search results
Authors: Z. Dou, Z. Jiang, S. Hu, J. Wen, and R.
Description:
We address the problem of finding query facets which are multiple groups of words or
phrases that explain and summarize the content covered by a query. We assume that the
important aspects of a query are usually presented and repeated in the query’s top retrieved
documents in the style of lists, and query facets can be mined out by aggregating these
significant lists. We propose a systematic solution, which we refer to as QDMiner, to
automatically mine query facets by extracting and grouping frequent lists from free text, HTML
tags, and repeat regions within top search results. Experimental results show that a large number
of lists do exist and useful query facets can be mined by QDMiner. We further analyze the
problem of list duplication, and find better query facets can be mined by modeling fine-grained
similarities between lists and penalizing the duplicated lists.
CSE Dept 3 VMTW

Title: Facetedpedia: dynamic generation of query-dependent faceted interfaces for

Wikipedia
Authors: M. Diao, S. Mukherjea, N. Rajput, and K. Srivastava
Description:
This paper proposes Facetedpedia, a faceted retrieval system for information
discovery and exploration in Wikipedia. Given the set of Wikipedia articles resulting
from a keyword query, Facetedpedia generates a faceted interface for navigating the
result articles. Compared with other faceted retrieval systems, Facetedpedia is fully
automatic and dynamic in both facet generation and hierarchy construction, and the facets
are based on the rich semantic information from Wikipedia. The essence of our approach
is to build upon the collaborative vocabulary in Wikipedia, more specifically the
intensive internal structures (hyperlinks) and folksonomy (category system). Given the
sheer size and complexity of this corpus, the space of possible choices of faceted
interfaces is prohibitively large. We propose metrics for ranking individual facet
hierarchies by user’s navigational cost, and metrics for ranking interfaces (each with
facets) by both their average pairwise similarities and average navigational costs. We
thus develop faceted interface discovery algorithms that optimize the ranking metrics.
Our experimental evaluation and user study verify the effectiveness of the system.
CSE Dept 4 VMTW

3. SYSTEM ANALYSIS
EXISTING SYSTEM
QUERY facets summarize a query in different aspects. Theymay help users quickly
understand important aspects of thequery and help them explore information. Dou et al.
firstintroduced this problem and proposed QDMiner algorithm. QDMiner first extracts frequent
lists in top search resultsusing predefined patterns, then weights each list and groupsthem into
final facets. Similar to QDMiner, Kong and All a developed supervised approaches, namely QF-I
and QFJ, to mine query facets. Facet item candidates are extractedfrom frequent lists which are
obtained in a similar way asQDMiner. Then two Bayesian models are learned to estimatehow
likely a candidate is a facet item and how likely twocandidates belong to the same facet. All the
existing worksare based on top search results, hence the quality of finalfacets might be limited. If
some words or phrases don’tappear in a list within top search results, they have noopportunity to
be facet items.
Disadvantages
 facet-based browsers is that they allow only a limited set of queries

 the development effort required for developing specific interfaces and their inflexibility
in case of schema changes or extensions.
 PROPOSED SYSTEM:
CSE Dept 5 VMTW

In order to solve this problem, we propose leveraging a knowledge base as a

complementary data source to improve the quality of query facets. Knowledge bases contain
high quality structured information such as entities and their properties and are especially useful
when the query is related to an entity.
We propose using both knowledge bases and search results to mine query facets in this paper.
The reason why we don’t abandon search results is that search results reflect user intent and
provide abundant context for facet generation and expansion. Our target is to improve the recall
of facet and facet items by utilizing entities and their properties contained in knowledge bases,
and at the same time, make sure that the accuracy of facet items are not harmed too much.
Our approach consists of two methods which are facet generation and facet expansion. In
facet generation, we directly use properties of entities corresponding to a query as its facet
candidates. In facet expansion, we expand initial facets mined by traditional algorithms such as
QDMiner to find more similar items contained in a knowledge base such as Freebase1. The
facets constructed by the two methods are further merged and ranked to generate final query
facets.
Advantages
 we propose leveraging a knowledge base as a complementary data source to improve the

quality of query facets.
 By leveraging both knowledge bases and search results, QDMKB breaks the limitation of
only using search results to generate query facets, thus could improve the quality of
facets, especially recall.
 finding more information related to each facet item through the link structure of
knowledge bases;
 using the types or properties in knowledge bases as a potential explanation of the
meaning of each facet.
CSE Dept 6 VMTW

SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
 System : Pentium IV 2.4 GHz.

 Hard Disk : 40 GB.
 Floppy Drive : 1.44 Mb.
 Monitor : 15 VGA Colour.
 Mouse : Logitech.
 Ram : 512 Mb.
SOFTWARE REQUIREMENTS:
 Operating system : Windows XP/7.

 Coding Language : JAVA/J2EE
 Data Base : MYSQL
 Server : Tomcat
 IDE : Eclipse
CSE Dept 7 VMTW

4.IMPLEMENTATION
MODULES:
• Facet Generation
• Facet Expansion
• Facet Grouping
• Facet Weighting
MODULES DESCRIPTION
Facet Generation :
Retrieve relevant entities based on Freebase Search API, we retrieve list ofentities
relevant to query Retrieve related properties Based on Freebase Topic API, for each entity, we
retrieve all its direct properties and further construct its second-hop properties.We remove
properties in which have only one target entity and the left ones are facet candidates.
Facet Expansion
We first use existing state-of-art algorithm QDMiner to generate a set of initial facets
F(q). For each facet , we use the following steps:Property based facet expansion We try to assign
f to the most appropriate property p 2 P(q) and use other entities in p to enrich f. The property p
should cover most items in f, meanwhile its size should be as small as possible to avoid including
unrelated items. Type based facet expansion If P(q) is empty or no suitable p is found in P(q), we
try to assign f to the most fine-grained type t which covers most items in f and use other entities
into enrich f. Because Freebase types are usually too broad, we further construct some
constraints to narrow down the semantic scope just like the example in Section 1. The constraints
precisely characterize the relationship between facet and query context and thus improve the
accuracy of expansion.
CSE Dept 8 VMTW

Facet Grouping
All the facet candidates constructed by facet generation and expansion might have
duplicate entities. We use the Quality Threshold algorithm to cluster them into the final facets by
grouping similar candidates together.
Facet Weighting
We weight each final facet in the same way as QDMiner. A good facet should be
supported by many websites and contain informative items. These final facets are sorted by
weights to form the output of QDMKB.
SYSTEM ARCHITECHTURE
SDLC METHODOLOGY:
CSE Dept 9 VMTW

Spiral Model
The spiral model is a risk-driven process model generator for software projects. Based on
the unique risk patterns of a given project, the spiral model guides a team to adopt elements of
one or more process models, such as incremental, waterfall, or evolutionary prototyping
CSE Dept 10 VMTW

5.FEASIBILITY STUDY
THE next step in analysis is to verify the feasibility of the proposed system. “All projects
are feasible given unlimited resources and infinite time“. But in reality both resources and time
are scarce. Project should confirm to time bounce and should be optimal in there consumption of
resources. These places a constant are approval of any project.
Feasibility has applied to Maintenance of Elementary School Data pertains to the following
areas:
 TECHNICAL FEASIBILITY
 OPERATIONAL FEASIBILITY
 ECONOMIC FEASIBILITY
1TECHNICAL FEASIBILITY:
To determine whether the proposed system is technically feasible, we should take into
consideration the technical issues involved behind the system.
Maintenance of Elementary School Data uses the web technologies, which is rampantly
employed these days worldwide. The world without the web is incomprehensible today. That
goes to proposed system is technically feasible.
OPERATIONAL FEASIBILITY:
To determine the operational feasibility of the system we should take into

consideration the awareness level of the users. This system is operational feasible since the users
are familiar with the technologies and hence there is no need to gear up the personnel to use
system. Also the system is very friendly and to use.
ECONOMIC FEASIBILITY
To decide whether a project is economically feasible, we have to consider various factors as:
 Cost benefit analysis
CSE Dept 11 VMTW

 Long-term returns
 Maintenance costs
The proposed project is computer based. It requires average computing capabilities and
access to internet, which are very basic requirements and can be afforded by any organization
hence it doesn’t incur additional economic overheads, which renders the system economically
feasible.
CSE Dept 12 VMTW

6.SOFTWARE ENVIRONMENT
HTML
Html is a language wshich is used to create web pages with html marking up a page
to indicate its format, telling the web browser where you want a new line to begin or how
you want text or images aligned and more are possible.
We used the following tags in our project.
TABLE:
Tables are so popular with web page authors is that they let you arrange the elements of a
web page in such a way that the browser won’t rearrange them web page authors frequently use
tables to structure web pages.
TR:
TR is used to create a row in a table encloses <TH> and

<TD> elements. <TR> contain many attributes. Some of them are,
• ALIGN: specifies the horizontal alignment of the text in the table row.
• BGCOLOR: Specifies the background color for the row.
• BORDERCOLOR: Sets the external border color for the row.
• VALIGN: Sets the vertical alignment of the data in this row.
TH:
TH is used to create table heading.
CSE Dept 13 VMTW

• ALIGN: Sets the horizontal alignment of the content in the table cell. Sets LEFT,
RIGHT, CENTER.
• BACKGROUND: Species the back ground image for the table cell.
• BGCOLOR: Specifies the background color of the table cell
• VALIGN: Sets the vertical alignment of the data. Sets to TOP, MIDDLE, BOTTOM or
BASELINE.
• WIDTH: Specifies the width of the cell. Set to a pixel width or a percentage of the
display area.
TD:
TD is used to create table data that appears in the cells of a table.
• ALIGN: Species the horizontal alignment of content in the table cell. Sets to LEFT,
CENTER, RIGHT.
• BGCOLOR: Specifies the background image for the table cell.
• BGCOLOR: sets the background color of the table cells.
• WIDTH: Species the width of the cell
FRAMES:
Frames are used for either run off the page or display only small slices of what are
supposed to be shown and to configure the frame we can use <FRAMESET>There are two
important points to consider when working with <FRAMESET>.
• <FRAMESET> element actually takes the place of the <BODY> element in a document.
• Specifying actual pixel dimensions for frames .
<FRAME> Elements are used to create actual frames.
CSE Dept 14 VMTW

From the frameset point of view dividing the browser into tow vertical frames means
creating two columns using the <FRAMESET> elements COLS attribute.
The syntax for vertical fragmentation is,
<FRAMESET COLS =”50%, 50%”>
</FRAMESET>
Similarly if we replace COLS with ROWS then we get horizontal fragmentation.
The syntax for horizontal fragmentation is,
<FRAMESET ROWS=”50%, 50%”>
</FRAMESET>
FORM:
The purpose of FORM is to create an HTML form; used to enclose HTML

controls, like buttons and text fields.
ATTRIBUTES:
• ACTION: Gives the URL that will handle the form data.
• NAME: Gives the name to the form so you can reference it in code set to an
alphanumeric string.
• METHOD: method or protocol is used to sending data to the target action URL. The
GET method is the default, it is used to send all form name/value pair information
in an URL. Using the POST method, the content of the form are encoded as with the
GET method, but are sent in environment variables.
CONTROLS IN HTML
<INPUT TYPE =BUTTON>:
CSE Dept 15 VMTW

Creates an html button in a form.
ATTRIBUTES:
• NAME: gives the element a name. Set to alphanumeric characters.
• SIZE: sets the size.
• VALUE: sets the caption of the element.
<INPUT TYPE = PASSWORD>:
Creates a password text field, which makes typed input.
ATTRIBUTES:
• NAME: gives the element a name, set to alphanumeric characters.
• VALUE: sets the default content of the element.
<INPUT TYPE=RADIO>:
Creates a radio button in a form.
ATTRIBUTE:
• NAME: Gives the element a name. Set to alphanumeric character.
• VALUE: Sets the default content of the element.
<INPUT TYPE=SUBMIT>:
Creates a submit button that the user can click to send data in the form back to the web server.
ATTRIBUTES:
NAME: Gives the element a name. Set to alphanumeric characters.
CSE Dept 16 VMTW

VALUE: Gives this button another label besides the default, Submit Query. Set to alphanumeric
characters.
<INPUT TYPE=TEXT>:
Creates a text field that the user can enter or edit text in.
ATTRIBUTES:
NAME: Gives the element a name. Set to alphanumeric characters.
VALUE: Holds the initial text in the text field. Set to alphanumeric characters.
6.2 JAVA SCRIPT
Java script originally supported by Netscape navigator is the most popular web
scripting language today. Java script lets you embedded programs right in your web pages
and run these programs using the web browser. You place these programs in a <SCRIPT>
element, usually within the <HEAD> element. If you want the script to write directly to
the web page, place it in the <BODY> element.
JAVASCRIPT METHODS:
Writeln:
Document.writeln() is a method, which is used to write some text to the current web page.
onClick:
Occurs when an element is clicked.
onLoad:
Occurs when the page loads.
CSE Dept 17 VMTW

onMouseDown:
Occurs when a mouse button goes down.
onMouseMove:
Occurs when the mouse moves.
onUnload:
Occurs when a page is unloaded.
6.3 MySql
JDBC DRIVERS:
The JDBC API only defines interfaces for objects used for performing various database-
related tasks like opening and closing connections, executing SQL commands, and retrieving the
results. We all write our programs to interfaces and not implementations. Either the resource
manager vendor or a third party provides the implementation classes for the standard JDBC
interfaces. These software implementations are called JDBC drivers. JDBC drivers transform the
standard JDBC calls to the external resource manager-specific API calls. The diagram below
depicts how a database client written in java accesses an external resource manager using the
JDBC API and JDBC driver:
Depending on the mechanism of implementation, JDBC drivers are broadly classified

into four types.
TYPE1:
Type1 JDBC drivers implement the JDBC API on top of a lower level API like ODBC.
These drivers are not generally portable because of the independency on native libraries. These
drivers translate the JDBC calls to ODBC calls and ODBC sends the request to external data
source using native library calls. The JDBC-ODBC driver that comes with the software
distribution for J2SE is an example of a type1 driver.
CSE Dept 18 VMTW

TYPE2:
Type2 drivers are written in mixture of java and native code.Type2 drivers use vendors
specific native APIs for accessing the data source. These drivers transform the JDBC calls to
vendor specific calls using the vendor’s native library.
These drivers are also not portable like type1 drivers because of the dependency on native code.
TYPE3:
Type3 drivers use an intermediate middleware server for accessing the external data
sources. The calls to the middleware server are database independent. However, the middleware
server makes vendor specific native calls for accessing the data source. In this case, the driver is
purely written in java.
TYPE4:
Type4 drivers are written in pure java and implement the JDBC interfaces and translate
the JDBC specific calls to vendor specific access calls. They implement the data transfer and
network protocol for the target resource manager. Most of the leading database vendors provide
type4 drivers for accessing their database servers.
DRIVER MANAGER AND DRIVER:
The java.sql package defines an interface called Java.sql.Driver that makes to be

implemented by all the JDBC drivers and a class called java.sql.DriverManager that acts as the
interface to the database clients for performing tasks like connecting to external resource
managers, and setting log streams. When a JDBC client requests the DriverManager to make a
connection to an external resource manager, it delegates the task to an appropriate driver
class implemented by the JDBC driver provided either by the resource manager vendor or a
third party.
JAVA.SQL.DRIVERMANAGER:
CSE Dept 19 VMTW

The primary task of the class driver manager is to manage the various JDBC drivers
register. It also provides methods for:
• Getting connections to the databases.
• Managing JDBC logs.
• Setting login timeout.
MANAGING DRIVERS:
JDBC clients specify the JDBC URL when they request a connection. The driver
manager can find a driver that matches the request URL from the list of register drivers and
delegate the connection request to that driver if it finds a match JDBC URLs normally take the
following format:
<protocol>:<sub-protocol>:<resource>
The protocol is always jdbc and the sub-protocol and resource depend on the type of
resource manager. The URL for postgreSQL is in the format:
Jdbc: postgres ://< host> :< port>/<database>
Here host is the host address on which post master is running and database is the name of
the database to which the client wishes to connect.
MANAGING CONNECTION:
DriverManager class is responsible for managing connections to the databases:
public static Connection getConnection (String url,Properties info) throws SQLException
CSE Dept 20 VMTW

This method gets a connection to the database by the specified JDBC URL using the
specified username and password. This method throws an instance of SQLException if a
database access error occurs.
CONNECTIONS:
The interface java.sql.Connection defines the methods required for a persistent

connection to the database. The JDBC driver vendor implements this interface. A database
‘vendor-neutral’ client neveruses the implementation class and will always use only the
interface. This interface defines methods for the following tasks:
• Statements, prepared statements, and callable statements are the different types of
statements for issuing sql statements to the database by the JDBC clients.
• For getting and setting auto-commit mode.
• Getting meta information about the database.
• Committing and rolling back transactions.
CREATING STATEMENTS:
The interface java.sql.Connection defines a set of methods for creating database

statements. Database statements are used for sending SQL statements to the database:
Public Statement createStatement () throws SQLException
This method is used for creating instances of the interface java.sql.Statement. This
interface can be used for sending SQL statements to the database. The interface
java.sql.Statement is normally used for sending SQL statements that don’t take any arguments.
This method throws an instance of SQLException if a database access error occur:
Public Statement createStatement (int resType, int resConcurrency) throws SQLException.
JDBC RESULTSETS:
CSE Dept 21 VMTW

A JDBC resultset represents a two dimentional array of data produced as a result of

executing SQL SELECT statements against databases using JDBC statements. JDBC resultsets
are represented by the interface java.sql.ResultSet. The JDBC vendor provider provides the
implementation class for this interface.
SCROLLING RESULTSETS:
public boolean next() throws SQLException
public boolean previous() throws SQLException
public boolean first() throws SQLException
public boolean last() throws SQLException
ACCESSING RESULTSET DATA:
STATEMENT:
Interface java.sql.Stament is normally used for sending SQL statements that do not have
IN or OUT parameters. The JDBC driver vendor provides the implementation class for this
interface. The common methods required by the different JDBC statements are defined in this
interface. The methods defined by java.sql. Statement can be broadly categorized as follows:
• Executing SQL statements
• Querying results and resultsets
• Handling SQL batches
• Other miscellaneous methods
The interface java.sql.statements defines
methods for executing different SQL statements like SELECT, UPDATE, INSERT, DELETE,
and CREATE.
CSE Dept 22 VMTW

Public Resultset execute Query (string sql) throws SQLException
The following figure shows how the DriverManager, Driver, Connection, Statement,
ResultSet classes are connected.
6.4 JAVA SERVER PAGES (JSP)
INTRODUCTION:
Java Server Pages (JSP) technology enables you to mix regular, static HTML with
dynamically generated content. You simply write the regular HTML in the normal manner, using
familiar Web-page-building tools. You then enclose the code for the dynamic parts in special
tags, most of which start with <% and end with %>.
THE NEED FOR JSP:
Servlets are indeed useful, and JSP by no means makes them obsolete. However,
• It is hard to write and maintain the HTML.
• You cannot use standard HTML tools.
• The HTML is inaccessible to non-Java developers.
BENEFITS OF JSP:
JSP provides the following benefits over servlets alone:
• It is easier to write and maintain the HTML: In this no extra backslashes, no double
quotes, and no lurking Java syntax.
• You can use standard Web-site development tools:
We use Macromedia Dreamweaver for most of the JSP pages. Even HTML tools that know
nothing about JSP can used because they simply ignore the JSP tags.
CSE Dept 23 VMTW

• You can divide up your development team:
The Java programmers can work on the dynamic code. The Web developers can
concatenate on the representation layer.On large projects, this division is very important.
Dependingon the size of your team and the complexity of your project, you can enforce a
weaker or stronger separation between the static HTML and the dynamic content.
CREATING TEMPLATE TEXT:
A large percentage of our JSP document consists of static text known as template text. In
almost all respects, this HTML looks just likes normal HTML follows all the same syntax
rules, and simply “passed through” to that client by the servlet created to handle the page. Not
only does the HTML look normal, it can be created by whatever tools you already are
using for building Web pages.
There are two minor exceptions to the “template text passed through” rule. First, if you
want to have <% 0r %> in the out port, you need to put <\% or %\> in the template text. Second,
if you want a common to appear in the JSP page but not in the resultant document,
<%-- JSP Comment -- %>
HTML comments of the form:
<!—HTML Comment -->
are passed through to the client normally.
TYPES OF JSP SCRIPTING ELEMENTS:
JSP scripting elements allow you to insert Java code into the servlet that will be
generated from the JSP page. There are three forms:
CSE Dept 24 VMTW

Expressions of the form <%=Java Expression %>, which are evaluated and inserted into
the servlet’s output.
Sciptlets of the form <%Java code %>, which are inserted into the servlet’s_jspService method
(called by service).
Declarations of the form<%! Field/Method Declaration %>, which are inserted into the
body of the servlet class, outside any existing methods.
USING JSP EXPRESSIONS:
A JSP element is used to insert values directly into the output. It has the following
form:
<%= Java Expression %>
The expression is evaluated, converted to a string, and inserted in the page. This
evaluation is performed at runtime (when the page is requested) and thus has full access to the
information about the request. For example, the following shows the date/time that the page was
requested.
Current time: <%=new java.util.Date () %>
PREDEFINED VARIABLES:
To simplify expressions we can use a number of predefined variables (or “implicit

objects”). The specialty of these variables is that, the system simple tells what names it will use
for the local variables in _jspService.The most important ones of these are:
• request, the HttpServletRequest.
• response, the HttpServletResponse.
• session, the HttpSession associated with the request
CSE Dept 25 VMTW

• out, the writer used to send output to clients.
• application, the ServletContext. This is a data structure shared by all servlets and JSP
pages in the web application and is good for storing shared data.
Here is an example:
Your hostname: <%= request.getRemoteHost () %>
COMPARING SERVLETS TO JSP PAGES
JSP works best when the structure of the HTML page is fixed but the values at various
places need to be computed dynamically. If the structure of the page is dynamic, JSP is less
beneficial. Some times servlets are better in such a case. If the page consists of binary data or has
little static content, servlets are clearly superior. Sometimes the answer is neither servlets nor JSP
alone, but rather a combination of both.
WRITING SCRIPTLETS
If you want to do something more complex than output the value of a simple expression
.JSP scriptlets let you insert arbitrary code into the servlet’s _jspService method. Scriptlets
have the following form:
<% Java code %>
Scriptlets have access to the same automatically defined variables as do expressions

(request, response, session, out , etc ) .So for example you want to explicitly send output of the
resultant page , you could use the out variable , as in the following example:
<%
String queryData = request.getQueryString ();
out.println (“Attached GET data: “+ queryData);
%>
CSE Dept 26 VMTW

SCRIPTLET EXAMPLE:
As an example of code that is too complex for a JSP expression alone, a JSP page that
uses the bgColor request parameter to set the background color of the page .Simply using
<BODY BGCOLOR=”<%= request.getParameter (“bgcolor”) %> “>
would violate the cardinal rule of reading form data.
USING DECLARATIONS
A JSP declaration lets you define methods or fields that get inserted into the main body of
the servlet class .A declaration has the following form:
<%! Field or Method Definition %>
Since declarations do not generate output, they are normally used in conjunction with JSP
expressions or scriptlets. In principle, JSP declarations can contain field (instance variable)
definitions, method definitions, inner class definitions, or even static initializer blocks: anything
that is legal to put inside a class definition but outside any existing methods. In practice
declarations almost always contain field or method definitions.
We should not use JSP declarations to override the standard servlet life cycle methods.
The servlet into which the JSP page gets translated already makes use of these methods. There is
no need for declarations to gain access to service, doget, or dopost, since calls to service are
automatically dispatched to _jspService , which is where code resulting from expressions and
scriptlets is put. However for initialization and cleanup, we can use jspInit and jspDestroy- the
standard init and destroy methods are guaranteed to call these methods in the servlets that come
from JSP.
6.5Apache Tomcat
CSE Dept 27 VMTW

Apache Tomcat is a web Server.Tomcat is the Servlet/JSP container that implements the Servlet
2.4 and JavaServer Pages 2.0 specification. It also includes many additional features that make it
a useful platform for developing and deploying web applications and web services.
TERMINOLOGY:
Context – a Context is a web application.
$CATALINA_HOME – This represents the root of Tomcat installation.
DIRECTORIES AND FILES:
/bin – Startup, shutdown, and other scripts. The *.sh files (for Unix systems) are functional
duplicates of the *.bat files (for Windows systems). Since the Win32 command-line lacks certain
functionality, there are some additional files in here.
/conf – Configuration files and related DTDs. The most important file in here is server.xml. It is
the main configuration file for the container.
/logs – Log files are here by default.
/webapps – This is where webapps go\
INSTALLATION:
Tomcat will operate under any Java Development Kit (JDK) environment that provides a
JDK 1.2 (also known as Java2 Standard Edition, or J2SE) or later platform. JDK is needed so
that servlets, other classes, and JSP pages can be compiled.
DEPLOYMENT DIRECTORIES FOR DEFAULT WEB APPLICATION:
HTML and JSP Files
• Main Location
CSE Dept 28 VMTW

$CATALINA_HOME/webapps/ROOT
• Corresponding URLs.
http://host/SomeFile.html
http://host/SomeFile.jsp
• More Specific Location (Arbitrary Subdirectory).
$CATALINA_HOME/webapps/ROOT/SomeDirectory
• Corresponding URLs
http://host/SomeDirectory/SomeFile.html
http://host/SomeDirectory/SomeFile.jsp
Individual Servlet and Utility Class Files
• Main Location (Classes without Packages).
$CATALINA_HOME/webapps/ROOT/WEB-INF/classes
• Corresponding URL (Servlets).
http://host/servlet/ServletName
• More Specific Location (Classes in Packages).
$CATALINA_HOME/webapps/ROOT/WEB-INF/classes/packageName
• Corresponding URL (Servlets in Packages).
http://host/servlet/packageName.ServletName
Servlet and Utility Class Files Bundled in JAR Files
• Location
CSE Dept 29 VMTW

$CATALINA_HOME/webapps/ROOT/WEB-INF/lib
• Corresponding URLs (Servlets)
http://host/servlet/ServletName
http://host/servlet/packageName.ServletName
CSE Dept 30 VMTW

7.TESTING
SOFTWARE TESTING
TESTING
Software testing is a critical element of software quality assurance and represents the
ultimate review of specification, design and code generation.
7.1 TESTING OBJECTIVES
• To ensure that during operation the system will perform as per specification.
• TO make sure that system meets the user requirements during operation
• To make sure that during the operation, incorrect input, processing and output will be
detected
• To see that when correct inputs are fed to the system the outputs are correct
• To verify that the controls incorporated in the same system as intended
• Testing is a process of executing a program with the intent of finding an error
• A good test case is one that has a high probability of finding an as yet undiscovered error
The software developed has been tested successfully using the following testing
strategies and any errors that are encountered are corrected and again the part of the program or
the procedure or function is put to testing until all the errors are removed. A successful test is one
that uncovers an as yet undiscovered error.
Note that the result of the system testing will prove that the system is working correctly.
It will give confidence to system designer, users of the system, prevent frustration during
implementation process etc.,
7.2 TEST CASE DESIGN:
CSE Dept 31 VMTW

White box testing
White box testing is a testing case design method that uses the control structure of the
procedure design to derive test cases.All independents path in a module are exercised at least
once, all logical decisions are exercised at once, execute all loops at boundaries and within their
operational bounds exercise internal data structure to ensure their validity. Here the customer is
given three chances to enter a valid choice out of the given menu. After which the control exits
the current menu.
Black Box Testing
Black Box Testing attempts to find errors in following areas or categories, incorrect or
missing functions, interface error, errors in data structures, performance error and initialization
and termination error. Here all the input data must match the data type to become a valid entry.
The following are the different tests at various levels:
Unit Testing:
Unit testing is essentially for the verification of the code produced during the coding
phase and the goal is test the internal logic of the module/program. In the Generic code project,
the unit testing is done during coding phase of data entry forms whether the functions are
working properly or not. In this phase all the drivers are tested they are rightly connected or not.
Integration Testing:
All the tested modules are combined into sub systems, which are then tested. The goal is
to see if the modules are properly integrated, and the emphasis being on the testing interfaces
between the modules. In the generic code integration testing is done mainly on table creation
module and insertion module.
Validation Testing
CSE Dept 32 VMTW

This testing concentrates on confirming that the software is error-free in all respects. All
the specified validations are verified and the software is subjected to hard-core testing. It also
aims at determining the degree of deviation that exists in the software designed from the
specification; they are listed out and are corrected.
System Testing
This testing is a series of different tests whose primary is to fully exercise the computer-
based system. This involves:
• Implementing the system in a simulated production environment and testing it.
• Introducing errors and testing for error handling.
TEST CASE 1 :
Test case for Login form:
When a user tries to login by submitting an incorrect ID or an incorrect Password then

it displays an error message “NOT A VALID USER NAME”.
TEST CASE 2:
Test case for User Registration form:
When a user enters user id to register and ID already exists, then this result in displaying
error message “USER ID ALREADY EXISTS”.
TEST CASE 3 :
Test case for Change Password:
CSE Dept 33 VMTW

When the old password does not match with the new password, then this results in
displaying an error message as “OLD PASSWORD DOES NOT MATCH WITH THE NEW
PASSWORD”.
TEST CASE 4 :
Test case for Forget Password:
When a user forgets his password he is asked to enter Login name, ZIP code, Mobile
number. If these are matched with the already stored ones then user will get his Original
password
CSE Dept 34 VMTW

8.SYSTEM DESIGN
SYSTEM design is transition from a user oriented document to programmers or data base
personnel. The design is a solution, how to approach to the creation of a new system. This is
composed of several steps. It provides the understanding and procedural details necessary for
implementing the system recommended in the feasibility study. Designing goes through logical
and physical stages of development, logical design reviews the present physical system, prepare
input and output specification, details of implementation plan and prepare a logical design
walkthrough.
The database tables are designed by analyzing functions involved in the system and
format of the fields is also designed. The fields in the database tables should define their role in
the system. The unnecessary fields should be avoided because it affects the storage areas of the
system. Then in the input and output screen design, the design should be made user friendly. The
menu should be precise and compact.
SOFTWARE DESIGN
In designing the software following principles are followed:
1. Modularity and partitioning: software is designed such that, each system should consists of
hierarchy of modules and serve to partition into separate function.
2. Coupling: modules should have little dependence on other modules of a system.
3. Cohesion: modules should carry out in a single processing function.
4. Shared use: avoid duplication by allowing a single module be called by other that need the
function it provides
CSE Dept 35 VMTW

UML Concepts
The Unified Modelling Language (UML) is a standard language for writing software blue
prints. The UML is a language for
• Visualizing
• Specifying
• Constructing
• Documenting the artefacts of a software intensive system.
The UML is a language which provides vocabulary and the rules for combining words in
that vocabulary for the purpose of communication. A modelling language is a language whose
vocabulary and the rules focus on the conceptual and physical representation of a system.
Modelling yields an understanding of a system.
Building Blocks of the UML:
The vocabulary of the UML encompasses three kinds of building blocks:
• Things
• Relationships
• Diagrams
Things are the abstractions that are first-class citizens in a model; relationships tie these
things together; diagrams group interesting collections of things.
Things in the UML:
There are four kinds of things in the UML:
• Structural things
CSE Dept 36 VMTW

• Behavioral things
• Grouping things
• Annotational things
Structural things are the nouns of UML models. The structural things used in the project
design are:First, a class is a description of a set of objects that share the same attributes,
operations, relationships and semantics.
Window
origin
size
open()
close()
move()
display()
Window
origin
size
open()
close()
move()
display()
CSE Dept 37 VMTW

Fig: Classes
Second, a use case is a description of set of sequence of actions that a system performs that
yields an observable result of value to particular actor.
Fig: Use Cases
Third, a node is a physical element that exists at runtime and represents a computational
resource, generally having at least some memory and often processing capability.
Fig: Nodes
Behavioural things are the dynamic parts of UML models. The behavioural thing used is:
Interaction:
An interaction is a behaviour that comprises a set of messages exchanged among a set of

objects within a particular context to accomplish a specific purpose. An interaction involves a
number of other elements, including messages, action sequences (the behaviour invoked by a
message, and links (the connection between objects).
Fig: Messages
5.1.3 Relationships in the UML:
CSE Dept 38 VMTW

There are four kinds of relationships in the UML:
• Dependency
• Association
• Generalization
• Realization
A dependency is a semantic relationship between two things in which a change to one

thing may affect the semantics of the other thing (the dependent thing).
Fig: Dependencies
An association is a structural relationship that describes a set links, a link being a

connection among objects. Aggregation is a special kind of association, representing a structural
relationship between a whole and its parts.
Fig: Association
A generalization is a specialization/ generalization relationship in which objects of the

specialized element (the child) are substitutable for objects of the generalized element (the
parent).
Fig: Generalization
A realization is a semantic relationship between classifiers, where in one classifier

specifies a contract that another classifier guarantees to carry out.
CSE Dept 39 VMTW
Fig: Realization
Sequence Diagrams:
UML sequence diagrams are used to represent the flow of messages, events and actions
between the objects or components of a system. Time is represented in the vertical direction
showing the sequence of interactions of the header elements, which are displayed horizontally at
the top of the diagram.
Sequence Diagrams are used primarily to design, document and validate the architecture,
interfaces and logic of the system by describing the sequence of actions that need to be
performed to complete a task or scenario. UML sequence diagrams are useful design tools
because they provide a dynamic view of the system behaviour which can be difficult to extract
from static diagrams or specifications.
Actor
Represents an external person or entity that interacts with the system
Object
Represents an object in the system or one of its components
Fig: Object
CSE Dept 40 VMTW

Unit
Represents a subsystem, component, unit, or other logical entity in the system (may or
may not be implemented by objects)
Fig: Unit
Separator
Represents an interface or boundary between subsystems, components or units (e.g., air

interface, Internet, network)
Fig: Seperator
Group
Groups related header elements into subsystems or components
Fig: Group
Sequence Diagram Body Elements
CSE Dept 41 VMTW

Action
Represents an action taken by an actor, object or unit
Asynchronous Message
An asynchronous message between header elements
Fig: Asynchronous Message
Block
A block representing a loop or conditional for a particular header element
Fig: Block
Call Message
A call (procedure) message between header elements
Fig: Call Message
Create Message
CSE Dept 42 VMTW

A "create" message that creates a header element (represented by lifeline going from
dashed to solid pattern)
Fig: Create Message
Diagram Link
Represents a portion of a diagram being treated as a functional block. Similar to a

procedure or function call that abstracts functionality or details not shown at this level. Can
optionally be linked to another diagram for elaboration.
Fig: Link
Else Block Represents an "else" block portion of a diagram block
Fig: Else
Message:A simple message between header elements
Fig: Message
Return Message:A return message between header elements
Fig: Return
CSE Dept 43 VMTW

Sequence diagram
Use Case Diagram
A use case diagram is a graph of actors set of use cases enclosed by a system boundary,
communication associations between the actors and users and generalization among use cases.
The use case model defines the outside(actors) and inside(use case) of the system’s behavior.
use case diagram is quite simple in nature and depicts two types of elements: one representing
the business roles and the other representing the business processes.
CSE Dept 44 VMTW

Figure : actor
To identify an actor, search in the problem statement for business terms that portray roles
in the system. For example, in the statement "patients visit the doctor in the clinic for medical
tests," "doctor" and "patients" are the business roles and can be easily identified as actors in the
system.
Use case: A use case in a use case diagram is a visual representation of a distinct business
functionality in a system. The key term here is "distinct business functionality." To choose a
business process as a likely candidate for modelling as a use case, you need to ensure that the
business process is discrete in nature.
As the first step in identifying use cases, you should list the discrete business functions in
your problem statement. Each of these business functions can be classified as a potential use
case. Remember that identifying use cases is a discovery rather than a creation. As business
functionality becomes clearer, the underlying use cases become more easily evident. A use case
is shown as an ellipse in a use case diagram (see Figure ).
Figure : use case
Figure shows two uses cases: "Make appointment" and "Perform medical tests" in the use
case diagram of a clinic system. As another example, consider that a business process such as
"manage patient records" can in turn have sub-processes like "manage patient's personal
information" and "manage patient's medical information." Discovering such implicit use cases is
possible only with a thorough understanding of all the business processes of the system through
discussions with potential users of the system and relevant domain knowledge.
CSE Dept 45 VMTW

Activity Diagram
Activity diagrams represent the business and operational workflows of a system. An

Activity diagram is a dynamic diagram that shows the activity and the event that causes the
object to be in the particular state.
CSE Dept 46 VMTW

So, what is the importance of an Activity diagram, as opposed to a State diagram? A

State diagram shows the different states an object is in during the lifecycle of its existence in the
system, and the transitions in the states of the objects. These transitions depict the activities
causing these transitions, shown by arrows.
An Activity diagram talks more about these transitions and activities causing the changes
in the object states.
Admin activity diagram:
login
enter username password

inavalid username password
addproduct
editproduct
changeproduct
deleteproduct
viewtransaction
facetevaluation
User Activity
CSE Dept 47 VMTW

login
enter username password

inavalid username password
searchproduct
buyproduct
Defining an Activity diagram
Let us take a look at the building blocks of an Activity diagram.
Elements of an Activity diagram
An Activity diagram consists of the following behavioural elements:
Initial Activity: This shows the starting point or first activity of the flow. Denoted by a
solid circle. This is similar to the notation used for Initial State.
Fig: Initial State
CSE Dept 48 VMTW

Activity: Represented by a rectangle with rounded (almost oval) edges.
Fig: Action State
Decisions: Similar to flowcharts, a logic where a decision is to be made is depicted by a

diamond, with the options written on either sides of the arrows emerging from the diamond,
within box brackets.
Fig: Decision
Signal: When an activity sends or receives a message, that activity is called a signal.
Signals are of two types: Input signal (Message receiving activity) shown by a concave polygon
and Output signal (Message sending activity) shown by a convex polygon.
Fig: Signal
Concurrent Activities: Some activities occur simultaneously or in parallel. Such activities

are called concurrent activities. For example, listening to the lecturer and looking at the
CSE Dept 49 VMTW

blackboard is a parallel activity. This is represented by a horizontal split (thick dark line) and the
two concurrent activities next to each other, and the horizontal line again to show the end of the
parallel activity.
Fig: Horizontal
Final Activity: The end of the Activity diagram is shown by a bull's eye symbol, also called as a
final activity.
Fig: Final State
Class Diagram
An object is any person, place, thing, concept, event, screen, or report applicable to your
system. Objects both know things (they have attributes) and they do things (they have methods).
A class is a representation of an object and, in many ways; it is simply a template from which
objects are created. Classes form the main building blocks of an object-oriented application.
CSE Dept 50 VMTW

Class diagram
Responsibilities
Classes are typically modeled as rectangles with three sections: the top section for the
name of the class, the middle section for the attributes of the class, and the bottom section for the
methods of the class. Attributes are the information stored about an object, while methods are the
things an object or class do. For example, students have student numbers, names, addresses, and
phone numbers. Those are all examples of the attributes of a student. Students also enroll in
courses, drop courses, and request transcripts. Those are all examples of the things a student
does, which get implemented (coded) as methods. You should think of methods as the object-
oriented equivalent of functions and procedures.
CSE Dept 51 VMTW

Object diagram
An object diagram in the Unified Modeling Language (UML), is a diagram that shows a
complete or partial view of the structure of a modeled system at a specific time.
An Object diagram focuses on some particular set of object instances and attributes, and
the links between the instances. A correlated set of object diagrams provides insight into how an
arbitrary view of a system is expected to evolve over time. Object diagrams are more concrete
than class diagrams, and are often used to provide examples, or act as test cases for the class
diagrams. Only those aspects of a model that are of current interest need be shown on an object
diagram.
Deployment Diagram
Deployment diagrams are used to visualize the topology of the physical components of a
system where the software components are deployed.
So deployment diagrams are used to describe the static deployment view of a system.
Deployment diagrams consist of nodes and their relationships.
The purpose of deployment diagrams can be described as:
The name Deployment itself describes the purpose of the diagram. Deployment diagrams are
used for describing the hardware components where software components are deployed.
• Visualize hardware topology of a system.
• Describe the hardware components used to deploy software components.
• Describe runtime processing nodes.
• Deployment Diagram
CSE Dept 52 VMTW

Component Diagram
In the Unified Modelling Language, a component diagram depicts how components are
wired together to form larger components and or software systems. They are used to illustrate the
structure of arbitrarily complex systems.
Components are wired together by using an assembly connector to connect the required
interface of one component with the provided interface of another. This illustrates the service
consumer - service provider relation between the two components.
CSE Dept 53 VMTW

Entity relationship diagram:
An entity-relationship (ER) diagram is a graphical representation of entities and their

relationships to each other, typically used in computing in regard to the organization of data
within databases or information systems.
CSE Dept 54 VMTW

INPUT/OUTPUT DESIGN
Input design: considering the requirements, procedures to collect the necessary input data
in most efficiently designed. The input design has been done keeping in view that, the
interaction of the user with the system being the most effective and simplified way.
Also the measures are taken for the following
 Controlling the amount of input

 Avoid unauthorized access to the classroom.
 Eliminating extra steps
 Keeping the process simple
 At this stage the input forms and screens are designed.
Output design: All the screens of the system are designed with a view to provide the user with
easy operations in simpler and efficient way, minimum key strokes possible. Instructions and
CSE Dept 55 VMTW

important information is emphasized on the screen. Almost every screen is provided with no
error and important messages and option selection facilitates. Emphasis is given for speedy
processing and speedy transaction between the screens. Each screen assigned to make it as much
user friendly as possible by using interactive procedures. So to say user can operate the system
without much help from the operating manual.
Database Design
Copy your db tables here
CSE Dept 56 VMTW

9.OUTPUT SCREENS
Copy your screen shots here
CONCLUSION
EXISTING query facet mining algorithms, including QDMiner, QF-I, and QF-J mainly
rely on the top search results from the search engines. The coverage of facets mined using this
kind of methods might be limited, because usually only a small number of results are used. We
propose leveraging knowledge bases as complementary data sources. We use two methods,
namely facet generation and facet expansion, to mine query facets. Facet generation directly uses
properties in Freebase as candidates, while facet expansion intends to expand initial facets mined
by QDMiner in property based and type-based manners. Experimental results show that our
approach is effective, especially for improving the recall of facet items.
REFERENCES:
[1] H. Shum, “Bing dialog model: Intent, knowledge, and user interaction,”
http://research.microsoft.com/en-us/um/
redmond/events/fs2010/presentations/Shum_Bing_Dialog_ Model_RFS_71210.pdf.
[2] J. Niccolai, “Yahoo vows death to the ’10 blue links’,”
http://www.pcworld.com/businesscenter/article/165214/
yahoo_vows_death_to_the_10_blue_links.html.
[3] R. Browne, “No longer just ’ten blue links’,” http://blogoscoped.
com/archive/2007-06-28-n28.html.
CSE Dept 57 VMTW

[4] Z. Dou, S. Hu, Y. Luo, R. Song, and J.-R.Wen, “Finding dimensions
for queries,” in Proceedings of CIKM’11, 2011, pp. 1311–1320.
[5] W. Kong and J. Allan, “Extending faceted search to the general
web,” in Proceedings of CIKM ’14, 2014, pp. 839–848.
[6] ——, “Extracting query facets from search results,” in Proceedings
of SIGIR ’13, 2013, pp. 93–102.
[7] Z. Dou, Z. Jiang, S. Hu, J. Wen, and R. Song, “Automatically
mining facets for queries from their search results,” IEEE Trans.
Knowl. Data Eng., vol. 28, no. 2, pp. 385–397, 2016.
[8] O. Ben-Yitzhak, N. Golbandi, N. Har’El, R. Lempel, A. Neumann,
S. Ofek-Koifman, D. Sheinwald, E. Shekita, B. Sznajder, and S. Yogev,
“Beyond basic faceted search,” in Proceedings of WSDM ’08,
2008, pp. 33–44.
[9] M. Diao, S. Mukherjea, N. Rajput, and K. Srivastava, “Faceted
search and browsing of audio content on spoken web,” in Proceedings
of CIKM ’10, 2010, pp. 1029–1038.
[10] D. Dash, J. Rao, N. Megiddo, A. Ailamaki, and G. Lohman,
“Dynamic faceted search for discovery-driven analysis,” in CIKM
’08, 2008, pp. 3–12.
[11] C. Li, N. Yan, S. B. Roy, L. Lisham, and G. Das, “Facetedpedia:
CSE Dept 58 VMTW

dynamic generation of query-dependent faceted interfaces for
wikipedia,” in Proceedings of WWW ’10, 2010, pp. 651–660.
[12] W. Dakka and P. G. Ipeirotis, “Automatic extraction of useful facet
hierarchies from text databases,” in Proceedings of ICDE ’08, 2008,
pp. 466–475.
[13] W. Dakka, R. Dayal, and P. G. Ipeirotis, “Automatic discovery of
useful facet terms,” in SIGIR Faceted Search Workshop, 2006, pp.
18–22.
[14] W. Dakka, P. G. Ipeirotis, and K. R.Wood, “Automatic construction
of multifaceted browsing interfaces,” in Proceedings of CIKM ’05,
2005, pp. 768–775.
[15] K. Latha, K. R. Veni, and R. Rajaram, “Afgf: An automatic facet
generation framework for document retrieval,” in Proceedings of
ACE ’10, 2010, pp. 110–114.
[16] E. Stoica and M. A. Hearst, “Automating creation of hierarchical
faceted metadata structures,” in Proceedings of NAACL HLT ’07,
2007.
[17] S. Basu Roy, H. Wang, G. Das, U. Nambiar, and M. Mohania,
“Minimum-effort driven dynamic faceted search in structured
databases,” in Proceedings of CIKM ’08, 2008, pp. 13–22.
CSE Dept 59 VMTW

[18] J. Pound, S. Paparizos, and P. Tsaparas, “Facet discovery for structured
web search: A query-log mining approach,” in Proceedings of
SIGMOD ’11, 2011, pp. 169–180.
[19] E. Stoica, M. A. Hearst, and M. Richardson, “Automating creation
of hierarchical faceted metadata structures,” in Proceedings
of NAACL HLT ’07, 2007, pp. 244–251.
[20] “Google sets,” http://labs.google.com/sets.
[21] R. C. Wang and W. W. Cohen, “Language-independent set expansion
of named entities using the web,” in Proceedings of ICDM ’07,
2007, pp. 342–350.
[22] ——, “Iterative set expansion of named entities using the web,” in
Proceedings of ICDM ’08, 2008, pp. 1091–1096.
[23] ——, “Character-level analysis of semi-structured documents for
set expansion,” in Proceedings of EMNLP ’09, 2009, pp. 1503–1512.
[24] A. Carlson, J. Betteridge, R. C. Wang, E. R. Hruschka, Jr., and
T. M. Mitchell, “Coupled semi-supervised learning for information
extraction,” in Proceedings of WSDM ’10, 2010, pp. 101–110.
[25] H. Zhang, M. Zhu, S. Shi, and J.-R.Wen, “Employing topic models
for pattern-based semantic class discovery,” in Proceedings of ACLIJCNLP
’09, 2009, pp. 459–467.
CSE Dept 60 VMTW

[26] S. Shi, H. Zhang, X. Yuan, and J.-R. Wen, “Corpus-based semantic
class mining: distributional vs. pattern-based approaches,” in
Proceedings of COLING ’10, 2010, pp. 993–1001.
[27] R. Baeza-Yates, C. Hurtado, and M. Mendoza, “Query recommendation
using query logs in search engines,” in Current Trends in
Database Technology EDBT 2004 Workshops, 2004, pp. 588–596.
[28] Z. Zhang and O. Nasraoui, “Mining search engine query logs for
query recommendation,” in Proceedings of WWW ’06, 2006, pp.
1039–1040.
[29] L. Li, L. Zhong, Z. Yang, and M. Kitsuregawa, “Qubic: An adaptive
approach to query-based recommendation,” J. Intell. Inf. Syst.,
vol. 40, no. 3, pp. 555–587, Jun. 2013.
[30] I. Szpektor, A. Gionis, and Y. Maarek, “Improving recommendation
for long-tail queries via templates,” in Proceedings of WWW
’11, 2011, pp. 47–56.
[31] A. Herdagdelen, M. Ciaramita, D. Mahler, M. Holmqvist, K. Hall,
S. Riezler, and E. Alfonseca, “Generalized syntactic and semantic
models of query reformulation,” in Proceeding of SIGIR’10, 2010,
pp. 283–290.
[32] M. Mitra, A. Singhal, and C. Buckley, “Improving automatic query
CSE Dept 61 VMTW

expansion,” in Proceedings of SIGIR ’98, 1998, pp. 206–214.
[33] P. Anick, “Using terminological feedback for web search refinement:
a log-based study,” in Proceedings of SIGIR ’03, 2003, pp.
88–95.
[34] S. Riezler, Y. Liu, and A. Vasserman, “Translating queries into
snippets for improved query expansion,” in Proceedings of COLING
’98, 2008, pp. 737–744.
[35] X. Xue and W. B. Croft, “Modeling reformulation using query
distributions,” ACM Trans. Inf. Syst., vol. 31, no. 2, pp. 6:1–6:34,
May 2013.
[36] L. Bing, W. Lam, T.-L. Wong, and S. Jameel, “Web query reformulation
via joint modeling of latent topic dependency and term
context,” ACM Trans. Inf. Syst., vol. 33, no. 2, pp. 6:1–6:38, Feb.
2015.
[37] J. Huang and E. N. Efthimiadis, “Analyzing and evaluating query
reformulation strategies in web search logs,” in Proceedings of
CIKM ’09, 2009, pp. 77–86.
[38] S. Gholamrezazadeh, M. A. Salehi, and B. Gholamzadeh, “A comprehensive
survey on text summarization systems,” in Proceedings
of CSA ’09, 2009, pp. 1–6.
CSE Dept 62 VMTW

[39] M. Damova and I. Koychev, “Query-based summarization: A
survey,” in S3T’10, 2010.
[40] L. J. Heyer, S. Kruglyak, and S. Yooseph, “Exploring Expression
Data: Identification and Analysis of Coexpressed Genes,” Genome
Research, vol. 9, no. 11, pp. 1106–1115, November 1999.
CSE Dept 63 VMTW

Generating Query Facets Using Knowledge Bases: CSE Dept 1 VMTW

Uploaded by

Copyright:

Available Formats

You might also like

Generating Query Facets Using Knowledge Bases: CSE Dept 1 VMTW

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Generating Query Facets Using Knowledge Bases: CSE Dept 1 VMTW

Uploaded by

Copyright:

Available Formats

Generating Query Facets Using Knowledge Bases MN15_B03

In order to solve this problem, we propose leveraging a knowledge base as a

CSE Dept 1 VMTW

In facet expansion, we expand initial facets mined by traditional algorithms such as

CSE Dept 2 VMTW

Title: Finding dimensions for queries

Zhicheng Dou1 , Sha Hu2, Yulong Luo

Authors: Z. Dou, Z. Jiang, S. Hu, J. Wen, and R.

CSE Dept 3 VMTW

Title: Facetedpedia: dynamic generation of query-dependent faceted interfaces for

CSE Dept 4 VMTW

 facet-based browsers is that they allow only a limited set of queries

CSE Dept 5 VMTW

In order to solve this problem, we propose leveraging a knowledge base as a

 we propose leveraging a knowledge base as a complementary data source to improve the

CSE Dept 6 VMTW

 System : Pentium IV 2.4 GHz.

 Operating system : Windows XP/7.

CSE Dept 7 VMTW

CSE Dept 8 VMTW

CSE Dept 9 VMTW

CSE Dept 10 VMTW

To determine the operational feasibility of the system we should take into

 Cost benefit analysis

CSE Dept 11 VMTW

CSE Dept 12 VMTW

We used the following tags in our project.

TR is used to create a row in a table encloses <TH> and

• BGCOLOR: Specifies the background color for the row.

• BORDERCOLOR: Sets the external border color for the row.

• VALIGN: Sets the vertical alignment of the data in this row.

TH is used to create table heading.

CSE Dept 13 VMTW

• BGCOLOR: Specifies the background color of the table cell

TD is used to create table data that appears in the cells of a table.

• BGCOLOR: Specifies the background image for the table cell.

• BGCOLOR: sets the background color of the table cells.

• WIDTH: Species the width of the cell

• Specifying actual pixel dimensions for frames .

<FRAME> Elements are used to create actual frames.

CSE Dept 14 VMTW

The syntax for vertical fragmentation is,

<FRAMESET COLS =”50%, 50%”>

Similarly if we replace COLS with ROWS then we get horizontal fragmentation.

The syntax for horizontal fragmentation is,

<FRAMESET ROWS=”50%, 50%”>

The purpose of FORM is to create an HTML form; used to enclose HTML

<INPUT TYPE =BUTTON>:

CSE Dept 15 VMTW

Creates an html button in a form.

• NAME: gives the element a name. Set to alphanumeric characters.

• SIZE: sets the size.

• VALUE: sets the caption of the element.

<INPUT TYPE = PASSWORD>:

Creates a password text field, which makes typed input.

• NAME: gives the element a name, set to alphanumeric characters.

• VALUE: sets the default content of the element.