Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 14

Software Requirements Specification

for

<Truth Discovery with multiple Conflicting Information Providers on web>


Prepared by

A.K.SAINATH(O8BQ1A0504) M.SUNDEEP(08BQ1A0550) CH.ANIL KUMAR(08BQ1A0513) <VASIREDDY VENKATADRI INSTITUTE OF TECHNOLOGY> <Date created>

Page ii

Table of Contents
1. INTODUCTION........................................................................................................................ 1 2. OVERALL DESCRIPTION.................................................................................................... 2 3.SYSTEM FEATURES...............................................................................................................5 4. EXTERNAL INTERFACE REQUIREMENTS.................................................................... 7 5. OTHER NON FUNCTIONAL REQUIREMENTS...............................................................8 6. OTHER REQUIREMENTS.................................................................................................... 9

Page 1

1.
1.1

INTODUCTION
PURPOSE

In this paper, we propose a new problem called the Veracity problem, which is formulated as follows: Given a large amount of conflicting information about many objects, which is provided by multiple websites (or other types of information providers), how can we discover the true fact about each object? We use the word fact to represent something that is claimed as a fact by some website, and such a fact can be either true or false. In this paper, we only study the facts that are either properties of objects (e.g., weights of laptop computers) or relationships between two objects (e.g., authors of books). We also require that the facts can be parsed from the web pages. There are often conflicting facts on the Web, such as different sets of authors for a book. There are also many websites, some of which are more trustworthy than others.2 A fact is likely to be true if it is provided by trustworthy websites (especially if by many of them).
1.2 PROJECT SCOPE
EXISTING SYSTEM

Page Rank and Authority-Hub analysis is to utilize the hyperlinks to find pages with high authorities. These two approaches identifying important web pages that users are interested in, Unfortunately, the popularity of web pages does not necessarily lead to accuracy of information

Finally, we propose an algorithm called TRUTHFINDER for identifying true facts using iterative methods.

DISADVANTAGES

The popularity of web pages does not necessarily lead to accuracy of information. Even the most popular website may contain many errors.

PROPOSED SYSTEM

We formulate the Veracity problem about how to discover true facts from conflicting information.

Page 2

Second, we propose a framework to solve this problem, by defining the trustworthiness of websites, confidence of facts, and influences between facts.

ADVANTAGES

Our experiments show that TRUTHFINDER achieves very high accuracy in discovering true facts. It can select better trustworthy websites than authority-based search engines such as Google.

1.3

REFERENCES

[1] B. Amento,L.G. Terveen, and W.C. Hill, Does Authority Mean Quality? Predicting Expert Quality Ratings of Web Documents, Proc. ACM SIGIR 00, July 2000. [2] M. Blaze, J. Feigenbaum, and J. Lacy, Decentralized Trust Management, Proc. IEEE Symp. Security and Privacy (ISSP 96), May 1996. 3] A. Borodin, G.O. Roberts, J.S. Rosenthal, and P. Tsaparas, Link Analysis Ranking: Algorithms, Theory, and Experiments, ACM Trans. Internet Technology, vol. 5, no. 1, pp. 231-297, 2005.

2. OVERALL DESCRIPTION
2.1 PRODUCT PERSPECTIVE

THE World Wide Web has become a necessary part of our lives and might have become the most important information source for most people. Every day, people retrieve all kinds of information from the Web. For example, when shopping online, people find product specifications from websites like Amazon.com or ShopZilla.com. When looking for interesting DVDs, they get information and read movie reviews on websites such as NetFlix.com or IMDB.com. When they want to know the answer to a certain question, they go to Ask.com or Google.com. Is the World Wide Web always trustable? Unfortunately, the answer is no. There is no guarantee for the correctness of information on the Web. Even worse, different websites often provide conflicting information, as shown in the following examples. Example (Height of Mount Everest). Suppose a user is interested in how high Mount Everest is and queries Ask.com with What is the height of Mount Everest? Among the top 20 results, 1 he or she will find the following facts: four websites (including Ask.com itself) say 29,035

Page 3

feet, five websites say 29,028 feet, one says29, 002 feet, and another one says 29,017 feet. Which answer should the user trust?
2.2 PRODUCT FEATURES

First, we formulate the Veracity problem about how to discover true facts from conflicting information. Second, we propose a framework to solve this problem, by defining the trustworthiness of websites, confidence of facts, and influences between facts. Finally, we propose an algorithm called TRUTHFINDER for identifying true facts using iterative methods. Our experiments show that TRUTHFINDER achieves very high accuracy in discovering true facts, and it can select better trustworthy websites than authority-based search engines such as Google.

Page 4

2.3 USER CLASSES AND CHARACTERISTICS


The UML class diagram also referred to as object modeling, is the main static analysis diagram. Object modeling is the process by which the logical objects in the problem space are represented by the actual objects in the program. These diagrams show a set of classes, interfaces, and collaborations and their relationships. These diagrams address the static design view of a system. This view primarily supports the functional requirements of a system the services the system should provide to its end users. A class diagram is a collection of static modeling elements, such as classes and their relationships, connected as a graph to each other and to their content

Page 5

2.4 OPERATING ENVIRONMENT

The product will be operating in windows environment. Also it will be compatible with any web browser. The only requirement to use this system would be the internet connection.We also need to create a 3d environment,where the user can interact with different objects.Similar to other web applications, the platform required for the is similar to that of a normal web application. This would consist of a client and a server. An example would be as of below. (i.e. an apache web server and a My SQL database using ).

2.5 DESIGN AND IMPLEMENTATION CONSTRAINTS

The design constraints and implementation constraints are as follows : Users selection of web sites is limited We need a separate search engine

2.6ASSUMPTIONS AND DEPENDENCIES

The product needs following third party product. SQL server to store the database. Python to develop the Product

3.SYSTEM FEATURES
3.1 FUNCTIONAL REQUIREMENS

Input Design Requirements:

Input design is a part of overall system design. The main objective during the input design is as follows

Page 6

To achieve the highest possible level of accuracy. To ensure that the input is acceptable and understood by the user

Inputs: Query to the web Outputs: Data with no ambiguity Modules: Login module Validation module Query processing and data retrieval from the database Truth finder algorithm implementation module Send results. GUI module

3.2 INTERFACE REQUIREMENTS


Field accepts numeric data entry Field only accepts dates before the current date Screen can print on-screen data to the printer

3.3 BUSINESS REQUIREMENTS


Data must be entered before a request can approved Clicking the Approve Button moves the request to the Approval Workflow All personnel using the system will be trained according to internal training strategies

3.4 REGULATOTORY/COMPLIANCE REQUIREMENTS


The database will have a functional audit trail The system will limit access to authorized users The spreadsheet can secure data with electronic signatures

Page 7

3.5 SECURITY REQUIREMENTS


Member of the Data Entry group can enter requests but not approve or delete requests Members of the Managers group can enter or approve a request, but not delete requests Members of the Administrators group cannot enter or approve requests, but can delete requests

4. EXTERNAL INTERFACE REQUIREMENTS

4.1 USER INTERFACES

The user needs a visual display unit to interact lively with the system and he also needs a keyboard to physically interact with the system .

4.2 HARDWARE INTERFACES


4.2.1 SERVER SIDE

Operating System: Windows xp Processor: Pentium 3.0 GHz or higher RAM: 256 Mb or more Hard Drive: 10 GB or more

4.2.2 CLIENT SIDE

Operating System: Windows xp or above, MAC or UNIX. Processor: Pentium III or 2.0 GHz or higher. RAM: 256 Mb or more

4.3 SOFTWARE INTERFACES Database: SQL Server. Application: python scripts and java applet,web browser

Page 8

Web Server:apachee is a powerful Web server that provides a highly reliable, manageable, and scalable Web application infrastructure)

4.4 COMMUNICATION INTERFACES The Customer must connect to the Internet to access the Website: Dialup Modem of 52 kbps Broadband Internet Dialup or Broadband Connection with a Internet Provider.

5. OTHER NON FUNCTIONAL REQUIREMENTS

5.1 PERFORMANCE REQUIREMENTS

we make three major distributions in this paper. First, we formulate the Veracity problem about how to discover true facts from conflicting information. Second, we propose a framework to solve this problem, by defining the trustworthiness of websites, confidence of facts, and influences between facts. Finally, we propose an algorithm called TRUTHFINDER for identifying true facts using iterative methods. 5.2 SAFETY REQUIREMENTS The database may get crashed at any certain time due to virus or operating system failure. Therefore, it is required to take the database backup. 5.3 SECURITY REQUIREMENTS Prevention of obtaining password hashes or plaintext passwords from sniffing the network. The 3d password scheme should be able to avoid bruteforce attack and shoulder surfing attacks.

Page 9

6. OTHER REQUIREMENTS
6.1 CLASS DIAGRAMS Class diagrams commonly contain the following things:

Classes Interfaces Collaborations

6.2 USECASE DIAGRAM An actor is anything that interacts with a use case. It could be a human user, external hardware, or another system. An actor represents a category rather than a physical user. A use case diagram is a graph of actors, a set of use cases enclosed by a system boundaries.

Page 10

6.3 SEQUENCE DIAGRAM

Page 11

6.4 COLLABORATION DIAGRAM

Page 12

You might also like