Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 32

Randomized Protocols for

Duplicate Elimination in
Peer-to Peer Storage Systems
Objective
 Eliminating duplicates in unstructured
peer-to-peer storage system using two
randomized protocols
 Addresses availability and scalability
with respect to management of
redundant data.
 We abstract the problem of retaining a
copy of a data item to one of electing
leaders in a distributed system.
Scope of the Project
 In this paper, we present a solution that
addresses availability and scalability
with respect to management of
redundant data. Specifically, we
address the problem of duplicate
elimination in the context of systems
connected over an unstructured peer-
to-peer network in which there is no a
priori binding between an object and its
location.
 We propose two randomized protocols
to solve this problem in a scalable and
decentralized fashion that does not
compromise the availability
requirements of the application.
Performance results using both large-
scale simulations and a prototype built
on PlanetLab demonstrate that our
protocols provide high probabilistic
guarantees while incurring minimal
administrative overheads.
Existing System
 Peers contribute data and storage
and, in return, gain access to data at
other peers.
 Effective storage management is an
important issue in the deployment of
such systems.
 Data replication and caching are key
enabling techniques for scalability,
performance, and availability.
 In this context, an important problem
relates to pruning unwanted copies of
data efficiently and safely.
 Attempts at aggressive replication may
lead to significant overheads associated
with thrashing in resource constrained
environments.
 The network as a whole must provide
mechanisms for eliminating replicas that
are not accessed, while leaving a
minimum number of replicas in the
network to satisfy availability constraints.
DisAdvantages
 No of Duplicates in the network is high
 Administrative Overhead to keep.
 Limit our System Storage Space
Proposed System
 We propose two randomized protocols
to solve this problem in a scalable and
decentralized fashion that does not
compromise the availability
requirements of the application.
They are
 Probabilistic Quorum and
 Randomized Election.
Advantages
 Less Replicas we retain only minimal
copy
 Solve Administrative Overhead
 Average storage gain per node
Architectural Diagram

REQUEST SEND REQUEST RECEIVE

LOCAL SEARCH LOCAL SEARCH

DELETE FILE DELETE FILE

RECEIVE ACK SEND ACK

DELETE FILE DELETE FILE

CONTENDER MEDIATOR
Modules Description
We have three major modules in this project
they are
1. Node Selection
i. Contender
ii. Mediator
2. Message Transfer
i. File Information’s
ii. Acknowledgement Basics
3. File Search
i. Local Search
ii. Remote Search
1. NODE SELECTION

 In this module we are selecting the


status of the node, as whether the
node is a contender or mediator.
This module having two sub
modules they are.
i. Contender and
ii. Mediator
i. Contender

 The Contender is the leader for the


file, its work is sending
REQUEST_TO_KEEP to its mediators
with the SenderID and File Content
and proceed next round whether it
receives ACK otherwise it deletes the
file and terminate from the
searching…
ii. Mediator

 The Mediator is the receiver it


receives REQUEST_TO_KEEP from
the contenders. The work of the
mediator is it should wait for a
prescribed time period and processes
all the messages received. At the end
of the time period it check for the
same requests and sends the
acknowledgement to the
corresponding contenders.
2. MESSAGE TRANSFER
 In this module we are sending file
contents as message to its
mediators. The message contains
senderID and file content. The
message transfer module having two
sub modules they are
i. File Information’s
ii. Acknowledgement Basics
i. File Information’s
 In this module we are selecting a
replicated file and sends
REQUEST_TO_KEEP to the
mediator with the details of sender ID
(Source address) and the File content
hash.
ii. Acknowledgement Basics
 In this module the mediator node sends
acknowledgment to the appropriate
mediators. The acknowledgement is
sends based on the request for a file. If
two or more nodes sends requests for a
same file than the mediator sets NACK
and sends it to the contenders. If it
receives only single request for a file than
it sets ACK and sends it to the
appropriate contenders.
 When the mediator sends ACK to the
contender than the corresponding file will
be delete from the mediator else it retain
the file and it act as a leader for that file.
When the contender receives NACK than
it will delete the file and quite from the
round, fatherly it will not act as a leader
for the corresponding file. If it receives
ACK than it will retain the file and search
further to its quorums.
3. FILE SEARCH

 In this module we are going to


search the file in our local and
remote systems. This module having
two sub modules they are
i. Local Search
ii. Remote Search
i. Local Search
 This module search within the local
system (contender). The contender first
select a file and search that file in the
local system as whether it has more than
one copy. If it has more than one copy
than the searched copies are deleted
from the local system and sends request
to the remote system. That selected file
will be deleted based on the receiving
acknowledgement.
ii. Remote Search
 This module search within the remote
system (mediator) after receiving the
request from the contender. The
mediator also select the appropriate file
and search that file in the its local
system as whether it has more than one
copy. If it has more than one copy than
the searched copies are deleted from the
local system. That selected file will be
deleted based on the acknowledgement
sent by the mediator.
Test Case
 Given Input
– User give the input file to request
– User select the type of operation as
Centralized
– User select the type of operation
Decentralized
– User select the node type as contender
– User select the node type as mediator
– User select the receivers IP address
– User sends the Request file
– System automatically search for that file in
our local system
– System automatically search for that file in
the remote system
 Expected Output
– User get the Acknowledgement for that file
– User get the acknowledgement from the local
system as the file is send to the remote system
– User get the acknowledgement from the
remote system as the file is received
– User get the Searching status from our local
system. If it found then it will automatically
delete the particular file.
– User get the Searching status in his remote
system for its local search. If same file found
there that file will be automatically delete
from that node.
– User get the final acknowledgments
– User get positive acknowledgement if there is
only one request for that file
– User get Negative acknowledgement if there
is more than one request for that file.
Algorithms
We are following five algorithms to
implement our system they are.

 In the first phase of the protocol, the


number of contenders in round j + 1 is
approximately half the number of
contenders in round j.
 The probability of having at least one
contender at the end of the first phase.
3. There is exactly one contender
remaining at the end of the protocol.
4. The number of rounds in the protocol
is O(log n).
5. The total number of messages in the
system is O(n).
Software Requirements
 Hardware Interface
Hard disk : 40 GB
RAM : 512 MB
Processor : Pentium IV

 Software Interface
JDK 1.5
Swing Builder
SQL Server
Conclusion
 This paper addresses the problem of
duplicate elimination in storage
systems in the context of unstructured
peer-to-peer networks in which there
is no a priori binding between an
object and its location. We abstract
the problem of retaining a copy of a
data item to one of electing leaders in
a distributed system.
 The experimental results show that
RE performs better than PQ when the
number of duplicates in the network is
high and the content is similar among
the nodes. When the number of
different objects in the network is high
(nodes have unique objects), PQ
performs better than RE.
Future Enhancement
 In the future Enhancement we are
combining these two randomized protocol in
Unstructured Peer-Peer Networks.

The experimental results show that

 RE performs better than PQ when the


number of duplicates in the network is high
and the content is similar among the nodes.
 When the number of different duplicates in
the network is high in a same object PQ
performs better than RE
 So we are combine these two
protocols and deletes the duplicate
like same duplicates in different
system (RE) and different duplicates
in same system (PQ) in a distributed
network.
 We are keeping data in single system
using centralized approach and keep
various system in distributed approch.

You might also like