Love

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 82

SCRUTINIZING AND DEDUPLICATION OF

DATA IN CLOUD
A project report submitted in partial fulfilment of the requirement
for the award of degree of

BACHELOR OF TECHNOLOGY
In
INFORMATION TECHNOLOGY
Submitted by

L. HANUMAN SAI
(16341A1227)

Under the esteemed guidance of

Mr. V.S.K.Chaitanya
Assistant Professor, Dept. of IT

GMR Institute of Technology


An Autonomous Institute Affiliated to JNTUK, Kakinada
(Accredited by NBA, NAAC with ‘A’ Grade & ISO 9001:2008 Certified Institution)

GMR Nagar, Rajam – 532 127,


Andhra Pradesh, India
JULY 2020
DEPARTMENT OF INFORMATION TECHNOLOGY

BONAFIDE CERTIFICATE

This is to certify that the report entitled “SCRUTINIZING AND DEDUPLICATION OF


DATA IN CLOUD ” is the bonafide work submitted by L. HANUMAN SAI (16341A1227) who
completed the internship program under our guidance and supervision at RISE CORP
PVT.LTD.
The results embodied in this report has not submitted to any other university or institution for the
award of any degree or diploma.

Signature of the Faculty Supervisor Signature of Industry Supervisor

Mr. V.S.K Chaitanya Mr. U. Naidu


Assistant Professor, Project Lead,
Department of IT RISE Corp Pvt.Ltd.

Signature of Head of the Department

Dr. Ajit Kumar Rout


Professor & HoD,
Department of IT.

i
INTERNSHIP CERTIFICATE

iii
ACKNOWLEDGEMENT
I would like to sincerely thank internal supervisor, Mr.V.S.K.Chaitanya, Assistant Professor,
Department of Information Technology for whole hearted and valuable guidance throughout the program.

I would like to sincerely thank Mr. Sateesh Vavilapalli, Managing Director India and Mr. U.
Naidu, Project Lead of RISE Corp Pvt Ltd. for providing all the necessary facilities that led to the
successful completion of our Internship.

It gives me an immense pleasure to express deep sense of gratitude to the Central Internship team
Dr. Surya Narayan Dash, Hod & Professor, Department of Chemical Engineering, I would sincerely
thank our department coordinator Mrs. P. Akhila, Assistant Professor, Department of Information
Technology for their great support.

I would like to take this opportunity to thank our beloved Principal Dr. C.L.V.R.S.V. Prasad,
beloved Vice Principal Dr. J. Raja Murugudoss and Head of the Department Dr. Ajit Kumar Rout,
Professor, Department of Information Technology for providing a great support in completing the full
semester Internship.

We would like to thank all the faculty members and the non-teaching staff of the Department of
Information Technology for their direct or indirect support for helping us in completion of this project
work.

Finally, we would like to thank all of our friends and family members for their continuous help
and encouragement.

L. Hanuman Sai ( 16341A1227)

iv
ABSTRACT

In cloud outsourcing data for storage turns into an appealing pattern, which benefits in saving
endeavors on overwhelming data maintenance and management. Outsourcing data in cloud
distributed storage isn't completely reliable, it raises security concerns on the best way to
acknowledge information deduplicated within the cloud while achieving integrity scrutinizing.
Even if data deduplication brings plenty of advantages in security and privacy concerns occur
because the user's confidential data are prone to both attacks insider and outsider. A convergent
encryption technique imposes data privacy while making deduplication feasible. Traditional
deduplication systems based on convergent encryption even though they offer confidentiality but
don't maintain the duplicate check on basis of differential rights. This work explicitly, targeting to
achieve both data integrity and deduplication planned to guard data security by visualizing
discrepancy privileges of users within the duplicate check. Deduplication systems, clients with
differential privileges are incorporated assessed in copy check other than the data itself. To keep
up greater security the documents are encoded with differential benefit keys. Clients are permitted
to carry out the copy check for files marked with the matching decode privileges to access. The
client can confirm their occasion of an archive after deduplication within the cloud with the help
of a 3rd party scrutinizer. Scrutinizer investigates the data and affirms the transferred record on a
schedule. As a result, this system generates advantages to both the storage provider and user by
the deduplication system and scrutinizing method correspondingly.

Keywords: Cloud Server, Client Data Security, Integrity, Deduplication and Scrutinizing.

v
SCRUTINIZING AND DEDUPLICATION OF
DATA IN CLOUD
TABLE OF CONTENTS
CHAPTER TITLE PAGE
NO. NO.

Bonafied Certificate ii

Internship Certificate iii

Acknowledgement iv

Abstract v

List of Figures x

List of Abbreviation xiii

1 Introduction 1

1.0 Introduction 1

1.1 Benefits of Internship 1

1.1.1 Benefits to the Students 2

1.1.2 Benefits to the Industry 2

1.1.3 Benefits to the Institution 2

1.2 Ethics 3

1.3 Values 3

2 Profile of the Company 4

2.0 About the Company 4

2.1 Services 5

2.2 Team 5

vi
3 Tasks Taken Up and Problem Definition 6

3.0 Introduction 6

3.1 Problem Statement 7

3.2 Domain Analysis 7

3.2.1 What is cloud computing 7

3.2.2 How Cloud computing Works 8

3.2.3 Cloud Computing Architecture 8

3.2.4 Cloud Deployment Models 10

3.2.5 Cloud Service Models 12

3.2.6 Feature of Cloud 13

3.2.7 Benefits of Cloud 16

3.3 Existing System 17

3.3.1 Disadvantages 17

3.4 Proposed System 18

3.4.1 Advantages 18

3.4.2 Scope 18

3.4.3 Need for Proposed System 19

3.5 System Specifications 19

3.5.1 Functional Requirements 19

3.5.2 Non Functional Requirements 19

3.5.3 System Requirements 20

3.6 Technologies Used 21

4 Methodology and Learning 28

vii
4.0 Feasibility Study 28

4.1 Existing System 29

4.1.1 Introduction 29

4.1.2 Algorithm used 30

4.1.3 Steps in Algorithm 31

4.2 Proposed System 32

4.2.1 Introduction 32

4.2.2 Algorithm Used 33

4.2.3 Key generation 33

4.2.4 Blowfish Encryption 34

4.2.5 Blowfish Decryption 35

4.2.6 Data Encryption Process 36

4.2.7 Data Decryption Process 36

5 System Design 37

5.0 System Attribute Entity Model 37

5.0.1 System Architecture 37

5.1 Module Implementation 38

5.1.1 Working Process 39

5.1.2 System Protocols 40

5.1.3 System Objectives 42

5.2 Data Flow Diagram 43

5.3 UML Diagrams 45

6 Coding 50

viii
6.0 Sample Source Code 50

7 Results 57

7.0 Output Screenshots 57

7.1 Observations 66

8 Conclusion and Suggestions 67

8.0 Conclusion 67

8.1 Future Scope 67

Reference 68

Copy Right 69

ix
LIST OF FIGURES

Fig.No. Name of the Figure Page No.

2.1 RISE Pvt. Ltd. Logo 4

3.1 Various Cloud Providers 7

3.2 Cloud Computing Architecture 9

3.3 Cloud Deployment Models 10

3.4 Cloud Service Models 12

3.5 Advantages of Cloud 15

3.6 Benefits of Cloud 16

3.7 Java Programming Language 22

3.8 Java Platform 22

3.9 Structure of JDBC 23

3.10 SQL Server Architecture 24

3.11 TCP and IP Stack 25

3.12 Network Address 27

3.13 NetBeans Working Architecture 27

4.1 Convergent Encryption 30

4.2 Diffe Hellman Algorithm example 30

4.3 Deduplication 32

4.4 F function Working in Key Generation 33

4.5 Blowfish’s Encryption Process 34

4.6 Blowfish’s Decryption process 35

x
4.7 Data Owner Side(Encryption) 36

4.8 Data User Side(Decryption) 36

5.1 Attribute Entity Model 37

5.2 Proposed System Architecture 37

5.3 DFD for Data Owner 44

5.4 DFD for Data User 44

5.5 DFD for Cloud Server 44

5.6 Usecase Diagram 46

5.7 Class Diagram 47

5.8 Sequence Diagram 47

5.9 Collaboration Diagram 48

5.10 Activity Diagram 49

5.11 Component Diagram 49

7.1 Wolke’s Home Page 57

7.2 Wolke’s Client Login Page 57

7.3 Wolke’s Registration Page 58

7.4 Client Registered with Wolke 58

7.5 Wolke’s Server Login Page 59

7.6 Wolke’s Server Home Page 59

7.7 Activate User’s Page 60

7.8 IAM key send to E-Mail 60

7.9 Client Home Page 61

7.10 Client Uploading Files 61

xi
7.11 Wolke’s Scrutinizer Login Page 62

7.12 Upload File Request 62

7.13 Decryption Key Request 63

7.14 Decryption key send to E-Mail 63

7.15 File Decryption Process 64

7.16 File Download 64

7.17 List of Files stored in the cloud 65

7.18 Secure Deduplication of files 65

7.19 Initial vs Subsequent data owner 66

7.20 Generation vs Verification 66

7.21 Challenges vs Proof vs Verification 66

7.22 Cloud Storage vs Scrutinizer 66

xii
LIST OF ABBREVIATION

SecCloud Security Cloud


EnCloud Encrypted Cloud
ABE Attribute based Encryption
LDSS Light Weight Data Sharing Scheme
CP-ABE Ciphertext-policy-Attribute-based
Encryption

xiii
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

1. INTRODUCTION

1.0 Introduction

An internship is a trained and supervised experience in a professional setting in which


student is learning and gaining essential experience and expertise. Internship is meant for
introducing candidates either full-time or part time to a real-world experience related to
their career goals and interests. Internship is an excellent way to build those all-important
connections that are invaluable in developing and maintaining a strong professional
network for the future. Internship is relatively short term in nature with the primary focus
on getting some on the job training and taking what’s learning in the classroom and
applying it to the real world.

1.1 Benefits of Internship

Students learn how their course of study applies to the real world and build valuable
experience that makes them stronger candidates for jobs after graduation.
 Internship at a start-up will benefit in improving team spirit, adapting to flexible
working times and client services.
 You can get serious work experience, build a portfolio and establish a network of
professional contacts which can help you after you graduate.
 The main advantage is to have practical knowledge. In our college we can have
theoretical knowledge which doesn’t help much. Working on a project gives the
practical experience.
 Confidence can be increased when we were involved in solving problems and were
succeeded in solving it.
 If you are willing to show initiative, enthusiasm and work hard, you will be given
further opportunities to develop.

 Working on a project also improves communication skills and


Dept of IT,GMRIT
1
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

interpersonal skills. As we need to talk to higher authorities regarding the


project our skills can be better when compared. Having several
internships while in college can be very impressive to potential employers.

 Working in team for a project teaches us how to interact with our


colleagues and how to deal them without hurting the feelings of both
sides.
1.1.1 Benefits to the Students

 Learning by doing.

 All round development.

 Aid in career planning.

 Experience of professional working conditions.

 Smooth transition from campus to company.

1.1.2 Benefits to the Industry

 Steady stream of skilled manpower provides value addition and


increased productivity.
 Human Resource Development benefits.

 Conduit for Industrial Partnership.

 Employer Branding.

1.1.3 Benefits to the Institution

 Inputs to quickly adapt curriculum to match the needs of industry.

 Opportunities for research and consultancy.

 Access to industrial expertise and infrastructure.

Dept of IT,GMRIT
2
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

1.2 Ethics

 Help develop an organizational environment favorable to acting ethically.

 Improve their understanding of the software and related documents on which they
work and of the environment in which they will be used.
 Accept full responsibility for their own work.

 Improve their ability to produce accurate, informative, and well-written


documentation.

 Not promote their own interest at the expense of the profession, client or employer.

 Assist colleagues in professional development.

 Strive to fully understand the specifications for software on which they work.

 Improve their knowledge of the Code, its interpretation, and its application to their
work.

1.3 Values

 Professional communications.

 Be proactive, and when invited to work functions introduce oneself to people.

 Taking constructive criticism well.

 Being able to work independently with little guidance is very important in the
working world.
 Always work hard even the task is small and seems unimportant.

 Make an effort during the course of the internship to build relationships with people
around the office.

Dept of IT,GMRIT
3
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

2. PROFILE OF THE COMPANY

2.0 About the Company

Fig-2.1: Resource Integration Solution Enterprise Pvt. Ltd. Logo

Resource Integration Solutions Enterprise Inc.(RISE) Dallas, Texas based


company with global focus on On Demand Applications, Cloud computing, Data
Management and Services providing solution for an integrated enterprise.
RISE will leverage its “Platform-as-a-service” All-in-one solution to help companies
which deploy the entirety of their business processes to the cloud. It also provides
“Head in the cloud, Feet on the ground” company a quick and cost effective
integrated solution.We provide insight and value to our customers by leveraging
knowledge, proven methodologies, global talent, innovation and continued focus on
business process optimization as well as non-disruptive technology. It ensures that
we deliver complete solutions that help you to build customer loyalty through
increased levels of service and improved quality of outputs.
RISE specializes in next-gen and immersive technologies with optimized
visualization capabilities to transform, scale and drive business performance for a
seamless enterprise, while keeping laser-focus on “user experience!” and
``Gamification!`` Our applications are powered by proven components that integrate
well with enterprise software platforms while empowering users to accelerate their
performance and agility.
RISE has a network of offices and development centers across the US and India.

Dept of IT,GMRIT
4
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD
2.1 Services

The services offered are:

 Business Data Management

 Application Development & Management

 Data Services

 Mobile Applications

 Cloud Services

 Web Portals

 Industry Applications

 SAP on Demand

 Business Intelligence

 Remote Service.

2.2 Team

1. Chief Executive Officer : Suresh Ketha


2. Director : Roni Bumpas
3. Managing India : Sateesh Vavilapalli
4. Head of Development : Kalyan Raju
5. Project Lead : U. Naidu

Dept of IT,GMRIT
5
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3. TASKS TAKEN UP AND PROBLEM DEFINITION

3.0 Introduction

In recent days, the speeding growth associated with digital contents is gearing up
to lift the demand for new storage and network capacities, together with an increasing need
for more cost- effective use of storage and network bandwidth for data transfer. As of now,
the utilization of computer memory systems is gaining an expanding interest, namely the
cloud storage based services, since it provides cost efficient architectures. The above
mentioned architectures take part in supporting the transmission, storing in an exceedingly
multi-tenant environment, and intensive computation of outsourced data in a pay per use
model and additionally to avoid wasting resources consumption in both, network
bandwidth and storage capacities. Many cloud services, apply client side deduplication this
idea ignores the storage of redundant data in cloud servers and reduces network bandwidth
consumption associated to transmitting the identical contents several times.
Cloud storage service providers perform deduplication to avoid wasting space by
only storing one copy of every file uploaded. Should clients conventionally encrypt their
files, however, savings are lost. Message-locked encryption (the most prominent
manifestation of which is convergent encryption) resolves this tension. However it's
inherently subject to brute-force attacks which will recover files falling into a known set.
But customers might want their data encrypted, for reasons starting from personal privacy
to corporate policy to legal regulations. A client could encrypt its file, under a user’s key,
before storing it. But common encryption modes are randomized, making deduplication
impossible since the Storage Service effectively always sees different cipher texts no matter
the information. If a client’s encryption is deterministic (so that the identical file will
always map to the identical cipher text) deduplication is feasible, but just for that user.
Cross-user deduplication, which allows more storage savings, isn't possible because
encryptions of various clients, being under different keys, are usually different. Sharing one
key across a group of users makes the system brittle within the face of client terms.

Dept of IT,GMRIT
6
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.1 Problem Statement

In this project, we investigate the problem of integrity auditing and secure


deduplication on cloud data. Specifically, aiming at achieving both data integrity and
deduplication in cloud, we propose two secure systems, namely SeCloud and EnCloud.
SeCloud introduces an scrutinizing entity with a maintenance of a cloud, which helps
clients generate data tags before uploading as well as scrutiny the integrity of knowledge
having been stored in cloud. Deduplication is in a position to cut back the space for storing
by keeping only one copy of the identical file. Notice that, regarding to secure deduplication,
our objective is distinguished from previous add that we propose a way for allowing both
deduplication over files and tags.

3.2 Domain Analysis


3.2.1 What is cloud computing
Cloud computing is the use of computing resources (hardware and software) that
are delivered as a service over a network (typically the Internet). The name comes from the
common use of a cloud-shaped symbol as an abstraction for the complex infrastructure it
contains in system diagrams. Cloud computing entrusts remote services with a user's data,
software and computation. Cloud computing consists of hardware and software resources
made available on the Internet as managed third-party services. These services typically
provide access to advanced software applications and high-end networks of server
computers.

Fig-3.1: Various Cloud Providers

Dept of IT,GMRIT
7
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.2.2 How Cloud Computing Works


The goal of cloud computing is to apply traditional supercomputing, or high
performance computing power, normally used by military and research facilities, to
perform tens of trillions of computations per second, in consumer- oriented applications
such as financial portfolios, to deliver personalized information, to provide data storage or
to power large, immersive computer games.
The cloud computing uses networks of large groups of servers typically running
low-cost consumer PC technology with specialized connections to spread data-processing
chores across them. This shared IT infrastructure contains large pools of systems that are
linked together. Often, virtualization techniques are used to maximize the power of cloud
computing.
3.2.3 Cloud Computing Architecture
Cloud Computing architecture refers to the various components and sub-
components of cloud that constitute the structure of the system.
Cloud computing architecture consists of:
a) Front-End Cloud.
b) Back-End platforms, such as servers and storage.
c) Cloud-based delivery.
d) A network (internet, intranet).

Fig-3.2: Cloud Computing Architecture

Dept of IT,GMRIT
8
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.2.3.1 Front End Cloud:


Front-end is the side that is visible to the client, customer, or user. Front-end
pieces include the user interface and the client’s computer system or network that is used
for accessing the cloud system. You have probably noticed that different cloud computing
systems use different user interfaces—for example, not only can you choose from a variety
of web browsers (including Chrome, Safari, Firefox, etc.), but the Google Docs user
interface is different than that of Salesforce.
.
3.2.3.2 Back End Cloud:
On the other hand, the back-end pieces are on the side used by the service
provider. These include various servers, computers, data storage systems, virtual machines,
and programs that together constitute the cloud of computing services. The back-end side
also is responsible for providing security mechanisms, traffic control, and protocols that
connect networked computers for communication.

3.2.3.3 Cloud-Based Delivery:


As we’ve discussed above, cloud computing services are everywhere these days.
For example, if your company uses Salesforce or you use Google Drive or Office 365 at
home or work, you’re a cloud computing user. These are all examples of subscriptions a
company or individual can purchase that enable them to use the software, typically known
as Software-as-a-Service, or SaaS.
Because of technology like virtualization and hypervisors, it’s possible for many
virtual servers to exist on a single physical server. These technologies power other cloud
subscriptions like Platform-as-a-Service (PaaS), Infrastructure-as-a-Service (IaaS), and more.

Dept of IT,GMRIT
9
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.2.3.4 Cloud Services Network:


Cloud services can be delivered publicly or privately using the internet and can
also remain within a company’s network when delivered over an intranet. Sometimes,
organizations make use of a combination of both. No matter where the actual “cloud” is a
company’s own data center or a service provider’s data center, cloud computing uses
networking to enable convenient, on-demand access to a shared pool of computing
resources like networks, storage, servers, services, and applications. By using virtualization,
these assets can be provisioned and released quickly and efficiently as necessary.

3.2.4 Cloud Deployment Models


Cloud computing deployment models are based on location. In order to know
which deployment model would best suit your organization requirements, it is necessary to
know the four deployment types.

Fig-3.3: Cloud Deployment Models

3.2.4.1 Public Cloud


It is a type of hosting which cloud services are delivered over a network for public use.
 Customers do not have any control over the location of the infrastructure.
 The cost is shared by all users, and are either free or in the form of a
license policy like pay per user.
 Public clouds are great for organizations that require managing the host
Dept of IT,GMRIT
10
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

application and
 The various applications users use.
3.2.4.2 Private Cloud
It is a cloud infrastructure that is solely used by one organization.
 It gives organizations greater control over security and data which is
safeguarded by a firewall and managed internally.
 It can be hosted internally or externally.
 Private clouds are great for organizations that have high security demands,
high management demands and uptime requirements.
3.2.4.3 Hybrid Cloud
It uses both private and public clouds, but can remain separate entities.
 Resources are managed and can be provided either internally or by
external providers.
 A hybrid cloud is great for scalability, flexibility and security.
 An example of this is an organization can use public cloud to interact with
customers, while keeping their data secured through a private cloud.
3.2.4.4 Community Cloud
It is an infrastructure that is mutually shared between organizations that belong to a
particular community.
 The community members generally share similar privacy, performance
and security concerns.
 An example of this is a community cloud at banks, government in a
country, or trading firms.
 A community cloud can be managed and hosted internally or by a third
party provider.
 A community cloud is good for organizations that work on joint ventures
that need centralized cloud computing ability for managing, building and
executing their projects.

Dept of IT,GMRIT
11
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.2.5 Cloud Service Models

Fig-3.4: Cloud Service Models

Cloud service models focus on providing some type of offering to their clients.
3.2.5.1 Cloud Software as a Service
 It is a type of cloud that offers an application to customer or organizations through a
web browser.
 The data for the app runs on a server on the network, not through an app on the
user’s computer.
 Software is usually sold via subscription.
 Examples of SaaS are Salesforce, Google Docs, Office 365, Basecamp etc.

3.2.5.2 Cloud Infrastructure as a Service


 It provides the hardware and usually virtualized OS to their customers.
 Software is charged only for the computing power that is utilized, usually CPU hours
used a month.
 Examples of IaaS are Amazon EC2, Rackspace, Google Compute Engine etc.

3.2.5.3 Cloud Platform as a Service


 It provides networked computers running in a hosted environment, and also adds
support for the development environment.
 PaaS offerings generally support a specific program language or development

Dept of IT,GMRIT
12
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

environments.
 Deploying your app in this environment, you can take advantage of dynamic
scalability, automated database backups without need to specifically code for it.
 PaaS are billed as an additional cost on top of the IaaS charges.
 Examples of PaaS are Google App Engine, Cloud Foundry, Engine Yard Etc.

3.2.6 Features of Cloud


Following are the characteristics of Cloud Computing:

i. ResourcesPooling

It means that the Cloud provider pulled the computing resources to provide
services to multiple customers with the help of a multi-tenant model. There are
different physical and virtual resources assigned and reassigned which depends on
the demand of the customer. The customer generally has no control or information
over the location of the provided resources but is able to specify location at a higher
level of abstraction.

ii. On-DemandSelf-Service

It is one of the important and valuable features of Cloud Computing as the user
can continuously monitor the server uptime, capabilities, and allotted network storage.
With this feature, the user can also monitor the computing capabilities.

iii. EasyMaintenance

The servers are easily maintained and the downtime is very low and even in
some cases, there is no downtime. Cloud Computing comes up with an update every
time by gradually making it better. The updates are more compatible with the devices
and perform faster than older ones along with the bugs which are fixed.

Dept of IT,GMRIT
13
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

iv. Large NetworkAccess

The user can access the data of the cloud or upload the data to the cloud from
anywhere just with the help of a device and an internet connection. These capabilities
are available all over the network and accessed with the help of internet.

v. Availability

The capabilities of the Cloud can be modified as per the use and can be
extended a lot. It analyzes the storage usage and allows the user to buy extra Cloud
storage if needed for a very small amount.

vi. Automatic System

Cloud computing automatically analyzes the data needed and supports a


metering capability at some level of services. We can monitor, control, and report the
usage. It will provide transparency for the host as well as the customer.

vi. . Economical

It is the one-time investment as the company (host) has to buy the storage and
a small part of it can be provided to the many companies which save the host from
monthly or yearly costs. Only the amount which is spent is on the basic maintenance
and a few more expenses which are very less.

vii. . Security

Cloud Security, is one of the best features of cloud computing. It creates a


snapshot of the data stored so that the data may not get lost even if one of the servers
gets damaged. The data is stored within the storage devices, which cannot be hacked
and utilized by any other person. The storage service is quick and reliable.

ix. Pay asyougo

Dept of IT,GMRIT
14
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

In cloud computing, the user has to pay only for the service or the space they
have utilized. There is no hidden or extra charge which is to be paid. The service is
economical and most of the time some space is allotted for free.

x. MeasuredService

Cloud Computing resources used to monitor and the company uses it for
recording. This resource utilization is analyzed by supporting charge-per-use
capabilities. This means that the resource usages which can be either virtual server
instances that are running in the cloud are getting monitored measured and reported
by the service provider. The model pay as you go is variable based on actual
consumption of the manufacturing organization.

Fig-3.5: Advantages of Cloud

Dept of IT,GMRIT
15
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.2.7 Benefits of cloud computing


i. Achieve economies of scale – increase volume output or productivity with fewer
people. Your cost per unit, project or product plummets.
ii. Reduce spending on technology infrastructure. Maintain easy access to your
information with minimal upfront spending. Pay as you go (weekly, quarterly or
monthly) based on demand.
iii. Globalize your workforce on the cheap. People worldwide can access the cloud,
provided they have an Internet connection.
iv. Streamline processes. Get more work done in less time with less people.
v. Reduce capital costs. There’s no need to spend big money on hardware, software or
licensing fees.
vi. Improve accessibility. You have access anytime, anywhere, making your life so
much easier!

Fig-3.6: Benefits of Cloud

Dept of IT,GMRIT
16
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.3 Existing System


A number of deduplication systems have been proposed based on various
deduplication strategies. such as clientside or server-side deduplications , file-level or block-
level deduplication.
A new cloud storage architecture with two independent cloud servers for integrity
scrutinizing to reduce the computation load at client side by using Provable Data Possession
(PDP), which enables cloud users to verify the data integrity without retrieving the entire file,
is highly essential for cloud storage. All the existing PDP schemes rely on the Public Key
Infrastructure and convergent Encryption.
Li addressed the key-management issue in block-level deduplication by distributing these
keys across multiple servers after encrypting the files. Bellare et al showed how to protect
data confidentiality by transforming the predictable message into unpredictable message.

3.3.1 Disadvantages
i. The first problem is integrity scrutinizing. The cloud server is able to relieve clients
from the heavy burden of storage management and maintenance. The most
difference of cloud storage from traditional in-house storage is that the data is
transferred via Internet and stored in an uncertain domain, not under control of the
clients at all, which inevitably raises clients great concerns on the integrity of their.

ii. The second problem is Client deduplication. The rapid adoption of cloud services is
accompanied by increasing volumes of data stored at remote cloud servers. Among
these remote stored files, most of them are deduplicated: according to a recent
survey by EMC, 75% of recent digital data is duplicated copies.

iii. Data reliability is actually a critical issue in deduplication storage systems because
there is only one copy for each file stored in the server shared by all the Owners.

Dept of IT,GMRIT
17
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.4 Proposed System


i. In this project, aiming at achieving data integrity and deduplication in cloud, we
propose two secure systems namely SecCloud and EnCloud.
ii. SecCloud introduces an scrutinizing entity with maintenance of a cloud, which
helps clients generate data tags before uploading as well as scrutiny the integrity of
data having been stored in cloud.Deduplication is in a position to cut back the space
for storing by keeping only one copy of the identical file.
iii. Besides supporting integrity scrutinizing and secure deduplication, system enables
the guarantee of file confidentiality by using EnCloud.

iv. We propose a method of directly scrutinize integrity on encrypted data.

3.4.1 Advantages
i. This design fixes the issue of previous work that the computational load at user or
auditor is too huge for tag generation. For completeness of fine-grained, the
functionality of scrutiny designed in SecCoud is supported on both block level and
sector level. In addition, SecCoud also enables EnCloud.
ii. The challenge of deduplication on encrypted is the prevention of dictionary attack.
iii. Our proposed system has achieved both integrity auditing and file deduplication.

3.4.2 Scope
Despite these significant advantages in saving resources, client data deduplication
brings many security issues, considerably due to the multi-owner data possession
challenges. For instance, several attacks target either the bandwidth consumption or the
confidentiality and the privacy of legitimate cloud users. For example, a user may check
whether another user has already uploaded a file, by trying to outsource the same file to
the cloud. Recently, to mitigate these concerns, many efforts have been proposed under
different security models. These schemes are called Proof of Ownership systems (PoW).
They allow the storage server check a user data ownership, based on a static and short
value.
Dept of IT,GMRIT
18
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.4.3 Need for proposed system


Data deduplication is an attractive technology to reduce storage space
for increasing vast amount of duplicated and redundant data. In a cloud storage system with
data deduplication, duplicate copies of data will be eliminated and only one copy will be
kept in the storage. To protect the confidentiality of sensitive data while supporting
deduplication, the convergent encryption technique has been proposed to encrypt the data
before outsourcing. However, the issue of keyword search over encrypted data in
deduplication storage system has to be addressed for efficient data utilization.

3.5 System Specifications

3.5.1 Functional Requirements

i. Descriptions of data to be entered into the system.

ii. Descriptions of operations performed by each screen.

iii. Descriptions of work-flows performed by the system.

iv. Descriptions of system reports or other outputs.

v. Who can enter the data into the system.

vi. How the system meets applicable regulatory requirements.

3.5.2 Non-functional Requirements

i. Reliability

ii. Usability

iii. Responsive

iv. Performance

v. Error handling

Dept of IT,GMRIT
19
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.5.3 System Requirements

3.5.3.1 Hardware Specifications

• System : Intel i3 (or equivalent) and above.

• RAM : 512Mb and higher.

• Hard Disk : 50GB and higher.

3.5.3.2 Software Specifications

• Operating System : Windows XL/7 and above.

• Server : Apache Tomcat Server.

• IDE : Apache NetBeans IDE .

• Front-end : HTML, CSS.

• Back-end : Java and JavaScript.

• Database :MySQL & HeidiSQL

Dept of IT,GMRIT
20
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.6 Technologies Used

3.6.1 Java Programming Language/Technology


Java technology is both a programming language and a platform.
The Java programming language is a high-level language that can be characterized by
all of the following buzzwords

1) Simple

2) Architecture neutral

3) Object oriented

4) Portable

5) Distributed

6) High performance

7) Interpreted

8) Multithreaded

9) Robust

10) Dynamic

11) Secure
With most programming languages, it is either to be compiled or interpreted a
program so that you can run it on your computer. The Java programming language is
unusual in that a program is both compiled and interpreted. With the compiler, first you
translate a program into an intermediate language called Java byte codes the platform
independent codes interpreted by the interpreter on the Java platform. The interpreter
parses and runs each Java byte code instruction on the computer. Compilation happens
just once; interpretation occurs each time the program is executed.

Dept of IT,GMRIT
21
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-3.7: Java Programming language

Java byte codes can be as the machine code instructions for the Java Virtual
Machine (Java VM). Every Java interpreter, whether it’s a development tool or a Web
browser that can run applets, is an implementation of the Java VM. Java byte codes help
make “write once, run anywhere” possible. Program can be compiled into byte code on
any platform that has a Java compiler.
3.6.2 The Java Platform
A platform is the hardware or software environment in which a program runs.It is already
mentioned some of the most popular platforms like Windows 2000, Linux, Solaris, and
MacOS. Most platforms can be described as a combination of the operating system and
hardware. The Java platform differs from most other platforms in that it’s a software-only
platform that runs on top of other hardware-based platforms. The Java platform has two
components:

(i) The Java Virtual Machine (Java VM)

(ii) The Java Application Programming Interface (Java API)


As already been introduced to the Java VM. It’s the base for the Java platform and is ported
onto various hardware-based platforms. The following figure depicts a program that’s
running on the Java platform. As the figure shows, the Java API and the virtual machine
insulate the program from the hardware.

Fig-3.8: Java Platform


Dept of IT,GMRIT
22
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.6.3 ODBC
Microsoft Open Database Connectivity (ODBC) is a standard programming interface for
application developers and database systems providers. Before ODBC became a de facto
standard for Windows programs to interface with database systems, programmers had to
use proprietary languages for each database they wanted to connect to. Now, ODBC has
made the choice of the database system almost irrelevant from a coding perspective,
which is as it should be. Application developers have much more important things to
worry about than the syntax that is needed to port their program from one database to
another when business needs suddenly change. Through the ODBC Administrator in
Control Panel, it can be specified the particular database that is associated with a data
source that an ODBC application program is written to use.

3.6.4 JDBC
Java Database Connectivity (JDBC) is an application programming interface (API) for the
programming language Java, which defines how a client may access a database.JDBC
offers a generic SQL database access mechanism that provides a consistent interface to a variety of
RDBMSs. This consistent interface is achieved through the use of “plug-in” database connectivity
modules, or drivers. If a database vendor wishes to have JDBC support, he or she must provide the
driver for each platform that the database and Java runon.

Fig-3.9: Structure of JDBC

Dept of IT,GMRIT
23
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.6.5 SQL Database


SQL is Structured Query Language, which is a computer language for storing,
manipulating and retrieving data stored in a relational database.
SQL is the standard language for Relational Database System. All the Relational Database
Management Systems (RDMS) like MySQL, MS Access, Oracle, Sybase, Informix,
Postgres and SQL Server use SQL as their standard database language.

SQL is widely popular because it offers the following advantages:

1) Allows users to access data in the relational database management systems.

2) Allows users to describe the data.

3) Allows users to define the data in a database and manipulate that data.

4) Allows to embed within other languages using SQL modules, libraries &
pre-compilers.

5) Allows users to create and drop databases and tables.

6) Allows users to create view, stored procedure, functions in a database.

7) Allows users to set permissions on tables, procedures and views.

Fig-3.10: SQL Server Architecture

Dept of IT,GMRIT
24
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

3.6.6 Networking

Networking is the practice of transporting and exchanging data between nodes over a
shared medium in an information system. Networking comprises not only the design,
construction and use of a network, but also the management, maintenance and operation
of the network infrastructure, software and policies.
A. TCP/IP stack

1) TCP/IP or the Transmission Control Protocol/Internet Protocol, is a


suite of communication protocols used to interconnect network devices
on the internet.
2) TCP is a connection-oriented protocol; UDP (User Datagram Protocol)
is a connectionless protocol.

Fig-3.11:TCP and IP Stack

B. IP Datagram

The IP layer provides a connectionless and unreliable delivery system. It


considers each datagram independently of the others. Any association between
datagram must be supplied by the higher layers. The IP layer supplies a
checksum that includes its own header. The header includes the source and
destination addresses.

Dept of IT,GMRIT
25
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

C. UDP & TCP


1) UDP is also connectionless and unreliable. What it adds to IP is a
checksum for the contents of the datagram and port numbers. These are
used to give a client/server model - see later.

2) TCP supplies logic to give a reliable connection-oriented protocol


above IP. It provides a virtual circuit that two processes can use to
communicate.

D. Internet Addresses
In order to use a service, you must be able to find it. The Internet
uses an address scheme for machines so that they can be located. The address is
a 32 bit integer which gives the IP address. This encodes a network ID and more
addressing.

3.6.6.1 Network Address

Class A uses 8 bits for the network address with 24 bits left over for other addressing.
Class B uses 16 bit network addressing. Class C uses 24 bit network addressing and class
D uses all 32.

a) Subnet address
Internally, the UNIX network is divided into sub networks. Building 11 is
currently on one sub network and uses 10-bit addressing, allowing 1024
different hosts.
b) Host address
8 bits are finally used for host addresses within our subnet. This places a limit of
256 machines that can be on the subnet.

c) Total address
It is the total address and the 32 bit address is usually written as 4 integers
separated by dots.

Dept of IT,GMRIT
26
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-3.12: Network Address

d) Port addresses
A service exists on a host, and is identified by its port. This is a 16 bit number.
To send a message to a server, it is sended to the port for that service of the host
that it is running on. This is not location transparency! Certain of these ports are
"well known".
e) Socket
A socket is a data structure maintained by the system to handle network
connections. A socket is created using the call socket. It returns Read File and
Write File functions.
3.6.7 NetBeans
NetBeans is an open-source integrated development environment (IDE) for developing
with Java, PHP, C++, and other programming languages. NetBeans is also referred to as a
platform of modular components used for developing Java desktop applications.

Fig-3.13: NetBeans Working Architecture

Dept of IT,GMRIT
27
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

4. METHODOLOGY AND LEARNING

4.0 Feasibility Study


The feasibility of the project is analyzed in this phase and business proposal is put forth
with a very general plan for the project and some cost estimates. During system analysis the
feasibility study of the proposed system is to be carried out. This is to ensure that the
proposed system is not a burden to the company. For feasibility analysis, some
understanding of the major requirements for the system is essential.
Three key consideration involved in the feasibility analysis are

(i) Economical Feasibility

(ii) Technical Feasibility

(iii) Social Feasibility


4.0.1 Economical Feasibility
The aspect of study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the research
and development of the system is limited. Thus, the developed system as well within the
budget and this was achieved because most of the technologies used are freely available.
4.0.2 Technical Feasibility
The aspect of study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the client.
4.0.3 Social Feasibility
The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity. The level of acceptance by
the users solely depends on the methods that are employed to educate the user about the
system and to make him familiar with it.

Dept of IT,GMRIT
28
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

4.1 Existing System


4.1.1 Introduction
In recent years, many studies on access control in cloud are based on attribute-based
encryption algorithm (ABE). However, traditional ABE is not suitable for mobile cloud
because it is computationally intensive and mobile devices only have limited resources.
Decentralized multi-authority attribute-based encryption (ABE) has been adopted to protect
outsourced data. However, there are some security issues. Firstly, the user’s attributes
information is leaked to the authorities and secondly, the access structure being sent along
with ciphertext violates its privacy. To address these issues, an efficient decentralized
attribute-based encryption scheme with features of privacy preservation and expressive
access structures is proposed by using Diffie-Hellman algorithm.

4.1.1.1 Convergent Encryption


Convergent encryption provides data confidentiality in deduplication. A user (or data
owner) derives a convergent key from the data content and encrypts the data copy with the
convergent key. if two data copies are the same, then their tags are the same.
Formally, a convergent encryption scheme can be defined with four primitive function.
 KeyGen(F) : The key generation algorithm takes a file content F as input
and outputs the convergent key ckf of F
 Encrypt(ckf;F) : The encryption algorithm takes the convergent key (ckf)
and file content F as input and outputs the ciphertext (ctF) ;
 Decrypt(ckf; ctF ) : The decryption algorithm takes the convergent key
(ckf) and ciphertext (ctF) as input and outputs the plain file F;
 TagGen(F) : The tag generation algorithm takes a file content F as input
and outputs the tag tagF of F

Dept of IT,GMRIT
29
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-4.1: Convergent Encryption

4.1.2 Algorithm Used


In existing system, Diffie-Hellman algorithm is used. The Diffie-Hellman key
agreement protocol was the first practical method for establishing a shared secret over
an unsecured communication channel. The point is to agree on a key that two parties can
use for a symmetric encryption. Diffie-Hellman key exchange is a specific method of
securely exchanging cryptographic keys over a public channel. Traditionally, secure
encrypted communication between two parties requires that they first exchange keys by
some secure physical channel. The Diffie-Hellman key exchange method allows two
parties that have no prior knowledge of each other to jointly establish a shared secret key
over an insecure channel. At the end of the communication both sender and receiver have
the same key.

Fig-4.2: Diffie-Hellman Algorithm Example

Dept of IT,GMRIT
30
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

4.1.3 Steps in the algorithm

The steps in the algorithm are

1. Alice and Bob agree on a prime number p and a base g.

2. Alice chooses a secret number a, and sends Bob ( g a mod p).

3. Bob chooses a secret number b, and sends Alice ( g b mod p).

4. Alice computes (( g b mod p ) a mod p).

5. Bob computes (( g a mod p ) b mod p). Both Alice and Bob can use this
number as their key. Notice that p and g need not be protected.

Key received = y Key received = x

Generated Secret Key = Generated Secret Key =

(k_b)= x^b mod P (k_a)= y^a mod P


Exchange of generated keys takes place

Public Keys available = P, G Public Keys available = P, G

Private Key Selected = a Private Key Selected = b

Key generated =( x )= G^a mod P Key generated =( y) = G^b mod P

Algebraically it can be shown that k_a = k_b


Users now have a symmetric secret key to encrypt

Dept of IT,GMRIT
31
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

4.2 Proposed System

4.2.1 Introduction
Secure Deduplication provides data confidentiality in deduplication. A user
derives a convergent key from the data content and encrypts the data copy with the
convergent key. if two data copies are the same, then their tags are not.

Fig-4.3: Deduplication
In proposed system, we develop the architecture of LDSS. The proposed system involves
the scheme and the algorithms used.

(i) We propose a Lightweight Data Sharing Scheme (LDSS) for mobile cloud
computing environment.

(ii) We design an algorithm called LDSS-CP-ABE based on Attribute-Based Encryption


(ABE) method to offer efficient access control over ciphertext.

(iii) We use proxy servers for encryption and decryption operations. In our approach,
computational intensive operations in ABE are conducted on proxy servers, which
greatly reduce the computational overhead on client side mobile devices.
Meanwhile, in LDSS-CP-ABE, in order to maintain data privacy, a version
attribute is also added to the access structure. The decryption key format is
modified so that it can be sent to the proxy servers in a secure way.

(iv) We introduce lazy re-encryption and description field of attributes to reduce the
revocation overhead when dealing with the user revocation problem.
Dept of IT,GMRIT
32
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

(v) We used blowfish algorithm, a symmetric encryption algorithm to encrypt and


decrypt data and to generate keys. Finally, we implement a data sharing prototype
framework based on LDSS.

4.2.2 Algorithm Used

The blowfish algorithm is used in proposed system. It is applied to encrypt and


decrypt data and to generate keys. Blowfish is a symmetric encryption algorithm. It
consists of a single key that is used for both encryption and decryption process. This
blowfish encryption scheme’s secret key ranges from 32 to 448 bits. If the range of key is
448 bits, then it needs 2448 groupings to define all the entire keys. Furthermore, this key
has a fixed 64-bit block size with variable-length key block cipher. The cipher is a 16-
round Feistel network, which uses password-dependent S-boxes to develop the structure
by which the encryption and decryption process has taken place. This cipher divides
messages into 64 bits blocks and then encrypts them separately. The algorithm possesses
two main sub-key groups, namely, the 18-entry P-boxes (permutation boxes) to perform
bit-shuffling and four 256-entry S-boxes (substitution boxes).

4.2.3 Key generation


The p-array consists of 18, 32-bit subkeys
P1,P2,................ ,P18
Four 32-bit S-Boxes consists of 256 entries each
S1,0, S1,1,..............S1,255
S2,0, S2,1,...............S2,255
S3,0, S3,1,...............S3,255
S4,0,S4,1,.............. S4,255

Fig 4.4: F Function Working

Divide into four eight-bit quarters: a, b, c and d

Dept of IT,GMRIT
33
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

4.2.4 Blowfish Encryption Algorithm


Blowfish symmetric block cipher algorithm encrypts block data of 64-bits at a time.
It will follows the feistel network and this algorithm is divided into two parts. Key
expansion and data encryption. Here, the rounds are used in p1,p2….p18 order.

To encrypt: Divide x into two 32-bit halves: , .


For i = 1 to 16:

Swap and (Undo the last swap.)

Recombine and

Fig-4.5: Encryption Process

Dept of IT,GMRIT
34
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

4.2.5 Blowfish Decryption Algorithm


Decryption is exactly the same as encryption, except that P1, P2, … , P18 are used
in the reverse order.
The subkeys are calculated using the Blowfish algorithm. The exact method follows.
1) Initialize first the P-array and then the four S-boxes, in order, with a fixed string.
This string consists of the hexadecimal digits of p.
2) XOR P1 with the first 32 bits of the key, XOR P2 with the second 32-bits of the key,
and so on for all bits of the key (up to P18). Repeatedly cycle through the key bits
until the entire P-array has been XORed with key bits.
3) Encrypt the all-zero string with the Blowfish algorithm, using the subkeys described
in steps 1 and 2.
4) Replace P1 and P2 with the output of step 3. 5. Encrypt the output of step 3 using the
Blowfish algorithm with the modified subkeys. 6. Replace P3 and P4 with the output
of step 5.
5) 7. Continue the process, replacing all elements of the P-array, and then all four S-
boxes in order, with the output of the continuously changing Blowfish algorithm.

Fig-4.6: Decryption Process

Dept of IT,GMRIT
35
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

4.2.6 Data Encryption process


Before uploading a data file to the cloud, the data holder initially logs in with a
unique ID, and then randomly chooses a symmetric data file encryption key to encode the
data.

Fig-4.7: Data Owner side (encryption process)


4.2.7 Data Decryption process
The data consumer initially downloads the data from the cloud to the local, and
then requests the decryption algorithm to decrypt the data. The user can decrypt the
corresponding data file.

Fig-4.8: Data user side (decryption process)

Dept of IT,GMRIT
36
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

5. SYSTEM DESIGN

5.0 System Attribute Entity Model


Aiming at allowing for scrutiny and deduplicated storage, we propose the
Integrity Scrutinizing system.

Fig-5.1: Attribute Entity Model


5.0.1 System Architecture

Fig-5.2: System Architecture

Dept of IT,GMRIT
37
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

5.1 Module Implementation


In proposed system, we develop the architecture of LDSS by using following six
components. The six components are
1) Data Owner (DO)
2) Data User (DU)
3) Scrutinizer
4) Encryption Service Provider (ESP)
5) Decryption Service Provider (DSP)
6) Cloud Service Provider (CSP)
a. Data Owner
In Data Owner module, Initially Data Owner must have to register their detail.
After successful registration data owner can login and upload files into cloud
server with encrypted keywords and hashing algorithms. He/she can view the
files that are uploaded in cloud. Data Owner can approve or reject the file
request sent by data users. After request approval data owner will send the
trapdoor key and verification object through mail.

b. Data User
In Data User module, Initially Data Users must have to register their detail and
after login he/she has to verify their login through secret key. Data Users can
search all the files upload by data owners. He/she can send request to the files
and then request will send to the data owners. If data owner approve the request
then he/she will receive verification object and decryption key in registered mail.
c. Scrutinizer
Scrutinizer is an entity which facilitates interactions between two parties who
both trust the third party. It is responsible for generating public and private keys,
and distributing attribute keys to users. With this mechanism, users can share
and access data without being aware of the encryption and decryption operations.

Dept of IT,GMRIT
38
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

d. ESP and DSP


To relieve the overhead on client side mobile devices, the encryption service
provider (ESP) and decryption service provider (DSP) are used. Both the
encryption service provider and the decryption service provider are also semi-
trusted. We modify the traditional CPABE algorithm and design an LDSS-CP-
ABE algorithm to ensure the privacy of the data when outsourcing
computational tasks to ESP and DSP.
e. Cloud Service Provider (CSP)

CSP stores the data for DO. It faithfully executes the operations requested by
DO, while it may peek over data that DO has stored in the cloud. Cloud can
edit the files and update and also cloud server can view the download history.

5.1.1 Working Process


Firstly Data Owner send data to the cloud. Since the cloud is not credible,
data has to be encrypted before it is uploaded. The Data Owner defines access control
policy in the form of access control tree which policies are such as read the data, write the
data. A DU should obtain if he wants to access a certain data file. In LDSS, data files are
all encrypted using symmetric encryption mechanism, and the symmetric key for data
encryption is also encrypted using attribute based encryption (ABE).
In our proposed system, data owner, scrutinizer is present on equal level
of authority. Data owner firstly should register or login on website then as it nothing but
work like a CSP (cloud service provider) then he can upload his own files on cloud in
encrypted format. Data user can register or login on website for access for files ,After
login of data user on cloud server then request goes to the data owner then data owner
decide the approve of files access to user or not. Data user has acknowledgment from data
owner if he approves the request of data user.
Third party authorization is used to monitories the data owners activities
also it can check the integrity, durability of files which are uploaded by data owner on
mobile cloud computing. Scrutinizer also generates the report for data owner. While
Dept of IT,GMRIT
39
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

requesting of data user of some kind of data from cloud, data owner select the role for data
user and also after approval of users request he send the public key to data user through
the email then data user can retrieve the information from cloud by entering the key on
website but this information it in the form of encryption so to decrypt that data .Data
owner provide the private key to data user from mail. Then by using this key Data User
can decrypt that data.
To relieve the overhead on the client side mobile devices, encryption
service provider (ESP) and decryption service provider (DSP) are used. Both the
encryption service provider and the decryption service provider are also semi-trusted. We
modify the traditional CP-ABE algorithm and design an LDSS-CP-ABE algorithm to
ensure the data privacy when outsourcing computational tasks to ESP and DSP, also we
used the AES (Advanced Encryption Standard) algorithm to encrypt and decrypt the
overall data which are uploaded on mobile cloud by data owner.

5.1.2 System Protocols


A. File Uploading Protocol:
This protocol aims at allowing clients to upload files via the scrutinizer.
Specifically, the file uploading protocol includes three phases:

a) Phase 1 (cloud client → cloud server): Client takes the duplicate check
with the cloud server to confirm if such a file is stored in cloud storage or
not before uploading a file. If there is a duplicate, another protocol called
Proof of Ownership will be run between the client and the cloud storage
server. Otherwise, the following protocols (including phase 2 and phase 3)
are run between these two entities.
b) Phase 2 (cloud client → scrutinizer): Client uploads files to the scrutinizer,
and receives a receipt from scrutinizer.
c) Phase 3 ( scrutinizer → cloud server): scrutinizer helps generate a set of
tags for the uploading file, and send them along with this file to cloud server.

Dept of IT,GMRIT
40
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

B. Integrity Scrutinizing Protocol:


It is an interactive protocol for integrity verification and allowed to be
initialized by any entity except the cloud server. In this protocol, the cloud
server plays the role of prover, while the scrutinizer or client works as the
verifier. This protocol includes two phases:
a) Phase 1 (cloud client/scrutinizer → cloud server): Verifier generates a
set of challenges and sends them to the prover.
b) Phase 2 (cloud server → cloud client/scrutinizer): Based on the stored
files and file tags, prover tries to prove that it exactly owns the target file by
sending the proof back to verifier. At the end of this protocol, verifier
outputs true if the integrity verification is passed.

C. Proof of Ownership Protocol:


It is an interactive protocol initialized at the cloud server for verifying that the
client exactly owns a claimed file. This protocol is typically triggered along with
file uploading protocol to prevent the leakage of side channel information. On
the contrast to integrity scrutinizing protocol, in PoW the cloud server works as
verifier, while the client plays the role of prover. This protocol also includes two
phases
a) Phase 1 (cloud server → client): Cloud server generates a set of
challenges and sends them to the client.
b) Phase 2 (client → cloud server): The client responds with the proof for
file ownership, and cloud server finally verifies the validity ofproof.

Dept of IT,GMRIT
41
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

5.1.3 System objectives


a. Integrity Scrutinizing

The first design goal of this work is to provide the capability of verifying
correctness of the remotely stored data. The integrity verification further
requires two features those are public verification and stateless verification.

To to protect data in public cloud servers from unauthorized entities, the client
has to ensure that only authorized users are able to obtain the decrypting keys.
As such, the data owner has to encrypt the data deciphering key, using the
public key of the recipient user.

This key is, then, integrated by the data owner in user metadata, ensuring data
confidentiality against malicious users, as well as flexible access control policies.

i. public verification, which allows anyone, not just the clients originally
stored the file, to perform verification.

ii. Stateless verification , which is able to eliminate the need for state
information maintenance at the verifier side between the actions of
auditing and data storage.

b. Cost-Effective
The computational overhead for providing integrity scrutinizing and secure
deduplication should not show a major additional cost to traditional cloud
storage, nor should they alter the way either uploading or downloading
operation

c. .Secure Deduplication
The second design goal of this work is secure deduplication. In other words, it
requires that the cloud server is able to decrease the storage space by keeping
only one copy of the same file.

Dept of IT,GMRIT
42
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

d. File Confidentiality
The design goal of file confident accessing the content of files. Specially, we
require that the goal of file confidentiality needs to be resistant to “dictionary
attack”. That is, even the adversaries have pre-knowledge of the “dictionary”
which includes all the possible files, they still cannot recover the target file.

5.2 Data flow Diagram


A Data-Flow Diagram (DFD) is a way of representing a flow of a data of a
process or a system (usually an information system) The DFD also provides
information about the outputs and inputs of each entity and the process itself. A data
flow diagram has no control flow, there are no decision rules and no loops.
The data flow diagrams for modules are:
a) Data Owner
In Data Owner module, Initially Data Owner must have to register their detail. After
successful registration data owner can login and upload files into cloud server. Data
Owner can approve or reject the file request sent by data users.
b) Data User
In Data User module, Initially Data Users must have to register their detail and after
login he/she has to verify their login through secret key. Data Users can search all the
files upload by data owners.
c) Cloud Server
CSP stores the data for Data Owner. It faithfully executes the operations requested by
DO, while it may peek over data that DO stored in the cloud. Cloud can edit the files
and update and also cloud server can view the download history.

Dept of IT,GMRIT
43
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-5.3: DFD for Data Owner

Fig-5.4: DFD for Data User Fig-5.5: DFD for Cloud Server

Dept of IT,GMRIT
44
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

5.3 UML diagram


UML stands for Unified Modelling Language. UML is a standardized
general-purpose modelling language in the field of Object Oriented Software Engineering.
The standard is managed, and was created by, the Object Management Group. The goal is
for UML to become a common language for creating models of object oriented computer
software.
Various UML Diagrams are:
1) Usecase Diagram
2) Object Diagram
3) Sequence Diagram
4) Class Diagram
5) Collaboration Diagram
6) Activity Diagram
7) Deployment Diagram
8) Component Diagram
9) State Machine Diagram

5.3.1 Goals of UML

Various goals of UML are:

1. Be independent of particular programming languages and development


processes.

2. Provide a formal basis for understanding the modelling language.

3. Encourage the growth of the OO tools market.

4. Support higher-level development concepts such as collaborations,


frameworks, patterns and components.

Dept of IT,GMRIT
45
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

5.3.2 Usecase Diagram


Use case diagrams are used to gather the requirements of a system
including internal and external influences. These requirements are mostly design
requirements. Hence, when a system is analyzed to gather its functionalities, use cases are
prepared and actors are identified.

Fig-5.6: Usecase Diagram


5.3.3 Class Diagram
Class diagrams are one of the most useful types of diagrams in UML as they
clearly map out the structure of a particular system by modelling its classes, attributes,
operations, and relationships between objects. Class diagrams are a type of structure
diagram because they describe what must be present in the system being modelled.

Dept of IT,GMRIT
46
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-5.7: Class Diagram


5.3.4 Sequence Diagram
These diagrams are used by software developers and business
professionals to understand requirements for a new system or to document an existing
process. Sequence diagrams are sometimes known as event diagrams or event scenarios.

Fig-5.8: Sequence Diagram

Dept of IT,GMRIT
47
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

5.3.5 Collaboration Diagram


A collaboration diagram, also called a communication diagram or
interaction diagram, is an illustration of the relationships and interactions
among software objects in the UML. They are used to show how objects interact to
perform the behaviour of a particular use case, or a part of a use case.

Fig-5.9: Collaboration Diagram

5.3.6 Activity Diagram


Activity diagram is another important diagram in UML to describe the
dynamic aspects of the system. Activity diagram is basically a flowchart to represent the
flow from one activity to another activity.

Dept of IT,GMRIT
48
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-5.10: Activity Diagram


5.3.7 Component Diagram
A component diagram,describes the organization and wiring of the
physical components in a system.

Fig-5.11: Component Diagram

Dept of IT,GMRIT
49
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

6. CODING
6.0 Sample Source Code
Coding (or programming) is the construction of software. Coding involves writing a
'recipe', in a so called programming language that a computer can understand.
6.0.1 Login Servlet
public class LoginServlet extends HttpServlet {
public void service(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
System.out.println("loging servlet");
HttpSession hs = request.getSession();
String url="";
String user = request.getParameter("uname");
String pwd = request.getParameter("pwd");
try {
Connection con = DatabaseConnection.getconnection();
PreparedStatement pst = con.prepareStatement("select * from userregister
where username=? and password=?");
pst.setString(1, user);
pst.setString(2, pwd);
ResultSet rs = pst.executeQuery();
if (rs.next()) {
System.out.println("if block");
String username = rs.getString(2);
hs.setAttribute("uname", username);
url="userkey.jsp";
RequestDispatcher rd=request.getRequestDispatcher(url);
rd.forward(request, response);
} else {
url="userlogin.jsp";
RequestDispatcher rd=request.getRequestDispatcher(url);

Dept of IT,GMRIT
50
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

rd.forward(request, response);
}
} catch (Exception e) {
}
//System.out.println("hello");}}
6.0.2 NewUserRegister
public class NewUserRegister extends HttpServlet
{ private static final long serialVersionUID = 1L;
public void service(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
PrintWriter out = response.getWriter();
String UPLOAD_DIR = "images";
ServletContext context = getServletContext();
String dirName = context.getRealPath("\\") + File.separator
+ UPLOAD_DIR;
File save = new File(dirName);
if (!save.exists()) {
save.mkdir();
}
MultipartRequest multi = new MultipartRequest(request, dirName,
10 * 1024 * 1024); // 10MB
String username = multi.getParameter("uname");
String password = multi.getParameter("pwd");
String gender = multi.getParameter("gnd");
String email = multi.getParameter("email");
String mobile = multi.getParameter("mobile");
File f = multi.getFile("image");
System.out.println("File name is :"+f.getName());
System.out.println("File path is :"+f.getAbsolutePath());
String path = UPLOAD_DIR + "\\" + f.getName();
FileInputStream fs = new FileInputStream(f);
Dept of IT,GMRIT
51
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

try {
Connection con = DatabaseConnection.getconnection();
System.out.println("call connection " + con);
PreparedStatement pst = con
PrepareStatement("insertinto
userregister(username,password,gender,email,mobile,path,filedata,status)
values(?,?,?,?,?,?,?,?)");
pst.setString(1, username);pst.setString(2, password);
pst.setString(3, gender);pst.setString(4, email);
pst.setString(5, mobile);pst.setString(6, path);
pst.setBinaryStream(7, fs, fs.available());
pst.setString(8, "waiting");
int i = pst.executeUpdate();
if (i >= 0) {
//System.out.println("if block ok");
out.write("<script type='text/javascript'>\n");
out.write("alert('Saved Successfully !...');\n");
out.write("setTimeout(function(){window.location.href='userlogin.jsp'},100);");
out.write("</script>\n");
} else {
out.write("<script type='text/javascript'>\n");
out.write("alert('Details are not saved !...');\n");

out.write("setTimeout(function(){window.location.href='newuserregister.jsp'},100);");
out.write("</script>\n");
}

} catch (Exception e)
{ e.printStackTrace();
}
}}
Dept of IT,GMRIT
52
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

6.0.3 Encryption
public class encryption
{
//public static void main(String args[])
//{
// Scanner s=new Scanner(System.in);
// System.out.println("Enter text for encrypt");
// String t=s.next();
// new encryption().encrypt(t);
//}
public String encrypt(String text,SecretKey secretkey)
{
String plainData=null,cipherText=null;
try
{ plainData=text;
//secretkey generating
// KeyGenerator keyGen = KeyGenerator.getInstance("AES");
// keyGen.init(128);
// SecretKey secretKey = keyGen.generateKey();
// System.out.println("secret key:"+secretKey);
// //converting secretkey to String
// byte[] b=secretKey.getEncoded();//encoding secretkey
// String skey=Base64.encode(b);
// System.out.println("converted secretkey to string:"+skey);
// //converting string to secretkey
// byte[] bs=Base64.decode(skey);
// SecretKey sec=new SecretKeySpec(bs, "AES");
// System.out.println("converted string to seretkey:"+sec);
Cipher aesCipher = Cipher.getInstance("AES");//getting AES
instance
aesCipher.init(Cipher.ENCRYPT_MODE,secretkey);//initiating ciper encryption using
Dept of IT,GMRIT
53
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

secretke
byte[] byteDataToEncrypt = plainData.getBytes();
byte[] byteCipherText = aesCipher.doFinal(byteDataToEncrypt);//encrypting data
// System.out.println("ciper text:"+byteCipherText
cipherText = new BASE64Encoder().encode(byteCipherText);//converting
encrypted data to string
System.out.println("\n Given text : "+plainData+" \n Cipher Data :
"+cipherText);
}
catch(Exception e)
{ System.out.println(e);
} return cipherText;
}}
6.0.4 Decryption
public class decryption
{
//public static void main(String args[])
//{ Scanner s=new Scanner(System.in);
// System.out.println("Enter encrypted Text and key");
// String text=s.next();
// String key=s.next();
// new decryption().decrypt(text,key);
//}
public String decrypt(String txt,String skey)
{
String decryptedtext = null;
try
{ //converting string to secretkey
byte[] bs=Base64.decode(skey);
SecretKey sec=new SecretKeySpec(bs, "AES");
System.out.println("converted string to seretkey:"+sec);
Dept of IT,GMRIT
54
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

System.out.println("secret key:"+sec);
Cipher aesCipher = Cipher.getInstance("AES");//getting AES instance
aesCipher.init(Cipher.ENCRYPT_MODE,sec);//initiating ciper encryption
using secretkey
byte[] byteCipherText =new BASE64Decoder().decodeBuffer(txt);
//encrypting data
// System.out.println("ciper text:"+byteCipherText);

aesCipher.init(Cipher.DECRYPT_MODE,sec,aesCipher.getParameters());//initiating
ciper decryption
byte[] byteDecryptedText = aesCipher.doFinal(byteCipherText);
decryptedtext = new String(byteDecryptedText);
System.out.println("Decrypted Text:"+decryptedtext);
}
catch(Exception e)
{
System.out.println(e);
}
return decryptedtext;
}

}
6.0.5 Key Validate
public class KeyValid extends HttpServlet
{ @Override
public void service(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
PrintWriter out=response.getWriter();
HttpSession hs=request.getSession();
String username=(String)hs.getAttribute("uname");
System.out.println("key servlet...");
Dept of IT,GMRIT
55
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

System.out.println("session username "+username);


String skey=request.getParameter("key");
System.out.println("key is "+skey);
int key=Integer.parseInt(skey);
try
{
Connection con=DatabaseConnection.getconnection();
PreparedStatement pst=con.prepareStatement("select * from userregister
where username=? and userkey=?");
pst.setString(1,username);
pst.setInt(2, key);
ResultSet rs=pst.executeQuery();
if(rs.next())
{
RequestDispatcher rd=request.getRequestDispatcher("userhome.jsp");
rd.forward(request, response);
}else{ hs.invali
date();
out.write("<script type='text/javascript'>\n");
out.write("alert('Invalid Key !...');\n");
out.write("setTimeout(function(){window.location.href='userlogin.jsp'},100);");
out.write("</script>\n");
}

}catch(Exception e)
{
e.printStackTrace();
}
}
}

Dept of IT,GMRIT
56
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

7. RESULTS
7.0 Output Screenshots

Fig-7.1:Wolke’s HomePage

Fig-7.2: Wolke’s Client Login Page

Dept of IT,GMRIT
57
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-7.3: Wolke’s Registration Page

Fig-7.4: Client Registered with Wolke

Dept of IT,GMRIT
58
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-7.5: Wolke’s Server Login Form

Fig-7.6: Wolke Server HomePage

Dept of IT,GMRIT
59
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-7.7: Activate User’s Page

Fig:7.8: IAM Key send to E-Mail

Dept of IT,GMRIT
60
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig:7.9: Client’s HomePage

Fig:7.10:Client Uploading files

Dept of IT,GMRIT
61
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig:7.11: Wolke’s Scrutinizer Login Page

Fig-7.12: Upload File Request

Dept of IT,GMRIT
62
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-7.13: Decryption Key Request

Fig-7.14: Decryption Key send to the E-Mail

Dept of IT,GMRIT
63
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-7.15: File Decryption Process

Fig-7.16: File Download

Dept of IT,GMRIT
64
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

Fig-7.17: List of files stored in the cloud

Fig-7.18: Secure Deduplication of Files

Dept of IT,GMRIT
65
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

7.1 Observations

Fig-7.19: Initial vs Subsequent data owner Fig-7.20:Generation vs Verification

Fig-7.21:Challenge vs Proof vs Verification Fig-7.22:Cloud Storage vs Scrutnizer

Dept of IT,GMRIT
66
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

8. CONCLUSION AND SUGGESTIONS

8.0 Conclusion

In recent years, many studies on access control in cloud are based on attribute-based
encryption algorithm (ABE). However, traditional ABE is not suitable for mobile cloud
because it is computationally intensive and mobile devices only have limited resources. In
this project, we proposed LDSS to address this issue. It introduces a novel LDSS-CP-ABE
algorithm to migrate major computation overhead from mobile devices onto proxy servers,
thus it solved the secure data sharing problem in mobile cloud. The experimental results
show that LDSS can ensure data privacy in mobile cloud and reduce the overhead on users’
side in mobile cloud. In the future work, new approaches are to be designed to ensure data
integrity. To further tap the potential of mobile cloud, we will also study how to do
ciphertext retrieval over existing data sharing schemes.

8.1 Future Scope

As Cloud Services will play a predominent role in IT sector in future in terms of


storages and other computational services provided to the clients on demand. Results in
storage of huge amount of data with in the cloud . To maintain integrity,
confidentiality,Durability and availablity of data system need to maintain high level of
security and Storage facility.There by our proposed system provides both security as well as
Storage by deduplicating the Clients information.For future work there will be the
opportunities for researchers to present the secure sharing schemes for sharing the essential
data among authorized users. The files must be shared among users according to access
privileges assigned by data owner to specific authorized users. There will be additional
opportunity to decrease the overhead of cryptographic standard algorithms and research the
schemes to afford same security with low overhead as provided by standard cryptographic
algorithms.

Dept of IT,GMRIT
67
SCRUTINIZING AND DEDUPLICATION OF
2019-20
DATA IN CLOUD

REFERENCES

1. M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D.


Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “A view of cloud computing,”
Communication of the ACM, vol. 53, no. 4, pp. 50–58.

2. J. Yuan and S. Yu, “Secure and constant cost public cloud storage auditing with
deduplication,” in IEEE Conference on Communications and Network Security (CNS),
2013, pp. 145–153.

3. Shacham and B. Waters, “Compact proofs of retrievability,” in Proceedings of the 14th


International Conference on the Theory and Application of Cryptology and Information
Security: Advances in Cryptology, ser. ASIACRYPT ’08. Springer Berlin Heidelberg,
pp. 90– 107.

4. Q. Wang, C. Wang, J. Li, K. Ren, and W. Lou, “Enabling public verifiability and data
dynamics for storage security in cloud computing,” in Computer Security – ESORICS
2009, M. Backes and P. Ning, Eds., vol. 5789. Springer Berlin Heidelberg, 2009, pp.
355–370.

5. J. Xu and E.-C. Chang, “Towards efficient proofs of retrievability,” in Proceedings of


the 7th ACM Symposium on Information, Computer and Communications Security, ser.
ASIACCS ’12. New York, NY, USA: ACM, 2012, pp. 79– 80.

6. E. Stefanov, M. van Dijk, A. Juels, and A. Oprea, “Iris: A scalable cloud file system
with efficient integrity checks,” in Proceedings of the 28th Annual Computer Security
Applications Conference, ser. ACSAC ’12. New York, NY, USA: ACM, 2012, pp. 229–
238.

Dept of IT,GMRIT
68
SCRUTINIZING AND DEDUPLICATION OF 2019-20
DATA IN CLOUD

COPYRIGHT NOTICE
Copyrights © 2020 Wolke All rights reserved

All rights reserved.No part of this report may be reproduced or used in any manner
without written permission of the copy right owner.The report is done as a part of
Full Semester Intership at RISE Corp Pvt. Ltd.,Visakhapatnam under
Department of Information Technology,GMR Institute of
Technology,Rajam.The results embodied in this report has not submitted to any
other university or institution for the award of any degree or diploma.

Report Developed By : Wolke Team

B. Devaki -16341A1208

G. Supraja - 16341A1216

L. Hanuman Sai - 16341A1227

M. Nihitha -16341A1230

S.G.S.N.V. Sai Pavan -16341A1252

S.K. Maheswari - 16341A1250

Supported By :

Head of the Department : Dr.Ajit Kumar Rout

Industrial Supervisor : U.Naidu.

Faculty Supervisors : V.S.K.Chaitanya & P.Akhila.

Approved By : RISE Corp Pvt. Ltd.

Year : April,2020.

Dept of IT, GMRIT

69

You might also like