Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

A Project Report

On

SECURE DATA TRANSFER AND DELETION FROM COUNTING


BLOOM FILTER
Submitted to Jawaharlal Nehru Technological University for the partial fulfillment of the requirement for
the Award of the Degree in

BACHELOR OF TECHNOLOGY

in

COMPUTER SCIENCE & ENGINEERING


Submitted by

Y. SHRAVANI (17271A05A2)
G. SOUMYA (17271A05A6)
V. SWAPOORVA (17271A05B2)
G. SWATHI (17271A05B3)
Under the Esteemed guidance of

Mr. P. BALAKISHAN
Associate Professor (CSE Dept.)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

JYOTHISHMATHI INSTITUTE OF TECHNOLOGY & SCIENCE


(Approved by AICTE, New Delhi, Affiliated to JNTUH, Hyderabad)
NUSTULAPUR, KARIMNAGAR- 505481, TELANGANA, INDIA
2017-2021
CERTIFICATE

This is to certify that the Project Report entitled “SECURE DATA TRANSFER AND DELETION
FROM COUNTING BLOOM FILTER” is being submitted by Y.SHRAVANI (17271A05A2),
G.SOUMYA (17271A05A6), V.SWAPOORVA (17271A05B2), G.SWATHI (17271A05B3) in partial
fulfillment of the requirements for the award of the Degree of Bachelor of Technology in Computer
Science & Engineering to the Jyothishmathi Institute of Technology & Science, Karimnagar, during
academic year 2020-21, is a bonafide work carried out by him/them under my guidance and supervision.

The results presented in this Project Work have been verified and are found to be satisfactory. The results
embodied in this Project Work have not been submitted to any other University for the award of any other degree
or diploma.

Project Guide Head of the Department

Mr.P. Balakishan Dr. R. Jegadeesan


Associate Professor Professor
Dept. of CSE Dept. of CSE

External Examiner
ACKNOWLEDGEMENT

We would like to express our sincere gratitude to our advisor, Mr. P. Balakishan, Associate Professor,
CSE Dept., whose knowledge and guidance has motivated us to achieve goals we never thought possible. The
time we have spent working under his supervision has truly been a pleasure.

The experience from this kind of work is great and will be useful to us in future. We thank Dr. R.
Jegadeesan, Professor & HOD, CSE Dept. for his effort, kind cooperation, guidance and encouraging us to
do this work and also for providing the facilities to carry out this work.
It is a great pleasure to convey our thanks to our principal Dr. G. Lakshmi Narayana Rao, Principal,
Jyothishmathi Institute of Technology & Science and the College Management for permitting us to undertake
this project and providing excellent facilities to carry out our project work.
We thank all the Faculty members of the Department of Computer Science & Engineering for sharing
their valuable knowledge with us. We extend out thanks to the Technical Staff of the department for their
valuable suggestions to technical problems.

Finally, Special thanks to our parents for their support and encouragement throughout our life and this
course. Thanks to all our friends and well-wishers for their constant support.
DECLARATION

We hereby declare that the work which is being presented in this dissertation entitled, “ SECURE

DATA TRANSFER AND DELETION FROM COUNTING BLOOM FILTER ”, submitted


towards the partial fulfillment of the requirements for the award of the degree of Bachelor of Technology in
Computer Science & Engineering, Jyothishmathi Institute of Technology & Science, Karimnagar is an
authentic record of our own work carried out under the supervision of Dr. R. JEGADEESAN, Professor, &
HOD, Department of CSE, Jyothishmathi Institute of Technology and Science, Karimnagar.

To the best of our knowledge and belief, this project bears no resemblance with any report submitted
to JNTUH or any other University for the award of any degree or diploma.

Y. SHRAVANI (17271A05A2)
G. SOUMYA (17271A05A6)
V. SWAPOORVA (17271A05B2)
G. SWATHI (17271A05B3)

Date:
Place: Karimnagar
ABSTRACT

With the rapid development of cloud storage, an increasing number of data owners prefer to outsource
their data to the cloud server, which can greatly reduce the local storage overhead. Because different cloud
service providers offer distinct quality of data storage service, e.g., security, reliability, access speed and
prices, cloud data transfer has become a fundamental requirement of the data owner to change the cloud service
providers. Hence, how to securely migrate the data from one cloud to another and permanently delete the
transferred data from the original cloud becomes a primary concern of data owners. To solve this problem, we
construct a new counting Bloom filter-based scheme in this project. The proposed scheme not only can achieve
secure data transfer but also can realize permanent data deletion. Additionally, the proposed scheme can satisfy
the public verifiability without requiring any trusted third party. Finally, we also develop a simulation
implementation that demonstrates the practicality and efficiency of our proposal.
Table of Contents
LIST OF TABLES i
LIST OF FIGURES ii
1. INTRODUCTION 1-5
1.1 Introduction 1-2
1.2 Existing System 2
1.3 Problem Statement 2-3
1.3.1 System Framework 2-3
1.3.2 Design Goals 3
1.4 Proposed System 3-5
1.4.1 Proposed system 3-5
1.4.2 Objectives 5

2. LITERATURE REVIEW 6-8

3. REQUIREMENTS & DOMAIN INFORMATION 9-14


3.1 Requirement Specifications 9
3.1.1 Hardware Requirements 9
3.1.2 Software Requirements 9
3.2 Domain Information 9-14

4. SYSTEM METHODOLOGY 15-23


4.1 Architecture of Proposed System 15
4.2 Modules 16-17
4.3 System Design 18-23
4.3.1 Data Flow Diagrams 18-19
4.3.2 UML Diagrams 20-23
4.3.2.1 Use case Diagram 20
4.3.2.2 Class Diagram 21
4.3.2.3 Sequence Diagram 22
4.3.2.4 Flowchart 23-24

5. EXPERIMENTATION & ANALYSIS 25-41


5.1 Experimentation 25-27
5.2 Algorithms 27-29
5.2.1 Counting Bloom Filter 29
5.2.2 AES Algorithm
5.3 Testing 29-34
5.3.1 Types of Testing 29-31
5.3.2 Other Testing Methodologies 32-33
5.4 Test Cases 33-34
5.5 Results 34-41
5.4.1 Results 34-36
5.4.2 Screenshots 37-41

6 CONCLUSION AND FUTURE SCOPE 42


6.1 Conclusion 42
6.2 Future Scope 42

REFERENCES
LIST OF TABLES

TABLE NO DESCRIPTION PAGE NO

5.4 Test Cases 33-34

i
LIST OF FIGURES
FIGURE DESCRIPTION PAGE NO

1.3.1 System Framework 3

4.1 Architecture of System 15


4.3 System Design
4.3.1 Data Flow Diagrams 18-19
4.3.2 UML Diagrams
4.3.2.1 Use case Diagram 20
4.3.2.2 Class Diagram 21
4.3.2.3 Sequence Diagram 22
4.3.2.4 Flow Chart 23

5.1 The Main Process of the System 25


5.2.1 Algorithms
5.2.1.1 An Example of Bloom Filter 28
5.2.1.2 An Example of Counting Bloom Filter 28
5.5. Results
5.5.1.1 The time cost of data encryption 34
5.5.1.2 The time of storage proof generation 35
5.5.1.3 The time of storage result verification 35
5.5.1.4 The time cost of data transfer 36
5.5.1.5 The time cost of data deletion 36

ii
CHAPTER-1
INTRODUCTION
1.1. Introduction

Cloud computing, an emerging and very promising computing paradigm, connects large-scale
distributed storage resources, computing resources and network bandwidths together. By using these
resources, it can provide tenants with plenty of high-quality cloud services. Due to the attractive advantages,
the services (especially cloud storage service) have been widely applied, by which the resource-constraint data
owners can outsource their data to the cloud server, which can greatly reduce the data owners’ local storage
overhead. According to the report of Cisco, the number of Internet consumers will reach about 3.6 billion in
2019, and about 55 percent of them will employ cloud storage service. Because of the promising market
prospect, an increasing number of companies (e.g., Microsoft, Amazon, Alibaba) offer data owners cloud
storage service with different prices, security, access speed, etc. To enjoy more suitable cloud storage service,
the data owners might change the cloud storage service providers. Hence, they might migrate their outsourced
data from one cloud to another, and then delete the transferred data from the original cloud. According to
Cisco, the cloud traffic is expected to be 95% of the total traffic by the end of 2021, and almost 14% of the
total cloud traffic will be the traffic between different cloud data centers. Foreseeably, the outsourced data
transfer will become a fundamental requirement from the data owners’ point of view.
To realize secure data migration, an outsourced data transfer app, Cloudsfer, has been designed
utilizing cryptographic algorithm to prevent the data from privacy disclosure in the transfer phase. But there
are still some security problems in processing the cloud data migration and deletion. Firstly, for saving network
bandwidth, the cloud server might merely migrate part of the data, or even deliver some unrelated data to cheat
the data owner. Secondly, because of the network instability, some data blocks may lose during the transfer
process. Meanwhile, the adversary may destroy the transferred data blocks. Hence, the transferred data may
be polluted during the migration process. Last but not least, the original cloud server might maliciously reserve
the transferred data for digging the implicit benefits. The data reservation is unexpected from the data owners’
point of view. In short, the cloud storage service is economically attractive, but it inevitably suffers from some
serious security challenges, specifically for the secure data transfer, integrity verification, verifiable deletion.
These challenges, if not solved suitably, might prevent the public from accepting and employing cloud storage
service.

Contributions in this work, we study the problems of secure data transfer and deletion in cloud storage,
and focus on realizing the public verifiability. Then we propose a counting Bloom filter-based scheme, which
not only can realize provable data transfer between two different clouds but also can achieve publicly verifiable
data deletion. If the original cloud server does not migrate or remove the data honestly, the verifier (the data

1
owner and the target cloud server) can detect these malicious operations by verifying the returned transfer and
deletion evidences. Moreover, our proposed scheme does not need any Trusted third party (TTP), which is
different from the existing solutions. Furthermore, we prove that our new proposal can satisfy the desired
design goals through security analysis. Finally, the simulation experiments show that our new proposal is
efficient and practical.

1.2. Existing System


 Xue et al. studied the goal of secure data deletion, and put forward a key-policy attribute-based
encryption scheme, which can achieve data fine grained access control and assured deletion. They
reach data deletion by removing the attribute and use Merkle hash tree (MHT) to achieve verifiability,
but their scheme requires a trusted authority.
 Du et al. designed a scheme called Associated deletion scheme for multi-copy (ADM), which uses pre-
deleting sequence and MHT to achieve data integrity verification and provable deletion. However,
their scheme also requires a TTP to manage the data keys. In 2018, Yang et al. presented a Blockchain-
based cloud data deletion scheme, in which the cloud executes deletion operation and publishes the
corresponding deletion evidence on Blockchain. Then any verifier can check the deletion result by
verifying the deletion proof. Besides, they solve the bottleneck of requiring a TTP. Although these
schemes all can achieve verifiable data deletion, they cannot realize secure data transfer.

Drawbacks Of Existing System:


 In the existing work, the system does not provide Data integrity proof.
 This system is less performance due to lack of strong encryption techniques .

1.3. Problem Statement

In the following, we briefly introduce the system framework, and security goals.

1.3.1. System framework:


In our system, we aim to achieve verifiable data transfer between two different clouds and reliable
data deletion in cloud storage. Hence, three entities are included in our new construction, as shown in Fig.1.3.1.

2
Fig-1.3.1: The System framework

In our scenario, the resource-constraint data owner might outsource his large-scale data to the cloud
server A to greatly reduce the local storage overhead. Besides, the data owner might require the cloud A to
move some data to the cloud B, or delete some data from the storage medium. The cloud A and cloud B
provide the data owner with cloud storage service. We assume that the cloud A is the original cloud, which
will be required to migrate some data to the target cloud B, and remove the transferred data. However, the
cloud A might not execute these operations sincerely for economic reasons. Moreover, we assume that the
cloud A and cloud B will not collude together to mislead the data owner because they belong to two different
companies. Hence, the two clouds will independently follow the protocol. Furthermore, we assume that the
target cloud B will not maliciously slander the original cloud A.

1.3.2. Design goals

This system should realize the following three goals.


 Data confidentiality: The outsourced file may contain some private information that should be kept
secret. Hence, to protect the data confidentiality, the data owner needs to use secure algorithms to
encrypt the file before uploading it to the cloud server.
 Data integrity: The cloud A might only migrate part of the data, or deliver some unrelated data to
the cloud B. Besides, the data might be polluted during the transfer process. Hence, the data owner
and the cloud B should be able to verify the transferred data integrity to guarantee that the transferred
data is intact.
 Public verifiability: The cloud A may not move the data to the cloud B or delete the data faithfully.
So, the verifiability of the transfer and deletion results should be satisfied from the data owner’s point
of view.
3
1.4. Proposed System

1.4.1. Proposed System:


The proposed system does not need any trusted third party (TTP), which is different from the
existing system. Furthermore, we prove that our new proposal can satisfy the desired design goals through
security analysis. The proposed system not only can achieve secure data transfer but also can realize permanent
data deletion. Additionally, the proposed system can satisfy the public verifiability without requiring any
trusted third party.

Overview:

In the proposed work, the system studies the problems of secure data transfer and deletion in cloud
storage, and focus on realizing the public verifiability. Then the system proposes a counting Bloom filter-
based scheme, which not only can realize provable data transfer between two different clouds but also can
achieve publicly verifiable data deletion. If the original cloud server does not migrate or remove the data
honestly, the verifier (the data owner and the target cloud server) can detect these malicious operations by
verifying the returned transfer and deletion evidences.
Firstly, the data owner encrypts the data and outsources the ciphertext to the cloud A. Then he checks
the storage result and deletes the local backup. Later, the data owner may change the cloud storage service
provider and migrate some data from cloud A to cloud B. After that the data owner wants to check the transfer
result. Finally, when the data transfer is successful, the data owner requires the cloud A to remove the
transferred data and check the deletion result.

The concrete scheme:


Our new proposed system contains the following six steps.
1) Initialization
Generate public private key pairs (PKO, SKO), (PKA, SKA) and (PKB, SKB) for the data
owner, the cloud A and the cloud B, respectively.
2) Data encryption:
To protect the data confidentiality, the data owner uses secure encryption algorithm to encrypt the
outsourced file before uploading.
3) Data outsourcing:

The cloud A stores D and generates storage proof. Then the data owner checks the storage result and
deletes the local backup.
4) Data transfer:

When the data owner wants to change the service provider, he migrates some data blocks, even the
whole file from the cloud A to the cloud B.
4
5) Transfer check:

The cloud B wants to check the correctness of the transfer and returns the transfer result to the data
owner.

6) Data deletion:

The data owner might require the cloud A to delete some data blocks when they have been transferred
to the cloud B successfully.

1.4.2. Objectives of Proposed System

 To solve this problem, we propose a new counting Bloom filter-based scheme in this paper.
 The proposed scheme not only can achieve secure data transfer but also can realize
permanent data deletion.
 Additionally, the proposed scheme can satisfy the public verifiability without requiring any
trusted third party. Here we use a new counting bloom filter-based scheme.
 The cloud storage service provider must authenticate the data owner.

5
CHAPTER-2
LITERATURE SURVEY

A verifiable data deletion has been well studied for a long time, resulting in many solutions. Xue et al.
studied the goal of secure data deletion and put forward a key-policy attribute-based encryption scheme, which
can achieve data fine grained access control and assured deletion. They reach data deletion by removing the
attribute and use Merkle hash tree (MHT) to achieve verifiability, but their scheme requires a trusted authority.
Du et al. designed a scheme called Associated deletion scheme for multi-copy (ADM), which uses pre-deleting
sequence and MHT to achieve data integrity verification and provable deletion. However, their scheme also
requires a TTP to manage the data keys. In 2018, Yang et al. presented a Block chain-based cloud data deletion
scheme, in which the cloud executes deletion operation and publishes the corresponding deletion evidence on
Blockchain. Then any verifier can check the deletion result by verifying the deletion proof. Besides, they solve
the bottleneck of requiring a TTP.
Although these schemes all can achieve verifiable data deletion, they cannot realize secure data
transfer. To migrate the data from one cloud to another and delete the transferred data from the original cloud,
many methods have been proposed. In 2015, Yu et al. presented a Provable data possession (PDP) scheme
that can also support secure data migration. To the best of our knowledge, their scheme is the first one to solve
the data transfer between two clouds efficiently, but it’s inefficient in data deletion process since they reach
deletion by re-encrypting the transferred data, which requires the data owner to provide many information.
Xue et al. designed a provable data migration scheme, which characterized by PDP and verifiable deletion.
The data owner can check the data integrity through PDP protocol and verify the deletion result by Rank-based
Merkle hash tree (RMHT). However, Liu et al. pointed out that there exists a security flaw in the scheme and
they designed an improved scheme that can solve the security flaw. In 2018, Yang et al. adopted vector
commitment to design a new data transfer and deletion scheme, which offers the data owner the ability to
verify the transfer and deletion results without any TTP. Moreover, their scheme can realize data integrity
verification on the target cloud.

1. B. Varghese and R. Buyya, “Next generation cloud computing: New trends and research directions”:
The Landscape od computing has significantly changed over the last decade. Not only have more
providers and service offerings crowded the space, but also cloud infrastructure that was traditionally limited
to single provider data centers is now evolving. In this, we firstly discuss the changing cloud infrastructure
and consider the use of infrastructure away from data centers. These trends have resulted in the need for a
variety of new computing architectures that will be offered by future cloud infrastructure.

6
2. W. Shen, J. Qin, J. Yu, et al, “Enabling identity-based integrity auditing and data sharing with
sensitive information hiding for secure cloud storage”:
With cloud storage services, users can remotely store their data to the cloud and realize the data sharing
with others. Remote data integrity auditing is proposed to guarantee the integrity of the data stored in the
cloud. In some common cloud storage systems such as the electronic health records system, the cloud file
might contain some sensitive information. The sensitive information should not be exposed to others when the
cloud file is shared. Encrypting the whole shared file can realize the sensitive information hiding, but will
make this shared file unable to be used by others.

3. R. Kaur, I. Chaua and J. Bhattacharya, “Data deduplication techniques for efficient cloud storage
management: A systematic review”:
The exponential growth of digital data in cloud storage systems is a critical issue presently as a large
amount od duplicate data in the storage systems exerts an extra load on it. Deduplication is an efficient
technique that has gained attention in large-scale storage systems. Deduplication elements reduction data,
improves storage utilization and reduces storage cost. This paper presents a broad methodical literature review
of existing data deduplication techniques along with various existing taxonomies of deduplication that have
been based on cloud storage.

4. K. Ren, C. Wang, and Q. Wang, “Security challenges for the public cloud”:
Cloud computing represents today's m ost exciting computing paradigm shift in
information technology. However, security and privacy are perceived as primary obstacles to its wide
adoption. Here, the authors outline several critical security challenges and motivate further
investigation of security solutions for a trustworthy public cloud environment.

5. U. Adhikari, T. H. Morris, and S. Pan, “Applying non-nested generalized exemplars classification for
cyber-power event and intrusion detection”:
Non-nested generalized exemplars (NNGEs) are a state-of-the-art data mining algorithm which uses
distance between a new example and a set of exemplars for classification. The state extraction method (STEM)
preprocesses power system wide area measurement system data to reduce data size while maintaining critical
patterns. Together NNGE+STEM make an effective event and intrusion detection system which can
effectively classify power system events and cyber-attacks in real time. This paper documents the results of
two experiments in which NNGE+STEM was used to classify cyber power contingency, control action, and
cyber-attack events.

6. R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “ S e ar c h a b l e s y mm e t r i c e n c r y p ti o n :


i m p r o ve d d e f i n i t i o n s an d ef f i ci en t constructions”:

7
Searchable symmetric encryption (SSE) allows a party to outsource the storage of his data to another
party in a private manner, while maintaining the ability to selectively search over it. This problem has been
the focus of active research and several security definitions and constructions have been
proposed. In this paper we begin by reviewing existing notions of security and propose new and
stronger security definitions. We then present two constructions that we show secure under our new
definitions. Interestingly, in addition to satisfying stronger security guarantees, our constructions are more
efficient than all previous constructions. Further, prior work on SSE only considered the setting where only
the owner of the data is capable of submitting search queries. We consider the natural extension
where an arbitrary group of parties other than the owner can submit search queries. We formally define
SSE in this multi-user setting, and present an efficient construction.

8
CHAPTER-3
REQUIREMENTS & DOMAIN INFORMATION

3.1. Requirement Specification


3.1.1. Hardware Requirements:

 Processor - i3/Intel Processor


 RAM - 4 GB (min)
 Hard Disk - 20 GB

3.1.2. Software Requirements:

 Operating System : Windows 7/8/10


 Application Server : Tomcat 9.0
 Front End : HTML, JSP
 Scripts : JavaScript.
 Server-side Script : Java Server Pages.
 Database : My SQL 6.0
 Database Connectivity : JDBC.

3.2. Domain Information


Cloud Computing:
Cloud computing is a type of Internet-based computing that provides shared computer processing
resources and data to computers and other devices on demand. It is a model for enabling ubiquitous, on-
demand access to a shared pool of configurable computing resources (e.g., computer networks, servers,
storage, applications and services), which can be rapidly provisioned and released with minimal management
effort. Cloud computing and storage solutions provide users and enterprises with various capabilities to store
and process their data in third-party data centers that may be located far from the user–ranging in distance
from across a city to across the world. Cloud computing relies on sharing of resources to achieve coherence
and economy of scale, similar to a utility. Cloud computing is the on-demand availability of computer system
resources, especially data storage cloud storage and computing power, without direct active management by
the user. The term is generally used to describe data centers available to many users over the Internet. Large

9
clouds, predominant today, often have functions distributed over multiple locations from central servers. If the
connection to the user is relatively close, it may be designated an edge server.
Cloud computing is the delivery of computing services including servers, storage, databases,
networking, software, analytics, and intelligence over the Internet (“the cloud”) to offer faster innovation,
flexible resources, and economies of scale. You typically pay only for cloud services you use, helping lower
your operating costs, run your infrastructure more efficiently and scale as your business needs change. Cloud
computing is named as such because the information being accessed is found remotely in the cloud or a virtual
space. Companies that provide cloud services enable users to store files and applications on remote servers
and then access all the data via the Internet. This means the user is not required to be in a specific place to gain
access to it, allowing the user to work remotely.

Client Server

With the varied topic in existence in the fields of computers, Client Server is one, which has generated
more heat than light, and also more hype than reality. This technology has acquired a certain critical mass
attention with its dedication conferences and magazines. Major computer vendors such as IBM and DEC, have
declared that Client Servers is their main future market. A survey of DBMS magazine revealed that 76% of
its readers were actively looking at the client server solution. The growth in the client server development
tools from $200 million in 1992 to more than $1.2 billion in 1996.

Client server implementations are complex but the underlying concept is simple and powerful. A client
is an application running with local resources but able to request the database and relate the services from
separate remote server. The software mediating this client server interaction is often referred to as
MIDDLEWARE.

The typical client either a PC or a Work Station connected through a network to a more powerful PC,
Workstation, Midrange or Main Frames server usually capable of handling request from more than one client.
However, with some configuration server may also act as client. A server may need to access other server in
order to process the original client request.

The key client server idea is that client as user is essentially insulated from the physical location and
formats of the data needs for their application. With the proper middleware, a client input from or report can
transparently access and manipulate both local database on the client machine and remote databases on one or
more servers. An added bonus is the client server opens the door to multi-vendor database access indulging
heterogeneous table joins.

What is a Client Server


Two prominent systems in existence are client server and file server systems. It is essential to
distinguish between client servers and file server systems. Both provide shared network access to data but the
comparison dens there! The file server simply provides a remote disk drive that can be accessed by LAN
10
applications on a file-by-file basis. The client server offers full relational database services such as SQL-
Access, Record modifying, Insert, delete with full relational integrity backup/ restore performance for high
volume of transactions, etc. the client server middleware provides a flexible interface between client and
server, who does what, when and to whom.

Why Client Server


Client server has evolved to solve a problem that has been around since the earliest days of computing:
how best to distribute your computing, data generation and data storage resources in order to obtain efficient,
cost effective departmental and enterprise-wide data processing. During mainframe era choices were quite
limited. A central machine housed both the CPU and DATA (cards, tapes, drums and later disks). Access to
these resources was initially confined to batched runs that produced departmental reports at the appropriate
intervals. A strong central information service department ruled the corporation. The role of the rest of the
corporation limited to requesting new or more frequent reports and to provide hand written forms from which
the central data banks were created and updated. The earliest client server solutions therefore could best be
characterized as “SLAVE-MASTER”.

Time-sharing changed the picture. Remote terminal could view and even change the central data,
subject to access permissions. And, as the central data banks evolved in to sophisticated relational database
with non-programmer query languages, online users could formulate adhoc queries and produce local reports
without adding to the MIS applications software backlog. However remote access was through dumb
terminals, and the client server remained subordinate to the Slave\Master.

Front end or User Interface Desi gn

The entire user interface is planned to be developed in browser specific environment with a touch of
Intranet-Based Architecture for achieving the Distributed Concept. The browser specific components are
designed by using the HTML standards, and the dynamism of the designed by concentrating on the constructs
of the Java Server Pages.

Communica tion or Database Conn ecti vity Tier

The Communication architecture is designed by concentrating on the Standards of Servlets and


Enterprise Java Beans. The database connectivity is established by using the Java Data Base Connectivity.
The standards of three-tier architecture are given major concentration to keep the standards of higher cohesion
and limited coupling for effectiveness of the operations.

Fea tu res o f The La nguage Used

In my project, I have chosen Java language for developing the code.

11
About Java(J2EE)

Initially the language was called as “oak” but it was renamed as “Java” in 1995. The primary
motivation of this language was the need for a platform-independent (i.e., architecture neutral) language that
could be used to create software to be embedded in various consumer electronic devices.

 Java is a programmer’s language.

 Java is cohesive and consistent.

 Except for those constraints imposed by the Internet environment, Java gives the programmer,
full control.
Finally, Java is to Internet programming where C was to system programming.

Importance of Java to the In ternet

Java has had a profound effect on the Internet. This is because; Java expands the Universe of objects
that can move about freely in Cyberspace. In a network, two categories of objects are transmitted between
the Server and the Personal computer. They are: Passive information and Dynamic active programs. The
Dynamic, Self-executing programs cause serious problems in the areas of Security and probability. But, Java
addresses those concerns and by doing so, has opened the door to an exciting new form of program called
the Applet.

Java can be us ed to create two types of programs:

Applications & Applets:

An application is a program that runs on our computer under the operating system of that computer.
It is more or less like one creating using C or C++. Java’s ability to create applets makes it important. An
Applet is an application designed to be transmitted over the Internet and executed by a Java –compatible web
browser. An applet is actually a tiny Java program, dynamically downloaded across the network, just like an
image. But the difference is, it is an intelligent program, not just a media file. It can react to the user input and
dynamically change.

Java Script
JavaScript is a script-based programming language that was developed by Netscape Communication
Corporation. JavaScript was originally called Live Script and renamed as JavaScript to indicate its relationship
with Java. JavaScript supports the development of both client and server components of Web-based
applications. On the client side, it can be used to write programs that are executed by a Web browser within
the context of a Web page. On the server side, it can be used to write Web server programs that can process
information submitted by a Web browser and then updates the browser’s display accordingly. Even though

12
JavaScript supports both client and server Web programming, we prefer JavaScript at Client-side
programming since most of the browsers supports it. JavaScript is almost as easy to learn as HTML, and
JavaScript statements can be included in HTML documents by enclosing the statements between a pair of
scripting tags.
<SCRIPTS>...</SCRIPT>.

<SCRIPT LANGUAGE = “JavaScript”>

JavaScript statements

</SCRIPT>
Here are a few things we can do with JavaScript:
 Validate the contents of a form and make calculations.
 Add scrolling or changing messages to the Browser’s status line.
 Animate images or rotate images that change when we move the mouse over them.
 Detect the browser in use and display different content for different browsers.
 Detect installed plug-ins and notify the user if a plug-in is required.
We can do much more with JavaScript, including creating entire application.

Hyper Text Markup Language

Hypertext Markup Language (HTML), the languages of the World Wide Web (WWW), allows users
to produces Web pages that include text, graphics and pointer to other Web pages (Hyperlinks).HTML is not
a programming language but it is an application of ISO Standard 8879, SGML (Standard Generalized
Markup Language), but specialized to hypertext and adapted to the Web. The idea behind Hypertext is that
instead of reading text in rigid linear structure, we can easily jump from one point to another point. We can
navigate through the information based on our interest and preference. A markup language is simply a series
of elements, each delimited with special characters that define how text or other items enclosed within the
elements should be displayed. Hyperlinks are underlined or emphasized works that load to other documents
or some portions of the same document.
HTML can be used to display any type of document on the host computer, which can be geographically
at a different location. It is a versatile language and can be used on any platform or desktop.
HTML provides tags (special codes) to make the document look attractive. HTML tags are not case-
sensitive. Using graphics, fonts, different sizes, color, etc., can enhance the presentation of the document.
Anything that is not a tag is part of the document itself.

Java Database Connectivity

13
JDBC is a Java API for executing SQL statements. (As a point of interest, JDBC is a trademarked
name and is not an acronym; nevertheless, JDBC is often thought of as standing for Java Database
Connectivity. It consists of a set of classes and interfaces written in the Java programming language. JDBC
provides a standard API for tool/database developers and makes it possible to write database applications using
a pure Java API.
Using JDBC, it is easy to send SQL statements to virtually any relational database. One can write a
single program using the JDBC API, and the program will be able to send SQL statements to the appropriate
database. The combinations of Java and JDBC lets a programmer write it once and run it anywhere.
What Does JDBC Do?

Simply put, JDBC makes it possible to do three things:


 Establish a connection with a database
 Send SQL statements
 Process the results.
JDBC versus ODBC and other APIs

At this point, Microsoft's ODBC (Open Database Connectivity) API is that probably the most widely
used programming interface for accessing relational databases. It offers the ability to connect to almost all
databases on almost all platforms.
1. ODBC is not appropriate for direct use from Java because it uses a C interface. Calls from Java to
native C code have a number of drawbacks in the security, implementation, robustness, and automatic
portability of applications.
2. A literal translation of the ODBC C API into a Java API would not be desirable. For example, Java has
no pointers, and ODBC makes copious use of them, including the notoriously error-prone generic
pointer "void *". You can think of JDBC as ODBC translated into an object-oriented interface that is
natural for Java programmers.
3. ODBC is hard to learn. It mixes simple and advanced features together, and it has complex options
even for simple queries. JDBC, on the other hand, was designed to keep simple things simple while
allowing more advanced capabilities where required.
4. A Java API like JDBC is needed in order to enable a "pure Java" solution. When ODBC is used, the
ODBC driver manager and drivers must be manually installed on every client machine. When the
JDBC driver is written completely in Java, however, JDBC code is automatically installable, portable,
and secure on all Java platforms from network computers to mainframes.

14
CHAPTER-4
SYSTEM METHODOLOGY

4.1. Architecture of System

Fig:4.1 Architecture of the System

15
4.2 . MODULES
4.2.1. Multi-cloud:
Lots of data centers are distributed around the world, and one region such as America, Asia, usually
has several data centers belonging to the same or different cloud providers. So technically all the data centers
can be access by a user in a certain region, but the user would experience different performance. The latency
of some data centers is very low while that of some ones may be intolerable high. System chooses clouds for
storing data from all the available clouds which meet the performance requirement, that is, they can offer
acceptable throughput and latency when they are not in outage. The storage mode transition does not impact
the performance of the service. Since it is not a latency-sensitive process, we can decrease the priority of
transition operations, and implement the transition in batch when the proxy has low workload.

4.2.2. Data Owner:


In this section, we elaborate a cost-efficient data hosting model with high availability in hetero genous
multi-cloud, named “MULTI CLOUD”. The architecture of CHARM is shown in above figure. The whole
model is located in the proxy in this system. There are four main components in MULTI CLOUD: Data
Hosting, Storage Mode Switching (SMS), Workload Statistic, and Predictor. Workload Statistic keeps
collecting and tackling access logs to guide the placement of data. It also sends statistic information to
Predictor which guides the action of SMS. Data Hosting stores data using replication or erasure coding,
according to the size and access frequency of the data. SMS decides whether the storage mode of certain data
should be changed from replication to erasure coding or in reverse, according to the output of Predictor. The
implementation of changing storage mode runs in the background, in order not to impact online service.
Predictor is used to predict the future access frequency of files. The time interval for prediction is one month,
that is, we use the former months to predict access frequency of files in the next month.
However, we do not put emphasis on the design of predictor, because there have been lots of good
algorithms for prediction. Moreover, a very simple predictor, which uses the weighted moving average
approach, works well in our data hosting model. Data Hosting and SMS are two important modules in MULTI
CLOUD. Data Hosting decides storage mode and the clouds that the data should be stored in. This is a complex
integer programming problem demonstrated in the following subsections. Then we illustrate how SMS works
in detail in x V, that is, when and how many times should the transition be implemented.

4.2.3. Cloud Storage:


Cloud storage services have become increasingly popular. Because of the importance of privacy, many
cloud storage encryption schemes have been proposed to protect data from those who do not have access. All
such schemes assumed that cloud storage providers are safe and cannot be hacked; however, in practice, some
authorities (i.e., coercers) may force cloud storage providers to reveal user secrets or confidential data on the

16
cloud, thus altogether circumventing storage encryption schemes. We present our design for a new cloud
storage encryption scheme that enables cloud storage providers to create convincing fake user secrets to protect
user privacy. Since coercers cannot tell if obtained secrets are true or not, the cloud storage providers ensure
that user privacy is still securely protected. Most of the proposed schemes assume cloud storage service
providers or trusted third parties handling key management are trusted and cannot be hacked; however, in
practice, some entities may intercept communications between users and cloud storage providers and then
compel storage providers to release user secrets by using government power or other means. In this case,
encrypted data are assumed to be known and storage providers are requested to release user secrets. we aimed
to build an encryption scheme that could help cloud storage providers avoid this predicament. In our approach,
we offer cloud storage providers means to create fake user secrets. Given such fake user secrets, outside
coercers can only obtained forged data from a user’s stored ciphertext. Once coercers think the received secrets
are real, they will be satisfied and more importantly cloud storage providers will not have revealed any real
secrets. Therefore, user privacy is still protected. This concept comes from a special kind of encryption scheme
called deniable encryption.

4.2.4. Owner Module:


Owner module is to upload their files using some access policy. First, they get the public key for
particular upload file after getting this public key owner request the secret key for particular upload file. Using
that secret key owner upload their file and performs Find all cost and memory Details, View Owner’s VMs
Details and purchase, Browse and enc file and upload, Check Data Integrity Proof, Transfer data from one to
another cloud based on the price (Storage Mode Switching), Check all cloud VM details and Price list.

4.2.5. User Module:


This module is used to help the client to search the file using the file id and file name. If the file id and
name is incorrect means the user does not get the file, otherwise server ask the secret key and get the encryption
file. If the user wants the decryption file means user have the secret key and performs View all attackers, View
Resource Utilization Profiles (Total memory used for each and every data owner), View All VM and Price
details, Resource Migration Check pointing (if it exceeds Threshold).

17
4.3. System Design

4.3.1. Data Flow Diagrams:


A Data Flow Diagram (DFD) is a traditional visual representation of the information flows within a
system. It shows how data enters and leaves the system, what changes the information, and where data is
stored. The objective of a DFD is to show the scope and boundaries of a system as a whole.

Level-0:

Fig:4.3.1 Level-0 Data Flow diagram

In level0 of data flow diagram, the data owner can upload files in cloud servers. And here send
transaction details like log for accessing the data and uploading, etc., to the proxy server.

18
Level-1:

In level1, The receiver request file to the cloud servers. Here cloud server checks the File name and
secret key of the file. If it is entered correct, then the authorized file is sent to the receiver. If it is wrong it will
show enter correct file name and secret key to the receiver.

Fig:4.3.1 Level-1 Data Flow Diagram


Level-2:
In level2, Cloud Servers checks the data integrity to the proxy server. Meanwhile the data owner
requesting to the proxy server to verify the file. The proxy server responded data integrity verification to the
data owner.

Fig:4.3.1 Level-2 Data Flow Diagram

19
4.3.2.UML Diagrams
A UML diagram is a diagram based on the UML (Unified Modeling Language) with the purpose
of visually representing a system along with its main actors, roles, actions, artifacts or classes, in order to better
understand, alter, maintain, or document information about the system. UML is a modern approach to
modelling and documenting software. In fact, it’s one of the most popular business process modelling
techniques. It is based on diagrammatic representations of software components.

4.3.2.1. Usecase diagram:


In UML, usecase diagrams model the behavior of a system and help to capture the requirements of the
system. Usecase diagrams describe the high-level functions and scope of a system. These diagrams also
identify the interactions between the system and its actors. Usecase diagrams are a set of use cases, actors,
and their relationships. They represent the usecase view of a system. A usecase represents a particular
functionality of a system.

Fig:4.3.2.1 Usecase diagram of proposed system

20
Owner module is to upload their files using some access policy. and performs View Owner’s VMs
Details and purchase, Browse and enc file and upload, Transfer data from one to another cloud based on the
price, Check all cloud VM details and Price list. Cloud servers can perform authorizing files, storing files. And
also cloud servers can show the owner files and registered users. The end user can request files from cloud
server and receiving files. The attacker modifying a file without cloud server respond.

4.3.2.2. Class Diagram:


Class diagram is a static diagram. It represents the static view of an application. Class diagram is not
only used for visualizing, describing, and documenting different aspects of a system but also for constructing
executable code of the software application.

Class diagram describes the attributes and operations of a class and also the constraints imposed on
the system. The class diagrams are widely used in the modelling of object-oriented systems because they are
the only UML diagrams, which can be mapped directly with object-oriented languages.

Fig:4.3.2.2 Class Diagram

These are the main building block in object-oriented which shows different attributes and
methods(operations) and the relationship among data owner depending upon cloud server’s, proxy server’s
21
and end user’s functionalities. The attacker can attack on cloud server to view and modify data i.e., hack the
server and misuse or steal the data.

4.3.2.3. Sequence Diagram:


A sequence diagram is a type of interaction diagram because it describes how and in what order a
group of objects works together. These diagrams are used by software developers and business professionals
to understand requirements for a new system or to document an existing process.

Fig:4.3.2.3 Sequence Diagram

The data owner registered to the cloud and the registration is successful the login to the cloud. The data
owner request for VM to the cloud. Here, the data owner can upload files. The cloud servers can view user
files in cloud. The data owner can verify the data integrity and checks file storage confirmation. The end user
registered to proxy server and if registration is successfully completed then login.

22
The cloud servers may authorize files, view the user files and view users in cloud. The end user can
send request file open to the cloud. The cloud servers can respond to request. Here the data owner checks the
file integrity. The proxy server automatically checks the MAC value. The data owner can transfer file from
one cloud to another cloud. After successfully transferring file to another cloud then delete the file from
existing cloud VM. The end user can view blocked users and also can unblock the users.

4.3.2.4. Flow Chart:


A flowchart is a picture of the separate steps of a process in sequential order. It is a generic tool that
can be adapted for a wide variety of purposes, and can be used to describe various processes, such as a
manufacturing process, an administrative or service process, or a project plan. A flowchart is a type
of diagram that represents a workflow or process. A flowchart can also be defined as a diagrammatic
representation of an algorithm, a step-by-step approach to solving a task.

The flowchart shows the steps as boxes of various kinds, and their order by connecting the boxes with
arrows. This diagrammatic representation illustrates a solution model to a given problem. Flowcharts are used
in analysing, designing, documenting or managing a process or program in various fields.

Fig:4.3.2.4 Flow Chart of the System

23
Firstly, the data owner can register to the cloud. If the data owner successfully registered to the cloud,
then login and assign memory and threshold to VM, and browsing and upload the files. The proxy server may
check the number of files in cloud. Here, the data owner verifies the data integrity in cloud servers. The end
user can request file open to data owner. The owner can respond to that request. Then the end user can open
the file and download the file. The data owner can transfer file from one cloud to another cloud, after successful
transfer of file then existing file can deleted from the cloud.

24
CHAPTER-5
EXPERIMENT ANALYSIS

5.1. Experimentation

This proposed system contains the following six steps.


1. Initialization:
Generate ECDSA public private key pairs (PKO, SKO), (PKA, SKA) and (PKB, SKB) for the data
owner, the cloud A and the cloud B, respectively. Then the data owner chooses k secure hash functions
g1, g2, · · ·, gk that all map any integer in [1, N] to distinct cells in CBF, i.e., gi: [1, N] → [1, m].
Additionally, the data owner chooses a unique tag tagf for the file that will be outsourced to the cloud
A.
2. Data encryption:
To protect the data confidentiality, the data owner uses secure encryption algorithm to encrypt the
outsourced file before uploading.

Fig-5.1: The Main Process of System

i) Firstly, the data owner computes encryption key k = H (tagf ||SKO), and then uses k to encrypt
the file C = Enck(F), where Enc is an IND-CPA secure encryption algorithm. After that the
data owner divides the ciphertext C into n ′ blocks, meanwhile, inserts n − n ′ random blocks
into the n ′ blocks at random positions, which can guarantee that the CBF will not be null after
data transfer and deletion. Then the data owner records these random positions in a table P F.
ii) For every data block Ci, the data owner randomly chooses a distinct integer ai as the index of
Ci, and computes the hash values Hi = H (tagf ||ai ||Ci). Thus, the outsourced data set can be
denoted as D = ((a1, C1), · ·, (an, Cn)). Finally, the data owner sends D to the cloud A, along
with the file tag tagf.

25
3. Data outsourcing:

The cloud A stores D and generates storage proof. Then the data owner checks the storage result and
deletes the local backup.
i) Upon receiving data set D and file tag tagf, the cloud A stores D, and uses the indexes (a1, a2,
· · ·, an) to construct a counting Bloom filter CBFs, where i = 1, 2, · · ·, n. Meanwhile, the
cloud A stores tagf as the index of D. Finally, the cloud A computes a signature sigs = SignSKA
(storage||tagf ||CBFs||Ts), and sends the proof λ = (CBFs, Ts, sigs) to the data owner, where
Sign is a ECDSA signature algorithm, Ts is a timestamp.
ii) ii) On receipt of storage proof λ, the data owner checks its validity. Specifically, the data owner
first checks the validity of the signature sigs. If sigs is invalid, the data owner quits and outputs
failure; otherwise, the data owner randomly chooses half of the indexes from (a1, a2, · · ·, an)
to check the correctness of the CBFs. If the CBFs is not correct, the data owner quits and outputs
failure; otherwise, the data owner deletes the local backup.

4. Data transfer:

When the data owner wants to change the service provider, he migrates some data blocks, even the
whole file from the cloud A to the cloud B.
i) Firstly, the data owner generates the index set of block indices ϕ, which will identify the data
blocks that need to be migrated. Then the data owner computes a signature sigt = SignSKO
(transfer||tagf ||ϕ||Tt), where Tt is a timestamp. After that the data owner generates a transfer
request Rt = (transfer, tagf, ϕ, Tt, sigt), and then sends it to the cloud A. Meanwhile, the data
owner sends the hash values {Hi}i∈ϕ to the cloud B.
ii) ii) On receipt of the transfer request Rt, the cloud A checks the validity of Rt. If Rt is not valid,
the cloud A quits and outputs failure; otherwise, the cloud A computes a signature sigta =
SignSKA (Rt||Tt), and sends the data blocks {(ai, Ci)} i∈ϕ to the cloud B, along with the
signature sigta and the transfer request Rt.

5. Transfer check:

The cloud B wants to check the correctness of the transfer and returns the transfer result to the data
owner.
i) Firstly, the cloud B checks the validity of the transfer request Rt and signature sigta. If not both
of them are valid, the cloud B quits and outputs failure; otherwise, the cloud B checks that
whether the equation Hi = H (tagf ||ai ||mi) holds, where i ∈ ϕ. If Hi ̸= H (tagf ||ai ||Ci), the cloud
B requires the cloud A to send (ai, Ci) again; otherwise, the cloud B goes to Step ii).
ii) The cloud B stores the blocks {(ai, Ci)} i∈ϕ, and uses the indexes {ai}i∈ϕ to construct a new
counting Bloom filter CBFb. Then the cloud B computes a signature sigtb = SignSKB

26
(success||tagf ||ϕ||Tt||CBFb). Finally, the cloud B returns the transfer proof π = (sigta, sigtb,
CBFb) to the data owner.
iii) Upon receipt of π, the data owner checks the transfer result. To be specific, the data owner
checks the validity of the signature sigtb. Meanwhile, the data owner randomly chooses half of
the indexes from set ϕ to verify the correctness of the counting Bloom filter CBFb. If and only
if all the verifications pass, the data owner trusts the transfer proof is valid, and the cloud B
stores the transferred data honestly.

6. Data deletion:

The data owner might require the cloud A to delete some data blocks when they have been transferred
to the cloud B successfully.
i) Firstly, the data owner computes a signature sigd = SignSKA (delete|| tagf ||ϕ||Td), where Td is
a timestamp. Then the data owner generates a data deletion request Rd = (delete, tagf, ϕ, Td,
sigd) and sends it to cloud A.
ii) ii) Upon receiving Rd, the cloud A checks Rd. If Rd is invalid, the cloud A quits and outputs
failure; otherwise, the cloud A deletes the data blocks {(ai, Ci)} i∈ϕ by overwriting. Meantime,
the cloud A removes indexes {aq}q∈ϕ from the CBFs and obtains a new counting Bloom filter
CBFd. Finally, the cloud A computes a signature sigda = Sign(delete||Rd||CBFd), and returns
the data deletion evidence τ = (sigda, CBFd) to the data owner.
iii) After receiving τ, the data owner checks the signature sigda. If sigda is invalid, the data owner
quits and outputs failure; otherwise, the data owner randomly chooses half of the indexes from
ϕ to check the equations CBF (aq) = 0 and determines if aq belongs to the CBFd. If the equations
hold, the data owner trusts τ is valid.

5.2. Algorithms
5.2.1. Counting Bloom Filter (CBF):
Bloom filter (BF), a space-efficient data structure, conceived by Burton Howard Bloom in 1970, that
is used to test that if a set contains a specified element. This designed to tell, rapidly and memory-efficiently,
whether an element is present in a set. BF costs constant time overhead to insert an element or verify that
whether an element belongs to the set, no matter how many elements the set and the BF includes.
A BF initially represents a bit of array of m bits, all set to 0. The insertion takes an element and inputs
it to k different hash functions each mapping the element to one of the m array positions, which are then set to
1. When querying the BF on an element, it is considered to be in the BF if all positions obtained by evaluating
the hash evaluations are set to 1. The initial secret key sk output by the generation algorithm of a BFE scheme
corresponds to an empty BF. Encryption takes a message M and the public key pk, samples a random element

27
s (acting as a tag for the ciphertext) corresponding to the universe U of the BF and encrypts a message using
pk with respect to the k positions set in the BF by s.

Fig-5.2.1.1 Example of Bloom Filter

A BF can be viewed as a m length bit array with k hash functions: hi (·): {0, 1} ∗ → {0, 1, …… ...,
m}. To insert an element, we need to set the group of k bits to 1, the positions of these bits are determined by
hash values h1(x), ………..., hk(x). Membership tests are implemented by executing the same hash
calculations and outputting success if all of the corresponding positions are one, as shown in Fig 5.2.1.2.

Fig-5.2.1.2 An Example of Counting Bloom Filter

Note that there is a false positive in the BF, which means that even all the k bits related to we are one,
but w does not belong to the set with a small probability. However, we can choose appropriate parameters to
reduce the probability, e.g., the number of the hash functions k, the length of the BF m and the number of the
elements n. Further, the probability will be so small that it can be negligible if the parameters are suitable.
Besides, BF cannot delete an element from the data set. As a variant of BF, CBF uses a counter cell count to
replace every “bit” position, as illustrated in Fig 5.2.1.2 To insert an element y, we require to increase the k
related counters by one, the indexes of the counters are also determined by the hash values h1(y), h2(y), · · ·,
hk(y). On the contrary, the element deletion operation is simply to decrease the k corresponding counters by
one.
The data owner uses Counting Bloom Filter, When the data owner wants to change the service
provider, he migrates some data blocks, even the whole file from one cloud to the other cloud based on the
services provided like resources, threshold VMs, prices and memory. On uploading the data, the data is
encrypted that is cipher text is generated along with a secret key. To transfer or access the data, secret key is
required.
28
If user wants to transfer data, request is sent by the user to data owner. The data owner checks for the
request and responds to the request by providing the secret key to transfer the data to other cloud. The other
cloud wants to check the correctness of the transfer and returns the transfer result to the data owner. If and
only if all the verifications pass, the data owner trusts the transfer proof is valid, and the other cloud stores the
transferred data honestly. On transfer, the file is downloaded. The data owner requires the previous cloud to
delete some data blocks when they have been transferred to the other cloud successfully.
5.2.2. AES Algorithm:
Advanced Encryption Standards (AES) is a symmetric-key algorithm. MAC uses block cipher
algorithm. A block cipher is an algorithm that encrypts and decrypts the data using 128/192/256-bit keys into
128-bit blocks. Symmetric key algorithms are sometimes referred to as secret key algorithms. This is because
these types of algorithms generally use one key that is kept secret by the systems engaged in the encryption
and decryption processes. Symmetric key algorithms are algorithms for cryptography that use the same
cryptographic keys for both the encryption of plain text and the decryption of cipher text. The keys may be
identical, or there may be a simple transformation to go between the two keys. The keys, in practice represent
a shared secret between two or more parties that can be used to maintain a private information link. The
requirement that both parties’ have access to the secret key is one of the main drawbacks of symmetric-key
encryption, in comparison to public-key encryption (Also known as asymmetric-key encryption).

5.3. Testing
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of components,
sub-assemblies, assemblies and/or a finished product It is the process of exercising software with the intent of
ensuring that the Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a specific testing requirement .

5.3.1. Types of Testing:


5.3.1.1. Unit testing:
Unit testing involves the design of test cases that validate that the internal program logic is functioning
properly, and that program inputs produce valid outputs. All decision branches and internal code flow should
be validated. It is the testing of individual software units of the application .it is done after the completion of
an individual unit before integration. This is a structural testing, that relies on knowledge of its construction
and is invasive. Unit tests perform basic tests at component level and test a specific business process,
application, and/or system configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and expected results.

29
5.3.1.2. Integration testing:
Integration tests are designed to test integrated software components to determine if they actually run
as one program. Testing is event driven and is more concerned with the basic outcome of screens or fields.
Integration tests demonstrate that although the components were individually satisfaction, as shown by
successfully unit testing, the combination of components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the combination of components.

5.3.1.3. Functional testing:


Functional tests provide systematic demonstrations that functions tested are available as specified by the
business and technical requirements, system documentation, and user manuals.
Functional testing is centered on the following items:

Invalid Input : identified classes of invalid input must be rejected.


Functions : identified functions must be exercised.
Output : identified classes of application outputs must be exercised.
Systems/Procedures : interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on requirements, key functions, or special test
cases. In addition, systematic coverage pertaining to identify Business process flows; data fields, predefined
processes, and successive processes must be considered for testing. Before functional testing is complete,
additional tests are identified and the effective value of current tests is determined.

5.3.1.4. System Testing:


System testing ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results. An example of system testing is the configuration-
oriented system integration test. System testing is based on process descriptions and flows, emphasizing pre-
driven process links and integration points.

5.3.1.5. White Box Testing:


White Box Testing is a testing in which in which the software tester has knowledge of the inner workings,
structure and language of the software, or at least its purpose. It is purpose. It is used to test areas that cannot
be reached from a black box level.

5.3.1.6. Black Box Testing:


Black Box Testing is testing the software without any knowledge of the inner workings, structure or
language of the module being tested. Black box tests, as most other kinds of tests, must be written from a
definitive source document, such as specification or requirements document, such as specification or
30
requirements document. It is a testing in which the software under test is treated, as a black box. you cannot
“see” into it. The test provides inputs and responds to outputs without considering how the software works.

Unit Testing:
Unit testing is usually conducted as part of a combined code and unit test phase of the software
lifecycle, although it is not uncommon for coding and unit testing to be conducted as two distinct phases .
Test strategy and approach
Field testing will be performed manually and functional tests will be written in detail.
Test objectives
 All field entries must work properly.
 Pages must be activated from the identified link.
 The entry screen, messages and responses must not be delayed.
Features to be tested
 Verify that the entries are of the correct format
 No duplicate entries should be allowed
 All links should take the user to the correct page.

Integration Testing
Software integration testing is the incremental integration testing of two or more integrated software
components on a single platform to produce failures caused by interface defects.
The task of the integration test is to check that components or software applications, e.g. components
in a software system or – one step up – software applications at the company level – interact without error .
Test Results:
All the test cases mentioned above passed successfully. No defects encountered.

Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant participation by the
end user. It also ensures that the system meets the functional requirements.
Test Results:
All the test cases mentioned above passed successfully. No defects encountered.

31
5.3.2. Other Testing Methodologies

User Acceptance Testing


User Acceptance of a system is the key factor for the success of any system. The system under
consideration is tested for user acceptance by constantly keeping in touch with the prospective system users
at the time of developing and making changes wherever required. The system developed provides a friendly
user interface that can easily be understood even by a person who is new to the system.

Validation Checking
Validation checks are performed on the following fields.
 Text Field:
The text field can contain only the number of characters lesser than or equal to its size. The text fields
are alphanumeric in some tables and alphabetic in other tables. Incorrect entry always flashes and error
message.
 Numeric Field:
The numeric field can contain only numbers from 0 to 9. An entry of any character flashes an error
message. The individual modules are checked for accuracy and what it has to perform. Each module is
subjected to test run along with sample data. The individually tested modules are integrated into a single
system. Testing involves executing the real data information is used in the program the existence of any
program defect is inferred from the output. The testing should be planned so that all the requirements are
individually tested.
A successful test is one that gives out the defects for the inappropriate data and produces and output
revealing the errors in the system.

Preparation of Test Data


Taking various kinds of test data does the above testing. Preparation of test data plays a vital role in
the system testing. After preparing the test data the system under study is tested using that test data. While
testing the system by using test data errors are again uncovered and corrected by using above testing steps and
corrections are also noted for future use.

Using Live Test Data:


Live test data are those that are actually extracted from organization files. After a system is partially
constructed, programmers or analysts often ask users to key in a set of data from their normal activities. Then,
the systems person uses this data as a way to partially test the system. In other instances, programmers or
analysts extract a set of live data from the files and have them entered themselves.
32
It is difficult to obtain live data in sufficient amounts to conduct extensive testing. And, although it is
realistic data that will show how the system will perform for the typical processing requirement, assuming that
the live data entered are in fact typical, such data generally will not test all combinations or formats that can
enter the system. This bias toward typical values then does not provide a true system test and in fact ignores
the cases most likely to cause system failure.

Using Artificial Test Data:

Artificial test data are created solely for test purposes, since they can be generated to test all
combinations of formats and values. In other words, the artificial data, which can quickly be prepared by a
data generating utility program in the information systems department, make possible the testing of all login
and control paths through the program.
The most effective test programs use artificial test data generated by persons other than those who wrote the
programs. Often, an independent team of testers formulates a testing plan, using the systems specifications.
The package “Virtual Private Network” has satisfied all the requirements specified as per software
requirement specification and was accepted.

5.4. Test Cases:

SI Test Name Input Output Expected Result Status


NO
1 Owner browsing the File Data stored Data stored in PASS
file encrypted form
Owner browsing Data Data is not Display Data FAIL
data Present in
directory
2 Owner Migrate file Provide details Data migration Delete details of PASS
of from and to files in existing
cloud details cloud
Owner Migrate file no details of Data not Delete details of FAIL
from and to migration files in existing
cloud details cloud
3 End-user File Search File name No. of files Got file with PASS
from cloud same name
33
End-user File Search No name File not found Got file with FAIL
same name
4 End-user File File name sent request File Request sent PASS
request success
End-user File No name Sent request File Request sent FAIL
request failed
5 End-user File File name Get Mac and Download file PASS
Download SK content
End-user File No name File not found Download file PASS
Download content

5.5. Results
5.5.1. Results:
The time cost of data encryption:
In encryption phase, we increase the file from 1MB to 8MB with a step for 1MB, and the number of
the data blocks is fixed in 8000, then the time cost comparison is shown in Fig.5.5.1.1. We can find that the
time cost of the three schemes will increase with the size of the encrypted data. However, the growth rate of
our scheme is relatively lower than that of the scheme, and almost the same with the scheme. Note that the
time cost of our scheme is less than that of the other two schemes because the scheme needs much more hash
computations to generate encryption keys, and the scheme needs more encryption operation to generate the
MAC. Hence, we think our scheme is more efficient to encrypt the file.

Fig. 5.5.1.1: The time cost of data encryption


34
The time of storage proof generation:
In storage phase, the computation overhead comes from storage proof generation and storage result
verification. Fig.5.5.1.2 the time cost of storage proof generation. We find that the time of our scheme is much
less than that of Yang et al.’s scheme, and the growth rate of Yang et al.’s scheme is relatively higher than that
of our scheme. Hence, our scheme is more efficient to generate the storage proof.

Fig.5.5.1.2: The time of Storage proof generation

The time of storage result verification:


We can find that the overhead of our scheme is much less than that of Yang et al.’s scheme, because
our scheme only needs to execute some hash calculations and a signature verification operation. But Yang et
al.’s scheme needs to execute plenty of bilinear pairing computations.

Fig.5.5.1.3: The time of storage result verification

The time cost of data transfer:


To simulate the data transfer, we increase the number of transferred data blocks from 10 to 80 with a
step for 10. For simplicity, we fix n = 400 and ignore the communication overhead, as shown in Fig. 5.5.1.4.
The time cost increases with the number of transferred data blocks. Moreover, Yang et al.’s scheme costs
much more time since it needs to execute many bilinear pairing calculations to verify the data integrity,
however, our scheme only needs to compute some hash values. Note that the hash calculation is more efficient
than the bilinear pairing calculation. Hence, our scheme is more efficient.
35
Fig.5.5.1.4: The time cost of data transfer

The time cost of data deletion:


Finally, the data owner wants to delete the transferred data from cloud A, and we fix n = 400, then the
performance evaluation is presented in Fig.5.5.1.5. The time overhead of Hao et al.’s scheme is almost
constant. However, the time cost of our scheme and Yang et al.’s scheme will increase with the number of
deleted data blocks, and the growth rate of Yang et al.’s scheme is relatively higher. Meanwhile, Yang et al.’s
scheme costs much more time when the deleted data blocks are more than 20. So, we think that our scheme is
more efficient to delete the transferred data blocks.

Fig.5.5.1.5: The time cost of data deletion

36
5.5.2. Screenshots:

Home Page:

Fig-5.5.2.1: Home page

Cloud Server:

Fig-5.5.2.2: Cloud Menu

Fig-5.5.2.2.1: View Data Owners in Cloud

37
Fig-5.5.2.2.2. Migration details in Cloud

Fig-5.5.2.2.3. Cloud files in Cloud

Proxy Server:

Fig-5.5.2.3: Proxy Server menu page

38
Data Owner:

Fig-5.5.2.4: Data Owner Login

Fig-5.5.2.4.1: Data Owner Menu page

End User:

Fig-5.5.2.5: End User Login

39
Fig-5.5.2.5.1: End User menu page

Searching a File:

Fig-5.5.2.6: Searching a File

Downloading a File:

Fig-5.5.2.7: Download File

40
Verifying a File:

Fig-5.5.2.8: Verify a File


Transfer a File:

Fig-5.5.2.9: Transfer a File

41
CHAPTER-6
CONCLUSION & FUTURE SCOPE

6.1. Conclusion:

In cloud storage, the data owner does not believe that the cloud server might execute the data transfer
and deletion operations honestly. To solve this problem, we propose a CBF-based secure data transfer scheme,
which can also realize verifiable data deletion. In our scheme, the cloud B can check the transferred data
integrity, which can guarantee the data is entirely migrated. Moreover, the cloud A should adopt CBF to
generate a deletion evidence after deletion, which will be used to verify the deletion result by the data owner.
Hence, the cloud A cannot behave maliciously and cheat the data owner successfully.
In the proposed scheme, the user can flexibly delete the unnecessary data blocks, while the useful data
blocks still remain on the physical medium. Meanwhile, the proposed scheme can achieve (public and private)
verifiability of data deletion result. That is, any verifier who owns the data deletion evidence can verify the
data deletion result. If the cloud server does not honestly execute the data deletion command and generate the
deletion evidence, the verifier can easily detect the malicious data reservation with an overwhelming
probability. Moreover, we provide the security analysis and efficiency evaluation, which respectively
demonstrate the security and practicality of the proposed scheme. Finally, the security analysis and simulation
results validate the security and practicability of our proposal, respectively.

6.2. Future Scope:

Similar to all the existing solutions, our scheme considers the data transfer between two different cloud
servers. However, with the development of cloud storage, the data owner might want to simultaneously
migrate the outsourced data from one cloud to the other two or more target clouds. However, the multi-target
clouds might collude together to cheat the data owner maliciously. Hence, the provable data migration among
three or more clouds requires our further exploration.

42
REFERENCES
[1] C. Yang and J. Ye, “Secure and efficient fine-grained data access control scheme in cloud computing”,
Journal of High-Speed Networks, Vol.21, No.4, pp.259–271, 2020.
[2] X. Chen, J. Li, J. Ma, et al., “New algorithms for secure outsourcing of modular exponentiations”, IEEE
Transactions on Parallel and Distributed Systems, Vol.25, No.9, pp.2386–2396, 2018.
[3] P. Li, J. Li, Z. Huang, et al., “Privacy-preserving outsourced classification in cloud computing”, Cluster
Computing, Vol.21, No.1, pp.277–286, 2018.
[4] B. Varghese and R. Buyya, “Next generation cloud computing: New trends and research directions”, Future
Generation Computer Systems, Vol.79, pp.849–861, 2019.
[5] W. Shen, J. Qin, J. Yu, et al., “Enabling identity-based integrity auditing and data sharing with sensitive
information hiding for secure cloud storage”, IEEE Transactions on Information
Forensics and Security, Vol.14, No.2, pp.331–346, 2019.
[6] R. Kaur, I. Chana and J. Bhattacharya J, “Data deduplication techniques for efficient cloud storage
management: A systematic review”, The Journal of Supercomputing, Vol.74, No.5, pp.2035–2085, 2018.
[7] Cisco, “Cisco global cloud index: Forecast and methodology, 2014–2019”, available at:
https://www.cisco.com/c/en/us- /solutions/collateral/service-provider/global-cloud-index-gci/ white-paper-
c11-738085.pdf, 2019-5-5.
[8] Cloudsfer, “Migrate & backup your files from any cloud to any cloud”, available at:
https://www.cloudsfer.com/, 2019-5-5.
[9] Y. Liu, S. Xiao, H. Wang, et al., “New provable data transfer from provable data possession and deletion
for secure cloud storage”, International Journal of Distributed Sensor Networks, Vol.15, No.4, pp.1–12, 2019.
[10] Y. Wang, X. Tao, J. Ni, et al., “Data integrity checking with reliable data transfer for secure cloud
storage”, International Journal of Web and Grid Services, Vol.14, No.1, pp.106–121, 2018.
[11] L. Xue, Y. Yu, Y. Li, et al., “Efficient attribute based encryption with attribute revocation for assured
data deletion”, Information Sciences, Vol.479, pp.640–650, 2019.
[12] L. Du, Z. Zhang, S. Tan, et al., “An Associated Deletion Scheme for Multi-copy in Cloud Storage”, Proc.
of the 18th International Conference on Algorithms and Architectures for Parallel Processing, Guangzhou,
China, pp.511–526, 2018.
[13] C. Yang, X. Chen and Y. Xiang, “Blockchain-based publicly verifiable data deletion scheme for cloud
storage”, Journal of Network and Computer Applications, Vol.103, pp.185–193, 2018.
[14] C. Yang, J. Wang, X. Tao, et al., “Publicly verifiable data transfer and deletion scheme for cloud storage”,
Proc. of the 20th International Conference on Information and Communications Security (ICICS 2018), Lille,
France, pp.445–458, 2018.
[15] F. Hao, D. Clarke and A. F. Zorzo, “Deleting secret data with public verifiability”, IEEE Transactions on
Dependable and Secure Computing, Vol.13, No.6, pp.617–629, 2015

43

You might also like