Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 979

ISSN 2229-5518

AN EFFICIENT DATA COMPRESSION


AND STORAGE TECHNIQUE WITH KEY
MANAGEMENT AUTHENTICATION IN
CLOUD COMPUTING
K. MuthuLakshmi 1 ,Associate Professor Member, IEEE, K.Lalitha 2 .Assistant Professor ,S.Uma 3
Associate Professor ,Panimalar Engineering College
akmuthube@gmail.com 1, lali.kalai@gmail.com 2 , umaokj@gmail.com 3

Abstract

Large scale data processing is increasingly common 1. Introduction

in cloud computing systems like MapReduce, Hadoop, and


Cloud storage become an important aspect in IT
Dryad in recent years. In these systems, files are split into

IJSER
industry. Cloud storage is an application of cloud
many small blocks and all blocks are replicated over
computing. It’s the most developed part in cloud
several servers. The high demand for data processing and

leads to high computational requirement which is usually application. It depends on the cluster application, grid

not available at the user's end Compression algorithms technology and distributed file system, and provides

reduce the redundancy in data representation thus storage service to user through internet. In most
increasing effective data density. Data compression is a conditions, Cloud storage can provide high reliability
very useful technique that helps in reducing the size of text
and security storage service at competitive price. The
data and storing the same amount of data in relatively
most developed cloud storage application is online
fewer bits resulting in reducing the data storage space,
backup or file-syncing. Disaster may happen at any
resource usage or transmission capacity.In this paper a
time: fire, flood, tornado, hard driver failure. These
hybrid data compression algorithm is propose which
disasters can destroy all local-stored data. User can
involves the combination of LZW and run length encoding.

This hybrid algorithm increase the compression ratio and use remote backup to protect their data from disasters.

minimize the compression ,decompression time while The other choice is to store copies of files in the cloud

comparing with existing algorithms. storage[1]. Online backup is an Internet based system
Keyword: Data compression, LZW algorithm, run that is set to automatically back up all selected files.
length encoding, Third party auditing, Key management
These files are stored online, and can be accessed
authentication.
anywhere. Which is especially useful in case of local

IJSER © 2018
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 980
ISSN 2229-5518

computer or server gets lost or damaged. The benefit difficult to email if they are too large or you have a lot

of using online storage services does not limited in of them. We can quickly circulate a URL and can

protecting data. Cloud storage services make it easy to share the files with anyone we choose, which would

share files from different machines and mobiles. mean uploading to the cloud storage.[3]

The benefits of the cloud storage are flexible with Management of data becomes paramount in

reduced cost and they also manage the data loss risk maintaining service levels and securing the critical

and so on. Recently many work focus towards third business information because all data in a cloud lives

party auditing and the remote integrity checking, in the same shared system. Small business under cloud

providing the data dynamics. Remote archive service computing is also dependent on the reliability of our

is responsible for properly preserving the data. The internet connection. But the internet outages make the

remote data integrity checking protocol detects the most reliable cloud computing service providers to

IJSER
data corruption and misbehaving server in the cloud suffer now and again. It is also noteworthy that the

storage. In the proposed work Data partitioning premier cloud computing service providers suffer

technique, remote data integrity checking is analyzed much outage. As the smaller organizations plan to

in internal and external ways. Partitioning happens in move forward with cloud computing, outages which

alphabetical order by using of index method whereby happen everywhere could have a profound impact on

the data being used is controlled.[2] them [4]. Cloud computing enables to be excessively

The security mechanism is also emphasized in dependent on the internet. The availability of the

order to prevent unrecoverable data loss. Storage and robust and reliable internet for all the time is the

retrieval process are simplified by reducing the storage premise on which the cloud computing exists. For the

space when there is need to store and retrieved by purpose of upholding the efficiency and effectiveness

merging technique. Cloud invisibly backs up the files of cloud computing in data management even in the

and folders and elevates the potentially endless and case of reduced bandwidth and irrespective of the

costly search for extra storage space from music files device used to retrieve data, a feasible solution could

to pictures to sensitive documents. Using cloud be to compress the data on cloud. Data compression

storage services means that you and others can access helps to reduce the consumption of expensive

and share files across a range of devices and locations. resources in this regard. For example, the disk space

Files such as photos and videos can sometimes be or transmission bandwidth.[5]

IJSER © 2018
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 981
ISSN 2229-5518

2.Literature survey: There are likewise different implementations of

Many researchers have proposed the use of convergent encryption for secure deduplication. It is

compression in cloud computing which leads to realized that some business cloud storage suppliers,

effective use of storage disks and bandwidth in the for example, Bitcasa, likewise send convergent

cloud.Wang et al. [6] was the first to propose the encryption.

scheme which can support public verification and Wang et al. [9] proposed a privacy protection

fully dynamic data at the same time because previous scheme which is considered user’s data privacy in the

studies only supported to modify and delete on a data public auditability. Data privacy implies personally

file. They define public auditability which implies identifiable information or sensitive information

public verification is delegated by a trusted third party whether they can be shared with third parties. As far

auditor (TPA) to verify. They propose a scheme to as users are concerned what they depend on TPA just

IJSER
improve complex the file index information because for the outsourced storage security of their data.

this needs to consume a lot of computed resource. However, most studies do not consider the

Stanek et al. [7] The innovative encryption protection of clients’ private information in the

scheme which provides many different security of auditing phase. This is a serious problem because an

known and unknown data. For known information that auditor may leak information without the client’s

are not especially delicate or sensitive, the traditional authorization. Besides, there are legal regulations,

or classic ordinary encryption is performed. An such as the Health Insurance Portability and

alternate two-layered encryption plan with higher Accountability Act (HIPPA), it guarantees patient

security while giving support to deduplication is confidentiality for all healthcare-related data and

proposed for unknown information. Along these lines, demands the outsourced data not to be leaked to

they accomplished better tradeoff between the external parties.

proficiency and security of the outsourced In [10] proposed an architecture that ensures the

information. privacy of data stored in cloud storage. The proposed

Xu et al. [8] additionally tended to the problem architecture can directly applicable to existing clouds

and demonstrated a protected convergent encryption without any modifications or any changes in cloud

for effective encryption, without considering problems database. It can be process that connects directly to an

of the block level deduplication and key-management. encrypted cloud database without an intermediate

IJSER © 2018
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 982
ISSN 2229-5518

devices or systems with geographically distributed order to achieve better throughput and end to end

clients and it also allowed executing independent and delay. Figure 1 demonstrates of the general

operations including those changing the database architecture for the proposed work. The proposed

structure. Moreover the proposed architecture removes framework is sub divided into three stages:

intermediate proxies that limit the scalability, Requisition phase, authentication phase and storage

elasticity and availability properties that are intrinsic phase. The primary stage involves the requisition

in cloud-based solutions. phase which uses the basis login and password

Khobragade P. B. et al [11] reviewed a number requisition model. The secondary stage executes

of compression techniques for compressing image authentication and authorization, whereas the

such as Lempel-Ziv-Welch, Huffman coding, Run authorized user have separate key to access the data in

Length encoding,, compression based on discrete order to do that an efficient method named efficient

IJSER
cosine transform, discrete wavelet transform (DWT), key management authentication algorithm. In the final

integer wavelet transform (IWT) based compression stage the datas are compressed and stored in the

and concluded that integer wavelet transform based network using hybrid data compression.

compression techniques along with lifting scheme


Req Authent Key
gives better compression ratio and retains quality of uisition ication distributio
data.

5. Mukherjee, Tilak et al [12] proposed lossless


Third
Hybrid
compression approach using wavelets in which integer party
data
dit
wavelet transforms are used and wavelet coefficients

are converted to integer values and lifting scheme is Data


retrieval
proposed for obtaining better peak signal to noise ratio
Figure 1. System architecture of proposed
and less time execution for compression, hence methodology

resulted in good compression results. 3.1 Efficient key management authentication


algorithm:
3. Research methodology:

The main objective of the proposed work is The authorized users from the requisition phase

to improve the storage capacity in cloud and secure generate a separate key. For tracing each attribute

the data in the cloud from the unauthorized users in users has the unique identity. These identity and user’s

IJSER © 2018
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 983
ISSN 2229-5518

attributes are hiding from the users. Through this method uses a public key based encryption which can

cannot learn anything from the cipher texts about the be decrypted by the user.

attributes matching or mismatching. The attributes are 3.2 Key distribution centre:

classified as the hidden normal attributes (HN) and the A Distributed Key Distribution Center (DKDC,

hidden identity attributes (HID). for short) is a set of n servers of a network that jointly

A.Setup the key: This phase outputs the public realizes the same function as a KDC. In this setting,

key and the master key. users have secure point-to-point channels with all

B.Encryption: Encrypt the message M with the set servers. A user who needs to communicate with other

of attributes X, but the attributes are Xhide hidden. users securely, sends a key-request message to a

C.Key generation: Key generation can be done by subset at his choice of at least k out of the n

access structure as input and produces the output. servers[13]. With this approach, the concentration of

IJSER
D.Decryption: Decryption can be done with secrets and the slow down factor which arise in a

decryption keys for each attributes of users. network with a single KDC are removed. A single

TPA (Third party Auditor) is an entity, which server by itself does not know the secret keys, since

has expertise and capabilities for Encryption and they are shared between the n servers. Moreover, each

decryption Service. When client want to store data at user can send a key-request in parallel to different

the cloud storage at that time TPA servers. Hence, there is no loss of time in computing a

(encryption/decryption service) Encrypt the data and key, compared with a centralized setting. Finally, the

return back to user for storage purpose. users can obtain the keys they need even if they are

For sensitive attributes the method chooses a unable to contact some of the servers.[14]

hash function which verifies the identity of user with 3.3 Hybrid data compression algorithm:

the help of TPA (Third party Auditor) and once the Compression technique is mainly used to reduce

identity verification gets cleared then the access the space of storage and increases the capacity of the

clearance is computed. When the user request clears resources. The data or information which occupies

both the service is fulfilled and the sensitive values are more space is compressed using a compressing

encrypted using the specific key which could be technique (i.e) Lossless compression technique. Then

decrypted by the user. For non-sensitive attributes the the compressed data can again be decompressed to

obtain the original information for future usage. This

IJSER © 2018
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 984
ISSN 2229-5518

is mainly used to reduce the resources storage space sub range. Therefore, the entire source message can be

and hence increase its productivity. In this section, we represented using a fraction. To decode the encoded

are going use the Lossless data compression technique message, the number of characters of the source

where the data or information which is compressed to message and the probability/frequency distribution are

minimize its storage size does not undergo any loss of needed.

data or information. The lossless compression Compression is achieved by taking each code

technique is highly secured. Our work comprises of from the file, and translating it through the code table

the combination of LZW algorithm and run length to find what character or characters it represents.

encoding. The LZW algorithm is very fast and simple Codes 0-255 in the code table are always assigned to

to implement but it has the limitation of compressing represent single bytes from the input file. For

the file that contain repetitive data whereas this can be example, if only these first 256 codes were used, each

IJSER
overcome by run length encoding. byte in the original file would be converted into 12

bits in the LZW encoded file, resulting in a 50% larger


In this case, the encoded data consists entirely of
file size. During compression, each 12 bit code would
12 bit codes, each referring to one of the entries in the
be translated via the code table back into the single
code table. In the encoding process, the cumulative
bytes.
probabilities are calculated and the range is created in
4. Performance analysis
the beginning. While reading the source character by
(A) Compression ratio
character, the corresponding range of the character
Compression Ratio is the ratio between the
within the cumulative probability range is selected. size of the compressed file and the size of the source
Then the selected range is divided into sub parts file.[15]
𝑠𝑖𝑧𝑒 𝑎𝑓𝑡𝑒𝑟 𝑐𝑜𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛
according to the probabilities of the alphabet. Then the Compression ratio =
𝑠𝑖𝑧𝑒 𝑏𝑒𝑓𝑜𝑟𝑒 𝑐𝑜𝑚𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛

next data is read and the corresponding sub range is

selected. In this way, datas are read repeatedly until


Table-1 Compression ratio between existing and
the end of the file is encountered. Finally a number proposed algorithm

should be taken from the final sub range as the output

of the encoding process. This will be a fraction in that

IJSER © 2018
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 985
ISSN 2229-5518
Algorithm type Compression ratio

LZW 0.67

Huffman encoding 0.32 Figure 3 -Graphical representation of compression


time
Reference based 0.59
compression (b) De-compression time:
Table-3 Decompression time between existing and
BAR algorithm 0.86
proposed algorithm
Proposed Hybrid data 0.932
compression algorithm
Algorithm type Average de-
Compression time for
100 kb
LZW 0.380 sec
Figure 2 -Graphical representation of compression Huffman encoding 0.54 sec
ratio
Reference based 0.57 sec
(b) Compression time:
compression
Table-2 Compression time between existing and

IJSER
proposed algorithm BAR algorithm 0.64 sec
Algorithm type Average Compression Proposed Hybrid data 0.21 sec
time for 100 kb compression algorithm
LZW 0.25 sec
Huffman encoding 0.54 sec
Reference based 0.19 sec
compression
BAR algorithm 0.43 sec
Proposed Hybrid data 0.152 sec
compression algorithm

Figure 4 -Graphical representation of


decompression time
(e) Packet loss rate:
Table-5 PLR between existing and proposed
algorithm
Algorithm type Packet loss ratio
LZW 0.97

IJSER © 2018
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 986
ISSN 2229-5518

Huffman encoding 0.86 2011 IEEE International Conference on, pages 131–
Reference based 0.58 138. IEEE, 2011.
compression
[2] Qian Wang, Cong Wang, Jin Li, Kui Ren, and
BAR algorithm 0.23
Proposed Hybrid data 0.15 Wenjing Lou,” Enabling Public Audit ability and Data
compression algorithm Dynamics for Storage Security in Cloud Computing”,

IEEE computer society , Vol 22, No 5, May 2011


Figure 5 -Graphical representation of Packet Loss
rate (PLR) [3] Dr. Anil G.N, Mrs. Swetha M.S & Mr.

Muneshwara M.S “A Smarter Way of Securing and


5. Conclusion:
This paper work studies the security issues of Managing Data for Cloud Storage Applications Using

ensuring the integrity of data storage in Cloud High Throughput Compression in the Cloud

Computing. We outline the challenges associated with Environment” IJARCSMS Volume 2, Issue 9,

IJSER
the retrieval of data from cloud in an appropriate September 2014.

manner. As the data gets compressed, it leads to a [4] "PPM performance with BWT Complexity: A fast

more optimized way of retrieving data from cloud. and effective data compression algorithm", M.

The use of compression in cloud computing leads to Effros,Proceedings of the IEEE, 88(11), 1703-1712,

effective use of storage disks and bandwidth. This (2000).

work enables the user to fine-tune the trade-off [5] Yuan, Jiawei, and Shucheng Yu. "Secure and

between storage costs, computation time and constant cost public cloud storage auditing with

bandwidth costs. Different computations of characters deduplication." Communications and Network

can be represented by fewer numbers of bits in Security (CNS), 2013 IEEE Conference on. IEEE,

compression, which is an efficient way of retrieving 2013.

data in the cloud environment. As a result the [6] Q. Wang, C. Wang, K. Ren, W. Lou, and J. Li,

proposed algorithm achieves the compression ratio “Enabling public auditability and data dynamics for

about 0.932 with the compression time 0.16 sec and storage security in cloud computing,” IEEE

decompression time 0.21 sec for 100 kb file. Transactions on Parallel and Distributed Systems, vol.

References: 22, no. 5, pp. 847–859, 2011.

[1] C.C. Tan, Q. Liu, and J. Wu. Secure locking for [7] Stanek, Jan, et al. A secure data deduplication

untrusted clouds. In Cloud Computing (CLOUD), scheme for cloud storage. Technical Report, 2012.

IJSER © 2018
http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 9, Issue 1, January-2018 987
ISSN 2229-5518

[8] Li, Jin, et al. "Secure deduplication with efficient Sharing in Cloud Storage ” , IEEE Transactions

and reliable convergent key management." (2013) 1-1 On Parallel And Distributed Systems, Vol. 25,

[9] C. Wang, S. S. M. Chow, Q. Wang, K. Ren, and No. 2, February 2014 .

W. Lou, “Privacy-preserving public auditing for [15]Gurjeevan Singh, Ashwani Kumar Singla,

secure cloud storage,” IEEE Transactions on K.S.Sandha, “Through Put Analysis of Various

Computers, vol. 62, no. 2, pp. 362–375, 2014. Encryption Algorithms”, IJCST Vol.2, Issue3,

[10]. L. Ferretti, M. Colajanni, M. Marchetti. September 2011.

Distributed, concurrent, and independent access to

encrypted cloud databases. IEEE transactions on

parallel and distributed systems, 2014; 25(2), 437-446.

[11] Khobragade, P. B., and S. S. Thakare. "Image

IJSER
Compression Techniques-A Review." International

Journal of Computer Science and Information

Technologies (IJCSIT) 5.1 (20145272-275

[12] Mukherjee, Tilak, and M. Koteswara Rao.

"Efficient Performance of Lifting Scheme Along With

Integer Wavelet Transform In Image

Compression."International Journal of Engineering

Research and Applications (IJERA) 3.4 (2016): 1950-

1953.

[13] S.S.M. Chow, Y. Dodis, Y. Rouselakis, and B.


Waters, “Practical Leakage-Resilient Identity-
Based Encryption from Simple Assumptions,”
Proc. ACM Conf. Computer and Comm.
Security,pp. 152-161, 2010.

[14] Cheng-Kang Chu ,Sherman S.M. Chow ,Wen-

Guey Tzeng , Jianying Zhou, and Robert H. Deng

“Key-Aggregate Cryptosystem for Scalable Data

IJSER © 2018
http://www.ijser.org

You might also like