Fuzzy-Folded Bloom Filter-as-a-Service For Big Data Storage in The Cloud

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Fuzzy-folded Bloom Filter-as-a-Service for Big Data Storage in the Cloud

Abstract:

With the ongoing trend of smart and Internet-connected objects being deployed across a broad range of
applications, there is also a corresponding increase in the amount of data movement across different
geographical regions. This, in turn, poses a number of challenges with respect to big data storage across
multiple locations, including cloud computing platform. For example, the underlying distributed file
system has a large number of directories and files in the form of gigantic trees, which are difficult to parse
in polynomial time. Moreover, with the exponential increase of (big) data streams (i.e. unbounded sets of
continuous data flows), challenges associated with indexing and membership queries are compounded.
The capability to process such significant amount of data with high accuracy can have significant impact
on decision-making and formulation of business and risk-related strategies, particularly in our current
Industrial Internet of Things environment (IIoT). However, existing storage solutions are deterministic in
nature. In other words, they tend to consume considerable memory and CPU time to yield accurate
results. This necessitates the design of efficient quality of service (QoS)-aware IIoT applications that are
able to deal with the challenges of data storage and retrieval in the cloud computing environment. In this
paper, we present an effective space-effective strategy for massive data storage using bloom filter (BF).
Specifically, in the proposed scheme, the standard BF is extended to incorporate fuzzy-enabled folding
approach, hereafter referred to as Fuzzy Folded BF (FFBF). In FFBF, fuzzy operations are used to
accommodate the hashed data of one BF into another to reduce storage requirements. Evaluations on UCI
ML AReM and Facebook datasets demonstrate the efficacy of FFBF, in terms of dealing with
approximately 1.9 times more data as compared to using the standard BF. This is also achieved without
affecting the false positive rate and query time.

Existing System:

BFs is that query complexity increases as the size grows. Initial size of filter is an important factor in
dynamic BFs as the small initial sized array may lead to computational overhead, slice addition and query
complexity overhead. On the other hand, a larger initial dynamic BF size may result in memory wastage.
Further, streaming applications, such as-approximate cache, duplicate detection, and membership query,
require one-pass processing of data. In such applications, results are required within a stipulated time-
bound. Thus, to serve this purpose, BF size should be small and constant to be optimally mapped with
cache. In order to accommodate new data, some data needs to be deleted from the BF. Thus, staling of
data is required to manage the trade-off between false positives and false negatives [21].

Proposed System:

We propose a novel technique of compression of two BFs into one filter without losing any data. The
proposed approach uses fuzzy logic to store data optimally and efficiently use the storage capacity:
Compression of two BFs into one BF using fuzzy fold operation, wherein large number of elements are
accommodated in a single BF of size m. Slow decay of data which allows streaming data to reside in
memory for substantial amount of time. Efficient and optimal utilization of storage space without any loss
of accuracy. Significant reduction in computational cost by leveraging double hashing to compute the k
hash functions. False positives in the proposed FFBF are not affected by the use of compression
operation.
CONCLUSION:

IIoT is likely to be increasingly the norm in our society, particularly in our critical infrastructure sectors
such as the Chemical Sector, the Commercial Facilities Sector, the Communications Sector, the Critical
Manufacturing Sector, the Dams Sector, the Defense Industrial Base Sector, the Emergency Services
Sector, the Energy Sector, the Food and Agriculture Sector, the Government Facilities Sector, and so on.
IIoT also has applications in a conflict and adversarial environment such as Industrial Internet of Military
Things. Hence, there is a pressing need to address some of the existing challenges, including the
challenge we were seeking to address in this paper. Specifically in this paper, our proposed filter uses a
novel fuzzy based technique to resolve the space requirement problem in BF. We demonstrated that the
proposed approach can accommodate a higher number of elements in the same space, as compared to
SBF. The cost of folding and operations associated with it is almost negligible because the proposed filter
only contains simple fuzzy operation on binary sets. The false positive rate in compressed, and
representation remains the same as that of the standard BF. The computational time in hashing is also
significantly reduced due to the use of double hashing technique, since it uses only two hash functions to
generate k hash functions. The query complexity of FFBF is dependent on the number of blocks in which
BF is divided. Searching an element from a m sized BF and same sized compressed representation
remains unchanged (i.e., O(k)). Findings from our evaluations using both UCI ML AReM and Facebook
datasets also demonstrated the efficiency of FFBF.

REFERENCES

[1] A. Rajaraman and J. D. Ullman, Mining of Massive Datasets. New York, NY, USA: Cambridge
University Press, 2011.

[2] S. Al-Rubaye, E. Kadhum, Q. Ni, and A. Anpalagan, “Industrial Internet of Things Driven by SDN
Platform for Smart Grid Resiliency,” IEEE Internet of Things Journal, 2017.

[3] S. Mumtaz, A. Alsohaily, Z. Pang, A. Rayes, K. F. Tsang, and J. Rodriguez, “Massive Internet of
Things for Industrial Applications: Addressing Wireless IIoT Connectivity Challenges and Ecosystem
Fragmentation,” IEEE Industrial Electronics Magazine, vol. 11, no. 1, pp. 28–33, 2017.

[4] L. Jiang, L. D. Xu, H. Cai, Z. Jiang, F. Bu, and B. Xu, “An IoTOriented Data Storage Framework in
Cloud Computing Platform,” IEEE Transactions on Industrial Informatics, vol. 10, no. 2, pp. 1443–1451,
May 2014.

[5] F. Tao, J. Cheng, and Q. Qi, “IIHub: an Industrial Internetof-Things Hub Towards Smart
Manufacturing Based on CyberPhysical System,” IEEE Transactions on Industrial Informatics, 2017.

[6] A. R. Sfar, E. Natalizio, Y. Challal, and Z. Chtourou, “A roadmap for security challenges in the
internet of things,” Digital Communications and Networks, 2017.

[7] “Gartner says a thirty-fold increase in internet-connected physical devices by 2020 will significantly
alter how the supply chain operates,” Gartner, Mar. 2014, [Accessed on: Oct 2017]. [Online]. Available:
{http://www.gartner.com/newsroom/id/2688717}
[8] A. Velosa, “Internet of things — architecture remains a core opportunity and challenge: A gartner
trend insight report,” Gartner, vol. G00317007, 2017.

[9] “Big data and cloud computing-challenges and opportunities,” Big Data Made Simple, Jun. 2017,
[Accessed on: Mar. 2018]. [Online]. Available: http://bigdata-madesimple.com/ big-data-and-cloud-
computing-challenges-and-opportunities/

[10] X. Liu, R. Deng, K.-K. R. Choo, Y. Yang, and H. Pang, “Privacypreserving outsourced calculation
toolkit in the cloud,” IEEE Transactions on Dependable and Secure Computing, 2018.

[11] S. Kaisler, F. Armour, J. A. Espinosa, and W. Money, “Big data: issues and challenges moving
forward,” in System Sciences (HICSS), 2013 46th Hawaii International Conference on. IEEE, 2013, pp.
995– 1004.

[12] A. Broder and M. Mitzenmacher, “Network applications of bloom filters: A survey,” Internet
mathematics, vol. 1, no. 4, pp. 485–509, 2004.

[13] S. Tarkoma, C. E. Rothenberg, and E. Lagerspetz, “Theory and Practice of Bloom Filters for
Distributed Systems,” IEEE Communications Surveys Tutorials, vol. 14, no. 1, pp. 131–155, First 2012.

[14] “What are the best applications of bloom filters?” https://www.quora.com/What-are-the-best-


applications-ofBloom-filters, [Online].

You might also like