Blockchain_Technologies_and_Their_Applications_in_Data_Science_and_Cyber_Security

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

2020 3rd International Conference on Smart BlockChain (SmartBlock)

Blockchain Technologies and Their Applications


in Data Science and Cyber Security
Bhavani Thuraisingham
Computer Science Dept.
The University of Texas at Dallas
Richardson, TX, USA
2020 3rd International Conference on Smart BlockChain (SmartBlock) | 978-1-6654-4073-8/20/$31.00 ©2020 IEEE | DOI: 10.1109/SMARTBLOCK52591.2020.00008

bxt043000@utdallas.edu

Abstract— Blockchain technologies have been very create smart contracts for exchanging data and executing
effective in processing distributed transactions the processes. Essentially, s is a peer-to-peer technology
securely. They have many applications including in with no central control.
handling bitcoin cryptocurrencies and smart
contracts. More recently the use of blockchain has Blockchains essentially deal with transactions and
been explored for data science applications. This transactions involve data. Massive amounts of data have
paper examines blockchain technologies and to be collected, processed, analyzed and shared in
discusses their applications in data science and cyber various transactions. That is, data science techniques are
security. at the heart of many of the transactions. Blockchains
provide a way to execute these transactions securely.
Keywords— Blockchain, Bitcoin, Smart Contracts, Data This paper will discuss the applications of blockchain in
Science, Privacy, Cyber Security Data Science and Cyber Security. Section 2 discusses
blockchain technologies. Section 3 discusses blockchain
I. INTRODUCTION
for Data Science while Section 4 discusses blockchain
The National Institute of Standards and Technology for Cyber Security. The paper is concluded in Section 5.
(NIST) has defined blockchains to be “tamper evident For an overview of blockchain technology we refer the
and tamper resistant digital ledgers implemented in a reader to [1] and [2].
distributed fashion (i.e., without a central repository)
II. BLOCKCHAIN TECHNOLOGIES
and usually without a central authority (i.e., a bank,
company, or government)” [1]. Blockchain technologies Blockchain essentially consists of a collection of blocks
essentially provide a platform for the secure transfer of that are linked together via chains. A block is essentially
the data that are part of any transactions including a file that contains data pertaining to a transaction. The
financial transactions and contracts. At the heart of data from one block may be transferred to multiple
blockchain is cryptography that ensures that the data blocks. Furthermore, a block may receive data from
being transferred is not tampered with and provides multiple blocks. The data in each block is permanent and
authenticity and integrity. The transactions involved in immutable. Blocks can be added to the blockchain as the
blockchains are simply a transfer of assets and the assets transaction progresses. Furthermore, each transition has
are essentially data that may represent financial to be verified. However, unlike in non-blockchain
information, healthcare information or even corporate applications where transactions are usually verified by a
information. One of the most popular applications of central authority, in a blockchain based transaction, it is
blockchain is Bitcoin which is essentially financial verified by a distributed collection of processes.
cryptocurrency. Transfers of bitcoins between different
individuals are executed through blockchains. As discussed in the NIST document [1], blocks can be
published without permissions which means anyone can
In addition to Bitcoin, another popular application of publish a block or with permissions where the blocks can
blockchain is Etherium. It is stated in [2] that Etherium be published only with the approval of an authority,
“has been created to implement not only transactions, but either centralized or decentralized. An important
contracts which contain transactions with conditions and component of blockchain is cryptographic hash
rules.” Such contracts are called smart contracts and have functions. This is a form of a message digest where
applications in areas such as the Internet of Things (IoT) checksums are computed based on the contents. This is
where billions of devices have to work together and

978-1-6654-4073-8/20/$31.00 ©2020 IEEE 1


DOI 10.1109/SmartBlock52591.2020.00008

Authorized licensed use limited to: JNT University Kakinada. Downloaded on February 11,2022 at 09:16:33 UTC from IEEE Xplore. Restrictions apply.
one of the key components that provide security (e.g., the IoT and related big data systems. More recently, the
confidentiality, integrity, authenticity) for blockchains. developments in data science are also being examined
Another major component of blockchain is the notion of for securing such systems. Furthermore, the data science
transaction which is the means of interaction between techniques could be attacked and need to be secured.
two parties. Also, it is through transactions that Blockchain is emerging as a key technology for securing
cryptocurrencies are passed between the users of the the data science techniques. That is, securing the data
blockchain. Blockchains use asymmetric key technology collection, data processing, data management, data
which is essentially public key cryptography. analytics and data sharing activities via blockchain is
Blockchains may also use network addresses which are being examined. It is stated in [3] that “data analysis is
derived from the public key cryptography. At the heart of possible right from the edge of individual devices.
blockchain is the notion of a ledger which is a collection Additionally, data generated through blockchain is
of transactions. The transactions are executed in a validated, structured and immutable. Since the data that
distributed fashion and therefore the architecture to is provided by blockchain is ensured of data integrity, it
support the blockchain is a distributed ledger. As enhances big data.”
transactions are executed, blocks get added to the
blockchain which contain information such as a list of As stated in [4], that data scientists “are now relying on
validated transactions and the metadata about these blockchain to authenticate and track data at every point
transactions. These blocks are chained together to form on a chain. Its immutable security is one of the main
the blockchain. drivers for its adoption. This decentralized ledger
protects data through multiple signatures, thus
There are several other details that pertain to blockchains preventing data leaks and hacks.” It is also stated that
such as the consensus model (e.g., proof of work model) blockchain is becoming key to maintaining trust,
forks and smart contracts. In the proof of work model, a improving data quality and securely sharing the data. In
user publishes the next block showing the proof that the many organizations, trust is enforced by a single
work has been completed. Bitcoin uses this model. In authority resulting in single point of failure situation.
some cases, the blockchains may need to change and With the decentralized nature of blockchain, trust can be
such changes are called forks. As stated in [1], a “smart ensured by a collection of processes in the peer-to-peer
contract is a collection of code and data (sometimes network. Similarly, for data sharing, blockchain
referred to as functions and state) that is deployed using technologies enable multiple parties to access and share
cryptographically signed transactions on the blockchain the data securely. Blockchain is also enabling the
network.” These nodes in the blockchain execute the verification of the integrity of the data at every point of
smart contracts. Examples include Etherium and the transactions. The distributed ledger at the heart of
Hyperledger Fabric’s chaincode. More details of how blockchain can also determine the provenance of the
blockchains execute transactions as well as the data which is an important aspect of data science.
descriptions of the various components can be found in Blockchain is also key to keeping track of all the
[1] and [2]. transactions in a supply chain process [5] and this also
includes the data supply chain [6].
III. BLOCKCHAIN FOR DATA SCIENCE
Other efforts of blockchain for data science include the
With the advent of the web, computing systems are now ones reported in [7] and [8]. For example, in [7], the
being used in every aspect of our lives from mobile authors discuss how blockchain can be used in big data
phones to autonomous vehicles. It is now possible to analytics for analyze private data. Their approach is
collect, store, manage, and analyze vast amounts of based on the Hyperledger fabric Block chain. In [8], the
sensor and other data emanating from numerous devices authors state that data security and privacy cut across all
and sensors. Such systems collectively are known as the aspects of data science and then discuss how blockchain
Internet of Things where multiple autonomous and can provide solutions due to its decentralized
distributed devices and systems are connected through infrastructure.
the web and coordinate their activities. However,
security and privacy for the massive data systems within In our recent work [9], we argue that blockchain based
the IoT has become a major concern. Due to the large cryptocurrencies have the entire transaction graph
volumes of heterogeneous data being collected from accessible to the public (i.e., all transactions can be
numerous devices, the traditional cyber security downloaded and analyzed). We then investigate whether
techniques such as encryption are not efficient to secure the transaction graphs in blockchains impact the price of

Authorized licensed use limited to: JNT University Kakinada. Downloaded on February 11,2022 at 09:16:33 UTC from IEEE Xplore. Restrictions apply.
the underlying cryptocurrency. We show that the Several other articles have discussed the use of
topological feature computed from the blockchain blockchain for security. For example, in [15] it is stated
graphs can be used to predict Bitcoin price dynamics. In that blockchains “could potentially help enhance cyber-
[10], we have discussed privacy-aware policy-based data defense as the platform can prevent fraudulent activities
lifecycles which involve data collection, storage, via consensus mechanisms, and detect data tampering
management, analysis and sharing. We believe that depending on its underlying characteristics of
blockchain technologies can ensure security for the operational resilience, data encryption, auditability,
entire data lifecycle process. Finally, we have examined transparency and immutability.” It also adds that
various aspects of integrating cyber security and data blockchains enhance security by eliminating humans in
science [11] as well as explored a data driven approach the authentication process, reduce distributed denial of
for the science of cyber security [12]. We need to service attacks (DDoS), provide traceability, and support
explore how blockchain can ensure the security for such decentralized storage. Similar applications are also
data lifecycle activities. Finally, there has been a lot of discussed in several articles including in [16] where the
work on multilevel security for database management use of blockchains to enhance security including for data
systems in the 1980s and 1990s [13]. We need to explore confidentiality and integrity are provided.
the use of blockchains in the designs of such multilevel
secure database systems including in the execution of V SUMMARY AND DIRECTION
multilevel distributed transactions. In this paper we discussed blockchain technologies and
IV. BLOCKCHAIN FOR CYBER SECURITY their applications in data science as well as in cyber
security. In particular, we discussed various concepts
Blockchain technologies were developed mainly to such as blocks, transaction execution, bitcoin, smart
execute secure transactions including the secure transfer contracts, cryptographic checksum and related
of cryptocurrencies. As stated earlier, the heart of blockchain components and then discussed how
blockchain is cryptographic checksum. Therefore, blockchain technologies could be used for data science
security is at the forefront of blockchain. This section including data analytics and data sharing. Blockchain
briefly discusses the blockchain applications for security. provides security for the entire data life cycle process.
Finally, we discussed the applications of blockchains for
Four security-based use cases of blockchain are security including for IoT security, storage, DDoS
discussed in [14]. It is stated that centralized storage is attacks, confidentiality and integrity, as well as for
not secure due to single point of entry. With the authentication.
distributed processing capability of blockchains, the data
could be distributed across multiple devices. That is, the The next step is to examine various aspects of data
distributed ledger-based architecture for blockchains science activities including the privacy aware policy-
facilitates distributed data storage. Cryptographic based data life cycle process and explore how blockchain
checksums are used to ensure security. Furthermore, the technologies and be securely applied for various
key can be revoked any time and this way one can distributed transactions involved in these activities. In
enforce dynamic security. Another application of addition, smart contracts in supply chains including data
blockchain is in providing IoT security where billions of supply chain as well as executing financial transactions
devices are connected. Such a system facilitates need to be explored. Finally, blockchain applications in
distributed processing. As a result, blockchain cyber security need to be explored further including
technologies can be used for the secure communication areas such as ransom-ware and adversarial machine
between the devices and not have centralized control. A leaning. We believe that blockchain is the glue that
third area is in DNS (Domain Name System). DNS integrates data science with cyber security.
systems are usually centralized and therefore hackers can ACKNOWLEDGMENT
break into such systems without difficulty. However, due
to the distributed nature of blockchains, hackers will find I thank my PhD students including Dr. Ceren Abay and
it more difficult to find the single point of entry. Finally, Mr. Brian Ricks as well as colleagues Profs. Murat
most messaging systems use end-to-end encryption. Kantarcioglu, Latifur Khan, Cuneyt Akcora, and Yulia
However, more recently, these systems are beginning to Gel for discussions. I also thank Ms. Rhonda Walls for
use blockchain technologies. Again, the distributed editing this paper.
processing capabilities provided by blockchains, enables
a uniform way of communication in messaging systems.

Authorized licensed use limited to: JNT University Kakinada. Downloaded on February 11,2022 at 09:16:33 UTC from IEEE Xplore. Restrictions apply.
REFERENCES [10] Bhavani M. Thuraisingham, et al, Towards a
Privacy-Aware Quantified Self Data Management
[[1] Dylan Yaga, et al, National Institute of Standards
Framework. SACMAT 2018: 173-184.
and Technology, NISTIR 8202 Blockchain Technology
Overview; [11] Bhavani M. Thuraisingham, et al, Integrating Cyber
https://nvlpubs.nist.gov/nistpubs/ir/2018/NIST.IR.8202.p Security and Data Science for Social Media: A Position
df Paper. IPDPS Workshops 2018: 1163-1165.
[2] Cuneyt Gurcan Akcora, Yulia R. Gel, Murat [12] Bhavani M. Thuraisingham, et al, A Data Driven
Kantarcioglu: Blockchain: A Graph Approach for the Science of Cyber Security: Challenges
Primer. CoRR abs/1708.08749 (2017) and Directions. IRI 2016: 1-10.
[3] Vibhuthi Viswanathan, Implications of blockchain in [13] Bhavani Thuraisingham, Database and Applications
data science, July 17, 2019 Security: Integrating Information Security and Data
Management, CRC Press, 2005.
https://www.itproportal.com/features/implications-of-
blockchain-in-data-science/ [14] Andrew Arnold, 4 Promising Use Cases Of
Blockchain In Cybersecurity, Forbes, January 30, 2019.
[4] Catalin Zorzini Why Data Scientists Are Falling in
Love with Blockchain Technology, August 9, 2019; https://www.forbes.com/sites/andrewarnold/2019/01/30/
https://www.techopedia.com/why-data-scientists-are- 4-promising-use-cases-of-blockchain-in-
falling-in-love-with cybersecurity/#3a9a73643ac3
blockchaintechnology/2/33356#:~:text=Data%20scientis
ts%20are%20now%20relying,main%20drivers%20for% [15] Savaram Ravindra, The Roe of Blockchain in
20its%20adoption.&text=When%20the%20decentralize Cyber Security, InfoSecurity Magazine,
d%20ledger%20is,with%20a%20specific%20cryptograp https://www.infosecurity-magazine.com/next-gen-
hic%20key. infosec/blockchain-cybersecurity/
[5] Bernard Marr, How Blockchain Will Transform The [16] John Ocampos, Contribution of Blockchain to
Supply Chain And Logistics Industry, Cybersecurity, March 23, 2020.
https://www.forbes.com/sites/bernardmarr/2018/03/23/h
ow-blockchain-will-transform-the-supply-chain-and- https://theblockchainland.com/2020/03/23/contribution-
logistics-industry/#4d5dd7fb5fec blockchain-cybersecurity/
[6] Kevin W. Hamlen, Bhavani M. Thuraisingham: Data
security services, solutions and standards for
outsourcing. Comput. Stand. Interfaces 35(1): 1-5 (2013)
[7] Konstantinos Lampropoulos et al, Using Blockchains
to Enable Big Data Analysis of Private Information,
Proceedings 2019 IEEE 24th International Workshop on
Computer Aided Modeling and Design of
Communication Links and Networks (CAMAD);
https://ieeexplore.ieee.org/document/8858468
[8] Jiameng Liu, et al, Blockchain for Data
Science, Proceedings of the 2020 The 2nd International
Conference on Blockchain Technology,
https://dl.acm.org/doi/10.1145/3390566.3391681
[9] Nazmiye Ceren Abay, et al, ChainNet: Learning on
Blockchain Graphs with Topological
Features. ICDM 2019: 946-951

Authorized licensed use limited to: JNT University Kakinada. Downloaded on February 11,2022 at 09:16:33 UTC from IEEE Xplore. Restrictions apply.

You might also like