Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

CYBERSECURITY AND MACHINE LEARNING I

Cybersecurity and Machine Learning


CYBERSECURITY AND MACHINE LEARNING II

Chapter 1 Introduction

1 Background/Introduction

1.1 Problem Statement

In modern days a person cannot think of life without modern technologies. Be it internet,

email, web access, digital transactions, or smart devices. All these modern appliances depend

a lot on Computer science and information technology. Cyberspace is being attacked by

intruders to get the information they want, to get the secrets of competitors and the details of

the customers, etc. To stop hackers or intruders’ organizations use security measures.

Traditional security measures are becoming incompetent to tackle cyber threats. Internet

access to personal and social networks is being compromised. The software of the network is

being attacked by hackers regularly. Internal and external factors contribute to the hijacking

of the information. Organizations and, individuals rely on the security provided by the

vendors. But the security provided by the vendors is limited to a certain extent. Hackers are

discovering new methods to penetrate the security systems and hack very important

information about organizations and, individuals. Primary security systems cannot monitor

the excess network traffic that passes through the organization's network system.

There are many advances in Information Technology to mitigate cyber-attacks. Machine

learning is one of the strategic initiatives for cybersecurity and acts as a helping hand to the

human security professional. Machine learning has a great advantage in that it offers round-

the-clock security of the network by analyzing multiple attacks that may be missed by normal

security measures.
CYBERSECURITY AND MACHINE LEARNING III

1.2 Goal

The goal of this paper is to discuss the role of Machine Learning as a Cybersecurity

measure. The paper illustrates the different types of Cyberattacks and also discusses Machine

learning in detail.

1.3 Research Questions

The paper focuses on the following research questions.

a. What are the various tools used by Cybercriminals to invade the host's security?

b. Is Machine learning the ultimate security measure to prevent Cyberattacks?

c. What are the pros and cons of using Machine learning in Cybersecurity?

1.4 Relevance and Significance

Though there are many Cybersecurity measures employed by various organizations, the

attackers are coming up with advanced Cyberattacks which are difficult to be handled by the

traditional defense measures. So, there must be strong and innovations to prevent

cyberattacks. Machine learning is one of the innovations which prevents cyberattacks.

1.5 Barriers and Issues

As Machine Learning is a new subject, many corporations are not employing it. So the

data is not available about the use of Machine learning in Cybersecurity. There is no much

research conducted on how to use Machine learning.

Chapter 2 Literature Review

Cybersecurity is a combination of technologies and security systems to protect the

networks, computers, data, and information of various organizations and individuals from

cyberattacks, hacking of information, etc, Cyber securities are of three types. 1.Cloud-based,

2. Network security, and 3. Application Security. Cybersecurity contains certain domains


CYBERSECURITY AND MACHINE LEARNING IV

like 1. Access control, 2. Telecommunications, 3. Network security, 4. Software

development, 5. Social domain, 6. Operation Security. Machine learning is a type of

Artificial Intelligence.

Modern technology is helping people, organizations, businesses to access different

information and reach others through the web, internet, and social media like Twitter,

Facebook, Instagram, and other applications. Development of Smart devices helping to gather

the data and accessing the information easy which was very difficult to perform by ordinary

systems. This reduced amount of time in securing the information. The simplicity to use

smart devices has increased the security challenges that are being faced by organizations at

present. A large amount of data transfer is involved the security is also being compromised.

Most of the individuals who operate the smart devices do not know the adverse effects caused

by sharing their information on social media, and the internet of things. They expose

themselves because they do not know the security measures or updating the updated security

procedures. As many organizations are now trying to reach out to the maximum number of

customers globally the data that is being collected is huge and there are many security threats

because of this. Many incidents of compromise network security are reported recently. Even

though the experts are well equipped with the most advanced security systems they could not

stop the hacking or hijacking problems.

Cyberattacks have become common. Hackers use malware and viruses to steal important

information and affect the business of organizations. Interaction with the help of smart

devices, the internet, and social media has increased the risk of cyberattacks. The attacks may

be interstate, intercompany and external sources. The employees who are not happy with the

company they are working with may also be involved. Competitors also try hacking the

important information of the company.


CYBERSECURITY AND MACHINE LEARNING V

Cybersecurity is becoming a more and more important aspect as cyber threats are

increasing day by day. So, Cyberspace is facing a big challenge to protect the data from

threats. Cyberspace may include IoT, business organizations, government organizations,

individuals. Cybersecurity is a set of technologies and processes designed to protect

computers, networks, programs, and data from attacks and unauthorized access, alteration, or

destruction. A network security system consists of a network security system and a computer

security system. Each of these systems includes firewalls, antivirus software, and intrusion

detection systems (IDS). IDSs help discovers, determine and identify unauthorized system

behaviour such as use, copying, modification, and destruction.

Governments and corporates should install the most advanced security systems to protect

their data. Online transactions are increasing day by day and hackers are using the most

modern ways to secure the data. Security teams should be provided with the most advanced

tools. As hackers are using the most sophisticated methods for security breaches, rule-based,

signature-based, and firewall security are failing regularly. Machine learning provides the

most advanced security where the traditional process failed. Machine Learning identifies the

attack and goes deep into the problem like the source of the attack also. Machine learning has

become a mainstream technology in cybersecurity applications. There are many examples

like malware analysis, threat analysis, anomaly-based intrusion detection of prevalent attacks

on critical infrastructures, and many others. Due to the ineffectiveness of signature‐based

methods in detecting zero-day attacks or even slight variants of known attacks, machine

learning‐based detection is being used in cybersecurity products. Machine learning system

plays a major role in detecting the Cyber threats. ML can be used by both attacker and

defender (Sarkar,2020). The attacker will try to find a way to penetrate through the firewall

and security system using ML. On the defender side, ML helps in detecting malware and

other threats.
CYBERSECURITY AND MACHINE LEARNING VI

Using Machine Learning against different types of Cyber threats is showing very good

results when compared to old methods. Even though Machine Learning is very advanced it

also fails against threats from attackers. There is a need to develop a more advanced machine

learning system to fight against the threats.

Machine Learning is one of the AI techniques for big data analytics. The research on

these fields is going on for a few years and they are being introduced to a few fields such as

biology and chemistry. These are introduced to maintain interest among the concepts such as

large data sets and algorithms. (Hariri, 2019). Machine Learning algorithms are used for

classification, regression, clustering or to reduce the complex tasks associated with the large

data sets. Human abilities are improved, and these are widely employed in routine life for

image and speech recognition, email/spam filtering, credit scores, fraud detection, and so on.

These create the models to predict and for knowledge discovery. Data-driven decision-

making became efficient through these models.

There are various uncertainties associated with the ML. If the uncertainties and the

disadvantages related to ML are removed, then there will be new developments in these

fields. (Hariri, 2019) has presented the uncertainties which occur in Machine learning and.

These are related to incomplete training, inconsistent classification, and learning from low

veracity and noisy data. Natural Language processing lacks a keyword search, words

ambiguity, and classification. So, future research must concentrate on these uncertainties and

develop new services. These new developments can be explained how Machine learning was

introduced into Solid-State Materials Science. The experiments must be done to identify the

key elements in the new materials. The new systems are developed due to the increase in

computing power and the development of more efficient codes. In the experiment, all the

codes may work but in reality, there are many heralds in making the code work.
CYBERSECURITY AND MACHINE LEARNING VII

Material genome initiative may be used to remove the gap between the experiment

and theory. In the experiments, they use only a small set of data that can be mapped to a large

number of CPU hours. So, the algorithms must be developed to involve a large amount of

data. The two-layered networks which are used now are not efficient. They need some human

intervention. This can be prevented by deep networks. The neutral networks must be denoted

with the hidden layers creating deep layers. The concept behind the deep networks is that

there must be different levels of abstraction without human intervention. There are no novel

laws and knowledge regarding their use.

Surrogate-based optimization must be used to get the correct results with less budget.

There must be further research done on this concept. Component prediction can also be used

to improve speed. (Schmidt, 2019). Deep learning helps in the computation of large data with

fewer codes and engineering. Labeled data may not be available in most situations. In these

cases, zero-shot learning must be employed. More research must be done to read the

unlabeled data There must be more reinforcement learning methods that help in resolving this

issue. The symbolic and unsymbolic methods must be combined. This makes a good

decision-making process. (Young, 2018).

Statistical-based algorithms and Machine Learning algorithms must be adapted to

read a huge amount of data. Several researchers have mentioned that all prototypical tasks

must be given importance. There are about 12 prototypical tasks that must be considered. The

artifacts must be developed to target the real world but not the unreal world. The problem

with ML is that it requires a lot of data to be fed. ML also needs to be handled by well-trained

employees.

Machine learning is widely used in human lives. A huge amount of research is going

on these concepts. People want machines to talk and do the work effectively in a short
CYBERSECURITY AND MACHINE LEARNING VIII

amount of time. Making this possible involves complex coding and algorithms. The

uncertainties or the disadvantages with ML can be removed by modifying the algorithms. In

the future, the human-computer conversation may evolve.

Security breaches include external intrusions and internal intrusions. There are three

main types of network analysis for IDSs: misuse-based, also known as signature-based,

anomaly-based, and hybrid. Misuse-based detection techniques aim to detect known attacks

by using the signatures of these attacks. They are used for known types of attacks without

generating a large number of false alarms. However, administrators often must manually

update the database rules and signatures. New (zero-day) attacks cannot be detected based on

misused technologies.

Anomaly-based techniques study the normal network and system behaviour and

identify anomalies as deviations from normal behaviour. They are appealing because of their

capacity to detect zero-day attacks. Another advantage is that the profiles of normal activity

are customized for every system, application, or network, therefore making it difficult for

attackers to know which activities they can perform undetected. Additionally, the data on

which anomaly-based techniques alert (novel attacks) can be used to define the signatures for

misuse detectors. The main disadvantage of anomaly-based techniques is the potential for

high false alarm rates because previously unseen system behaviours can be categorized as

anomalies.

Hybrid detection combines misuse and anomaly detection. It is used to increase the

detection rate of known intrusions and to reduce the false-positive rate of unknown attacks.

Most ML/DL methods are hybrids. Machine learning is becoming more popular in tackling

Cyber threats.

Cyber threats can be divided into Security threats and Privacy Threats
CYBERSECURITY AND MACHINE LEARNING IX

Security threats

Denial of Service. Denial of Service (DoS) is the most common method of attack. In

this, the server is flooded with numerous requests that the server cannot handle and leads to a

crash. Man-in-the-middle is one of the threats. It is used by spoofing and cloning. Malware is

a dangerous software that is planted into the security system thus corrupting the system.

Privacy threats.

Privacy threats include threats like sniffing and capturing personal data. In Privacy

threats, the hacker may collect the data for a long time but do not act. Another aspect is

hacker collects the data and starts manipulating the system or can act against the person.

The threat of Bots and Botnets:

A botnet is a collection of bots, viruses, and zombies. Botnets will be introduced into

the central network. Nowadays internet and IoT devices are becoming more prone to botnet

attacks, Etc,. Hackers are now using the latest infrastructure for a large-scale attack on

Cyberspace. This is also called Bots or Botnets. The traffic of bots is increasing day by day.

Bots are of different kinds. A malicious bot is the most dangerous which enables the attacker

to target many users (Sharma,2018). The process used by malicious bots is invading the IP

addresses, invading user name and password combinations, and bypassing the captcha. This

whole process is called Botnet. DDoS attacks are the most commonly used Botnets. Attackers

use a botnet to spread malware into websites, by compromising the security of the internet

and obtain confidential papers from the organizations. Recently attackers are using ML with

botnets to gather large data. Botnets are discovering new techniques and applications to

attack cyberspace. Easy access to the internet and the low cost of utilizing bots are the

advantages for the attackers. Bots and botnets can attack the captchas and fingerprints also.

Attackers are adopting new techniques to obtain the data. Bots and Botnets are dangerous
CYBERSECURITY AND MACHINE LEARNING X

threats to any organization. Organizations need to adopt new techniques to tackle the

situation. Organizations should implement new strategies like Machine learning to tackle the

threats and protect the data from attackers. Organizations should train their security personnel

in using the latest technologies. Machine learning techniques can detect the more advanced

threats and old threats also very systematically. Machine learning can monitor the web

application very effectively. Machine Learning identifies the unusual behaviour of the users

and notifies the administrators regarding the same (Varadarajan,2020). If the problem is

persistent ML can block the incoming traffic and the staff is informed the same. Machine

learning when combined with automation can give excellent results in detecting and stop

attacks. Machine learning examines the incoming data and inspects if any threat is involved.

It will identify the possible threats and stop the attack.

Regarding bot attacks, Machine Learning offers excellent protection. Machine

Learning uses external threat intelligence to observe malicious information and collect the

data. The administrators should feed the older data to the machine and it will develop a

method to nullify the threats. A lot of products are available to implement machine learning

like supervised machine learning. In the supervised machine, learning the analyst feeds data

to the machine to inspect large no. of data and analyze the same and take necessary action.

Machine learning can be employed with cloud technology as well.

Machine learning is also useful to detect the attacks by the insider and improve the

security measures. Some of the employees may use the login for malicious purposes.

Machine learning detects the behaviour and system from which he has logged in and bars the

login. Another problem that arises by insider attack is the time the attacker is in the network

without being detected. The attacker will be in the network very long time till he gathers the

information he requires. This is very difficult to detect as the attacker is an insider. The

attacker may directly login and access the information or he will login by compromising the
CYBERSECURITY AND MACHINE LEARNING XI

entire network where he will be complete access to all the communication and he will acquire

all the information he needs. Machine learning is the best for tracking the traffic of insider

attacks. Machine learning can detect remote attackers within a very short period. The

protection can be administered by connecting all the security systems and networking

systems within an organization. An Organization should adopt some new steps to embrace all

the MI proceedings, to give training to the employees, document all the traffic, and be in line

with all the modern techniques.

Another field that is more vulnerable to cyber threats is IoT, the internet of things. IoT is

more prone to attacks because of the large data it handles and numerous products that use

smart devices (Sharma,2018). The use of smart devices requires carrying on the information

between two parties. The threats are increasing whereas the security measures are not

sufficient. The breach takes into account confidentiality, integrity and accountability.

Machine learning enables the customers to build the security systems as per their need and all

the data can be safe. The data will be divided into small packets and kept safe. Several

algorithms are developed to preserve the data for private security.

Security teams feel that Machine learning will end the role of the human in the

Cybersecurity scenario. But this is not true. Machine learning will not replace human beings.

It makes things easy for them. If we implement Machine learning correctly it is a game-

changer. When Machine learning completes the task, it is the job of the administrator to

interpret the results and train the machine to make it realize which is correct data which is

wrong data.

The limitations of machine learning are

1. Large data is required. Collect the data, may take a longer time. The data may be correct or

not should be checked.


CYBERSECURITY AND MACHINE LEARNING XII

2. Training should be given to the software security personnel to interpret the findings.

3. Certain benchmark for the data set is not available.

4. If the same data set is presented it gives different results.

5. Old data sets cannot be used. The latest versions of the data set should be used to get better

results.

6. Each machine should have separate time complexity.

7. Higher cost when compared to traditional security models.

Chapter 3 Approach/Methodology

The primary approach was to survey the University. The staff and the students will be

surveyed on their knowledge of Machine Learning and Cybersecurity measures. But as the

duration of the course is less, the articles are reviewed and advantages and disadvantages

analysis is used to compare the effectiveness of Machine Learning among the various

Cybersecurity measures. The advantages and disadvantages of Machine Learning are

weighed and the conclusions will be made if Machine learning is useful or not.

Chapter 4 Findings, Analysis, Synthesis


In modern days the use of Information technology, IoT, networks is increasing. With the

increase of the use, the threats for cyberspace are also increasing. Attackers are inventing new

methods to gather the information they need. They are using malware, viruses, botnets, etc. to

tackle the traditional security measures. It is found that Machine learning can detect and abort

all the threats posed by the attackers. Machine learning is a branch of Artificial Intelligence.

Machine learning can detect large data and large traffic. Machine learning can defend

networks against traditional attacks as well as the latest technologies like botnet and threats

for IoT also effectively. Many studies were conducted and many models were tested about
CYBERSECURITY AND MACHINE LEARNING XIII

the efficacy of Machine learning. Machine learning can also be combined with other

technologies to fight against cyber threats. Even though there are some limitations in the

usage of Machine Learning it is the best security measure against Cyber threats.

Chapter 5 Conclusion

After going through some of the surveys, models, and experiments to test the efficacy of

Machine learning in the field of cybersecurity, it can be concluded that machine learning is

the perfect protection against any cyber threats. It should be noted that the employees should

be given good training, the capability of collecting data to feed the machines, to invest more

money, the capability to analyze the data and give directions to the machines. Machine

learning work in combination with other software against all cyber threats. It is also observed

that machine learning alone cannot protect cyberspace but it requires the help of a human to

complete its task.


CYBERSECURITY AND MACHINE LEARNING XIV

References

Abdullah, & Syahirah, R., (2013) Revealing the Criterion on Botnet Detection Technique.

International Journal of Computer Science Issues, 10, 2, pp. 208–215

Hariri, R.H., Fredericks, E.M. & Bowers, K.M. J Big Data

(2019) 6: 44. https://doi.org/10.1186/s40537-019-

0206-3

Iqbal, S., Yosef,B. A., Fawaz A, & Asif,I. K. (2020). IntruDTree: A Machine Learning-

Based Cyber Security Intrusion Detection Model. Symmetry, 12, 754;

doi:10.3390/sym12050754

Liu, D., Li, Y., Thomas, M. (2017). A Roadmap for Natural Language Processing Research

in Information. International Conference on System Sciences.

Sabah, A., & Hong, H. (2018). A Survey of Cloud Computing Detection Techniques against

DDoS Attacks. Journal of Information Security, 9, 01, pp. 45–69.

Schmidt, J., Marques, M., Botti, S., Marques, M. (2019). Recent advances and applications

of machine learning in solid-state materials science. Computational Materials.

Sharma, D., & Sharma, A. (2019) Botnet and Botnet Detection Techniques. Academic Journal

of Information Security, 1, 1.

Shahid, A. (2014). A Review Paper on Botnet and Botnet Detection Techniques in Cloud

Computing. IEEE Symposium on Computers & Informatics.

Shaukat, K., Luo, S., Varadharajan, V., Hameed I, A., Chen, S., Liu, D.,& Li, J.(2020).

Performance Comparison and Current Challenges of Using Machine Learning

Techniques in Cybersecurity. Energies, 13, 2509; doi:10.3390/en13102509


CYBERSECURITY AND MACHINE LEARNING XV

Soe, YN., Feng, Y., Santosa, PI., Hartanto, R, & Sakurai,K. (2020). Machine Learning-

Based IoT-Botnet Attack Detection with Sequential Architecture.

Sensors,20,4372,doi:10.3390/s20164372.

Waheed, N., HE,X., Ikram, M., Usman, M., Hashmi,S,S.,& Usman, M.,(2020) Security and

Privacy in IoT Using Machine Learning and Blockchain: Threats and

Countermeasures. ACM Computing surveys,53,6.

Young, T., Hazarika, D., Poria, S., Cambria, E.(2018). Recent trends in Deep Learning

Based Natural Language Processing. IEEE Computational Intelligence, 13(3).

You might also like