Professional Documents
Culture Documents
Cybersecurity and Machine Learning
Cybersecurity and Machine Learning
Chapter 1 Introduction
1 Background/Introduction
In modern days a person cannot think of life without modern technologies. Be it internet,
email, web access, digital transactions, or smart devices. All these modern appliances depend
intruders to get the information they want, to get the secrets of competitors and the details of
the customers, etc. To stop hackers or intruders’ organizations use security measures.
Traditional security measures are becoming incompetent to tackle cyber threats. Internet
access to personal and social networks is being compromised. The software of the network is
being attacked by hackers regularly. Internal and external factors contribute to the hijacking
of the information. Organizations and, individuals rely on the security provided by the
vendors. But the security provided by the vendors is limited to a certain extent. Hackers are
discovering new methods to penetrate the security systems and hack very important
information about organizations and, individuals. Primary security systems cannot monitor
the excess network traffic that passes through the organization's network system.
learning is one of the strategic initiatives for cybersecurity and acts as a helping hand to the
human security professional. Machine learning has a great advantage in that it offers round-
the-clock security of the network by analyzing multiple attacks that may be missed by normal
security measures.
CYBERSECURITY AND MACHINE LEARNING III
1.2 Goal
The goal of this paper is to discuss the role of Machine Learning as a Cybersecurity
measure. The paper illustrates the different types of Cyberattacks and also discusses Machine
learning in detail.
a. What are the various tools used by Cybercriminals to invade the host's security?
c. What are the pros and cons of using Machine learning in Cybersecurity?
Though there are many Cybersecurity measures employed by various organizations, the
attackers are coming up with advanced Cyberattacks which are difficult to be handled by the
traditional defense measures. So, there must be strong and innovations to prevent
As Machine Learning is a new subject, many corporations are not employing it. So the
data is not available about the use of Machine learning in Cybersecurity. There is no much
networks, computers, data, and information of various organizations and individuals from
cyberattacks, hacking of information, etc, Cyber securities are of three types. 1.Cloud-based,
Artificial Intelligence.
information and reach others through the web, internet, and social media like Twitter,
Facebook, Instagram, and other applications. Development of Smart devices helping to gather
the data and accessing the information easy which was very difficult to perform by ordinary
systems. This reduced amount of time in securing the information. The simplicity to use
smart devices has increased the security challenges that are being faced by organizations at
present. A large amount of data transfer is involved the security is also being compromised.
Most of the individuals who operate the smart devices do not know the adverse effects caused
by sharing their information on social media, and the internet of things. They expose
themselves because they do not know the security measures or updating the updated security
procedures. As many organizations are now trying to reach out to the maximum number of
customers globally the data that is being collected is huge and there are many security threats
because of this. Many incidents of compromise network security are reported recently. Even
though the experts are well equipped with the most advanced security systems they could not
Cyberattacks have become common. Hackers use malware and viruses to steal important
information and affect the business of organizations. Interaction with the help of smart
devices, the internet, and social media has increased the risk of cyberattacks. The attacks may
be interstate, intercompany and external sources. The employees who are not happy with the
company they are working with may also be involved. Competitors also try hacking the
Cybersecurity is becoming a more and more important aspect as cyber threats are
increasing day by day. So, Cyberspace is facing a big challenge to protect the data from
computers, networks, programs, and data from attacks and unauthorized access, alteration, or
destruction. A network security system consists of a network security system and a computer
security system. Each of these systems includes firewalls, antivirus software, and intrusion
detection systems (IDS). IDSs help discovers, determine and identify unauthorized system
Governments and corporates should install the most advanced security systems to protect
their data. Online transactions are increasing day by day and hackers are using the most
modern ways to secure the data. Security teams should be provided with the most advanced
tools. As hackers are using the most sophisticated methods for security breaches, rule-based,
signature-based, and firewall security are failing regularly. Machine learning provides the
most advanced security where the traditional process failed. Machine Learning identifies the
attack and goes deep into the problem like the source of the attack also. Machine learning has
like malware analysis, threat analysis, anomaly-based intrusion detection of prevalent attacks
methods in detecting zero-day attacks or even slight variants of known attacks, machine
plays a major role in detecting the Cyber threats. ML can be used by both attacker and
defender (Sarkar,2020). The attacker will try to find a way to penetrate through the firewall
and security system using ML. On the defender side, ML helps in detecting malware and
other threats.
CYBERSECURITY AND MACHINE LEARNING VI
Using Machine Learning against different types of Cyber threats is showing very good
results when compared to old methods. Even though Machine Learning is very advanced it
also fails against threats from attackers. There is a need to develop a more advanced machine
Machine Learning is one of the AI techniques for big data analytics. The research on
these fields is going on for a few years and they are being introduced to a few fields such as
biology and chemistry. These are introduced to maintain interest among the concepts such as
large data sets and algorithms. (Hariri, 2019). Machine Learning algorithms are used for
classification, regression, clustering or to reduce the complex tasks associated with the large
data sets. Human abilities are improved, and these are widely employed in routine life for
image and speech recognition, email/spam filtering, credit scores, fraud detection, and so on.
These create the models to predict and for knowledge discovery. Data-driven decision-
There are various uncertainties associated with the ML. If the uncertainties and the
disadvantages related to ML are removed, then there will be new developments in these
fields. (Hariri, 2019) has presented the uncertainties which occur in Machine learning and.
These are related to incomplete training, inconsistent classification, and learning from low
veracity and noisy data. Natural Language processing lacks a keyword search, words
ambiguity, and classification. So, future research must concentrate on these uncertainties and
develop new services. These new developments can be explained how Machine learning was
introduced into Solid-State Materials Science. The experiments must be done to identify the
key elements in the new materials. The new systems are developed due to the increase in
computing power and the development of more efficient codes. In the experiment, all the
codes may work but in reality, there are many heralds in making the code work.
CYBERSECURITY AND MACHINE LEARNING VII
Material genome initiative may be used to remove the gap between the experiment
and theory. In the experiments, they use only a small set of data that can be mapped to a large
number of CPU hours. So, the algorithms must be developed to involve a large amount of
data. The two-layered networks which are used now are not efficient. They need some human
intervention. This can be prevented by deep networks. The neutral networks must be denoted
with the hidden layers creating deep layers. The concept behind the deep networks is that
there must be different levels of abstraction without human intervention. There are no novel
Surrogate-based optimization must be used to get the correct results with less budget.
There must be further research done on this concept. Component prediction can also be used
to improve speed. (Schmidt, 2019). Deep learning helps in the computation of large data with
fewer codes and engineering. Labeled data may not be available in most situations. In these
cases, zero-shot learning must be employed. More research must be done to read the
unlabeled data There must be more reinforcement learning methods that help in resolving this
issue. The symbolic and unsymbolic methods must be combined. This makes a good
read a huge amount of data. Several researchers have mentioned that all prototypical tasks
must be given importance. There are about 12 prototypical tasks that must be considered. The
artifacts must be developed to target the real world but not the unreal world. The problem
with ML is that it requires a lot of data to be fed. ML also needs to be handled by well-trained
employees.
Machine learning is widely used in human lives. A huge amount of research is going
on these concepts. People want machines to talk and do the work effectively in a short
CYBERSECURITY AND MACHINE LEARNING VIII
amount of time. Making this possible involves complex coding and algorithms. The
Security breaches include external intrusions and internal intrusions. There are three
main types of network analysis for IDSs: misuse-based, also known as signature-based,
anomaly-based, and hybrid. Misuse-based detection techniques aim to detect known attacks
by using the signatures of these attacks. They are used for known types of attacks without
generating a large number of false alarms. However, administrators often must manually
update the database rules and signatures. New (zero-day) attacks cannot be detected based on
misused technologies.
Anomaly-based techniques study the normal network and system behaviour and
identify anomalies as deviations from normal behaviour. They are appealing because of their
capacity to detect zero-day attacks. Another advantage is that the profiles of normal activity
are customized for every system, application, or network, therefore making it difficult for
attackers to know which activities they can perform undetected. Additionally, the data on
which anomaly-based techniques alert (novel attacks) can be used to define the signatures for
misuse detectors. The main disadvantage of anomaly-based techniques is the potential for
high false alarm rates because previously unseen system behaviours can be categorized as
anomalies.
Hybrid detection combines misuse and anomaly detection. It is used to increase the
detection rate of known intrusions and to reduce the false-positive rate of unknown attacks.
Most ML/DL methods are hybrids. Machine learning is becoming more popular in tackling
Cyber threats.
Cyber threats can be divided into Security threats and Privacy Threats
CYBERSECURITY AND MACHINE LEARNING IX
Security threats
Denial of Service. Denial of Service (DoS) is the most common method of attack. In
this, the server is flooded with numerous requests that the server cannot handle and leads to a
crash. Man-in-the-middle is one of the threats. It is used by spoofing and cloning. Malware is
a dangerous software that is planted into the security system thus corrupting the system.
Privacy threats.
Privacy threats include threats like sniffing and capturing personal data. In Privacy
threats, the hacker may collect the data for a long time but do not act. Another aspect is
hacker collects the data and starts manipulating the system or can act against the person.
A botnet is a collection of bots, viruses, and zombies. Botnets will be introduced into
the central network. Nowadays internet and IoT devices are becoming more prone to botnet
attacks, Etc,. Hackers are now using the latest infrastructure for a large-scale attack on
Cyberspace. This is also called Bots or Botnets. The traffic of bots is increasing day by day.
Bots are of different kinds. A malicious bot is the most dangerous which enables the attacker
to target many users (Sharma,2018). The process used by malicious bots is invading the IP
addresses, invading user name and password combinations, and bypassing the captcha. This
whole process is called Botnet. DDoS attacks are the most commonly used Botnets. Attackers
use a botnet to spread malware into websites, by compromising the security of the internet
and obtain confidential papers from the organizations. Recently attackers are using ML with
botnets to gather large data. Botnets are discovering new techniques and applications to
attack cyberspace. Easy access to the internet and the low cost of utilizing bots are the
advantages for the attackers. Bots and botnets can attack the captchas and fingerprints also.
Attackers are adopting new techniques to obtain the data. Bots and Botnets are dangerous
CYBERSECURITY AND MACHINE LEARNING X
threats to any organization. Organizations need to adopt new techniques to tackle the
situation. Organizations should implement new strategies like Machine learning to tackle the
threats and protect the data from attackers. Organizations should train their security personnel
in using the latest technologies. Machine learning techniques can detect the more advanced
threats and old threats also very systematically. Machine learning can monitor the web
application very effectively. Machine Learning identifies the unusual behaviour of the users
and notifies the administrators regarding the same (Varadarajan,2020). If the problem is
persistent ML can block the incoming traffic and the staff is informed the same. Machine
learning when combined with automation can give excellent results in detecting and stop
attacks. Machine learning examines the incoming data and inspects if any threat is involved.
Learning uses external threat intelligence to observe malicious information and collect the
data. The administrators should feed the older data to the machine and it will develop a
method to nullify the threats. A lot of products are available to implement machine learning
like supervised machine learning. In the supervised machine, learning the analyst feeds data
to the machine to inspect large no. of data and analyze the same and take necessary action.
Machine learning is also useful to detect the attacks by the insider and improve the
security measures. Some of the employees may use the login for malicious purposes.
Machine learning detects the behaviour and system from which he has logged in and bars the
login. Another problem that arises by insider attack is the time the attacker is in the network
without being detected. The attacker will be in the network very long time till he gathers the
information he requires. This is very difficult to detect as the attacker is an insider. The
attacker may directly login and access the information or he will login by compromising the
CYBERSECURITY AND MACHINE LEARNING XI
entire network where he will be complete access to all the communication and he will acquire
all the information he needs. Machine learning is the best for tracking the traffic of insider
attacks. Machine learning can detect remote attackers within a very short period. The
protection can be administered by connecting all the security systems and networking
systems within an organization. An Organization should adopt some new steps to embrace all
the MI proceedings, to give training to the employees, document all the traffic, and be in line
Another field that is more vulnerable to cyber threats is IoT, the internet of things. IoT is
more prone to attacks because of the large data it handles and numerous products that use
smart devices (Sharma,2018). The use of smart devices requires carrying on the information
between two parties. The threats are increasing whereas the security measures are not
sufficient. The breach takes into account confidentiality, integrity and accountability.
Machine learning enables the customers to build the security systems as per their need and all
the data can be safe. The data will be divided into small packets and kept safe. Several
Security teams feel that Machine learning will end the role of the human in the
Cybersecurity scenario. But this is not true. Machine learning will not replace human beings.
It makes things easy for them. If we implement Machine learning correctly it is a game-
changer. When Machine learning completes the task, it is the job of the administrator to
interpret the results and train the machine to make it realize which is correct data which is
wrong data.
1. Large data is required. Collect the data, may take a longer time. The data may be correct or
2. Training should be given to the software security personnel to interpret the findings.
5. Old data sets cannot be used. The latest versions of the data set should be used to get better
results.
Chapter 3 Approach/Methodology
The primary approach was to survey the University. The staff and the students will be
surveyed on their knowledge of Machine Learning and Cybersecurity measures. But as the
duration of the course is less, the articles are reviewed and advantages and disadvantages
analysis is used to compare the effectiveness of Machine Learning among the various
weighed and the conclusions will be made if Machine learning is useful or not.
increase of the use, the threats for cyberspace are also increasing. Attackers are inventing new
methods to gather the information they need. They are using malware, viruses, botnets, etc. to
tackle the traditional security measures. It is found that Machine learning can detect and abort
all the threats posed by the attackers. Machine learning is a branch of Artificial Intelligence.
Machine learning can detect large data and large traffic. Machine learning can defend
networks against traditional attacks as well as the latest technologies like botnet and threats
for IoT also effectively. Many studies were conducted and many models were tested about
CYBERSECURITY AND MACHINE LEARNING XIII
the efficacy of Machine learning. Machine learning can also be combined with other
technologies to fight against cyber threats. Even though there are some limitations in the
usage of Machine Learning it is the best security measure against Cyber threats.
Chapter 5 Conclusion
After going through some of the surveys, models, and experiments to test the efficacy of
Machine learning in the field of cybersecurity, it can be concluded that machine learning is
the perfect protection against any cyber threats. It should be noted that the employees should
be given good training, the capability of collecting data to feed the machines, to invest more
money, the capability to analyze the data and give directions to the machines. Machine
learning work in combination with other software against all cyber threats. It is also observed
that machine learning alone cannot protect cyberspace but it requires the help of a human to
References
Abdullah, & Syahirah, R., (2013) Revealing the Criterion on Botnet Detection Technique.
0206-3
Iqbal, S., Yosef,B. A., Fawaz A, & Asif,I. K. (2020). IntruDTree: A Machine Learning-
doi:10.3390/sym12050754
Liu, D., Li, Y., Thomas, M. (2017). A Roadmap for Natural Language Processing Research
Sabah, A., & Hong, H. (2018). A Survey of Cloud Computing Detection Techniques against
Schmidt, J., Marques, M., Botti, S., Marques, M. (2019). Recent advances and applications
Sharma, D., & Sharma, A. (2019) Botnet and Botnet Detection Techniques. Academic Journal
of Information Security, 1, 1.
Shahid, A. (2014). A Review Paper on Botnet and Botnet Detection Techniques in Cloud
Shaukat, K., Luo, S., Varadharajan, V., Hameed I, A., Chen, S., Liu, D.,& Li, J.(2020).
Soe, YN., Feng, Y., Santosa, PI., Hartanto, R, & Sakurai,K. (2020). Machine Learning-
Sensors,20,4372,doi:10.3390/s20164372.
Waheed, N., HE,X., Ikram, M., Usman, M., Hashmi,S,S.,& Usman, M.,(2020) Security and
Young, T., Hazarika, D., Poria, S., Cambria, E.(2018). Recent trends in Deep Learning