Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Reg. No.

:
Name :

GRADE IMPROVEMENT EXAMINATION - February 2024


Programme : ALL Semester : NA
Course Title : Machine Learning in Cyber Security Course Code : CSD6011
Time : 1½ hours Max. Marks : 50

Answer ALL the Questions

Q. No. Question Description Marks


PART A – (30 Marks)
1 (a) Imagine you are a cybersecurity analyst working for a financial institution that 10
processes a vast amount of customer data. The organization is keen to enhance its
security measures using machine learning techniques. Describe a machine learning-
based solution to detect and prevent insider threats within the organization.
How would you design, implement, and fine-tune a machine learning model to
identify suspicious behaviour or unauthorized access by employees or other
insiders? Highlight the key data sources, algorithms, and evaluation metrics you
would consider for this security application.
OR
(b) Considering various attributes in the cybersecurity domain, such as 'Attack Vector,' 10
'Firewall Rules,' 'IDS Alerts,' 'Vulnerability Severity,' and 'Packet Loss Rate,'
Classify each attribute as either qualitative (nominal or ordinal) or quantitative
(interval or ratio) and briefly explain your reasoning. Additionally, for the
quantitative attribute(s), describe which operations are relevant in the context of
cybersecurity analysis. Finally, discuss how the nature of these attributes can impact
cybersecurity strategies and decision-making within an organization.

2 (a) You have a test set consisting of five cyber data points: z1, z2, z3, z4, and z5. When 10
you apply a classifier F to these data points (without labels), you obtain the
following predictions:
F(z1) = 1
F(z2) = 1
F(z3) = -1
F(z4) = -1
F(z5) = 1
Out of these predictions, F correctly classified z1, z4, and z5 but incorrectly
classified z2 and z3. Calculate the following metrics based on these predictions?
i. True Positives (TP)
ii. False Positives (FP)
iii. True Negatives (TN)
iv. False Negatives (FN)
and hence compute the following parameters.
i. Accuracy

Page 1 of 3
ii. Precision
iii. Recall (Sensitivity)
iv. F1 Measure

OR
(a) Suppose you have 4 log entries related to user login behaviour, with two features: 10
1. Risk Score (x1)
2. Security Alert Count (x2)
Risk Score Security Alert
Instance (x1) Count (x2)
A 5 3
B -1 1
C 1 -2
D -3 -2

Apply k-means Algorithm for k=2 to identify and categorize different patterns of
user login behavior, which may be indicative of security risks or operational
concerns.

3 (a) In the field of cybersecurity, organizations often use clustering algorithms to identify 10
patterns and anomalies in network data.
K-Medoid is a popular clustering algorithm for such purposes. Consider a
cybersecurity scenario where you have a dataset of network traffic logs, and you
want to apply the K-Medoid algorithm for cluster analysis.
i. Discuss the basic concept of the K-Medoid algorithm and how it works in the
context of cybersecurity.
ii. Discuss the importance of selecting an appropriate distance metric for K-
Medoid clustering when dealing with network data.
iii. Suppose you have a dataset of network log entries with various features,
including source IP, destination IP, port, and timestamp. Describe how you
would preprocess this data to make it suitable for K-Medoid clustering.
iv. Provide an example of a situation where K-Medoid clustering can be
valuable in cybersecurity, and explain how the clusters identified by K-
Medoid can assist in improving network security.

OR
(b) Consider the following distance matrix, which represents the pairwise distances 10
between five data points:

0 9 3 6 11
9 0 7 5 10
𝐷= 3 7 0 9 2
6 5 9 0 8
[11 10 2 8 0 ]
Perform single linkage clustering iteratively, merging clusters based on the
minimum distance between them, until only one cluster remains. Record the
intermediate clusters at each step.
Draw a dendrogram that illustrates the clustering process step by step.

PART B – (20 Marks)

Page 2 of 3
4 Discuss the application of the Naive Bayes classifier in the context of cybersecurity 10
data analysis. How does the Naive Bayes algorithm work, and what are the
advantages and limitations of using it to classify and detect cyber threats? Provide an
example of a cybersecurity scenario where the Naive Bayes classifier can be
effectively employed.
5 Discuss the role of Machine Learning in profiling network traffic. How can machine 10
learning techniques be applied to analyse and classify network traffic patterns for
security and performance monitoring.



Page 3 of 3

You might also like