Professional Documents
Culture Documents
CSD6011 - Machine Learning For Cyber Security
CSD6011 - Machine Learning For Cyber Security
:
Name :
2 (a) You have a test set consisting of five cyber data points: z1, z2, z3, z4, and z5. When 10
you apply a classifier F to these data points (without labels), you obtain the
following predictions:
F(z1) = 1
F(z2) = 1
F(z3) = -1
F(z4) = -1
F(z5) = 1
Out of these predictions, F correctly classified z1, z4, and z5 but incorrectly
classified z2 and z3. Calculate the following metrics based on these predictions?
i. True Positives (TP)
ii. False Positives (FP)
iii. True Negatives (TN)
iv. False Negatives (FN)
and hence compute the following parameters.
i. Accuracy
Page 1 of 3
ii. Precision
iii. Recall (Sensitivity)
iv. F1 Measure
OR
(a) Suppose you have 4 log entries related to user login behaviour, with two features: 10
1. Risk Score (x1)
2. Security Alert Count (x2)
Risk Score Security Alert
Instance (x1) Count (x2)
A 5 3
B -1 1
C 1 -2
D -3 -2
Apply k-means Algorithm for k=2 to identify and categorize different patterns of
user login behavior, which may be indicative of security risks or operational
concerns.
3 (a) In the field of cybersecurity, organizations often use clustering algorithms to identify 10
patterns and anomalies in network data.
K-Medoid is a popular clustering algorithm for such purposes. Consider a
cybersecurity scenario where you have a dataset of network traffic logs, and you
want to apply the K-Medoid algorithm for cluster analysis.
i. Discuss the basic concept of the K-Medoid algorithm and how it works in the
context of cybersecurity.
ii. Discuss the importance of selecting an appropriate distance metric for K-
Medoid clustering when dealing with network data.
iii. Suppose you have a dataset of network log entries with various features,
including source IP, destination IP, port, and timestamp. Describe how you
would preprocess this data to make it suitable for K-Medoid clustering.
iv. Provide an example of a situation where K-Medoid clustering can be
valuable in cybersecurity, and explain how the clusters identified by K-
Medoid can assist in improving network security.
OR
(b) Consider the following distance matrix, which represents the pairwise distances 10
between five data points:
0 9 3 6 11
9 0 7 5 10
𝐷= 3 7 0 9 2
6 5 9 0 8
[11 10 2 8 0 ]
Perform single linkage clustering iteratively, merging clusters based on the
minimum distance between them, until only one cluster remains. Record the
intermediate clusters at each step.
Draw a dendrogram that illustrates the clustering process step by step.
Page 2 of 3
4 Discuss the application of the Naive Bayes classifier in the context of cybersecurity 10
data analysis. How does the Naive Bayes algorithm work, and what are the
advantages and limitations of using it to classify and detect cyber threats? Provide an
example of a cybersecurity scenario where the Naive Bayes classifier can be
effectively employed.
5 Discuss the role of Machine Learning in profiling network traffic. How can machine 10
learning techniques be applied to analyse and classify network traffic patterns for
security and performance monitoring.
Page 3 of 3