Professional Documents
Culture Documents
Individual 2022 Assignment New PDF
Individual 2022 Assignment New PDF
Individual 2022 Assignment New PDF
Individual Assignment
Algorithm and Mathematics of Data Mining
Submission Deadline: June 24, 2022
Note: Only handwritten answer is accepted
Introduction
This assignment was designed to allow you asses how much you know about the algorithm and
mathematics of clustering, classification and association rule mining. Hence, you are asked to provide the
details (also steps) for each of the following questions.
Task 1: Clustering.
1) Use the k-means clustering algorithm and Euclidean distance to cluster the following 8 examples
into 3 clusters:
Points X1 X2
P1 10 2
P2 5 2
P3 4 8
P4 8 5
P5 5 7
P6 4 6
P7 2 1
P8 9 4
And also assume that Point P1, P4 and P7 are initially selected as a cluster center. The stopping
criteria is when there is no movement.
a) Show all the necessary steps until convergence?
b) Show the grouping after convergence?
2) Apply agglomerative clustering to group the data described in the above table (Question 1) and
show the dendrograms after clustering?
3) Plot all the data points in Question one on a 10X10 square. Then, if the radius is 2 and the
minimum number of point is 2, what are the clusters that DBScan would discover (show your
answer by circling all that belongs to the same cluster on the 10X10 square that you created
earlier).
And suppose that both rule satisfy the minimum support and minimum confidence requirements.
Do you think that the rule B → D will also satisfies the minimum confidence and minimum support
requirements? If your answer is yes, justify it by proving it, otherwise show a counterexample.
3) Given the table of transaction given below, show the results of using the Apriori algorithm with
support threshold s=33.34% and confidence threshold c=60%. Enumerate all the final frequent
item sets following the example provided in the lecture class.
Transaction ID Items
T1 H, B, K
T2 H,B
T3 H,C, D
T4 D,C
T5 D,K
T6 H, C,D
4) Construct the frequent pattern tree for the transaction data given in question 3.
The goal of this task to classify the data using Decision Tree classifier.
a) Given the training data, construct the rule or Decision Tree classifier. Plot the flow chart of the
rule at the end.
Training data
Points x1 x2 y
P1 1 0 0
P2 1 1 0
P3 0 0 1
P4 0 0 1
P5 1 1 0
P6 1 1 0
P7 1 0 0
b) Using the rule that you extracted, predict the class label of P9 (0, 1)?
P9 0 1 ?