Professional Documents
Culture Documents
Bai Luyen Tap Data Mining
Bai Luyen Tap Data Mining
Data mining (Trường Đại học Kinh tế – Luật, Đại học Quốc gia Thành phố Hồ Chí Minh)
1 T, N, F, N, J
2 U, F, G, H, P, Q
3 Z, N, G, H, I
4 W, E, I
5 W, E, F, K
6 P, U, F
2a. Use the Apriori algorithm to find all frequent itemsets with minsup = 2.
2b. Use the FP-Growth algorithm to generate a final FP-Tree and find rules created by 3 items
Exercise 3 Use single link agglomerative clustering to group the data described by the following
distance matrix. Show the dendrogram.
A B C D E F G
A 0 150 72 26 100 104 130
0 74 36 150 34 20
B
0 50 4 14 106
C
0 26 34 104
D
0 24 190
E
0 158
F
0
G
Exercise 4 Given a document, drawn from a collection of 80000 documents, in which the 5 terms
given in the table below occur, calculate the TFIDF values for each one. Which term accounts for
the lowest TFIDF value?
Term Frequency in current document Number of documents containing term
Bird 80 1000
Tiger 40 1500
Bias Weights
Layer Value
Hidden -0.35
Output 0.25