DWDM MID - 2 Question Paper and Online Bits

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 3

DATA WAREHOUSE AND DATA MINING

K.VARA PRASAD
III B.TECH I SEM CSE-1 & 2
MID – II
Answer the following questions: 10 x 3 =30 Marks

1. (a). What is Cluster Analysis? Explain about Types of Cluster Analysis?

(b). Suppose that the data mining task is to cluster points into three clusters, where the points
are A1=(2,10), A2=(2,5), A3=(8,4),B1=(5,8),B2=(7,5),B3=(6,4),C1=(1,2),C2=(4,9). The
distance function is Euclidean distance. Suppose initially we assign A1, B1, C1, as the
center of each cluster, respectively. Use the k-means algorithm to show only the three
cluster centers after the first round of execution.
2. The given data is a hypothetical dataset of transactions with each letter representing an item.
Let the minimum support is: 3
TID Items
100 f, a, c, d, g, I, m, p
200 a, b, c, f, l, m, o
300 b, f, h, j, o
400 b, c, k, s, p
500 a, f, c, e, l, p, m, n

3. For the given dataset, apply Apriori algorithm to discover strong association rules among image
tags.
A database has five transactions. Let min_sup = 40% and min_conf = 70%.
Generate association rules from the frequent itemsets. Calculate the confidence of each rule and
identify all the strong association rules.

Transaction ID Items Bought


1 {Bread, Butter, Milk}
2 {Bread, Butter}
3 {Beer, Cookies, Diapers)
4 {Milk, Diapers, Bread, Butter}
5 (Beer, Diapers}
DATA WAREHOUSE AND DATA MINING
K.VARA PRASAD
III B.TECH I SEM CSE-1 & 2
MID – II ONLINE BITS
1. If a rule concerns associations between the presence or absence of items, it is a _______rule.
a) Boolean Association b) Quantitative Association
c) Frequent Association d) Transaction Association

2. If the transaction data is the minimum support count is _________


a) 1 b) 2 c) 3 4) 4

3. Which step could involve huge computations?


a) join step b) prune step c) calculation step d) logical step

4. While __________ Predicts class, _________ models continuous –valued functions.


a) Prediction – Classification b) Classification – Prediction
c) Speed – Scalability d) Scalability – Speed
5. In _________________ the class label of each training sample is not known, and the number
or set of classes to be learned may not be known in advance.
a) Supervised learning b) Unsupervised learning c)Authorized learning
d) Unauthorized learning

6. Decision trees can easily be converted to _________ rules.


a) IF b) Nested IF c) IF-THEN d) GROUP BY

7. In Decision tree, __________ represents an outcome of the test.


a) internal node b) branch c) leaf node d) root

8. In how many approaches does tree pruning work?


a) 1 b) 2 c) 3 d) 4

9. The agglomerative approach is also called as ____________ approach.


a) top – down b) bottom – up c) Sequential d) Random

10. ______________ methods quantize the objects space into a finite number of cells that form a
grid structure.
a) Hierarchy methods b) density – based methods
c) grid – based methods d) model – based methods

11. Clustering large applications can be shortened as ____________


a) CLA b) CLAPP c) CLARA d) CLULA
12. Which one of the following statements about the K-means clustering is incorrect?
a) The goal of the k-means clustering is to partition (n) observation into (k) clusters
b) K-means clustering can be defined as the method of quantization
c) The nearest neighbor is the same as the K-means
d) All of the above

13. Euclidean distance measure is can also defined as ___________


a) The process of finding a solution for a problem simply by enumerating all possible solutions
according to some predefined order and then testing them
b) The distance between two points as calculated using the Pythagoras theorem
c) A stage of the KDD process in which new data is added to the existing selection.
d) All of the above

14. In ________ algorithm each cluster is represented by the center of gravity of the cluster.
a) k-medoid. b) k-means. C) STIRR. D) ROCK.
15. Which data mining is appropriate for undirected data mining.
a) Association Rules b) Statistical. c) Decision Tree. d)Neural Network
16. Which of the following is not a frequent pattern mining algorithm?
a) Apriori b) FP growth c) Decision trees d) Eclat

17. How do you calculate confidence (A -> B)?


a) Support (A ∩ B) / Support (A)
b) Support (A ∩ B) / Support (B)

c) Support (A U B) / Support (A)

d) Support (A U B) / Support (B)

18. What does FP growth algorithm do?


a) It mines all frequent patterns through pruning rules with lesser support
b) It mines all frequent patterns through pruning rules with higher support
c) It mines all frequent patterns by constructing a FP tree
d) It mines all frequent patterns by constructing an itemsets

19. When do you consider an association rule interesting?


a) If it only satisfies min_support
b) If it only satisfies min_confidence
c) If it satisfies both min_support and min_ confidence
d) There are other measures to check so

20. Which of the following is not a measure of similarity used in clustering?


a) Euclidean distance
b) Manhattan distance
c) Cosine similarity
d) Entropy

You might also like