Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Practice Question

1. Why is tree pruning useful in decision tree induction? What is the drawback of using a
separate set of tuples to evaluate pruning?
2. Why is naıve Bayesian classification called “naıve”? Briefly outline the major ideas of
naıve Bayesian classification.
3. What is Boosting? State why it may improve the accuracy of decision tree induction.
4. Compare the advantages and disadvantages of eager classification (e.g., decision tree,
Bayesian, neural network) versus lazy classification (e.g., k-nearest neighbor, case-based
reasoning).
5. Briefly describe and give examples of each of the following approaches to clustering:
partitioning methods, hierarchical methods, density-based methods, and grid-based
methods.
6. Both k-means and k-medoid algorithms can perform effective clustering. Illustrate the
strengths and weakness of k-means in comparison with the k-medoids algorithm.
7. The support vector machine (SVM) is a highly accurate classification method. However,
SVM classifiers suffer from slow processing when training with a large set of data tuples.
Discuss how to overcome this difficulty and develop a scalable SVM algorithm for
efficient SVM classification in large datasets.
8. Consider the following confusion matrix:

Calculate the following for the same:


a) Accuracy
b) Precision
c) Recall
d) F1 Score
e) G mean
9)
10)
The average class accuracy(harmonic mean)

You might also like