Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

1. Explain ID3 algorithm for classification.

ANS:
ID3 stands for Iterative Dichotomiser 3 and is named such because the
algorithm iteratively (repeatedly) dichotomizes(divides) features into two or
more groups at each step.
Invented by Ross Quinlan, ID3 uses a top-down greedy approach to build a
decision tree. In simple words, the top-down approach means that we start
building the tree from the top and the greedy approach means that at each
iteration we select the best feature at the present moment to create a node.
Most generally ID3 is only used for classification problems
with nominal features only.

ID3 Steps:
1. Calculate the Information Gain of each feature.

2. Considering that all rows don’t belong to the same class, split the
dataset S into subsets using the feature for which the Information
Gain is maximum.

3. Make a decision tree node using the feature with the maximum
Information gain.

4. If all rows belong to the same class, make the current node as a leaf
node with the class as its label.

5. Repeat for the remaining features until we run out of all features, or
the decision tree has all leaf nodes.

2. Discuss about various data mining tools.


ANS:
Data mining is the process of finding patterns and relationships in large
amounts of data. It’s an advanced data analysis technique, combining machine
learning and AI to extract useful information, which helps businesses learn
more about customers’ needs, increase revenues, reduce costs, improve
customer relationships, and more.
Various Data Mining Tools

1. MonkeyLearn : No-code text mining tools


2. RapidMiner : Drag and drop workflows or data mining in Python
3. Oracle Data Mining : Predictive data mining models
4. IBM SPSS Modeler : A predictive analytics platform for data scientists
5. Weka : Open-source software for data mining
6. Knime : Pre-built components for data mining projects
7. H2O : Open-source library offering data mining in Python
8. Orange : Open-source data mining toolbox
9. Apache Mahout : Ideal for complex and large-scale data mining
10. SAS Enterprise Miner : Solve business problems with data mining

3. Explain k-means clustering algorithm. Suppose the data for clustering is


{1,3,5,15,23,11,25}. Consider k=2, perform clustering using k-means
algorithm.

You might also like